Different ways of file uploads on Web Application Servers

Divam Technologies
5 min readDec 9, 2021

Nowadays asset/data storage is our major concern due to the day-by-day traffic increment. In older days we stored our assets on the same server on which our site is hosted, this created storage and serving request issues on the server. For example, if your server has a capacity of 1024 MB RAM & 10GB storage and it takes 1MB to serve the normal request, what happens if 100 requests occur it consumer the 100 MB of RAM, if one asset size is 5MB and get uploaded to the server, the server will consume 6 MB, so if 100 upload requests consumed then server RAM spike to 600 MB and storage fill with 500MB, our server gets slow down and request get in queue soon. What do you suggest as a solution to this problem?

Vertical scaling is one solution to increase the size of existing hardware, such as increasing the RAM, storage capacity, etc, but this solution also affects the budget. What happens if a server got crashed due to any reason, stored assets may be lost forever.

If we go for horizontal scaling, it creates another level of problem. Such as asset-A is available to the first server-A, and asset-B to some different server-B. So what happens if the asset-A access request comes to server-B as the file is not stored at server-B. In short, horizontal scaling at this upload approach is wrong. In the image, we can see a proxy server that is introduced as a load balancer to distribute traffic to different servers. Will discuss vertical & horizontal scaling, load balancer, and proxy server in some different articles.

To save assets due to server crashing issues from a single server and vertical scaling approach or enable horizontal scaling in our application logic, we can extract this storage space to different servers. Say a different server design in such a way to store & process assets, replication, and high availability. So our application server in any manner vertical/horizontal actually does not store assets, just forward the assets to different storage servers.

This approach also has some caveats, as still, the request comes to the server, where assets use temp storage and RAM. The main question is still the same, somehow we save our assets from server crashing issues, but the server still consumes 600MB RAM requests. Why not asset upload requests go to the direct storage server instead of the application server. Sounds good, doesn’t it?

Now we have two challenges:

  1. Some security challenges we can face, like now anybody, can upload to our server. So we apply some basic auth mechanisms to upload the assets. To share these credentials with the front end is another level of a security breach. Additionally, we don’t have primary validation of assets, access control of user privileges, etc. We can also solve this problem by some ephemeral signed request which is capable of uploading assets into a storage server, then the storage server validates the assets type, size, and a few more concerns to allow the request and also we don’t need to share credentials to frontend as well. This approach looks nice, doesn’t it?
  2. If the upload request goes directly to the storage server, then how does the application server know something uploaded to the storage server. Two options are available:

2.1 After immediate upload, one request comes from the front end to the application server to say that assets are uploaded successfully to the storage server at this location.

2.2 When an asset is successfully uploaded to the storage server, the storage server somehow informs the application server that this asset I got.

Looks like everything is sorted now, isn’t it? As we discuss the storage server features, we don’t need to reinvent the storage server or write any code. With all the above features, AWS Simple Storage Service (aka AWS S3) does all this and with very cheap pricing. Let’s discuss how S3 can help us.

Users give all the information about assets like name, size, etc to the application server. Now the application server sends all this information with an asset as public/private access to the S3 server to generate a pre-signed URL. Storage servers generate a URL on the basis of the given information which contains bucket information, asset name, expiration time till the URL is valid, and object read property as public/private. Now users easily upload the asset with the pre-signed URL. Public assets are accessible by just hitting the complete S3 file path URL, while private assets use the signed URL which gives the user a limited time of access to the assets for reading operation. Public assets are useful in the case of profile photos, brand logos, etc. Private assets are useful as some reports, images that can be accessible by the authenticated user/specific email links or targeted users.

In approach 2.2, once the asset has been uploaded on S3 the lambda server (special server which is designed for lightweight request processing as serverless) starts its functionality. It tells the main server depending on the action of the bucket that the asset has been successfully uploaded. Will discuss lambda server functionality in different articles.

I hope we covered all the aspects of the file uploading process. Please share your feedback:

Divam Technologies

contact@divamtech.com

https://divamtech.com

--

--

Divam Technologies
0 Followers

Divam Technologies is one of the growing IT Solutions company based in India. We provides expert solutions in the areas like: App development, Web development