Wednesday
May142008
Scaling an image upload service

Hi,
First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers.
The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image.
Users can then view the images via our web server.
Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible.
Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware.
Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache.
I appreciate any input on the subject.
First of all I want to to say that this is an extremely interesting and informative website. i have enjoyed reading the various posts on how the big sites scale to meet the needs of their customers.
The service we are developing is a webcam service. The client application sends images to the server via HTTP POST and they are saved in folder specified by the users id. When a new image is sent to the server it will overwrite the current image.
Users can then view the images via our web server.
Ideally we want the images to upload as quickly as possible and allow users to view them as quickly as possible.
Would I be correct to assume that when the number of uploading clients exceeds the capability of the server the only way to scale is to add more hardware.
Also I assume that to use HTTP accelerator caches will not speed up viewing the images as the new images will invalidate the cache.
I appreciate any input on the subject.
Reader Comments (4)
Some ideas.
Separate you upload and download servers.
This way, you will know easily which servers to scale.
Use high-performance servers such as nginx for downloading, lighttpd for uploading, etc.
See http://trac.lighttpd.net/trac/wiki/Docs:ModUploadProgress if you want to attach a progress meter. Pretty cool for large uploads so the user knows that something is happening and the server/browser hasnt frozen.
You will also need to tune the upload/download servers accordingly.
Perhaps you will allow 1MB POST, 60secon input request time, 60 second process time, etc.
On the download server, do not allow POST and kill everything that takes more than 1 second to finish.
hope this helps·
Thanks for the input Atif,
I just want to clarify a few things regarding the points you have made.
Regarding separating the download and upload servers; do you mean running both services on the same machine. Each running on different ports such as 80 for download and 8080 for upload.
If it meant two separate machines I assume I would need a SAN or some service like MogileFS to allow the download server access to the uploaded images.
I assume that using lighttpd for uploading would require PHP correct?
The uploads are quite small so I currently don't require a progress meter, however, I have wondered if version 1.5 of lighttpd could be used in a production server.
I will certainly look at tuning the performance.
Thanks again for you help
Hello agallagher,
Regarding separating the download and upload servers; do you mean running both services on the same machine. Each running on different ports such as 80 for download and 8080 for upload.
A. I meant multiple machines for each function. (10 servers for uploads, 5 for downloads, etc)
If it meant two separate machines I assume I would need a SAN or some service like MogileFS to allow the download server access to the uploaded images.
A. Yes a NAS would do just fine.
I assume that using lighttpd for uploading would require PHP correct?
A. No PHP is not required to use lighttpd. You can write your upload logic is any language that you wish. Perhaps choose a server designed specifically for uploads. Or just use a FTP server.
I have wondered if version 1.5 of lighttpd could be used in a production server.
A. We use lighttpd 1.5 in production since 2 months and its doing quiet well, we are moving all scattered upload functionatlies to this upload server.
Very interesting information, I am looking forward to implementing such a system
Thanks again Atif,