« Product: System Imager - Automate Deployment and Installs | Main | Make Your Site Run 10 Times Faster »
Tuesday
Feb262008

Architecture to Allow High Availability File Upload

Hi,

I was wondering if anyone has found any information on how to architect a system to support high availability file uploads. My scenario: I have an Apache server proxying requests to a bunch of Tomcat Java application servers. When I need to upgrade my site, I stop and upgrade each of the Tomcat servers one at a time. This seems to work well as Apache automatically routes subsequent requests for the stopped app server to the remaining app servers that are up. The problem is that if a user is uploading a file when the app server is stopped, the upload fails and the user has to upload the file again. This is problematic as uploading files is an integral feature of the site and it's frustrating for the users to have to restart their uploads every time I upgrade the site (which I want to be able to do frequently).

Has anyone seen any information on how this can be done or have ideas on how this can be architected? I imagine sites like Flickr must have a solution to this problem as I have seen presentations they say that they are able to upgrade their site several times a day without the users noticing.

Thanks!
Tuyen

Reader Comments (7)

Tuyen,

Separate the file uploads to a different server that does only that.
Use lighttpd which has a file progress meter module.

December 31, 1999 | Unregistered Commenteratif.ghaffar

Typically during an upgrade you shed work to other servers, wait for a server to go idle, and then kill it. A quick search on Apache didn't tell me if this is possible or not. If not, you can track currently outstanding sessions for a server on your own and wait for them to finish before upgrading.

December 31, 1999 | Unregistered CommenterTodd Hoff

Thank you for your reply. I was thinking about separating out the file upload but I don't think it works in my instance because the file upload is part of a form submission that includes many other fields that need processing. I don't think it's possible to divert just the file upload to another server, right? I could set one server to handle all form submissions that have an upload field but then I have the same problem of trying to update that instance. Am I missing something?

Regards,
Tuyen

December 31, 1999 | Unregistered Commentertuyen

Todd,

Great site by the way. This is by far the best site I have found on building scalable, high availability websites and on my daily must-read list. Kudos!

I think your suggestion will work. I connect Apache to Tomcat using the JK connector. The JK connector has a web interface that allows you to see which server has busy connections. I just learned that more recent versions of the connector also has an Ant library for checking the status of a server and disabling and enabling a server. Using the Ant library, I believe that it will be possible to automate the following:
1. Remove the app server from the list of active servers that can process new requests
2. Monitor the connector until all busy requests for the app server has been completed
3. Stop the app server, upgrade it, and then restart it
4. Enable the app server so it can process new requests again

Thanks again for the help!
Tuyen

December 31, 1999 | Unregistered Commentertuyen

Tuyen,

We do it with forms that have many fields.
You have to design an upload widget. When this widget is clicked it opens a window to your upload server where the person choose a file and upload it. When the upload is finished the upload window goes to a callback url (provided when calling the the widget). At this url the window provides to the opening window the name of the uploaded file and closes itself.

The file resource provided in some format such that the application servers can understand.
It could be something like

resource:/data/shared/files/yyyy/mm/dd/tmpfilename
or it can be a url such as

http://uploadserver.com/files/yyyy/mm/dd/tmpfilename

The user then fills out the rest of the form and submits.

The app server gets something the data and can decide wether to fetch the file or just move it to another directory if using shared storage.

For example if you want to upload a picture on your profile the upload server will return something like
resource:/data/shared/files/yyyy/mm/dd/tmpfilename
which is shared between the appservers and the upload server, the app server may then do
mv /data/shared/files/yyyy/mm/dd/me.jpg /data/shared/profiles/user_id/me.jpg

let me know if this technique works for you.

December 31, 1999 | Unregistered Commenteratif.ghaffar

Thank you for your help. I will try it and let you know how it works.

Regards,
Tuyen

December 31, 1999 | Unregistered Commentertuyen

Tuyen,

Let me know if you need some code examples.

December 31, 1999 | Unregistered Commenteratif.ghaffar

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>