Google and Netflix Strategy: Use Partial Responses to Reduce Request Sizes
This strategy targets reducing the amount of protocol data in packets by sending only the attributes that are needed. Google calls this Partial Response and Partial Update.
Netflix posted about adopting this strategy in their recent Netflix API redesign. We've seen previously how Netflix improved performance by creating less chatty protocols.
As a consequence packet sizes rise as more data is being stuffed into each packet in order to reduce the number of round trips. But we don't like large packets either (memory usage and packet processing overhead), so we have to think of creative ways to shrink them back down.
The change Netflx is making is to conceptualize their API as a database. What does this mean?
A partial response is like a SQL select statement where you can specify only the fields you want back. Only the attributes of interest are requested. Previously all fields for objects were returned, even if the client didn't need them. So the goal is reduce payload sizes by being more selective about what data is returned.
An example Google uses is:
GET /myFeed?fields=id,entry(author)
The fields parameter selects the attributes to return out of a much larger feeds resource.
A partial update works similarly, you send only the fields you want to update.
Synchronization Issues?
Clearly this will reduce the number of attributes in flight, but are there any problems with this strategy? A major problem I've experienced when using partial data are synchronization issues.
Thinking of an API as a database moves the problem to be in the general problem class of keeping two distributed models in sync when both sides are changing and the connection between them is unreliable. Now each model is receiving only partial data, if any data is lost or retransmitted between requests, then resources get out of sync, which can lead to integrity problems.
A user account model on the client side, for example, could ask for the user's preferences just once, preferences could change via another client or via some backend system, and the first client would never pick up changes to those preferences again, during that interval the client could be making a lot of bad choices. Then the user could make a change to those preferences on the client and overwrite any updates that have happened since the last request. If you are dealing with alarms and alerts this all gets a lot worse. With a state synchronization model where the entire resources is returned the window for these errors is much smaller as updated preferences are always being returned.
This brings up an entire rats nest of workarounds, like doing a complete resource sync before writes, but the system gets a lot more complicated. Another direction to go is to drop responses completely. In a database sync model there is really no need for direct responses anymore. What's needed are for all aggregated changes to sync back to clients so the client can bring the client side model back in sync with the Netflix model.
Just some things to think about if you are considering this approach.
Reader Comments (3)
I don't see how this changes how you would develop your application WRT to synchronization in the face of multiple concurrent processes modifying shared state. All that matters is that the window exists. Who would want to develop an application that relies on the window being small?
Also there is a typo in the header for synchronization issues.
In one window you can overwrite an old record with a new record that is at least presumably consistent. And a conflict can be resolved with CAS type check. When exchanging partial data the record can be inconsistent, so even if you do have a last write wins approach the data will be frankensteinish. It can help to have a merge protocol when differences are detected. In some cases different attributes have different masters. An age attribute might be mastered by the server side, but a current alarm might be mastered by the component closest to the device. Full state syncs make it easier to keep the data consistent.
Optimistic concurrency control can be used for partial updates. Server should provide distinct generation counters for each entity, which shouldn't be changed between the read and the subsequent write operation from the same client. The generation counter should be returned to the client when it reads entity fields. The client should return back the generation counter to the server when performing partial update. The server then compares the current generation counter for the entity with the generation counter obtained from the client. If they match, then the entity wasn't changed between the read and the update from the client, so it is safe updating fields in it. If the generation counter on the server doesn't match the generation counter received from the client, then the update cannot be proceed. In this case the client should repeat read-update cycle for the entity. The generation counter for the entity is incremented by the server after each successful update.