This is a guest post by Marcel Panse and Sander Nagtegaal from Teletext.io.
In our early Peecho days, we wrote an article explaining how to build a really scalable architecture for next to nothing, using Amazon Web Services. Auto-scaling, merciless decoupling and even automated bidding on unused server capacity were the tricks we used back then to operate on a shoestring. Now, it is time to take it one step further.
We would like to introduce Teletext.io, also known as the serverless start-up - again, entirely built around AWS, but leveraging only the Amazon API Gateway, Lambda functions, DynamoDb, S3 and Cloudfront.
We like rules. At our previous start-up Peecho, product owners had to do fifty push-ups as payment for each user story that they wanted to add to an ongoing sprint. Now, at our current company myTomorrows, our developer dance-offs are legendary: during the daily stand-ups, you are only allowed to speak while dancing - leading to the most efficient meetings ever.
This way of thinking goes all the way into our product development. It may seem counter-intuitive at first, but constraints fuel creativity. For example, all our logo design is done with technical diagramming tool Omnigraffle, so there is no way we could use hideous lens flares and such. Anyway - recently, we launched yet another initiative called Teletext.io. So, we needed a new restriction.
At Teletext.io, we are not allowed to use servers. Not even one.
It was a good choice. We will explain why.
In the past years, the Amazon cloud already made us very happy with the ability to use auto-scaled clustering of EC2 server instances with a load balancer on top. This means that if one of your servers goes down, a new one can be initiated automatically. The same happens when unexpectedly high peaks of traffic occur. Extra servers will then be initiated.
Although this is pretty cool, there are disadvantages.
In an architecture based on an API with micro-services, this means a lot of overhead for small tasks. Luckily, there are now options to solve this. Let's first take a look at the problem at hand.
Our brand new solution was born out of personal frustration about content management within custom software. In most start-ups, HTML texts inside buttons, dashboards, help sections or even entire web pages have to be managed and deployed by programmers instead of the writers of the texts. This is really annoying for both developers and editors.
For years, we tried to find a distributed content management service that would solve this issue properly. We failed. So, a couple of months ago, we were fed up and decided that we would just have to build something ourselves.
We planned to create a really simple, Javascript-based service that could inject centrally hosted content, using the common data attributes of HTML for labeling of elements. It should be able to handle localization and dynamic insertion of data. Most importantly, it should give content writers a way to just change texts whenever they want, using an inline WYSIWYG editor in their own app.
Apart from some UX caveats, there are three technical challenges here.
The third issue is not related to architecture per se. The trusty search engine moves in mysterious ways, so all we could do was to test the assumption. Spoiler: yes, it works. The solution to the first two problems is in your own hands, though. These can be solved at low cost - with a clever architecture.
Let's dive into the building blocks of our system, which are based on several of the latest features of Amazon Web Services.
Amazon API Gateway is a managed AWS service that allows developers to create APIs at any scale, with just a few clicks in the AWS Management Console. The service can trigger other AWS services, including Lambda functions.
Instead of running cloud instances, we use AWS Lambda. The name derives from the Greek letter lambda (λ) used to denote binding a variable in a function. AWS Lambda lets you run code without maintaining any server instances. You may think of an atomic, stateless function with a single task, that may run for a limited amount of time (one minute, currently). The functions can be written in Javascript (Node.js), Python or Java.
If you upload your Lambda code, Amazon will take care of everything required to run and scale your code with high availability. Lambda executes in parallel. So, if a million requests are made, a million Lambda functions will execute without loss of speed or capacity. According to Amazon, "there are no fundamental limits to scaling a function".
The best thing is, that from a developer's perspective, Lambda functions are not even there when they don't get executed. They only appear when needed. And what isn't up, can't come down.
The Lambda functions put their data in a data store. Because our rule says that we are not allowed to use any servers, we can't use the Relational Database Service (RDS). Instead, we use DynamoDB, Amazon's giant managed data store.
Amazon DynamoDB is a NoSQL database service for all applications at any scale. It is fully managed and supports both document and key-value store models. Scalability of the service has been proven with other users like Netflix, AirBnB and IMDb.
Our system is decoupled into three parts:
This separation of concerns is by design. A problem in either the content management API or our own website should never lead to an outage of the customer's websites. Therefore, the delivery of content should be entirely autonomous.
The content management part is what editors use to edit HTML and texts. Editors can log in using their Google account, via an AWS Cognito identity pool connected to AWS IAM using Javascript only. For loading draft content, storing edits and publishing the result, the Amazon API Gateway gets called, triggering Lambda functions. The Lambda functions all relate to a single API call and store their data in DynamoDb.
As you can see, there are no servers that could crash or get stuck.
As mentioned, we decided to decouple the delivery of content entirely from the editing features, so your app will work even if disaster strikes. When an editor decides to publish new content to the live version of his app, another Lambda immediately copies the draft content as flat JSON files to S3, Amazon's data store for files. The JSON file contains meta data and i18n localized HTML strings that describe the content.
From there, the Teletext.io script in your app can access these files through the Cloudfront CDN, ensuring high availability and performance. We added a clever prefetching algorithm to ensure the most popular files get retrieved and cached in your browser before you need them, so the next clicks are provisioned without actual load of content.
Since there is no server-side logic involved in the delivery of published content, it is really fast and practically bullet proof.
What about our website, though? We chose a simple, but effective concept - again, without servers. The website was built using the React framework as a single-page app and deployed to S3 as a single, static file. Then, Cloudfront was configured as the content delivery mechanism on top, ensuring super-fast content delivery from many endpoints around the world.
Again, this approach is based on flat file delivery and therefore very robust.
The static app uses HTML5 pushState and React Router for URL handling. Usually, there is one problem with that. If you access a specific URL other than the root, a web server must dynamically render the same routes that the front end is rendering dynamically. This is currently impossible in S3. We found a trick, though, that we would like to share here.
The result is that all URL paths (except for the root) lead to a 404 response in S3, which then triggers the cached custom error response of Cloudfront. This final response is just your single-page app again. Now, in the browser, you can handle all routing based on the current path.
There is only one disadvantage. You can't return an actual 404 HTTP response code in any case. However, in return, you will get an ultra-cheap, ultra-scalable single-page app.
Working with Lambda has impact on your development process. Support is improving, though. For example, previously, Lambda functions couldn't be versioned. This had the effect that testing and deploying was very risky. However, Amazon recently rolled out their versioning system and now we can use mutable aliasing. This means that it is now possible to have different versions of the same function that can be updated independently, similar to a test environment versus a production environment.
Our freemium service is now open for customers. We eat our own dog food, so we use it, too. In the following GIF you can see the functionality at work in our own website.
However, the truly scalable nature of the system shows in our monthly AWS bill.
The cost is entirely defined by actual usage. For simplicity's sake, we will ignore temporary free tiers, as well as the slew of smaller services that we use.
We just started - so to calculate the costs, we made some assumptions. Let's consider a fairly big website to be our average customer. We guess that such a client uses 1000 API calls per month (that's editing only), requires therefore 1GB of data in, and needs about 10 GB of traffic-related data out. Persistent storage we estimate to be 500MB. We don't expect the Lambda execution time to get over 2 seconds.
For several different numbers of this type of customer, our monthly cost would look like this (rounded, in American dollars).
Customers | API Gateway | Lambda | DynamoDb | S3 | Cloudfront | Total |
---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0.25 | 1.00 | 1.25 |
100 | 9.50 | 0 | 3.50 | 1.50 | 85.00 | 99.50 |
1000 | 93.50 | 4.00 | 3.50 | 15.00 | 850.00 | 966.00 |
10000 | 935.00 | 35.00 | 3.50 | 150.00 | 8050.00 | 9173.50 |
100000 | 9350.00 | 410.00 | 3.50 | 1500.00 | 76700.00 | 87963.50 |
As you can see, the cost is largely determined by the CDN, which is becoming cheaper at larger traffic numbers. The API and associated Lambdas attribute for a much smaller portion. So, if you would build a service that is less dependent on Cloudfront, you can do much better.
Using our love for creative constraint, we managed to launch a start-up without servers, auto-scaling, load balancers and operating systems to maintain. This is not only a really scalable solution with an almost infinite peak capacity, but also a very cheap solution - especially in the early days before traction hits.
So... Down with servers!