The Dollar Shave Club Architecture Unilever Bought for $1 Billion

Tuesday

Sep132016

The Dollar Shave Club Architecture Unilever Bought for $1 Billion

Tuesday, September 13, 2016 at 8:56AM

This is a guest post by Jason Bosco, the Dollar Shave Club’s Director of Engineering, Core Platform & Infrastructure, on the infrastructure of its ecommerce technology.

With more than 3 million members, Dollar Shave Club will do over $200 million in revenue this year. Although most are familiar with the company’s marketing, this immense growth in just a few years since launch is largely due to its team of 45 engineers.

Dollar Shave Club engineering by the numbers:

Core Stats

Super Bowl Ads served with no downtime: 1
Monthly Traffic Bandwidth: 9 TB
Orders processed via Arm: 38 Million orders
Total Bugs Found: 4,566
Automation Tests Ran: 312,000
Emails sent via Voice: 195 Million emails
Analytics data points processed and stored in Hippocampus: 534 Million
Size of dataset in Hippocampus: 1.5TB
Currently Deployed Apps / Services: 22
Number of servers: 325

Technology Stack

Ember for a front-end framework
Primarily Ruby on Rails on the backend
Node.js for high-throughput background processing needs (eg: in Voice)
Golang for infrastructure software
Python for infrastructure & data science
Elixir for 1 internal app
Ruby for Test Automation
Swift and Objective C for Native iOS App

Infrastructure

Fully Hosted on AWS
Ubuntu & CoreOS
Ansible & Terraform for Configuration Management
Transitioning to Docker-based deployments
Jenkins for deployment coordination
Nginx & Varnish
Fastly for application delivery
Sumologic for log aggregation
CloudPassage for security monitoring
Vault by HashiCorp for secrets storage & provisioning

Data Stores

Primarily MySQL hosted on RDS
Memcached hosted on Elasticache for caching
Self-hosted Redis servers primarily for queuing
A dash of Kinesis for handling orders from spiky traffic
Amazon Redshift for a data warehouse

Messaging & Queuing

Resque and Sidekiq for async job processing & messaging
RabbitMQ for messaging
Kafka for stream processing

Analytics & Business Intelligence

Snowplow & Adobe Analytics for web/mobile analytics
AWS Elastic MapReduce
FlyData to ETL data from MySQL into Redshift
Databricks (Hosted Spark)
Looker as the BI front-end
Near-realtime data availability for reporting

Monitoring

Rollbar, Sentry & Crashlytics for exception tracking
DataDog for custom application metrics & metrics aggregation
SysDig for infrastructure metrics & monitoring
NewRelic for application performance monitoring
Site24x7 for availability monitoring
PagerDuty for on-call alerting

QA and Test Automation

CircleCI for running unit tests
Jenkins + TestUnit + Selenium + SauceLabs for browser-based Automated tests
Jenkins + TestUnit + Selenium + SauceLabs for Brain Automated tests
Jenkins + TestUnit for API Functional Tests
Jenkins + TestUnit + Appium + SauceLabs for Native Android Automated Tests
Jenkins + TestUnit + Appium + SauceLabs for Native iOS Automated Tests
Jenkins + TestUnit + Selenium + SauceLabs + Proxy Server for BI Test Automation
SOASTA + Regex Scripts for Stress, Soak, Load and Performance Testing.

Engineering Workflow

Slack for cross-team communication
Trello for task tracking
Hubot with custom plugins as our chat bot
Github as our code repository
ReviewNinja integrated with Github Status API for code reviews
Continuous deployment - multiple deployments per day typically
Moving to continuous delivery
On-the-fly sandbox environments for feature development
Currently, single-button push deployment using Jenkins, moving towards continuous delivery
Vagrant box running docker containers => fully-functioning development environments for new engineers on their first day

Architecture

Event-driven architecture
Moving from a monolithic architecture to “medium” services interacting through a common message bus
VCL-based edge-routing on the CDN edges, deployed just like any other app.
Web and Mobile frontends talk to an API layer
API layer talks to services, aggregates data and formats it for clients
Services talk to the data stores and message bus
Scheduled tasks run as one master job that breaks itself up into smaller jobs in resque/sidekiq
Technology components include internal tools for customer service (Brain), marketing automation platform (Voice), fulfillment system (Arm), subscription billing system (Baby Boy) and our data infrastructure (Hippocampus).

Team

45 top-notch entrepreneurial and highly-skilled engineers working out of Marina Del Rey, CA HQ
Engineers participate in cross-functional teams called squads along with product managers, designers, UX and stakeholders to deliver end-to-end features.
Teams are vertically divided based on domains into Frontend, Backend, QA & IT.
Front-end team owns Web UI for DSC.com & internal tools and our iOS & Android apps.
Backend team owns web backends for DSC.com & internal tools, Internal Services (Billing and Fulfillment), Data Platform & Infrastructure.
QA teams owns testing and automation infrastructure for all digital products.
IT team owns Office & Warehouse IT.
Engineers get to attend one company-sponsored conference every year.
Engineers get to buy as many books / learning resources as they need.
Standing desks for all. One treadmill desk currently available as a pilot.
Weekly engineering team lunches.
Tech Belly events every other week where engineers present talks on technology topics over lunch.
Engineers are encouraged to experiment with bleeding edge technology and create proposals through Requests for Proposal (RFCs).
Engineers are encouraged to open source tools and libraries where it makes sense
Every engineer gets a standard issue of a 15” Mac Book Pro, 27” Mac Display and a 24” monitor.
One 3D-printer available to print props and more 3D printers.

Lessons Learned

Scaling becomes an easier challenge when components you’re trying to scale are composed of simple and small services.
Documentation & knowledge sharing are important for fast-growing teams.
A well-nurtured test-suite is critical to fast-evolving systems.
Redis uses an approximate LRU algorithm, so it’s not suitable if you have precise LRU requirements for caching
Web performance is critical, especially on mobile - every millisecond costs us revenue
Usability & User Experience are important even for internal tools: efficient tools = more productive teams

On HackerNews

HighScalability Team |

6 Comments |

Permalink |

Print Article

Email Article

Example

Reader Comments (6)

Why did you decide to host Redis yourselves? I would assume you'd run Redis on ElastiCache (with Replication Groups for better availability). Also, why hosting memcached when it's also available on Elasticache?

September 14, 2016 |

Hugo Lopes Tavares

Seems like a pretty standard modern stack. Glad it's working out for them! The real question is how their transition into Unilever's corporate technology fortress will look like.

September 14, 2016 |

Carlos Nunez

I guess my question is why do you need 45 engineers for what is basically a tiny catalog with a subscription option?

September 22, 2016 |

Brad

Documentation makes its first appearance as something important. What infrastructure and process do you use to keep it relevant?

September 25, 2016 |

Peter Schaafsma

Why does this company need such a complex system to run? It is not some tech company where 100s and 1000s of requests are coming in realtime. Just seems like an over engineered solution to sound cool. Maybe they are doing something big and complex in the back that I don't see. So I am just curious.

December 8, 2016 |

ARK

Yeah, this is more of "Life at Dollar Shave Club" than a software architecture breakdown, only one question to the author, why do you need 45 engineers and such a bloated solution? You could easily move your site to Shopify.

October 11, 2020 |

Mark

Post a New Comment

Enter your information below to add a new comment.

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>