« Stuff The Internet Says On Scalability For November 6th, 2020 | Main | Sponsored Post: IP2Location, Ipdata, StackHawk, InterviewCamp.io, Educative, Triplebyte, Stream, Fauna »
Monday
Nov022020

ShiftLeft on Refactoring a Live SaaS Environment

 

This is guest a post by Preetam Jinka, Senior Infrastructure Engineer at ShiftLeft. Originally published here.

ShiftLeft NextGen Static Analysis (NG SAST) is a software-as-a-service static analysis solution that allows developers to scan every pull request for security issues. Earlier this year we released Secrets, Security Insights, and a v4 API. Secrets and Security Insights are two new types of results we extract from code analysis, and the V4 API is a brand new RESTful JSON API with an OpenAPI/Swagger specification that you can use to access all of your results. Read more about these features in our announcement post.

NG SAST was initially designed only for vulnerabilities. In order to implement Secrets and Security Insights, we either had to retrofit these new result types into our existing implementation or significantly refactor our back-end to support their unique characteristics. Even though it would take longer and be more difficult to implement, we decided to do the latter. We rewrote almost all of the storage used for storing code analysis results while maintaining backwards compatibility and without any outages. The analogy is that it’s like changing the engine on an airplane in flight without the passengers noticing.

We could’ve saved a lot of time by hacking things together and making it work, but instead chose to take a step back and use this opportunity to redesign, clean up some technical debt, and establish a solid foundation for future work. It wasn’t easy: in addition to re-implementing large parts of our back-end, the UI was also significantly refactored to move to a new API. It took several weeks of intense collaboration, designing, iterative implementation, and testing in production.

Design

Process

As software engineers our job is not to produce code per se, but rather to solve problems. Unstructured text, like in the form of a design doc, may be the better tool for solving problems early in a project lifecycle, as it may be more concise and easier to comprehend, and communicates the problems and solutions at a higher level than code.
https://www.industrialempathy.com/posts/design-docs-at-google/

Every big engineering project at ShiftLeft starts as a design doc. These are Google Drive documents that describe what we want to implement and the implementation plan. This allows us to collaborate and iterate on a design with a large audience so everyone has an idea of and (a say in) what’s about to be implemented.

Even though only a handful of people were responsible for the implementation, the whole team participated in the review and refinement through design docs. One of the major objectives was to avoid any major surprises during implementation. A lot of time was spent on the design doc and API spec for this project, and as a result we ended up making lots of changes in the design in the beginning and had no major surprises.

Data model

The right data model makes everything else easier.

With the new data model, a lot of things became much easier. The mapping to database tables is more obvious, so database queries became much clearer, and in some cases, faster. It was much easier to implement certain features like tags, customizable severities, and scan comparison. Finally, when the data model is simpler you end up with fewer bugs.

Vulnerabilities with user-friendly IDs

We were also able to implement user-friendly IDs for scans and findings. Instead of identifying them with long IDs or hashes, scans and findings now have GitHub-style incrementing numeric IDs. The first scan and finding of each project starts at ID 1, the second has ID 2, and so on. This was a feature that was requested by customers for a long time, and we finally had a easy way of implementing it with this refactor.

Some other features that depend on the new design:

  • User-friendly API
  • User-friendly IDs for scans and findings
  • Expanded language support via ShiftLeft Scan
  • Better build rules that use the v4 API
  • Compare scans
  • Customizable severities and tags

Unified API

Old and New

 

We started by writing data from new customer scans to the old and new systems in parallel. Secrets and Security Insights are new features, so there wasn’t any old data for them, so they could be served only from the new API. Vulnerabilities had to be served from the old API because historical data wasn’t available to the new API.

As the UI was implemented using the new API, we slowly made bug fixes and added anything we missed during the initial implementation. Eventually, the new API was done, the UI was using it for Secrets and Security Insights, and it was time to get the last part of the new API working: Vulnerabilities.

Data migration

    Column       |           Type           | Nullable
-----------------+--------------------------+----------
sp_id | text | not null
organization_id | text | not null
project_id | text | not null
scan_id | bigint | not null
processed_time | timestamp with time zone |
Indexes:
"vuln_migration_status_pkey" PRIMARY KEY, btree (sp_id)

Next we created a small (<250 line) Go script to iterate through this table and migrate vulnerabilities one scan at a time. It runs the same functions as the live production service to create scans and findings records, except with a timestamp in the past instead of the current time.

After a few hours of carefully running the migration in batches, we managed to migrate several thousands of scans with 0 failures! Even if we had a failure, we could keep track of our progress with the vuln_migration_status table and possibly skip over or manually fix any problematic scans. In the worst case, because the new data was only in new tables, we could have wiped everything and started over without any consequences.

Feature flags

Our UI uses feature flags toggled by URL parameters. To enable the new views that work with the new API and back-end, we simply had to add ?findings=enable to the URL. This allowed us to compare results side-by-side.

   

This was one of the biggest projects at ShiftLeft. We decided to make changes to things that have been a certain way since the beginning of the company. It wasn’t an easy project, but there were several things that made it easier: a culture of collaboration, processes like design docs, and tooling like migration scripts and feature flags. Not only did this project give us a foundation for other features, but it also provided a template for working on other big projects too.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>