The Design Review

Monday, September 10, 2018 by Louie Bacaj
...


The Design Review

The design review, and then later in the development phase, the production readiness review, are two incredibly important parts of our development cycle on the Jet.com engineering team. These two events are a huge part of our success; in fact, I will argue that the design review process has been instrumental to us building a world-class engineering organization. As you’ll read in this post, I attribute much of our uptime, cross-pollination of technical ideas, and the reliability of our systems to these two engineering wide meetings. They help catch edge cases and really push our systems in the right direction.

I would like to talk you through what the design review process looks like for us at Jet. I would also like to get into some of my favorite design reviews that I have seen over the last 4 years here at Jet.com and some of the learnings.

What is a design review?

A design review is an engineering wide meeting that is an opportunity for a team, or an individual from a team, to showcase the architecture design and get approval from the rest of engineering for the building of a new system, a new feature they would like to introduce to jet.com, or the on-boarding of a new platform that will impact all of engineering in some way. These meetings are a staple of our engineering culture. Folks can, and should, spend days or weeks preparing for one of these. Many documents are drafted and re-iterated on before they are presented in these meetings. Lunch is always served.

For us at Jet design reviews happen on Thursday, every single Thursday. There are one or two design reviews scheduled by different teams across engineering every week. Each is an incredible opportunity for new engineers to learn how scalable distributed systems are built from the design phase to implementation. These design and production readiness reviews are tremendously helpful to catch edge cases that the designing team may have missed. They are a great opportunity for teams to understand the tradeoffs of certain technologies without having to do so the hard way.

For example, if a team is preparing to introduce a new piece of technology such as Cosmos DB to their system, and that particular team has never worked with this technology before, this is an opportunity for other engineers from other teams who have experience with it to chime in and to talk about their experiences and the edge cases they have seen in a production setting. The value of this cannot be understated — people are often blinded by the promises of new technology but it’s only when you are woken up in the middle of the night because it broke down that you really understand what the design review and production readiness review process are doing for you.

Design reviews have a few rules at Jet.

1. Be on time. They start promptly at the scheduled time. A lot of engineers can and do attend, and engineering time is valuable.

2. Be prepared. You must come prepared with the design document, architecture diagrams, and the risk management thought through. If you have not thought through your system well enough then this is where that will be made evident in front of all of engineering, and you don’t want that. At the same time, it is ok if you missed some edge cases, that is what this process is for.

3. Be ready to answer questions. It is a given that you should be ready to answer questions about your design. However, if your system is particularly interesting, occasionally questions will come during the presentation and there will definitely be questions after your presentation.

Learnings and benefits.

For me personally, I always seem to learn something new with each design review that I am able to attend. I am either learning about a new area of the business that I never touch, such as Supply Chain, and how they are going to solve a deeply technical challenge using a new system, or I learn about new techniques and technologies.

Decisions are made in the design review process that can and will impact other teams. Decisions are made to approve the on-boarding of new technologies or presenting new ways of doing things. Not attending means you don’t have your concerns heard and this can have consequences (but that is for another blog post).

Being selected to present by your team is an honor. It means they trust you to deliver the architecture of this system to the rest of engineering. It is a great opportunity to hone your presentation and public speaking skills in a safe and friendly environment.

Some of my favorite design reviews.

1. The push for containerization and the Nomad design review. This particular design review was perhaps one of the most contentious at Jet, because it impacted every single engineering team at Jet and the way they write their code. This led to some funny outbursts during the design review itself, and a few calls for an engineering-wide vote, but ultimately it yielded some great results. The back and fourth led to the rethinking of certain aspects of this system and ultimately led to a better rollout of this technology, providing flexibility for teams.

2. I presented a design review, in the early days of Jet, for the segmentation engine my team was building for marketing and I was not able to get approval from the teams that would be affected on the first try. This is because this system required a particularly large set of data, and in as a real-time fashion as possible. At the time, this conflicted with our Tiering system for SLAs (Service Level Agreements) between teams, as it required pulling data from what was considered a big data analytics system with loose SLAs (measured in minutes) into a production system that had tight SLAs measured in milliseconds. This was a huge no-no but we needed something, because the marketing business was desperate for this segmentation engine. Ultimately this led to many compromises, which in turn resulted in the formation of a brand new engineering team to support the type of large data aggregation our system required. This new team would go on to use Kafka, Spark, and Cassandra to handle the near-real-time aggregation of all production data at Jet and make it available to our Segmentation Engine.

3. The Equinox design review was a particularly interesting one that I think many folks at Jet learned a great deal from. We are known at Jet for having one of the largest event sourcing systems in the world but that hasn’t come without cost. We have had many issues in the past with the backbone of our event storage layer and as we barreled down the path of millions of users and the load kept going up we needed to build something ourselves. Equinox was our answer. It was a well thought-out event store that sits on top of CosmosDB on Azure and enables event sourcing. I think a good number of engineers learned a great deal about distributed systems and event sourcing from this particular design review.

4. The original Inventory System design review and the introduction of Document DB (now known as CosmosDB) to our tech stack. A new real-time inventory system was needed at Jet and, as such, a group of folks was pulled together to get this done. As they explored their options, they landed on a technology that they felt could fit their needs for the hundreds of millions of daily updates to inventory, their persistence, expiration, and many other aspects of their system. However, they couldn’t be too sure since the technology on Azure was very new at the time. So they set out to run some massive load tests that would prove the technology could handle the load. These tests were so large that their cost ran into the hundreds of thousands of dollars, and they exploded our technology costs exponentially for the week they were running. I will always remember how interesting the findings were, and how great it was that the team, with the OK of the business, was willing to go to this length to make sure the system was sound.

There are many other fantastic design reviews at Jet, too many to mention in one blog post, but these are a few that stand out and come to my mind right away.

The thing is, every single Thursday is an opportunity to learn from fellow engineers at Jet, all while getting free lunch and participating in educated (and sometimes passionate) discussions about architecture and design. This is an engineering dream and all a part of our incredibly strong learning culture at Jet.com.

You can follow me on Twitter where I started posting regularly about this and many other tech topics.


P.S. If you are interested in seeing a template of what our design review process looks like, take a look at this wonderful template from my colleague Avianne Tan which can be found here.