Thumbtack Engineering Engineering

A romance of a single dimension: linear git history in practice

Electrical wires in Bangkok

"I call our world Flatland, not because we call it so, but to make its nature clearer to you, my happy readers, who are privileged to live in Space."
(Edwin A. Abbott, Flatland)

A linear commit history is a fine, beautiful thing. It keeps developers sane. It keeps beastly merge commits at bay. It removes pollution from the history. It enables faster debugging. And, like any useful tool, the linear history is a useful mental construct for thinking about code and changes to that code.

A linear commit history relies on a powerful git mechanism called the rebase. You might have heard fairy tales of how rebasing is dangerous or how it can corrupt your history. Yes. Like most powerful tools, in the hands of a novice, the rebase could be problematic. However, like any powerful tool, a master craftsman can wield it carefully and precisely to achieve great ends.

This post is targeted at new developers (or non-engineers) who are looking to understand a git workflow in the context of a drawing a straight line, a linear commit history.

Components of the line

You can think about the code as a series of commit objects, that are organized linearly and which stack on top of one another. If you were to start the codebase over, and apply each commit sequentially, you'd end up with exactly the same codebase we have now. Each commit has a parent commit. This is useful because you can think of each commit as just being a difference between two states: the state of the codebase at the parent commit versus the state of the codebase at the new commit.

This difference has a more formal name in git: a diff. The concept of a diff is widely used, and can refer to any difference between file(s) in one state and the same files in a different state. Git even has special commands for viewing differences: git diff and its variations, but we won't go into those at the moment.

Each commit is given a unique hash code, which is a bunch of letters and numbers, like 83ddf3be77b58395d2f00b7f51a7cec8bafd2ac8. Because these codes are so unique, you can often refer to them by just the first part of the code, like 83ddf3b.

Linear commit history

In the following diagrams, each commit will be referred to with a one-letter code. Each machine references the commit history. You can think of production, staging, and the two versions of master as pointers into the history.

production represents what our users are currently seeing in production. staging is the code about to be deployed. master is the code we are working on currently in development.

A <- B <- C <- D <- E <- F <- G
^              ^              ^
|              |              |
production     staging        master on remote
                              master on your machine

Great! So we can see here that production is behind staging, which is what we expect. So if we were to deploy what's currently on staging to production, the diagram would now show both pointing to D:

A <- B <- C <- D <- E <- F <- G
               ^              ^
               |              |
               staging        master on remote
               production     master on your machine

Now you make several new changes to the codebase on your local dev instance, and you commit them into master. Those commits have the codes H, I, J, K, and L. The commit history now looks like this:

A <- B <- C <- D <- E <- F <- G <- H <- I <- J <- K <- L
               ^              ^                        ^
               |              |                        |
               staging        master on remote           master on your machine
               production

Great. Now your local copy of master includes the changes you made - H, I, J, K, and L. The master step is to issue a git push origin master to get these 5 commits into a central place where they can be accessed by other developers and by other systems.

A <- B <- C <- D <- E <- F <- G <- H <- I <- J <- K <- L
               ^                                       ^
               |                                       |
               staging                                 master on your machine
               production                              master on remote

Now you want to update the staging environment to the latest version of master, so you'll go to your internal deployment tool and start the deployment process. The result will be that staging now points to the commit history at commit L, and the staging environment will showcase the newer version of the codebase.

A <- B <- C <- D <- E <- F <- G <- H <- I <- J <- K <- L
               ^                                       ^
               |                                       |
               production                              master on your machine
                                                       master on remote
                                                       staging

If you want to also put those commits into production, you'd now issue another deployment that will update the production pointer to master:

A <- B <- C <- D <- E <- F <- G <- H <- I <- J <- K <- L
                                                       ^
                                                       |
                                                       master on your machine
                                                       master on remote
                                                       staging
                                                       production

Feature branches

The linear commit history described above works great when you're just working in master, but what happens when you want to make a big feature that has a multi-week development timeline? You don't want to make big features in master, because you'd prevent yourself from making any small bugfixes or tweaks in master, so you create a new branch. Think about the main commit history diagrammed above as the trunk of a tree, then it makes sense how you might branch off that trunk.

Let's say the new branch is going to be a new interface for the app that uses smoke signals to communicate with old-fashioned users, so we'll call the new branch smoke-signals and we'll create the branch with git checkout -b smoke-signals (do this while master is checked out, so your new branch will start at master). Detailed versions of this command (and its caveats) are below. We'll focus on the high level diagram here.

Working from the diagrams above, we hide the earlier part of the history to make it easier to read. Your new branch smoke-signals will start at the same commit as master:

J <- K <- L
          ^
          |
          master on your machine
          master on remote
          staging
          production
          smoke-signals

That's exactly what we expected. You start working on smoke signals, and make your first commit that can show a "hello" signal. You commit this and it has commit-id M.

J <- K <- L <- M
          ^    ^
          |    |
          |    smoke-signals
          |
          master on your machine
          master on remote
          staging
          production

You make a few more commits into the smoke-signals branch:

J <- K <- L <- M <- N <- O <- P
          ^                   ^
          |                   |
          |                   smoke-signals
          |
          master on your machine
          master on remote
          staging
          production

Suddenly the support team emails you and notifies you of a high-priority bug that you need to fix. You switch branches from smoke-signals to master (git checkout master), and get to work on the bugfix. (Note that to switch branches you should have a clean working tree, which means you need to either commit your work before switching or stash it) When you commit the bugfix it is given a new unique ID Q. Q is based off of master, so now the commit history has two very distinct branches, indicated in the diagram with +. Also note that master on your machine is now at Q.

                                 smoke-signals
             / <- M <- N <- O <- P
J <- K <- L +
          ^  \ <- Q
          |       ^
          |       |
          |       master on your machine
          |
          master on remote
          staging
          production

You want to get the bugfix Q out to production right away, so you first git push origin master to get your commit into the remote server.

                                 smoke-signals
             / <- M <- N <- O <- P
J <- K <- L +
          ^  \ <- Q
          |       ^
          |       |
          |       master on your machine
          |       master on remote
          |
          staging
          production

Then you issue a deploy to update the staging and production pointers as well. After these steps, the diagram looks like this:

                                 smoke-signals
             / <- M <- N <- O <- P
J <- K <- L +
             \ <- Q
                  ^
                  |
                  master on your machine
                  master on remote
                  staging
                  production

Awesome, your bugfix is in production and now you can go back to working on smoke-signals. You issue git checkout smoke-signals to get back to your project, and write some more code and get smoke-signals ready for primetime. You issue a final commit and it has an ID of R:

                                      smoke-signals
             / <- M <- N <- O <- P <- R
J <- K <- L +
             \ <- Q
                  ^
                  |
                  master on your machine
                  master on remote
                  staging
                  production

In the meantime, your colleague has been working on master and created another bugfix S:

                                      smoke-signals
             / <- M <- N <- O <- P <- R
J <- K <- L +
             \ <- Q <- S
                  ^    ^
                  |    |
                  |    master on remote
                  |
                  master on your machine
                  staging
                  production

This is starting to get complicated, so take a deep breath and look closely at the diagram. Your goal in the master three steps will be to get smoke-signals on production.

First: update your local copy of master to what your colleague has with git checkout master and git pull --rebase origin master. Note that you might also choose to run git fetch if git reports that your branch is ahead of origin/master by N commits (this is not actually a bug or an issue, so you can skip the fetch if you choose).

                                      smoke-signals
             / <- M <- N <- O <- P <- R
J <- K <- L +
             \ <- Q <- S
                  ^    ^
                  |    |
                  |     master on remote
                  |     master on your machine
                  |
                  staging
                  production

Second, and this is important, you will stack smoke-signals on top of master by first going to the branch with git checkout smoke-signals then issuing a rebase with git rebase master. By doing this you essentially detach your branch from where it departed from master and rewires the diagram so your feature branch is now sitting on top of master.

$ git checkout master
$ git pull --rebase origin master
$ git checkout smoke-signals
$ git rebase master
# .. Resolve any conflicts followed with `git rebase --continue`
                 / <- (nothing here!)
    J <- K <- L +
                 \ <- Q <- S <- M <- N <- O <- P <- R
                      ^    ^                        ^
                      |    |                        |
                      |     master on remote          smoke-signals
                      |     master on your machine
                      |
                      staging
                      production

During a rebase, you might have to resolve conflicts if any files changed in one branch were also changed in the other branch. This is totally normal, and you should carefully consider the conflicts to make sure they're resolved correctly. After you resolve all conflicted files, type git rebase --continue to continue the rebasing process.

Now you'll want to move your master to be at R. This is the only time you should issue a merge command. Be sure to use the --ff-only flag.

$ git checkout master
$ git merge --ff-only smoke-signals
             / <- (nothing here!)
J <- K <- L +
             \ <- Q <- S <- M <- N <- O <- P <- R
                  ^    ^                        ^
                  |    |                        |
                  |     master on remote          smoke-signals
                  |                             master on your machine
                  |
                  staging
                  production

Now your master is at R, and you can use the same deploy process described above to get your new smoke signals feature into production.

J <- K <- L <- Q <- S <- M <- N <- O <- P <- R
                                             ^
                                             |
                                             smoke-signals
                                             master on your machine
                                             master on remote
                                             staging
                                             production

Notice how you've now flattened the commit history into a single line again. Exactly what we wanted.

Conclusion

The linear commit history is a useful tool for managing a complex codebase. We have found it scales well with growing codebase and engineering team, and recommend the linear history as a useful strategy for anyone considering ways of managing workflows.

Thumbtack in Three Quotes

In everything that we do, Thumbtack strives to be comprehensive, super-helpful, easy, dependable, and encouraging. These core values inform our decisions across the board, from product-based changes to social interactions with each other, and they manifest themselves in our passion for our work, our obsession with our customers, and our relentless self-improvement. Here's why.

I'm a rising sophomore Computer Science and Robotics major at Carnegie Mellon University. I like robots. They're great. But I love Thumbtack - the people, the company, the product. I'm finishing a summer internship here at Thumbtack, and it's been one of the greatest experiences of my life. I joined because I felt like I could do the following: first, work on something that has a profound impact on people's lives; second, work at a company that really believes in the power of a great culture; and third, learn from some of the most ridiculously intelligent (yet remarkably humble) engineers I'd ever met.

But you've already heard from Brandon about why you should definitely want to intern (or work full-time!) at Thumbtack, so instead, I'll do my best to describe our culture. As with any company's culture, it's nearly impossible to describe in a few words; alternatively, I'll provide a few quotes that I think aptly encompass what's unique about Thumbtack Engineering.

"For, firstly, the social instincts lead an animal to take pleasure in the society of its fellows, to feel a certain amount of sympathy with them..." -Charles Darwin

We've got a diverse group of interesting people who are sociable and intellectually curious. We love spending time with each other, and have tons of fun studying CS theory together (if that's your thing, of course!), reading and debating literature at a book club, and playing foosball at an almost-amateur level. We've gone urban hiking in the city, biking to Sausalito, and explored the 'Desolation Wilderness' at our engineering retreat in Lake Tahoe. The fact that we're so close to each other socially allows us the opportunity to truly feel comfortable working with, understanding, and having fun with each other. It's important to everyone at Thumbtack that there's a balance between work and life, and that makes us more excited to work every day.

"Everything is an experiment." -Tibor Kalman

At Thumbtack, everything – everything – is an experiment. We truly believe that data are a vehicle that we can use to answer the toughest questions, whether the data be quantitative, anecdotal, or anything in between. This belief pervades everything we do, both internally and from a product standpoint.

We even experiment in how we organize and review ourselves. In our Product Process Review, we review positive and negative aspects of the current product process (the way we, as members of the product team – for us this includes PMs, designers, and engineers – organize ourselves and the way projects travel through the pipeline), and develop (if needed) a new methodology to solve the problems we faced and the problems we think we'll face in the future. One of the greatest things about this is that although we, as engineers, are able to give feedback about product and engineering decisions, we also get an opportunity to give impactful feedback about the way in which we give feedback (inception, right?). The fact that this meeting between every person working on our product is so successful is a testament to how truly collaborative we are, and how well engineers, designers, and project managers are able to effectively communicate with each other.

We're also strictly data-driven when it comes to product changes. Our Engineering Technical Lead, Steve Howard, is an avid statistician when he's not talking to our engineers, interviewing candidates, or playing foosball (what a nerd!). Steve has created an environment of statistical rigor, where we analyze everything we learn with a critical eye. Interacting with him has grown within many of us a love for the fundamentals of statistics, experimental rigor, and data – and that has made us better engineers and better people.

"Love is a quality, not a quantity." -Vanna Bonta

There are two ultimately important things that I can draw from this. First, Thumbtack Engineering emphasizes quality as an important factor to balance with speed. While many startups push for fast release cycles and development of a minimum viable product as quickly as possible, we recognize the importance – the necessity – of maintainable, readable, and high quality code. We focus on developing our codebase and our individual coding style every single day – engineers code review every change that's made to the website (in order to promote high quality code and prevent against knowledge siloing), and improve each other's performance with every review.

Secondly, and perhaps more importantly, we, as a company and as individuals, love what we do. It's that simple. We're truly passionate about changing people's lives and making a difference. The passion that pervades through our work is apparent, and there's nothing that can stand in the way of us doing our best to improve the lives of our customers. There's simply nothing that's more powerful than real passion about helping people, and that's what makes us tick.

Thumbtack is a unique and wonderful place. The culture here, without a doubt, is one of the things that makes us come back for more, day after day. If company culture interests you, or if this sounds great to you, let us know.

Eng Retreat in Tahoe: Robots and Desolation

Within engineering, we like to have consensus on decisions, especially those that have significant and lasting import. So before we left for a multi-day retreat in Tahoe, the engineeering team asked itself an important question.

robots-question

The results were indicative of just how great of a team we have.

survey-results

And with that, we packed our robot boxes and set off for the Sierras.

We hiked in Desolation Wilderness, spent time stargazing, caught (and ate) crawfish, tossed Frisbees, played Resistance, cooked and ate incredible food, and all-in-all had a fantastic time—and no one got too hurt.

We'll let the following pictures tell more of the story.

everyone

hike-1

hike-2

hike-3

hike-4

lake-1

crawfish

lake-2

building-1

building-2

building-3

robots

You May Be Interested in an Internship at Thumbtack

Hello! If you're reading this, you're likely a young software engineer, probably in college and looking for a summer internship. You want to intern somewhere that will look good on a resume, but also somewhere fun with real software work. You'll likely want to learn a lot and get along with your coworkers. You've got high standards, and you should.

I'm not going to tell you Thumbtack can be all those things for you, because not everyone sees those things the same way. I only know what's fun and interesting for me. But I can explain what I gained from working at Thumbtack; then, you can decide for yourself.

First, a little context. I study Computer Science at Carnegie Mellon University. When I return this fall, I'll be a junior working towards my bachelor's degree. I first found out about Thumbtack at a tech industry job fair hosted on campus at CMU. Thumbtack's booth was like any other: an engineer turned recruiter collected resumes, handed out t-shirts, and chatted with students. A few days later they reached out to me; I interviewed, and I decided it was a good place to be.

Starting at Thumbtack

Before I started at Thumbtack, I had certain expectations for what I thought the experience would be like. I'd met several of the engineers, so I knew that I was going to get along with the team and that I was a good 'culture fit'. I understood that they worked on a pretty wide variety of problems, so there would be problems which would be fun to solve and whose solutions would have some impact. And, finally,I saw that the team was genuinely interested in engineering as a practice in general and the problems they were working on in particular. These are, essentially, the reasons why I took the job in the first place.

I had these expectations going in, and this isn't a story with a twist. To spoil the ending, this is a story where everything goes well, with a few pleasant surprises along the way.

First Impressions

Thumbtack pairs each new engineer and intern with a mentor, another engineer. My mentor was Alexander Kojevnikov and he was incredibly intelligent, always helpful, and at least as excited about my projects as I was. While I didn't always work directly with Alex, he was my go-to resource for any questions and advice.

I spent most of my time at Thumbtack working on a team called Marketplace Mechanics, or M2. Our mission statement contained the phrase 'high leverage', and essentially meant 'Do the little things that help a lot.' This mission encouraged lean projects and quick results; perfect for someone with only a couple months in which to work. What was interesting about this team was that when I joined Thumbtack, M2 didn't exist. In fact, no teams existed. Engineering was one more or less cohesive unit, which gave the advantages of less overhead and greater personal freedom. However, there was a distinct lack of scalability, and Thumbtack is very keen on scaling.

So, in only my second week, Engineering participated in what we call a Product Process meeting. In these meetings, we decide how to organize ourselves and proceed in getting work done. It was in this meeting that we created the M2 team, among others. I had the reasonable assumption that with only a few days at Thumbtack, I wasn't going to have that much authority on how to run the engineering team. As it turned out, people wanted to listen to what I had to say. Even more surprising, I felt perfectly comfortable contributing in this doubly new environment. I'd like to chalk it up to my own assertiveness and personal charisma, but I'm not that good of a liar. Whatever the real reason, it's a moment that stands out to me, as much as a meeting can be particularly memorable.

Another measure of the atmosphere at Thumbtack is just how much time people spend together outside of work. I've gone rock climbing, biking, and hiking with my coworkers, and played board games late into the night. I've played an irresponsible amount of foosball in the office. I did, however, occasionally get some work done.

Working at Thumbtack

To understand some of the work I did at Thumbtack, it's necessary to understand that Thumbtack deals with massive amounts of human-generated data: messages, requests, profiles, quotes, everything involved with users buying and selling services to each other. Much of this data needs to be moderated, categorized, or otherwise administrated in ways that are difficult or impossible for a computer. Consequently, Thumbtack has a team of people performing these tasks, a human API. Where we originally had just one person, now we have hundreds, and this will continue to scale with our traffic.

To me, as an engineer, this was and is a huge opportunity. Even if the various manual tasks really can't be automated, preprocessing and filtering the data as much as possible before handing it off can still be an improvement. Even displaying existing data in a new way can help the human operation be more effective.

My very first project was along those lines. When customers send requests for their service needs, those requests are checked manually to ensure that we only send high quality requests to professionals. I wrote code to automatically check for the most common features of bad requests, such as duplication of an existing request. As a result, we will no longer have to check manually for those features, both saving us work and decreasing the processing time for requests.

I did other work like this, and it's something I'll miss about Thumbtack. There's a certain allure in making changes that save people time, in writing tools that people need and doing it in just a few days.

On the other hand, it's satisfying to build and own a big project. Mine involved tracking and aggregating credits. Credits are an intermediate currency sold by Thumbtack which service professionals use to purchase bids. Since professionals can hold their credits for a while before using them, those of you familiar with accounting will understand that Thumbtack has significant deferred revenue at any given time. Since we sell credits at varying discounts for bulk purchases, and since credits and bids can both be purchased and refunded in certain circumstances, it turns out to be difficult to precisely determine our deferred revenue without some fairly fine-grained tracking. This tracking is exactly what I implemented, providing financial data that we simply didn't have before. It was a fun and meaningful project, and I worked closely with our CFO. I learned quite a bit from him, and from this project, about what businesses need when it comes to financial data.

Learning at Thumbtack

I actually learned a lot more at Thumbtack than I expected. Obviously I learned about our stack, became more familiar with the technologies we use, et cetera. That's a given. But engineers here go out of their way to learn new things; possibly because it'll make them better engineers, but mostly because they want to know. I attended regular classes in two subjects: I met with other engineers to work through MIT's Operating System Engineering and Coursera's Machine Learning. It struck a nice balance to take these courses with the motivation of a group but without the commitment of a real college course.

Leaving Thumbtack

At the end of my internship, I'm happy with what I accomplished, but I'll miss Thumbtack. I'll miss the awesome people, the big opportunities, and the hundred fun projects I never got to start.

I hope that you see something you like in my internship, that you want to take advantage of what Thumbtack has to offer.

Because if that's the case...

You may be interested in an internship at Thumbtack.

Thought you had seen enough scroll-based animation? Think again, but this time in AngularJS!

During the latest redesign of our homepage I implemented a framework in AngularJS for building scrolling animations. The framework is called angular-scrollery and it makes implementing a scrolling animation as simple as defining a series of steps and element behaviors specific to each step. In this post I will describe the process of creating a scrolling animation (specifically, this demo) with angular-scrollery.

Step 1: Set-up

The first thing you need to do is standard set-up for using an angular module. This includes:

  • including angular-scrollery-config.js and angular-scroller.js on your page
  • initialize the angular-scrollery app
  • include the controller attribute on the animation container like so:

    <div class='animation-container' ng-controller='ScrollerController'>

Step 2: Design animation

In angular-scrollery, animations are broken down into steps that you define in angular-scrollery-config.js, following the convention that user-defined settings exist in a separate config file. Breaking down the animation into steps makes it easier to tweak your animation as you're developing it. Rather than defining the beginning and end points of every sub-animation of every element, these sub-animations are grouped into steps, which altogether make up the whole animation. This shrinks the number of variables you have to keep track of when designing the animation.

The angular-scrollery-config.js for the demo animation is copied below. Each step has a name (which isn't actually used anywhere but can help you stay organized), a starting point (defined as the vertical distance from the top of the page in pixels), and a duration (defined as the length of the scroll in pixels). For example, in our demo config we have the second step lasting from when the user has scrolled 200 pixels down the page until the user has scrolled 200 more pixels. Note that the start of each step is the sum of the durations of all previous steps. You will probably continue to tweak this as you work on your animation.

angular-scrollery-config.js

(function() {
"use strict";

var app = angular.module("angular-scrollery-config", []);

app.service("scrolleryConfig", function() {
    // steps is an array of objects formatted like: {name: "stepName", start: 0, duration: 200}
    var steps = [
        {
            name: "fireAppear",
            start: 0,
            duration: 200
        },
        {
            name: "smokeAppear",
            start: 200,
            duration: 200
        },
        {
            name: "rocketLaunch",
            start: 400,
            duration: 400
        },
        {
            name: "theEnd",
            start: 800,
            duration: 100
        }
    ];
    return {
        getSteps: function() {
            return steps;
        }
    };
});
})();

Step 3: Add animation behavior to DOM elements

Telling the DOM elements how you'd like them to animate requires adding the scroll-behavior directive to the element, as well as an attribute called animations. animations is JSON defined as {stepIndex: {property: [startValue, endValue]}}. Here is an example from our tests:

<div class="fire" scroll-behavior animations='[[{0: {opacity: [0, 1]}, 2: {translateY: [0, -875]}}]]'></div>

Angular-scrollery parses animations and stores the animations and affected properties for each element. When the user scrolls, the directive on each animated element is notified, the affected steps for each element are calculated, and then those steps are applied. This can mean finishing or reversing steps, calculating an intermediate value for a step currently in progress, or both depending on how far the user has scrolled.

One-time animations

You can build two types of scrolling animations with angular-scrollery: a scrolling animation that continues to animate any time the user passes that portion of the page and reverses when the user scrolls backward (like the demo), or a simpler animation that animates once and remains static for the rest of the time the user spends on the page.

If you're building a one-time animation, you need to configure two additional variables. Set animateOnlyOnce equal to true and change animatedEnd to hold the location on the page the marks the end of the animation (vertical distance in pixels).

The angular-scrollery-config.js for the demo is copied below, altered so that the animation would run only once.

angular-scrollery-config.js (one-time animation)

(function() {
"use strict";

var app = angular.module("angular-scrollery-config", []);

app.service("scrolleryConfig", function() {
    // steps is an array of objects formatted like: {name: "stepName", start: 0, duration: 200}
    var steps = [
        {
            name: "fireAppear",
            start: 0,
            duration: 200
        },
        {
            name: "smokeAppear",
            start: 200,
            duration: 200
        },
        {
            name: "rocketLaunch",
            start: 400,
            duration: 400
        },
        {
            name: "theEnd",
            start: 800,
            duration: 100
        }
    ];
    // For a one-time animation
    var animateOnlyOnce = true;
    // The end (in DOM height) of the animation (for one-time animation only)
    var animationEnd = 900;
    for (var i = 0; i < steps.length; i++) {
        animationEnd += steps[i]["duration"];
    }
    return {
        getSteps: function() {
            return steps;
        },
        getAnimationOnlyOnce: function() {
            return animateOnlyOnce;
        },
        getAnimationEnd: function() {
            return animationEnd;
        }
    };
});
})();

Currently supported animations

Currently, angular scrollery supports animations that change the class of an element, use translateY or rotate transforms, or alter opacity. Something that could be better about angular-scrollery is that you can only apply one transform (translateY or rotate) at a time for given element and a given step. It could easily be extended to support multiple transform animations at once as well as other types of animations, and if you're interested in doing so you should!

Page 1 / 8 »