YouTube placeholder

Hints for Computer System Design

Statistics: That’s Math, Right

Computer systems researchers have a somewhat tortured relationship with statistics and math. (Many of us are computer systems researchers because we weren’t smart enough to do mathematics.)

  • On a good day: "I’ll rerun my experiment a few times and compute an average."

  • For extra-special bonus points: "I’ll put error bars on my graph."

First, Predict

Before performing an experiment and collecting data it is helpful to make predictions.

  • One way to do this is to draw sketches of the graphs you expect to produce.

  • After collecting real results you can compare them against your predictions as a way of developing intuition about your system.

  • (Predictions about simple cases are also good ways to validate models and simulators.)

Understand Your Data

Beware the premature use of summary statistics--means, medians, etc.

As an example, the following data from two experiments could have the same mean and median:

narrow

bimodal

Clearly they would have very different implications for performance improvement, so examine your raw data!

Love Your Outliers

Outliers are special and deserve special treatment.
  • They may just be weird remnants of your measurement harness.

  • They may have a lot more to tell you.

  • Understand them either way.

Deciding What to Improve

Improve the slowest part, right?

  • (Even if it were this simple, getting programmers to work on any one specific thing can be hard. Frequently they decide to optimize something simply because "they want to".)

Say your code has two functions:
  1. foo which takes 5 minutes to execute.

  2. bar which takes 5 seconds to execute.

Clearly you should immediately get to work improving foo, right?

Why Not foo?

What two elements have we missed in our overly-simplistic decision?

  1. Significance: how much does foo matter?

  2. Difficulty: how hard is it going to be to improve foo?

What do we likely know more about at this point after our experimentation?
  • Significance, so let’s start there.

Amdahl’s Law

The impact of any effort to improve system performance is constrained by the performance of the parts of the system not targeted by the improvement.

Imagine that have the choice between:
  1. Reducing the execution time of foo from + 5 minutes → 1 minute.

  2. Reducing the execution time of bar from + 5 seconds → 4 seconds.

Note that the improvement to foo is better:
  • absolutely (4 minutes v. 1 second) and

  • proportionally (80% v. 20%).

Not So Fast (Pun Intended)

Imagine that have the choice between:
  1. Reducing the execution time of foo from + 5 minutes → 1 minute.

  2. Reducing the execution time of bar from + 5 seconds → 4 seconds.

But what if our program spends 95% of its time running bar but only 0.1% running foo?

  • foo speedup: 0.001 * 240 seconds = 0.24 seconds.

  • bar speedup: 0.95 * 1 = 0.95 seconds.

This is why server performance geeks take a month vacation every time they trim one instruction off of a hot path.

Amdahl’s Law

Even more colloquially:

Ignore the thing that looks the worst and fix the thing that is doing the most damage.

And the unfortunate corollary to Amdahl’s law:

The more you improve one part of a system the less likely it is that you are still working on the right problem!

Hints for Computer System Design

Systems Are More Complicated Than Algorithms

(Don’t tell Atri.)

Why is designing a computer system different from designing an algorithm?
  • "The external interface (that is, the requirement) is less precisely defined, more complex, and more subject to change."

  • "The system has much more internal structure, and hence many internal interfaces."

  • "The measure of success is much less clear."

I have designed and built a number of computer systems, some that worked and some that didn’t. I have also used and studied many other systems, both successful and unsuccessful. From this experience come some general hints for designing successful systems. I claim no originality for them; most are part of the folk wisdom of experienced designers. Nonetheless, even the expert often forgets, and after the second system [6] comes the fourth one.

— Butler Lampson

Three Goals

What are the three goals Lampson focuses on?
  • Functionality

  • Speed

  • Fault-tolerance

What are the three parts of the design task he identifies?
  • Ensuring completeness

  • Choosing interfaces

  • Designing implementations

Summary of the Hints

hints table

To The Paper


Created 2/17/2017
Updated 9/18/2020
Commit 4eceaab // History // View
Built 4/26/2016 @ 20:00 EDT