From a Random Walk to Conscious Improvement in Software Engineering

You're not satisfied with how quickly you're improving as a software engineer. Make the shift from a random walk to conscious improvement.

(This is part 2 in a series on systematic improvement as a software engineer. See part 1 here)

So, you're not satisfied with how quickly you’re growing as a software engineer. And you’re reading this, so you don't want to continue the random walk towards improvement that has characterized your road so far. I want to help you with a method I’ve used to quickly improve how I ship code, and that I’ve used with others to help them ship code faster, with fewer bugs, and less pain during code review.

You've written down your personal software process. (If you haven’t, see part 1, and take a minute to write your process down.)

With that in place, we have a base we can start working from. We can look objectively at the process and start to understand how it works, how it fails, and how to improve it.

Defects

Defects, for the purposes of improving your software development process, are any deviations from the process going perfectly. And I mean cartoonishly perfect. Unrealistically perfect.

Why?

Each defect represents an opportunity for improvement.

Examples:

  • Typos
  • Missing or incomplete tests
  • Logical defects
  • Improper handling of error cases
  • Functionality problem
  • Performance problems
  • Documentation is missing, misleading, or incorrect
  • Defects your process missed released to production
  • Defects your process missed that others noticed
  • Skipping a step in the process

You’ll probably want to start general, and make these defect types more specific as you go, because the specificity allows you to better address the problem in your process.

Finding common failure modes

Each defect taken individually isn't necessarily something you can or should change your process based on.

To understand why, it's important to understand the ideas of normal and special variance.

Normal variance is the statistical outcome of the process when everything is moving along essentially as expected. It's not that nothing ever goes wrong. It's that things go wrong in a predictable way, due to the structure and problems with the process.

Special variance is due to things outside the process. AWS US-East-1 going down isn’t something you can control. (But it could be something worth thinking about.) It’s outside the process.

Tweaking your process due to special variance isn't going to actually improve your process, because it's not due to the process. It's due to the chaos and complexity of the universe.

So, in a spreadsheet or notebook, track each defect. Below I specify a more basic, lightweight approach, or an advanced, heavier-weight tracking approach.

I personally prefer the advanced for generating better insights, but if it seems intimidating start with the basic and weigh out if the advanced seems like it may be worthwhile for you.

Recording - Basic

  • The date
  • The phase
  • A link to the defect context
  • Notes

Recording - Advanced

  • The date
  • The phase
  • The defect type
  • The phase owner
  • The original issue/ticket number
  • A link to the defect context
  • Notes
An example line of my defect tracking spreadsheet
An example line of my defect tracking spreadsheet

In either case you’ll want to include a link to the defect in context, such as the code review comment, or the comment on the issue in the issue tracker, and personal notes reminding you about what was happening here.

Using the data to guide changes

After you've tracked the defects your process creates for a week, you'll have a long enough list to start differentiating signal and noise. You'll see you reliably make the same kinds of mistakes. Now you can change your written process to combat this. And as long as you adhere to the written process you can begin to see improvements via the changes.

For example, if you constantly forget to add tests to show certain return values are excluded, you can add an explicit check to your process before you commit code that those tests are present.

Or if you notice that you often begin work before you fully understand the problem and paint yourself into a corner, you can experiment with adding a step to check your potential solution with others on your team before you begin cutting code.

If you're not sure about a change you're considering, conduct a thought experiment. If your process already had this step would it have caught the defect? If yes, add it. You can remove it later if it proves ineffective.

After making changes, let the process run again for a bit, and continue collecting data. This way you can improve your process in a systematic, if rudimentary way.

Next steps

Congratulations are in order. Even this rudimentary system of personal process improvement will put you way ahead of your peers who are adhering to the random walk method of improvement. But there is more you can do to improve the improvement cycle.

You can improve even faster by taking the next step, adding the stopwatch, which I'll detail in part three of this series.