Sayed's Blog

Scientific Debugging

Posted June 30th, 2020

The development of my debugging technique looks a lot like the development of the scientific method.

Today we know that the Sun is the centre of the Solar System, and the Earth and other planets orbit the Sun. Before this, some people used to believe that the Earth was at the centre of the world.

However eventually people observed certain things that contradicted the idea of the planets orbiting the Earth. The apparent distances between the Earth and the other planets were varying. The motion of the planets looked regular, so it looked like it could be described by mathematical formula, but not mathematical formula that assumed that the Earth was at the centre and they were orbiting spherically.

Ptolemy tried to resolve this with the idea of Epicycles. This was circles within circles, designed to resolve the apparently varying distances of the planets to the Earth.

It looked something like this:


This accounted for the motions of various celestial bodies, at least for a while. However, even with Epicycles, the observations did not match what would be expected.

To resolve this, even more epicycles were added. The planets were orbiting Earth in cycles within cycles.

The data kept contradiction the notion that the Earth was at the centre of the universe, but they kept trying to reinterpret the data in a way that was consistent with their biases.

Eventually, Copernicus looked at the Ptolemy's models and the Babylonian observations and found that everything was much simpler if we concluded that the Sun was at the centre of the Solar System and the planets orbited the Sun.

As a programmer, debugging is part of the job. When I first started to program, I was more like Ptolemy. I would hold preconceived notions, and if the outputs contradicted my preconceived notions, then there must be something more elaborate in play that would make my preconceived notions correct.

This made debugging take much longer that it should. It limited my ability to focus on the rest of the code.

Eventually, I've moved towards an approach that more closely resembles the scientific method. I would start with a hypothesis, and come up with a test that would validate or invalidate the hypothesis. If my hypothesis was invalidated, then I move on to the next one. Once I validate the hypothesis, I can now begin to solve the bug, since I have identified the cause.

With all that said, there are good reasons to depart from proper scientific practice when writing software.

Software is not merely an academic exercise where the goal is to have the best possible understanding of a situation. In software there is often a time constraint. Debugging is intended to fix a problem, ideally quickly. There could be thousands of hypotheses we can generate when debugging to identify the cause of a bug, but there isn't time to test all of them.

We can sacrifice a little accuracy for speed. In computer science, this is known as heuristics.

Another way to arrive at viable hypotheses faster is through dividing and conquering, like in binary search.

Image a file with 1000 lines of code.

At some point in this code there is a bug, the output is not what would be expected given the input.

A naive way of seeing where the bug is, would be starting from the first line and checking that the state of the program is valid. This corresponds to linear search.

A faster way would be to start from line 500 and see if the bug has occurred or not. If the state is not what would be expected if the program was valid, then do the same for line 250, otherwise look at line 750. In less than 10 checks, the location of the bug can be identified.

Understanding the history of science can go a long way when debugging. Especially when understanding when to deviate from the process.

Debugging, like science, involves finding stuff out. But science and empiricism is not the only way to find stuff out. Various fields, like mathematics, History, etc, have come up with various ways of finding things out within their domains (finding stuff out is the branch of philosophy known as epistemology). Many of these methods can transfer to software engineering and debugging.