Code Read 4 - Dijkstra's Notes on Structured Programming
Edsgar W. Dijkstra's "Notes on Structured Programming", which is Code Read 4, stuck me as shockingly prescient - or perhaps it is just that we creators of software are very slow to learn these lessons. Surely something written almost 40 years ago should feel more dated than this., and we should have learned or discarded all the lessons in it. But a careful reading shows that, as we have seen before, Dijkstra was one smart cookie.
Overall, he starts by making the point that techniques which work for small programs do not scale to large programs, and therefor we need to be careful how we approach large programs. He then given some techniques and examples how to actually build large programs. Finally he describes a way to think about programs that makes it easy to understand its degree of complexity and the scope of required changes. In all of these sections I found useful insights for my own work.
Misunderstanding Dijkstra
My experience when reading the notes was to continually mistake what Dijkstra was saying for something that has become a hot-button contemporary issue. (From the comments to Scott Rosenberg's post, I think this is a common experience.) This is similar to Dijkstra's own experience with his famous "Go To Statement Considered Harmful" paper (see my earlier post), which he later described as being misunderstood by people who took one relatively superficial idea to be its entire point. But a careful and pensive read reveals surprising depth to what Dijkstra was saying, and what are to me some new (or rather forgotten) insights.
Differences between small, medium and large programs
Dijkstra's central point is that what works for small programs does not for large ones, and that only through great care can we successfully work with large software. Given that he described "page", "chapter", and possibly "book" length programs as large, whereas now even a mid-sized project might have several hundred thousand lines, and thus thousands of pages of code, we should recognize that we are dealing with yet another layer of scaling.
What we will find, though, is that the issues Dijkstra was dealing with are precisely the issues we now face in our day to day work of designing and coding software. So even tough some of his techniques might not work for us, and new techniques have been invented, his insights are at the core of what has driven our field for the last 40 years, and are still quite valuable.
Demonstrating correctness and proofs
Although a lot of what he says early in the paper is a restatement of things we've covered is the first Code Reads, some points stand out. Ensuring that a program is correct is one of the most important ones. Dijkstra gives a complete and mathematically rigorous proof of a very simple of programs. It takes several pages, is largely an exercise in logical reduction, and is mind-numbingly boring. Dijkstra goes on to say that such proofs are not the point - that the great difficultly and length for even a simple program is in fact the point he was making. He also shows how it is impossible to completely test a program. So we must be able to convincingly argue the correctness of a program without being able to rigorously prove it, or test every possible case. To so this, he says, we must structure our programs so that their correctness is clear. This is a repeated theme of his - keep it simple, make it clear. It is something that takes hard intellectual work on our part, but it pays off.
He also makes a very valuable point about abstractions - when using a tool such as a library, or sophisticated programming language, it is very important to know the operational limits of that abstraction. For instance, you can not always add two integers and assume the result will be correct, because the sum could be larger than the integer representation will allow. This is one of the key points about using such layers of abstraction - although you do not need to know how it works, you do need to know exactly what it does and what its limits are. This is such an important point it might be worth its own post in the future.
Understanding and comparing programs
Dijkstra starts leading to his goal of program design by talking about how we understand programs. I think we all have experienced that it is much harder to understand someone else's code than it is to write our own. One touchstone of a really good software designer is that their code and design are easy to understand. Dijkstra looks at exactly what coding techniques make code easier to follow, and also proposes some techniques for figuring out what was happening after a critical failure (a "core dump"). Remembering that this was written almost 40 years ago, I am frankly awed. Even though I doubt he invented it all, almost everything he supported is now so commonly accepted as best-practice that most of us have forgotten there ever was even a question. The basics of programming, "if-then-else", "case" statements, "while-do" and "repeat-until" loops, and stack-based subroutines are all there. The only thing he proposed which has not caught on widely is the idea of keeping an absolute count of the number of times each loop has been executed. And his reasoning is very important - he supports all of these because they make the structure and flow of the program easier to understand.
Dijkstra also points out that it is hard to compare programs that are anything but superficially different, and that anything other than such superficial differences constitutes a design decision. He's foreshadowing the rest of the paper, so this will have to wait a little bit.
It is in this section, and the next, however, that the limits of what Dijkstra was talking about are felt, however loosely. Many of the more modern techniques for coping with large systems evolved as approaches for coping with the issues Dijkstra addresses in this paper, and for much larger systems. The value of Dijkstra's paper is that it identifies the core issue, not wrapped in the theory of a particular solution.
Actually writing code
A lot of the paper is devoted to how one actually writes code - an examination of the thought processes involved in going from the English instruction "print a list of the first 1000 primes" to the software instructions that actually do just that. (He does this twice, the second time with equally pedagogical problem). Although I tend to prefer a more intuitive approach, his process of "step-wise" construction of code is a very good technique to use if one gets stuck. Moreover, it is easily generalized and applied to designing larger software systems.
Near the end of the paper, once he has laid the foundation for it, he makes a very interesting point which comes from his step-wise technique: "Programming (or problem solving in general?) is the judicious postponement of decisions and commitments!" By working with unimplemented abstractions, we are free to understand the problem better by the time we need to actually decide on an implementation. This is a valuable idea, which can be applied almost every day. Before you decide that everything looks like a nail, see if maybe you have more than a hammer.
There's a lot more to say about this, but this post is already quite long, so I'm going to put this one out as is, and say more in my next post.
No comments:
Post a Comment