Code Read 9 - John Backus and Functional Programming
The 1977 Turing Award went to John Backus, and in his Turing Lecture, "Can Programming Be Liberated from the von Neumann Style?", he made a vigorous case against traditional "von Neumann" programming (VNP), and for functional programming (FP). Unfortunately, the vigorous rhetorical style of this paper has infused the discussion of FP, resulting in countless flame wars over the last 30 years. The overly broad and weakly supported statements that make up the bulk of the discussion often boil down to these:
"Traditional programming languages are clumsy. FP will solve all your problems, if only you are smart enough to use it."
versus
"FP is a toy. Nobody has done any real work with it, so it must be useless."
Both sides are throwing out the baby with the bathwater, and the good ideas of FP are tainted by their association with their more problematic brethren.
Backus's Paper
Let's start with Backus's paper. From the beginning, he clearly has an ax to grind - he describes conventional programming languages foundations as "complex, bulky, not useful", and conventional programs, while "moderately clear, are not very useful conceptually". Whereas, he says later in his paper, FP programs "can accommodate a great range of changeable parts, parts whose power and flexibility exceed that of any von Neumann language so far." Yet he makes very few points to support these statements.
Expressive Power and Simplicity
One of his recurrent themes is the superior "expressive power" of FP. When he does defines "expressive power", it is not clear that even under his definition FP is any more expressive than VNP. His definition revolves around on having functions that are expressible in one language and not in another, yet he offers no supporting example.
He also asserts that VNP languages are necessarily complex, requiring new features to be added to the language itself, rather than implemented using existing language features. As one of the creators of many VNP languages, he's more than entitled to make such statements. But newer languages (like C, and many OO languages) have reversed this trend by simplifying the underlying language and making it easier to define new functions. They do this without using using FP, so this argument also falls apart.
An Algebra of Programs
The most coherent argument in the paper, and also longest, is that FP allows a relatively straightforward mathematical theory of programming, which is much more difficult in VNP. This is true, but I'm not convinced it is that important. He proves the equivalence of two algorithms for matrix multiplication, and also discusses using this algebra of programs to optimize algorithms safely.
However, formal proofs of programs are only marginally useful, usually taking far longer to do well than to write an algorithm, and are just a prone to misunderstanding of the problem to be addresses. So while it may be easier to prove an FP algorithm does what it tires to do, it is just as hard to prove that what it tries to do correctly solves the underlying problem.
Optimization is important, but as I've argued before, the most effective optimizations involve using an newer algorithm that is not equivalent in a formal way, but only by virtue of a deeper understanding of the problem. For example I once had to optimize an algorithm which had to go through a list doing pairwise operations on each element and all previous elements (e.g. f(1,2), f(1,3), f(2,3), f(1,4), f(2,4), f(3,4),...) This approach takes big-O n-squared operations, so the implementation ran in seconds for n = 100, but took hours for n = 1000, and days for n = 10000. No amount of code optimization was going to improve that. But when I realized that there was a way to run it operating on the elements of the list and only a few previous elements, the program went from needing big-O n-squared operations to big-O n operations, which meant a run time of a few minutes even for n = 100000. No formal manipulations of the algorithm would ever have gotten me there, only the insight that most of the operations were unnecessary by virtue of the problem itself, not by virtue of the comparison algorithm.
Oddly, enough, when talking about this algebra of programs, Backus himself says that "severely restricted FP systems" are easier to analyze, "as compared with the much more powerful classical systems." He seems to be implying that the ease of analyzing FP systems justifies their restrictions, but I think the consensus view today is the other way. Outside of academic circles, formal analysis has never caught on.
Rethinking Computers from the Ground Up
One of the most visionary themes in his paper is the proposal that we change the way we relate to state (aka history, or storage). VNP relies on making many small, incremental, changes to the stored data in a computer. Thus the overall state of that data is consistent and meaningful only rarely - usually it represents a job half done. Moreover, subsequent, or concurrent execution runs the risk of operating on that inconsistent data, resulting in what is called "side-effects". Instead of operating piecewise on small bits of data, leaving the overall state inconsistent during the calculation, Backus proposed not just a programming style but a change in hardware that prevents these side-effects. Not only does this make it easier to analyze the code, it becomes trivial to split larger programs into smaller pieces of work, and run those pieces in parallel. Indeed, this idea seems to be the root of many the "FP" successes - for instance its use in telecommunications software, and in Google's MapReduce. However, I put FP in quotes, because this idea is a fringe benefit of FP, and can easily be implemented in traditional languages (in the right context), and even Ericssons's Erlang projects make extensive use of C libraries. Ironically, the most successful idea of FP - a formalized way to eliminate side-effects - seems to be a side effect itself.
Conclusions
Since Turing and Church proved that "Turning Machine Computable" and "Lambda Calculus Computable" are the same thing, and thus FP and VNP can perform exactly the same computations (with suitable extensions to FP), there are two basic questions. Is either one inherently better than the other in:
1) simplicity/ease of writing code (fewer bugs, easier to understand, better at expressing underlying problem space, easier to fix bugs)
2) speed of execution?
I do not find FP to be any more "expressive" or easier to understand than VNP (and my first programming language was LISP). If anything, the restrictions of FP make it harder to model some problems in code, and thus make it harder to understand that code. It is easier to optimize FP in some cases, and thus optimizations bugs are less likely. But these bugs are a small fraction of all bugs, and among the easier to fix once identified.
As for speed of execution, on existing hardware FP programs tend to be slower, whereas they may be faster on purpose build hardware. This is because of the elimination of side-effects. On conventional hardware, eliminating side-effects often requires duplicating data. On purpose-built hardware, eliminating side-effects allows automatic parallelization. So FP programs may be faster in cases where both problem and the hardware support simple parallelization. But parallelization is notoriously difficult, and automatic parallelization technique are only modestly successful on general problems. The marginal payoff of adding more processors quickly diminishes. Instead, special parallel algorithms need to be built which carefully manage data movement in the system, minimize duplication, and so on. These techniques are equally suited to FP and VNP.
So in the end, I think FP will fade away, except for leaving its name attached to extensions to VNP languages which encourage side-effect free programming for easier application to distributed systems. These extensions will prove to be very useful, especially as programming for the web, and programming for multicore processors becomes more and more common.
Hopefully the dogmatic proponents of FP will be satisfied by this, and the dogmatic detractors will be satisfied, so we can all move on to more productive discussions.
Read more...