Pluggable Type Systems
Following a considerable variety of discussion on Lambda the Ultimate, I grabbed a copy of "Pluggable Type Systems" by Gilad Bracha. It is dated Oct. 17, 2004, although I do not have publication data for it at the moment.
The problems I see with this paper are similar to many I have seen in discussions of static vs. dynamic typing, in addition to the specific suggestions that Bracha discusses, so I am going to attempt to address them as well.
Initially, Bracha writes, "the fact is that a mandatory type system greatly reduces the number of legal programs that can be expressed in a language." This is true, for what it is worth. It is also irrelevant. Essentially all of the "legal programs" in an untyped (but not un-typesafe) language such as Lisp, Python, and Perl are meaningless---it is the purpose of a type system to limit the legal programs in some fashion, usually to reduce the number of programs whose sole feature is to generate runtime exceptions.
I do not mean to suggest that many type systems do not rule out programs which run correctly, or even rule out those that a programmer might actually wish to write. They do, and this is one of the weaknesses of static type systems. But that is a plea for better type systems, not a reason to throw them out entirely. Bracha would assert that pluggable type systems would be the better system, and I will get to them below.
Complexity and brittleness
Bracha writes, "mandatory type systems inhibit software reliability and security, because they contribute to system complexity and brittleness." He goes on to write, "[a mandatory type system] becomes a basis for optimizations and for security guarantees that fail ungracefully if the underlying assumptions of the type system do not hold....Real systems tend to be too complex to formalize...[and the] assumptions tend to...invalidate any reasoning based on the formal model. In addition, implementations tend to have bugs...."
Unfortunately, this can be said for any aspect of programming language implementation, presumably the "system" that Bracha is discussing---the same arguments apply to program semantics, optimization, and essentially all other aspects of programming language that rely on formalisms of any kind. Beyound that, the general argument is true of any formal system.
Bracha does make some good points. The importance of performance is often overstated, the performance of dynamic typing is often more than adequate, and the use of typing information for optimization is hardly the greatest benefit of it. On the other hand, Bracha is again both true and false when discussing the consequences of an optional type system that "has no effect on the run-time semantics of the programming language." Python and Perl, for instance, support public fields in objects, although they do indeed do so by looking the fields up dynamically. The consequence of this lack of type information, in these cases, is a bit worse performance.
(I will admit I am unsure of the distinction between class-based encapsulation and object-based encapsulation that Bracha also mentions in the consequences of the lack of static typing. I can see that private data within a class is impossible to statically check, although I am not sure that it is impossible to do, by restricting access to data to methods on the object. How an automatic, dynamic check would be prohibitively costly, while method-based access is not, I am not getting. On the other hand, neither Perl nor Python offer strong encapsulation---in both cases, that comes under the header of "Don't do stupid things.")
Another good point Bracha makes is that typing rules do not usually effect the evaluation semantics for the common model of programming language. Typed lambda calculi do indeed share evaluation rules with untyped lambda calculi, with the type system only serving to reject some programs. This is, actually, an important point that I will return to later: modulo the consequences of type/run-time semantic independence, types don't do much at run-time. They can't: often typing information is completely discarded at compile time, often well before code generation.
Pluggable type systems
Bracha's main concern is pluggable type systems, with the following features:
- No interaction with semantic evaluation.
- Replacable type checkers, for different purposes. Bracha mentions aliasing, ownership, and information flow. The choice of the checking is up to the developer.
I have some practical concerns about this technique.
One is mentioned by Bracha: necessary metadata. Type annotations are one of the aspects of static typing that everyone loves to hate. They are, however, often necessary: for type inference, bottom-level operations have to be annotated, as well as module boundries, for most languages supporting separate compilation. Pluggable type systems would require significantly more annotations than traditional type sysetms: potentially, every pluggable type system will require its own set of annotations, and since type inference is optional, these annotations may need to be sprinkled throughout the code. Bracha allows that metadata would need to be associated with every node of the program's AST, as well as management of namespace and version information. Will that not make the system more brittle and less reliable, in exactly the same way that Bracha asserts mandatory type systems do?
Additionally, manipulating the metadata in the code is problematic. Bracha mentions the possibility of overwhelming syntactic clutter. He suggests, indirectly, that the syntactic clutter could be managed by an IDE. But IDEs are problematic themselves. There are, after all, reasons why "many users are still reluctant to part with ASCII based editors." IDEs are often brittle and unreliable. They often hide complexity rather than reducing it. As well, ultimately, there is the fact that I am typing this message in plain text, and that Bracha did not insert any metadata smileys in his paper at all.
Consider the history of optional type checking-like systems: lint, "use warnings" and "use strict" in Perl, "gcc -Wall", and Erlang's experiments with optional typing checks.
- When these tools are present, programmers are guided, often forcefully, towards using them. Why are all the directives needed? Because, in practice, no one uses them. Because, as dynamic typing proponents assert, it's often easier to get something working initially without worrying about satisfying the type checker. Because, often, they complain about meaningless things: the program would run perfectly correctly even with the checker spewing warning messages all over the place. Because it is possible to come up with a design you might want to use that cannot be checked by the system.
- Beginning to use these systems after the program has achieved a certain size is remarkably painful. They complain massively, often about constructs that are perfectly correct. They require you to change working code to satisfy them. They require you to insert syntactic clutter to tell them what you are doing. (I seem to recall several open-source projects that reject out-of-hand patches that do nothing but fix "gcc -Wall" complaints. Keeping track of the patches is deemed to be more trouble than the patches are worth.) Ultimately, these systems sometimes require complete redesigns---the approach the programmer has taken cannot be made acceptible to the checker.
- In practice, in thoroughly exercised code, the problems that static checking systems find do not exist. (This is where the Erlang results are important.) In well designed, thoroughly tested code, type errors are rare---a type error is a type error, no matter whether it is discovered at compile time or at run time. Taking the time to add the necessary metadata and satisfy the checker frequently has only a small return. (This may not necessarily be true for specialized type systems, such as security assertion checkers; see the work on the Linux kernel. But I believe even in this case, most discoverable problems are avoided or removed before the checker was used.)
Optional type systems, as I read it, either are not used initially or are made mandatory by administrative fiat. They are not added to the development process later, due to the excessive cost, or they are added later, but with reduced returns.
Typing and programming
In my view, static type checking is primarily a developmental aid. (It would have to be, since the run-time effects are either unnecessary or completely absent.) I am afraid I do not believe the assertion that mandatory type systems are harmful, or that development under dynamic type systems is necessarily quicker. Further, I do not see how an optional type system is likely to be of much use. (Being absolutely certain, modulo bugs in the language implementation of course, that something cannot occur is often a good feeling.)
In my opinion, dynamic type systems are primarily beneficial to programming language builders---good static type systems are hard and still open research topics, and getting even decent ones right is a considerable amount of work. The primary advantage of dynamically typed languages lies in higher-level constructs and better facilities for abstraction. Until relatively recently, those advantages could not be approached at all in static type systems, and the ease of constructing a dynamically typed language still gives them a head start.
I do see the utility of more flexible type systems, in the kinds of assertions that the type checker can prove about the code. A pluggable type system might be a good way to approach a more flexible system. Or, it might be better to skip the intermediate steps and head for Curry-Howard-land directly.