Posts Tagged ‘Language Design’

The Obvious Truth

Friday, March 7th, 2008

One of the most important qualities of a good programming language is the first class function. This critical language feature makes possible a whole class of behavioural and data abstraction techniques. Without it we could not incorporate many modern language like continuations, generators, or callbacks.

There are of course many workarounds to any of these features. These workarounds range from articles on simulating continuations in Object-Oriented languages and callbacks can be replaced in most cases by the, somewhat more complicated, event or messaging passing mechanisms. Generators however, really are just un-useful without proper first class functions.

A generator is simply a function that sends a number, possibly infinite, of values to an independent language construct, which can then take advantage of these values, one at a time. The proverbial construct associated with generators is the for loop. The loop invokes the generator, a value function within the generator produces each value, and the value function itself calls a function parameter containing the code body of the loop itself. There are usually some extra mechanisms to handle premature termination of the loop (among other things).

I digress however, since what I really want to discuss here is the language elements left behind. Features such as environment management, the function dispatch mechanisms, exception handling, memory management, concrete syntax, and so on. What I really want to see is language designers moving towards a more unified system, providing programmers, or at least library implementers, with abstractions for these features, rather than fixed semantics that limit the expressivity of the language.

See despite so many languages taking great pride in providing first class functions, most actually just inherited these features due to the popularisation of functional programming techniques. The real hidden gem here, I feel, is reifying language semantics however. This can allow programmers to selectively enhance or extend their favourite language, without losing the built-in compiler or interpreter, possibly at the cost of optimisations complexity.

My favourite example here is the environment, an element of language design that has historically encountered limitless debate. Yet, for reasons that I cannot comprehend, even languages promoting expressivity over implementation consistently refuse to provide language level abstractions over these features.

I’m not saying that most programming requires such levels of expressivity, and quite the opposite, most programming tasks will never require it. But there are some tasks that would benefit greatly from the presence of such features. Rather than re-implementing these, it should be possible to just tap into the abstractions provided by the programming language.

One style of programming that would definitely benefit from these extensions is programming language research. It would be much simpler to test out new environment models, construct prototype virtual machines and interpreters, and even mess around with less common things like dispatch algorithms or meta object protocols, if instead of building entirely new environment abstractions, the existing ones could be utilized and re-engineered in a controlled way. This requires both abstractions limiting the direct manipulation of the host language implementation, otherwise you really just re-implementing the host language, and by providing a preferable more powerful interface than the one likely used in the host languages compilers or interpreter.

Using Scheme as an example, it would be trivial to add some intermediate abstractions for managing the environment. Scheme already implements something very close to first class environments, and in-fact many lisps do actually provide procedures for inspecting and controlling where evaluation takes place, but it’s really not quite there yet.

The reason I singled out Scheme in this case, even comparatively to other dialects of Lisp, is because the larger dialects typically include an impressive continuations facility. This combines with the hygienic macro feature, producing extremely impressive language extension capabilities. Unfortunately, you still need to generate raw code, when writing domain specific languages if you wish to allow fragments of host language code to be embedded into the DSL. In my case these fragments correspond to actions and predicates in a parsing language I’ve been implementing.

Perhaps I should attempt a Scheme dialect with such features, and just maybe I’ll even end up with something new?

– Lorenz

Growing Pains - Bootstrapping a new Language

Friday, January 25th, 2008

Creating a programming language is hard. I’m not talking about the kind of hard that’s involved in math problems or even getting pwned with that rather expensive new internet connection (no, its not the mouse either… you’re really just not that good;). No, I’m talking about the sheer amount of effort required to first come up with your amazing, world changing programming language, and then realizing that you actually need to implement the whole thing a few times over, using variations of that language that are in-fact worse than what you started with (unless your using Java, in which case, jump right ahead).

As you might have guessed I’m mainly writing this out of mere frustration, that even the dumbed down, simplified variation of Mention that I’m attempting to bootstrap is going, well, slowly. Very slowly.

So far I’ve come to the conclusion that there may even be some general rules of thumb, which to the enlightened few, might not actually seem all that obvious. Firstly, the current attempt to bootstrap Mention is utilizing the Squeak programming language/tool-chain, in the hopes that it would help to speed up the development process. It didn’t. Actually, I’m even wondering if its actually slowing me down… Which brings me to the obvious question. Why?

The most useful languages that spring to mind after asking this question, probably actually come from the Lisp family, along the lines of some dialect of Scheme + a parsing language to complete the picture. But what would make a language like Lisp, outgun its successors like Smalltalk or perhaps even python. Well, actually, I think python might even work, but Smalltalk has issues… Namespace issues.

The issues with Smalltalk I’m referring to are the conventions surrounding the naming of global variables, or at least the lack-there-of. Many scripting languages and most Lisps, allow the programmer to assign to any old variable name, and its simply there, for the entire world to see. But of course, when you have to think in objects, you can’t expect to have globals. After all, good Object-Oriented software should be based on the interaction of object, and not glorified globals, singletons, or otherwise. Although you could be mistaken for thinking that the Application object is actually a global, but of course its not. Why, well I’m not sure really, its just not.

I’d rather not go on any further with this, but the point is simple. We know that software systems evolve, and don’t simply get designed and then implemented. And we know that software is not actually re-used until at least some refactoring has been performed (generally more than some). So why can’t we just use our globals while we understand the problem we’re working on, and if they do the job we can just keep them, and if they really are flakey code smells bend on silicon armageddon then we can simply refactor them once we understand how the problem breaks down into the individual problem level (application, subsystem, et cetera) objects.

– Lorenz (feeling slightly better ;)