[pro] Lisp and DSLs

Paul Tarvydas paul.tarvydas at rogers.com
Wed Jul 20 09:47:48 PDT 2011

It seems to me that most people don't understand the rudiments of CL's ability for creating DSL's and live in the land of Blub.

If you're going to discuss creation of DSLs, it might be an idea to first relate the aspects of CL using compiler terminology.

Here are some random thoughts (not all are accurate, but represent an attempt to translate CL concepts into Blub):

- CL is a language that has Lex and Yacc built into it.  You can change the syntax of CL to suit your problem.

- Lisp s-expressions are concrete parse trees.

- The CL "reader" is a scanner.  You don't need to build a scanner, since CL already provides you with one.

- CL symbols are "tokens".

- The CL scanner can read programs from a number of sources - not just files - e.g. strings, keyboard, streams, etc.

- Physicists commonly create DSL's.  They define a notation with which to describe a domain, then use the notation to solve problems in the domain.

- Few other languages can change their own syntax.  Examples of self-modifying syntax can be found in term-rewriting systems, such as the language TXL.

- CL macros are parsers.

- CL macros are code emitters.

- CL macros are source-to-source program transformers.

- CL backquote/comma/at-sign notation is used to splice and modify concrete parse trees into other shapes.

- The CL compiler compiles s-expressions into machine code.

- If one creates s-expressions, one can use the CL compiler to compile them into machine code at runtime.

- The CL compiler is a JIT compiler.  

- The CL repl interactively compiles s-expressions to machine code, then executes them, then prints the results of the execution.

- The CL language defines a rich library of operators for modifying concrete parse trees.

- To a CL'er, infix notation is just unnecessary syntactic sugar.  Many novice CL'ers try to create macros that understand infix notation early on in their careers, but as they become more familiar with the power of CL, they drop the syntatic sugar and use parse trees directly.

- The CL printer contains a notation for displaying structure sharing.  This facilitates debugging of optimizers which transform trees into graphs.

- To write a DSL in CL, one first writes a macro that parses s-expressions.  Then, the macro is extended with backquotes to emit new s-expressions.  The emitted s-expressions are automatically fed to the CL compiler and compiled to machine code.

- Unfortunately, the C language uses the name "macro" for a facility that is far less powerful than that of CL.  In C, the macro facility can only parse a syntax that looks like simple symbols and function calls and a special #if syntax that is not even part of the C language definition.  In CL, the macro facility can parse any s-expression and utilitize any CL function during code emission.

- With C macros, the programmer does not have the option to change the syntax of the macro language.  With CL macros, the programmer has full control of the macro syntax.

- CL has the ability to link/load modules at runtime, "on the fly".

- CL invented the concept of JIT (is this factually true?).

- Forward references in CL are automatically handled by the runtime linker.  The linker fixes up references on the fly to improve runtime efficiency (I know that I first saw this facility in UTAH Lisp in 1977).

- The CL scanner, compiler and linker sit in memory all (most) of the time, hence, CL is an all-in-one bundled DSL/compiler toolkit.  Some CL's give you the ability to remove the compiler when creating a .exe, if you no longer need it.

- The builtin CL macroexpand function gives the programmer the ability to debug his/her DSL compiler by viewing the emitted code (much like CPP's command line option that leaves the expanded code in a text file).

- CL's rich suite of syntactic and compiler tools came about due to early AI research into automatic code generation.

- CL lambdas (closures) are thunks.  

- CL closures are trampolines.

- CL closures and Scheme's call-with-continuation are the ultimate goto's.  Any control flow can be modelled and compiled with these facilities.

- CL closures and Scheme's call-with-continuation can be used to create compilers using the concepts of denotational semantics.

- Cl closures are (kind of) like Smalltalk's "blocks".

- CL closures are another example of the rich language-building facilities of CL, not commonly found in most languages.

- CL closures can be used to create anonymous functions, similarly to anonymous classes used in Java.

- The facilities of CL make it possible to model and compile just about any kind of runtime semantics, including very non-linear ones, e.g. PAIP prolog.

- Prolog provides an simple way to build parsers, if efficiency is not of utmost concern.  A number of prologs are available to CL programmers, so one can have the best of both worlds.  [I use paip prolog to parse diagrammatic syntax.]



More information about the pro mailing list