11

This is part of a series of questions which focuses on a project called the Abstraction Project, which aims to abstract the concepts used in language design in the form of a framework.

Another page associated to it related to structural typing can be viewed here. The meta-topic associated to an inquiry about the framework and the proper place to post can be found here.

How easy should it be to use a Language Development Framework?

I've written large scale code generation frameworks which also included the ability to send the result to the language-specific compiler. The topic of ease of use comes up from one such framework example: CodeDOM, or the Code Document Object Model.

It is a framework written by Microsoft that describes common code structures, but generally left a lot out (expression coercions) and tended to be a bit abstract in its representation of certain constructs, to downright emitting bad code based upon what you were doing: earlier CodeDOM poorly handled emitting PrivateImplementationType on CodeMemberMethod, when the type used was a generic interface. CodeDOM was my original reason for writing my first code generator.

One thing I'm trying to do, to simplify the framework, is reduce the amount of work you need to do something, and focusing on actions versus the specific types that make up those actions.

Here's a side by side comparison of how the framework I'm writing works:

//Truncated...
/* *
 * From a project that generates a lexer, this is the 
 * state->state transition character range selection logic.
 * */
var nextChar = nextMethod.Parameters.Add(new TypedName("currentChar", typeof(char).GetTypeReference()));
//...
char start = rangeElement.B.Value.Start;
char end = rangeElement.B.Value.End;
/* *
 * 'start' <= nextChar && nextChar <= 'end'
 * */
currentExpression = start.LessThanOrEqualTo(nextChar).LogicalAnd(nextChar.LessThanOrEqualTo(end));

Versus CodeDOM:

//Truncated...
var nextChar = new CodeVariableReferenceExpression("nextChar");
//...
var start = new CodePrimitiveExpression(rangeElement.B.Value.Start);
var end = new CodePrimitiveExpression(rangeElement.B.Value.End);
currentExpression = new CodeBinaryOperatorExpression(new CodeBinaryOperatorExpression(start, CodeBinaryOperatorType.LessThanOrEqual, nextChar), CodeBinaryOperatorType.BooleanAnd, new CodeBinaryOperatorExpression(nextChar, CodeBinaryOperatorType.LessThanOrEqual, end));

The focus of the framework is language enthusiasts, as well as those interested in generating code or applications. Given its focus on compilation, code generation, and language development, should the framework focus on ease of use or raw power?

My primary goal is to increase the availability of such tools, so those interested in the domain don't require a lot of experience in the language theory domain before they can start to work on their own language-centric projects.

Given that I'm the author of the framework, my view of "usability" is biased. Thus, I must ask another if the focus and goal make sense to others who aren't associated to the project.

2 Answers2

2

Its tough to build a language-development framework. You have to decide what kinds of things you'd like it to support, then you have to decide which of those you sort of know how to do, and how to integrate those together into a coherent whole. Finally, you have make enough investment so it works with real languages (e.g., typical computer languages as well as DSLs), and actually does something useful. My hat is off to you for trying.

You might compare you effort with the one I started 15 years ago, the DMS Software Reengineering Toolkit. DMS is intended to provide general purpose parsing, analysis, and transformation of code. Given an explicit langauge specification, it will parse code, build ASTs, regenerate code from ASTs (prettyprint), transform code using patterns written in the targeted programming language, build symbol tables, compute control and data flow, etc. By adding custom code, one make DMS carry off a wide variety of effects. (See the tools at the site; they're all DMS in one form or another).

Here's a technical paper on DMS as is was several years ago. (We keep improving it)

While DMS itself has been hard to build, we found that it took a correspondingly large chunk of engineering to define real langauges to DMS, including IBM COBOL, C# 4.0, Java 1.7, C++11 (and many others).

What we think it does (reasonably well): lower the cost of building tools by 1-2 orders of magnitude. What this means is that tasks that might otherwise take 1-10 years can be contemplated by mere mortals as 1 month-1 year projects. What still isn't easy:

  • Defining new langauges
  • Handling all the idiocies of current languages
  • Making it easy to write the custom code specific to your task
  • Defining new, complex analyses
  • Handling partial programs, or programs containing errors
  • (To your initial point) Make it easy for nonexperts to use these tools

So, there's lots of room for improvement. Let many flowers bloom.

Ira Baxter
  • 1,930
0

This question may have been answered in The Mythical Man Month, the "Conceptual Integrity" section. If not, it is at least highly relevant to your question. Even though Brooks describes architecting a whole computing system, the essay applies perfectly well to frameworks and new languages.

I believe a positive correlation exists between rate of adoption of any technology and its conceptual integrity and ease of use. There should be a case study of recent technologies like languages, frameworks, and OSs, to prove this correlation, but know of none yet.

maxpolk
  • 433