6

I'm in a position where we've got some brittle code that constructs SQL-like queries via text concatenation with parameters for inputs. The data source that it queries is fast and scalable but lacking tool support. Over time, addition entities and properties have been added to the data source that need changing or have obsoleted others, so the queries need changing.

I can see that this will happen again in a few months and then again.

In order to reduce errors introduced into the text queries, I suggested writing the queries into a separate e.g. .SQL file and then running some kind of code generator tool that could get the schema from the data source and generate a code wrapper around the SQL-like query which was easy to re-generate at any time and would give compile errors for any out-of-date client code.

This idea was met with some skepticism and resistance, even when I offered to fund the development myself.

What are the reasons against doing this? and, for balance, the reasons to go ahead and do it?

(I already saw this post with a couple of answers, but its' not comprehensive)

JBRWilkinson
  • 6,769

4 Answers4

6

Reasons for:

  1. Lots of boilerplate code can be generated (getters/setters, toString(), clear)
  2. Automated solution is less likely to miss schema changes if you're reading the schema to generate the code. In a large set of tables/POJOs this can prevent bugs.
  3. Ability to generate API/schema documentation from your code.
  4. Time saving in the future maintenance of your code base because you can generate new items quickly.

Reasons against:

  1. Takes time to write a code generator (and you still have to write code for the requirements). My argument against this is the time I save in maintenance will make up for it.
  2. Secondary set of code outside of your requirements to maintain.
  3. You can never cover every case in your generator (so you end up with something that allows you to inject custom code)
  4. If the project is small (less than 25 tables), it may be a overkill and the time savings may not be as great as expected.

EDIT: I have written 3 different code generators for projects and the greatest factor in deciding whether to do it was the size of the project. I did it for a smaller project and it wasn't as effective as the generator for maintaining the larger projects (maybe that's obvious, but I thought I would throw it out there). If the project is small to medium, I would lean toward not using one.

jmq
  • 6,108
3

Because I created and maintain a code generation framework (ABSE) my answers are probably biased, but here go my arguments:

For:

  • You can "reuse" yourself. Create one generator, use it over and over.
  • You can repeatedly update the generator and generate code. Change in one place, update everywhere.
  • Custom code is usually seen as a deterrent, but if your generator supports the inclusion of custom code, this "mix" can suddenly become a good thing.
  • An expert creates a code generator, the rest of the team can then use it. Everyone becomes as good as the expert. The expert may not like it though :).

Against:

  • Not suitable to newbies. Raising abstraction requires good thinking and some experience.
  • Not suitable for one-off development. If you are creating, say, a PHP script just this once, a code generator is never a good idea. Still, you can still generate common language patterns if you use one continuously.
2

In some respects, it's already been done in other languages. See Hibernate and NHibernate, two libraries which are in use throughout the Java and .NET Enterprise respectively, and which generate SQL.

They do this on the fly, rather than by producing code files; however, they're solving the same problem.

I would be surprised if something similar hadn't already been created for C++.

Lunivore
  • 4,242
2

Code generation has a tendency to become something like an automated copy-paste; therefore, before you generate code, you should always consider writing a piece of code that does the same thing on the fly; e.g. instead of generating code that creates a CRUD form for a table, write something that generates the form at runtime. If this is possible and feasible, it will result in (by far) less code and you avoid all the headaches that copy-paste comes with.

But in the real world, we have to admit that it is not always possible or feasible; in such a situation it's still much better to generate code than to write the same thing by hand. A decision that has to be made very early is whether or not you plan to manually edit the generated code. If so, the output should be simple and straight-forward, so it's easy to change. If not, you must include extension points that allow for customizations without editing the code. Doing that right is much harder, and if you can really do that, check if you cannot avoid code generation completely and create something that constructs the desired object at runtime.

user281377
  • 28,434