7

Note: Yes, I know that storing functions in databases should be punishable by law.

We are developing a financial web application using PostgreSQL, Sinatra and AngularJS. As you may have guessed, a financial application may need some calculations.

There is a certain database model (called 'Hypothesis') that has a certain function that needs to be executed overnight. The problem is that this function is different for each instance of the model. That being said, whenever a new hypothesis is added a corresponding function also needs to be added.

Because of the nature of financial applications, the function can either do simple arithmetic operations or integrate over an area. This leaves us with the following options:

  1. Implement a DSL (domain-specific language): This would be ideal, but adding branching and looping logic to a DSL seems more like creating a new programming language.

  2. STDIO pipe: Simply allow the execution of a program in a sandbox on a server. This is most flexible, but the gods of software security would not be merciful. Docker I guess?

  3. Pure sandboxed ruby code: Create a sandbox that allows only functionally pure Ruby code to be executed in it. Then simply evaluate the function inside the sandbox whenever required. This would be a security flaw as well, but not half as much as options 2. For now this option seems to be implementable using trusted-sandbox and pure.

  4. Use the octave-ruby gem: Beautiful idea. But it seems that nobody cares much about the project and we will probably need to fork it and work on it. This seem like the best option, since writing complex math expression is trivial in Octave.

  5. ???

  6. Profit

Another concern would be implementing the ability to test the functions with mock data in the admin panel, but this is not mandatory.

Can somebody suggest a better/more flexible/more secure option? Or at least one that doesn't store a function string in the database? I hope that I have explained the problem well enough.

4 Answers4

2

I don't know how different the functions you want to store are, but if you can isolate few different "types of functions" you could do something like this :

  • Create several generic methods in your model which can make every special calculus needed with the help of some arguments :

    def integrate_over_an_area(args={})
      #Do your calculus with your args.
    end
    
    def simple_arithmetics(args={})
      #Do your calculus with your args.
    end
    
  • Create - for each model instance - a hash describing which generic function to use and the parameters :

    { 
     function: "integrate_over_an_area",
     args: {
       arg1: "value",
       arg2: "value",
       ...
      }
    }
    
  • Store the hashes in a column named "special_function_hash" or something like this, in your model's table

  • Create a "special_function" method in your model which calls the function described in the hash, with the arguments described in the hash
Caillou
  • 256
1

Let's note that how the code is stored is inconsequential; how it is executed is essential.

Most probably your functions have a limited repertoire. Implement them properly, without storing in the DB. (I leave alone the question if a numeric integration implemented in Ruby a good idea. Let's assume you use something like Numpy of the Ruby world.)

Now you can have proper code revisions, tests, etc.

Then you can store a very limited form of a program in your database: which functions to call with which field values. The choice is limited by your blessed set of functions.

What if you need to store simple but unpredictable formulas?

I'd limit this feature to formulas that do simple arithmetics with no side effects, no loops, and no function definitions. It's pretty easy to write a parser for the basic arithmetic expressions with parentheses; most probably ready-made gems exist.

Then I'd store these expressions in the DB as a text. I'd only let in the expressions that can be parsed (no code constructs beyond the tiny allowed subset), and only execute the expressions that again can be parsed (so that putting exploit code directly into the DB won't work). This is apparently what you call the DSL approach.

If the above is not enough in your case, I'd pick a well-known embeddable language that can be stripped of unneeded parts and only allow functions you say it should allow. Lua is an obvious choice, Python is a slightly less obvious choice. Be prepared: it would take quite some work.

Docker could be a simpler approach, though: you can hide anything not strictly needed, like network and disk access, from the process.

But learn first if you are going to need these complexities. Chances are high that you don't.

9000
  • 24,342
0

You could embed an interpreter (like Lua or GNU guile) into the program using the database.

(I am not sure that coding your entire software in Ruby is best; coding it in C++ might be worthwhile)

You would either store in the database expressions (that is string containing Guile or Lua source) in Lua or Guile, or you might store some bytecode or some serialized closures.

0

I would use Jupyter as my backend execution engine & you then can write your functions in several languages e.g. Julia, python or octave. https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages

Bon Ami
  • 317
  • 2
  • 7