Saturday, September 30, 2006

ErlyDB 0.7

In the past couple of weeks, I've been hacking away at ErlyDB and the libraries it uses (the Revamped MySQL driver, ErlSQL and Smerl). I added a bunch of features and optimizations and also refactored and/or rewrote most of the original ErlyDB code until I felt the picture looked good.

The last release was v0.1, but I decided to name this release v0.7 because ErlyDB getting pretty close to 1.0 quality. Plus, I just like the number 0.7 :)

Here are the main new features:


Each generated module has the function transaction/1, which lets you execute a group of statements in a transaction against the module's driver. Here's an example:

fun() ->
P = person:new("Menahem"),
P1 = person:save(P),
city:add_person(TelAviv, P1)

Transactions are automatically rolled back if the function throws an error or crashes. For example, this function

fun() ->

would cause the delete statement to be rolled back.

Note: all auto-generated CRUD operations are executed in transactions, so you can set your database's auto-commit option to 'false' and nothing will break.

Protection against SQL injection attacks

ErlyDB uses ErlSQL under the hood to automatically quote all string and binary values in SQL expressions. All functions that let the user define a Where condition expect an ErlSQL expression by default. Here are some examples:

person:find({name, '=', "Joe"}).

{{name,'=',"Joe"}, 'or', {'not', {age, '<', 26}}}).

You can also use ErlSQL expressions for LIMIT and ORDER BY clauses, such as

person:find({name, like, "Joe%"},
[{order_by, [{age, desc}, country]},
{limit, 3, 4}]).

If you want to use string and/or binary expressions, you can define the {allow_unsafe_statements, true} option when calling erlydb:code_gen/3. This will tell ErlyDB to accept statements such as

person:find("name = 'Joe'").


person:find("name LIKE 'Joe%'", "LIMIT 17").

However, this usage is discouraged because it makes you more vulnerable to SQL injection attacks. If you turn this feature on, make sure to escape all your strings using erlsql:encode/1.


You can implement in your modules the following functions, which ErlyDB will use as hooks for different operations:


Fine-grained control on field visibility

By implementing the fields/0 function in your modules, you can specify exactly which database fields ErlyDB will use for the module. For example, if you have the function

fields() -> [name, age, country].

in the module 'person', ErlyDB will only use those fields even if the database table has additional fields (the 'id' field is always assumed to be the primary key).

User-defined table assignment

You can implement the table/0 function to indicate which table ErlyDB should use as the database table for your module. For example, the function

table() -> person.

in the module 'artist' would tell ErlyDB to use the 'person' table in all SQL statements for the module 'artist'.

Multiple modules per table

You can implement the type_field/0 function to indicate which column in the database table holds the module type for each record. In combination with the table/0 function, this lets you store records for multiple modules in a single table. By implementing the fields/0 function, you can also control which fields are exposed for each module.

For example, suppose you have the table 'person' defined as

id integer auto_increment primary key,
type char(10),
name varchar(30),
age integer,
country varchar(20),
office integer,
department varchar(30),
genre varchar(30),
instrument varchar(30),
created_on timestamp,

You can create 3 modules for accessing this table as follows:


-export([fields/0, type_field/0]).

fields() -> [name, age, country].

type_field() -> type.


-export([table/0, fields/0, type_field/0]).

table() -> person.
fields() -> person:fields() ++ [office, department].
type_field() -> type.


-export([table/0, fields/0, type_field/0]).

table() -> person.
fields() -> person:fields() ++ [genre, instrument].
type_field() -> type.

The created_on field will not be exposed to any of the modules.

Note: The fields/0 function isn't limited to defining supersets. You can define any relation you want, as long is it's expressible in Erlang :)

More auto-generated functions

In addition to the find(Where, Extras) and find_id(Id) functions, ErlyDB now generates find_max(Max, Where, Extras) for getting at most Max records, and find_range(First, Max, Where, Extras) for getting at most Max records starting from offset First. Each function has 4 variants, e.g.

find(Where, Extras)

Functions for finding related records also have such variants. Examples:

developer:projects(D, Where)
developer:projects_with(D, Extras)
developer:projects(D, Where, Extras)

developer:projects_first(D, Where)
developer:projects_first_with(D, Extras)
developer:projects_first(D, Where, Extras)

developer:projects_max(D, Max),
developer:projects_max(D, Max, Where),
developer:projects_max_with(D, Max, Extras)
developer:projects_max(D, Max, Where, Extras)

developer:projects_range(D, First, Max),
developer:projects_range(D, First, Max, Where),
developer:projects_range_with(D, First, Max, Extras)
developer:projects_range(D, First, Max, Where, Extras)

Aggregate functions

ErlyDB generates functions for getting aggregate data about records from a module. The functions currently supported are 'count', 'max', 'min', 'avg', 'sum' and 'stddev' (it's easy to add more functions to this list by changing aggregate_functions/0 in erlydb.erl). For example, you can make calls such as

employee:count('distinct name').
city:sum(population, {country,'=',"USA"}).

to get aggregate values from the database. The 'count' function has a special version that takes no arguments.


is equivalent to


In addition, ErlyDB lets you query aggregate data about related records (both in one_to_many and many_to_many relations). Examples:

developer:sum_of_projects(Joe, language).
apple:avg_of_oranges(MyApple, color,
{{state,'=',"ripe"},'and',{size, '>', 3}}).

The special form of 'count' also exists for related records:


Better error handling

In the last version of ErlyDB, functions returned {ok, Result} or {error, Err}. In version 0.7, I changed it so functions return Result or they crash by calling exit(Err). This makes it easier to execute a group of statements and let the enclosing function trap all the errors (it also follows the Mnesia API more closely).

Multiple drivers, multiple databases in one session

The last version of ErlyDB supported only one driver per session. In version 0.7, each module can have a different driver. In addition, each module can have a list of driver options defined when calling erlydb:code_gen/3. All generated functions pass these options to the driver. For example, the MySQL driver accepts the {pool_id, Val} option, which defines against which connection pool ErlyDB should execute SQL statements. This allows you to work with multiple MySQL databases in one ErlyDB session.

Efficient SQL generation

In the last version, ErlyDB created SQL statements by string concatenation. In version 0.7, ErlyDB uses ErlSQL to generate statements as iolists of binaries, which are more efficient than strings because they consume less memory.

Many internal improvements

I rewrote or refactored most of the code from the alpha version. I also optimized the code generation process so it's much faster now.

That's it :)

You can get the distribution from branches/v0.7 in the subversion repository.

Note: ErlyDB requires the Revamped MySQL Driver v0.9.5 or above, which you can get from the Subversion repository under branches/v0.9.5. This driver is based on the original MySQL driver from the YXA project with a improvements I made such as support of transactions, prepared statements, binary queries and more efficient connection pooling. Once the changes I made to this driver are integrated back into the YXA source repository, the Revamped driver will no longer be supported. This is to ensure that the Erlang community doesn't have to deal with multiple forks of the same driver.

Please let me know if you find any bugs or if you have any suggestions.


ajrw said...

Looks good! Do you have any plans to allow automatic creation of tables from a set of field specifications, or will ErlyDB alwas require an existing database schema?

Nick Thomas said...

Wow! This stuff is looking better and better! A question, though: are there any plans to support database engines other than MySQL?

Yariv said...

arjw -- I don't really see a compelling value in schema creation from Erlang. It can't be that much easier than writing CREATE TABLE statements. Am I missing something?

Nick -- there are such plans, esp with Mnesia, Postgres and ODBC. However, these efforts will probably rely mostly on contributions from other developers as I don't have a strong need for them at the moment.

ke han said...

I like where erlyDB is going. My biggest concern is namespace pollution. In anything but simple apps where your model objects are only used as rdb record wrappers, models end up with lots of functions for specialized domain behavior.
You are taking up lots of function names without having an approach for namespace collision. I assume you've thought of this and decided things looked more beautiful without prefixing functions with something like '_db_', so you would have '_db_table'/1 or db_table/1 instead of table/1.

ke han said...

your before_save and after_save functions do not provide enough resolution. In many cases, you need to know if the save was an update or a create. I recommend either offering additional before_update, before_create, etc... or have an optional parameter to the save hooks to pass the context of the operation. You could have a before_save/2 which can be used like:
before_save(Model, Context) where Context == update.

ke han said...

in addition to the resolution on save hooks, having access to information in the after hooks for fetch, save and delete would be useful. This additional info would inlcude the number of rows effected by the operation.

ajrw said...

Yariv: The nice thing about programmatic table creation is that you can write to one (common subset) schema and maintain portability across different database servers. There's also the possibility for automatic database upgrades as new fields are added or existing ones are modified.

In particular I'm thinking of SugarCRM's "vardefs.php" files, which define various metadata for each of the database (and virtual) fields, such as the column type and size. The field descriptor also covers things like formatting conventions, for automatic conversion from the user's display/edit format to the database storage format - useful when handling lots of dates and currency amounts.

Perhaps this sort of thing would be better implemented as a layer over ErlyDB.

ajrw said...

Hm, clicking on the submit button seems to add a comment without giving any feedback (such as moving to a new page). Inappropriate use of Ajax? :)

Dmitrii Dimandt aka Mamut said...

One word: Whoah! Automatic table generation or table migrations (a la Rails?) could be a plus, though

Yariv said...

Thanks for everybody's input!

Yariv said...

Regarding schema generation -- it's simple enough to do. Maybe I'll try to hack something for the next ErlyDB release. About namespace collisions -- I think the likelyhood of a collision is low enough for this to not be such a big concern. However, if it does happen, it can be annoying. I don't see it as an urgent issue, but maybe I'll find an easy way to add an ErlyDB prefix to generated functions for a future release. For hooks -- you can call the is_new/1 function of the domain modules to check for INSERT or UPDATE statements in the before_save hook. In the after_X hooks, they always apply to a single record. If the DELETE or UPDATE statements don't affect a single record, an error is thrown telling you how many records were affected in the database. Isn't that sufficient?

Norbert Klamann said...

@arjw wrt. database independence.

IMHO not a great idea at all. Use the database to its fullest extent ! You never get as much security and performance for the buck as in the database itself.

[Disclaimer] I do Oracle DB-Development for a living.

jwewymolusa said...

But then stopped, malin akerman nude that much longer, be doing your facts.

Bestiality porn zoo sex private photo and movies. said...

Dog porn movies zoo porn movies animal porn movies....

Beast sex zoo sex animal porn. Zoo sex beastiality horse porn dog fucking....