As I recently said, I switched TigerEvents over from using a MySQL input file to migrations. There were many decisions behind this choice.
* I wanted the code to be database neutral. Using migrations would allow me to automatically support MySQL, PostgreSQL, and SQLite by making changes to one file instead of 3.
* I know that the database for TigerEvents is going to change at some point in the near future. Rather than write multiple upgrade scripts manually, I could just use the built in migration functions to do it better.
* If for some reason I make a mistake and problems occur on the production site, it is easier to roll back to a previous version using migrations than dump the database, roll back the code, reload the database, etc. (Note, we still make sure to back up the database, cause not doing that is just tempting fate)
* Migrations will, in theory, make development easier. Someone made a change to the schema? Just run rake migrate, and your database is automatically updated. I think this is easier than reinitializing a database and then loading it with data repeatedly (though the second part of this can be handled with Fixtures).
While an awesome introduction to getting started with Migrations has already been written, I feel that the ‘database neutral’ stuff is missing. Using my own experience, several items must be taken into account, mostly due to MySQL usage and certain programming that were being used.
Already Existant Database
So we already had a database. A somewhat large database with a number of tables. I didn’t want to spend a lot of time manually creating the migration code from scratch. Was there a quicker way? You betcha. There was a rake task called db_schema_dump which will dump all the table data from your database to a file called schema.rb. So, once we do that, all we have to do is copy this information to the self.up section of a migration and we are done, right? Unfortunatly, it is NOT that easy. The problem is (in my case) that MySQL data types do not necessarily behave the same. Mostly, this deals with constraints, but also has to do with the boolean data type itself.
Constraints
In MySQL, you can do int(8). In rails, this would be translated into something like :integer, :limit => 8. However, PostgreSQL only has smallint, integer, and bigint, not constraint values. Therefore, the previous rails code returns an error, as PostgreSQL chokes on the SQL command that rails sends it. Therefore, while it is true that you don’t need to know specific SQL syntax, you DO need to know what kind of data you can feed various databases. For TigerEvents, I just generalized items and made the following changes:
:integer, :limit => 8 became :integer
:integer, :limit => 1 which translated to tinyint(1) in MySQL became :boolean
:timestamp became :datetime (which gave me extra functionality). Now, I by no means know if these are the optimal solutions, but they did work.
Boolean Data
We have a few boolean flags in our database. However, MySQL does not have a boolean data type. Instead, tinyint(1) is generally used. Well, generally that is ok, as ActiveRecord will automatically use tinyint(1) when connecting to MySQL and boolean when connecting to PostgreSQL and SQLite. However, since we started out with MySQL, there were numerous statements such as
@newgroups = Group.find(:all, :conditions => ["approved = 0"])
where 0 is supposed to indicate false. However, since this does not translate to false with PostgreSQL and SQLite, we have a problem. Solution: use placeholders. You are acually allowed to do this:
@newgroups = Group.find(:all, :conditions => ["approved = ?", false])
ActiveRecord takes the value and automatically translates the value into the proper data type, depending on the database type.
So once again, while you don’t need to know specific SQL query usage, it is important to keep data types general, and passed values abstracted if you want to create database code which is database neutral (at least for MySQL, PostgreSQL, and SQLite).