Wednesday, December 17, 2008

Powerful MySQL 5.1 - Cool is an understatement

MySQL 5.1 is here - as I was reading through MySQL's white paper on the new features, I occasionally thought, "yeah, that is cool." The most exciting things though were when I read about some features that made me get excited enough to want to blog about it. There are some totally powerful and powerfully cool features in 5.1

SQL Profiling

For example, the SQL Profiling Utility which lets you see exactly how much time the SQL server did doing what while it logged a profiling session. You can start and stop the session, then look back at each query that happened and then see in microsecond detail what things are taking a long time and which are taking... well, microseconds.
You'll be able to see in certain terms what things are taking a long time and what queries, indexes, or settings or hardware should be improved.

Replication

There are great replication features available that make it more reliable and able to ensure that the master and slave or a whole cluster are correctly in sync. Hurray for row-based replication!

Query Logs

The Slow Query Log and General Query Log are now logged as CSV files, which are accessible (along with any other CSV file) through the MySQL server, which makes it easy to see exactly which queries are taking the longest.

Process List

The show processlist command can now also be accessed through SQL (someone read my thoughts!) which means I can have one thread check whether another thread is currently running a query and how long it has been running, and how many other things are running and then the thread could decide whether it should go ahead or wait or abort. That would have been so hard before, and now it would be pretty simple.

CSV Storage Engine

The CSV Storage Engine gives us a whole new way to think about how to deal with CSV files. You can load any file instantly into MySQL and access it there as a normal table with SQL commands. I could have really used this 5 or 10 years ago, and all my manual work importing CSV files would have been handled in a fraction of the time. A ton of the work that we have to do with programming can now be done with SQL queries instead.

Event Scheduler

There is now an Event Scheduler which makes it easy to optimize tables or execute any other command and set them on a MySQL cron rather than have to depend
on (or remember to setup) Linux crons and their accompanying programs.

Table and Index Partitions

There are awesome new indexing features - the coolest is the ability to create Table and Index Partitions. Now queries against sets of data only need to look for the data in a fraction of the table.
Say we have web logs that are being added to a database or click through tracking data. Over time it grows and grows until we have millions of records. In the past we'd have to delete data or else summarize it and archive it, at which point we don't actually have all the data available to us anymore, just the summary, or we don't have it at all. If we setup table partitions instead, we can migrate partitions off to other physical locations (meaning we could move a set of the table to another physical disk partition), and the table would just keep plugging along. Meanwhile, the queries that only care about the last week or month, or the last year, as well as the queries that only care about January 3 years ago don't have to look through huge indexes or scan the whole table, they just look in the partitions they need data from and use the indexes for those partitions. That is a huge, HUGE benefit. This is the kind of thing that makes a Google search engine run quickly.

Full-text search enhancements and Plug-ins!

There is enhanced full-text search capability, with even the option to build your own plug-ins.


I haven't read about new features this cool for a long time. It will mean a lot of people being able to use a lot of data more efficiently and more quickly than ever before.