Monday, February 28, 2011

Working around MySQL dain bramange

I'm currently in the process of porting a perl app that was written for MySQL over to a PostgreSQL backend. Originally it was targeted for MySQL 3.x (yes, the app is that old), long before any kind of transactions or constraints were available, so data updates use this general idiom (those of you used to full-featured databases, you may want to avert your eyes):

  1. High level function locks tables
  2. Perform queries on database to check for consistency (uniqueness, business rule checks, etc)
  3. Pass parameters to lower level update function
    1. Perform global permissions checking
    2. Perform single update/insert/delete as appropriate
  4. Call lower level functions additional as needed
  5. Unlock tables

Later on, in the 4.1 days, rudimentary transactions were added, though constraints were still nonexistent to useless. Simple transaction support was shoehorned in by adding it to the lower level functions.

  1. High level function locks tables
  2. Perform queries on database to check for consistency (uniqueness, business rule checks, etc)
  3. Pass parameters to lower level update function
    1. Begin transaction
    2. Perform global permissions checking
    3. Perform update/insert/delete as appropriate
    4. Commit or rollback transaction
  4. Unlock tables

(I warned you not to keep reading, didn't I?)

So let's recap where we are now, thanks to sticking with what Selko describes as "not really a relational database, but does sequential processing behind a 'near-SQL dialect' language":

  • No constraints in the database, making direct edits (and let's face it, they always happen sooner or later) are quite dangerous.
  • Because there are no constraints, all of the checking has to be done in the application layer.
  • Because the checking is done in the application, all involved tables still have to be locked - MVCC would actually make concurrency issues worse, not better.
  • Because transactions are handled in the lower level functions, they can't encompass multiple statements. What's the point? (Okay, to be fair they do cover auditing records that the lower level functions silently create, but that's all.)

Ugh. The ideal path going forward would be move all of those constraints down into the database where they belong. This is far from a trivial task, however, and one that I probably don't have the resources for. The checks are scattered throughout the codebase, with odd conditions awkward to express in SQL. One common idiom is a column that references table A if the value is positive, table B if negative, or nothing at all if its zero. If I could wave my magic keyboard, that's the road I'd go down, but it's out for repair, and I need a working app sooner rather than later.

Instead, I'm going to have to spring begin/commits throughout the higher level functions, while still preserving the table locks everywhere. To make things even better, some of the higher level functions may call each other, so I need to simulate nested transactions, which PostgreSQL doesn't quite support (yeah, I know about savepoints.) So now I'm looking at something roughly like this:

  1. If not already in a transaction, begin one.
  2. Lock tables.
  3. Perform checks.
  4. Call low level functions, or other higher level functions as needed.
  5. If we began our own transaction, commit, else pass control back up.

Yeah, it's still pretty fugly, but that's the design tax we're still paying for all of the features left out way back in MySQL 3. Thanks, guys.

Converting MySQL blob data to PostgreSQL bytea

I just went through several annoying iterations of preparing to load a MySQL data set into a PostgreSQL server, with somewhat less than helpful error messages (why on earth would a bytea column complain about charset errors anyway?) Now that it's working, I though I'd share this little function. It's suitable for use in a script that's post-processing the txt files from running mysldump with the -T option. It takes one argument, a binary strong from a blob column, and returns the same data in an escaped form suitable for loading into a bytea column with the copy from function.

May it save you from a few of the headaches I have suffered.

sub encode_bytea { my ($blob) = @_; return join('', map { $_ = '\\\\' . sprintf("%.3o", $_) } unpack("C*", $blob)); }

Saturday, February 19, 2011

How to Lose A Customer in 7 Seconds or WTF, Mozy?

Ah, Mozy. I first signed up with you a few years ago, when it was basically between you and Carbonite. Almost immediately, you saved my bacon (or, more specifically, a couple dozen gigs of photos and music) from a hard drive crash. Not long after that, I laughed at the poor Carbonite users, their backups trashed by a combination of faulty hardware and lack of cross data center replication.

But that was then, and this is now. Hard drives are cheap, bandwidth is plentiful, and everyone is doing "cloud stuff". So how do you respond to the flocks of competitors? Lowering prices? Adding innovative new features? (Yeah, backing up to local hard drives is a good new feature, but hardly one I'd call innovative.) No, not you! Instead you not only drop the unlimited option, you also raise prices!

I'm currently backing up about 75G of data from a single PC (man, those high megapixel cameras can crank up the file size!) This means the price for me doubled from about $50 to $100 per year. And what do I get for that extra $50? Nothing. Nothing at all. I can't run it on Linux or back up to friends machines for free, live CrashPlan. I don't get to keep my costs down around $50, like Carbonite. I can't sync my backup set between multiple machines, like Livedrive. I can't play my music or view my photos online or on mobile devices, like SugarSync. I get nothing.

I've never been a big fan of paying something for nothing, so farewell Mozy. I've chosen to go with CrashPlan. The browser based file access, weekly status reports, and cross platform support hit all of my needs at the right cost. Although, to be fair, I never would have bothered to find CrashPlan if Mozy hadn't decided to shoot themselves in the foot. I'd still be giving Mozy my money, for fewer features. I guess that puts things in a different light.

Mozy, you may be doing your best to bleed customers, but your sacrifice shall not go completely unappreciated. As a formerly happy, but now happily former customer of yours, you have my gratitude.