Ruby-binlog based MySQL replication listener

jenseng · on Dec 1, 2012

This looks promising. The MySQL -> Postgres replication possibilities are particularly interesting for people looking to switch over a huge database with (almost) zero downtime.

At Instructure we essentially did this last year when we made the switch. We loaded in the initial dataset using pygmy (https://github.com/instructure/pygmy ... basically a wrapper around COPY and friends) and then used a kodama-like script to keep it in sync until we pulled the trigger. We haven't open sourced the replicator yet, as it's a shameful hack ... we were doing statement-based replication, which makes query translation super brittle (yay regexes!) if you do much beyond basic ORM stuff, due to differences between the two dbs.

No matter how you go about it, there are lots of little gotchas like updating your sequences, converting tinyints to booleans, etc. Should be interesting to see where this project goes, since MySQL -> Postgres is the current trend.

kppullin · on Dec 2, 2012

I'm curious as to what other factors drove the MySQL to Postgres conversion. Can you share any additional details (I assume it's not just because Postgres is the new hotness :] )?

jenseng · on Dec 2, 2012

There were several factors, but the two big categories were:

1. Familiarity. The original decision to use MySQL was arbitrary, and as the team grew, we just so happened to hire lots of postgres-philes.

2. Death by a million paper cuts. Locking/slowdowns as load increased. Useless EXPLAINs and naive query planning. Truncation (rather than rejection) of VARCHARSs and INTs that exceed length constraints. Having to rewrite queries in weird ways and/or use MySQL-specific syntax to work around performance issues. Lack of partial and expression indexes. No transactional DDL support.

Additionally, the stars aligned. We had a very large customer -- about as big as our other customers combined (up to that point) -- that wanted an on-premise Postgres install, so we added support for it. Eventually the customer opted to move it into our cloud. We didn't want to have to maintain multiple types of database clusters, so we just moved everything to Postgres.

noplay · on Dec 2, 2012

A similar work in progress project for Python: https://github.com/noplay/python-mysql-replication

It's fun to see we are working on similar project in the same time.

alfiejohn_ · on Dec 2, 2012

Here's the one that I wrote a while back in Perl that we use at Opera:

  https://github.com/alfie/MySQL--Replication

It's only statement based, but it is serving us well.

sonier · on Dec 2, 2012

This is great, would love to use it to keep MySQL and hive in sync. I will play around with this soon, thanks for the great gem!