Recently I needed to compare a specific set of tables in two MySQL databases on a remote system where I did not have Admin privileges (but DB user credentials). Both databases had the same structure, just some different data. I didn’t want to see all the differences in all the tables because there were probably hundreds of them. Not having access to the server logs, I was considering doing some brute-force regex-driven diff of dump files or perhaps a complex SQL query. I knew there had to be a better way.
After fishing around I found a relatively stand-alone tool (Perl dependency, not much else) that could be set up and used quickly on a non-Admin *nix account: mysql_coldiff. It’s easy-to-use, well-documented, and most-importantly, scriptable..
Since I had about a dozen tables that I wanted to compare I wrote a wrapper script to automate the process and produce a nice report of the differences.
#!/bin/bash DIFF_SCRIPT="$HOME/mysql_coldiff-1_0/mysql_coldiff" db4credentials="$1" query_list="$2" out_file="$3" if [ ! -e "$query_list" -o -z "$out_file" ]; then echo 'usage: '`basename $0`' <db4credentials> <query_list_file> <out_file>' echo ' db: seems only 1 set of credentials can be used for both DBs' echo ' query_list_file: list of tables to diff. 1st 2 lines are DB names' echo ' remaining lines have 3 fields per record:' echo ' <table_name> <id_column_name> [IGNORE]' exit 1 else db1=`sed -ne '1 p' "$query_list"` db2=`sed -ne '2 p' "$query_list"` fi read -p "username for $db4credentials: " us read -p "password for $db4credentials: " ps awk ' ($0!="" && $NF!="IGNORE" && NR>2){ gsub(/[\047"]/,"",$0) # strip devious chars for now print "------------------------------\n" $1 "\n------------------------------" cmd="'$DIFF_SCRIPT' -h localhost -u '$us' -p '$ps' -i " $2 " -n '$db1'." $1 " '$db2'." $1 system(cmd); close(cmd) }' "$query_list" > "$out_file"
This strategy seems to work best in situations where differences are minimal (but important). Case-in-point (or bird in hand?), I quickly found the modified field values in a table I had been looking for and changed them with phpMyAdmin. Here is the report I used to find those diffs (output condensed where “…” appears):
... ------------------------------ eav_attribute ------------------------------ We're comparing the the_database_number_1.eav_attribute using the attribute_id column and the database_number2.eav_attribute using the attribute_id column. ... ------------------------------ eav_attribute_group ------------------------------ ... ------------------------------ eav_attribute_option ------------------------------ ... ------------------------------ eav_entity_text ------------------------------ We're comparing the the_database_number_1.eav_entity_text using the value_id column and the database_number2.eav_entity_text using the value_id column. ... ------------------------------ eav_entity_type ------------------------------ We're comparing the the_database_number_1.eav_entity_type using the entity_type_id column and the database_number2.eav_entity_type using the entity_type_id column. ... +---------------------------------------+----------------+------------------------------+ | | entity_type_id | increment_model | | the_database_number_1.eav_entity_type +----------------+------------------------------+ |_______________________________________| 11 | eav/entity_increment_numeric | | | 11 | eav/entity_increment_alphabetic | database_number2.eav_entity_type +----------------+------------------------------+ | | entity_type_id | increment_model | +---------------------------------------+----------------+------------------------------+ ...
Gotchas?
- Had to use the same login credentials for both DBs. May be possible to use 2 sets?
- Not sure if mysql_coldiff handles tables where the primary key is not a single identity column, i.e.; tables that use a composite primary key. From glancing over the mysql_coldiff documentation I think it may not matter too much what you choose as the “index column”, but I figured it probably did, that’s why I added the “IGNORE” param to the input file for my wrapper script when I found that one of the tables I was hoping to compare had a composite primary key.
- There’s always ways to further investigate certain problems and solve them better. But if it ain’t broke neither are you.