I had to play a bit with MySQL, saving and restoring databases with
mysql. These two tools have one
severe discrepancy, which I would like to comment here because I think
it is an excellent example of what not to do when designing tools to be
mysqldump(1) is a
database backup tool, that takes a MySQL database and prints the
necessary SQL statements to restore it to a new server. It has one interesting option:
Use charset_name as the default character set. See Section 10.5, “Character Set Configuration”. If no character set is specified, mysqldump uses utf8.
As you can see, it is quite modern in its default choice: if you do not specify which encoding you want to encode your database content into, it uses UTF-8. This way, it can save any database content, whatever characters it may contain.
A more sensible choice may have been to use the
LC_CTYPE environment variable.
mysql(1) is the command-line MySQL client, which can is used to restore such dumps. It has a similar option:
Use charset_name as the default character set for the client and connection.
A common issue that can occur when the operating system uses utf8 or another multi-byte character set is that output from the mysql client is formatted incorrectly, due to the fact that the MySQL client uses the latin1 character set by default. You can usually fix such issues by using this option to force the client to use the system character set instead.
As you can see, this utility is still has an antique design: if you do not specify a character set, it assumes you are giving it data encoded in latin-1.
You can see the obvious discrepancy here:
mysqldump encodes your database backups in UTF-8 by
mysql restores them
as if they were encoded in latin-1. This way, when you have to restore
your databases, if you thought you could do it without wondering,
congratulations: voilÃ , you have just fucked your databases
up! And as an additional present, chances are high that this mistake is
discreet enoughto be detected only after a while, when modifications
have been done to the restored database and the original backups have
So, if you are dealing with MySQL databases, remember that you must take care of the encoding of your backups, because MySQL utilities will not by itself, or rather it will do just what should not be done. And if you are designing similar tools, please think, and if you want an example of what not to do, here is one!