Fixing our true Unicodeness
We recently moved zen.org to a different server, and in the process my dump and reload of our MySQL database worked—mostly. However any posts with UTF-8 Unicode characters didn’t get displayed correctly.
After spending too much time trying to figure out how to make mysql
and mysqldump
help me, I realized I should look around for others who’ve had the same problem.
Voila, Jonkepon in Japan gave the fix for exactly the problem we had. The fix has to do with the collation of the entries in the database, not the actual dumping and importing of the content itself.
Since the newer WordPress already does their first step with SET TABLE
, I just had to go in via phpMyAdmin. For each of post_content
in wp_posts
and comment_content
in wp_comments,
I changed the collation of each to binary
(noting the type of LONGTEXT
or TEXT
) and saved it. Then I edited them again and set each to utf8_unicode_ci
, and saved them.
Bingo! All is happy and good again. The other tables are all still latin1_swedish_ci
(?!), but I’ll leave them alone until we bump into somewhere else that it’s a problem.