User:Απεργός/MySQL troubleshooting

Introduction
This series of notes is closer to a self-guided set of exercises than real course materials. The reason for this is that I'm writing this set of notes with the specific goal of learning the answers to the questions posed, and more specifically, answers that apply to the MySQL setup on the Wikimedia projects.

If you happen to be following along for some reason -- and feel free to do so -- I would just ask that you not edit these pages directly, but leave notes on the talk pages, since it is akin to homework that I need to slog through. Thanks!

Getting more MyQSL-Fu
We run into a variety of issues fairly often, which include: db X has high replag, db Y is unresponsive, db Z crashed and now its tables are in some bad state, db Q ran out of disk space and is now unhappy, etc.

It would be nice to have a guide to deal with these issues as they come up. To that end, I have put together a preliminary list of "homework" questions below. These questions shuld be answered by referring to Internet and other resources, trial and error on a local installation of MySql, and by double-checking with folks who know this material well. As new questions turn up in the course of investigation, they will be added to the list.

As I find information that seems helpful towards finding answers, the information will be added below as well.

These problems may be done in any order, and it's fine to skip around doing pieces of them at a time. The point is to learn something while doing them, rather than to produce a nicely formatted set of answers to turn in.

"Homework" problems
In no particular order, here iis the first draft of the homework questions.

0. What are the basic steps for setting up a nice virtual environment with a MySQL master and a couple of slave servers for testing?

1. Someone reports in #wikimedia-tech that they see a warning message when checking recent changes: "Due to high database server lag, changes newer than 458 seconds may not appear in this list." What do you do to diagnose this? Once you have diagnosed it, how do you resolve the issue?

2. The db responsible for the parser cache ran out of space. (It may or may not be unresponsive to queries.) What do you do to restore the host to service?

3. One of the db slaves crashed hard. After pwercycling the db from the management console, what do you do to make sure MySQL is happy and in sync with the master?

4. What long-running queries should you expect to see on databases that house the Wikimedia projects... on the masters? On the slaves? On db12 (responsible for the watchlist)?

5. Why do we prefer InnoDB to MyISAM tables? What features of InnoDB do we rely on that make it the best choice for us? What MyISAM features do we lose by going with InnoDB, and what do we do about that?

6. The statistics table for Special:Statistics on one of the wikis has been reset to empty values, and the next page edit forced a recomputation of the entries, the sloooow way. You figure that the old information is probably somewhere in the MySQL logs. How can you dig it out of there? Once you have found the values, how can you re-enter the information and make sure that any page edits in the meantime are accounted for in the new values?

7. What are some warning signs when looking at SHOW PROCESSLIST or SHOW FULL PROCESSLIST that a query is liable to be a hog?

8. People often report issues with character sets when trying to import XML dumps of their chosen wiki into their installation. What should these settings be in order to avoid problems, anyways?

9. What are the normal stats for the Wikimedia db servers (queries per second, network usage, load, memory use)? What can our servers comfortaly handle? Where is the tipping point?

10. How can we tell when a db's unresponsiveness is likely due to a denial of service attack and not one (or several) poorly designed queries?

Reference material that could be useful

 * http://wikitech.wikimedia.org/view/Category:MySQL (wikitech pages dealing with the Wikimedia MySQL setup)
 * http://www.highperfmysql.com/ (the O'Reilly book)
 * https://launchpad.net/mysql-sandbox/ (creating multiple mysql instances without affecting what's already on the host)
 * ...? (Add new references here)