We had a fun time killing our site in an unusual manner last night and again this morning.
We recently changed how we handle our sports games and scores on our website. When we started out, we used the Pods CMS Framework to create some new tables to store schools, teams and games. What we found is that this added a lot of overhead to every page load, it required a lot of maintenance and the value wasn’t there, so we regrouped.
Now, we have a custom post type Games and custom taxonomies Schools and Sports. Each game comes in as a post with the relevant schools and sport attached. Then it becomes pretty easy to build a page of scores for the day or display a game without any extra queries or database tables. Yay custom post types.
We built everything as a plugin and tested it as a proof of concept. It worked pretty well so we went ahead and imported all 2,400 high school sports games scheduled this season.
The trouble starts
Last night, just as I was dozing off, our site crashed. After a few frantic minutes of everyone trying to figure out what was going on, our sysadmin diagnosed the problem: We were out of disc space on our master database server. He also found a 10 GB test file, which we promptly deleted, and everything came back. This morning, though, the site went down again just as I was coming into work. Again, we were out of disc space on the master db server. And so we started to try and diagnose how we used 10 GB of file space while we were all asleep and almost nothing was being posted to the website.
The problem ended up being our binlogs, which are filling up at an astronomical rate. And the reason was because almost all of the 2,400 games we imported into WordPress are marked as future posts and [for each of them, the Disqus plugin] spawned a one-off WP-Cron job, which is saved in the *_options table. Each time WP-Cron runs, which is sometimes multiple times a minute for us, the array of cron jobs is actually saved back to the *_options table, and each time our master db is updated the command is written to a binlog so it will propagate to our slave databases. Each time WP-Cron runs, we were adding almost 33 KB to the bin file.
The fix (see below for the real fix)
Well, we don’t have a permanent one yet. First, we disabled WP-Cron in favor of true cron jobs to make sure we’re not running things more than once a minute. One option is to remove all the future posts out of cron, or to change them to a different status such as To Be Played that doesn’t trigger a cron job. I think a better solution, though, is to patch WordPress so each new post doesn’t trigger a new cron job. My proposed solution is to search for stories with a status of future and a publish date less than or equal to now. There can also be a flag set every time a post is set to have a future date and then turned off when that post is actually published so that the query doesn’t turn up nothing. This would help avoid clogging up the cron job list with one-offs and would also help avoid cases where cron fails to run correctly and future posts don’t get published.
The true fix
At first we thought that WordPress spawned a new cron job for every future post. The culprit actually turned out to be Disqus, which was creating a new cron job to sync the comments on thousands of posts. The simple solution: In the Disqus settings page, disable comment sync.