New Server

Archived development update discussion from past versions
Archived development updates.
Locked
Nicholas
Posts: 12471

Post by Nicholas »

The website (and database) are now living inside Amazon's AWS infrastructure. It was time to renew the site's security cert, so I took the opportunity to shimmy over to a million-times more flexible infrastructure while I was at it.

Let me know if you spot any problems. :D

Thanks!
User avatar
jimhenry
Posts: 1821

Post by jimhenry »

Board is very slow for me.

And I got this trying to post a reply.

General Error

SQL ERROR [ mysql4 ]

Lock wait timeout exceeded; try restarting transaction [1205]

An SQL error occurred while fetching this page. Please contact the Board Administrator if this problem persists.
Jim Henry
Author of the Miditzer, a free virtual theatre pipe organ
http://www.Miditzer.org/
User avatar
jimhenry
Posts: 1821

Post by jimhenry »

Of course, no sooner do I complain, then everything is back to normal. :roll:
Jim Henry
Author of the Miditzer, a free virtual theatre pipe organ
http://www.Miditzer.org/
Nicholas
Posts: 12471

Post by Nicholas »

Well, there is definitely something going on. Along with the new infrastructure, I get way better reporting so I can see totally disconcerting things like this:
database.png
database.png (24.38 KiB) Viewed 8423 times
That is to say every few hours it looks like things go crazy for a few minutes. I wonder what might be causing it. Last night I tried to stress out the (embarrassingly slow) scoreboard with a bunch of simultaneous clicks -- thinking that might be the culprit -- and was barely able to make a blip on those charts. So it must be something else.
Nicholas
Posts: 12471

Post by Nicholas »

Alright, I just bumped the specs on the database machine.

One of the very cool parts about Amazon's infrastructure is that it takes a single click and about 4 minutes of downtime to completely change the hardware your stuff is running on. I just stepped up to the next tier that has double the memory and CPU. I'll keep an eye on those charts to see if things are still getting memory starved.
User avatar
jimhenry
Posts: 1821

Post by jimhenry »

Based on the timing, my suspicion would a web crawler.
Jim Henry
Author of the Miditzer, a free virtual theatre pipe organ
http://www.Miditzer.org/
Nicholas
Posts: 12471

Post by Nicholas »

I dug through the IIS logs and discovered your hypothesis was correct. Each spike was at exactly the same time a crawler completely ignored the site's robots.txt (which forbids indexing the scoreboard) and flooded the database with requests for those really slow scoreboard queries.

I fiddled with the site's robots.txt to see if I couldn't make it a little more explicit and I also optimized the scoreboard queries a bit. The longest running pages used to (embarrassingly) take 20+ seconds now that the scoreboard has almost a million scores. After optimizing, the longest is only around 3-5 seconds (with most of the rest at sub-second). Finally when Amazon makes the next version of MySQL available (5.6 instead of 5.5) those times will be cut in half again due to some optimizations the MySQL folks have added to the DB engine.

So it was a bunch of work but hopefully things should be smooth from here on out. I'm going to keep an eye on it for a few days and then probably step back down to the smaller hardware since the beefier DB server was (way!) overkill in all but the spikiest conditions.
Nicholas
Posts: 12471

Post by Nicholas »

The scoreboard fixes look like they did the trick. We've been back on the smaller hardware for a few days now with no hiccups at all.
fixes.png
fixes.png (26.53 KiB) Viewed 8229 times
Even after updating the site's robots.txt a second time to tell that crawler it's not allowed to look at any page, it has still been misbehaving. So, I just blocked it outright. Now even those tiny bumps should disappear.
Locked