Filed under: Industry
SimCity: "your city is experiencing brownouts, build another power plant"
The entire South of Market (SOMA) area of San Francisco is currently experiencing repeated power outages which began shortly before 2pm. This is the tech hub of SF, and a number of high-traffic websites are completely off the grid currently, including 365Main (a major colo), Craigslist, Netflix, Technorati, Yelp, and SixApart (Movable Type, Typepad, LiveJournal, Vox, etc).
InfoWorld's building near South Park is currently down as well, but our website, hosted at Hosting.com, formerly Verio, on 3rd is still up and running as of 3pm.
Details on SFGate
Apparently 20,000+ in SF are currently without power. Glad I worked from home (San Mateo) today. =) Just hope the power comes back on before we have to start shutting down servers for heat reasons. And if you see InfoWorld drop off the 'net you'll know why....
They use a 'CPS', flywheel driven system instead of a battery-backup system. Coupled with diesel engines. It seems like the error was not leaving the generators running long enough after the power came back on again. There was a succession of short power outages, and the CPS flywheels weren't able to recharge in time. At least, that's what I suspect.
According to our server logs (we're in SF Colo 7 of 365 Main), the power was only out for at most 2 minutes, and taking into account booting time, the outage was likely less than 1 minute. What was broken was most sites were not designed to come up automatically from scratch; that's why the sites had web servers responding with generic error messages. Also, sites who may use 365/GNI's SAN service apparently had a longer outage, as the SAN didn't come back up cleanly.
Here's the message to customers from their NOC:
This afternoon a power outage in San Francisco affected the 365 Main St. data
center. In the process of 6 cascading outages, one of the outages was not
protected and reset systems in many of the colo facilities of that building.
This resulted in the following:
- Some of our routers were momentarily down, causing network issues. These
were resolved within minutes. Network issues would have been noticed in our
San Francisco, San Jose, and Oakland facilities.
- DNS servers lost power and did not properly come back up. This has been
resolved after about an hour of downtime and may have caused issues for many GNi customers that would appear as network issues
- Blades in the BC environment were reset as a result of the power loss.
While all boxes seem to be back up we are investigating issues as they come in
- One of our SAN systems may have been affected. This is being checked on
right now
If you have been experiencing network or DNS issues, please test your
connections again. Note that blades in the DVB environment were not affected.
Posted by:
Rusty Hodge at July 24, 2007 07:46 PM