- AIR gets rich apps right
- AMD's ready to scale you up
- Apple's BlackBerry offensive
- Apple iPhone SDK upends mobile market
- PCs approach Mac simplicity, courtesy of AMD
- Microsoft opens up, just a little
- Vista SP1: Release to mob
- The mobile app gold rush
- Windows Server 2008: Redmond's new server OS hits paydirt
- Self-aware virtualization
March 26, 2008 | Comments: (0)
AMD's ready to scale you up
When it comes to scaling x86 servers, it's smarter to think inside the box
Architectural traits reaching back to Pentium remain present in the Intel-powered servers of today. The limitations of those servers aren't likely to be noticed as long as the routine of IT and commercial server buyers is to add capacity by scaling out, purchasing new two-socket servers. But the time will come when adding a rack server, or a rack of servers, is no longer the wise person's path to increased capacity. Smart planning will lead you to handle bigger workloads without more servers.
The terms "scale up" and "scale out" are sometimes unfamiliar to x86 buyers. They refer to the locale of capacity expansion, computing ("thinking") capacity in particular. A server that scales up can be made to handle substantially higher workloads through upgrades inside the chassis. These systems cost more at first, but they're designed to have untapped capabilities that you can turn on with an incremental investment far less than that of a new server.
Scale up is the factor that has kept proprietary Unix big iron in business. Linux on a commodity two-socket Intel server was supposed to push HP, IBM, and Sun out of business. It looks that way if you see a rack chassis as a rack chassis without regard for what's inside. But scale-up maximizes everything from power savings and server consolidation ratio to server longevity, with the bonus of lower long-term costs and higher availability. All AMD Opteron servers scale up. It's baked into the CPU, the bus, and the total system architecture. AMD's strategy is to make it possible to scale up any Opteron server for five years with only a CPU swap, no new server required. This stands in stark contrast to Intel's "tick tock" plan that attempts to nail IT to the stereotypical two-year purchasing cycle. Intel's two-year cycle of obsoleting chips makes parts scarce and expensive, so that if you do buy an Intel-based server with empty sockets with plans to scale it up, it's unlikely that CPUs precisely matching the models you have now will be available, and the availability of FB-DIMM memory at your existing Intel servers' speed may be rare as well. AMD's five-year plan is more in line with the way IBM treats, and retains, its customers.
Scale out means bigger racks, more servers, more heat, higher power and cooling costs, another tick on your service contract, another hand to hold in the middle of the night, and so on. The only thing going for it is convenience, and that's a powerful motivator. Most shops have the deployment of new rack servers down to a science, and there's rarely a need to even remove the cover on a server before you slide it into the rack. Opteron servers yield to the very same plug-and-play initial deployment, but in a few months when you'd ordinarily add a new server, you can take the scale-up route of your choice: Swap out your Opteron CPUs with higher speed or more cores, add RAM or use faster RAM, or fill empty CPU sockets with new CPUs. It really is as simple as it sounds, and when you (or your field service person) buttons up the case, you have a new server, or two, or two and a half, where your two-socket server used to be.
You have to adopt a long-term view to justify buying x86 servers that you can grow without filling more rack units, but the economy has a way of fast-forwarding reality such that the present suddenly laps the plan. If you're not already in spend-it-while-we-have-it mode, all forecasts indicate that you will be. Servers that you buy from now on should put you on course to grow your capacity, or to ready yourself for an overnight recovery, while you gently apply the brakes by reducing your costs now.
If that's too wishy-washy for you, I'll give you a hard example: A copy of Windows Server 2008 costs the same for a one-socket, four-way server as it does for an eight-socket, 32-way server. Each unit of Windows Server 2008 carries a license that permits the operation of an unlimited number of Windows virtual machines on one physical server. Today, expanding Windows server capacity means buying more servers, and therefore more Windows licenses. It may be that you have so many servers that a volume license, as costly as it is, is cheaper or more convenient than one license per server. Using any Opteron scale-up scenario, one Windows license covers all the cores and virtual servers you can squeeze into one physical box. As a bonus, any variety of distributed computing is done faster on scale-up hardware because far more server-to-server communication is handled at the speed of memory rather than the speed of Ethernet.
That scenario can be carried further. When you get to know Opteron, especially the quad-core Opteron CPU nicknamed Barcelona (revision B, with the TLB flaw repaired, is now shipping), I'll explain how AMD's redesign of the x86 architecture not only scales up through added components, but scales up through evolved software as well. There are many more features in quad-core Opteron than generic x86 and x64 operating systems use. You will scale up your quad-core Opteron servers merely by installing a Windows or Linux point release that includes Opteron-specific optimizations, or changing the architectural target of the projects you compile in-house. I realize that my strong position on Opteron and desktop derivatives, like the amazing Phenom, might appear to some like bias. Please understand that when I dig into AMD CPUs and platforms as technology and foundation for IT strategy and investment, I simply see so many changes for the better.
Posted by Tom Yager on March 26, 2008 03:00 AM
RATE THIS ARTICLE:
-

- COMMENTS
Tom,
I'm quoting you
"I realize that my strong position on Opteron and desktop derivatives, like the amazing Phenom, might appear to some like bias."
I'm going to sic George Ou on you.
You are an AMD fanboi who spews drivel.
Posted by: HugoChavez at March 26, 2008 09:26 AMThanks for the great Opteron article, Tom.
Hope you liked the recent trip to Austin - my people said not to worry about the boys in your room.
Hector
Posted by: Hector Ruiz, Austin, Texas at March 26, 2008 12:44 PMCome on! Trying to call that CPU anything but a disaster at this point is quite comical. It's performance is terrible. It took far too long to come out and is only now stable enough to run in a real time server environment. Few are developing AMD optimized code and such a concept is almost silly now days. Their best optimization is to emulate Intel's SSE standards. Let's not get carried away here. The Opterons "should" scale up well, but saying you are 5 years good by purchasing one of those things is one of the most ridiculous ideas I have ever heard. I'm sorry, but technology turns around too fast for such a silly idea. Here's a clue for you: in the past 5 years, we've seen improvements in SATA, PCI, Memory, and hard drive performance. DDR3 was just a concept 5 years ago and will likely be the standard in 2009. Go back 5 years and try to find a motherboard with DDR3 memory support. Go back 5 years and try to find PCI-E 2.0. Go back 5 years and try to find eSATA much less eSATA 2.0. While you can build an add on card for eSATA, it won't give you the bandwidth on a 5 year old motherboard as you would get on a built in motherboard of today.
NO SYSTEM is 5 years safe. Let's not get ridiculous. 3 years is about as good as you can hope for now days. In 3 years, AMD will have a completely new CPU and architecture that won't likely be compatible with what they are doing now. It will likely need the latest memory and system bus to work at it's optimum if at all. They had better! Intel's next CPU, the Nahelem, will be out in less than 6 months which will comopletely blow the Phenom/Opteron generation (that doesn't even compete with Intel's current offerings) out of the water. Higher clock speeds, multithreaded cores, and an integrated memory controller await. My gut feeling is AMD is already hell bent on redesigning the Phenom generation and will be out with something quite new in less than 16 months. It will likely be something that requires a new motherboard to be effective. This may not be IBM's model (according to you), but this is the "winning" model. You snooze, you lose in this CPU rat race. Quit being such a fanboy and be more realistic next time you write an article please.
Posted by: John at March 26, 2008 01:41 PMThe concept of a platform that can be truly superior for 5 years is at best arguable, and probably sheer folly. The System Board these days directly supports so many things and indirectly so many more. Take a look at the onset of high speed / high storage capacity / low cost SATA RAID drives. Your SB either supported it or it didn't. USB 2.0? There was a time when in one year every computer sold switched from USB 1.0 to 2.0. What to sit for another four years with that? How about memory speed and capacity? These don't stay stagnant of the years either.
The AMD chips are good when they work - and there are clearly times when they don't - but I am not going to purchase a processor that allows me to change out to another processor on a board I would just as soon jettison, and damn if that processor costs considerably more than the competition.
And we haven't even touched new Backside bus standards as they emerge!
Tom, IMHO, you did drink the Kool-Aid.
Simon, I am a wise consumer and don't buy Tom's propaganda a bit.
Don't be emotionally attached to AMD like Tom is :)
Posted by: HugoChavez at March 26, 2008 05:12 PMJohn-
Mostly good points, but I think PCI-e will be with us for much longer than 5 years. New revisions will be backward-compatible, and a current-day 16x slot has quite a bit of headroom for most applications.
Posted by: Neal at March 26, 2008 07:12 PMYour comments about Tom and the little boys in his hotel room at AMD's Austin headquarters is mean spirited.
Being gay doesn't make Tom Yaeger any less authoritive about AMD and Hector.
I happen to know that both Tom and Hector are very comfortable with their positions.
Gayvin
Posted by: Gayvin Nerwsome at March 26, 2008 08:33 PMTalk about emotional over a chip company, you Intel groupies talk like you change hardware like underwear. I'm a small business and My server was just a humble four-way board with two empty sockets and a lot of empty memory slots. My Data base has gotten huge and I just populated the last memory slot in December and the last processor socket in August. The system is reliable and the uptime with Linux is excellent. The AMD support is also good. I now serve 30 terminals and it's like the data is coming off your hard drive. I bet I will get another 2 years out of this system before I will even think about the next server. That puts me over 5 years. It was a great economic move for my company. Your needs might be different but I think Tom wrote about businesses on a budget for their IT needs.
Posted by: wmark at March 27, 2008 09:18 AMHugo, you're the one who started spewing drivel, without thinking about the merits of Tom's arguments. Really, how much is intel paying you to troll websites like these?
Intel wants the public to think that processors should be replaced along with the motherboard and memory every YEAR, while AMD's approach is that you can upgrade your (AM2) PC by simply replacing the CPU or the mobo at a time - you can use an older Athlon 64 X2 on the new 780G mobo, or you can also use the new Phenom on an old AM2 mobo...
you intel fanbois should rejoice when AMD folds up, when Intel the monopolist can charge you an arm and a leg for their new Nehalems - which btw, copied the AMD design.
im not an AMD fanboi, im using VIA chips...
Posted by: Simon at March 27, 2008 10:46 AMAMD is going to have to prove their business case. While scale up has some merit, there are problems, and this isn't exactly uncharted territory.
I have considerable IBM System i experience. That's a platform based upon IBM's Power CPU's and it has scale up capabilities in spades. However the system has come under serious pressure in the commercial marketplace, and one of the reasons are the costs that Tom mentions.
Creating all these customer upgrade opportunities comes at the cost of engineering those opportunities in the first place. Many customers won't access the upgrades and so, from the vendor's point of view, a sale is lost. Also, the opportunity to spread costs around and achieve economies of scale are reduced.
Then there's the issue of mismatched components. Sure, you might be able to drop a new CPU model into a compatible socket, but all the support circuitry, buses, external cache, the RAM, the disk, everything else hasn't changed.
How do you keep the CPU's pipelines filled? Sure, you can mitigate with caching, but the system is already heavily dependant upon caching just to achieve the old performance levels. Those pipeline stalls due to cache misses are very expensive in cycle and performance efficiency terms.
And let's not forget, nearly all systems require precisely matched CPU's, down to the stepping level for goodness' sake. If you want a new CPU release you're going to have to pull ALL the old ones out.
What you wind up with is the fact that the vendor has to stock old CPU's for long periods of time, just to keep the upgrade promise alive. However those CPU's look very expensive and the customers are aware that they are old too. Not a good combination--premium prices for old technology!
I believe that perhaps a superior technology strategy is to sell "computing units". This is essentially the Google data centre model. Each computing node is a self-contained bundle with processor, memory and disk resources. The application dynamically spreads itself over multiple compute nodes and is loosely coupled, permitting mismatched hardware on the network.
The customer can upgrade either by simply adding nodes, or replacing existing nodes, and it's their choice either way (or even both ways).
The downside is that it requires a new software processing model. But I've seen it done, in more than one case, and it works. Systems like this blow the doors off of anything else that is out there, because they also achieve fault tolerance as a major bonus.
Brian asked, "How do you keep the CPU's pipelines filled? ... Those pipeline stalls due to cache misses are very expensive in cycle and performance efficiency terms."
The Opteron architecture uses NUMA, so memory contention should not be significant. Each CPU gets its own memory controller and its own memory.
TOP STORIES
Top 10 stories of the weekA new place to hide rootkits
Sun exec on OpenSolaris, Linux
AT&T: No free iPhone Wi-Fi info
MS to appeal E.U. fine
XP SP3 causes endless reboots
Vista as insecure as Win 2000
Google grilled on human rights
Java ubiquity an edge in RIA battle
The InfoWorld news quiz
ADDITIONAL RESOURCES

- Virtualization: A Step by Step Approach to Success
- Dialing up Agility with Business Transformation
- 5 Things You Need to Know About Storage Virtualization

- Virtual Test Lab Automation: Manage development infrastructure
- Improve Resource Utilization and Lower Operating Costs
- Protect Your Data with SSL





