Free Newsletters

   All InfoWorld Newsletters
Grid Meter » November 2005

November 29, 2005 | Comments: (0)

Recommended Grid Reading for Enterprise IT Pros

We keep hearing about Grid computing evolving from big science to enterprise -- but there are precious few new ideas about the specific opportunities and business cases behind the evolution. If you're tired of the insipid 'aligning IT with business goals' market-ese on Grid and want some new perspectives, you might want to check out a book that was just released: "Grid Computing: The Savvy Manager's Guide."

The book is co-authored by Pawel Plaszczak and Richard Wellner -- both insiders in the open source Globus Toolkit development project that began ten years ago, and is the nucleus of the Grid computing evolution. These aren't marketing guys, they have street cred and hands on experience with Grids -- and the book reads that way.

Initial chapters provide the reader with some good foundation for understanding the fundamentals of Grid. Particularly useful to enterprise end users are the explanations of the importance of OGSA (open grid service architecture) and WSRF (web services resource framework) to the enterprise Grid evolution. The set-up also gives a good amount of discussion to virtualization's tie-in to Grid, and some clarity in distinguishing Grid from other technologies / concepts (clusters, cycle scavenging, distributed computing, web services, peer-to-peer, etc.). The book goes on to examine what, exactly, enterprise Grid will look like when it becomes more mainstream. From departmental Grids to enterprise and partner Grids -- the authors explain how different affinity groups and industries will participate.

What sets the book apart from other available literature on Grids is the practical context that it gives for would-be enterprise Grid decisionmakers. From 'building the business case' to understanding the motivations / requirements for 'Grid-enabling' a product or application -- this is a pretty comprehensive look at some of the key business issues related to enterprise Grid.

Posted by Greg Nawrocki on November 29, 2005 09:07 AM


November 28, 2005 | Comments: (0)

Data virtualization advances could be catalyst for Grid adoption

Grid computing is most commonly understood in terms of increasing the CPU horsepower and driving the performance of distributed applications. But really, it's the data virtualization capabilities of Grid -- as evidenced by new directions from vendors like NetApp and EMC -- that have the most possible downstream benefits for enterprise applications, and are stirring up the most excitement in the industry these days.

This article today is a very timely sneak preview about some of NetApp's new data virtualization Grid efforts, and the company's supporting Data OnTap GX operating system.

IT's concept of storage has generally been associated with dumb physical devices tied to an intelligent system (Disks hanging off a CPU). As the author of the article notes, NetApp's data virtualization technology decouples the physical device and turns it into an autonomous resource -- which is a critical 'next step' in the Grid evolution.

Taking this evolution even further would be to provide transformation functionality inherent with the data itself as Grids require very complex data migration without service interruption. Consider the fact that in manufacturing, the same quanta of data needs to be presented in different ways to an inventory application than it does to a CAD application. Although not specifically mentioned, one would suspect that with the emphasis on being able to scale out horizontally and "intertwined data management services," NetApp has this well in the works.

But what I believe will be most interesting to meter is what aspects of this operating system NetApp decides to expose, both from a functional aspect and in terms of open source. If the OnTap GX operating system is effective and becomes a commonly accepted method of data virtualization, it could prove revolutionary to enterprise Grid adoption.

Posted by Greg Nawrocki on November 28, 2005 11:54 AM


November 23, 2005 | Comments: (0)

with Grid, it's all in the timing ...

Tomorrow, many of us embark on one of the most complex problems known to queuing theory: getting Thanksgiving dinner timed. It's a complex problem of resource management, job priority, and timing. There are distributions for service times and inter-arrival times, deterministic and non-deterministic jobs, and FCFS, LCFS, and SIRO queues. And if you can't align your mashed potatoes and stuffing with your dining goals, you're in big trouble.

I often say that queuing theory is more important than the alphabet and we should be teaching it in preschool. I usually say this while I'm standing in line or sitting in a traffic jam and feel that if store managers and traffic engineers had even an elementary understanding of queuing theory, lines and traffic jams would be a thing of the past. I certainly recall from grad school that some of the math that goes into describing complex ordered systems can get a bit hairy, but there are simple ways to get preschoolers to understand the need to control process ordering with a funnel and a bucket of marbles.

Similar to the complexities of the Thanksgiving dinner, Grid computing, with shared resources that often present non-deterministic availability times, adds levels of complexity far beyond that of simple linear systems. I haven't seen a great deal of mathematical descriptions of Grid type systems, but perhaps that is because these systems are quite unique and lack common denominators, and admittedly I haven't been looking all that hard. Regardless, these are going to be some pretty complex systems to understand and for some of us who's recall of Little's Laws seems just a distant memory, may have to work harder than others to understand them.

So while I'm attempting to get Thanksgiving dinner to all come together at the right time tomorrow, and yes, I'm doing the cooking. I'll be thinking of a way to teach queuing theory to preschoolers in a clever jingle to the tune of "twinkle twinkle little star".

Does anyone know a word that rhymes with Markovian?

Posted by Greg Nawrocki on November 23, 2005 08:49 AM


November 21, 2005 | Comments: (0)

Still searching for enterprise Grid's first 'killer app'

When the IT industry scrutinizes nascent technology platforms, one of our favorite criteria is often the emergence (or lack of emergence) of a "killer app."

From mobile devices to interactive television - the burden of proof for a new platform is whether it enables people to do something compelling that they couldn't do before, and to what extent that new capability creates a groundswell of people actually adopting the new platform.

Finding cures for diseases, finding extraterrestrial life, being able to predict the weather and the location of the next big earthquake, and even being able to predict the direction of financial markets are all the noblest of goals - and these have been Grid's first 'killer apps' in research and science, and the financial services industry. But these applications are hardly common scenarios for the average InfoWorld reader, and Grid computing has yet to truly compel the mainstream enterprise end user with a new capability that cannot be derived through other means.

So what will Grid's first killer app for mainstream acceptance be?

One area that I find to be an interesting opportunity for Grid computing in enterprise is digital rights management. With the availability of new hardware devices and mobiled data and media networks - our formerly home-bound / office-bound data is now "on the road." The DRM security issues (whether you're talking about protecting copyrighted entertainment media, or intellectual property) are complex. In addition, inherent differences between devices can require data translation from the distribution channel, through the primary consumer device to the secondary portable consumer device. Which can at times have computationally-heavy requirements to pull this off. These common themes are all in the "wheelhouse" of Grid computing.

While Grid computing has traditionally been tied to various quests for knowledge, the killer app for Grid may be one that more reflects our desire as a society to turn off our brains, sit back, and be entertained.

Posted by Greg Nawrocki on November 21, 2005 11:23 AM


November 16, 2005 | Comments: (0)

Grid-enabling an application -- what does it involve?

A theme that we've touched on before, after the GridWorld conference last month, is the need to make Grid "all about the applications". Indeed, for enterprise Grid adoption, the ability to easily 'Grid-enable' applications has been cited as a major initial obstacle to widespread commercial Grid adoption.

At the University of Buffalo, Grid researchers have been making progress on grid-enabling research and science applications, with a project called GAT ("Grid-Enabling Application Template"). According to Mark Green, Computational Scientist at SUNY Buffalo's Center for Computational Research:

"A GAT essentially takes research code or even a commercial code and builds a small, very simple GUI on top of it. That GUI is presented within a grid portal. So as soon as you port your application to the grid portal, you can immediately use all the backend grids that the GAT portal has incorporated. That includes Grid 3, Open Science Grid, Open Science Grid ITB, TeraGrid, ACDC Grid, Western New York grid, and ultimately New York State Grid. So you port it once and you can use all the resources that the portal uses."
Green says that there are currently two earthquake engineering applications, some numerical methods applications, two quantum chemistry application suites, and several different bioinformatics & structural biology applications that have been grid-enabled with GAT.

This is a classic example where enterprise could again take a page from the books of research and academia. Similar portals and templates focused on enterprise applications would be quite the catalyst for making Grid commonplace.

Posted by Greg Nawrocki on November 16, 2005 09:13 AM


November 15, 2005 | Comments: (0)

Grid Community Reaching for First OSCAR

The Energy Science Network (serving the DOE) is working on a new network service that provides 'On-Demand Secure Circuits and Advance Reservation Systems' (OSCARS). To put it simply, these are virtual circuits that provide guaranteed bandwidth between sites or hosts (rather than mixing with commodity Internet traffic).

William Johnston, Senior Scientist at Lawrence Berkeley National Laboratory, explains why the science Grid community requires virtual circuits:

"You may have an experiment that produces a petabyte of data per year - and you know that you have to transfer a couple of terabytes per day from, say, CERN to Brookhaven. In order to do those transfers, you need virtual circuits with bandwidth guarantees, because otherwise you might not get enough of the bandwidth along the path to keep ahead of the data that's arriving.

The other area which I think virtual circuits will be needed for bandwidth guarantees, which is a completely different area, are Grid-based workflow systems. The high energy physics community will require a great deal of bandwidth to analyze all the data that's coming off the LHC. They're putting together large-scale Grids that involve hundreds of machines scattered across dozens of institutions. They will have workflow systems that manage the movement of work and the various steps at which it's being processed. Of course, many of the processing steps are in different machines, so the work flows into one machine, gets transformed, sent out to another machine, gets transformed, sent out to another machine, etc., etc. So you get these networks of workflow systems. Unless you have sufficient bandwidth between those systems, which may be scattered around a number of different institutions, you won't be able to keep the workflow moving steadily."

Posted by Greg Nawrocki on November 15, 2005 09:18 AM


November 14, 2005 | Comments: (0)

No more babysitting large file transfers for Grid pros ...

We've all felt the pain of sitting in front of computer screen, watching a large file slowly eek its way over the public Internet to its destination. Watching the progess of the 'percent complete' is as agonizing as it was to watch the clock during our most despised class in school. And most of us have similarly experienced the agony of having a download or upload 'drop off' near the finish line ... and had to re-start the process from scratch.

Many of Grid computing's early adopters are busy science researchers that simply don't have the time (do any of us?) to babysit enormous file transfers to make sure they go through ok. And for a particle physicist sending or receiving petabytes of data, for example -- there certainly isn't time to restart that transfer from scratch after a failure.

Hence the excitement / interest in the "Reliable File Transfer" service project out of the Globus Alliance. RFT is a Web Services Resource Framework-compliant service that provides 'job scheduler'- like functionality for data movement. Researchers can simply provide a list of data sources and destination URLs, and then RFT writes the job description into a database and then moves the files on the user's behalf.

And what if there is a drop off ...?

According to Ravi Madduri, the software developer at Argonne National Laboratory who's been driving the project:

"Say you're transferring a million files. And you are transferring the data from, say, Argonne to ISI, at the University of California. And somewhere during the transfer, some router in between the University of Chicago and Argonne and University of California goes down. The user does not have to do anything. RFT tries to transfer the file and it gets a transient failure, saying that I'm not able to reach this host because network is down somewhere. And RFT will wait and retry after some time. And the number of retries is a configurable option. You can tell RFT to retry forever, or at pre-determined intervals."

With RFT the overhead of maintaining and monitoring file transfers becomes an automated and reliable process.

Posted by Greg Nawrocki on November 14, 2005 07:48 AM


November 10, 2005 | Comments: (0)

Why cancer researchers care about Grid computing

One of the characteristics of life science and biomedical science is the diversity of data types - the heterogeneity of data and the way it's described. In cancer research in particular, this presents interesting challenges for collaboration between different scientists and exchange of data sets.

A really interesting project to watch is caBIG, which is focused on allowing better sharing of data and tools for cancer research. According to one of caBIG's participants, Peter Covitz, Director for Core Infrastructure at NCI Center for Bioinformatics:

"Even within a given type of data from, say, a measurement technology, or a theoretical description of biology - even within an area that is 'the same' from a conceptual standpoint, there is often a diversity of terminology or a subtle differences of meaning. This is a problem that commonly confronts informaticists who want to integrate resources in life sciences.

Other scientists - such as those in high energy physics, for example - may have tremendous amounts of data and separate challenges with large computational loads, but they tend to deal with a relatively modest number of 'well understood' data types in their domains. They don't' have this diversity and heterogeneity problem that we face.

With caBIG, we're taking the best possible technology for integrating and sharing resources - namely, the Grid technology that's evolved over the years, driven by physics and astronomy's cases - and we've extended it to a common base of needs for the life sciences community. The extensions that we've put in have been largely about better support for descriptions of data and diverse data types, and semantic control of those data types by binding them to structured ontologies.

We've integrated everything into the grid framework that Globus already provided and thus created a massive data Grid. The locations are the NCI, the Georgetown Lombardi Cancer Center, the Duke Cancer Center, and the University of Pittsburg Medical center. Some locations have more than one node, so there are 6 or 7 total nodes.

Given the distributed nature of the sites, the diversity of the data, it's sheer volume and the way it is presented and manipulated, CaGrid is probably one of the more sophisticated data Grid architectures out there today."

Posted by Greg Nawrocki on November 10, 2005 07:56 AM


November 09, 2005 | Comments: (0)

Energy Sector Ready to Buy Utility Computing?

Two years ago, everywhere you looked, a VP of marketing at a major systems vendor was proselytizing the future of compute resources on demand. At a certain point, when it became somewhat obvious that the end user market for utility computing had not yet arrived, it seemed that utility computing noise / frenzy was placed on the backburner.

Today, Sun and a company called Virtual Compute Corporation announced a partnership that would suggest they are preparing for a groundswell of utility computing customers in the energy sector. The announcement is more cheerleading than specifics, but the nuts and bolts of it are that VCC has agreed to use more than 1 million hours of CPUs on the Sun Grid -- via their energy customers.

Earlier this year, VCC also announced an agreement with MCI, 'allowing the company to provide global reach to clients of their high performance computing resources.' So in a short period of time, they've announced a partnership with one of the largest service providers, and one of the largest systems vendors. It's pretty aggressive posturing -- it will be interesting to meter how quickly the company can actually sell utility computing in the energy sector.

Several months ago, Ian Foster explained a Grid computing broker / consumer continuum. He explained that as the utility computing model grows, we'll not only see the very large compute resource brokers (Sun, in this case), and the actual end user consumers -- we'll also see a new class of middleman / solution provider pop up. Virtual Compute Corporation falls under that category, and I have no doubt that specific markets (like oil and gas) require very specific computing needs, and that specialized companies like VCC will continue to crop up as a new type of mega compute reseller.

But I also think the market is still very early-stage ... and in particular, there's no general consensus on the end user side about when and why to decide to outsource Grid, rather than building in-house. Is the timing right for this particular company? Or is the model still a few years before its time? Only time will tell.

Posted by Greg Nawrocki on November 9, 2005 07:39 AM


November 08, 2005 | Comments: (0)

Keeping an eye on Apple's Grid computing efforts ...

Yesterday I spoke about the Windows / Linux dichotomy and how it relates to Grid. However, there is another OS player that really can't be discounted: Mac OS X.

Sure, one could argue that OS X is simply BSD Unix all dressed up, but in this case the clothes may indeed make the man. Here are some reasons that OS X needs to remain on the Grid radar.

Xgrid

Admittedly I don't know a whole lot about Xgrid because I don't know of any enterprise scale deployments where it is in use. However, it ships standard with the latest version of OS X which means it has the blessing of the "powers that be" at Apple, which is no small feat. It has a very user friendly graphical configuration widget that makes deploying such a Grid a snap. There are some community cycle sharing implementations deployed, and I'm sure there are some local Grids out there where it is in use (if you know of any, please drop me a note).

Commercial Grid Software

The GridIron XLR8 application manager runtime has supported Mac OS X for some time now. There are also several commercial schedulers and resource managers including Sun Grid Engine and Platform LSF which also run on OS X.

Open Source Software

Given the Unix base of OS X, Apple has made Open Source development part of its development strategy. Applications and build tools that are utilized by open source Grid middleware such as MySQL, ant, and the gnu development tools run on OS X. The Globus Toolkit builds easily under OS X with the binaries soon to follow.

OSx86

Lastly, and this may indeed be the most significant, is that OS X for the x86 architecture is currently under serious development in both Apple sanctioned and community development forms. When ready for prime time, OS X may indeed displace a non-trivial number of Linux installs. With open source, commercial offerings, and Xgrid ready to go, this could be a whole new audience for Grid.

Posted by Greg Nawrocki on November 8, 2005 10:43 AM


November 07, 2005 | Comments: (0)

Grid Computing and Clustering -- the next OS battleground?

At Gartner's recent IT Symposium event, Steve Ballmer called high performance clusters a "Linux stronghold," and explained that Microsoft is gunning for the market with a cluster edition of Windows Server. As George Foreman once said, "generally when there's a lot of smoke ... there's just a whole lot more smoke." But in this case, I believe Grids and clusters really will be presented as major battlegrounds between the OS's in the near future.

As we look ahead to the release of the Windows Compute Cluster Server 2003 product, I'm anticipating a lot of interesting debate around whether it's Linux or Windows that's inherently better suited for Grids and clusters. The debate brings into question not only systems management compare / contrasts (i.e., which OS is easier to manage in a scale-out environment), but also continues the full "TCO" range of discussion about the economics of the two OSs.

Certainly for Grid computing environments -- my perception has been that the lion's share of major deployments out there today are running on Linux. This applies to both research / academic type grids and those deployed in enterprise.

I suspect that this has more to do with the big picture, including applications and underpinnings (the LAMP stack) as opposed to the OS alone. These components have simply become the industry standard for databases and the engines that power web services. Components that happen to be the foundation of the most popular Grid middleware and deployments as well.

Two things that I am absolutely sure of are that the strength and staying power of the Linux community cannot be discounted, and one can never discount Microsoft. Although the picture of a battleground may be drawn, the picture at the end of the day may be far more peaceful than the David and Goliath analogies usually applied to discussions where Linux and Microsoft are mentioned.

The holy grail of Grid computing is true heterogeneity and interoperability with cpu cycles and applications for all. Not only is there a market and an opportunity for either flavor of OS, Grid computing may actually be the common ground rather than the battle ground.

Posted by Greg Nawrocki on November 7, 2005 12:36 PM


Technology White Papers

 

InfoWorld Technology Marketplace

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
» BUY A LINK NOW

Sponsored Technology Links