About InfoWorld : Advertise : Subscribe : Contact Us : Awards : Events : Store
InfoWorld InfoWorld HomeTechnology NewsTechnology Test CenterOpinionsTechnology Product GuideTechnology IndexCareers
 SUBSCRIBE  E-MAIL NEWSLETTERS  RSS FEEDS
SiteIT Product Guide Search
 
Free Technology Newsletters
» All 33 InfoWorld Newsletters
Technology & Business Daily
 

 RECENT ENTRIES
 Securing your data on the road
 The promise of OSA
 What's wrong with SAN and NAS
 More sensitive tapes get lost


 About the Author
 Contact Mario Apicella

 ARCHIVES


Powered By
Movable Type 3.17

 INFOWORLD BLOGS

 BLOGS WE READ

 RSS FEEDS
How this works
 Top News 
 Columnists 
 Tech Watch 
 Test Center Reviews 
 Applications 
 App Development 
 E-Business Solutions & Strategies 
 End-user Hardware 
 Networking 
 Operating Systems 
 Platforms 
 Security 
 Standards & Protocols 
 Storage 
 Telecommunications 
 Wireless 
 Web Services 

THE STORAGE NETWORK HOSTED BY MARIO APICELLA



May 13, 2005

Securing your data on the road

Have you noticed that USB drives are getting smaller? I am talking of their physical size of course, because their capacity seems to break a new record every time I check.

Speaking then of their outside measures, don't settle for one of the 3 inches long variety: they are out.

Actually I have to find a better use for the ones I have. Perhaps hang them from a laundry line to create a makeshift organizer for USB cords ? Probably not, it's not a good idea after all.

I am obviously joking, because my drives are still working well, but someone must have overstocked the large sized models because I keep seeing creative new ways of using USB drives, for example disguised as bracelets, flashlights and other similar nonsense.

To see with your own eyes subscribe to Engadget: they keep a close watch on these and other more compelling novelties.

Regardless, if you have or have seen a miniature format USB drive, the old ones will seems as antiquate as the Pyramids. Courtesy of PMC Sierra I got a 64K one, which is slightly longer than 2 inches and has a semi-transparent blue shell that offers a really cool view of the inside electronics.

However, my favorite USB drive is now the CryptoStick, sold by Research Triangle Software, Inc.

The name is a giveaway: you can actually encrypt and compress data on that drive, using the software that comes, you guess it, on the drive. They sent me a 128MB unit for a test ride and I am having a hard time filling it with data because with compression the capacity jumps to about 400MB. However, you can save files to CryptoStick without using encryption, just as you do with any other drive.

To use encryption, you start a middle-man application that will prompt to set up a password on first time use.

After that, the proxy-apps will propose using CryptoBuddy, your encryption software, or covering your tracks with a secure browser session that doesn't leave any trace behind.

If you are paranoid and browsing on someone else computer, the last apps is for you.

To save data in encrypted (and at the same time compressed) format, you start the CryptoBuddy, which opens a Windows Explorer look-alike window with two panes, one for input on the left, and the other one, obviously on the right, for output.

In essence, you navigate to your input files on the left and choose the target location on the right. Select the files, choose a counter-intuitive "Translate Process" button, and your files will be compressed and encrypted in no time. Reverse that process to bring your files back in the open.

I like CryptoStick and to be fair the software does what's expected, but the GUI could use some face-lift, because it now displays a very spartan look and some unfortunate choice of words.

Hey folks, how about labeling that button "Encrypt" instead of "Translate Process" in the next version of CryptoBuddy? It took me a while to figure that one out. Also, why do I have to set a password (twice of course) every time I encrypt a file? A choice to use the master password by default would be nice.

Other than that CryptoStick rocks. You can choose from many models, with capacity ranging from 32MB to 4GB. The price also jumps accordingly, from 10 bucks to nearly $500 for the larger model.

The slick aluminum case of the drive is just too cool. More important, the encryption works: filenames and directory structure are still in clear, but the file content becomes a meaningless jumble, as it should. Even if you are not paranoid, CryptoStick is an effective protection from embarassing disclosures both of business and personal data. Give it a try.


Posted by Mario Apicella on May 13, 2005 03:17 PM | TrackBack

May 04, 2005

The promise of OSA

This is the second part of a fascinating essay on obiect storage by Joe Breher of lingua data .

If you missed it, the first part is here.

The fundamental innovation which enables this architecture is, as you noted in your article, the reassignment of the responsibility for the mapping of streams to sectors from the client or server to the data store itself (OSD).

Once the OSD can perform the stream to sector mapping, the clients no longer need to agree what algorithm to employ for this mapping. This is one of the aspects that makes OSD so suitable for data sharing in a cross-platform environment. (I will discuss the other major aspect of data sharing below.)
This cross-platform data sharing addresses the major benefit of NAS over SAN, and is inherent in OSA.

The scalable performance is encompassed in OSA by virtue of the fact that the data flow is direct between the client(s) and the OSD(s), without any need to pass through a server. This allows aggregate system performance to rise to the inherent capacity of the underlying fabric. This direct pipe from clients to storage addresses the major benefit of SAN with respect to NAS, and is inherent in OSA.

I have so far described several benefits of OSA, without having yet described the structure thereof. It is important to note that these benefits require a rethinking of the parties involved in any data transaction. OSA is a tripartite architecture. In addition to the clients and storage units (OSDs) that we are familiar with, we need to introduce a third actor - the MetaData Server (MDS).

While it is an over-simplification, it is helpful to think of the MDS as having responsibility for maintaining all of the filesystem other than the mapping of streams to sectors. The MDS is where any hierarchical directory structure would be maintained, along with permissions, file-scope locking, etc.

Without getting into the security aspects (this email is already trending towards the large...) a typical scenario would have a client walking a directory tree to find a file, by means of communications solely with the MDS. Once the file is located in the file system tree, the MDS would return the name of the SCSI target and LUN of the OSD which houses the relevant object, and the Object ID (OID) of the object within the OSD.

The client then builds a SCSI CDB specifying the OID, and the byte range of interest within that OID, along with an op code (READ, WRITE, etc), and sends it to the relevant OSD. The OSD responds then with the appropriate data transfer operation (after checking the Credential sent as part of the CDB).

So we have identified three primary classes in the architecture - client, OSD, and MDS. In any system, there are any number of clients, any number of ODSs, and one MDS. So is it not true that, due to the presence of a *single* logical MDS in any system, OSA suffers from the same scalability problem as NAS does? While it is true that all communications must hit the MDS at some point, this is at a radically different scale than the situation embodied in NAS.

First and foremost, the data itself never passes through the MDS. All data flow is directly between clients and OSDs. Additionally, once granted access to an object, the client may use the MDS-supplied credential (permission) across multiple accesses. (This capability is granted by the MDS for a certain time interval, which it may revoke before the stated expiration time).

Research has shown that a single MDS runs out of bandwidth at a rate several orders of magnitude greater than does a NAS server. In other words, a single MDS is capable of serving on the order of 100x to 1000x more clients and 100x to 1000x more back end storage than a NAS server is capable of, before running out of bandwidth. Additionally, for performance or reliability reasons, the logical MDS may be implemented as a cluster of machines, much as multiple machines clustered into a single logical NAS head.


I have glissed over a lot of detail above, but hope that I have shed some light on other aspects of OSA for you.

With the above in mind, where is OSA today, and what are the next steps? There are products now shipping which encompass fundamental aspects of this architecture. Most so-called SAN File systems incorporate some aspect of OSA, but are proprietary. Approximately half of the world's ten fastest supercomputers employ the Lustre filesystem.

The Lustre file system is an implementation of OSA, created as an open source project (though not now, and perhaps never, compliant with SCSI-OSD).

Panasas is shipping OSA product, and is also active in the standardization effort for the various pieces of the architecture. They demonstrated a while back 11GB/s performance to a single directory.

You mentioned the Emulex and Seagate demo that you witnessed at SNW. There are many more players in this effort, including my organization, lingua data, which is developing an initiative labeled obstor.

As we go forward from here, much work remains to be done to deliver on the promise of this superior architecture. As I mentioned, the SCSI OSD spec is ratified.

However, it defines only the characteristics of OSDs and communications between the clients and the OSDs, while stating only what is necessary about the overall architecture and the implementation of MDSs and communications between client and MDS as is necessary to frame the responsibility of the OSD.

This limited scope of the OSD spec is largely due to the legacy of T10 specifications, which focus on the target, while saying as little about the initiator as possible.

There is an effort afoot in the IETF to define the communications between the clients and the MDS. So far, it appears as this work will be incorporated into a minor versioning of the NFSv4 spec - perhaps NFSv4.2. This has been dubbed within the nfsv4 group as pNFS (for parallel NFS), and several internet drafts are available on the topic.

As far as I know, there has not yet been any standardization effort on the private channel between MDS and OSDs. This channel is used merely for the maintenance of a shared master secret key used in security mechanisms. The MDS and OSD collaborate on lower level (working) secret key maintenance over the same interface that the clients invoke upon the OSD - SCSI OSD.

So there is some time that will pass before all necessary parts of the architecture are standardized. However, many of us that are aware of OSD feel that its long term prospects include relegating both NAS and SAN to legacy environments. This will of course take time - more than a decade, but I feel it is destined to happen.

If there is one point I would like you to take from this discussion, it would be: Yes, it is true that OSA brings the benefits of device managed replication and other stuff that you mention. However, I believe the big hitter is combining the cross-platform data sharing of NAS with the scalable performance of SAN, abandoning the limitations of both, and wrapping it all in strong, fine-grained security. *That's* the promise of OSA.

If you would like to discuss this topic in more depth, please feel free to contact me.

Posted by Mario Apicella on May 4, 2005 12:08 PM | TrackBack

What's wrong with SAN and NAS

Thank you for your column inches on OSD. If I may expound a bit...
says Joe Breher of lingua data in response to Moving toward object storage devices.

And expound he did. I am breaking Joe's interesting essay on object storage over two Weblog entries, but don't let their length scare you away.

The first part is a lucid analysis of current networked storage architectures and their shortcomings. Here it goes, enjoy.

You stated that "If you are thinking that to accommodate object storage devices the set of SCSI commands needs to be expanded, you're absolutely right." While this is certainly a true statement, a reader may have the impression that there is no defined way to speak to an OSD over SCSI.

To the contrary, we in the SNIA's OSD Technical Working Group have written a specification for the SCSI Model and Command Set for OSD. This work was passed on to the INCITS T10 Technical Committee, which competed a letter ballot on the document. This is now a standard within INCITS, document number INCITS 400-2004, published since mid-2004.

The upshot of all this is that there is ratified SCSI spec for OSD that is a full peer to SBC-x (disks), SSC-x (tapes), etc. Work continues upon version 2.0 of the spec, which will lend additional functionality to the specification.

Concerning your statement "the winner will be serving files, or to be more accurate, serving objects" - I would like to point out that objects are potentially quite different from files. It is certainly true that one natural mapping from a host space to the object space is representing the data associated with a file as an object. Indeed, this is the initial thrust of lingua data's approach to OSD. (Note that the metadata of the file may be represented as attributes associated with the object within the OSD, as metadata maintained on the MetaData Server (MDS), or some combination thereof.)

However, other natural mappings spring to mind. One intuitive mapping is a 1:1 relationship between objects and records within a database. At least one company is looking at encapsulating an entire file system within each object.

In essence, I just wish to convey that there are more applications for OSD than a 1:1 mapping between files and objects. It will take some time for the industry to learn which of these other mappings may have wide applicability.

I would wholeheartedly agree with your observation that "object storage
promises to be the greatest revolution since networked storage". While work remains to be done in order to implement the entire architecture in a standardized manner, OSD offers the best aspects of both NAS and SAN architectures, while leaving behind the shortcomings of either.

We are faced today with the unnatural situation of two predominant
architectures for networked storage. When discussing architectures aimed at solving a given problem, two is an unnatural cardinality. A cardinality of one would indicate convergence on a solution. A cardinality of many would indicate there are really multiple problems in the space. However, we currently have two approaches - SAN and NAS. Why do we have two? In essence, SAN does some things better than NAS, and NAS some better than SAN.

SAN's primary drawbacks center around data sharing. For any meaningful data sharing to occur, the clients must coordinate their access to shared data through communications with each other. This can happen in a limited fashion today - but only on certain client platforms. Native cross-platform shared data access is a distant wish.

NAS's primary drawbacks center around performance. All data must pass
through the server. The bandwidth of the server's bus becomes a bottleneck as the number of clients and/or the size of the served file set scale up.

These problems are architectural in nature. Steps have been taken to alleviate these issues somewhat, but these approaches ultimately fall flat.

Some have tried to couple multiple machines into a single logical NAS
server. This can scale performance of a NAS system somewhat. However, the number of communication links between members of the server cluster rises exponentially with the number of servers in the cluster. Ultimately, the bandwidth necessary for maintenance of cache coherency outpaces the capacity of the communications mechanism.

Other companies have created file system components meant to be installed by all clients on a SAN. These file system components coordinate each client with all other clients such that they have access to a shared data set. However, these efforts are proprietary in nature, and are therefore limited in application.

In contrast, OSA (Object Storage Architecture) manages to combine the best attributes of NAS and SAN, while overcoming their aforementioned limitations, by means of a novel architecture.

Posted by Mario Apicella on May 4, 2005 11:50 AM | TrackBack

May 02, 2005

More sensitive tapes get lost

According to CNN
data tapes containing sensitive information about Time Warner employees have disappeared.

Don't you have a feeling of deja vu? So do I.

Probably it's because not long ago we heard another case of missing tapes, that time from Bank of America

Are companies becoming careless when doing media management? I don't think so. Unless something really disturbing emerges from the investigation, I will maintain that no company should be suspected of negligence.

Nevertheless, stuff happens and a trivial clerical error such as mislabeling or misplacing a tape can quickly become breaking news when big names are involved.

Worse yet, it can affect the life and the finances of innocent people whose data happens to be on those tapes.

For the sake and piece of mind of all parties involved, I hope that the content of those tapes was encrypted.

However, it may be time for a different approach to data vaulting. Moving tapes from one location to another is after all expensive and risky. For the bad guys those media may be worth their weight in gold, which suggests moving them around as little as possible.

A viable and safer alternative could be electronic vaulting, essentially transferring information over a secure connection rather than making media travel around the globe.

Perhaps once the dust from this incident settles Time Warner and Iron Mountain should start planning for a future when tapes move around only when strictly needed. After all, isn't this what banks do with their gold bars?

Posted by Mario Apicella on May 2, 2005 03:08 PM | TrackBack

TODAY'S TOP STORIES AT INFOWORLD:

Four quick tips for choosing an IM security product

Forrester analysts ID hot IT jobs

Nvidia claims 10 hours of HD video on Tegra chip

Database vendors add Google's MapReduce

Network management: Tips for managing costs

EMC targets SMBs, branch offices with new low-end storage

RESOURCE CENTER   advertisement

Ads by Techwords beta

See your link here




Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist