Free Newsletters

   All InfoWorld Newsletters
IT Troubleshooter | Harper Mann » April 2006

April 25, 2006 | Comments: (0)

Open Source Tools: "data" versus "information"

In a recent Businessweek article, an IBM exec pointed out that "Today, the amount of information we produce increases by about 800 megabytes per year for every man, woman, and child on the planet." The article goes on to point out enterprises' ongoing need to translate "unstructured data" into truly useful "information."

This need to translate raw data into actual information is becoming a pretty critical issue for open source network and systems management projects today too.

Say you're looking for "information" on an open source network monitoring tool that would be good for your environment. The logical first step would be to go to SourceForge. So go there and search for "network monitor."

You get a ton of results, but which of these tools is the best for your environment? Good luck figuring that out, based on the search results. The most popular and proven open source network monitoring tool -- Nagios -- doesn't even appear in the first screen of the retrieved search results. In the open source discovery process, it's easy to find a lot of data, but not that easy to find the exact information that you need.

The old complaint around open source was that the documentation was lousy. People were quick to complain that open source is an engineering phenomena -- that the documentation was pointed towards open source hobbyists, not enterprise end users. For the end user that was used to unpacking a Dell box and finding crystal clear quick-start guides and troubleshooting menus, open source documentation was a little frightening.

The new news around open source documentation is that with the popularization of Wikis and other group sharing tools, the documentation has become dynamic. Documents even weeks old can be out of date given the speed of open source projects. Documentation is often written where user comments greatly enhance its usability. Project participants are actively submitting end user-oriented feedback, new entries and fixes to existing entries, which clarifies the documents and make them user-friendly. Wikis also play nicely with documentation processes such as indexing, which results in more standardized, intuitive information.

Open source has a new documentation problem today: its own popularity. The enterprise IT pro needs to have a better method to understand what's relevant for their unique requirements, and to let the rest of the data / noise fall away.

Posted by Harper Mann on April 25, 2006 03:16 PM


April 20, 2006 | Comments: (0)

Gearing up for Interop ...

Early next month in Vegas, the Interop event (formerly known as "NetWorld+Interop," or "N+I") kicks off. As always, this year the InteropNet -- the actual event network, built from scratch by 20+ different vendors -- will be a focal point for attendees to check out the latest technologies.

I recently spoke with Thomas Stocking (GroundWork co-founder), to hear his thoughts on why installing and configuring network monitoring on an extremely heterogeneous network like the InteropNet can be challenging. Here's what he had to say:

>There's a tremendous diversity of different equipment. That's the main challenge for any monitoring technology you choose. For instance, if you're going to do something with SNMP, you have to poll a lot of different MIB values just to get a consistent set of data you want to monitor. It's a lot easier if you're doing just one type of vendor equipment -- where you can pull CPU utilization across several routers, for instance. When you're pulling across multiple vendors with all sorts of different MIB values for CPU utilization, there are all sorts of differences, sometimes even between models of the same product line from a single vendor -- that's where a truly heterogeneous environment like that found in the InteropLab and InteropNet can become more challenging.

>If you want to get down to the deeper level of server monitoring (counting running processes, picking out CPU, disk and that sort of thing), you can get it off of SNMP -- but SNMP isn't always on. It's nearly always installed, but a lot of time the different participants are using different community strings, and it's not practical to standardize, so you have to adjust the monitoring system to use those settings.

>You also have to deal with very rich network security -- firewalls blocking the monitoring protocols, honeypots trapping your discovery attempts, IPS systems shutting down your monitoring because it looks similar to a hacker trying to map the network. It's not uncommon to see no support for ICMP (one of the networking protocols under TCP/IP) echo requests, or pings coming back from certain equipment. So you might have to use a different protocol, like HTTP for instance, just to see if a device is up. You might have a router built in '99, and it will respond to SNMP and ICMP -- but it may not have a web interface or HTTP. A lot of the time the configuration interfaces on the routers are running different software versions, which impacts the sorts of monitoring data you can gather from them.

>The whole goal with network and server monitoring is to be able to monitor the right things ("garbage in, garbage out," as they say). Not too much data or you'll be flooded. And not too little, or you'll miss important information. If you over-monitor, you cannot manage and resolve your issues -- you get data, not information. That's why open source tools are so popular. They're point solutions, but they get very specific information on that point, and often make it easier to "right size" your monitoring solution.

Every year, the event also runs some InteropLabs -- separate projects sponsored by Network World to dig deep into emerging networking technologies, investigating them and testing what is possible. In the past, they have focused on emerging protocols like SIP (session initial protocol for VoIP), advanced wireless technologies, and other "cutting edge" emerging technologies. This year, one of the InteropLabs was designated as an "Open Source Software Initiative" -- which sets out to help networking professionals answer (through extensive testing) how ready open source is for prime time networking use.

There are a whole lot of cool open source network monitoring tools that will be on display this year at Interop. I'll be reporting back some of the key findings live from the event the week of 5/1.

Posted by Harper Mann on April 20, 2006 10:42 AM


April 18, 2006 | Comments: (0)

Open source network monitoring tools you should care about: MRTG and RRDtool

In today's "mash-up" application development craze, the innovation is being driven largely by the fact that APIs are now more open and accessible, and presentation layer technologies such as AJAX are creating very compelling new ways to visualize data streams from multiple services. Developers can get to services more easily, and with XML they can deliver the data in much more dynamic ways via web applications.

Similar trends are driving software development innovation today in network monitoring. Tools such as MRTG (Multi-Router Traffic Grapher) and RRD (Round Robin Database) make it possible to more easily collect data from a greater number of devices on the network, and convert the data into XML for easy consumption on the front end.

According to Alex van den Bogaerdt, who wrote a very good tutorial on RRD -- "MRTG started as a tiny little script for graphing the use of a university's connection to the Internet. MRTG was later used as a tool for graphing other data sources including temperature, speed, voltage, number of printouts and the like."

Free under the GNU GPL, network professionals started using MRTG to poll network devices, retrieve MIB (Management Information Base) and SNMP (Simple Network Management Protocol) values, and use Perl scripts to post the results on graphs on web pages. Because MRTG is so good at polling devices and producing graphs, it quickly became widely used not only by the open source folks cobbling their own solutions together, but also by very large proprietary vendors such as HP who, according to this site, borrow from some of MRTG's capabilities for OpenView.

In network monitoring, the whole goal is to be able to monitor the right things. You don't want too much data, or you'll be flooded. Too little, and you miss important info. So there's this very fine line you have to walk to "right size" your monitoring system.

With MRTG and RRDTool (creator Tobias Oetiker's next generation MRTG, which extends the scalability and functionality), polling devices and getting just the information you want has become much, much easier. And because they use open C APIs and the info is dumped into XML format, these open source tools are interoperable with just about everything.

Posted by Harper Mann on April 18, 2006 12:33 PM


April 03, 2006 | Comments: (0)

Working with Log Files Hands on Lab Part 1: Syslog

I'm attending a fantastic session at LinuxWorld today titled "Working with Log Files." The session is being run by Mark Cohen (Quote.com, LookSmart, Penquin Computing) and Patrick McGovern (SourceForge.net and Splunk). The lab is designed for system administrators, developers and support people that want to learn more about how to use log files to troubleshoot IT infrastructures.

In a three hour session Mark and Patrick are providing a series of hands on labs, each lasting about 15 minutes, ranging from LAMP stacks, syslog, Linux, VOIP and J2EE. The lab participants ssh over wireless to a central server in the room running 50 Linux sessions using VMWare. We're using a variety of tools including Logwatcher, Swatch and Splunk Server to figure out what's going on with each of our Linux sessions use the log files.

The first lab involved ssh access to the VMWare server to get a virtual machine session and setting up and configuring syslog and syslogng. Syslog is the network logging workhorse of Linux and Unix systems. It is both a communications protocol as well as a set of actual programs and libraries. Syslog as opposed to syslogng is included in all Linux and Unix distros. Syslogng (next generation) is a newer and improved version of syslog adding better message filtering, better forwarding, message integrity and encryption and remote logging over TCP and UDP.

- Syslog can take messages generated by operating systems, processes and applications and save the messages to a log file or other syslog server over the network.
- Syslog can organize messages into categories according to facility and priority.
- Syslog can display messages directly to the console.

Syslog can be used for troubleshooting:

- Routers, firewalls and devices during installation or in problem situations.
- Intrusion detection.
- Operations management.
- Auditing.
- Tracking user and administrative activities.

One of the important things the lab reinforced is that setting up and deploying syslog requires some good thinking into your particular needs. Syslog can be set-up on specific services and devices in a number of different modes including:

Device - generates messages (may be a program).
Collector - receive and optionally store messages.
Relay - receives and forwards messages.
Sender - any program or device sending messages (device & relay).
Receiver - any syslog program running who receives syslog messages (relay & collector).

A typical syslog set-up involves one or multiple senders sending dta to a single, central syslog server. The server stores the received data usually all on one network. A larger syslog set-up can involves senders sending data to intermediary syslog servers. The intermediary servers receive, optionally filter and forward messages to a final syslog destination.

Syslog has a number of fields that you should be familiar with including facility, severity, timestamp, host, tag and message. Not all of them are always present depending upon the implementation. The configuration can be controlled with syslog.conf file in /etc.

Facility is a numerical indicator ranging from 0 to 23 indicating the sender of the message. 0 is traditionally used for kernel messages, 2 for email subsystem messages and 16-23 are reserved for your own customization. Severity is a numerical code from 0 to 7 with 0 indicating the level of the message. The lower the number the more important the message is deemed to be (0 is "emergency" and 7 is "debug").

We also learned in the lab that syslog timestamps are funny, containing no time zone, year or greater than 1 second resolution. Often too times in syslog can be out of sync unless your servers are syncing with a network time protocol like (NTP).

The syslog host is the name or IP address of the sender of the messages. The intention is to provide the name of the original sender when passing through syslog relays. The syslog tag is a short ID meant to identify the process id of the sender. The syslog message contains the actual content of the message and can be highly non-structured data.

Our presenters were informed syslog users and happily pointed out the short-comings.

- Syslog is not the most reliable as it runs over UDP.
- Syslog is not terribly secure as the sender can be faked and it is open to relay attacks.
- The content is not standardized and there are a large variety of unstructured message contents.

Words of wisdom are if you deploy syslog, spend some time thinking about how to do it for your environment. Syslogng offers a number of improvements over syslog and I'll be talking in a next post about Part 2 of the lab covering the benefits of syslogng and some hands on experience with it.

If you have questions or thoughts about syslog or syslogng email me at thebaum@splunk.com.

Posted by Michael Baum on April 3, 2006 11:34 AM


Technology White Papers

 

InfoWorld Technology Marketplace

  • Protect Your Data with SSL - Discover how to increase customer confidence in your site with the latest solution in SSL, Extended Validation (EV) SSL ...
  • Need simple, low cost server virtualization? - Do more with less. Support fewer servers. Simplify disaster recovery. Implement proven, easy-to-use server virtualization...
  • Virtually Limitless Virtual Storage - Do you need virtualization space savings of 50% or more with virtually no performance impact? You might be able to get storage...
  • Invisible IT? - The goal of IT is to become an invisible entity within a larger organization. Eliminating visibility and road blocks IT ...
  • It Really Is Easy to be Green - "Green IT" is a popular concept. And IT organizations are learning the influence that IT purchase decisions have on data...
  • Key Strategies For SOA Testing - SOA requires a unique approach to testing. Unless you're willing to reorient your testing procedures and technology now,...

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
» BUY A LINK NOW

Sponsored Technology Links