March 08, 2008 | Comments: (0)
/etc/hosts.deny, hackers, and automation run amok
3AM. It's always 3AM when these things happen.
Last night, my cellphone started beeping, and after it finally woke me up, I cracked open an eye and checked the screen. Text messages from Nagios, telling me that my main FreeBSD mail/Web server was incommunicado. Lovely.
I crawled out of bed and logged into my MacBook Pro. I had an open SSH session to that box, but it was all but unusable, echoing back a character every few seconds. An eventual 'uptime' showed the 5-minute load at over 300. Three hundred processes in the run queue basically means the box is thrashing wildly... but why?
The Nagios client had respawned a hundred or so times, sshd, snmpd, and inetd were all running 60-70% CPU utilization, completely consuming both CPUs. Everything had come to a standstill. I killed the offending processes from the console (hooray for Raritan KVM-over-IP!) and the box settled back down.
I first started sshd back up, and didn't see the load rise, but as soon as I attempted to SSH back into the box, it spiked to 100% utilization. I killed it, and rebuilt openssh-portable from ports, wondering if I'd been hacked, or the sshd binary had somehow become corrupt. I ran the newly-built sshd manually in debug mode, and watched the same problems occur. Obviously, this wasn't good. Checks of dmesg and /var/log/messages showed literally no problems whatsoever. The I/O subsystem seemed fine, as did all normal server operations -- I could SSH out, Apache, MySQL and sendmail were working, but there was obviously something very wrong.
The uptime on this server was 525 days. Generally speaking, I refrain from rebooting a box unless absolutely necessary, but in this case, I felt that I had to start with a clean slate. For the first time since September of 2006, I rebooted my main workhorse server.
It came back up without issue, other than the same sshd, snmpd, and inetd problems. The reboot was ultimately unnecessary. But what could be causing this problem? As I was making a cup of coffee, I thought that I might try removing hosts.deny to see if that made a difference. That did the trick -- all was well without it. But what caused that?
Awhile ago, I wrote a quick script to scan /var/log/auth.log for spurious brute-force SSH login attempts, and to add the offending IP address to /etc/hosts.deny for sshd. This worked extremely well, reducing the potential effectiveness of these attacks to all but zero. The problem, as it turned out, was that the script eventually wrote over 140 IPs to /etc/hosts.deny, which either triggered a bug, or exceeded a line-length limit that I'm unaware of. Removing that line caused all previously-misbehaving services to return to normal, and after some time to settle down, the server was back to handling a few hundred thousand emails a day, alongside Web and DNS services. I rewrote the brute-force detection script to add IPs to a pf table instead of /etc/hosts.deny, and parsed the previous hosts.deny list into the table to retain that information. Of course, this is how I should have done it to begin with. It took two cups of coffee, but I was out of the woods.
This was a decidedly non-obvious solution to a decidedly bizarre problem. I'd still like to know if I hit a bug in the BSD stack, or what the hosts.deny line-lengths limits are. Anyone? Bueller?
Posted by Paul Venezia on March 8, 2008 02:14 PM
February 15, 2008 | Comments: (0)
Summary: Make sure you uninstall SideTrack 1.5 before doing a Tiger-to-Leopard upgrade.
Tolstoy:
I'm not the kind of guy that leaps on new operating systems before the shrink wrap has shrunk. I like to let others take the lumps of a .0 release before I subject my core laptops and workstations to the latest and greatest. Thus, I kept my 17" MacBook Pro on Tiger until this evening.
I probably would have stayed there for awhile longer if I hadn't picked up a MacBook Air. I've been using it daily since I got it, switching back to the 17" when I needed the screen space (heavy coding, lots of RDP connections, etc), and I found that several of the features in Leopard were too good to pass up, especially spaces and the spring-loaded dock folders. So I mounted an NFS share to my trusty Adaptec Snap 650 filer, and backed up all 140GB of data from my MacBook Pro. Feeling relatively safe, I dropped the Leopard DVD into the drive (making that the second time I've used the DVD drive in at least six months, maybe longer), and let the installer do its thing.
The system updated successfully and rebooted. Happy that things looked like they had gone well, I started to log in -- but had no keyboard. The trackpad worked fine, but the keyboard was as dead as a doornail. No capslock lights, nothing. Obviously, this was a big problem.
I grabbed the Air and checked some of the sites I'd seen a month or so ago discussing intermittent keyboard problems with MacBooks and MacBook Pros on Leopard. Apple had released a fix for 10.5.1, and I had nothing better to try, so I plugged in an external USB keyboard, logged in, and fired up Software Update. The 10.5.2 update came down along with a passel of other updates. The subsequent reboot... did nothing to fix the problem.
Of course, the Apple update (MacBook/MacBook Pro Software Update 1.1) was rolled into 10.5.2, so it wasn't that... and this is my main machine. My workstations are great when I need the dual 24" LCDs, but when I need to find a comfy chair and get some serious writing or coding done, I'll grab the 17" and never look back -- except that without a keyboard, it's obviously useless. I was worried.
Then I remembered that I'd installed SideTrack. SideTrack is a trackpad helper app from Raging Menace. I've used it for years to enable single-finger scrolling on Mac trackpads, along with a few other nice additions not provided with the standard Apple trackpad driver. There's currently a Leopard update for SideTrack, version 1.6, but I hadn't checked that before the upgrade. So, I did the uninstallation and the necessary reboot, and voila, all is now well.
So if you're using SideTrack 1.5 on an Tiger system, save yourself a headache and uninstall it before you do the upgrade. Now my only decision is whether or not to reinstall the Leopard-compatible version of SideTrack. I doubt it supports the newfangled touchpad on the Air, and switching scrolling reflexes from laptop to laptop will drive me nuts. Decisions, decisions.
Posted by Paul Venezia on February 15, 2008 08:52 PM
February 08, 2008 | Comments: (0)
Once in awhile, I reflect on some of the tools that I use constantly, and the fact that there's an awful lot of unsung heros out there. Last night I started thinking about it and compiling a simple list of tools and some specific people that fit this bill. Here they are, in no particular order.
PHP
This one should be obvious. PHP has developed into an extremely strong, functional, stable, and fast Web development framework. If Perl makes easy things hard and hard things possible, PHP makes everything easy. I've even taken to writing backend scripts in PHP that would have been Perl not too long ago. A recent IMAP mailbox scanning, parsing, and spam blocking database interaction script springs to mind. It's around 30 lines of PHP and works like a charm.
MySQL
Again, another obvious entry here. Where would we be without MySQL? It's far more powerful and flexible than many DBAs will admit, and scales extremely well. Think Wikipedia.
phpMyAdmin
I don't know how many times I've used phpMyAdmin, or on how many servers I've installed it, but it's simply a phenomenal tool for working with MySQL.
Linux
'nuff said.
FreeBSD
FreeBSD (and NetBSD, OpenBSD, etc) are the unsung heros of the unsung heros. I operate several high-powered and heavily-loaded FreeBSD boxes, and it's a welcome change from the cult of Linux on occasion. It might not be as admin-friendly to the uninitiated, but once you grok it, there are features in FreeBSD that you wish your Linux boxes had.
DarwinPorts
For the past 7 years or so, I've been using Mac OS X, and never have I used the Fink package system. It just seemed, well, not quite right to me. Enter DarwinPorts. I use this all the time, and find it fast, flexible, and simple.
Larry Wall
I want to live on whatever planet Larry's from. It's hard to picture the world without Perl... and we wouldn't have Perl without Larry, that's for sure.
OpenSSL/OpenSSH
The deployed base of OpenSSL and OpenSSH is probably incalculable. From my cellphone to my TiVo, to my workstations, laptops, servers, across all operating systems and devices, there's OpenSSL and probably OpenSSH. It's become as ubiquitous as the air we breathe.
Bram Moolenaar and Vim
Another hidden hero, Bram Moolenaar (et al) is responsible for the best editor ever -- Vim. It's my mail reader on some boxes, obviously my editor of choice, and my IDE all rolled into one. I've been using Vim for years and years, and probably still only know and use 20% of the functions. I'm constantly using Vim reflexes in other editors (like Microsoft Word, or in ecto, which I'm using to write this post). If I can find Vim keybindings for an app, I'll use them. Firefox already supports several, such as the / search.
There are many, many more than those listed here, but these are the ones that topped my list last night while I reflected on this post, a few fingers of Lagavulin warming by belly and my brain. Have some more? Drop me a line.
Posted by Paul Venezia on February 8, 2008 12:57 PM
February 07, 2008 | Comments: (0)
Dear Apple, Never mind, someone already did
There are times that I need to get my hands dirty fixing problems all by myself, then telling everyone else how to do it. Then, there are times that I post a blog entry complaining about something or other, and the solution finds me. This, thankfully, is one of those times.
While I'm definitely an OS X guy for the moment (don't screw it up, Apple... please), I was relatively upset about the essentially useless X11 code in Leopard, and posted about it this morning. This afternoon, I received an email from Tevor Zylstra with a pointer to the Leopard and X11 blog, which contains links to the Xquartz Project. There, I found an unofficial X11 2.1.3 release that fixes all the bugs I've come across with Leopard's X11, from the window focus to the Option-click pasting. Huzzah!
Either I need to do more research or less -- I'm not sure which. All I know right now is that I'm a happy guy. Many, many thanks to Ben Byer and his group. Hopefully, 10.5.1 will incorporate this update and make it official.
Posted by Paul Venezia on February 7, 2008 11:07 PM
February 07, 2008 | Comments: (0)
Dear Apple, please fix X11. Again.
I'll admit -- I've been reticent to migrate my MacBook Pro to Leopard. The Tiger installation on that system has been stable and reliable, and I didn't want to visit any unnecessary demons on it if I could avoid it. However, I just received my MacBook Air that came with Leopard. This has given me a few more reasons not to upgrade, unfortunately. The biggest issue is X11.
Tiger's version of X11 was cranky right out of the box, with X11 windows not gaining focus when Command-Tabbing to it, and middle clicks to paste the clipboard buffer don't work, and a few other inconsistencies. After a few months, an update was released that fixed these problems. It seems that the Leopard version of X11 again suffers from a nearly identical set of problems. The version included with Leopard is v2.0, where the version running on my Tiger MacBook Pro is v1.1.3 -- and it seems that I may need to downgrade to Tiger's version to resolve these issues, though obviously, I'd rather not do that.
The vast majority of Leopard users will never know about, nor need X11, but there's a large contingent of deep geeks that use Mac OS X and X11 constantly -- and this is the audience that greatly influences purchasing decisions made by others. It's certainly in Apple's best interests to fix these problems, especially since we've been down this road before. So please, Apple, please fix Leopard's X11. Pretty please? With sugar on top?
Posted by Paul Venezia on February 7, 2008 08:59 AM
November 25, 2007 | Comments: (0)
Having endured the Vongo ads during various football games the past few days, I figured I'd at least check it out. I wasn't sure what to expect, and boy was I surprised. If you don't already know, Vongo is a new digital movie distribution site that allows users to download as many movies as they want for $9.99 a month. Intriguing, for sure, but riddled with artificial restrictions, apparently.
For one thing, Vongo is deeply, deeply Microsoft-centric. So deeply, in fact, that you can't even view their website with a browser claiming to come from another OS. With a Linux or Mac browser, the only possible option is to enter your email address to be notified if/when your OS is supported.
This means that you can't do any research on Vongo from anything but a Windows box -- Switching FireFox to identify itself as IE on Windows XP completely broke the site rendering, and hitting the site from my Nokia N95 (as I would imagine lots of people will do when seeing the ads on football games in bars or at a friend's house) gave me the nice "Incompatible OS" page as well, preventing me from getting any more information about Vongo. Handy.
Also, if you enter in 'vongo.com' to go directly to the site, it redirects to 'www.vongo.com.', a typo that thankfully most browsers ignore, but does show a certain lack of attention to detail.
So I wandered around the site with my Windows XP VM, looking for some answers to what's really happening on the back end. It seems that the only compatible playback devices are Windows XP, 2000, and Vista, or an Xbox 360. I could find no mention of playback on portable devices, although the commercials made a point of referencing this ability (and a point of not mentioning/showing an iPod). I'd guess that the Zune is supported, but I've seen no specific information on that issue.
But the fact that they were expecting to support portable devices without specifically mentioning the Zune gave me a flicker of hope that these movies might not be horrendously crippled for playback on other devices, like my iPod, Nokia N95, and Mac. Those hopes were dashed when I read this on their site:
"In order to enjoy the full experience of Vongo and Media Center Edition feature integration with Windows Vista, we strongly recommend that you uninstall the Vongo application software prior to upgrading to Vista. (Please Note: When you uninstall Vongo you will lose movies and videos already downloaded to your library. Because Windows 2000 and XP are separate and distinct operating systems from Vista, there is simply no technical means of porting Vongo videos across operating systems. As a Vongo subscriber you can always replace the videos in your library at no charge.)"
Really? The movies you download on XP won't be playable on Vista due to technical reasons? Please. Pull the other one, it's got bells on. This little lie is probably in place just to convince potential users to upgrade to Vista first, with Vongo as the proverbial carrot.
So it seems to me that Vongo has been designed as a Vista delivery catalyst and little more...and why would I want to artificially restrict myself so heavily, to the point where upgrading to another Microsoft OS will cause me to lose the movies I've already downloaded?
Amazon recently started offering $8.99 non-DRM MP3 albums. I've bought several so far, since I can use them on any of my playback devices, from my Sonos system to my Linux workstations and laptops to my iPod. It's this reason that I don't use the iTunes store, or any other crippled delivery system. So sorry, Vongo, but I'm completely uninterested.
Posted by Paul Venezia on November 25, 2007 11:09 AM
November 12, 2007 | Comments: (0)
I'll come right out and say this: I'm an AMD kinda guy. My main workstation runs Opteron 2220s, I prefer Opterons in my servers, and I've been looking forward to Barcelona for, well, far too long.
My attraction to the Opteron has been solely based on performance. In the past few years, the Opteron has consistently outperformed Intel's offerings, at least with my workloads. It seems that this may no longer be the case. My Stoakley reference system running two 45nm quad-core 3.0Ghz CPUs simply screams. I've had it running VMware ESX 3.0.2 for the past month, running a LAMP-based Web app load simulator I wrote in PHP. Even though on the face of it, Intel's design seems to be inferior and placing far too much emphasis on shared busses, the reality of the Stoakley platform is that it delivers.
Today's announcement from Intel regarding the new 45nm chips marks a reaffirmation of the next generation of processors. In reflecting on the fact that the new 45nm chips have transistor densities that place 30 million transistors on a space the size of a pinhead, I realize that the first Intel processors had around 2,300 transistors total. That was only thirty years ago, or so.
But enough reminiscing. Penryn is here, and it's going to make a splash if the performance of my reference system is any indication. The 6MB cache per die is definitely helping the numbers here, along with the 1.6Ghz FSB. If you want an exhaustive account of the ins-and-outs of the Stoakley platform and the new 45nm chips, check out Scott Wasson's Tech Report entry from September. I was at Intel in Oregon, standing a few feet from him when he took that beautiful photo of that wafer with a video camera of all things. I think I was laughing at the sight of three geeks desperately trying to photograph a largely reflective surface with digital cameras, and even took a photo of them trying to take photos.
On the outside of the box, I'm more concerned with what Stoakley can do for me. My Web app test hasn't changed much since I used it to test blade servers and VMware VI3 late last year. It's a relatively simple PHP script and a 500,000-record MySQL database. When a Web load generator is pointed at the script, it either serves up a static page, or makes a database call to generate a dynamic page. Tuning the parameters of the script can lean more towards Web server or SQL server performance, but normal operation has the predominate load shifting between the two as the test runs.
So I built a VMware ESX 3.0.2 server on the Stoakley box. The box itself had 16GB RAM, two 136GB U320 SCSI drives, and two Intel gigabit NICs. The onboard NICs weren't supported by VMware until just a week ago or so, and the SuperMicro case was low-profile, so I was limited to the dual-port LP Intel NIC I had. I dedicated one NIC to VM storage, and the other as the front end for all servers. I then built out six CentOS 5-based VMs: A two-CPU MySQL server, four single-CPU Apache servers, and a single-CPU load-balancer. Using LVS for load balancing, I nailed the box with HTTP requests for the Web test app installed on all four Apache servers. The load on all boxes grew substantially, delivering up to 4,000 requests per second, a number that's very dependent on the parameters of the test script, but certainly showed that the box was running on all eight cores, and running them hard. Over 200 million requests and nearly 15 hours later, the box showed no signs of distress, and each VM was still running like mad.
I kept the tests running over the next week, simulating a scenario that should never happen in real life, but occasionally does. After finally stopping the attack, I let the system quiesce, and then proceeded to load it up with more VMs for other tests I needed to do in the lab. I've wanted to run other benchmarks on the server, like compression and encoding tests, but that means that I have to power it down. I won't be able to realistically do that for another few days -- at this point, it's already indispensable.
So for now, I'm lacking raw numbers on standard tests, but I can tell you that my love affair with AMD is on shaky ground. I don't yet have a Barcelona system to run against Stoakley, and I really need to run those tests before I make any hasty moves. From what I've seen of Stoakley and Penryn this past month, my AMD honeymoon may be over.
Posted by Paul Venezia on November 12, 2007 07:12 PM
November 07, 2007 | Comments: (0)
So for all the virtual infrastructures that I've built in the lab and in the field, for every time I've PXE booted a new VMware ESX server into a production environment, I hadn't virtualized my own core systems... until yesterday.
It was a spur of the moment kinda thing, and I went with it. I started the day with five physical core servers of varying age, and ended the day with two, having collapsed the others onto a single Dell PowerEdge running VMware ESX 3.0.2. These were a few Windows boxes, three Linux boxes, and a FreeBSD system, all running on relatively ancient hardware.
It might be odd to think that just feet away from an Intel reference system running the new Stoakley platform and several 8- and 16-core servers from HP and Sun, a 6-year-old HP Kayak XU800 sat, a Fedora Core 3 system with a Pentium III-866 and two 40GB PATA drives in software RAID1, running Cyrus IMAPD for my 5GB mailbox, as well as primary DNS, DHCP, and NTP tasks for the entire lab on all VLANs. Around the corner from that were several other servers running various backup tasks, public Web apps, and so forth. I decided that everything had to go, and by 5pm, I'd rebuilt all the systems on the VMware box, including a somewhat annoying Berkeley DB 4.2->4.3 migration for Cyrus, since the new server was built on CentOS 5. Essentially, this was a 5-hour server consolidation project from conception to reality, and I don't seem to be the worse for it. In fact, I've lightened the power and heat loads in the lab, and am making far better use of the Dell PowerEdge 2800 that's holding down these new systems.
The PE2800 isn't the highest-spec box, especially for VM tasks. It has two HyperThreaded single-core 3.6 Ghz Xeons with 4GB RAM and a bunch of disk in a RAID 5, but it's now running 6 VMs like a champ, including my Asterisk PBX build. I plan on moving over a few more boxes today, but the bulk of the work is done.
One of the only issues that I have with the new environment is that all the VMware management tools are Windows-based. I really wish that VMware had continued their practice of producing Windows and Linux management tools, since my only Windows XP system is a VM running on my workstation and I now have a slightly disconcerting dependency there. Add to that the fact that the VirtualCenter server is running on a VM of its' own on another physical system under VMware Server, and I probably should build a standalone Windows server to handle those tasks. It would be ideal if VMware could bring simple VM management for ESX hosts into the VMware Server Console. I don't need all the bells and whistles there, but I do need to be able to powerup/powerdown servers and access their console. Since there's been consternation regarding the management split between VMware Server and ESX, this might actually happen, but I'm not holding my breath.
On the other side of the coin, migrating a VMware Server 1.0.3 VM to ESX 3.0.2 was very simple -- move the files into place on the ESX host, run vmkfstools on the vmdk, and import the VM via VirtualCenter. I found that it's best to delete the NIC from the VM and re-add it since there's some issue with variable syntax in the .vmx file between VMware Server and ESX, but all told, it was quick and easy, just the way it should be.
Posted by Paul Venezia on November 7, 2007 10:42 AM
November 02, 2007 | Comments: (0)
I made a few changes to the code, enough to warrant a sub-version bump. These were a few cosmetic changes and I added a -i option that will force fileages.pl to ignore .snapshot directories. In the future, I might expand that syntax to handle a definable ignore string, but in the short term I needed to run it on a filer that had snapshot support, and I didn't want to count the snapshots, so there you go.
It's also been tested on Mac OS X, and Martin Heller wrote in to tell me that it works just fine under Windows if you have a Win32 perl install. I ran it on a few 10-million-file directory hierarchies without issue, though it did take quite some time to run, as you might expect.
You can grab fileages.pl here.
Posted by Paul Venezia on November 2, 2007 03:41 PM
October 27, 2007 | Comments: (0)
Help for file packrats - fileages.pl
If you're like me, you would rather buy a few new disks than cull through your stuff to delete files you probably don't need anymore. I think that in my case, whenever I've decided to clean house I've deleted files that I found myself desperately needing a few weeks later. These would be things like an FC2 ISO set, for instance. It's large and I'm not planning on installing or using FC2 anytime soon, so out it goes.Then two weeks later, out of nowhere, I get pulled into a problem with someone's FC2 box, and I need to build a VM to replicate it, or something like that. Back to the Web to pull down another set.
Well, I decided to at least get an idea of the ages of the files in one of my main filestores. Not just creation time, but last access time, and last modified time. This is obviously a job for Perl, and I'm sure there are one hundred similar scripts floating around the Internet like flotsam, but a cursory Google search didn't pull up anything promising. So I fired up vim and typed up fileages.pl. It should run on any POSIX OS with Perl 5, but I've only tested it on Linux and FreeBSD 6.1.
It's very simple: Walk a directory structure with File::Find, and note the mtime, atime, and ctime of each file. Then, compile all that info and dump out a summary. Optionally, dump out the info for every file. The usage is also simple:
Usage: fileages.pl [-dhs] [-t (atime|ctime|mtime)] [-p <path>] -d detailed output -h this help -s supplemental ages -t (atime|ctime|mtime) type of scan. Default is atime -p <path> path to scan If -p isn't specified, then the current directory is used.
By default it runs on the current directory, looking for atime, and only outputs a summary for files 30, 60, 90, 180, 365, and 730 days old. Optionally, using the -s flag, additional times are recorded, including 1, 3, 7, 14 days old. The -d flag will display info for every file seen, so on a large filesystem the output can be enormous. The script is fairly CPU intensive, but used less than 9MB RAM on a million-file run.
I've run this on some large filestores, and found some interesting results (to me, anyway). Walking a 850GB store with over 1 million files resulted in this:
[pvenezia@bop ~]$ fileages.pl -s -p /bigdisk --------------------------------- Path: /bigdisk Scanning for atime Total files scanned: 1006417 ----- Total files older than 1 days: 2929 Total size: 14.20 GB ----- Total files older than 3 days: 2023 Total size: 17.07 GB ----- Total files older than 7 days: 991 Total size: 59.58 GB ----- Total files older than 14 days: 21 Total size: 11.28 GB ----- Total files older than 30 days: 1803 Total size: 24.35 GB ----- Total files older than 60 days: 1672 Total size: 42.91 GB ----- Total files older than 90 days: 52161 Total size: 83.05 GB ----- Total files older than 180 days: 14011 Total size: 86.97 GB ----- Total files older than 365 days: 360239 Total size: 146.05 GB ----- Total files older than 730 days: 561901 Total size: 214.23 GB
Note that this scan took 9 minutes running on a RAID5 array accessed via an NFS mount on a gigabit network. Obviously, I have lots and lots of stuff that I haven't even touched in two or more years.
In the corporate world, I've used this script to prove to a few folks exactly how old their files are, and making the case for deleting or offlining gigs and gigs of relatively useless data. After all, nothing speeds up the backup window more than backing up fewer files.
Might be time for me bite the bullet too. Or not.
You can download fileages.pl here.
Posted by Paul Venezia on October 27, 2007 02:52 PM
July 25, 2007 | Comments: (0)
Intel had a fairly big announcement today, highlighting their work in the server MP space with a new multi-processor framework dubbed Caneland. This is a new one on Intel, and definitely new in the industry, marking the first time a four-socket quad-core offering has reached this level. The basis of the system is the Tigerton CPU, running four cores at up to 2.93Ghz per core. This is married to the Clarksboro chipset, and Intel is claiming a 2x speed improvement over existing Xeon-based chips. I had a chance to play with the Tigerton/Clarksboro framework in Intel's lab today, and it really is rather odd to see 16 cores on 4U, four-socket Windows box. I ran a few benchmarks, but I can't publish any of the data -- yet. Suffice it to say that the official launch of Caneland later this year will be quite interesting. Manufacturers have been getting shipments for over a month now, so Intel isn't worried about quantities.
I happened to be in Intel's Oregon location to attend a workshop centered around a some new products that will be announced in the coming months. It was a very enlightening few days, and left me truly wondering about AMD's delayed quad-core, Barcelona. It's clear to me that Intel's technology isn't quite as good as AMD's Opteron and Barcelona, but then again, they've had their version of a quad-core x86_64 CPU for quite some time, while AMD's still waiting on the official launch of their quad.
The differences in CPU design are significant. Where Intel basically bolts two dual-cores together to make a quad-core, AMD is placing all four cores on a single chip, in all its HyperTransport and NUMA glory. I've found the Operton to be the better choice for lots of workloads, especially RAM-intensive applications, and found Intel's new Xeons to be speedy, but challenged in key areas, such as bus performance and memory access. Those points are moot, however, if AMD delays Barcelona too much longer. I know I'm eager to set these new chips against each other, but it will be a bit of a wait on both fronts.
Whatever else is happening, it's certain that on every level, CPU development is full steam ahead... just in time for everyone to start spec'ing gear for their virtualization rollouts.
Posted by Paul Venezia on July 25, 2007 12:56 AM
June 01, 2007 | Comments: (0)
Introducing ASAP - Automated Switchport Access Provisioning
...or something like that.
ASAP is a PHP/Perl application that automates switchport VLAN assignments for Cisco switches.
The good stuff:
o- Web and telnet interface
o- Can handle multiple switches across multiple sites
o- All switch interaction is via SNMP
o- Forces selected switchport down/up, causing Windows systems to automatically release/renew a DHCP lease
o- Prunes the ARP table following a VLAN modification
o- Can prevent certain subnets and IP addresses from being modified
o- Can be easily used by end-users
o- Admin interface provides instant system location by hostname, IP or MAC address
o- LDAP authentication for admin interface
o- MySQL logging
o- Basic reporting tools
o- Has been tested on CatOS and IOS with Cisco 6500-series and 3500-series switches
The bad stuff:
o- Rudimentary reporting needs work
o- Unsure of scalability. Sites with dozens of switches may require code tweaks
o- Hasn't been tested on several switch classes
o- Configuration could be more straightforward
Overview:
Once a network has been built and is fully operational, the vast majority of configuration tasks are simple VLAN assignments. Usually, these assignments happen only once, when a workstation is first introduced into the network, but in lab environments, VLAN assignments can occur constantly. ASAP was designed to remove the burden of system switchport location and VLAN modification from IT, and allow general users to easily perform these changes. Alternatively, ASAP can be configured to only allow admin access, and given a MAC address, IP address, or hostname, a specific system's current switchport can be located and modified without telnetting to a switch, and with an audit trail.
I originally wrote this right before I moved two large sites from one building to another. Each site had over 800 switchports and I was lazy enough to not want to deal with VLAN assignments. I wrote the ASAP Web and telnet applications, and placed every switchport into a VLAN with ACLs preventing access to any internal resources other than the Linux server running the apps, a dhcpd and a wildcard DNS server. Thus, whenever a client is plugged into an unknown switchport, they're given an IP in the "deadzone" range, and any Web site they try to visit brings up the ASAP app. They can then select the appropriate VLAN for their system, and 45 seconds later, they're fully up and running, without rebooting. *nix systems that don't run a GUI interface can also do their own VLAN assignments via the telnet application. Telnetting to the IP/hostname of the ASAP server brings up a CLI version of the Web application.
Both apps are autonomous, and can be configured independently. This permits greater modularity as well as security, since it's likely that the Web app will be used in a general corporate setting, where the telnet app will probably be used more in a lab setting. Both the Web and telnet apps log to a common MySQL database.
The apps work by determining the IP address of the connecting client, then polling each switch in turn until the IP, MAC address, switch, and switchport information is determined. Then, if the IP doesn't match a denysubnets definition, and all necessary info has been gathered, the user can select from a list of VLANs, and the app will change that specific port to the new VLAN, disable and re-enable the port (Web app only), and remove the original ARP entry from the router. With most browsers, the user will be sent to a "Please wait..." page that will refresh after 45 seconds showing that all is well. In the background, the switchport has been changed to the right VLAN, and the port disable/enable action forces Windows and Mac systems to release/renew their DHCP lease. This forces the system into the correct VLAN without requiring any user interaction or reboots. Note that the telnet app does not perform the disable/enable action though it could certainly be coded to do so.
To date, this code has been used by several hundred people to change several hundred switchports, but needs testing in lots of other settings. There are probably bugs that will be triggered by older IOS/CatOS revisions, among other things.
Configuration:
Configuration is relatively manual for now. Read through the asapd.pl and index.php files to configure the application. The most important bits are obviously the switch IP/SNMP settings, denysubnet definitions, and other site information. Note that you'll have to manually pull the VLAN index numbers from your switches. Info on how to do this is in the script comments. The included asap.sql file should be imported into a new database and general access granted to a username/pass pair matching that found in the db.inc file and the asapd.pl file via mysql -u root -p < ./asap.sql and an accompanying grant all privileges on asap.* to asap@localhost identified by 'passwd'. The help.html file can be modified to show whatever help info you wish on the main app page. login.php needs to be modified with the appropriate LDAP/AD configuration matching your site. It's currently built for non-anonymous binding to a normal Windows 2003 AD server.
So there's a bit of work to do to configure the app, but if you're at all familiar with Perl, PHP, and MySQL, it shouldn't take more than a few minutes.
Troubleshooting:
There are no debugging facilities to speak of. Since real men debug with print statements, that's what you'll find. Enjoy.
If there's enough interest in this tool, I'll put more time into tightening up the configuration and reporting, and work on any bugs that might get dug up. Either way, if you're using ASAP in your network, I'd love to hear about it. You can find the code linked below.
Download asap-0.0.1.tgz
Posted by Paul Venezia on June 1, 2007 02:44 PM
May 30, 2007 | Comments: (0)
3Ware's 9650SE and the Sun Ultra 40 M2
For the past few months, I've been running a Sun Ultra 40 M2 coupled with a 3Ware 9650SE SATA RAID controller. It would seem that this is a marriage made in heaven.
As I've remarked before, the Ultra 40 M2 is simply the most powerful workstation available from a mainstream vendor today. Armed with two AMD Opteron 2218 dual-core CPUs, up to 16GB RAM, eight hot-swap SAS or SATA drive bays, two PCI-X slots, built-in 7.1 sound, S/PDIF optical input and output, a dual-layer DVD burner, and (in my case) an nVidia Quadro 5500 graphics card, this system is the creme de la creme of the workstation world. The only downside is the relatively anemic nVidia SATA RAID controller built into the mainboard. The performance of this controller isn't terrible, but the Linux driver support simply isn't there. Enter the 3Ware 9650SE.
The 3Ware 9650SE-8LPML I have running in this system is a full-on 8-port SATA RAID controller with 256MB RAM and an optional battery-backup unit. There are two four-port SATA multilane connectors on the card, which can be used to marry the 9650SE to a multilane-capable disk array, or to individual SATA drives with multi-lane to discrete cabling. In the case of the Ultra 40 M2, however, multilane to SAS cabling is needed. Fortunately, the built-in nVidia controller uses multilane connectors to feed the disk backplanes within the Ultra 40 chassis, but the included cables aren't long enough to reach the 9650SE. Sun can supply cables of appropriate length to reach the card, however.
Once I had the right cables, it was simply a matter of cable routing to each backplane connector and then back into the 9650SE. The fan tray that sits to the left of the disk bays can get in the way here, but some creative cable management within the case made everything fit and look like it was meant to be there. I placed eight 250GB SATA drives in the disk cages, and powered the system on. The 9650SE posted, found all the drives, and all was well.
I configured the eight drives into a RAID50 set, giving me high throughput on 1.36TB of usable space while providing significant fault-tolerance. The configuration through the 3Ware BIOS tools is quick and easy. Unfortunately, installing and running Fedora Core 6 (or any reasonably recent distro) on the 3Ware 9650SE isn't as simple. The 9650SE and the more recent cards from 3Ware aren't supported in the included 3w-9xxx driver found in stock 2.6 kernels. Historically, 3Ware has been extremely good at providing support for Linux and FreeBSD, so I would think that this problem will be rectified shortly, but in the interim, there are a few steps involved in getting everything working right on Fedora and RedHat. The first is to download the right install disk from 3Ware. You can find the files for just about every major distro on their site. These are just zipfiles with driver sets. Format a floppy with mformat (mformat a:), and unzip the installdisk file to the floppy. Then, boot the system as you would for a normal installation. At the boot: line, enter linux dd and the installer will prompt for a boot disk. Select the floppy drive, and it should load the appropriate driver. Continue the installation normally. On the Ultra 40 M2, I had to use a USB floppy drive, which appears as /dev/sda.
Following the initial boot, the system needs to be updated. Be aware that updating the kernel may result in a non-bootable system since the new kernel will not have the right driver for the 3650SE. Fortunately, it's easy to remedy this problem. Run the yum update to pull in all the new packages, including a new kernel and kernel-devel package. Then, download the upstream driver for the 2.6.19+ kernels from 3Ware's download site. Extract the driver source into a new directory, such as /usr/local/src/3ware, (mkdir -p /usr/local/src/3w-9xxx; cd /usr/local/src/3w-9xxx; tar zxf /path/to/source.tgz; tar zxf ./3w-9xxx.tgz) move into the driver directory, and edit the Makefile to pull in the right kernel path. In my case, the SRC:= line at the top of the Makefile should be modified to SRC := /lib/modules/2.6.20-1.2948.fc6/source/. This will tell the compiler to build the driver with the source tree of the new kernel, not the running kernel. Then, simply run /lib/modules/2.6.20-1.2948.fc6/updates and you should be all set.
Once this was done, I rsync'd 190GB to the fresh install (yes, my home directory is 190GB), and saw write throughput to the RAID50 set at around 100MB/s. Reads were slightly higher than that at 110MB/s. I've been beating up the 9650SE and the Ultra 40 M2 with my normal brand of workstation torture -- cyclic MD5 sums on multi-gigabit files, kernel recompilations, DVD ripping, MP3 encoding, and two virtual systems running under VMware Workstation 6, all while playing movies from NFS shares and running Beryl with all the widgets enabled. Between the stellar performance of the 9650SE and the calm and steady power of the Ultra 40 M2, all of these tasks were handled with aplomb. Suffice it to say, you'd be hard-pressed to equal or surpass the performance of this box with any computing hardware available today.
As far as longevity and survivability goes, the 9650SE has been running for a few months without a problem, and my several-year-old 9500 8-port SATA RAID controller has been driving a four-disk RAID5 set without a hiccup. If history is any indicator, reliability isn't an issue with 3Ware cards. I'll be posting more on this power duo as time and events warrant, but for right now, I'm a very happy guy.
Posted by Paul Venezia on May 30, 2007 10:17 AM
May 02, 2007 | Comments: (0)
Check it out: Deep into APC hardware management
I just barely finished turning up two new datacenters in two different states within two weeks. Exhausting? Definitely. On the plus side, however, I wrote several new tools and plugins to manage all of the APC gear that went into both sites with Nagios and Cacti.
First, a little background. Both datacenters were built to be nearly identical to each other -- from rack layout to equipment, to color-coded patch cabling. The major difference is that one site is cooled with APC ACSC100 In-row air units, and the other cooled with ACRC100 In-row water-cooling units. Both sites are powered from APC Symmetra PX UPSes and PDUs, and use APC racks and 3-phase zero-U rackmount PDUs. In addition, several NetBotz WallBotz 500 units were implemented to provide external environmental monitoring and surveillance of the rooms. Basically, it's all APC gear. I'll be posting more on the build process over the next few weeks, but I wanted to get some of the code out there first.
I wrote two main plugins for Nagios and Cacti to assist in monitoring all this new stuff. The Nagios plugin checks the most pertinent data on the ACRC and ACSC units, as well as the main sensors on the NetBotz units, and the load on each phase on the PDUs. It's come in very handy since the sites were turned up, since I have a easily-digested central view of all PDUs, or all AC units on one page. Tweaking parameters on the AC units becomes very simple when you have all the data in one place, versus having to log into each unit to get status info, or even using APC's Infrastruxure Central Console.
I've released the Nagios plugin, check_apcext, and will be posting the Cacti templates soon. Here's the overview of the Nagios plugin, and a link to the NagiosExchange page. Enjoy.
Usage: ./check_apcext.pl -H <hostip> -C <community> -p <parameter> -w <warnval> -c <critval>Parameters:
APC NetBotz
nbmstemp NetBotz main sensor temp
nbmshum NetBotz main sensor humidity
nbmsairflow NetBotz main sensor airflowAPC Metered Rack PDU (3 phase)
rpduamps Amps on each phaseAPC ACSC In-Row
acscstatus System status (on/standby)
acscload Cooling load
acscoutput Cooling output
acscsupair Supply air
acscairflow Air flow
acscracktemp Rack inlet temp
acsccondin Condenser input temp
acsccondout Condenser outlet tempAPC ACRC In-Row
acrcstatus System status (on/standby)
acrcload Cooling load
acrcoutput Cooling output
acrcairflow Air flow
acrcracktemp Rack inlet temp
acrcsupair Supply air
acrcretair Return air
acrcfanspeed Fan speed
acrcfluidflow Fluid flow
acrcflenttemp Fluid entering temp
acrcflrettemp Fluid return tempThus, in checkcommands.cfg, place the following:
define command{
command_name check_apcext
command_line $USER1$/check_apcext.pl -H $HOSTADDRESS$ -C $ARG1$ -p $ARG2$ -w $ARG3$ -c $ARG4$
}and in services.cfg, you'll have something similar to the following:
define service{
use generic-service
hostgroup_name acsc
service_description ACSC Status
is_volatile 0
contact_groups admins
check_command check_apcext!public!acscstatus
}
define service{
use generic-service
hostgroup_name acsc
service_description ACSC Rack Temps
is_volatile 0
contact_groups admins
check_command check_apcext!public!acscracktemp!90!95
}... and so on, for all parameters you wish to inspect. There are two special cases:
1) ACSC and ACRC status has no warn/critical values -- it's OK if the unit is operating, and WARNING if it's on standby
2) Rack PDUs will flag as WARNING or CRITICAL if any of the three phases is beyond the threshold.TODO:
1) NetBotz external sensor monitoring
2) Other rack PDUs (although I don't have any to test)
3) Bugfixes?
Posted by Paul Venezia on May 2, 2007 11:29 AM
March 12, 2007 | Comments: (0)
Where's Waldo? Locating the OID you need.
A few days ago I decided to write a little Cisco-centric SNMP query/modify tool. I didn't need or want anything beyond simply finding the switch and switchport a MAC or IP address was plugged into, and to be able to set that port to another VLAN, and then enable/disable the port to force the system to renew it's DHCP lease. Most of the OIDs I needed were simple to find, others not so for some reason. Here's my short list:
Pull the MAC address table: .1.3.6.1.2.1.17.4.3.1.1
o- Used in conjuction with community@vlan syntax.
Pull the bridge port number table: .1.3.6.1.2.1.17.4.3.1.2
Find the ifIndex number: .1.3.6.1.2.1.17.1.4.1.2.<bridge port number>
Find the assigned VLAN: .1.3.6.1.4.1.9.9.68.1.2.2.1.2.<ifIndex>
Find the real port name: .1.3.6.1.2.1.31.1.1.1.1.<ifIndex>
Set a port to another VLAN: .1.3.6.1.4.1.9.9.68.1.2.2.1.2.<ifIndex> integer <VLAN ID>
Enable/disable a switchport: .1.3.6.1.2.1.2.2.1.7.379.<ifIndex> integer [ 1 = enabled | 2 = disabled ]
I'm still writing this tool, so there's sure to be more in the near future.
Posted by Paul Venezia on March 12, 2007 07:22 PM
February 07, 2007 | Comments: (0)
So I installed Vista Enterprise on a VMware Workstation instance over the weekend. After five days or so, I find that, well, it's certainly more attractive than Windows XP, and certainly more annoying with the constant barrage of confirmation dialogs, but I don't hate it. In fact, I liking lots of it so far.
A few brief notes, since I'm short on time:
1) It's very pretty. Still not as fluid as OS X or Beryl, to be sure, but much prettier than XP.
2) The things I really like (gadgets, searching, smooth look) are the same things I like about Mac OS X. Funny that.
3) There are going to be lots of people looking desperately for the Start and file menus. ("Oh, you can click on that circle thingy?")
4) It's huge. A default install with Office 2007 is 14GB for some reason.
5) The file system sure ain't WinFS, but it's also more fluid than XP. For instance, I originally installed it on a 16GB VMware disk, but quickly had only 500MB free. Running vmware-vdiskmanager -x 40Gb Vista.vmdk, then booting Vista, opening the Disk Manager and selecting Extend Partition did the trick on the fly, on the boot partition. Nice.
6) Some apps that work perfectly on XP bail on Vista, or refuse to install at all. It appears that they're having issues with the new security measures, but I can't be 100% sure. I can be 100% sure that it's a big pain, however.
7) I have managed to bluescreen one instance already, but that might have been due to VM issues.
8) Apple's new Get A Mac commercial is pretty much right on target. Allow. I wouldn't be surprised if in the home setting, most people disable the confirmation dialogs since they will be perceived as constant annoyances. That will mostly obviate this new security model.
9) I would definitely prefer Vista to XP for those times that I need a Windows system for work in the lab, but I'm currently running an XP instance and a Vista instance simultaneously, since I can't be guaranteed that all the apps I need will function. This will certainly hinder enterprise adoption, at least in the short term. Bummer.
10) It's very pretty.
Posted by Paul Venezia on February 7, 2007 07:16 AM
February 04, 2007 | Comments: (0)
I've been following the development of Beryl for nearly it's entire life, but never got around to installing it on any of my workstations. This was primarily due to the fact that they were stable under Fedora Core 5, and I had no real reason to upgrade, until now.
Beryl is an open-source X11 window manager that is truly breathtaking in scope and execution, especially for such a young project. Beryl was forked from Novell's Compiz project originally as Quinnstorm, but only really became it's own project in October of 2006, when it was officially separated from its roots. That was version 0.1.1, and in the three months since, it has been subject to a massive development effort, with the 0.2.0 release coming very soon. To say that the project has advanced substantially since October would be a massive understatement.
I first saw Beryl and compiz as a nice idea, but thought they were going to follow the same path as Enlightenment, with extreme window decorations and visual effects, but relatively unstable and a massive resource hog. Not so, at least with the 0.1.99 release I'm running now. It was a little struggle to originally implement, requiring a few relatively minor but hard-to-find changes to xorg.conf, but for a total package of around 5MB, it's astonishing in performance and integration. Basically, I'm having lots and lots of fun with it on my main workstation. Aside from a few little bugs generally relating to intelligent window placement underneath panels and such, it has been very stable and responsive. I've had extremely good luck with nVidia cards, specifically the Quadro line, but it also runs well on my Dell Latitude D800 with a GeForce 5600.
I'm definitely not easy on my workstations, with generally 300 processes running at any one time, and at least a hundred windows open -- gnome-terminals, Firefox, VMware Workstation, xmms, OpenOffice.org, GAIM, etc, etc. I've had an occasional slowdown with effects rendering on a virtual desktop with a massive window load, but generally there's no sluggishness or delay with Beryl 0.1.99. I had v0.1.4 build of Beryl crash on me once awhile back, but it faithfully fell back to Metacity (the default Gnome window manager) and all I lost was the Beryl effects -- not any work.
InfoWorld Senior Contributing Editor Oliver Rist was here this weekend, and asked me three or four times "What video card do you have in that thing again?", in between all the "Holy crap that's cool" comments. I think he's going to be installing FC6 on a box next week. This will be a watershed event -- in the last month, I've somehow managed to get him to buy a MacBook Pro and install Linux on a workstation simply by showing him how I work. There's a lesson there.
Beryl 0.1.99 is now in Fedora Core 6 extras, so a yum install beryl-gnome should install everything necessary, and modifying the /etc/X11/xorg.conf file should be performed according to these instructions. If you just want to see what the fuss is about, check out the video.
Posted by Paul Venezia on February 4, 2007 04:00 PM
January 29, 2007 | Comments: (0)
Forcedeth issues in 2.6.19/FC6
If you happen to be running Fedora Core 6 on a AMD64 system with the nVidia MCP55 chipset, specifically the MCP55 Ethernet controllers, the update to kernel 2.6.19-1.2895 will probably break your Ethernet driver. In my minimal testing, the 0.57 forcedeth driver hangs after some number of bytes are passed through the interface, requiring an up/down to reset. This problem is not evidenced in 2.6.18-1.2798 with forcedeth 0.56, at least as far as I can tell.
The more you know.
Posted by Paul Venezia on January 29, 2007 04:11 PM
October 30, 2006 | Comments: (0)
I was planning on upgrading my Dell D800 to Fedora Core 6 at some point. I just didn't expect it to be yesterday.
Due to the fickle finger of fate, the 80GB 2.5" drive in the D800 decided to develop amnesia right in the middle of running tests on a collection of blade systems at the ANCL lab. Figures.
A quick trip to CompUSA, a new 80GB drive and an hour later, FC6 was installing. It's very pretty. A few things are relatively annoying, some traceable to Fedora/RedHat, and some to other sources. Here's my quick punchlist of problems and resolutions:
1) config.h? We don't need no stinking config.h
Yeah, it's not there. Lots of drivers and tools won't compile properly even with kernel-devel installed. Just grab a copy of config.h from a recent kernel and drop it into /lib/modules/[KERNEL REV]/build/include/linux.
2) System beep is painful, at least on this system.
I'm still unsure as to what's happening here. I just disabled it.
3) Cisco VPN Client 4.8 won't build
This is the config.h problem. See above.
4) nVidia drivers won't build
Yep, config.h problem. See above. The official nVidia drivers are significantly better than the provided xorg 'nv' driver. Well worth the effort. Also, ensure that xorg-x11-server-sdk is installed before installing the nVidia drivers. I used the x86-1.0-8776 rev without problem.
5) Repos haven't caught up yet
It'll be a little while before all your packages are in your favorite repos. I just tossed in Livna and haven't had a problem so far.
6) What do you mean 'another copy of yum is running'?
The new yum-updatesd daemon is set to run at startup and will prevent a manual yum call. chkconfig --level 345 yum-updatesd off and service yum-updatesd stop will fix that permanently. I turned it back on after I had all my packages installed, since it kinda makes sense to do so.
Other than these items, it's been pretty clean sailing today, and I've been stressing the system something fierce, though I haven't had much chance to play with the new features. I did note the fact that the Broadcom BCM4306-based 802.11b/g card was detected and configured automatically. Finally.
A great site for locating packages and tips for FC6 is Mauriat Miranda's Install Guide, which has links for a bunch of FireFox plugins, repos, et al.
Posted by Paul Venezia on October 30, 2006 11:36 PM
September 06, 2006 | Comments: (0)
Although I've run BSD-based production servers for 15 years or so, I find that I tend to get rusty since they basically just sit there doing their thing until there's a hardware failure. Being the proactive fellow that I am, I tend to fire hardware before it can quit, so I decided to take the weekend and build a FreeBSD 6.1-RELEASE server to replace one that had been running 4.9-RELEASE for years.
This server does just about everything, from handling a massive mail volume and the associated filters and virus scanning duties, to mailing lists served via mailman, to hosting over 80 domains for both DNS, mail and Web hosting. This upgrade would be major indeed, upgrading to PHP5, MySQL5, Apache 2.0, and on and on. Also, the disk in this server is standard ATA/133 PATA drives using software RAID. What follows are brief notes on my migration, some FreeBSD basics, and things I wish I'd known at the time.
Installation-
Boot from the bootonly ISO, standard install, construct your partitions on one of the drives (ad0), set the MBR, basically all the defaults, and select the Developer package set. Let the installer do its thing, but don't bother installing any specific ports yet. When it's all over, set up a local user, root password, timezone, and the like. Then, before rebooting, configure the RAID.
RAID1 with gmirror-
Before rebooting the box, type Alt-F4 and get to a shell. Type
sysctl kern.geom.debugflags=16
to remove the mount checks, and then
gmirror label -v -b round-robin gm0 /dev/ad0
will set the mirror up on ad0. Now,
echo geom_mirror_load="YES" > /boot/loader.conf
to instruct the bootloader to head for the mirror, and now, edit /etc/fstab, replacing "/dev/ad0" with "/dev/mirror/gm0" to mount the RAID device instead of the raw device on boot. If all is well, reboot. Following the initial boot, assume root and enter
gmirror insert gm0 /dev/ad2
which will make /dev/ad2 part of the gm0 mirror.
gmirror status will show you the resync status and tell you when the array has completed the rebuild. Also, gstat will show you how hard the mirror's working, and which disks are in use, measured in usecs.
If you're anything like me, one of the first things you'll do is install bonnie from ports and test the mirror's I/O. I found a particularly nasty IRQ problem this way, which resulted in 5.5MB/s writes to the mirror. Fixing that brought the performance into the 26MB/s write, 80MB/s read territory since reads are striped from each disk. Quite nice.
Installing ports
After the first boot, cvsup you ports tree and src trees, then install portupgrade. I found very little change to the src tree, but plenty of ports updates. I dislike the prompting in the installer to install ports directly from there -- I'd much rather do it following the first boot, though portupgrade makes life lots easier.
The FreeBSD Kernel
I wanted to run pf as the firewall, since it's the slickest firewall available on any OS. To do this, cd /usr/src/sys/i386/conf and cp GENERIC HOSTNAME, substituting the system's hostname for HOSTNAME. Use the SMP kernel file if it's a multi-CPU system.
Add these lines to the file:
device pf
device pflog
device pfsync
options ALTQ
and recompile the kernel with make buildkernel from the /usr/src directory. Install the kernel with make installkernel, and reboot.
pf
I really really like pf. The tables structure and configuration file syntax bring happiness to my heart, as does the use of variables within the config, as seen in this example:
ext_if="fxp0"
loop="lo0"
table <smtpblock> { 10.0.0.0/8 }
tcp_services = "{ 25, 53, 20, 21, 22, 80, 443, 110, 143, 993, 995 }"
udp_services = "{ 53 }"
block all
pass quick on $loop all
block drop in quick on $ext_if inet proto tcp from <smtpblock> to ($ext_if) port 25
pass in on $ext_if inet proto tcp from any to ($ext_if) port $tcp_services flags S/SA keep state
pass in on $ext_if inet proto udp from any to ($ext_if) port $udp_services
pass out quick on $ext_if inet proto { tcp, udp, icmp } all keep state
antispoof for $loop
antispoof for $ext_if
That's it, a full configuration with a table (no NAT).
Make sure that you have
pf_enable="YES"
pf_rules="/etc/pf.conf"
pf_flags=""
in /etc/rc.conf, and you're all set. Some handy pf commands are
pfctl -s rules | Show the current rules |
pfctl -sa | Show the current rules, connection tables, and statistics |
pfctl -vvsa | Show the current rules, connection tables, and statistics, with extra verbosity |
pfctl -vvsTable | Show the currently defined table statistics |
pfctl -t smtpblock -T show | Show the entries in table smtpblock. |
pfctl -t smtpblock -T add -f /tmp/bl | Add a list of IPs from file /tmp/bl to the table |
pfctl -s rules | Show the current rules |
There's much more to pf, and to this build, but it'll have to wait for another entry. I hope to have time to detail more of the migration, including gotchas encountered when moving between versions of common services, system-level changes and so on.
Suffice it to say, the server was rebuilt and put in place in a few hours' time, and I'm sitting back enjoying the knowledge that aside from minor patches, I won't have to touch it again for another few years.
Posted by Paul Venezia on September 6, 2006 11:23 AM
August 31, 2006 | Comments: (0)
I've been messing around with OpenVZ for the past few weeks and having a blast. I really liked Virtuozzo, its' commercial sibling, and I find that OpenVZ lacks only the management tools available in Virtuozzo. Since I was going to be building quite a few VPSes on the host servers, I wrote some very simple shell scripts to automate the process, from the actual VPS creation to custom package installation, NIS and NFS mount configuration, and so forth. Also, I found that I wanted to be able to run several commands across all VPSes simultaneously, or in series, so there's a script that handles that.
I also decided that although the /etc/sysconfig/vz-scripts/$VPSID.mount and umount scripts are handy, in my case they would all be the same, so there's a global script for those that is run from a symlink. These were written for one specific purpose, but I imagine that there are others that could use these tools as a starting point at least, so have at it. You can download the tools here: ovztools-0.0.1.tgz
ovztools-0.0.1
This is a selection of simple bash-based OpenVZ tools, written to streamline VPS builds and configuration. They're written in bash for portability. They have been tested on Red Hat-based distributions such as Fedora and CentOS. Modifications will need to be made for other distributions.
INSTALL
Simply place all of the scripts in a directory in your path, such as /usr/local/sbin. Then move the global.* scripts to /etc/sysconfig/vz-scripts. You will need to modify variables in most of them to reflect your environment, and probably make significant modifications to vzsetup.sh.
TOOLS
mkvz -
This script condenses the creation of a VPS into a few prompts for hostname, IP address and VPSID, then builds the VPS to spec, adding packages specified in the $packages variable as well as performing post-install work via the vzsetup.sh script, such as modifying files to provide for NIS authentication and configuring various NFS mounts.
vzsetup.sh -
This is called from mkvz to perform various tasks within the new VPS, such as NIS setup and whatever else is needed.
vzdo -
This script is used to run global commands across multiple VPSes, such as installing new packages to all VPSes via vzyum.
global.mount, global.umount -
These scripts control all bind mounts to the VPSes when they're started and stopped. Rather than using a single script for each, these scripts are called via symbolic links in the /etc/sysconfig/vz-scripts directory, such as 101.mount -> global.mount. The script parses out the VPS id from $0 and performs the requisite mount --bind operations. Originally written as a single script, I broke it into two for simplicity.
vzdomount -
This script can be called manually to mount --bind and directories within a VPS
As always, YMMV.
Posted by Paul Venezia on August 31, 2006 07:47 PM
August 05, 2006 | Comments: (0)
After I don't know how many years using and coding on Evan Harris's relaydelay Sendmail milter, I just recently found and fixed a small bug in the recipient whitelisting code on version 0.04. Prior to this fix, recipient whitelisting wouldn't work if the database entry for the address wasn't enclosed in <> brackets. This was due to the way that the rcpt_to variable was handled prior to the database call. The patch below yanks those out of the string if they exist, rendering normal addresses viable for recipient whitelisting.
Note that this patch was made against a heavily modified milter, so it might fail on the stock version. Just add the regex in the patch in the right place in your version.
--- relaydelay.rcpt_to.pl Sat Aug 5 12:27:25 2006
+++ relaydelay.pl Thu Aug 3 10:18:41 2006
@@ -743,6 +743,7 @@
# See if this recipient (or domain/subdomain) is wildcard white/blacklisted
# Do the check in such a way that more exact matches are returned first
if ($check_wildcard_rcpt_to) {
+ $rcpt_to =~ s/(?:<|>)//g;
my $subquery = "rcpt_to = " . $dbh->quote($rcpt_to);
my $tstr = $rcpt_domain;
while(index($tstr, ".") > 0) {
Posted by Paul Venezia on August 5, 2006 03:34 PM
July 15, 2006 | Comments: (0)
There are several things that make me cringe with most Web-based applications. Platform really doesn't matter that much, since I've seen more than my share of bad apps written in Java, JavaScript/DHTML, Shockwave or Flash; it's usually a matter of UI and functionality failings, but either one can be exacerbated by a misbehaving stack. Having just taken Gliffy for a quick spin, I can say that it doesn't seem to suffer from any of these ailments. So far, I'm incredibly impressed with the UI, the functionality is superb, and the whole application feels like a look at the next level of application delivery.
If you haven't played with Gliffy yet, take a few minutes and register. It provides 80% of the functionality of Visio without the hassle of even installing an application. Exports in SVG, JPEG, and PNG formats, and is simply a great example of an extremely well executed Flash application. Bravo.
Posted by Paul Venezia on July 15, 2006 02:11 PM
June 03, 2006 | Comments: (0)
More than a few readers sent me notes asking about the availability of the code that I mentioned in last week's Enterprise Hacks feature. I wrote two pieces in that article, the first one regarding timezone settings for thin clients, and the second on using Perl, PHP, and MySQL to map Windows shares and permissions across an enterprise network. The code for the first is easy, since I posted the code back on my blog back in 2003.
The share mapping code will be trickier. For one, it's fairly involved, including Win32 and Linux-based perl scripts, PHP scripts, and a database schema. Also, I'll have to ensure that the code is distributable. If I can get all these pieces together for that app, I'll post it here, but be forewarned -- it was written for one specific purpose, and for one specific network. It should be easily portable to other networks, but it'll have to be screened for security information and probably dressed up a bit. Stay tuned -- I can't promise anything, but I'll try.
Posted by Paul Venezia on June 3, 2006 09:54 AM
May 30, 2006 | Comments: (0)
Just a few days ago I put the finishing touches on the homegrown Ultra40 project. In response to many folks who mentioned that while the Ultra40 was quite an impressive workstation, you could build one for far less. So I did.
I chose solid server-class parts, since that's really what this system required, with the obvious exception of the video card. Antec's Titan550 case seemed the best bet to fit the large Tyan Thunder K8WE mainboard, and included the TruePower2 550W power supply. Two dual-core AMD Opteron 285s fit the bill, along with two Zalman CNPS9500LED CPU coolers. Disk was handled by a pair of Western Digital 250GB SATA drives and a Sony 16x DVD-RW drive. Following this I threw in two gigs of DDR400 registered ECC RAM and a SoundBlaster Audigy2 Platinum Pro soundcard to provide the 5.1 and SPDI/F support. Completing the picture was a brand-spanking new nVidia Quadro 3500 PCI-X video card. Mix well with a fresh installation of Fedora Core 5 and serve.
I've been running this system for a few days and it's been perfectly stable and incredibly responsive. The nVidia Quadro 3500 is the newest in the Quadro line, and simply blew me away. It's driving two 21" Sun GDM-5410 CRTs at 1920x1440, and pushes glxgears at 1,200fps at 1920x1440x75. At 640x480, it's well over 6,000fps. Using the nVidia Linux x86_64 driver version 1.0-8756, I ran into a little bit of trouble, since they deprecated IgnoreEDID in favor of UseEDID, causing the card to step down to the max resolution claimed by the monitors, which is 1600x1200. That just wouldn't do. Setting Option "UseEDID" "false" cleaned that up nicely. The only other problem here was the cursor animation flicker that's a known bug in the nVidia Linux driver. Setting Option "SWCursor" "true" helped a bit, but that's not a great solution. Hopefully this will be fixed in a later rev of the driver.
The Tyan Thunder K8WE uses the nVidia chipset and is actually quite similar to the Sun Ultra40 in this respect, including the use of the nVidia SATA RAID chipset, which doesn't have native Linux support. Since I'm actually running mirrored 250GB SATA drives in the box, I opted for the Linux software mirroring, which is extremely fast -- and more responsive than the 3ware 8002 SATA RAID controller I'd been using in a previous workstation. In fact, I consistently see buffered reads from the RAID1 device at over 60MB/s. I liked the parallel CPU layout on this mainboard as well, since in the Titan550 the CPU sockets line up with the 120mm rear fan, and using the simply enormous Zalman CPU coolers, it's possible to get all the CPU heat to flow directly out the rear of the case in a straight line. I did put an 80mm intake fan in the front, which was a bit of a challenge given the front-loading nature of the Titan550 case, but the overall result was a system that runs remarkably cool. With the mainboard ambient temp at 93F, the CPUs run around 103F each, slightly higher under extreme load. The downside was that one of the Zalman fans quit about 12 hours after I installed it. It was on the rear CPU, and I didn't notice that it had happened for an hour or so. Normally, a fan dying on a CPU cooler is a recipe for disaster, but when I noticed that the fan wasn't spinning, I checked the temps; that CPU was running 118F, even without the fan running. Undoubtedly this was due to the working fan on the front CPU and the large exhaust fan right behind the bad unit, but it's still quite impressive that a passive heatsink could work so well, even if it's not supposed to be passive.
Of course one of the most dangerous tasks in building custom systems is replacing a CPU cooler. Since the thermal paste hardens after use, it's not a great idea to just wrench the heatsink off the CPU cold, so powering up the system to melt the paste somewhat generally works, but you have to hit that window perfectly between melting the paste and removing the cooler or cooking the CPU. Otherwise, you'll have a nifty Opteron 285 keychain. Added to that possibility was the fact that I had to do this while the mainboard was installed in the case, since I'd already finished building the whole system, and I wasn't about to remove everything including the mainboard to replace this part. After reapplying the paste with a foam peanut (by far the best tool for the job), I fired everything back up. The replacement cooler has been working well since it was installed, but I've been watching the fan RPMs and set gkrellm to trigger a warning notification if the RPMs on that fan drop below 2,000. While I was quite unpleased that one of the Zalman units failed almost immediately, I like them overall -- they look great and function very well in this system... so far.
The Titan550 is really a server case, so there's relatively little in the way of bells and whistles, but it's a solid platform with plenty of internal space, and the front-loading 3.5" drive bays are a nice touch since you don't have to drag hard drives across a crowded interior, they slide out the front on rails. The only problem there is that the intake fan mounts are on the hinged door covering the disks, and the distance from the fans to the mainboard fan headers is quite far. With a few modifications to the fan and power cabling, I made it work, but you have to disconnect the fan from the mainboard to open the front cage, and retrieving the cable after this can be a pain. I do like the quiet nature of the case though, with shock-mounted rails for the disks, a quiet powersupply with internal fan RPM leads, and a large 120mm rear exhaust fan. The rear fan doesn't provide a tach though, so I'll be replacing it with one that does shortly.
The installation of FC5 went smoothly and very quickly, as you might expect. Following the first boot, I installed yum repos for livna, RPMForge, FreshRPMS, dag, and ATrpms.net, took a quick list of RPMs installed on my current FC3 system with rpm -qa --qf '%{NAME}\n' > origrpms.txt and did the same on the new system. Running them though comm to find the differences, a quick manual perusal of the final list, and then yum install `cat ./rpmlist.txt | tr '\n' ' '`. Within a few minutes, all my apps and their dependencies were installing, including mplayer, xmms-mp3, easytag, and so forth. What's better is that pirut, the software installation manager, searches enabled yum repos, so firing up that tool will let you search for packages across multiple repositories. Nicely done.
As far as raw performance, I no longer have the Ultra40 in the lab, but I can say with certainty that this box is actually faster than the Ultra40. This is due in no small part to the Opteron 285s vs the 280s in the Ultra40, but regardless, even if it measured up equally, it's still cheaper. You forego the Sun tools, support, grid applications and so forth, but if you don't need them, you can build your own Ultra40 and save yourself the equivalent of a new MacBook Pro.
| Part | Cost |
|---|---|
| Antec Titan550 | $180 |
| Tyan Thunder K8WE | $500 |
| AMD Opteron 285 CPUs | $2,000 |
| nVidia Quadro 3500 | $1,500 |
| Western Digital 250GB SATA drive | $100 |
| Crucial 8GB DDR400 RAM | $800 |
| SoundBlaster Audigy2 Platinum Pro | $175 |
| Sony 16x DVD-RW | $50 |
| Zalman CNPS9500LED CPU cooler | $100 |
| Misc Parts | $15 |
| Total | $5,430 |
| Sun Ultra40 | $7,000 |
all prices approximate
Posted by Paul Venezia on May 30, 2006 01:50 PM
May 25, 2006 | Comments: (0)
If you have a MacBook or PowerBook with the embedded motion sensor, you have to see Erling Ellingsen's SmackBook. Desktop paging with a tap of the hand; so very cool. I normally see things like this and appreciate the inventive nature of the author, but rarely do I bother to actually implement them. This was an exception.
If you read the comments you'll find patched binaries of Desktop Manager (a great app that I've been using for eons) and some hints on getting everything working. In my case, I'm running 1.67Ghz 15" PowerBook G4 and I had to do some fiddling with the thresholds after building the patched Desktop Pager. I'm still working on getting the settings just right, but if you're having trouble, try this modified smack.pl:
#!/usr/bin/perl
use strict;
my $stable;
open F,"./AMSTracker -s -u0.01 |";
while(
my @a = /(-?\d+)/g;
print, next if @a != 3;
# we get a signed short written as two unsigned bytes
my $x = $a[0];
if(abs($x) < 10) {
$stable++;
}
if(abs($x) > 15 && $stable > 15) {
$stable = 0;
my $foo = $x < 0 ? 'Prev' : 'Next';
system "./notify SwitchTo${foo}Workspace\n";
}
}
It's a bit trying to find the line between breaking your screen hinges to shift desktops and having them switch too easily. The easiest way to gauge what's happening is to run AMSTracker -s -u0.01 > test and tap each side of the screen at an appropriate level, then take a look at the resulting values. Nice work, Erling!
Posted by Paul Venezia on May 25, 2006 03:52 PM
May 22, 2006 | Comments: (0)
| The thought occurred to me the other day that it might be cool to have a gkrellm monitor on my main workstation displaying throughput on my IPCop firewall. I couldn't find a gkrellmd addon for IPCop, so I put one together. This is based on gkrellm-daemon 2.2.5 and includes the necessary glib2 2.4.7 libraries.
Instructions are in the tarball, but essentially you just | ![]() |
Posted by Paul Venezia on May 22, 2006 04:14 PM
April 20, 2006 | Comments: (0)
If you're looking to make IE respond like all the other browsers when it comes to forced downloads, standard Content-type and Content-disposition headers won't do the trick. However, this PHP snippet shows what will. Be sure to modify the last Content-type header to whatever MIME type the file actually is.
header("Pragma: public"); // required
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
header('Content-Description: File Transfer');
header('Content-Type: application/force-download');
header("Content-Type: application/download");
header("Content-Type: a-xzip-compressed");
header("Content-disposition: attachment; filename=" . $name);
Posted by Paul Venezia on April 20, 2006 12:44 PM
April 20, 2006 | Comments: (0)
Microsoft Virtual Server 2005 R2 supports Linux Guest OSes
It's official. Is this maybe the beginning of a change of tactics for Microsoft? Between this, SFU 3.5 which was released over a year ago and the recent peeks into Microsoft's Linux labs, we might be watching Redmond switching strategies. If you can't beat them, join them, I suppose. This hurdle was definitely not a significant technical achievement, but politically, it's much more interesting.
What's next? Opening the SMB protocol? True XML support in Office? "Rivers and seas boiling" says Dr. Peter Venkman, "Human sacrifice, dogs and cats, living together... mass hysteria!"
Posted by Paul Venezia on April 20, 2006 09:37 AM
March 13, 2006 | Comments: (0)
If you haven't already seen Cacti, take a look now. I'll wait...
Back already? Great.
For what seems like eons, MRTG has been the graphing/trending tool du jour for throughput data. Tons of hacks have turned MRTG into a graphing tool for other data, such as temperature probes, disk utilization, mail statistics, ad infinitum. Tobi Oetiker's RRDTool has been around nearly as long, providing a solid round-robin database backend to store the data for just about anything that fits as a counter, and absolute, or what have you. Cacti is an extensive PHP framework around RRDTool, providing a Web GUI to add/delete/manage monitored devices and present graphs in a very elegant fashion.
I'd played with Cacti years ago when it was in it's infancy. Now, it's all grown up. The newest version supports the most common data gathering tools, and has facilities for custom additions, however Byzantine the structure may be. If you want to graph router/firewall ifInOctet/ifOutOctet data, that's simple. Graphing CPU utilization from Net-SNMP hosts is equally simple, and built into the base distribution. Adding custom data is a bit of a challenge, as it requires intimate knowledge of the inner workings of Cacti, and a bit of a brain-bend on how exactly the Cacti magic happens. Make no mistake, you will need to make several attempts at custom code unless it's simple SNMP queries. The usage of non-SNMP data for per-instance queries is decidedly non-obvious.
I found that I had a need to graph FLEXlm license utilization across multiple servers in multiple locations, with multiple sub-applications that needed individual graphs. Inputting all this data by hand would have taken eons, so being lazy, I wrote a series of Cacti XML templates and perl scripts to do all this work for me.
Basically, parsing FLEXlm license daemon output is rather nasty. It's a textual adventure and slow. So I wrote a poller daemon in perl that populates a shared hash using Tie::ShareLite. The poller runs every 4.5 minutes, while the accompanying query script runs during the Cacti polling run, referencing that shared hash. The result is that gathering data on the license utilization of hundreds of individual applications is instantaneous to Cacti, and only requires a single query to each lmgrd process on a license server from the poller.
Also, I wound up writing a query script and XML template to track datacenter temperatures via Sensatronics TempTrax probes with Cacti.
All of this work has been posted to the Cacti forums. You can find the FLEXlm code here, and the TempTrax code here.
In any event, check this project out. It's better than many commercial packages, and with RRDTool 1.2, the visuals are stunning. For the first time in forever, there aren't any MRTG scripts in my crontab. Amazing.
Posted by Paul Venezia on March 13, 2006 06:08 PM
March 03, 2006 | Comments: (0)
For an upcoming article, I found that I wanted to gather some statistics on DNSBLs such as spamhaus.org, sorbs.net, and so forth. In order to get that data, I wrote a sendmail milter in perl that simply matches inbound relay IP addresses and catalogues the data in a MySQL database. So in short, this code doesn't block anything whatsoever, but will give an indication of what would be blocked by each DNSBL, and by classification within that DNSBL if provided.
Suffice it to say, this code hasn't been thoroughly tested. It works fine on my FreeBSD mailserver with perl 5.8.5, MySQL 4.1.1, and sendmail 8.12.9pl2 with Sendmail::Milter 0.18, but YMMV. Check the README for installation and hacking instructions.
You can download dnsblcheck-0.0.1 here.
Posted by Paul Venezia on March 3, 2006 12:30 PM
November 06, 2005 | Comments: (0)


