Big Bubbles (no troubles)

What sucks, who sucks and you suck

Become a System Knowall

One message that is coming through loud and clear from the various Sun blogs is that their engineers are feeling incredibly bullish about Solaris 10, and rightly so. It’s a good sign when a product is trailed not (just) by meaningless marketese about “e-enabling IT and business paradigm integration” but by the backroom boys saying “Hey, cool stuff, look!” It’s even better when you can download a free beta and check for hot air leaks.

The Solaris Express releases demonstrate the most ballyhooed new features, such as Dynamic Tracing (dtrace) and Zones, but they also highlight the incredible amount of Cool Stuff going into Sol 10: NET-SNMP (i.e. a functional SNMP implementation at last), updated and extended GNU tools, a fileset checksumming and comparator util (BART), Mozilla (including all the required libs), Samba… Oh, and GNOME, which appears to be an attempt to replace the CDE debacle with something less distasteful. If you squint, it’s almost as nice as administrating a Linux server. (Most of this stuff is under /usr/sfw/.)

Of all this, DTrace appears to be the stimulus of most engineering orgasms: the ability to safely enable debug output for any aspect of a running production system in real time, with almost no overhead. Being an afficionado of the brute pleasures of truss (no Google hits, please), I was keen to acquire DTrace skills. Unfortunately, reading kernel engineers’ blog entries about it is rather like watching a gifted pianist playing a note-perfect version of The Entertainer then saying “See? Easy! Now you try!”, when you can’t even find middle C on the keyboard (as far as I’m concerned, C is in the middle of ‘X’ and ‘V’). Being able to instrument any part of the kernel or OS is great if you’re intimately familiar with the code, otherwise it leaves you staring blankly and wondering where to stick your probes.

When in doubt, start with the Fine Manual. The DTrace guide is exceptionally fine, with the first section forming an easy tutorial and the later parts a good reference. It won’t turn you into a guru (that will require a proper book and possibly a copy of Solaris Internals), but it got me up and running. I decided to start by writing a DTrace script to trace all the open() calls in a selected process; very useful if you suspect that a program is looking for a file that isn’t there. With truss, this would simply be a matter of:
# truss -t open -f cmd …

The DTrace equivalent is:

!/usr/sbin/dtrace -s

# trace opened files for a named process syscall::open:entry, syscall::open64:entry /execname == $$1/ { printf(“%d %s = “, pid, copyinstr(arg0)); } syscall::open:return, syscall::open64:return /execname == $$1 && arg1 != -1/ { printf(“%d\n”, arg1); } syscall::open:return, syscall::open64:return /execname == $$1 && arg1 == -1/ { printf(“Err#%d\n”, errno); }

Run this with:
# ./openfiles.d process name
(Replace “execname” with “pid” to trace by process ID.) The only difference is that DTrace can’t (yet?) translate errno values into text. However, the DTrace version, while harder to comprehend, has several advantages: * It can be left running to pick up all instances of the named program as they execute; truss relies on either running the program yourself or grabbing the PID of any instances as they run. * It has less overhead on the process than truss, which traces the entire thread of execution. * It’s extensible; you can edit it to take other actions, focus on particular arguments or exit values (e.g. ENOENT), combine it with other trace criteria to narrow things down, etc. * While not evident in this noddy example, there are a hell of a lot more things that DTrace can do that truss cannot.

That said, I can see that DTrace’s appeal will mainly be to two groups: Sun’s support engineers will love it, as they can develop scripts to pinpoint the cause of obscure customer issues very quickly; and performance consultants should find that DTrace knowledge gives them a competitive edge. I suspect that most other users will find themselves only using it to run third party scripts, like Brendan Gregg’s excellent utilities, at least until the O’Reilly DTrace book or its equivalent comes out.

One other interesting tidbit from the Sun blogs: apparently, the linker patches for all the currently supported Solaris versions are always built from the latest source. So installing the latest ld patch for Solaris 8, for example, gives you many of the features in the Solaris 10 linker. (One trusts that Sun use good source code management to avoid release-dependent incompatibilities!)

Other bubbles * BigAdmin DTrace page, including those scary blogs.