The UNIX family design philosophy is to have a large toolbox filled with simple tools, each of which does one thing well. Then you can plug them together into pipelines and build arbitrarily complex tools.
Someone says ”Let’s read the Apache web server log and extract just the requested URLs, count how many there is of each unique URL, and list those in order from most popular down.”
The UNIX person hears ”
Each of those simple tools can be used in several ways. Let’s look at
grep. The output of the tricks I’ll describe is often too large to fit into this blog format, I’ll leave it to you to do the test driving!
The obvious point of
grep is to simply output the matching lines. That what it does with no options. But you can ask for:
-qbecause you want to use it as a test within a script, seeing if
$?is zero (found a match) or not
Sometimes I want to see the matching lines but I would also like to see what was immediately on either side. That’s the
-C (that’s capital ”C”) option, for ”context”. I know that my kernel ring buffer will have one line each for the driver detections of my two Ethernet interfaces. But I want to see a line or maybe two on each side, so I can see what else that driver reported. Let’s ask for two lines of context before and after, and maybe there are more than two Ethernet interfaces:
$ dmesg | grep -C 2 'eth[0-9]'
Richly featured desktop environments like Gnome and KDE have components that can get in your way if you accidentally start them and don’t want them. There are various indexing systems intended to be like ”Google for your desktop” but which can cause enough disk I/O to slow down system performance.
The problem is that you discover one of these using
iotop, maybe it’s
gam_server or who knows what else. It’s running, you can terminate it, but then it keeps getting respawned. Maybe the next time you log in, or maybe immediately.
What you need to do is trace the problem back: what is starting the thing that starts the thing that causes your problem? Then terminate the troublesome grandmother (or earlier) process. We need more than simple context to solve this. The solution is in three parts:
grep/egrepcolor the matches
pstree shows the family tree of all processes. Use it with the
-w option to allow wide output, otherwise it’s likely to truncate what you’re looking for.
egrep lets you do things like this:
$ egrep 'this|that' myfile
The pipe symbol means logical ”OR”, the above would output all lines in
myfile containing either
Third, the coloring. Your distribution may have set up an alias so that
egrep always run as if they have the
--color=auto option. If not, use it. That changes the color of the matched pattern if output is going to the terminal, but not if it is going into a file or into a pipeline to another program. Unfortunately, programs like
less don’t handle the ANSI color-change characters as we would like.
Advanced note: If you are doing this on a UNIX-family OS other than Linux or Mac OS X, maybe Solaris or OpenBSD, use
gegrep to get the GNU version of
egrep which supports coloring.
Every line has to have a beginning, and ”
^” is the regular expression meaning ”Beginning of a line”. Let’s assume we suspect our disk I/O problem is being caused by one of those indexing programs. We want to see what series of programs inadvertently started by our graphical login has led to the problem. Let’s put it all together:
$ pstree -w | egrep --color=auto '^|akonadi|fam|gam_server'
The output contains all of the process tree, because every line has a beginning (^). The strings
gam_server will be highlighted in bright red wherever they appear. Now you can look for red text with the scroll bars or by pressing Shift-PageUp and Shift-PageDown. Then look up the tree to trace where it came from.
Simple tools and concepts are easily combined into one very specialized tool! If you would like to learn lots more about doing this on-the-fly tool design, check out Learning Tree’s Tools and Utilities course.