Tuesday, April 29, 2008

Bash Tip: Reverse Sorting Lists in the Shell

Every once in a while I check my site logs and find a common search phrase in referrals from search engines. Often the visitors appear to leave immediately; presumably because the page they landed on didn't answer their question.

A phrase that has been appearing quite frequently lately is "bash reverse sort list".

I can't tell exactly what they mean by their search query, so we'll take a couple of cracks at it.

My first thought is that they might be looking to reverse the output of the command line tool ls.

Say we have a directory and we see the following files when we run ls:


a aa ab b bd be c cqw cw d de defg h hi hij hijk i ii iii ij


To get a simple reverse listing of those files, we should use the -r switch for ls. typing ls -r in the same directory yields:


ij iii ii i hijk hij hi h defg de d cw cqw c be bd b ab aa a


Having your file list reversed in a horizontal line isn't always useful when you are looking for a vertical list. It just takes a little bit of extra work to turn your list on its side if that's what you need.

First, we'll use the -l switch of ls to show the long listing for the files. Typing ls -lr gives us a reverse listing of our files in a vertical format.


total 0
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 ij
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 iii
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 ii
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 i
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 hijk
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 hij
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 hi
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 h
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 defg
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 de
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 d
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 cw
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 cqw
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 c
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 be
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 bd
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 b
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 ab
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 aa
-rw-r--r-- 1 jjones jjones 0 2007-05-07 21:23 a


That's a bit more verbose than what I expect we're looking for, so we'll need to employ a couple of more tools to trim away the fat.

All we want is the last column (8, if you consider the columns delimited by spaces) of information. This is where the cut command comes in handy. It does exactly what the name implies by slicing and dicing data in multiple handy ways.

By default, the cut command treats data as fields separated by tabs. By sending the output of our ls -lr command as input to the cut command while changing the default delimiter character with the -d switch, we can filter out all but the 8th column. So far our command looks like this, ls -lr | cut -d" " -f8, and our ouput looks like this:



ij
iii
ii
i
hijk
hij
hi
h
defg
de
d
cw
cqw
c
be
bd
b
ab
aa
a


Almost perfect. However, you'll notice one small problem at the top of the list. There's an extra blank. If you look at the original output of ls -lr, it's quickly becomes clear where the blank line came from. The total 0 line in the original output had only two fields, total and 0, leaving nothing but a blank when cut went looking for the eighth field.

It's not too difficult a job to clean this up with a little creative application of the grep command. We'll use the -v or inverse match switch of grep (otherwise known as "show me everything but") of a line with only a beginning, represented by the carat (^) symbol, and an end, represented by the dollar sign ($) and nothing in between, or -v ^$.

Putting it all together as ls -lr | cut -d" " -f8 | grep -v ^$ successfully removes the blank line from our vertical reverse sorted list of files.


ij
iii
ii
i
hijk
hij
hi
h
defg
de
d
cw
cqw
c
be
bd
b
ab
aa
a


Another list you might like to sort is one contained in a file. ls isn't going to help us with this one, but the sort command is here to help.

By default, sort will sort a list in a file by the first field as delimited by white and non-white space. Taking an example file (sort.txt) containing the following:


a
b
bd
hij
be
aa
cqw
ab
c
cw
d
de
iii
defg
h
hi
hijk
i
ii
ij


So, running sort against sort.txt results in:


a
aa
ab
b
bd
be
c
cqw
cw
d
de
defg
h
hi
hij
hijk
i
ii
iii
ij


The sort command also offers a reverse sort option through the -r switch. Running sort -r against sort.txt (sort -r sort.txt) results in:


ij
iii
ii
i
hijk
hij
hi
h
defg
de
d
cw
cqw
c
be
bd
b
ab
aa
a


I hope this answers some of the basic questions about reverse sorting lists. For more information check out the manual pages for ls and sort (man ls and man sort).

However, you might just want your list flipped on its head, with no sorting whatsoever. Say you have the list:


a
d
c
b


You want it like to look like this:


b
c
d
a


As it turns out, there is a command just for that purpose called tac. Where cat will concatenate the contents of a file to the screen (standard output), tac will do the same after reversing the contents of a file.

Take the text of the 1st Amendment to the US Constitution, for example.


Congress shall make no law respecting an establishment of religion,
or prohibiting the free exercise thereof;
or abridging the freedom of speech,
or of the press;
or the right of the people peaceably to assemble,
and to petition the Government for a redress of grievances.


Running tac against these lines compeletely reverses them:


and to petition the Government for a redress of grievances.
or the right of the people peaceably to assemble,
or of the press;
or abridging the freedom of speech,
or prohibiting the free exercise thereof;
Congress shall make no law respecting an establishment of religion,


Whereas if we had used sort, the output would look slightly different:


and to petition the Government for a redress of grievances.
Congress shall make no law respecting an establishment of religion,
or abridging the freedom of speech,
or of the press;
or prohibiting the free exercise thereof;
or the right of the people peaceably to assemble,



If your list isn't vertical with items separated by a newline, you can use tac's -s switch, similar to cut's -d switch, to identify a different separator.

Update 1: A helpful reader pointed out that the ls examples could be a lot smaller with the application of the -1 switch to the ls command. This switch tells the standard ls command to print one file per line. When combined with the reverse, -r, switch, we get a reverse list of files in a vertical as opposed to the standard horizontal layout.

In the end, just typing


ls -1r


will result in this list of files


a aa ab b bd be c cqw cw d de defg h hi hij hijk i ii iii ij


being printed like this


ij
iii
ii
i
hijk
hij
hi
h
defg
de
d
cw
cqw
c
be
bd
b
ab
aa
a


Update 2: It turns out that I didn't cover how to reverse sort a horizontal line. Since it is a little long, you can check out my solution in this post, Bash Tip: Reverse Sorting Lists Revisted; Reversing a Horizontal List.

--

I hope these tips help everyone out. If you want more resources on shell scripting, I highly recommend Unix Shell Programming (3rd edition) by Stephen Kochan and Patrick Wood if you are interested improving your understanding of shell scripting. Kochan and Wood do a very thorough job (using plenty of code examples) exploring various aspects of essential shell scripting tools and techniques

2 comments:

Prasinos said...

Although "ls -1" solves your original problem, I would like to point out a flaw in your original solution.

Filenames may contain spaces or other "strange" characters. "cut -f8" will truncate such a filename. You should use "cut -f8-" to cut all the parts of the name (very brittle) or "ls -Q" that puts quotes around names. Even better is to use find and its "-print0" option.

Not accounting for the possibility of spaces in file names is one the most common errors in shell scripting.

Jim said...

@prasinos,

Thanks for the catch!

I will put it on my list of things to fix.