Sidestepping the question of C++'s reputation

Recently, Jussi Pakkanen asked, "Does C++ still deserve the bad rap it has had for so long?". His thesis is essentially that it does not. Of course this caused dissent, with Jason Moiron bluntly stating that "C++ Deserves Its Bad Reputation", and David Andersen coming out with "No, C++ still isn't cutting it."

The interesting thing about both rebuttals is that they each essentially say the same thing: "I can write that same sample program more concisely in $my_preferred_language." The sample program in question solves the following question:

Find all files with the extension .txt recursively in the subdirectories of the current directory, count the number of times each word appears in them and print the ten most common words and the number of times they are used.

This sounds strangely familiar. I thought about it, and it turns out to be a slight variation of the program used by Donald Knuth to demonstrate Literate Programming, which Douglas McIlroy famously refuted with a 6 command UNIX pipeline in the Programming Pearls in the June 1986 issue of Communications of the ACM.

So, to sidestep the question about C++'s reputation entirely, I remind you all that the best code is the code you don't have to write, because it's already written. To tweak McIlroy's original solution, here is my POSIX solution to the sample problem:


find . -name \*.txt -exec cat {} + |            # dump contents of all .txt files
tr '[[:upper:]][[:space:]] ' '[[:lower:]]\n' |  # make everything lowercase, one word per line
grep -E '^[[:lower:]]+$' |                      # only take words consisting solely of letters
sort |                                          # sort all words (obviously)
uniq -c |                                       # count unique occurrences of each word
sort -rn |                                      # sort by count of each word, descending
head                                            # print only the first 10

Seven commands (McIlroy's six assumed input came from stdin) and arguably more correct (this handles more than just ASCII if the system tr and grep are sufficiently equipped) than other examples. It took me less time to come up with the pipeline (even before looking up McIlroy's example for comparison) than it did to document what each step does.

Copyright © 2020 Jakob Kaivo <jakob@kaivo.net>