The Linux Philosophy for SysAdmins, Tenet 18 —Find the simplicity
Author’s note: This article is excerpted in part from chapter 17 of my book, The Linux Philosophy for SysAdmins, with some changes to update the information in it and to better fit this format.
“UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity.” 1
— Dennis Ritchie
I would never deign to disagree with one of the creators of Unix. However my own perspective has evolved since I began using Unix and Linux. The tenets of the Linux Philosophy helped me to solidify my understanding of the truth that Linux is simple and that the simplicity is illuminated by the philosophy.
In this article we search for the simplicity of Linux.
Complexity in numbers
Yes, GNU/Linux is complex on the surface. One book I know of, Linux in a Nutshell 2, contains a list of 372 Linux commands. Yes, I counted them. Another book, my favorite for beginners, A Practical Guide to Linux, Commands, Editors, and Shell Programming3, covers “… 98 utilities …”.
But those numbers are trivial compared to another number I came up with. Let’s estimate the total number of commands on your Linux computer. Most of the executable files that are command line commands are located in the /usr/bin directory so counting the number of files in that directory gives a pretty good estimate. Do this as a non-root user. Determine how many executables are located in /usr/bin.
$ ls -l /usr/bin | wc -l
2633
Yup – that is a lot of commands. Of course the number you see will be different because you’ll have a different programs and tools installed.
The test VM that I am using to create and test these experiments is a pretty basic installation with the KDE and MATE desktops and a few applications like LibréOffice. That VM has 2,633 executable Linux files, most of which are CLI commands. Those numbers seem overwhelming to someone just learning Linux. They did to me when I was just starting as a baby SysAdmin.
When I was just beginning to learn Linux, back around 1996 or 1997, I picked up a couple books about Linux – not that there were that many available back then – and discovered what seemed to me at the time an unimaginable number of commands. I thought it would be impossible for me to learn all of those commands.
I cringe when I see articles with titles like, 83 useful Linux commands4, and 50 Most Frequently Used UNIX / Linux Commands (With Examples)5. These titles imply that there are sets of commands that you must memorize, or that knowing large numbers of commands is important.
I do read many of these articles, but I am usually looking for new and interesting commands; commands that might help me resolve a problem or simplify a command line program.
Simplicity in basics
I’m not a genius, I really am not. But I am persistent. I never tried to learn all of those Linux commands, regardless of what numbers you might come up with as the total for “all.”
I just started by learning the commands I needed at any given moment for whatever project was at hand. I started to learn more commands because I took on personal projects and ones for work that stretched my knowledge to the limit and forced me to find commands previously unknown to me in order to complete those projects. My repertoire of commands grew over time and I became more proficient at the application of those commands in resolving problems, I began finding jobs that payed me more and more money to play with Linux, my favorite toy.
As I learned about piping and redirection, about Standard Streams and Standard I/O, as I read about the Unix Philosophy and then the Linux Philosophy, I started to understand how and why the command line made Linux and the Core Utilities so powerful. I learned about the elegance of writing command line programs that manipulated data streams in amazing ways.
I also discovered that some commands are, if not completely obsolete, then seldom used and only in unusual circumstances. For this reason alone it does not make sense to find a list of Linux commands and memorize them. It is not an efficient use of your time as a SysAdmin to learn many commands that may never be needed.
The simplicity here is to learn what you need to do the task at hand. There will be plenty of tasks in the future which will require you to learn other commands. There are always methods for discovering and learning those commands when you need them. I have found that discovering and learning new commands as the need arose works very well for me. Almost any new project, including writing this book, leads to finding new commands to learn.
Simple programs do one thing well
Think about the GNU/Unix/Linux utilities. What is the ls program supposed to accomplish? Its only function is to list the files contained in a directory, remembering that directories themselves are nothing more or less than files. It can do this task in a number of different ways by using one or more of its several options – or no options at all.
Without options, the ls command lists only non-hidden filenames in the current directory (PWD), and as many as possible are listed on each line of output. The -l option is a long listing that shows the permissions, size, and other data about the files in a nice columnar listing that is easy to read. The -a option shows all files, including the hidden ones. The -r option lists files, recursing through each subdirectory and also listing the files in each of those as well. Without an argument, the ls command lists the files in the PWD. Using a different directory path as an argument, it can list the files in that other directory. Other variations of the argument let you list specific files.
The ls utility has a number of other interesting options and variations on the arguments that can be used with it. Read the man page for ls to see all of the possibilities.
Note that file globbing is handled by the shell and not by the ls command. Because the shell handles file globbing for all programs and scripts that take file names as an argument, none of those programs needs to do it. The file expands the files that match the globs into a list of files on which the programs and scripts operate. This, too, is simplicity. Why include the file globbing capability into each program when it need only be in one place, the shell.
The thing you should observe about the ls utility is that every option, every argument variation, are all in aid of producing a list of files. That’s it – that is all it does, list files. Therein lies its simplicity, that it does one thing and it does it very well. There is no point in adding more features to this program because it does not need them.
Simplicity and the Philosophy
“At first I hoped that such a technically unsound project would collapse but I soon realized it was doomed to success. Almost anything in software can be implemented, sold, and even used given enough determination. There is nothing a mere scientist can say that will stand against the flood of a hundred million dollars. But there is one quality that cannot be purchased in this way — and that is reliability. The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay.6”
– C. A. R. Hoare7, Writing about the development of the programming language PL/I8 (Emphasis is mine.)
Many of the more interesting software problems I have encountered have involved the simplification of existing code – especially my own code. Adding new functions to a program increases its complexity. A quick new feature added to existing code and employed to meet a deadline increases complexity.
One of the hardest things to do is to reduce the complexity of code. But it pays dividends in the long run. One of my own programs, a Bash shell script that I had written to perform a number of tasks after a basic Fedora installation has grown out of control more than once.
Due to the changes between Fedora releases, the needs of the program changed. The program needed to be modified to install some packages that are no longer installed during a default installation. Sometimes I needed to add code that would remove packages that were installed automatically because I did not want or need them.
Adding new code to do these things added to the complexity of the program. In some cases I added more options to be evaluated as the program initialized in order to leave my options open – as it were – with regard to the changes required to my program. Over a period of several years this program grew to be quite large with plenty of cruft. I recently took some time to use the shellcheck utility and my own eyes on the code to remove cruft – mostly unused and no longer needed procedures – that reduced the size of the code by a few hundred lines.
Hardware
Hardware is an appropriate topic when discussing simplicity, too. Hardware is, after all, the engines on which our software runs.
Hardware is not particularly complex these days. There are standard motherboard sizes, ATX, Mini ATX, Micro ATX, and Extended ATX. Most desktop and tower computer cases are standardized to accept any of these sizes, except perhaps the Extended ATX.
With a little research it is possible to purchase a CPU and RAM memory DIMMs that are compatible with any standard motherboard on the market. Additional adapters such as GPUs, SATA and USB plug-in adapters, and others are facilitated by the standardized PCI Express bus common to the standard motherboards.
Power supplies are standardized and all fit in spaces specifically allotted to them. The only real difference being the total power wattage that they are capable of supplying. The power connectors have long been standardized as are the voltages that they supply.
USB and SATA connectors make attaching devices from hard drives to mice trivially easy and fast. Devices such as hard drives are standard sizes and fit easily in the space designed for them in today’s cases.
I did say that hardware is not especially complex these days, but that is not strictly true. On the macro level of motherboards, cases, adapters, power supplies, and so on, that is true. But each of those devices becomes more complex at the micro- and nano-levels. As the chips get smaller and more complex they contain more and more of the logic necessary to make life simpler for the end user.
Perhaps you were not around in the early ‘80s when the original IBM PC was first released. Integrated circuits (ICs) could contain only a fraction of the components that they do now, and they ran at a tiny fraction of the speeds we now take for granted, let alone those speeds attainable by the extreme overclocking crowd.
In 1981, the Intel 8088 CPU with a single core held 29,000 transistors in an area of 33 square millimeters9. The 20-core Sapphire Rapids quad-chip module, the latest of the Intel CPUs listed on the Wikipedia page in footnote 9, contains 48 billion transistors in 1,600 square millimeters.
In those early days the ICs were simpler and had far fewer transistors. Jumper pins and DIP switches were common and confusing ways to configure the hardware. Today I can boot the computer into a BIOS configuration mode and make changes in a GUI environment. But in most cases, even this is not required as both the hardware and the operating system pretty much configure themselves.
Linux and hardware
Today’s Linux brings amazing levels of simplicity to configuring hardware. Most of the time user intervention is not required. In the past it was often necessary for the Linux user to install device drivers for some hardware. In the present, Linux almost always does all of the work for us.
IThe Udev daemon and its mechanisms enable Linux to identify hardware at boot time and when it is hot-plugged some arbitrary time after boot. Here’s a somewhat simplified version of what takes place when a new device is connected to the host. I stipulate here that the host system is already booted and running at multi-user.target (run level 3) or graphical.target (run level 5).
- The user plugs in a new device, usually into an external USB, SATA, or eSATA connector.
- The kernel detects this and sends a message to Udev to announce the new device.
- Based on the device properties and its location in the hardware bus tree, Udev creates a name for the new device if one does not already exist.
- The Udev system creates the device special file in /dev.
- If a new device driver is required it is loaded.
- The device is initialized.
- Udev may send a notification to the desktop so that the desktop may display a notification of the new device to the user.
The overall process of hot-plugging a new hardware device into a running Linux system and making it ready is very complex – for the operating system. It is very simple for the user who just wants to plug in a new device and have it work. This simplifies things immensely for the end user. For USB and SATA hard drives, USB thumb drives, keyboards, mice, printers, displays, and nearly anything else, all I need to do as a user is to plug the device into the appropriate USB or SATA port and it will work.
The goal
To me, the ultimate goal is to make things as simple for the end user as possible. However, let’s not forget that we SysAdmins are also end users. I would much prefer getting actual work accomplished than fiddling with a new device for hours just to get it to work. That is the old way of doing things. But this new way of doing things moves the complexity from the human side of the equation to the software side. And that software complexity is aided by the manifold increase in hardware complexity.
Thus our quandary is that on the one hand we have been told that our programs should be simple, yet on the other hand that we should move complexity into the software or get rid of it entirely. Hopefully so that the user does not need to deal with it.
Reconciling this tension between complexity and simplicity is the task of both the developer and the System Administrator. The programs and scripts that we create to “automate everything” do need to be as simple as possible. But they also need to be able to perform the tasks at hand in order to simplify the end users’ tasks as much as possible.
“Computers are unreliable, but humans are even more unreliable.”
— Gilb’s Laws Of Unreliability
When you have been a SysAdmin for a certain time, the truth of the preceding quote becomes obvious. Our users will, at some point, always find a way to do something unexpected which will create more damage and havoc than anything we could possibly do in our programs and scripts. That means our objective must be to follow the basic tenets to write small programs that each do one thing well and interact using STDIO.
The last word
“Fools ignore complexity; pragmatists suffer it; experts avoid it; geniuses remove it.”
– Alan Perlis10
- azquotes.com, http://www.azquotes.com/quote/246027?ref=unix ↩︎
- Siever, Figgins, Love & Robbins, Linux in a Nutshell 6th Edition, (O’Reilly, 2009), ISBN 978-0-596-15448-6 ↩︎
- Sobell, A Practical Guide to Linux, Commands, Editors, and Shell Programming, 3rd Edition, (Prentice Hall, 2013), ISBN 978-0-13-308504-4 ↩︎
- TechTarget.com, http://searchdatacenter.techtarget.com/tutorial/77-Linux-commands-and-utilities-youll-actually-use ↩︎
- The Geek Stuff, http://www.thegeekstuff.com/2010/11/50-linux-commands/?utm_source=feedburner ↩︎
- WikiQuote, C._A._R._Hoare, https://en.wikiquote.org/wiki/C._A._R._Hoare ↩︎
- Wikipedia, Tony Hoare, https://en.wikipedia.org/wiki/Tony_Hoare ↩︎
- Wikipedia, PL/I, https://en.wikipedia.org/wiki/PL/I ↩︎
- Wikipedia, Transistor count, https://en.wikipedia.org/wiki/Transistor_count ↩︎
- Wikipedia, Alan Perlis, https://en.wikipedia.org/wiki/Alan_Perlis ↩︎