What is an operating system?
The operating system manages the operation of the computer and of the application software which runs on the computer.
The definition
A simple definition of an operating system is that it is a program, much like any other program. It is different only in that its primary function is to manage the movement of data in the computer. This definition refers specifically to the kernel of the operating system.
The operating system kernel manages access to the hardware devices of the computer by utility and application programs. The operating system also manages system services such as memory allocation – the assignment of specific virtual memory locations to various programs when they request memory – the movement of data from various storage devices into memory where it can be accessed by the CPU, communications with other computers and devices via the network, display of data in text or graphic format on the display, printing, and much more.
The Linux kernel provides an API – application programming interface – for other programs to use in order to access the kernel functions. For example, a program that needs to have more memory allocated to its data structures uses a kernel function call to request that memory. The kernel then allocates the memory and notifies the program that the additional memory is available.
The Linux kernel also manages access to the CPUs as computing resources. It uses a complex algorithm to determine which processes have are allocated some CPU time, when, and for how long. If necessary, the kernel can interrupt a running program in order to allow another program to have some CPU time.
An operating system kernel like Linux can do little on its own. It requires other programs – utilities – which can be used to perform basic functions such as create a directory on the hard drive and then other program utilities to access that directory, create files in that directory, and then manage those files. These utility programs perform functions like creating files, deleting files, copying files from one place to another, setting display resolution, and complex processing of textual data. We will cover the use of many of these utilities as we proceed through this book.
Typical operating system functions
Any operating system has a set of core functions which are the primary reason for its existence. These are the functions that enable the operating system to manage itself, the hardware on which it runs, and the application programs and utilities that depend upon it to allocate system resources to them:
- Memory management
- Managing multitasking
- Managing multiple users
- Process management
- Interprocess communication
- Device management
- Error handling and logging
Let’s look briefly at these functions.
Memory management
Linux and other modern operating systems use advanced memory management strategies to virtualize real memory – random access memory1 (RAM) and swap memory (disk) – into a single virtual memory space which can be used as if it were all physical RAM. Portions of this virtual memory2 can be allocated by the memory management functions of the kernel to programs that request memory.
The memory management components of the operating system are responsible for assigning virtual memory space to applications and utilities, and for translation between virtual memory spaces and physical memory. The kernel allocates and deallocates memory and assigns physical memory locations based upon requests, either implicit or explicit, from application programs. In cooperation with the CPU, the kernel also manages access to memory to ensure that programs only access those regions of memory which have been assigned to them. Part of memory management includes managing the swap partition or file and the movement of memory pages between RAM and the swap space on the hard drive.
Virtual memory eliminates the need for the application programmer to deal directly with memory management because it provides a single virtual memory address space for each program. It also isolates each application’s memory space from that of every other, thus making the program’s memory space safe from being overwritten or viewed by other programs.
Multitasking
Like most modern operating systems Linux can multitask. That means that it can manage two, three, or hundreds of processes at the same time. Part of process management is managing multiple processes that are all running on a Linux computer.
I usually have several programs running at one time such as LibreOffice Write which is a word processor, an email program, a spreadsheet, a file manager, a web browser, and usually multiple terminal sessions in which I interact with the Linux command-line interface (CLI). Right now, as I write this sentence, I have multiple documents open in several LibreOffice Write windows. This enables me to see what I have written in other documents and to work on multiple documents at the same time.
But those programs usually do little or nothing until we give them things to do by typing words into the word processor or clicking an email to display it. I also have several terminal emulators running and use them to login to various local and remote computers for which I manage and have responsibility.
Linux itself always has many programs running in the background – called daemons – programs that help Linux manage the hardware and other software running on the host. These programs are usually not noticed by users unless we specifically look for them. Some of the tools you will learn about in this course can reveal these otherwise hidden programs.
Even with all of its own programs running in the background and users’ programs running, a modern Linux computer uses a few compute cycles and wastes most of its CPU cycles waiting for things to happen. Linux can download and install its own updates while performing any or all of the preceding tasks simultaneously – without the need for a reboot. Wait – what?! That’s right. Linux does not always need to reboot before, during or after installing updates or when installing new software. After a new kernel or glibc (General C Libraries) is installed, however, you may wish to reboot the computer to activate it, but you can do that whenever you want and not be forced to reboot multiple times during an update or even stop doing your work while the updates are installed.
Multiuser
The multitasking functionality of Linux extends to its ability to host multiple users – tens or hundreds of them – all running the same or different programs at the same time on one single computer.
Multiuser capabilities means a number of different things. First, it can mean a single user who has logged in multiple times via a combination of the GUI desktop interface and via the command line using one or more terminal sessions. Second, multiuser means just that – many different users logged in at the same time, each doing their own thing, and each isolated and protected from the activities of the others. Some users can be logged in locally and others from anywhere in the world with an Internet connection if the host computer is properly configured.
The role of the operating system is to allocate resources to each user and to ensure that any tasks, that is, processes, they have running have sufficient resources without impinging upon the resources allocated to other users.
Process management
The Linux kernel manages the execution of all tasks running on the system. The Linux operating system is multitasking from the moment it boots up. Many of those tasks are the background tasks required to manage a multitasking and – for Linux – a multiuser environment. These tools take only a small fraction of the available CPU resources available on even modest computers.
Each running program is a process. It is the responsibility of the Linux kernel to perform process management.3
The scheduler portion of the kernel allocates CPU time to each running process based on its priority and whether it is capable of running. A task which is blocked – perhaps it is waiting for data to be delivered from the disk, or for input from the keyboard – does not receive CPU time. The Linux kernel will also preempt a lower priority task when a task with a higher priority becomes unblocked and capable of running.
In order to manage processes, the kernel creates data abstractions that represent that process. Part of the data required is that of memory maps that define the memory that is allocated to the process and whether it is data or executable code. The kernel maintains information about the execution status such as how recently the program had some CPU time, how much time, and a number called the “nice” number. It uses that information and the nice number to calculate the priority of the process. The kernel uses the priority of all of the process to determine which process(es) will be allocated some CPU time.
Note that not all processes need CPU time simultaneously. In fact, for most desktop workstations in normal circumstances, usually only two or three processes at the most need to be on the CPU at any given time. This means that a simple quad-core processor can easily handle this type of CPU load.
If there are more programs – processes – running than there are CPUs in the system, the kernel is responsible for determining which process to interrupt in order to replace it with a different one that needs some CPU time.
Interprocess communication
Interprocess communication (IPC) is vital to any multitasking operating system. Many programs must be synchronized with each other to ensure that their work is properly coordinated. Interprocess communication is the tool that enables this type of inter-program cooperation.
The kernel manages a number of IPC methods. Shared memory is used when two tasks need to pass data between them. The Linux clipboard is a good example of shared memory. Data which is cut or copied to the clipboard is stored in shared memory. When the stored data is pasted into another application, that application looks for the data in the clipboard’s shared memory area. Named pipes can be used to communicate data between two programs. Data can be pushed into the pipe by one program and the other program can pull the data out of the other end of the pipe. A program may collect data very quickly and push it into the pipe. Another program may take the data out of the other end of the pipe and either display it on the screen or store it to the disk, but it can handle the data at its own rate.
Device management
The kernel manages access to the physical hardware through the use of device drivers. Although we tend to think of this in terms of various types of storage devices, it also manages other Input/Output (I/O) devices such as the keyboard, mouse, display, printers, and so on. This includes management of pluggable devices such as USB storage devices, and external USB and eSATA storage devices.
Access to physical devices must be managed carefully or more than one application might attempt to control the same device at the same time. The Linux kernel manages devices so that only one program actually has control of or access to a device at any given moment. One example of this is a COM port4. Only one program can communicate through a COM port at any given time. If you are using the COM port to get your e-mail from the Internet, for example, and try to start another program which attempts to use the same COM port, the Linux kernel detects that the COM port is already in use. The kernel then uses the hardware error handler to display a message on the screen that the COM port is in use.
For managing disk I/O devices, including USB, parallel and serial port I/O, and file system I/O, the kernel does not actually handle physical access to the disk, but rather manages the requests for disk I/O submitted by the various running programs. It passes these requests on to the file system, whether it be EXT[2,3,4], VFAT, HPFS, CDFS (CD-ROM file system), or NFS (Network File System), or some other filesystem types, and manages the transfer of data between the file system and the requesting programs.
All types of hardware – whether they are storage devices or something else attached to a Linux host – are handled as if they were files. In Linux, Everything is a File.5 This results in some amazing capabilities and interesting possibilities.
Error Handling
Errors happen. As a result the kernel needs to identify these errors when they occur. The kernel may take some action such as retrying the failing operation, displaying an error message to the user, and logging the error message to a log file.
In many cases the kernel can recover from errors without human intervention. In others, human intervention may be required. For example if the user attempts to unmount6 a USB storage device that is in use, the kernel will detect this and post a message to the umount program which usually sends the error message to the user interface. The user must then take whatever action necessary to ensure that the storage device is no longer in use and then attempt the unmount the device.
Utilities
In addition to its kernel functions, most operating systems provide a number of basic utility programs which enable users to manage the computer on which the operating system resides. These are the commands such as cp, ls, mv, and so on, as well as the various shells, such as bash, ksh, csh and so on, which make using and managing the computer so much easier.
These utilities are not technically part of the operating system; they are merely provided as useful tools that can be used by the SysAdmin to perform administrative tasks. In Linux, often these are the GNU core utilities. However, common usage groups the kernel together with the utilities into a single conceptual entity that we call the operating system.
- Wikipedia, random access memory, https://en.wikipedia.org/wiki/Random-access_memory ↩︎
- Wikipedia, Virtual Memory, https://en.wikipedia.org/wiki/Virtual_memory ↩︎
- Wikipedia, Process Management, https://en.wikipedia.org/wiki/Process_management_(computing) ↩︎
- A COM (communications) port is used with serial communications such as a serial modem to connect to the Internet over telephone lines when a cable connection is not available. ↩︎
- Both.org,The Linux Philosophy for SysAdmins, Tenet 03 — Everything is a File, https://www.both.org/?p=6843 ↩︎
- The Linux command to unmount a device is actually umount. ↩︎