Randomness, zero and more

0

Photo by Breakingpic from Pexels

Randomness is an important tool in any computing environment. It’s critical to creating public/private keypairs (PPKP) for use with tools like SSH, data encryption, security, statistics, and much more. Zeros are also important and a good source of both zeros and randomness is critical to modern computers.

One practical example is wiping storage with random data. Friends have gifted me with old computers or had me replace old hard drives with new ones. I always assure them that as part of my work for them, I’ll scrub all of the data from their old hard drives. I use a neat tool, shred, that’s included with the GNU Core Utilities that is part of every Linux distribution. The shred tool allows me to overwrite all of the data on a hard drive, from the first sector to the last, irrespective of any existing partitioning, with various bit patterns, streams of zeros, and a random data stream.

Where do these random and zero data streams come from?

There are some very interesting device files in /dev. The article, The Linux Philosophy for SysAdmins, Tenet 03 — Everything is a File, looks at some of those device files and shows how they can be used. If you haven’t already done so, read that article.

Unlike other device special files, the files, null, zero, random and urandom are not associated with any physical devices. You can list the contents of /dev and use the search facility of the less pager to locate those four files:

tuser1@testvm1:~$ ls -l /dev | less

/dev/urandom

The /dev/random and /dev/urandom devices are both useful as data stream sources. As their names imply, they both produce random output – not just numbers but any and all byte combinations. The /dev/urandom device produces deterministic1 random output and is very fast, producing a constant stream of random data. Use this command to view typical output from /dev/urandom. You can use Ctrl-c to break out.

tuser1@testvm1:~$ cat /dev/urandom
D���$�\�̗/>T��|�p��XE9��}9��/ۧ����#I�>���]N&��V������v���▒�^��g��Ѝ�h��V�k݆R![��(+���%՞Rk�B�}u+������M��=�0���A�D�G���RAZ�vk4Lun�����7Vא,�Јz�&��F��*�\5n�I�לOQ����?�Y��$���.O�`X�0b �z��+~ ߩ��y▒^i���J� x&{�▒D�U▒�|����i�֮)�r�[�m����l@�
                                                      '�ڐ�|)�����i���N�#>�ii
��څ)bE[�Yɿ��a<▒%[�5s��F��q�nZ$�x�K�a▒�:
�F%`<��䶥����
�΀
C[q�17�n�R7�l�rN��O93�JR��cMg��q�w▒έ����j?{��F��Ǘ,A�V�Q%��=��&�L-
+Ko��:%��3;r�`��#���2�2�M���&�`R֡]�'�*6b                          �v�I噪H�@c��F�Z�eUA�.B��%�s}��▒▒�T%>�c�XQ▒c3��Ú�!��d�+,6���M��^4^�͡2t992@̿5��(�C�=#6�S�Dz�N�N�
                                       ���+۪���-��qpv,����
<SNIP>

I have shown only a small part of the data stream from the command but it should give you a sense for what you should see on your system. You could also pipe the output of that command through the od command to make it a little more human readable as a stream of octal numbers. That makes little sense for most real world applications because it is, after all, random data, but it should give you an idea of the tools available for processing streams of data.

tuser1@testvm1:~$ cat /dev/urandom | od
0000000 017570 137210 034305 133265 027400 107245 060537 102713
0000020 112266 161045 006546 030266 030456 067340 005411 160413
0000040 163537 075761 171570 022222 156410 075100 132012 027743
0000060 071457 071132 025022 176146 127637 071244 025743 043710
0000100 170672 044575 074315 170167 012554 037754 104755 020341
0000120 161572 120263 114173 073643 147430 112140 033245 124225
0000140 120100 113000 163245 122414 070633 000165 066746 001100
0000160 022546 133134 103050 030734 173131 116773 137717 063004
0000200 131226 000664 173012 006753 147014 015603 071252 105322
0000220 043426 155213 023304 146375 101131 064615 157263 125476
<SNIP>

This output is in Octal but we can use the -c option to display the ASCII text characters that exist in the data stream.

tuser1@testvm1:~$ cat /dev/urandom | od -c
0000000   O   u   % 203 263   _ 310   ?   6 202   0   c 006   L 316 263
0000020   * 250 023  \r 344 357   G   e   w   z 375 265  \0 257 237   b
0000040 342 372 253 340 330 253   :  \v 032   h 263   ! 314 203 234 335
0000060 337 376   S   _   = 343 302 345 222   I 016   , 360 340   @   ?
0000100 311 223 316   ] 275 275   r   K 200 360 275  \0 203 206 020   l
0000120 204 356 023 256 360 343   9   D 345   8      \a 225 246   \   '
0000140 321 310 217   H 233 375   2   d   u 233   > 322   T 357   Q   \
0000160  \t 003   /   5   z 245   L 334 332   o 240 376 303   ) 213   s
0000200 244   ) 363  \a   c   A 004   S 307 322   @ 364 275   + 223   -
0000220 214  \a 330 313 263   G   / 357 237 036 205   b 035 371 321 006
<SNIP>

The man page for od shows that it can be used to obtain data directly from a file as well as specify the amount of data to be read. In this case I have used -N 128 to limit the output of the data stream to 132 Bytes.

tuser1@testvm1:~$ od /dev/urandom -N 132
0000000 022742 054271 170171 017422 120423 114551 107374 014750
0000020 042375 141566 115120 054152 141734 126021 106777 102566
0000040 135547 004421 141340 150635 037144 036731 001656 030474
0000060 135227 126736 003004 167231 050742 053132 063645 125716
0000100 044615 006454 172070 013663 004744 113741 141466 021137
0000120 156702 044270 047240 071546 111221 047205 140257 141536
0000140 117170 173447 037536 031436 157700 170546 073424 057123
0000160 027166 076711 130656 036561 034131 117365 123637 025320
0000200 117447 001053
0000204
tuser1@testvm1:~$

The dd command could also be used to specify a limit to the amount of data taken from the [u]random devices but it cannot directly format the data.

tuser1@testvm1:~$ dd if=/dev/urandom bs=1048 count=1 | od 
1+0 records in
1+0 records out
0000000 047027 007520 071711 112111 104652 137721 113412 104143
1048 bytes (1.0 kB, 1.0 KiB) copied, 5.7878e-05 s, 18.1 MB/s
0000020 172124 004706 116413 105363 024260 056031 004254 104661
0000040 015536 042206 036062 033605 152017 046626 107505 164461
0000060 125052 147070 046412 160067 031736 017152 020754 035632
0000100 071350 065570 070252 150112 076160 135303 026463 013425
0000120 000244 033523 110346 044077 164634 145545 145764 174702
<SNIP>
0002000 121612 060320 111433 130366 150047 036117 161017 120467
0002020 041745 153357 145104 131171
0002030
tuser1@testvm1:~$

Can you see the problem here? The status data from the dd command is displayed along with the stream of random data. But there is a way to deal with that problem using the null device.

/dev/null

The null device, /dev/null, can be used as a target for the redirection of all or some of the output from shell commands or programs so that they are not displayed on the terminal. I frequently use /dev/null in my bash scripts to prevent users from being presented with output that might be confusing to them. Think of the null device as a place to send a data stream or parts of one that we explicitly don’t want to be passed on any further.

Enter the command below to redirect the output to the null device. Nothing will be displayed on the terminal.

tuser1@testvm1:~$ echo "Hello world" 
Hello world
tuser1@testvm1:~$ echo "Hello world" > /dev/null
tuser1@testvm1:~$

There is really no visible output from the /dev/null because the null device simply returns an end of file (EOF) character. The null device is useful as a place to redirect unwanted output so that it is removed from the data stream.

The default using > is to redirect only parts of the data stream sent to STDOUT, that is, file handle 1. STDERR, file handle 2, is still sent on, either to the display, which is the default destination for STDOUT and STDERR, or it’s piped to the next tool in the pipeline.

Let’s return to the problem of the intermingled data from the dd command. The key to understanding how to fix this is knowing that the data stream that contains all the data from any command is sent to Standard I/O (STDIO). However, the desired result of any command is sent to standard output (STDOUT), while any status or error messages from the command are sent to STDERR (Standard Error). This gives is a solution to our problem. We simply redirect the STDERR part of the data stream to the null device. Since STDERR is device handle 2, we do that using 2> as our redirection command.

tuser1@testvm1:~$ dd if=/dev/urandom bs=1048 count=1 2>/dev/null | od 
0000000 074106 037115 170604 031006 063323 164327 152337 046053
0000020 065750 174066 030371 145462 002570 063045 010672 067217
0000040 024770 070370 057735 132074 124434 124050 037533 174740
0000060 010261 103544 120266 153641 137140 164100 127710 073605
0000100 117226 115452 052025 157543 137713 126743 147074 051617
0000120 067501 046623 177452 050015 170526 072576 160061 122676
0000140 120635 073574 072141 144124 073511 076610 106633 101275
0000160 055575 167746 130347 120143 171414 120155 144111 167461
0000200 024702 033657 133477 064423 077307 101332 031653 026743
0000220 050650 014337 177170 176314 054507 026500 002763 036727
0000240 106071 106616 060333 022106 137716 047155 017604 033322

So this use of redirection sends the error and status messages to the null device while the rest of the data stream is sent on to STDOUT, in this case, the display.

/dev/random

The /dev/random device file produces non-deterministic2 random output. This output is not determined solely by an algorithm that is dependent only upon the previous number that was generated, but it is generated in response to keystrokes, mouse movements, device drivers, and other “environmental noise3” which is a source of randomness for the random device. That random noise is stored in the random data pool from which it’s read when needed by the /dev/random device. This method makes it far more difficult to duplicate a specific series of random numbers.

The primary functional difference between the two random devices is that urandom will never block, as it doesn’t depend on there being data in the random data pool. The random device may block, that is stop producing a data stream, if there is no data left in the random data pool.

/dev/zero

As its name implies, the /dev/zero device file produces an unending string of zeroes as output. Note that these are Octal zeroes and not the ASCII character zero (0). Use the dd command to view some output from the /dev/zero device file. Note that the byte count for this command is non-zero.

tuser1@testvm1:~$ dd if=/dev/zero bs=512 count=500 | od -c
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
500+0 records in
500+0 records out
256000 bytes (256 kB, 250 KiB) copied, 0.0031656 s, 80.9 MB/s
0764000
tuser1@testvm1:~$

Summary

These four interesting devices are integral to many operations in modern computers. Most of those functions are performed internally to the operating system and we users never encounter that directly.

However, except for the zero device, I use the other three of them as I sometimes need to generate random number streams and frequently need to redirect STDOUT or STDERR to the null device. Most programming languages have methods that return a random number. This includes scripting languages like Bash which uses the $RANDOM environment variable.

dboth@david:~$ echo $RANDOM
9885
dboth@david:~$ echo $RANDOM
1836
dboth@david:~$ echo $RANDOM
20361
dboth@david:~$

  1. Deterministic means the output is determined by a known algorithm and uses a seed string as a starting point. Each unit of output is dependent upon the previous output and the algorithm, so if you know both the seed and the algorithm, the entire data stream can be reproduced. As a result it is possible, although difficult, for a hacker to reproduce the output if the original seed is known. ↩︎
  2. Non-deterministic results are not dependent upon the previous data in the random data stream. Thus they are more truly random than if they were deterministic. ↩︎
  3. See the “random” man page for a little more detail. ↩︎

Leave a Reply