Use ‘grep’ to solve a 2-dimensional word puzzle

0

As a systems administrator, you need to keep a lot of “tools” on your “tool belt” of skills. Understanding how to best apply different tools and concepts to new problems can turn a difficult problem into a small one. One skill I used frequently when I worked as a systems administrator – early in my career – was regular expressions.

You can learn more about regular expressions from the regex manual page, in section 7 of the online manual, with man 7 regex. The basics of regular expressions are:

CharacterMeaning
^The start of a line
$The end of a line
.Any single character
[ and ]Match any of a set of characters, such as [ab] to match either a or b, or [a-z] to match any lowercase letter
*Zero or more of the thing before it, like a* to match zero or more of a
?Zero or one of the thing, like a? to match an a that may or may not be there
+One or more of the thing, like a+ to match one or more of a
Regular expression characters and what they mean

Playing a word puzzle game

I like to play puzzle games to relax, especially word puzzles. I wrote before about playing the Wordle game, but recently I discovered another game: Keyword is a word puzzle game that presents six words of different lengths, written vertically, each missing a single letter. The missing letters also form a word that runs horizontally. You need to pick the right letters to both fill in the missing letters from the words and create a new six-letter word.

This is an excellent opportunity to practice using grep to leverage regular expressions to find words in the system dictionary that match the six words – and at the same time, uncover the hidden word.

screenshot of Keyword, showing six incomplete words and a hidden secret word

In this puzzle, we have six words, each missing one letter:

  1. at?lete
  2. d?al
  3. dec?de
  4. p?ot
  5. ?ear
  6. parc?ment

Some of these may be immediately obvious. I can’t imagine the first word as being anything other than athlete. The last word seems obvious as parchment. But the third word could be either decide or decade.

List the possible words

Let’s use grep to match each of the possible six words from a dictionary of possible words. On Linux, the /usr/share/dict/words file contains a list of over a half-million correctly-spelled words, such as you might use in a spell-checking application. Let’s use regular expressions to match each incomplete word to valid words from the words file. To match each word, we only need a few regular expression characters: ^ and $ and [a-z].

$ grep '^at[a-z]lete$' /usr/share/dict/words
athlete

$ grep '^d[a-z]al$' /usr/share/dict/words
deal
dhal
dial
dual

$ grep '^dec[a-z]de$' /usr/share/dict/words
decade
decede
decide
decode

$ grep '^p[a-z]ot$' /usr/share/dict/words
phot
plot
poot

$ grep '^[a-z]ear$' /usr/share/dict/words
bear
dear
fear
gear
hear
jear
lear
mear
near
pear
rear
sear
tear
wear
year

$ grep '^parc[a-z]ment$' /usr/share/dict/words
parchment

Some of the incomplete words have only one match in the words file, while others have quite a few. Still, matching these words with grep gives us a lot of information to fill in the missing word. The matched words from the dictionary tell us what letters can appear in each position:

  1. The first letter can only be h
  2. The second letter can be e, h, i, or u
  3. The third letter might be a, e, i, or o
  4. The fourth letter is one of h, l, or o
  5. The fifth letter has several possibilities!
  6. The last letter can only be h

Narrowing the options

We can build a regular expression to match a six-letter word from the dictionary, where each letter is what we have uncovered. For each letter, we’ll use a pattern to match a specific letter, like [ehiu] to match one of e, h, i, or u. I’m not sure I want to type in all fifteen possible letters for the fifth letter, so let’s see how far we can get by specifying . for that letter, to match any character at that position.

Our grep command to find the possible letters looks like this:

$ grep '^h[ehiu][aeio][hlo].h$' /usr/share/dict/words
health

So now we have discovered exactly one six-letter word that also fills in the missing letter for each incomplete word: health. And that’s correct!

screenshot of Keyword, showing HEALTH as the missing word

Matching patterns

Regular expressions match patterns in text. Here, we’ve used grep and regular expressions in a fun way to play a word puzzle game. With a well-formed regular expression, you can also find specific messages in a log file, or uncover errors in output files. Practice using regular expressions so you can add this great “tool” to your virtual “tool belt” of system administrator skills.

Leave a Reply