Wednesday, March 9, 2022

Greedy strategies for Wordle

Today, I wanted to see how well the following greedy strategy works on Wordle. Given a set H of words, a word w partitions H into equivalence classes or “cells”, where two words count as equivalent if they give the same evaluation if w is played. We can measure the fineness of a partition by the size of its largest cell. Then, the greedy strategy is this:

  1. Start with the 12792 five-letter Scrabble words of the original Wordle (things will be a bit different for the bowdlerized New York Times version).

  2. Guess a word that results in a finest partition of the remaining words, with ties broken by favoring guesses that are themselves one of the remaining words, and remaining ties broken alphabetically.

  3. If not guessed correctly, repeat 2–3 with the remaining words replaced by the cell of the partition that fits the guess.

A quick and dirty python (I recommend pypy) implementation is here.

Note that this greedy strategy does not cheat by using the secret list of answer words, but only uses the publicly available list of five-letter Scrabble words.

Here some observations:

  • The greedy strategy succeeds within six guesses for all but one of the 2315 answer words in the original Wordle. The remaining word needs seven guesses.

  • A human might be able to guess the answer word that the algorithm fails for, because a real human would not be partitioning all the five-letter Scrabble words, but only the more common words that are a better bet for being on the answer list.

  • The greedy algorithm’s first guess word is “serai”, and the maximum cell size in the resulting partition is only 697 words.

  • While only one answer list word remained that the greedy strategy fails for, it looks like Wardle may have deliberately removed from the answer list some other common, clean and ordinary English words that the greedy strategy does not guess in six guesses.

  • It would be a cheat, but one can optimize the algorithm by starting with only the 2315 answer words. Then indeed the greedy strategy guesses every answer word in six guesses, and all but two can be done in five.

  • A human can do something in-between, by having some idea of what words an ordinary person is likely to know.


Alexander R Pruss said...

Playing "ikons" and "ratel" results in a partition with maximum cell size 94 and this is an optimal pairing if we don't allow the second word to depend on the first. I coded this in C for greater efficiency:

Continuing with the greedy strategy allows success in six moves for all the (original) answer words.

Alexander R Pruss said...
This comment has been removed by the author.
Alexander R Pruss said...
This comment has been removed by the author.