tldr: Wordle except only using fake words. Warning: will not keep guesses or progress if you leave the page. Like that other game, only has one puzzle per day. Post your favourite words and what you think their definitions should be on social media!
Okay so there's this word game right, and this person I know was all "you should play this", so I did. And like, it's okay and stuff? Pretty smooth game design, quality content, somewhat addictive, all that good juice. And I was thinking, and I was really thinking you gotta understand, and I was totally all "yoo I could make a version of this". That's definitely not an idea that's played to death or anything. So I did, with just one small teensy tiny little catch: Nonsurdle only uses words that don't exist. Nice.
How It Works (Math Nerd Version)
I used a two step text generation Markov Chain to generate 6000 fake words, with an almost 6000 word long list of common 5 letter words as input. We then play Wordle as normal using the new list as a dictionary. I check to make sure no actual words are generated, although it's possible for more obscure words or words in other languages than English to be generated. The rng takes the date as a seed, so the dictionary changes each day, but is the same for each player every day. You could quite easily fool it by forwarding the day on your computer. Stopping this felt like more effort than it was worth, because why would anyone want to cheat at some obscure game on the interwebs?
How It Works (Normal Person Version)
I want you to consider a word as a list of letters in order. If you think about it, every letter has some frequency of following every other letter. For example, the pair "cz" is significantly less common than "ch". Therefore, the frequency of an "h" after a "c" is higher than that of a "z". The theory behind the random word generation is to follow this through logically. It generates words one letter at a time, using letters that follow the previous letter based on the probabilities of each letter following the previous letter in normal English. The probabilities are generated by reading a list of nearly 6000 common 5 letter words, and counting how often each letter follows every other letter. The program uses that to generate a list that looks something like: "a" - "a" 0.01, "b" 0.04, "c" 0.02", etc.
(Sidenote: This is an example of a Markov chain, a system that has states and probabilities of going from each state to another state. The letters are the states, and the path after letting it run for 5 steps is the word. Markov chains are a powerful tool, and can be applied to a variety of problems besides this.)
Now, some of you might have already spotted a problem with this. When generating the next letter, it only considers the previous letter. So, it might start by picking an "r". It might then notice that "rt" as seen in "chart" is a somewhat common pair, and so pick a "t" as the next letter. It might then look at the "t" as see that "tripe" is a word, and pick "r" as the next letter. It could repeat this 5 times, generating "rtrtr" as the word, which is clearly unpronounceable gibberish. Since my goal was for words that kind of look real and could probably be pronounced, this is clearly unacceptable (and was happening way too often in my test runs).
Which leads us to our next innovation: a two step Markov chain. This is really just a fancy way of saying that we consider the previous two letters instead of just the previous letter. Apart from that, it works just the same. So the computer has a list that reads like: "aa" - "a" 0, "b" 0.01, "c" 0.02"... etc. It has one version for every two letter pair. And that, is how you use a Markov chain to generate words. Actually building the wordle game on top of this is an exercise left to the reader. Have fun!
I could've used a three step chain for even more pronounceable words, but higher chains take more space and longer to run, and have less randomness. I did some testing and decided two was the best option.
Game design by Cahatstrophe Games. Based on the popular game by Josh Wardle. Uses mathematics invented by Andrey Markov over a hundred years ago. Uses the font Open Sans by Steve Matteson, licensed under the Apache License 2.0.
Follow me on Twitter to be notified when I do things! Or itch, you could also follow me on itch. Either works.