So, Voynicheros know well enough that the composition of Voynichese words is governed by a set of rules which determines which characters are allowed or required to follow which. These rules form either a kind of “grammar”, or a set of building blocks which make up the VM words, depending on which way you look at it. Of course everybody has their own ideas of what these rules actually look like, from the Core-Mantel-Crust paradigm of Jorge Stolfi to the feeble attempts of yours truly. As usual, there is no universally accepted way what the grammar actually looks like, and for every set of rules there appears to be any number of counter examples.
Either way, it appear that there are “mandatory” letter groups in each word, and “optional” letter groups, and, at least as far as text in Currier B language is concerned, it appears to me that the optional groups precede the mandatory groups in each word — in other words, an optional start group is followed by a mandatory end group.1
Now, combine this with the observation that Voynichese entropy is below that of natural languages, ie the “predicitability” of the next letter in a word is higher than would be anticipated2 This in turn indicates to me that a Voynichese “word” is not equivalent to a plaintext word (as I have pointed out in my Face value fallfacy page), but rather “less” — a number of letters only? (This would fit in with my Strokes theory, in case you haven’t noticed.)
Let’s assume for the moment that each VM “word” represents two plaintext letters — one is enciphered in the start group, and one in the end group — but the letters are sorted backward. This backward sorting could happen for the whole text (the whole paragraph, the whole line, …), or it could happen only within one VM word. So, for example, let’s encipher “voynich” this way: We reverse letter order and group the letters into “chunks” of two. The cipher sequence could in this case be
hc in yo v (reverse word-wise)
or
ov ny ci h (reverse chunk-wise)
From here we proceed with individually enciphering each letter into one Voynichese group — start group or end group–, however that is actually done.
But we notice that, since “voynich” consists of an odd number of letters, beside three groups of two-letter chunks, there results also one single-letter chunk. This is in accordance with our observation (hoorah!), but unfortunately it doesn’t explain why the end groups should be any different from the start groups (boo!), as both should be governed by the same set of rules. It only explains why some words consist of two groups, and some of only one.3
But what, if odd and even letters in the plaintext were enciphered differently? For example, the plaintext letters could be converted to CaMeLcAsE before enciphering. In the case of the chunk-wise reversal (second example above) that would render
oV nY cI H
And with one final leap of faith, namely, that capital letters are enciphered somehow different from lower case letters, ta-dah, we’re there: We have words of one start group and one end group, and words of only an end group, and the start and end groups are disctinctly different from each other. Note that the difference doesn’t mean that start and end groups are made up of completely different character sets. Rather, there is considerable overlap between the VM characters used in start and end groups. The difference lies in the way of possible (“legal”) arrangments/sequences of the characters.
This could be explained in view of the strokes — some strokes occur only in uppercase letters, some occur in both upper and lower case latters. Slashes “/” and backslashes “\” for examle show up mostly in capital letters (“M”, “N”, “W”, “V”) and rarely or not at all in lower case letters. A vertical line “I” will show up in both sets of letters (“B”, “D”, “E”, … as well as “b”, “d”, “l”, …) , but more frequently in upper case. A tiny circle “o” is mostly relegated to lower case letters, and so on.
And, once more here I’m at the end of my wisdom. This all looks fine and reasonable to me, but how to put it to the test? The “forward” way of synthesizing the way the plaintext letters were broken up into their constituent strokes, and how these strokes were in turn substituted with VM letters, offers too many alternatives. The “backward” way of deducing what exactly constitutes a start and an end group, is difficult — lots of people have tried, but none seems to have come to a convincing conclusion.
Hm.
P.S. — And just when I hit the send button, it occured to me — letter reversal isn’t even necessary. As long as you make sure that the last letter in a word is enciphered as a capitel letter, you’re fine. In other words, encipher your plaintext such that it is “lower case — upper case”, and only at the very last chunk make it “lower — upper” if there are two letters left to encipher, or only “upper”, if it’s only one letter.
- This determination was mostly made by “gut feeling”. I haven’t given it a thorough statistical check. ↩︎
- This can be attributed to any number of reasons, starting with a faulty recognition of the VM character set. ↩︎
- One could ask: Why even bother with the single letter chunks? They could be introduced to denote plaintext word boundaries, but obviously they would fulfill that purpose only in 50% of all cases. ↩︎










