Moderator: mosher

The Chaocipher algorithm is finally revealed!

mosher
Super member
mosher
Super member
Joined: May 26th, 2009, 10:24 am

July 2nd, 2010, 10:37 am #1

Working on John F. Byrne's Chaocipher challenge messages starting in August 2008 has been a long but enlightening, educational, and enjoyable journey for me. Together with many other amateur cryptanalysts and correspondents, we burned plenty of midnight oil and went down many dead-ends trying to determine the Chaocipher underlying system. This forum contains many thoughts and hypotheses raised and disproved along the way. Speaking for myself, I am a better cryptanalyst now than I was when I began.

The story of how John F. Byrne's Chaocipher material came to be acquired by the National Cryptologic Museum has been told elsewhere. Now, based on John F. Byrne's personal papers, Ninety-two years after it was invented and fifty-seven years after it was published, it's time to reveal how the true Chaocipher system worked.

You can download the first of several papers, this one entitled "Chaocipher Revealed: The Algorithm", from the Chaocipher Clearing House web site.

I hope to follow up very shortly with full descriptions on how exhibit 1 and 4 in "Silent Years" were enciphered, revealing the unknown challenge plaintexts embedded within each of these exhibits.

Enjoy!

Moshe
Last edited by mosher on October 11th, 2010, 12:08 pm, edited 1 time in total.
Quote
Like
Share

jdege
NSA worthy
jdege
NSA worthy
Joined: December 7th, 2006, 8:43 pm

July 2nd, 2010, 4:29 pm #2

I'd sort of figured that he was doing something completely off-the-wall - something that nobody had considered. The auto-key aspect was hinted at, in some of the descriptions. The continual permuting of the alphabets on each disk is - or so it appears to me - an entirely new idea, and not one that anyone I've ever heard mention as a possibility.
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
Quote
Like
Share

james
Elite member
james
Elite member
Joined: May 14th, 2010, 3:48 pm

July 3rd, 2010, 12:29 am #3

Yes the Chao algorithm is a novel in the sense that none of the proposed models got anywhere near. It's also quite impractical and it's now quite understandable why Byrne had no success in selling it.

The interest is, I suggest, how to solve it given ciphertext only (I mean without the plaintext given in Byrne's exhibits) and no knowledge of the starting alphabets on the two wheels.
Quote
Like
Share

fiziwig
Elite member
fiziwig
Elite member
Joined: June 17th, 2010, 12:31 am

July 3rd, 2010, 3:43 am #4

Very cool.

Here are some sketches preliminary to possibly building a cylindrical model with tiles in tracks around an oatmeal box, or perhaps a short length of PVC pipe.. There's no need, of course, for the two alphabets to rotate in opposite directions. They arrive at the same place no matter what the direction of rotation.


fig 1

fig 2

fig 3

fig 4

fig 5

On Edit: The shuffling action visualized in this manner suggests other "feedback track" arrangements. The top and bottom "feedback tracks" don't need to start and end where they do, neither do they have to be the lengths they are. And the two "index spots" on this model could be just about anywhere also. This could be a fertile ground for experimentation. For that matter, the top and bottom feedback tracks could feedback in opposite directions instead of both going left to right.
Last edited by fiziwig on July 3rd, 2010, 4:06 am, edited 3 times in total.
Quote
Like
Share

james
Elite member
james
Elite member
Joined: May 14th, 2010, 3:48 pm

July 3rd, 2010, 8:30 am #5

Superb fiziwig!

It's possible that the different Exhibits of Byrne are enciphered with tracks of different lengths by shifting the index to different positions. He has used 14 for exhibit 1 but perhaps he used 7 for exhibit 2.
Last edited by james on July 3rd, 2010, 8:39 am, edited 1 time in total.
Quote
Like
Share

mosher
Super member
mosher
Super member
Joined: May 26th, 2009, 10:24 am

July 3rd, 2010, 10:25 pm #6

It's still early on to categorize Chaocipher, but I believe it is an early (earliest?) exposition of Terry Ritter's patented concept of "Dynamic Substitution" (DS). As far as I know, Byrne did not have formal training in cryptography or cryptanalysis. He stumbled upon the Chaocipher concept which was ahead of its time, in the same way Thomas Jefferson invented the wheel cipher without any formal crypto education.

jdege: Truth be told, the possibility of Chaocipher being DS was raised on 23 October 2009. I was intrigued with DS being able to explain how the first 100 lines of Exhibit 1 consisted of the same 55-leter phrase ("ALLGOODQQUICK...") without repetition. Although I proposed DS, I changed my mind because we thought DS was too chaotic in its output and would not match the characteristic probability graph. Even had I pursued it, I don't believe I would have hit on Byrne's precise system. There is much to be learned when the Chaocipher post-mortem document is written <g>.

james: Chaocipher could be considered impractical mechanically, but the algorithm is simplicity itself. Performed manually, the system is highly error-prone, and the extremely high error propagation precludes it as a military cipher. Nonetheless, one wonders whether the US Signal Corp or the Navy could have implemented it mechanically and used it as a high-grade cipher system. In the Cryptologia article "The SIGCUM Story: Cryptographic Failure, Cryptologic Success" (Volume 21, Issue 4 October 1997, pages 289 - 316), Stephen J. Kelley tells the story how Frank Rowlett leveraged two back-to-back messages sent using the SIGCUM (or M-228) enabled him to determine the complete rotor wirings. With Chaocipher, plaintext and ciphertext are so closely intertwined that the difference of a single character between two messages results in highly diffused and nonlinear outputs. I'd like to believe that it is possible to solve Chaocipher, but its strength versus other rotor systems needs to be reckoned with.

figwig: a most excellent graphic -- thank you! You are right that the system can be extended by allowing changing the basic parameters.

james: Exhibits 1 and 4 use the (zenith/nadir) = (1/14) intervals. it is certainly conceivable that Exhibits 2 and 3 may have changed these parameters. Another possibility, and one I will discuss in an upcoming paper, is that Byrne occasionally located the plaintext letter in either the left or the right alphabets, based on a prearranged pattern. This might explain why Exhibits 2 and 3 show pt/ct interval < 9 not found in Exhibit 1.

Here's a basic challenge to this forum: now that we know how Chaocipher works, and we know the 13,500 corresponding plaintext and ciphertext pairs of Exhibit 1, it should be a simple task to derive the starting alphabets (i.e., the alphabets used to begin enciphering the very first pt/ct pair). To date it is not obvious how to begin. Here are the plaintext and ciphertext of the first 1100 characters:
wrote:ALLGOODQQUICKBROWNFOXESJUMPOVERLAZYDOGTOSAVETHEIRPARTYW
-------------------------------------------------------
CLYTZPNZKLDDQGFBOOTYSNEPUAGKIUNKNCRINRCVKJNHTOAFQPDPNCV
LTVFICOTSSLWYYIHBICFUTHXNUVKGIMVEZYWSTHEPIEWXNNGFTOGHSR
TBZXTMVGLTJXCSQXLNJTENCSVLCWRTBENZLSUVYIDAXLAFATQSRNZOP
HKYGQJTOGYSDBNVDJOWHKECRMLYWIQIFIKSCYJGCVXNSKYHRYVYEDSZ
RIFFZAQNHSOMJPORWTJOIJIPKVHZGPWQKRXDMAUEFFXIACFLCZMAFZS
JEOZIFKJCFMETESYYHZUVLFFURRHRIIFFDZMTTOVKLZOVLPVPPGVGEW
WEFRFYHKXOPKXRQSZKLCZKHZWXRJXLMVFGGFGYIFDAEINIWPOMOUVRF
BUZLAGDBCUAMFQLACRWWTUGSMPPZBRFASROYIRCAGVEYNSRTOQTDLFJ
RUTKFKASGVLVYYFVRAIYNIVJKIUWPFZBVRUEOTEJGLCGYSSNHHQTIQW
UKQASXKGSPWHRYMTQSOQBAMAPFQRLIIUGTIVBEBYXFBIUSEYHMLKGOE
CSWUHTBIZZHLBNDIWTQAMAZBMYMBEKCYKCABLYQYMELPJOWNRVFZVBR
EBVUJEQIAEMOHTGFHFFIDIQQJUAWDHLUYREUGSKTIMDWRRNONJKDPTC
JDCJNBVEOUTWXOFGRXNDKITNLOXSLZWQRDERERHLXWAMYLRVPRJFHRA
SDJWWOIWEVAVMRRNLRJMIFDHHADDQCBZWYKDVPAYNPIAXBYUKIJGVUC
ACJHFXRALOVRLZUVANABNZDZTPFQRIYCLLZYILTWJBPAFLPOIOZTBPI
USRXCDCITEEKMJBHPPYONYEGSZWGURIFIPWUMTLJYVYNEACGJXJAGCX
QPDLABSYMUDOKYDWRXCJUFPXCPBWYQPHMTAXNROBASQRZYVJXOHUXFP
BIHGGPKRFDMWTOTMKBOLBRRNOCHWLQDVNEEVXBNEGHJQQCVIEFYMEQR
XSYEWVJZTQXDEWKWSWIEEHDSNRHRCVDUYOGNGVDPRHUTYKPRAOIVCUJ
DYVLOWBMGSTFTXUVOXGZZUIIRYXSAVEPRWPKQJMSVGYBNECJOKCNMFP
Can anyone derive the starting alphabets from these 1100 characters, where the plaintext for each line of 55 characters is the same "ALLGOODQQUICK..."?

I cannot think off-hand of a classical cipher system where, given 13,500 matching pt/ct pairs, one cannot derive the internal settings. Can it be done with Chaocipher?

Moshe

Last edited by mosher on July 4th, 2010, 8:35 am, edited 3 times in total.
Quote
Like
Share

fiziwig
Elite member
fiziwig
Elite member
Joined: June 17th, 2010, 12:31 am

July 4th, 2010, 3:38 pm #7

Thinking about the challenge, I built a little quick-and-dirty model with some cardboard an a bunch of 3/4" wooden cubes I had laying around the workshop. (I know, as a retired programmer I should just code up a simulation, but I get a better feel for things by having something physical to touch and manipulate.)

Anyway, The first, and most obvious thing I discovered is that the shuffling process is NOT reversible unless you also know where each pt/ct pair was located relative to the index marks (zenith/nadir). Second is that when two consecutive but different pt letters encrypt to the same ct letter then those two pt letters were immediately adjacent before the first one was encrypted, with the second one encrypted being to the right of (using my sliding model rather than a rotating one) See the first line at pt: WN/ ct: OO.

Anyway, here's my model:



I'm sure there's a rigorous mathematical way to recover the starting alphabets, but I'm just fool enough to think I might be able to recover them by fiddling around with my model for a while.

--gary
Quote
Like
Share

mosher
Super member
mosher
Super member
Joined: May 26th, 2009, 10:24 am

July 4th, 2010, 5:26 pm #8

Hi fiziwig,

I agree with you wholeheartedly: to understand any algorithm, mathematical formula, or cipher system, a tangible, manipulatable model is invaluable. When I first played with the Chaocipher algorithm I used Scrabble tiles. I had to deal with the problem that a standard scrabble set has only one tile of the following letters: J, K, Q, and X <g>.

Now that the paper is out of the way I started giving thought to solving for the alphabets given the pt + ct, and moved along the same lines as your second point. I imagine the same could be said for identical plaintext, different ciphertext. And yes, the way the Chaocipher output is highly dependent on the pt AND ct letters and their distance from the zeniths, makes recovery quite difficult. Nonetheless, there seems to be a wealth of information with every pt/ct pair. A computer will certainly come in handy, and I believe there is a method far more efficient than brute force.

IMHO, hill-climbing or simulated annealing algorithms will not help here because of the inability to score correctly. If an alphabet is even slightly off, the resulting ciphertext after 2-3 letters is already random. This is more the case when the alphabet, during the hill-climbing process, has many wrong letters.

In any case, this is the challenge at the moment: recovering the alphabets given the pt+ct. When this is finally solved we can concentrate on the real challenge: solving a ciphertext-only Chaocipher message.

Moshe
Last edited by mosher on July 4th, 2010, 5:28 pm, edited 4 times in total.
Quote
Like
Share

fiziwig
Elite member
fiziwig
Elite member
Joined: June 17th, 2010, 12:31 am

July 4th, 2010, 11:55 pm #9

mosher wrote:Hi fiziwig,

I agree with you wholeheartedly: to understand any algorithm, mathematical formula, or cipher system, a tangible, manipulatable model is invaluable. When I first played with the Chaocipher algorithm I used Scrabble tiles. I had to deal with the problem that a standard scrabble set has only one tile of the following letters: J, K, Q, and X <g>.

...

Nonetheless, there seems to be a wealth of information with every pt/ct pair. A computer will certainly come in handy, and I believe there is a method far more efficient than brute force.

...

In any case, this is the challenge at the moment: recovering the alphabets given the pt+ct. When this is finally solved we can concentrate on the real challenge: solving a ciphertext-only Chaocipher message.

Moshe
I visited a half-dozen thrift stores yesterday looking for used Scrabble sets. No luck at all. That's why I resorted to the method I used.

However, I then discovered that loads of people are selling large lots of Scrabble tiles on eBay: http://toys.shop.ebay.com/Scrabble-/19097/i.html
Get 'em while they're hot!

I agree, the computer will be invaluable, but only after we know how to attack the problem, and I''m hoping the answer to that will come from physically manipulating the model.

Yes, solving a ciphertext only message will be the real challenge.

--gary
Quote
Like
Share

kryptosfan
NSA worthy
kryptosfan
NSA worthy
Joined: September 4th, 2011, 6:09 am

September 17th, 2011, 5:49 am #10

mosher wrote:Chaocipher could be considered impractical mechanically, but the algorithm is simplicity itself. Performed manually, the system is highly error-prone, and the extremely high error propagation precludes it as a military cipher.
But the military never makes mistakes :D
OBKR
UOXOGHULBSOLIFBBWFLRVQQPRNGKSSO
TWTQSJQSSEKZZWATJKLUDIAWINFBNYP
VTTMZFPKWGDKZXTJCDIGKUHUAUEKCAR
Quote
Like
Share