Moderator: mosher

Attempting to solve Exhibit 6 using hill-climbing/simulated annealing

mosher
Super member
mosher
Super member
Joined: May 26th, 2009, 10:24 am

January 29th, 2015, 4:25 am #1

I'd like to share my so-far unsuccessful attempt to solve Exhibit 6 using hill-climbing and simulated annealing. I believe the method I've chosen is a good one, but my software has not succeeded in solving the messages. In this post I explain the proposed method, the software I wrote (apologies if it seems long), and the results I've had so far. It is my hope that other forum readers will take the method to its successful conclusion.

What data do we have to work with?

To recap what is presented in this post, we are given fifty (50) short Chaocipher messages, from 22 to 34 characters per message. We are told that all fifty messages are "in-depth", meaning that all messages begin with the identical machine settings (i.e., the left and right alphabets, and the plaintext takeoff key).

What do we need to solve for?

As alluded to in the previous paragraph, the fifty messages will be considered solved when the following have been determined:
  • The left and right starting alphabets have been determined
  • The plaintext takeoff key has been determined
  • The decrypted plaintext is clearly correct
The left and right alphabets are each a 26-character jumbled sequence of the standard alphabet. The plaintext takeoff key is a string of unknown length, consisting of the letters 'L' and 'R', and denotes whether the corresponding original plaintext letter was located in the left or right alphabet, respectively.

For a complete description of Chaocipher encryption using a plaintext takeoff key, see the original post related to Exhibit 6 and the accompanying PDF document.

What's the problem solving the Exhibit?

The key space size for solving the Exhibit is S = 26! x 26! x (2^K), where K is the length of the plaintext takeoff key. If we assume for the moment that K=20, then S = 1.71E+59, precluding solving it by brute force. As mentioned in the original post, one can guess what 'TH' (pt) would be in ciphertext, but it is not clear how to proceed from that point onwards.

At first blush, Chaocipher is not amenable to solving using hill-climbing (HC) or simulated annealing (SA). This is due to the somewhat chaotic effects of alphabet permutation in Chaocipher, making it difficult to score candidate alphabets. As I will write in the following section, I believe one can score candidate alphabets well enough to implement a HC/SA algorithm.

The proposed method: Hill Climbing/Simulated Annealing

I propose using hill-climbing/simulated annealing for finding the left/right alphabets and the plaintext takeoff key.
  • The method begins by generating random left and right alphabets.
  • The method will concentrate solely on the first N columns of each message, with N being either 5 or 6.
  • For a proposed pair of left and right alphabets, we iterate through the 2^N possible plaintext takeoff keys (i.e., from "LL..LL" to "RR..RR"). For each plaintext takeoff key, we decrypt the first N columns. The scoring function consists of summing the logarithmic weights of all decrypted plaintext for the first N columns using the particular plaintext takeoff key. We will use a separate log weight table for the initial letters.
  • In each successive cycle of the HC/SA, one or both of the alphabets is modified slightly in HC/SA fashion.
  • We use simulated annealing to avoid getting stuck in local maximas.
  • We let the software run for a large number of cycles (e.g., 1,500,000). After that we start the program over from the beginning, generating new random left and right alphabets, etc.
The method in action: An example

In this example we will relate to the first N=5 columns only.

(0) We'll use the following initial and regular letter log weights:

       Log Weights
     Initial   Regular
a:    2.041     1.863
b:    1.664     0.954
c:    1.777     1.477
d:    1.447     1.644
e:    1.398     2.114
f:    1.608     1.447
g:    1.252     1.204
h:    1.588     1.544
i:    1.750     1.869
j:    0.763     0.301
k:    0.729     0.477
l:    1.326     1.544
m:    1.547     1.398
n:    1.388     1.892
o:    1.855     1.869
p:    1.670     1.431
q:    0.276     0.477
r:    1.500     1.887
s:    1.869     1.799
t:    2.202     1.969
u:    1.135     1.431
v:    0.785     1.114
w:    1.706     1.204
x:   -0.215     0.699
y:    0.855     1.279
z:   -0.437     0.000

(1) We begin the run with the fifty (50) messages in Exhibit 6, and will only relate to the first five (5) columns of each message:

TIHUL
OYRBQ
XANQD
XANQD
XMIRE
TCYXM
. . .
TCQNO
TCBXI
BKRBO

(2) Initializing the run, generate random left and right alphabets:

Left Alphabet:  BJCKRUSHGDEQPOFXTZYIMLWANV
Right Alphabet: XCRUSQGAVOTKMPILJFEWDHYBZN

(3) Iterate through all 2^5=32 possible plaintext takeoff keys (i.e., "LLLLL", "LLLLR", "LLLRL", ... "RRRRL", "RRRRR"). For each takeoff key, decrypt the first five columns of each message, compute the sub-score for each message 5-gram using the log weights given above. By summing up all fifty sub-scores we arrive at a score for each candidate triplet (left alphabet, right alphabet, plaintext takeoff key). The highest fitness score is the fitness score for the candidate left/right alphabets.

Here are the computations for the first three takeoff keys, given the initial left/right alphabets:

Plaintext TakeOff: "LLLLL"
Message 0 (score: 7.122000): "tihul" -> "jdobn"
Message 1 (score: 7.509000): "oyrbq" -> "pwgrl"
Message 2 (score: 5.632000): "xanqd" -> "lzcpm"
Message 3 (score: 5.632000): "xanqd" -> "lzcpm"
Message 4 (score: 8.146000): "xmire" -> "lhhai"
Message 5 (score: 5.952000): "tcyxm" -> "judez"
. . .
Message 47 (score: 5.579000): "tcqno" -> "jupcq"
Message 48 (score: 7.149000): "tcbxi" -> "jureb"
Message 49 (score: 5.547000): "bkrbo" -> "xsasj"
***** Score: 347.886000

Plaintext TakeOff: "LLLLR"
Message 0 (score: 5.707000): "tihul" -> "jdobq"
Message 1 (score: 7.412000): "oyrbq" -> "pwgrf"
Message 2 (score: 4.933000): "xanqd" -> "lzcpx"
Message 3 (score: 4.933000): "xanqd" -> "lzcpx"
Message 4 (score: 7.231000): "xmire" -> "lhhab"
Message 5 (score: 7.156000): "tcyxm" -> "judeg"
. . .
Message 47 (score: 6.533000): "tcqno" -> "jupcu"
Message 48 (score: 8.309000): "tcbxi" -> "juree"
Message 49 (score: 6.677000): "bkrbo" -> "xsasu"
***** Score: 343.487000

Plaintext TakeOff: "LLLRL"
Message 0 (score: 7.122000): "tihul" -> "jdobn"
Message 1 (score: 7.020000): "oyrbq" -> "pwgml"
Message 2 (score: 6.170000): "xanqd" -> "lzctm"
Message 3 (score: 6.170000): "xanqd" -> "lzctm"
Message 4 (score: 7.397000): "xmire" -> "lhhvi"
Message 5 (score: 5.701000): "tcyxm" -> "judaz"
. . .
Message 47 (score: 5.306000): "tcqno" -> "jupwq"
Message 48 (score: 6.898000): "tcbxi" -> "jurab"
Message 49 (score: 5.146000): "bkrbo" -> "xsamj"
***** Score: 335.951000

The highest scoring is found for plaintext takeoff key "RRLLL":

Plaintext TakeOff: "RRLLL"
Message 0 (score: 7.681000): "tihul" -> "eoovp"
Message 1 (score: 7.170000): "oyrbq" -> "dlgul"
Message 2 (score: 7.462000): "xanqd" -> "bsxip"
Message 3 (score: 7.462000): "xanqd" -> "bsxip"
Message 4 (score: 7.417000): "xmire" -> "bqhai"
Message 5 (score: 6.110000): "tcyxm" -> "ebdez"
. . .
Message 47 (score: 7.578000): "tcqno" -> "ebnrf"
Message 48 (score: 7.307000): "tcbxi" -> "ebreb"
Message 49 (score: 8.237000): "bkrbo" -> "aeguf"
***** Score: 365.815000

(4) We now begin the journey of modifying one or both alphabets, scoring the candidate alphabets, and keeping track of the highest score and the associated left/right alphabets. We use Mike Cowan's CHURN technique to enable the software to extricate itself from local maximas. Here are the results of a complete run of 1,500,000 iterations of permuting the alphabets and scoring:

Command line:
argv[0] = ..\Release\SolveInDepthMessages.exe
argv[1] = -messagepath
argv[2] = ..\Data\Exhibit-6.txt
argv[3] = -cols
argv[4] = 5
argv[5] = -changeratios
argv[6] = 1:40/2:15/3:15/CO:10/L1:4/L2:4/FN:4/FO:4/CUT:4
argv[7] = -roundslimit
argv[8] = 1500000

=================================================================
Beginning a new search (#1) ...
=================================================================
bestScore (initial): 353.532000
Left Alphabet: tgdskhrxumijypvlwfqbezocan
Right Alphabet: xwhbsyznqcdvlrfomkjtuipaeg
Decryptions (RLLLR): "xvnvt", "prqpp", "nghiz", "ncrca", "xefdl", "gzgyp"
----------------------------------------------------------------
2: Accepting new best score (353.532000 -> 364.051000)
Left Alphabet: tiuazhrjgmdsypvlofqkecwxbn
Right Alphabet: iaxpqyzdvcnelrfohkjtubsmwg
Decryptions (LLLRL): "ipdyt", "hrvas", "mqabr", "mnpcl", "isfil", "gzmrs"
----------------------------------------------------------------
8: Accepting new best score (364.051000 -> 370.831000)
Left Alphabet: tibazlhnmwkjqrvdgpefscuxyo
Right Alphabet: iyxpblogvucjtkfrzednshamqw
Decryptions (LRRLR): "ipvwu", "wwrlf", "mburd", "mupza", "iaiiw", "ggmdz"
----------------------------------------------------------------
30: Accepting new best score (370.831000 -> 372.465000)
Left Alphabet: tsimaguepkhfqyjxrowvczldnb
Right Alphabet: sfcixgbuolraeztkjvnhdmqwpy
Decryptions (LLRRR): "scels", "vtncj", "kgskf", "kxnba", "sminu", "paqsj"
----------------------------------------------------------------
38: Accepting new best score (372.465000 -> 374.709000)
Left Alphabet: msitazuepjfhqylgokwbcxvdnr
Right Alphabet: sbcixgquzljfeotrmknhdavpwy
Decryptions (LLRRR): "iiolq", "mtbvm", "agsrb", "abicj", "iarfq", "wevtm"
----------------------------------------------------------------
39: Accepting new best score (374.709000 -> 376.167000)
Left Alphabet: msitazuepjfhqycgokwblxvdnr
Right Alphabet: sbcixgquzljfeotrmknhdavpwy
Decryptions (LLRRR): "iiolw", "mtbvm", "agsrb", "abicj", "irrwn", "wemtm"
----------------------------------------------------------------
43: Accepting new best score (376.167000 -> 377.690000)
Left Alphabet: msitazuepjfhqycgokwblxvndr
Right Alphabet: sbcixgquzljfeotrmknhdavpwy
Decryptions (LLRRL): "iiolw", "mtbvm", "agyrt", "abicj", "irrwn", "pemoi"
----------------------------------------------------------------
45: Accepting new best score (377.690000 -> 380.064000)
Left Alphabet: msitazuebjfhqycgovwplxkndr
Right Alphabet: sbcixgquzljfeotrmknhdavpwy
Decryptions (RLRLL): "iiolw", "mtbfm", "agyrt", "abicj", "irrwn", "pemoi"
----------------------------------------------------------------
138: Accepting new best score (380.064000 -> 384.216000)
Left Alphabet: lcgsnyavoqxmzwhjburtkpefdi
Right Alphabet: eniboaxwtjlvfmqryuzdskcpgh
Decryptions (RLLRR): "deysr", "txsdh", "lwxfi", "lmekh", "diwma", "orbnm"
----------------------------------------------------------------
142: Accepting new best score (384.216000 -> 384.480000)
Left Alphabet: lqgsnyavojxmzwhcburtkpedfi
Right Alphabet: wniboaxetjlvfmqryuzdskcpgh
Decryptions (RLLLL): "dwysr", "txsda", "lexow", "lmwkh", "dyemy", "orzwe"
----------------------------------------------------------------
262: Accepting new best score (384.480000 -> 388.308000)
Left Alphabet: nepqdxshakguvjctmbrlyziwfo
Right Alphabet: qfihkrdvmetxlpojzwsgcunbay
Decryptions (RRLRL): "jbeob", "yuccv", "reidt", "rwacm", "jjnmd", "qmzvv"
----------------------------------------------------------------
993: Accepting new best score (388.308000 -> 390.164000)
Left Alphabet: mgqdwktjanfcvyhzsiperxluob
Right Alphabet: vqoxbztwdncaurlgfkmjipyehs
Decryptions (RRRRL): "tmfvv", "hlylt", "pnczd", "pqjee", "tulhb", "ngrtt"
----------------------------------------------------------------
1583: Accepting new best score (390.164000 -> 396.077000)
Left Alphabet: hqbgrtxjepkwmiyvfsaculoznd
Right Alphabet: ahdvqyrckzwinmguobjtesflxp
Decryptions (RLLRR): "ygdla", "furjj", "rtaqu", "rmocn", "yeoso", "xasgu"
----------------------------------------------------------------
1584: Accepting new best score (396.077000 -> 397.473000)
Left Alphabet: hqbgrtxjepkwmiyvfsaculoznd
Right Alphabet: ahdvqjrckzwinmguobytesflxp
Decryptions (RLLRR): "jgdla", "furyy", "rtaqu", "rmocn", "jeoso", "xasgu"
----------------------------------------------------------------
1587: Accepting new best score (397.473000 -> 397.698000)
Left Alphabet: hqbgrtxjepkwmiydfsaculoznv
Right Alphabet: ahdvqjrckzwinmguobytesflxp
Decryptions (RLLRR): "jgdla", "furyy", "rtaqt", "rmocn", "jeoso", "xasyo"
----------------------------------------------------------------
15408: Accepting new best score (397.698000 -> 397.902000)
Left Alphabet: oaczfhpsgmuiryewdqbnxkjvtl
Right Alphabet: oebrystkvcpjhaixnwqflugzdm
Decryptions (RLLRR): "dhkeb", "oiiuu", "lbull", "lphnq", "drxze", "ftyfu"
20000: ... (bestScoreToDate=397.902000)
40000: ... (bestScoreToDate=397.902000)
----------------------------------------------------------------
40582: Accepting new best score (397.902000 -> 399.255000)
Left Alphabet: sqbycjgkrnovmzdwelfuitpaxh
Right Alphabet: optvafnkczqhlsbujgimxdrewy
Decryptions (RLRLR): "ddprd", "qaqff", "wwhai", "wsrhx", "dffpj", "zongf"
----------------------------------------------------------------
47329: Accepting new best score (399.255000 -> 400.387000)
Left Alphabet: dzjslyunhfkicarqwxbvgemotp
Right Alphabet: bgeqxcujaprokdnmhlszywfitv
Decryptions (RRLLR): "tkrpa", "iuhwc", "lnpsl", "lidlv", "tdjyb", "jrnqz"
60000: ... (bestScoreToDate=400.387000)
----------------------------------------------------------------
72873: Accepting new best score (400.387000 -> 402.128000)
Left Alphabet: klxjctwvohqambizdnsgferyup
Right Alphabet: notbmavprzkwicfhsqujglxyed
Decryptions (LLLLL): "ahwoa", "reeso", "tijcg", "tchdd", "aadat", "qkgjo"
80000: ... (bestScoreToDate=402.128000)
100000: ... (bestScoreToDate=402.128000)
120000: ... (bestScoreToDate=402.128000)
140000: ... (bestScoreToDate=402.128000)
160000: ... (bestScoreToDate=402.128000)
180000: ... (bestScoreToDate=402.128000)
200000: ... (bestScoreToDate=402.128000)
220000: ... (bestScoreToDate=402.128000)
240000: ... (bestScoreToDate=402.128000)
260000: ... (bestScoreToDate=402.128000)
280000: ... (bestScoreToDate=402.128000)
----------------------------------------------------------------
296146: Accepting new best score (402.128000 -> 403.708000)
Left Alphabet: fqikmbpuozrxghyvencaslwtjd
Right Alphabet: pqjnoukfsvxegzradmlchbytiw
Decryptions (LLLRL): "tnaxw", "sagsu", "ehcon", "euozh", "tcdrs", "mrkjm"
300000: ... (bestScoreToDate=403.708000)
320000: ... (bestScoreToDate=403.708000)
----------------------------------------------------------------
327389: Accepting new best score (403.708000 -> 409.399000)
Left Alphabet: geyairzvopmhjdqxltufnkswcb
Right Alphabet: lgkdniusojzhcvafwtrbqpyxme
Decryptions (RLLRL): "tivbb", "odskr", "fnytt", "fhujn", "tenrv", "qclwb"
340000: ... (bestScoreToDate=409.399000)
360000: ... (bestScoreToDate=409.399000)
380000: ... (bestScoreToDate=409.399000)
400000: ... (bestScoreToDate=409.399000)
420000: ... (bestScoreToDate=409.399000)
440000: ... (bestScoreToDate=409.399000)
460000: ... (bestScoreToDate=409.399000)
480000: ... (bestScoreToDate=409.399000)
500000: ... (bestScoreToDate=409.399000)
520000: ... (bestScoreToDate=409.399000)
540000: ... (bestScoreToDate=409.399000)
560000: ... (bestScoreToDate=409.399000)
580000: ... (bestScoreToDate=409.399000)
600000: ... (bestScoreToDate=409.399000)
620000: ... (bestScoreToDate=409.399000)
640000: ... (bestScoreToDate=409.399000)
660000: ... (bestScoreToDate=409.399000)
680000: ... (bestScoreToDate=409.399000)
700000: ... (bestScoreToDate=409.399000)
720000: ... (bestScoreToDate=409.399000)
740000: ... (bestScoreToDate=409.399000)
760000: ... (bestScoreToDate=409.399000)
780000: ... (bestScoreToDate=409.399000)
800000: ... (bestScoreToDate=409.399000)
820000: ... (bestScoreToDate=409.399000)
840000: ... (bestScoreToDate=409.399000)
860000: ... (bestScoreToDate=409.399000)
880000: ... (bestScoreToDate=409.399000)
900000: ... (bestScoreToDate=409.399000)
920000: ... (bestScoreToDate=409.399000)
940000: ... (bestScoreToDate=409.399000)
960000: ... (bestScoreToDate=409.399000)
980000: ... (bestScoreToDate=409.399000)
1000000: ... (bestScoreToDate=409.399000)
1020000: ... (bestScoreToDate=409.399000)
1040000: ... (bestScoreToDate=409.399000)
1060000: ... (bestScoreToDate=409.399000)
1080000: ... (bestScoreToDate=409.399000)
1100000: ... (bestScoreToDate=409.399000)
1120000: ... (bestScoreToDate=409.399000)
1140000: ... (bestScoreToDate=409.399000)
1160000: ... (bestScoreToDate=409.399000)
1180000: ... (bestScoreToDate=409.399000)
1200000: ... (bestScoreToDate=409.399000)
1220000: ... (bestScoreToDate=409.399000)
1240000: ... (bestScoreToDate=409.399000)
1260000: ... (bestScoreToDate=409.399000)
1280000: ... (bestScoreToDate=409.399000)
1300000: ... (bestScoreToDate=409.399000)
1320000: ... (bestScoreToDate=409.399000)
1340000: ... (bestScoreToDate=409.399000)
1360000: ... (bestScoreToDate=409.399000)
1380000: ... (bestScoreToDate=409.399000)
1400000: ... (bestScoreToDate=409.399000)
1420000: ... (bestScoreToDate=409.399000)
1440000: ... (bestScoreToDate=409.399000)
1460000: ... (bestScoreToDate=409.399000)
1480000: ... (bestScoreToDate=409.399000)
1500000: ... (bestScoreToDate=409.399000)

Methods used for permuting the alphabets

The software I wrote currently implements ten (10) ways for permuting an alphabet. Nine of them permute a single alphabet, while one method permutes both alphabets at the same time. Here are examples of each permuting method.

(a) SwapAlphabet (N=1,2,3,4) : Swap N pairs of letters

Example of N=1:

Before: JYOUKGQXSDTWEAZIBVCLFMPNRH
After:  JYOUKBQXSDTWEAZIGVCLFMPNRH

Example of N=2:

Before: JYOUKGQXSDTWEAZIBVCLFMPNRH
After:  JYOUKBQXMDTWEAZIGVCLFSPNRH

(b) Cyclic Shift Left (shift=1,2): Shift the alphabet <shift> places to the left

Example of shift=1:

Before: RAJBYSLIMTONXDKHEFWPGQCVUZ   <-- shift one position
After:  AJBYSLIMTONXDKHEFWPGQCVUZR

(c) Flip neighboring cells: Exchange adjacent letters

M  P  G  Z  C  W  K  F  O  Y  A  H  J  R  E  V  T  I  U  B  X  N  D  S  Q  L
^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
+--+  |  |  +--+  |  |  +--+  |  |  +--+  |  |  +--+  |  |  +--+  |  |  +--+
  1   +--+    3   +--+    5   +--+    7   +--+    9   +--+   11   +--+   13
        2           4           6           8          10          12

Before: MPGZCWKFOYAHJREVTIUBXNDSQL
After:  PMZGWCFKYOHARJVEITBUNXSDLQ

(d) Flip opposing cells: Flip alphabet horizontally

M  P  G  Z  C  W  K  F  O  Y  A  H  J  R  E  V  T  I  U  B  X  N  D  S  Q  L
^  ^  ^                          ^  ^  ^  ^                          ^  ^  ^
|  |  |                          |  |  |  |                          |  |  |
|  |  |  ......................  |  +--+  |  ......................  |  |  |
|  |  |                          +--------+                          |  |  |
|  |  +--------------------------------------------------------------+  |  |
|  +--------------------------------------------------------------------+  |
+--------------------------------------------------------------------------+

Before: MPGZCWKFOYAHJREVTIUBXNDSQL
After:  LQSDNXBUITVERJHAYOFKWCZGPM

(e) Cut the alphabet: Equivalent to a "Cyclic Shift Left of 13"

Before: RAJBYSLIMTONXDKHEFWPGQCVUZ   <-- shift 13 positions
After:  DKHEFWPGQCVUZRAJBYSLIMTONX

(f) Cross-over: Genetic operator on both alphabets

This is the only operation that affects both alphabets simultaneously. The operation is borrowed from genetic programming and consists of the following steps:
  • Generate a random mask of 0s and 1s of length 26.
  • All characters in both alphabets corresponding to the 0s will stay as they are.
  • Those corresponding to the 1s will be rearranged in random order in the alphabet.
For example, we begin with the following left and right alphabets:

Left Alphabet:      WINFUJMGCDXYVRPEOQLABKTZSH
Right Alphabet:     YRLGPAQUKOZBHEXCVIMTSNDFJW

We generate a random 26-character random string consisting of 0s and 1s:

Crossover String: 01001000110010100011010100

Extract the pairs of letters in the left and right alphabets corresponding to 0s, leaving them in their current positions. All the other letters in the alphabets are jumbled randomly and distributed back into their respective alphabets:

Left Alphabet:      WINFUJMGCDXYVRPEOQLABKTZSH
Right Alphabet:     YRLGPAQUKOZBHEXCVIMTSNDFJW
Crossover String:   01001000110010100011010100

(Transfer pairs corresponding to 0s)

Left Alphabet (0):  W.NF.JMG..XY.R.EOQ..B.T.SH
Right Alphabet (0): Y.LG.AQU..ZB.E.CVI..S.D.JW

(Identify pairs corresponding to 1s)

Left Alphabet (1):  .I..U...CD..V.P...LA.K.Z..
Right Alphabet (1): .R..P...KO..H.X...MT.N.F..

(Scramble letters corresponding to 1s to get permuted alphabets)

Left Alphabet (*):  WZNFIJMGDCXYARPEOQKLBUTVSH
Right Alphabet (*): YMLGHAQUOKZBPENCVIXFSTDRJW

The software I wrote

I wrote a C++ program to implement the proposed simulated annealing method described above, and am uploading the source code. The code consists of two C++ source files: main.cpp and CGetCmdOpts.cpp. In addition, there is a file called Exhibit-6.txt which contains all fifty (50) Exhibit 6 messages, one message per line.
  • main.cpp: implements the proposed method
  • CGetCmdOpts.cpp: code for parsing the command line options
To run, compile the two source files to produce an executable (e.g., SolveExhibit6.exe), then execute the program by passing the appropriate command line options:
  • -messagepath: absolute or relative path to the Exhibit 6 messages
  • -cols: specifies the number of starting columns to process
  • -changeratios: specifies which alphabet permutaters to use, and how frequently to use each. Frequencies must add up to exactly 100.
  • -roundslimit: number of cycles to perform before resetting the test and starting over again
Regarding "changeratios", here is a list of recognized permutaters:
  • <n> : Swap <n> alphabet pairs in one alphabet, n=1,2,3,4
  • L<n> : Cyclic shift left <n> places (one alphabet), n=1,2
  • FN : Flip neighboring cells
  • FO : Flip opposing cells
  • CUT : Cut the alphabet
  • CO : Crossover
A "changeratios" string consists of one or more permutaters together with its usage ratio, where the sum of all the ratios must equal 100. The permutaters are separated by '/'. Here are examples of "changeratios" strings:
  • 1:40/2:15/3:15/CO:10/L1:4/L2:4/FN:4/FO:4/CUT:4
  • 1:10/2:15/3:20/4:25/CO:30
  • 1:40/2:15/3:15/CO:10/L1:4/L2:4/FN:4/FO:4/CUT:4
  • 1:30/2:15/3:15/4:10/CO:25/FN:2/FO:2/CUT:1
  • 1:100
In the first example in the list, the permutaters will be selected randomly with the following ratios: 40%, 15%, 15%, 10%, 4%, 4%, 4%, 4%, and 4% of the time.

And finally, here are full command line examples:
  • -messagepath Exhibit-6.txt -cols 5 -changeratios 1:40/2:15/3:15/CO:10/L1:4/L2:4/FN:4/FO:4/CUT:4 -roundslimit 1500000
  • -messagepath Exhibit-6.txt -cols 6 -changeratios 1:10/2:15/3:20/4:25/CO:30 -roundslimit 700000
  • -messagepath ..\Data\Exhibit-6.txt -cols 5 -changeratios 1:50/CO:50 -roundslimit 1000000
My results so far

Although the theory seems right to me, in practice the program is not capable of locating the correct solution. In all runs, the maximal score reached when processing the first five (5) letters of messages averages about 408, which corresponds to random letters tending towards plaintext. Although each test runs through 1,500,000 cycles, no significant maximas are being located. The permutaters are meant to perform shotgun hill-climbing to spread out the search, while the simulated annealing prevents us from getting stuck in local maximas.

In an earlier version of the program I used Mike Cowan's method of scoring using tetragrams, but found that the intermediate candidate plaintexts were so far from viable English that the scoring was useless. I therefore changed to scoring individual letters, with special scoring for initial letters.

So my question is: why isn't the program finding the correct solution? Am I doing something wrong? Can someone improve the algorithm?

Moving forward: The Challenge to You

Having presented the general proposal for solving Exhibit 6 using hill-climbing and simulated annealing, I intend to continue improving the hill-climbing algorithm. Nonetheless I'm handing the baton to other forum readers, and look forward to hearing that someone reaches better results than I have.
Exhibit_6.txt (1.67 KiB)
CGetCmdOpt.h (1 KiB)
CGetCmdOpt.cpp (5.2 KiB)
main.cpp (24.14 KiB)
Exhibit_6.txt (1.67 KiB)
CGetCmdOpt.h (1 KiB)
CGetCmdOpt.cpp (5.2 KiB)
main.cpp (24.14 KiB)
Last edited by mosher on April 8th, 2015, 6:26 pm, edited 20 times in total.
Quote
Like
Share

james
Elite member
james
Elite member
Joined: May 14th, 2010, 3:48 pm

February 14th, 2015, 12:31 pm #2

This looks like the original cipher text put up by D & K -- just as impossible to solve.
Quote
Like
Share

mosher
Super member
mosher
Super member
Joined: May 26th, 2009, 10:24 am

February 14th, 2015, 6:57 pm #3

Hi james,

Can you elaborate on your remark? We know the precise Chaocipher flavor used by K&D in their Exhibit 5 (i.e., classic Chaocipher with a plaintext takeoff key). I'm assuming they used the same system for Exhibit 6. I'm assuming my SA method, although logical, is not finding the correct solution, and the question is why not? Unlike Exhibit 5, which consisted of three short messages, here we have a depth of fifty (50) messages. My feeling is not to point a finger at K&D (yet!).

You're a hill-climbing/SA veteran. Where am I going wrong?
Quote
Like
Share