Cryptography is the science of writing messages that no oneexcept the intended receiver can read. Cryptanalysis is thescience of reading them anyway. "Crypto" comes from the Greek'krypte' meaning hidden or vault and "Graphy" comes from theGreek 'grafik' meaning writing. The words, characters orletters of the original intelligible message constitute thePlain Text (PT). The words, characters or letters of thesecret form of the message are called Cipher Text (CT) andtogether constitute a Cryptogram.
Cryptograms are roughly divided into Ciphers and Codes.
William F. Friedman defines a Cipher message as one produced byapplying a method of cryptography to the individual letters ofthe plain text taken either singly or in groups of constantlength. Practically every cipher message is the result of thejoint application of a General System (or Algorithm) or methodof treatment, which is invariable and a Specific Key which isvariable, at the will of the correspondents and controls theexact steps followed under the general system. It is assumedthat the general system is known by the correspondents and thecryptanalyst. [FRE1]
A Code message is a cryptogram which has been produced by usinga code book consisting of arbitrary combinations of letters,entire words, figures substituted for words, partial words,phrases, of PT. Whereas a cipher system acts upon individualletters or definite groups taken as units, a code deals withentire words or phrases or even sentences taken as units.We will look at both types of systems in this course.
The process of converting PT into CT is Encipherment. Thereverse process of reducing CT into PT is Decipherment.
Cipher systems are divided into two classes: substitution andtransposition. A Substitution cipher is a cryptogram in whichthe original letters of the plain text, taken either singly orin groups of constant length, have been replaced by otherletters, figures, signs, or combination of them in accordancewith a definite system and key. A Transposition cipher is acryptogram in which the original letters of the plain text havemerely been rearranged according to a definite system. Moderncipher systems use both substitution and transposition tocreate secret messages.
The fundamental difference between substitution andtransposition methods is that in the former the normal orconventional values of the letters of the PT are changed,without any change in the relative positions of the letters intheir original sequences, whereas in the latter only therelative positions of the letters of the PT in the originalsequences are changed, without any changes to the conventionalvalues for the letters. Since the methods of encipherment areradically different in the two cases, the principles involvedin the cryptanalyses of both types of ciphers are fundamentallydifferent. We will look at the methods for determine whethera cipher has been enciphered by substitution or transposition.
Probably the most popular amateur cipher is the simplesubstitution cipher. We see them in newspapers. Kids use themto fool teachers, lovers send them to each for specialmeetings, they have been used by the Masons, secret Greeksocieties and by fraternal organizations. Current gangs in theSouthwest use them to do drug deals. They are found inliterature like the Gold Bug by Edgar Allen Poe, and deaththreats by the infamous Zodiak killer in San Francisco in thelate 1960's.
The Aristocrats (A1-A25) in the Aristocrats Column of "TheCryptogram" are all simple substitution ciphers in English.Each English plain text letter in all its occurrences in themessage is replaced by a unique English ciphertext letter. Themathematical process is called one-to-one contour mapping. Itis unethical (and a possible wedge for the analyst) to use thesame ciphertext letter for substitution for a plaintext letter.
A recurring theme of my lectures is that all substitutionciphers have a common basis in mathematics and probabilitytheory. The basis language of the cipher doesn't matter aslong as it can be characterized mathematically. Mathematics isthe common link for deciphering any language substitutioncipher. Based on mathematical principles, we can identify thelanguage of the cryptogram and the break open its contents.
William F. Friedman presents the fundamental operations for thesolution of practically every cryptogram:
According to the Navy Department OP-20-G Course in Crypt-analysis, the solution of a substitution cipher generallyprogresses through the following stages:
Since this is a course in Cryptanalysis, lets start crackingsome open.
While reading the newspaper you see the following cryptogram.Train your eye to look for wedges or 'ins' into the cryptogram.Assume that we dealing with English and that we have simplesubstitution. What do we know? Although short, there areseveral entries for solution. Number the words. Note that itis a quotation (12, 13 words with * represent a proper name inACA lingo).
A-1. Elevated thinker. K2 (71) LANAKI
Note words 1 and 6 could be: ' The....That' and words 3 and 5use the same 4 letters I T A M . Note that there is aflow to this cryptogram The _ _ is? _ _ and? _ _. Titleseither help or should be ignored as red herrings. Elevatedmight mean "high" and the thinker could be the properperson. We also could attack this cipher using patternwords (lists of words with repeated letters put intothesaurus form and referenced by pattern and word length) forwords 2, 3, 6, 9, and 11.
Filling in the cryptogram using [ The... That] assumption wehave:
Not bad for a start. We find the ending e_t might be 'est'.A two letter word starting with t_ is 'to'. Word 8 is 'are'.So we add this part of the puzzle. Note how each wedge leadsto the next wedge. Always look for confirmation that yourassumptions are correct. Have an eraser ready to start backa step if necessary. Keep a tally on which letters havebeen placed correctly. Those that are unconfirmed guesses,signify with ? Piece by piece, we build on the opening wedge.
Now we have some bigger wedges. The s_h is a possible 'sch'from German. Word 9 could be 'surrounded.' Z = i. The namecould be Albert Schweitzer. Lets try these guesses. Word 2might be 'highest' which goes with the title.
The final message is: The highest knowledge is to know that weare surrounded by mystery. Albert Schweitzer.
Ok that's the message, but what do we know about the keyingmethod.
Ciphertext alphabets are generally mixed for more security andan easy pneumonic to remember as a translation key. ACAciphers are keyed in K1, K2, K3, K4 or K()M for mixed variety.K1 means that a keyword is used in the PT alphabet to scrambleit. K2 is the most popular for CT alphabet scrambling. K3uses the same keyword in both PT and CT alphabets, K4 usesdifferent keywords in both PT and CT alphabets. A keyword orphrase is chosen that can easily be remembered. Duplicateletters after the first occurrence are deleted.
Following the keyword, the balance of the letters are writtenout in normal order. A one-to-one correspondence with theregular alphabet is maintained. A K2M mixed keyword sequenceusing the word METAL and key DEMOCRAT might look like this:
4 2 5 1 3 M E T A L ============= D E M O C R A T B F G H I J K L N P Q S U V W X Y Z
The CT alphabet would be taken off by columns and used:
CT: OBJQX EAHNV CFKSY DRGLUZ MTIPW
Going back to A-1. Since it is keyed aa a K-2, we set up thePT alphabet as a normal sequence and fill in the CT lettersbelow it. Do you see the keyword LIGHT?
PT a b c d e f g h i j k l m n o p q r s t u v w x y zCT Q R S U V W X Y Z L I G H T A B C D E F J K M N O P ----------KW = LIGHT
In tough ciphers, we use the above key recovery procedure to goback and forth between the cryptogram and keying alphabet toyield additional information.
To summarize the eyeball method:
A working knowledge of the letters, characteristics, relationswith each other, and their favorite positions in words is veryvaluable in solving substitution ciphers.
Friedman was the first to employ the principle that EnglishLetters are mathematically distributed in a unilateralfrequency distribution:
13 9 8 8 7 7 7 6 6 4 4 3 3 3 3 2 2 2 1 1 1 - - - - - E T A O N I R S H L D C U P F M W Y B G V K Q X J Z
That is, in each 100 letters of text, E has a frequency (ornumber of appearances) of about 13; T, a frequency of about 9;K Q X J Z appear so seldom, that their frequency is a lowdecimal.
Other important data on English ( based on Hitt's MilitaryText):
6 Vowels: A E I O U Y = 40 %20 Consonants: 5 High Frequency (D N R S T) = 35 % 10 Medium Frequency (B C F G H L M P V W) = 24 % 5 Low Frequency (J K Q X Z) = 1 % ==== 100.%
The four vowels A, E, I, O and the four consonants N, R,S, T form 2/3 of the normal English plain text. [FR1]
Friedman gives a Digraph chart taken from Parker Hitts Manualon p22 of reference. [FR2]
The most frequent English digraphs per 200 letters are:
TH--50 AT--25 ST--20ER--40 EN--25 IO--18ON--39 ES--25 LE--18AN--38 OF--25 IS--17RE--36 OR--25 OU--17HE--33 NT--24 AR--16IN--31 EA--22 AS--16ED--30 TI--22 DE--16ND--30 TO--22 RT--16HA--26 IT--20 VE--16
The most frequent English trigraphs per 200 letters are:
THE--89 TIO--33 EDT--27AND--54 FOR--33 TIS--25THA--47 NDE--31 OFT--23ENT--39 HAS--28 STH--21ION--36 NCE--27 MEN--20
Frequency of Initial and Final Letters:Letters-- A B C D E F G H I J K L M N O P Q R S T U V W X Y ZInitial-- 9 6 6 5 2 4 2 3 3 1 1 2 4 2 10 2 - 4 5 17 2 - 7 - 3 -Final -- 1 - 1017 6 4 2 - - 1 6 1 9 4 1 - 8 9 11 1 - 1 - 8 -
Relative Frequencies of Vowels:
A 19.5% E 32.0% I 16.7% O 20.2% U 8.0% Y 3.6%
Average number of vowels per 20 letters, 8.
Becker and Piper partition the English language into 5 groupsbased on their Table 1.1 [STIN], [BP82]
Table 1.1 Probability Of Occurrence of 26 Letters Letter Probability Letter Probability A .082 N .067 B .015 O .075 C .028 P .019 D .043 Q .001 E .127 R .060 F .022 S .063 G .020 T .091 H .061 U .028 I .070 V .010 J .002 W .023 K .008 X .001 L .040 Y .020 M .024 Z .001
Groups:
ELCY gives Data for English, German, French, Italian, Spanish,Portuguese in her Appendices, p218 ff. She also give tables ofletter contact data. [ELCY]
LANAKI published data on English and 10 different languages aswell as expanded work on Chinese. It is available at the CDB. [NIC1] [NIC2]
S-TUCK gives detailed English, French and Spanish lettercharacteristics in her book. [TUCK]
Friedman in his Military Cryptanalytics Part I - Volume 1gives charts showing the lower and upper limits of deviationfrom theoretical (random) for the number of vowels, high, low,medium frequency consonants, blanks in distributions forplain text and random text for messages of various lengths. [FR1]
Friedman in his Military Cryptanalytics Part I - Volume 2give a veritable pot puree of statistical data on letterfrequencies, digraphs, trigraphs, tetragraphs, grouped letters,relative log data, special purpose data, pattern words,idiomorphic data, standard endings, initials, foreign languagedata [German, French, Italian, Spanish, Portuguese andRussian], classification of systems used in concealment, nullsand literals. [FR2]
Sinkov assigns log frequencies to digraphs to aid inidentification. The procedure is explained by Friedman. [FR1] [SINK]
"ACA and You" presents general properties of English letters. [ACA]
Foster presents detail letter characteristics based on theBrown Corpus. [CCF]
Don L. Dow puts out a clever computer cryptogram game whichdoes frequency analysis and is user friendly for very simpleAristocrats. {Available as shareware} [DOW]
Depending the basis text we choose, we find variations in thefrequency of letters. For example, literary English givesslightly different results than frequencies based on militaryor ordinary English text.
Hagn presented Literary English Letter Usage Statistics basedon "A Tale of Two Cities" by Charles Dickens as follows:[HAGN]
Total letter count = 586747Letter use frequencies: Total doubled letter count = 14421E: 72881 12.4% Doubled letter frequencies:T: 52397 8.9% LL: 2979 20.6%A: 47072 8.0% EE: 2146 14.8%O: 45116 7.6% SS: 2128 14.7%N: 41316 7.0% OO: 2064 14.3%I: 39710 6.7% TT: 1169 8.1%H: 38334 6.5% RR: 1068 7.4%S: 36770 6.2% PP: 628 4.3%R: 35946 6.1% FF: 430 2.9%D: 27487 4.6% NN: 301 2.0%L: 21479 3.6% CC: 243 1.6%U: 16218 2.7% MM: 207 1.4%M: 14928 2.5% DD: 201 1.3%W: 13835 2.3% GG: 99 0.6%C: 13223 2.2% BB: 41 0.2%F: 13152 2.2% ZZ: 13 0.0%G: 12121 2.0% AA: 2 0.0%Y: 11849 2.0% HH: 1 0.0%P: 9452 1.6%B: 8163 1.3%V: 5044 0.8%K: 4631 0.7%Q: 655 0.1%X: 637 0.1%J: 623 0.1%Z: 213 0.0%
Total initial letters = 135664 Total ending letters = 135759
Initial letter frequencies: Ending letter frequencies:T: 20665 15.2% E: 26439 19.4%A: 15564 11.4% D: 17313 12.7%H: 11623 8.5% S: 14737 10.8%W: 9597 7.0% T: 13685 10.0%I: 9468 6.9% N: 10525 7.7%S: 9376 6.9% R: 9491 6.9%O: 8205 6.0% Y: 7915 5.8%M: 6293 4.6% O: 6226 4.5%B: 5831 4.2% F: 5133 3.7%C: 4962 3.6% G: 4463 3.2%F: 4843 3.5% H: 3579 2.6%Top digraphs:TH: 17783 RE: 8139 ED: 6217 IS: 5566HE: 17226 ND: 7793 AT: 6200 NG: 5564IN: 10783 HA: 6611 EN: 5849 IT: 5559ER: 10172 ON: 6464 HI: 5730 OR: 4915AN: 9974 OU: 6418 TO: 5703 AS: 4836
Time to put to good use the barrage of data presented. Giventhe next slightly harder cryptogram, and ignoring again apattern word attack, we can develop some useful tools. [Muchof what I am covering can be done automatically by computer butthen your brain goes mushy for failure to understand theprocess.]
A-2. [no clue] S-TUCK
V W H A Z S J X I H S K I M F M W C G M V W O J S I F -A G F J A Q Q M N R J K Z M G R S W M F. J A T W X H -A W F. F I Q Q W F F X I H F K H B A O Z J S M A H H F.T G A H P K D X M A W O V F S A R F X H K I M A F S.[ Hyphens mean a continuation of a word.]
First we perform a CT Frequency Count.
F A H M W S I J K X G Q O R V Z T B C D N P13 11 9 9 8 7 6 6 5 5 4 4 3 3 3 3 2 1 1 1 1 1
We have 106 letters. 20% are considered low frequency.20% of 106 = 21. Counting from right to left we have O, R, V,Z, T, B, C, D, N, P. We mark A-2. with a dot over eachappearance. We also enter the frequency data under the CT.
Next we develop a CT Letter Position Chart.
deduced F : I 2 3 - 3 2 E PT equiv's A 11 : / / ..... /// / i B 1 : . v C 1 : / w D 1 : / x F 13 : / / ..... / ///// s G 4 : / / a H 9 : // // . / / // l I 6 : / ... // u J 6 : // / .. / t K 5 : // / . / o M 9 :/ // / .. // r N 1 : / y O 3 : / / n P 1 : / b Q 4 : / / . / c R 3 : .. / p S 7 : / / .... / h T 2 : / / m V 3 : / . / d W 8 : / // .. / / / e X 5 : /// // f Z 3 : .. / g === 106
Columns represent the initial, first, second, third letters,final and two preceding antepenultimate letters. Dots for anyother position in word.
ANALYSIS of A-2. Using Vowel Selection Method.
The Vowel Selection Method is: 1) separate the vowels from theconsonants, 2) assign vowel identities, 3) assign identities toconsonants.
A-2. [no clue] S-TUCK 1 2 3 4. . . . .V W H A Z S J X I H S K I M F M W C G M V W O J S I F -3 8 9 + 3 7 6 5 6 9 7 5 6 9 * 9 8 1 4 9 3 8 3 6 7 6 * 5 6 7 . . . . .A G F J A Q Q M N R J K Z M G R S W M F. J A T W X H -+ 4 * 6 + 4 4 9 1 3 6 5 3 9 4 3 7 8 9 * 6 + 2 8 5 9 8 9 10 . . .A W F. F I Q Q W F F X I H F K H B A O Z J S M A H H F.+ 8 * * 6 4 4 8 * * 5 6 9 * 5 9 1 + 3 3 6 7 9 + 9 9 * 11 12 13. . . . . .T G A H P K D X M A W O V F S A R F X H K I M A F S.2 4 + 9 1 5 1 5 9 + 8 3 3 * 7 + 3 * 5 9 5 6 9 + * 7
(two digit figures F=13=* ; A=11=+)
Vowels contact the low frequency letters more often than doconsonants. About 80% of the time. We use S-TUCK methodcombined with our text. [ELCY] [TUCK]
We go thru A-2. writing down the contact letters on both sides,for low frequency CT. We tally one for each contact. If a CTletter is between two low frequency letters we tally 2.Contacts for low frequency letters touching each other = 0. Wedo not count N o R in word 2, and in word 1, W contacts V, so Wis tallied with 1. A an S contact Z, so both A and S arecredited. We get:
///// //// // /// /// // /// // // W A S G M J K H F
Low Frequency Contacts for A-2.
Second A E I O U Y A 0 0 .4 0 .1 .3 Total nonpairs = 5.1% E .7 .4 .2 .1 0 .2 pairs = 0.7% F I I .2 .4 0 .7 0 0 R S O .1 .1 .1 .3 1.0 0 T U .1 .1 .1 0 0 0 Y 0 .1 0 .2 0 0ELCY tells us quite a bit about vowel behavior.
ELCY defines high frequency letter behavior.
About 70% of the language is made up of E, T, A, O, N, I, R, S,H. This high frequency group has three cliques.
Class I. T, O, S appear frequently both as Initials and Finals; terminal O in short words like to. All double freely Class II. A, I, H appear frequently as initials, but rare as finals, especially A, I. They do not readily double. Class III. E, N, R, appear frequently as finals, less frequently as initials, frequently double, especially E, N and R not so often.
When one of these letters changes its class, the least likelyexchange is one occurring between Class II and III.
ELCY gives us tips for identifying consonants:
We return now to solution of A-2.
From the number of their contacts, W and A are most likelyvowels. G, K, M are next most likely.
We look at these letters in the position table:
W. has the looks of E even though it is not the most frequent.A. cannot be A so it might be I. but frequency may be too high.G. and K. have inside positions and look like vowels but can not be identified.M. might be O by frequency but is confused with R.
A study of A-2. shows that W and A reverse which might be eiand ie. AG reverses which might be io or ia. M repeats, andreverses with W and G. It most likely is R not O. K does notcontact W A G or M. We mark the cipher with W A G K as vowelsand M as a consonant, putting in the assumed values.
A-2. [no clue] S-TUCK 1 2 3 4d e l i g h t f u l h o u r s r e a r d e t h s. v c v . c v c v v c c c v . v c . v . v cV W H A Z S J X I H S K I M F M W C G M V W O J S I F -3 8 9 + 3 7 6 5 6 9 7 5 6 9 * 9 8 1 4 9 3 8 3 6 7 6 * 5 6 7i a s t i c c r t o g r h e r s t i f lv v c v c c c . . v . c v . v c c v . v c cA G F J A Q Q M N R J K Z M G R S W M F. J A T W X H -+ 4 * 6 + 4 4 9 1 3 6 5 3 9 4 3 7 8 9 * 6 + 2 8 5 9 8 9 10i e s s u c c e s s f u l s o l i g t h r i l l sv v c c v c c v c c c v c c v c . v . . c v c c cA W F. F I Q Q W F F X I H F K H B A O Z J S M A H H F.+ 8 * * 6 4 4 8 * * 5 6 9 * 5 9 1 + 3 3 6 7 9 + 9 9 * 11 12 13 a i l o f r i e d s h i s f l u i s h. v v c . v . c c v v . . c v . c c c v v c v cT G A H P K D X M A W O V F S A R F X H K I M A F S.2 4 + 9 1 5 1 5 9 + 8 3 3 * 7 + 3 * 5 9 5 6 9 + * 7
Using Nympho' robots rule, in Word 1, J X I H, one must be avowel. Word 8 shows F X I H contains a vowel. Word onesuggest the ending 'ful'. X = f and H = l. Examine X I Hand the I is in the vowel positions. (inner positions). So thevowels are now W E G K I. From its end position F =s. Inwords 4 and 11, GA reverses so G cannot be a u for ui is not areversal. We try KI=ou, therefore G = A. Put into the abovecipher tableaus. Word 5 breaks the two c's, so Q = c.Word 1 might be delightful, so V=d, ZSJ = ght. Remember thesecond letter position favors vowels. [ROBO]
The message reads: Delightful hours reward enthusiasticcryptographers. Time flies. Successful solving thrills.Mailbox friendships flourish. KW =K1=salutory.
Pattern words are words for which one or more letters arerepeated such as awkward, successful, interesting, unusually.Aegean Park Press publishes pattern word books from 3 - 16letters. Pattern words lists are indexed by key letters orfigures or by vowel consonant relationships. [BARK] Patternwords give a quick wedge into the cryptogram. One of the bestPattern Word Dictionaries is the Cryptodyct. [GODD]
The Crypto Drop Box has the TEA computer program which givesautomated pattern searching and anagraming up to 20 words. Itis a very effective tool.
In A-2. We find a prize in word 8. Using a key letterapproach: A B C C D A A E B F F I Q Q W F F X I H or 1 2 3 3 4 1 1 5 2 6 = (334) 11526 [10L] F I Q Q W F F X I H
The first pattern found on page 310 Appendix of [CCF] issuccessful. The Cryptodyct uses the latter indexing methodand under 10 letter words we find that the 334 11526 patternequals successful.
Cryptographers generate their own special lists:
Transposals: from, form; night, thing; mate, meat;Queer words: adieu, crwth, eggglass, giaour, meaowConsonant sequences: dths, lcht, ncht, rids, ngst, rthsFavorite ins: people, crypt, success,
Using the TEA model, it was necessary to assume thevowels at u and e for a 1u22e445u6 template to getsuccessful and juggernaut on the first try.
Non Pattern word lists are those with words that do not haveeven one repeated letter, such as come, wrath, journey. Theyare very useful in attacking Patristrocrats and very difficultRisties.
OMAR gave us this fine list in order of frequency:
CRYPT WORDS ABOUT KNOWS BELOW OKAPI SWORD BLACK ALONG AFTER NEGRO EXTRA PLACE THREW WATCH CRAZY CAUSE UNDER FIRST SIXTY WRONG WHILE CROWD DRUNK UPSET FOUND STUDY ANGRY PLUMB EMPTY YIELD
We will come back to it in the Patty section.
Also in the CDB is a program called ASOLVER which automatesthe Digram solution method to get the best fit.
Dr. Raj Wal summarized Barkers Vowel Preferences data.He also developed cross correlation coefficients for eachletter. Foster details this work in his book. [CCF]
This handy little table gives us an entry when needed. It iscorrect more times than it fails.
Word Length Position Preferences one 1 V two 1 2 V C three 1 2 3 C C - four 1 2 3 4 C V - C five 1 2 3 4 5 C C V C C six 1 2 3 4 5 6 C V C - - C seven 1 2 3 4 5 6 7 C V C C - - C eight 1 2 3 4 5 . . Final plus C C - - - - - C
Note the vowel preference in the second column. S-TUCKdescribes a method that uses the above table for long wordcryptograms. She lines the words up under each other andcompares the letter positions with each other. Using thecolumnar method (named by Sherlack) on A-2 we would havefound an incredible four of the vowels! The same process ofmarking the low frequency consonants and word endings wouldhave given us about half the letters. Wayne Barker developed acourse based on this method. [BAR2]
CODEX, MICROPOD and ZYZZ are among the best tough "risties"constructors. A tough ristie is a fascinating form of simplesubstitution with word division in which the message is of noimportance whatever and the encipherer's full attention hasbeen given to the manipulation of letter characteristics.Both ELCY and S-TUCK present versions of George C. Lamb'sVariety of Contact or Consonant Line Approach. I shall useELCY's version and example and expand the consonant lineapproach to make it more understandable. We start with:
A-3. No clue. Author Bosley No. 19. CM. June 1936. 1 2 3 U W Y M N X K A E H X R B Z U V X M U W B Z 4 5 6O Y Z T W H V C X Y A C Y A U Z D B R A H V K B A; 7 8 9Z W S V A H K U Z B K C, M S C X C Y X B S, 10X V Z Y T R Y C X P. (104L)
The object is to isolate a small group of consonants. Whereasfrequency data can be manipulated, variety of contact datacannot. We start with 1) a list of CT contacts in order ofappearance of the letters and 2) rearrange these CT letters inorder of decreasing variety of contacts.
A-3. Contacts 5U6 4W7 7Y9 3M5 1N2 8X10 4K7 6A7 1E1 4H6 3R5 6B8 --- --- --- --- --- --- --- --- --- --- --- --- -|W U|Y W|M Y|N M|X N|K X|A K|- -|H E|X X|B R|Z -|V U|B O|Z X|U | H|R V|B Y|- | W|V B|A W|Z M|W T|H X|A -|S | V|M H|U Y U | A|V T|Y D|R A|Z Z|S C|A | | C|Y B|C R|H | A|K | K|A K|Z | C|X | | C|- | B|- | | | Z|K | | Z|T | | Y|B | V|H | | | X|S | | R|C | | -|V | | | | | | C|P 7Z6 5V8 1O1 2T4 6C5 1D1 3S5 1P1 --- --- --- --- --- --- --- --- B|- U|X -|Y Z|W V|X -|B W|V X|- B|- H|C | Y|R -|Y | M|C | Y|T H|K | | K|- | B|- | U|- S|A | | S|X | | | -|W X|Z | | -|Y | | | U|B | | | Y|X | | | V|Y | | | | | | | Variety of Contact Table (VOC): Freq: 8 7 6 5 4 4 6 5 4 7 / 3 3 6 3 / 2 1 1 1 1 1 VOC: 10 9 8 8 7 7 7 6 6 6 / 5 5 5 5 / 4 2 1 1 1 1 CT: X Y B V W K A U H Z / M R C S / T N E O D P
We start with the position that 20% of the text represented byvariety count are consonants. 20% of 104 = about 21. The lineof demarcation is between R and C but 4 letters have the sameVOC of 5, M,R,S,C. If we take one , we must take all and oneof these most likely is a vowel. The key to solution is theVOC "step up" versus "step down" observation. Vowels tend tostep up and Consonants tend to step down. [i.e. 3M5 is a stepup of 2 points and 6C5 is a step down of one point.]
M, R, S all step up, C steps down 1 point and most likely is aconsonant. We develop a separation line and place thecontacts on each side of the consonant line starting from theright of the VOC table.
First Consonant Line C T N E O D P --------------------- V | X | XXXX YY | YYY K | S | Z | | W | R M | | H | B
If any letter does not appear at all below the line, thatletter is most likely a consonant. A and U fall into thiscatagory. We add these to analysis:
Second Consonant Line C T N E O D P A U --------------------- VV | V mark X and Y as Vowels X | XXXX (vowel) both step up YYYY | YYY (vowel) with high VOC KKK | S | Z | ZZ consonant (step down) | WWW test as h R | R MM | | HHH B | B | U A | |
We shift to A-3 and mark in the suspected consonents.
A-3. No clue. Author Bosley No. 19. CM. June 1936.cont 1 2 3 U W Y M N X K A E H X R B Z U V X M U W B Z - - o - - o -- - o o - o - - - o - - - o - 4 5 6O Y Z T W H V C X Y A C Y A U Z D B R A H V K B A;- o - - - o - - o o - - o - - - - o - - o - - o - 7 8 9Z W S V A H K U Z B K C, M S C X C Y X B S,- - o - - o - - - o - - - o - o - o o o o 10X V Z Y T R Y C X P. (104L)o - - o - - o - o -
n and h turn up on the right and left side of the consonantline freely. w and h are candidates. Since h=H, then wmight equal h. Digrams such as sh or ch are prevalent. W isthe second position in word 7 which tentatively confirms thePT h and suggests that Z is a consonant (step down). B isastep up as well as S. The third word confirms but the 9word has four vowels. Hmm? K and H are both possibilitiesfor vowels. Word 4 tends to favor the H. So:
Final Consonant Line C T N E O D P A U W Z --------------------- VVV | V mark X and Y as Vowels X | XXXX (vowel) both step up YYYYY | YYYYY (vowel) with high VOC KKK | S | S vowel low freq? =u? ZZ | ZZ consonant (step down) | WWWW test as h R | R MM | | HHHH BBB | BBB vowel UUUU | U consonant A | consonant T | T consonant
Let me fill in where ELCY stops. A-3 has vowels and consonantsseparated. We have the PT letter h. Word 9 is either cleveror wrong. Using Barkers Pattern List on p39, we find bayou andmiaou. The same reference gives us thunderclaps for word 7.Although not correct we find thunderstorm matching the patternunder 819710/12W and word 8 suggests puma. The final messagereads: shipyard zealot snapshot kitchenmaid midst goldenrod;thunderstorm, puma miaou, anticlimax.
The TEA database yields words: thunderstorm and anticlimax.The reader is invited to reconstruct the keywords, if any.
Try this Aristocrat.
A-4. Fire, fire burning bright. by Ah Tin Dhu. 1 2 3 4 5A B C D E A C F G H I C J F H K C I B L K F B H L 6 7 8 9 10K C M J N O M J P I B H L M C M R S P E B C A I H 11 12 13 14 15T I A U H. K U M C E V D U H P. S C F G D J W B I L 16 17 18 19J S U M L D U V N P, V E O M L C F G L E.
To solve by using non-pattern words, 3 or 4 words in the cipherhaving several letters in common. Under one of these write 5or 6 words from the pattern list. We will use OMAR's listgiven previously. Note the initials and final letters andletter positions of the trial words. In A-4. K is an initialand L is a terminal. Choose the non-pattern words to conformwith this requirement. We write the common letters under thetrial word and try to make clear message out of the balance ofCT. Word 5 has K, BHL and F.
K F B H L A C F G H K C I B L B H L M C1 b l a c k l c b a k a c k2 c r a z y r z c a y a z y3 w r o n g r n w o g o n g4 c r o w d r w c o d o w d5 d r u n k r n d u k u n k6 f o u n d o n f u d u n d
Line 6 arson, fraud, under. Putting this into the ristieswe get:
1 2 3 4 5b u r y b r o w n a r s o n f r a u d f o u n dA B C D E A C F G H I C J F H K C I B L K F B H L 6 7 8 9 10f r e e a u n d e r e y u r b a nK C M J N O M J P I B H L M C M R S P E B C A I H 11 12 13 14 15c a b i n f i e r y i n r o w u a dT I A U H. K U M C E V D U H P. S C F G D J W B I L 16 17 18 19 i e d i y e d r o w d yJ S U M L D U V N P, V E O M L C F G L E.
All the vowels are id'ed and r, n. The message is "Burly brownarson fraud found fresh vesta under empty cabin. Fiery glint.Prowl squad spied light, gyved rowdy."
PHOENIX has compiled a list of articles (page 2) concerningARISTOCRATS between 1932 - 1993 in "The Cryptogram Index,"available through the ACA. On page 27, he lists additionalreferences on simple substitution. Articles by B.NATURALand S-TUCK are especially useful. [INDE]
Solve these cryptograms, recovery the keywords, and send yoursolutions to me for credit. Be sure to show how you crackedthem. If you used a computer program, please provide "gut"details. Answers do not need to be typed but should begenerously spaced and not in RED color. Let me know what partof the problem was the "ah ha", i e. the light of inspirationthat brought for the message to you.
A-1. Bad design. K2 (91) AURIONV G S E U L Z K W U F G Z G O N G M V D G X Z A J U =X U V B Z H B U K N D W V O N D K X D K U H H G D F =N Z X U K Y D K V G U N A J U X O U B B SX D K K G B P Z K D F N Y Z B U L Z .
A-2. Not now. K1 (92) BRASSPOUNDERK D C Y L Q Z K T L J Q X C Y M D B C Y J Q L : " T RH Y D F K X C , F Q M K X R L Q Q I Q H Y D LM K L D X C T W R D C D L Q J Q M N K X T M BP T B M Y E Q L K F K H C Y L Q Z K T L T C . "
A-3. Ms. Packman really works! K4 (101) APEX DX* Z D D Y Y D Q T Q M A R P A C , * Q A K C M K* T D V S V K . B P W V G Q N V O M C M V B : L D X VK Q A M S P D L V Q U , L D B Z I U V K Q F P OW A M U X V , E M U V P X Q N V , U A M O ZN Q K L M O V ( S A P Z V O ) .
A-4. Money value. K4 (80) PETROUSHKAD V T U W E F S Y Z C V S H W B D X P U Y T C Q P VE V Z F D A E S T U W X Q V S P F D B Y P Q Y V D A F S ,H Y B P Q P F Y V C D Q S F I T X P X B J D H W Y Z .
A-5. Zoology lesson. K4 (78) MICROPODA S P D G U L W , J Y C R S K U Q N B H Y Q I X S P I NO C B Z A Y W N = O G S J Q O S R Y U W , J N Y X UO B Z A ( B C W S D U R B C ) T B G A W U Q E S L.* C B S W
[ACA] ACA and You, Handbook For Members of the American Cryptogram Association, 1995.[BARK] Barker, Wayne G., "Cryptanalysis of The Simple Substitution Cipher with Word Divisions," Aegean Park Press, Laguna Hills, CA. 1973.[BAR1] Barker, Wayne G., "Course No 201, Cryptanalysis of The Simple Substitution Cipher with Word Divisions," Aegean Park Press, Laguna Hills, CA. 1975.[B201] Barker, Wayne G., "Cryptanalysis of The Simple Substitution Cipher with Word Divisions," Course #201, Aegean Park Press, Laguna Hills, CA. 1982.[BP82] Beker, H., and Piper, F., " Cipher Systems, The Protection of Communications", John Wiley and Sons, NY, 1982.[CCF] Foster, C. C., "Cryptanalysis for Microcomputers", Hayden Books, Rochelle Park, NK, 1990.[DOW] Dow, Don. L., "Crypto-Mania, Version 3.0", Box 1111, Nashua, NH. 03061-1111, (603) 880-6472, Cost $15 for registered version and available as shareware under CRYPTM.zip on CIS or zipnet.[ELCY] Gaines, Helen Fouche, Cryptanalysis, Dover, New York, 1956.[GODD] Goddard, Eldridge and Thelma, "Cryptodyct," Marion, Iowa, 1976[FR1] Friedman, William F. and Callimahos, Lambros D., Military Cryptanalytics Part I - Volume 1, Aegean Park Press, Laguna Hills, CA, 1985.[FR2] Friedman, William F. and Callimahos, Lambros D., Military Cryptanalytics Part I - Volume 2, Aegean Park Press, Laguna Hills, CA, 1985.[FRE] Friedman, William F. , "Elements of Cryptanalysis," Aegean Park Press, Laguna Hills, CA, 1976.[HA] Hahn, Karl, " Frequency of Letters", English Letter Usage Statistics using as a sample, "A Tale of Two Cities" by Charles Dickens, Usenet SCI.Crypt, 4 Aug 1994.[INDE] PHOENIX, Index to the Cryptogram: 1932-1993, ACA, 1994.[NIC1] Nichols, Randall K., "Xeno Data on 10 Different Languages," ACA-L, August 18, 1995.[NIC2] Nichols, Randall K., "Chinese Cryptography Part 1," ACA- L, August 24, 1995.[OP20] "Course in Cryptanalysis," OP-20-G', Navy Department, Office of Chief of Naval Operations, Washington, 1941.[ROBO] NYPHO, The Cryptogram, Dec 1940, Feb, 1941.[SINK] Sinkov, Abraham, "Elementary Cryptanalysis", The Mathematical Assoc of America, NYU, 1966.[STIN] Stinson, D. R., "Cryptography, Theory and Practice," CRC Press, London, 1995.[TUCK] Harris, Frances A., "Solving Simple Substitution Ciphers," ACA, 1959.Throughout my lectures, PT will be shown in lower case. CTwill be shown in upper case. As a convention, Plain text willgenerally be shown above the Cipher text equivalent.Notes
A = Aristocrats, P = Patristrocrats, X = XenocryptsAny typo errors are my responsibility. I probably fell asleepat the keyboard. Please advise and I will correct them as wellas put out an erratum sheet at the end of the course. Studentsmay want to start a 3" permanent binder with separators for thevarious lectures and materials.
1. Intro - First Principles - Global Mathematical Nature 2. Keyword Systems and Conventions Used 3. Simple Substitution Cryptanalysis without/with Complexities a. Eyeball b. Frequency Distributions - General Nature of English Letters c. Friedman Techniques - Random vs Expected -Spaces and a Wealth of Tables: Digram, Trigram, and more d. C. C. Foster Techniques e. S-Tuck Techniques f. Pattern Words g. ELCY : Consonant Line Attack h. Sinkov Techniques i. Barker's Vowel Separation and Position Table j. Non Pattern Words: "Dooseys" k. SI SI Patterns l. CM References for Risties m. Relationship to XENOS:French and German Solutions n. Computer Program Aids - TEA Database, CDB, ABACUS, Computer Supplement o. References 4. Homework Problems 5. Variant Substitution Systems a. Friedman b. WaxtonNext lecture we will cover the balance of the outline materialand jump into Patristocrats.Text converted to HTML on April 25, 1998 by Joe Peschel.
Any mistakes you find are quite likely mine. Please let me know about them by e-mailing:
jpeschel@aol.com.
Thanks.
Joe Peschel');}if(e == 'us' && FCAdTagTarget.indexOf('travel') != - 1 && document.cookie.indexOf('quebec_suppress') == -1 && IsFCMember() != 1) {ar_date = new Date();ar_ord = ar_date.getTime();ar_expires = new Date(ar_ord + 3600000); // half hourdocument.cookie = 'quebec_suppress=1; path=/; domain=' + GetFCDomain() + '; expires=' + ar_expires.toGMTString();document.write('');document.write('');}if(FCLanguage == 'ad' && document.cookie.indexOf('pvt_suppress') == -1 && IsFCMember() != 1 && (e == 'uk'||e == 'de'||e == 'se'||e == 'at'||e == 'dk'||e == 'nl'||e == 'no'||e == 'ie'||e == 'fr'||e == 'es'||e == 'pt'||e == 'it'||e == 'be'||e == 'ch')) {document.write('');}// -->