Attacking Turkish Texts Encrypted by Homophonic
Transkript
Attacking Turkish Texts Encrypted by Homophonic
ATTACKING TURKISH TEXTS ENCRYPTED BY HOMOPHONIC CIPHER (HOMOFONĠK ġĠFRELENMĠġ TÜRKÇE METĠNLER ÜZERĠNE ATAKLAR) by Şefik İlkin SERENGİL, B.S. Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE in COMPUTER ENGINEERING in the INSTITUTE OF SCIENCE AND ENGINEERING of GALATASARAY UNIVERSITY June 2011 ATTACKING TURKISH TEXTS ENCRYPTED BY HOMOPHONIC CIPHER (HOMOFONĠK ġĠFRELENMĠġ TÜRKÇE METĠNLER ÜZERĠNE ATAKLAR) by Şefik İlkin SERENGİL, B.S. Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Date of Submission : May 18, 2011 Date of Defense Examination: June 17, 2011 Supervisor : Asst. Prof. Dr. Murat AKIN Committee Members : Assoc. Prof. Dr. Tankut ACARMAN Assoc. Prof. Dr. Y. Esra ALBAYRAK Acknowledgements I would like to thank with my whole heart to my supervisor, Asst. Prof. Dr. Murat Akın, who taught me a lot of things in my academic life. I am grateful to my beloved mother and father who do not avoid sacrificing anything from the day I was born to ensure me to have the best conditions. May, 2011 ġefik Ġlkin Serengil ii Table of Contents Acknowledgements ........................................................................................................... ii Table of Contents ............................................................................................................. iii List of Figures ................................................................................................................... v List of Tables ................................................................................................................... vi Abstract .......................................................................................................................... viii Résumé............................................................................................................................. ix Özet ................................................................................................................................... x 1. Introduction ............................................................................................................... 1 2. Classical Cryptography ............................................................................................. 2 2.1. Monoalphabetic Substitution ............................................................................. 2 2.1.1. Shift Cipher ................................................................................................. 2 2.1.2. Substitution Cipher ..................................................................................... 3 2.1.3. Affine Cipher ............................................................................................ 15 2.2. Block Ciphers ................................................................................................... 16 2.2.1. Permutation Cipher ................................................................................... 16 2.2.2. Polygraphic Substitution ........................................................................... 17 2.2.2.1. Playfair Cipher................................................................................... 17 2.2.2.2. Hill Cipher ......................................................................................... 18 2.2.3. Polyalphabetic Substitution Ciphers ......................................................... 22 2.2.3.1. Vigenere Cipher................................................................................. 22 2.2.3.2. Kasiski Attack ................................................................................... 25 3. Homophonic Cipher ................................................................................................ 27 4. A New Approach On Attacking Homophonic Cipher ............................................ 32 4.1. 5. High Frequent N-grams Consisting of Low Frequent Unigrams ..................... 33 Conclusion ............................................................................................................... 49 References ....................................................................................................................... 50 Appendix ......................................................................................................................... 52 Biographical Sketch ........................................................................................................ 54 iv List of Figures Figure 2.1 The Encryption Schema of the Shift Cipher ................................................... 2 Figure 2.2 An Illustration of Frequency Values of the Turkish Unigrams ....................... 5 Figure 2.3 Illustration of the Permutation Cipher ........................................................... 17 Figure 2.4 Pseudocode of Dividing Ciphertext Into Blocks to Attack ........................... 24 Figure 3.1 Domain Set and Target Set of Homophonic Cipher ...................................... 28 Figure 3.2 Illustration of Frequency Distribution of a Homophonic Encrypted Text .... 28 Figure 3.3 Substitution Table of Homophonic Cipher ................................................... 29 Figure 3.4 Algorithm for Computing the Key Space of Homophonic Cipher for Turkish ........................................................................................................................................ 30 List of Tables Table 2.1 Substitution Cipher Encryption Table .............................................................. 3 Table 2.2 Percentage of Frequencies of Turkish Unigrams [2] ........................................ 4 Table 2.3 The Most Frequent 100 Turkish Words [1] ...................................................... 6 Table 2.4 Frequencies of 600 Most Frequent Bigrams in Turkish within 11M ............... 7 Table 2.5 Frequencies of 600 Most Frequent Trigrams in Turkish within 11M .............. 9 Table 2.6 Frequencies of 600 Most Frequent Tetragrams in Turkish within 11M ......... 11 Table 2.7 Frequencies of 600 Most Frequent Pentagrams in Turkish within 11M ........ 13 Table 2.8 Index Values of the Letter of the Turkish Alphabet ....................................... 15 Table 2.9 Encryption Process of Affine Cipher .............................................................. 15 Table 2.10 Decryption Process of Affine Cipher ........................................................... 15 Table 2.11 The Encryption Key of Playfair Cipher ........................................................ 18 Table 2.12 Mathematical Illustration of Encyrption Process of Vigenere Cipher .......... 22 Table 2.13 Mathematical Illustration of Decryption Process of Vigenere Cipher ......... 23 Table 2.14 Encryption Process of Vigenere Cipher by Using of Vigenere Table .......... 23 Table 2.15 Vigenere Table for Turkish Alphabet ........................................................... 23 Table 2.16 Distances Between Repating Characters ...................................................... 25 Table 3.1 Percentage of Turkish Letter Frequencies and Expression Count in Homophonic Cipher [2] .................................................................................................. 29 Table 3.2 Comparison of Key Space Values of Common Algorithms ........................... 31 Table 4.1 Expression Count of Most Common 100 Words of Turkish within Homophonic Cipher ........................................................................................................ 33 Table 4.2 Frequencies and Expression Count of High Frequent Bigrams Consisting of Low Frequent Unigrams in 11M .................................................................................... 34 Table 4.3 Frequencies and Expression Count of High Frequent Trigrams Consisting of Low Frequent Unigrams in 11M .................................................................................... 35 Table 4.4 Frequencies and Expression Count of High Frequent Tetragrams Consisting of Low Frequent Unigrams in 11M ................................................................................ 36 Table 4.5 Frequencies and Expression Count of High Frequent Pentagrams Consisting of Low Frequent Unigrams in 11M ................................................................................ 37 Table A.1 List of Novels Used in the Corpus ................................................................. 53 vii Abstract Emerging technologies make the vital operations performing through insecure channels. At this point, transferring private information between parties and authentication of transferred data can only be possible by the use of cryptography. The security of the cryptographic methods should be examined to test for perfection. Therefore, it is always supposed that the cryptographic method is known on cryptographic attacks. In this work, firstly classical encryption methods and vulnerabilities for Turkish language are reviewed. Herein, homophonic cipher comes one step forward in the alternative classical encryption methods because it generates ciphertexts consisting of variable block sizes. This makes well known attacking models invalid. Therefore, a novel attacking model is aimed to develop for Homophonic cipher in Turkish. This model is investigated on a large data source aiming to detect the characteristic features of Turkish language. Key words: Turkish n-gram Frequencies, Homophonic Substitution, Cryptoanalysis of Encrypted Turkish Texts. Résumé Le développement de la technologie offre la possibilité d’effectuer des opérations vitales sur des canaux dont la sécurité n’est pas sûre. A ce point-là, le transfert de l’information privée entre les parties et l’authentification de ces données transférées sont possibles seulement grâce à l’utilisation de la cryptographie. Pour pouvoir examiner la perfection d’une méthode cryptographique, il faut d’abord questionner la sécurité qu’il offre à l’utilisateur. Par conséquence, on suppose toujours que la méthode du chiffrement est connue quand on parle d’une attaque cryptographique. Dans ce travail, premièrement on révise les méthodes de chiffrement classique et on parle des côtés vulnérables de la langue turque. Le chiffrement homophonique se diffère des autres méthodes de chiffrement grâce à sa particularité qui rend possible de produire des chiffres de blocs de taille variable. Cette caractéristique rend inefficaces les attaques effectuées en utilisant des méthodes bien connues. Pour cette raison, le but de ce travail est de proposer un nouveau modèle d’attaque propre aux textes turcs cryptés par le chiffrement homophonique. Le modèle est examiné sur une grande base de données afin d’évaluer les caractéristiques de la langue turque. Mots clés: Les fréquences des n-gram turcs, la substitution homophonique, cryptanalyse des textes turcs. Özet GeliĢen teknoloji ile güvenli olmayan kanallardan hayati iĢlemlerin gerçekleĢtirilmesi günümüz dünyasında oldukça yaygın bir Ģekilde yer almaktadır. Bu noktada, gizli bir bilginin partiler arasında aktarılması ve aktarılan bilgilerin doğruluğunun taahhüt edilmesi ancak ancak kriptografi ile mümkün olmaktadır. Bir kriptografik yöntemin mükemmelliğinin sınanabilmesi için güvenliği sorgulanabilmelidir. Bu nedenle, kriptografik ataklarda hep kriptografik yöntemlerin bilindikleri varsayılır. Bu çalıĢmada öncelikle kriptografinin temelini oluĢturan klasik yöntemler ve Türkçe’ye özgü zafiyetlere değinilmiĢtir. Bu noktada homofonik Ģifreleme, değiĢken blok boyutunda Ģifreler üretmesi sebebiyle alternatif Ģifreleme yöntemleri arasında kendini belli etmektedir. Bu da bilinen yöntemler ile saldırıları etkisiz kılmaktadır. Bu nedenle, homofonik ĢifrelenmiĢ Türkçe metinler için özgün bir saldırı modeli geliĢtirilmesi amaçlanmıĢtır. Bunun için oldukça geniĢ bir veri kaynağı üzerinde inceleme yapılmıĢ ve Türkçe’nin karakteristik özellikleri kestirilmeye çalıĢılmıĢtır. Anahtar Sözcükler: Türkçe n-gram Frekansları, Homofonik Yer DeğiĢtirme, Türkçe ġifrelenmiĢ Metinlerin Kriptoanalizi. 1 1. Introduction Cryptography could be implemented to provide data security, integrity and authentication. Moreover, the cryptosystems have to be resistant against to attacks. Therefore, detection of the vulnerebalities of a cryptographic method is a valueable processus. Hence, it is aimed to mention the main concepts of secret key cryptography and it is planned to focus on the attacking approaches of the secret key systems in this work. Statistical features of the source language are used to solve classical methods because classical methods depend on the source language. Related work presented by Dalkılıç [1] investigates Turkish language patterns and frequencies of Turkish. This contribution solves some of classical methods but this study is limited by the analysis about extraction of most frequent trigrams. Classical cryptography is presented in the first section. The main concepts of well known classical encryption methods are considered and attacking models are illustrated. Most particularly, there seems to have been no previous work on the subject that analyses homophonic cipher for Turkish. Therefore, developing an attacking model for Turkish is notably aimed in the second section. In this work, the corpus of size 13.4 MB is used to attain the statistical features of Turkish to solve some of classical cryptographic methods. Also, the corpus consists of 120 articles of a columnist, Çetin Altan, from the Turkish daily newspaper Milliyet and 37 novels of 9 different authors, which are Orhan Kemal, Orhan Pamuk, Çetin Altan, Aziz Nesin, Rıfat Ilgaz, Gülse Birsel, Ahmet Altan, Yılmaz Erdoğan and Soner Yalçın. 2. Classical Cryptography Classical cipher is an encryption method that performs on letters of alphabet. Classical ciphers depend on the source language. Therefore, source language specifies the key space of the method. More generally, they are implemented by hand or they are implemented with basic machines. Substitution or transposition techniques are often included in classical ciphers. In this section, most popular methods of classical cryptography and main concepts are presented. 2.1. Monoalphabetic Substitution Monoalphabetic substitution is a historical encryption method that depends on replacing each plaintext letter with another letter. The cipher alphabet remains unchanged throughout the encryption process. Therefore, it is named as Monoalphabetic Substitution Cipher. 2.1.1. Shift Cipher Shift cipher depends on the principle that each character of the message is replaced by a substitute with a shift of specified positions down or up in the alphabet. Figure 2.1 shows an example of the shift cipher. The method is also known as Caesar Cipher. Figure 2.1 The Encryption Schema of the Shift Cipher 3 If each character of the alphabet is presented by an integer that corresponds to its position in the alphabet, the formula of the method for replacing each character p of the plaintext with a character C of the ciphertext with a shift of k in an alphabet consisting of n characters could be expressed as Ci = (pi + k) mod n (2.1) pi = (Ci - k) mod n (2.2) As an illustrative example, the following text is chosen: Plaintext: GELECEĞĠKESTĠREBĠLMENĠNENGÜZELYOLUONUĠCATETMEKTĠR And by using the plaintext, it is encrypted as: Ciphertext: IĞOĞEĞĠLNĞUVLTĞDLOÖĞPLPĞPIZCĞOBROYRPYLEÇVĞVÖĞNVLT The method is weak against to brute force attack. In worst case, an attacker could detect the plaintext in (n-1) steps. 2.1.2. Substitution Cipher Substitution cipher depends on the principle that replacing each letter by another letter. Substitution characters are composed by a random permutation of the letters of the source language. Caesar cipher is a special form of a substitution cipher. Table 2.1 Substitution Cipher Encryption Table a b c ç d e f g ğ h ı i j k l m n o ö p r s Ģ t u ü v y z E R T Y U I O P Ğ Ü A S D F G H J K L ġ Ġ Z C V B N M Ö Ç Table 2.1 demonstrates an instance of substitution cipher. Each plaintext letter is looked up in the encryption table and written the corresponding ciphertext below. The decryption process depends on finding each ciphertext letter in the second row and writing down the corresponding letter from the top row. 4 As an illustrative example, the following text is chosen: : TEKNOLOJĠSENDOĞDUKTANSONRAĠCATEDĠLENHERġEYDĠR Plaintext And by using the plaintext, it is encrypted as: Ciphertext : VIJKGKDSZIJUKĞUBFVEJZKJĠESTEVIUSGIJÜIĠCIÖUSĠ The key space of substitution cipher is equal to n! on an alphabet consisting of n letters. For Turkish, the key space is equal to 29!, that is a number larger than 1030. This is larger than many times of key space of DES which is equal to 1016. Suppose that the attacker is able to check a possibility per microsecond, it would take time more than 1016 years to solve ciphertext in the worst case1. Obviously, the encipherment rules out a brute force attack. However, the attacker does not need to check all possibilities to solve ciphertext. It is a fact that the characteristic distribution of the texts is similar. Table 2.2 and Figure 2.2 illustrate the unigram frequencies of Turkish language. Therefore attacking by using an analysis of frequencies can lead to reveal encryption table. Table 2.2 Percentage of Frequencies of Turkish Unigrams [2] Letter A B C Ç D E F G Ğ H 1 10 30 10 −6 60.60.24.265 > 1016 Freq. 11,92 2,844 0,963 1,156 4,706 8,912 0,461 1,253 1,125 1,212 Letter I Ġ J K L M N O Ö P Freq. 5,114 8,6 0,034 4,683 5,922 3,752 7,484 2,476 0,777 0,886 Letter R S ġ T U Ü V Y Z Freq. 6,722 3,014 1,78 3,314 3,235 1,854 0,959 3,336 1,5 5 Figure 2.2 An Illustration of Frequency Values of the Turkish Unigrams However, the letter frequencies of the ciphertext could not exactly match the letter frequencies of the source language in a short message. Looking at the most common words as listed in Table 2.3 and the most common n-grams is a valuable tool to encrypt the data. The rest of the operation depends on the heuristic approach. The structure of the words and stream of the sentence would help to detect the plaintext. 6 Table 2.3 The Most Frequent 100 Turkish Words [1] Word Bir Ve Bu De Da Ne O Gibi Ġçin Çok Sonra Daha Ki Kadar Ben Her Diye Dedi Ama Hiç Ya Ġle En Var Türkiye Word Mi Ġki Değil Gün Büyük Böyle Nin Mı In Zaman Ġn Ġçinde Olan Bile Olarak ġimdi Kendi Bütün Yok Nasıl ġey Sen BaĢka Onun Bana Word Önce Nın Ġyi Onu Doğru Benim Öyle Beni Hem Hemen Yeni Fakat Bizim Küçük Artık Ġlk Olduğunu ġu Kadın KarĢı Türk Olduğu ĠĢte Çocuk Son Word Biz Vardı Oldu Aynı Adam Ancak Olur Ona Biraz Tek Bey Eski Yıl Bunu Tam Ġnsan Göre Uzun Ġse Güzel Yine Kız Biri Çünkü Gece Table 2.4, Table 2.5, Table 2.6 and Table 2.7 illustrate the most common n-grams in Turkish. The n-gram data presented below is collected from the data set of size 13.4 MB. 7 Table 2.4 Frequencies of 600 Most Frequent Bigrams in Turkish within 11M Ngr AR LA AN ER ĠN LE EN DE IN DA BĠ ĠR KA YA DĠ MA ND RA AL AK RĠ ĠL NĠ BA RD AY OR NI LĠ ME RI TA NE EL AM EK DI YO KĠ UN ĠM AD ĠY HA SĠ YE NA OL TE LI Fre 210907 193023 190681 167490 166362 147573 145677 138507 132807 127637 125824 120936 113213 107833 101972 100389 98136 94205 92166 87361 84497 76889 73277 71504 71256 70636 70300 69437 69339 68578 66743 65232 64209 64209 63808 62698 62359 61044 59905 59182 59020 58667 57222 56075 55239 55184 54770 54389 53881 53814 Ngr SA RE MĠ DU TI TĠ AT SI ET VE ED NL AS Aġ ON BE KE SE BU EY IR KL ÜN EM IK GE IL ES ĠK Ġġ UR ĞI LD ĠS IM CE RL MI NU ĞĠ GĠ ĠZ CA AH AZ Iġ YL LM ST KI Fre 53696 53009 52476 52283 51386 50968 50661 50362 49927 49863 49401 47564 47413 46294 45874 45640 45296 44712 44624 44573 43789 43444 43059 42649 40865 40841 39898 39387 38939 38227 38171 37718 37375 37263 37245 37156 36959 36784 36391 36283 35787 35238 33993 33171 32755 32622 32373 32197 31918 31841 Ngr ĠÇ RK UL KT RU SO AB IY OK ÇĠ LU AP ÜR ġT AĞ UM KO EV DÜ GÖ YI IĞ RM ZA TÜ KU VA MU IZ ġI ZE TU ġA LL SU ĠĞ NC PA HE ĠT UY RT GÜ YĠ ġE TL EC EĞ TT AC Fre 31429 29528 29371 28964 28453 27989 27564 27516 27389 27377 27338 27085 27057 26972 26617 26139 25764 25568 25363 25203 24913 24106 24077 23774 23620 23617 23616 23551 23091 22692 22563 22533 22239 22196 22168 21975 21614 21525 21182 20510 20367 20288 20124 19854 19742 19492 19480 19472 19323 18985 Ngr HĠ ĞU ġL ÇI ÖR ÇE ÜZ ġĠ ÇA UĞ ZL ĠB RS ÖN UZ UK ÇO ÖY Uġ ZI YÜ ZĠ RÜ TM Eġ AV ÜK LÜ YD UT ÜL NM Üġ AÇ ÖZ DO ML ĠD FA ġK MÜ ÜM CĠ BÜ MD KÜ ÜY YU AF NÜ Fre 18821 18753 18154 18112 17778 17736 17636 17282 17193 16907 16237 15851 15816 15278 15172 15139 14880 14744 14641 14561 14156 13862 13654 13595 13465 13357 13284 13272 13222 13198 13181 13100 13030 12668 12477 12350 12350 11953 11877 11863 11630 11563 11387 11348 11337 11279 11253 11219 11150 10971 Ngr NR ĠP OĞ IS GA VĠ CI SÖ PE ÜS US IP NS EZ LK CU ġM TO BO NT EÇ PI OY FE ZD EF ÜT NK SL LT SÜ KS OT ĞR OC EH KM PL ĞL ZÜ DÖ ÖL ĠH ĞA SK ġÜ FĠ LG RG ġU Fre 10949 10803 10648 10610 10575 10524 10287 10196 10171 10146 10074 10041 9885 9758 9755 9701 9568 9312 9184 8977 8691 8685 8640 8636 8012 7925 7852 7820 7774 7738 7729 7683 7353 7347 7324 7137 7097 7068 7029 7025 6918 6801 6744 6727 6723 6549 6520 6372 6210 6102 Ngr EP KÖ KK BI NG EB YR ĠF ÜÇ OM YN ÇL ZU AA Rġ ÇÜ RB Oġ OP PT RO MS OS PO VL UP BÖ AĠ OD ĞÜ NB ÜĞ NN GU TR SS LS LO RÇ ID FI SM UD ĠC VR NÇ VU PĠ HI KR Fre 6092 5975 5957 5760 5692 5667 5622 5528 5469 5460 5457 5262 5205 5193 5193 5186 5141 5093 5075 5061 4899 4790 4774 4765 4745 4701 4683 4625 4565 4516 4482 4463 4404 4374 4288 4284 4266 4242 4136 4104 4036 4030 3968 3936 3925 3916 3907 3837 3786 3777 8 NY HU ĞE HT ÖĞ KÇ RC HO ÖT ZM UH CÜ YM UC HM ÇT ÇM UÇ IC ÜC PR ĠV IT YG DD FO HL PM VD FT MO GI LN ÜD MB ÜP ÇB FL ZO NO ÖP ÖS RH SY RN UV HÜ ĠĠ YS FÜ 3776 3772 3688 3651 3603 3575 3536 3481 3447 3369 3327 3312 3282 3174 3131 3122 3122 3043 2972 2954 2926 2888 2823 2798 2723 2692 2646 2626 2621 2587 2566 2534 2528 2508 2470 2425 2412 2401 2386 2385 2371 2368 2368 2367 2353 2325 2320 2309 2280 2208 ZG TS OZ HR LY Kġ UB ĞM PS NZ HÇ ZC FR ÖK RP LB PK MC OB DR TÇ ÖM LC YK VG BR HB TK OF ÜV ÇU YB ÖV PU Öġ UA ĠG LÇ ZY AJ YÖ BD HV FU FK SR ġÖ ĠA MN IF 2195 2172 2171 2058 2043 2033 2008 1998 1994 1976 1975 1923 1908 1877 1853 1820 1805 1777 1620 1594 1573 1570 1541 1531 1525 1522 1515 1512 1492 1476 1434 1426 1414 1407 1381 1381 1369 1352 1351 1341 1319 1292 1289 1275 1263 1257 1235 1235 1199 1137 IV GR MM HS MR ZS SP ÜF ġS JA IÇ HK ġB VK RY RV RF ÇÖ UG YF ÖF ÖB MP DL OV YV JĠ ġY ĞD YT MH RR MK RZ VV TÖ ĞZ VÜ PÜ AG GO TF ZB PH OO BL TH HN ġÇ DY 1134 1123 1121 1106 1098 1088 1075 1070 1066 1054 1042 1029 1028 1021 1008 1005 991 990 963 960 954 938 931 916 906 896 883 876 864 859 851 842 815 814 808 804 799 791 784 745 739 730 726 713 713 710 709 709 703 683 CO EE ġO YH VI UF VM TB ġV OG OH VC ÇK FF Mġ MZ LV Hġ LP IA CH ÖD AE EG KN EĠ CR AU SN FÖ SH ÜB IH IB KY ZZ EO FÇ EA HP ĠE OJ RÖ MY KD ġH SV MT CL JE 672 655 644 640 639 629 602 596 596 593 571 568 558 536 534 533 533 529 523 522 519 516 516 514 514 513 502 495 491 483 481 481 478 478 473 468 467 456 448 447 446 438 435 434 432 425 425 423 416 415 Nġ VS UĠ OÇ YC ZK ÇS ĞC ÇR TV PÇ EJ HZ FS ġR OĠ VO KB YY MG FY ĞN ġG Vġ HD LH CD TY UE BB ZR ÜH SB ÖC KH VN ĠO YZ ġF VY NP ZN ĞS ÜE NF RJ II NÖ UU JI 413 401 393 393 391 382 381 381 372 365 362 361 357 352 349 344 332 330 327 327 326 324 321 319 318 317 315 315 301 300 298 297 292 291 290 287 284 279 277 274 274 273 270 270 262 250 250 249 235 235 AO GL BN TN KG TD PY SF CC JD IO VH ġN LF VZ NH HY LZ ÇG BZ IG KF JL OU CK JU ZT DN NV ĠJ SC CB ÖH UO ÖÇ TG ġġ ÇH HH TP OE ĠU PP JO ĞB KV TC MF Yġ ZH 234 231 227 223 222 220 218 208 203 201 200 200 197 195 183 183 178 177 176 170 166 165 163 163 161 156 155 154 152 151 143 142 137 134 132 131 129 126 126 125 118 118 118 117 114 114 109 108 102 97 9 Table 2.5 Frequencies of 600 Most Frequent Trigrams in Turkish within 11M Ngr LAR BĠR LER ERĠ ARI YOR ARA NDA ĠNĠ INI RĠN DEN AMA NDE EDĠ ANI ASI DAN NLA AYA RIN IND ENĠ ADA ĠLE ĠND ALA NIN ANL KAR LAN IĞI ADI SIN SĠN ESĠ YLE KEN RDU ELE ĠYO ĠĞĠ ĠNE ARD KLA ĠÇĠ TAN NĠN BĠL ERE Fre 89715 77192 68463 58199 56640 46374 39442 38533 36926 34233 32092 31700 31046 30920 29711 29369 28332 28109 28015 27995 27197 26780 26210 25992 25278 24708 24559 23573 23470 23010 22870 22842 22334 22148 21714 21362 21319 21223 21110 20913 20215 20109 19994 19356 19224 19160 18914 18606 18502 18440 Ngr ORD ĠYE BAġ EDE ALI UNU END BEN NDĠ ÇĠN ELĠ MIġ EĞĠ RLA MAN YAN ĠLĠ SON IYO ONU ECE KAD UĞU EME ĠST AĞI OLA INA ERD ANA AYI MĠġ KAL LMA DĠY EYE ÖYL ĠRĠ RDI RAK KLE ĠBĠ ABA ATI LDU ILA EKĠ ACA ĞIN GEL Fre 18348 18295 18165 18132 17950 17894 17848 17793 17595 17466 17380 16453 16136 16134 16042 15605 15586 15500 15455 15454 15382 15378 15363 14961 14871 14823 14792 14787 14758 14707 14639 14564 14496 14485 14100 13914 13810 13799 13746 13357 13276 13215 13100 12815 12749 12665 12626 12607 12555 12534 Ngr SAN YAP ORU AġI AHA GÖR EMĠ DĠĞ VER RDĠ RKE OLD GĠB AKL LEN KAN ĠMĠ MAK ÇIK STE DER DED DIĞ MEK ĞĠN NUN ġLA RLE AND ĠSĠ DUĞ MAY RDE NRA ONR LAM NCE LĠK LLA AMI VAR EYĠ ġTI BUL AKĠ LIK BAK ETĠ SUN TTĠ Fre 12433 12361 12334 12303 12211 12199 12047 11989 11988 11935 11897 11673 11642 11636 11505 11473 11433 11343 11312 11290 11222 11197 11092 11090 11013 10991 10966 10783 10735 10718 10666 10661 10552 10535 10532 10391 10390 10323 10321 10253 10250 10232 10166 10087 10042 10011 10007 9978 9885 9836 Ngr ÇOK CAK RDA DĠM EKL AKI MAS ARK OLM HER LDI GÜN AġL RUM URU YLA YER MAD DEĞ ĠġT DAR PAR ġEY YAZ TÜR AZI REK NLE AKA ULA IMI DAH LAY TIR HAY ARL STA ULU AKT ERL YAR IġT RAN TĠR TER ĠZĠ ÜZE REN IKL ILI Fre 9685 9670 9602 9595 9211 9195 9191 9173 9072 9055 9052 8955 8928 8917 8887 8818 8793 8788 8728 8673 8657 8558 8555 8511 8501 8491 8457 8440 8404 8404 8366 8348 8336 8325 8312 8309 8261 8249 8218 8198 8153 8101 8039 7980 7975 7951 7943 7922 7896 7881 Ngr AġK ĞUN DIN ĠKĠ YEN LDĠ ġTĠ TEN KTA RMA OLU ENE DIR SEN IRA ĠLM DAK ETT YIL MEY AġA GEÇ HAL API ISI YAT NLI ERK HĠÇ TAR KON SĠZ RME YAL LĠY BAN LĠN UND ART EVĠ DÜġ ĠġĠ DIM TLE TÜN ATL DUR ZAM MUġ SÖY Fre 7880 7798 7781 7729 7718 7710 7687 7679 7655 7648 7646 7566 7561 7524 7521 7493 7486 7422 7367 7323 7319 7184 7150 7113 7107 7107 7106 7084 7076 7068 7066 7050 7010 6996 6993 6993 6979 6965 6934 6931 6913 6894 6881 6795 6788 6759 6718 6688 6665 6645 Ngr KTI ĞIM ĠRD YET ÜRÜ LAD RAD BĠZ ġKA ÜND DĠN OKU ĠRL RAS TĠN LĠR ĞĠL KTE EKT RKA ALD IRI CEK DĠK ALĠ KUR ÜNÜ LIĞ MED TUR NDI GER BUN ĞĠM DĠL ALL APA LEM DĠR KAY HAN LAT ĠYĠ IġI MĠN KĠM KAP URA YAK NĠZ Fre 6634 6633 6624 6601 6600 6597 6581 6558 6553 6552 6550 6541 6541 6510 6489 6475 6469 6450 6425 6408 6403 6402 6396 6394 6378 6374 6358 6354 6340 6339 6336 6326 6301 6279 6269 6264 6244 6223 6200 6196 6158 6084 6065 6043 6041 6030 6015 5976 5970 5970 10 RAL MĠZ SEV TIN ĠKA ATA UYO ĠRA ÜRK ASA ĠDE YOK TME ETM BEL TEK NAN LME MES TLA GEN ÜġÜ ZLE ġIN RĠM MAL LAġ GÖZ ÇEK KIN UNA ATT MEN ġMA ÜTÜ ĠKL TIK TEL MLE NIZ NCA YÜZ USU DEM ĠMD LLE YAġ HAT DEK RIM 5950 5940 5934 5920 5915 5895 5892 5890 5850 5849 5841 5827 5817 5816 5815 5813 5808 5807 5803 5792 5791 5781 5781 5779 5772 5770 5765 5755 5729 5699 5677 5657 5641 5637 5629 5622 5614 5589 5585 5575 5570 5561 5550 5541 5524 5489 5446 5435 5412 5358 ĠRE ĠLD MIZ ABĠ IKA MET ORL AVA RES KAT ETE RĠY ANM MER AZA LIN ÜST SOR MAM ĠYA LEY UKL LED KLI RAR LIY KIZ MEM RTI RLĠ DUM ĠNC CAN BEY ILM PLA TIĞ BÜT ÇAL ARĠ BAS DAM ZEL ONL NĠM LIġ ERS AÇI LUN ETL 5347 5343 5288 5264 5253 5247 5241 5235 5233 5228 5182 5156 5153 5148 5111 5066 5059 5046 5039 4988 4987 4931 4930 4929 4927 4922 4908 4893 4887 4887 4886 4866 4864 4863 4860 4859 4856 4849 4831 4831 4829 4827 4823 4821 4793 4790 4786 4762 4753 4733 LĠM IZI NER ILD KĠL CAĞ TAK IRL ACI MAZ ÖRÜ ALM IYL ALT ĠNS DÖN MĠY ZER ġAR RET ELL ĠYL UMU NSA ġÜN ĠZL TĠM KOR ġLE DOĞ LAH NEM ZLA ĠRM KES CEĞ SES ĠNL GĠT KUL ĠKT YÜK UYU NMA ÜNE ANT TĠĞ ARġ AYD RTA 4722 4709 4696 4693 4674 4672 4655 4650 4645 4629 4600 4600 4589 4586 4586 4583 4573 4571 4566 4558 4554 4550 4525 4519 4498 4488 4477 4476 4460 4452 4450 4436 4434 4431 4418 4410 4392 4392 4392 4389 4375 4374 4351 4344 4338 4327 4325 4323 4317 4302 GĠR ÜYO RAM RIL EVE LAC SAL YDI LĠĞ SIL MIN DUY HEM YĠN IKT ÜĞÜ MUT LMĠ TIL EVL ARE GĠD ġĠM STĠ ÜYÜ KLĠ APT TIM ENL KĠN UNL KER BAB ÖNC AĞA RDÜ BĠN BAL MLA SAR SÜR IRD KAÇ MDA ERM ANE DIK ÖNE EYL BUR 4301 4299 4292 4243 4233 4210 4207 4205 4201 4194 4180 4180 4177 4163 4160 4160 4153 4144 4134 4123 4118 4114 4112 4110 4106 4104 4089 4085 4084 4080 4059 4042 4017 4016 4008 3998 3983 3973 3966 3960 3954 3932 3930 3927 3920 3918 3917 3917 3914 3912 OCU EKE ÜRE TĠK URD TĠY ÖRE RġI IKI TMA BER RĠL KIR ÜZÜ ÇOC RAY YDĠ NME NĠY ADE LEC ANK ġTU SIZ HAR UZU KTĠ ERA YOL ĠġL IRM ARM NED LMI KAS ANN EġĠ ELD SĠY IZL PEK SER ĠNA NAS ZĠN YON SIR UġT LET ALE 3910 3906 3906 3904 3904 3902 3901 3901 3897 3892 3883 3878 3874 3851 3833 3833 3832 3829 3819 3813 3809 3808 3803 3789 3783 3780 3770 3754 3749 3743 3742 3735 3722 3721 3718 3715 3699 3697 3688 3682 3670 3661 3654 3646 3644 3623 3620 3615 3615 3608 ĠTA LTI SAY KKA TEM NNE LLĠ BÜY NIM SEL EBĠ TTI TAL MEL ÖNÜ ORA ONA BAH DÜR OYU MDE INL AYR AYN RMĠ ĠLG MEZ BEK RIY LĠS DOL ZAR ANB DAY ĠML NEL ĠZE OTU TAB AST ĠDĠ LUK SEY OCA RAF ORT KET PTI EFE RSU 3599 3599 3588 3587 3583 3583 3581 3577 3571 3567 3563 3547 3546 3545 3537 3529 3524 3523 3520 3519 3517 3494 3493 3478 3475 3468 3467 3453 3439 3437 3435 3426 3423 3420 3408 3406 3395 3383 3381 3370 3353 3349 3345 3344 3343 3330 3327 3326 3324 3316 11 Table 2.6 Frequencies of 600 Most Frequent Tetragrams in Turkish within 11M Ngr LARI LERĠ ERĠN ARIN INDA ĠNDE ĠYOR ORDU ANLA ĠÇĠN NLAR YORD ENDĠ IYOR ASIN ÖYLE KLAR ALAR NDAN ANIN GĠBĠ DĠĞĠ OLDU DIĞI ININ DUĞU ĠLER ESĠN ELER DEDĠ ONRA SONR ARDI ĠNĠN KEND NDEN RĠNĠ LARD KADA RINI RLAR KLER DĠYE SĠNĠ IĞIN LDUĞ DAHA AKLA IġTI DEĞĠ Fre 43055 36077 26911 26117 25969 22658 19578 17770 17328 16930 16745 16018 15616 15398 14773 12783 12718 12303 12068 12011 11641 11316 11130 11066 10737 10666 10665 10658 10580 10528 10527 10525 10515 10499 10462 10226 10148 10110 9380 9329 8666 8628 8587 8576 8441 8212 8071 7957 7948 7914 Ngr LARA BENĠ YORU ILAR AMAN MAYA EDEN ĠĞĠN AġLA ARDA SINI ARAK ERDE KARA SIND OLMA UĞUN ACAK ERDĠ RĠND ENĠN ERLE ĠSTE ĞINI SÖYL ORUM RKEN DAKĠ BĠRĠ ERKE MASI EĞĠL ĠġTĠ LERD RLER IKLA ĞĠNĠ LADI ARLA ZAMA EKLE BAġK ETTĠ ADAR AġKA ANLI ADIN BĠLĠ UYOR UNDA Fre 7760 7737 7724 7684 7633 7631 7603 7583 7523 7507 7449 7434 7387 7375 7361 7267 7229 7067 6895 6817 6815 6737 6723 6721 6638 6634 6627 6596 6585 6575 6487 6466 6459 6405 6398 6377 6342 6309 6278 6262 6261 6236 6220 6171 6152 6125 6108 6067 5887 5828 Ngr BĠLE RĠNE EREK LIĞI ġLAR MIġT ONUN ECEK NLER ANDI MADI ĞUNU ĠSĠN LAMA ADAN ALDI TÜRK RIND MEYE HAYA RADA KONU BAġL ADIĞ RASI ÜTÜN DEKĠ MĠġT ARKA ORLA EDĠĞ ĠMĠZ TIĞI ĠKLE BÜTÜ ÜNDE ĠRDĠ DÜġÜ LĠYO DĠYO ONLA ÇIKA YANI KADI CAĞI STAN ĠLDĠ TLER IYLA ANDA Fre 5796 5782 5771 5697 5661 5623 5615 5589 5532 5526 5480 5384 5370 5364 5327 5282 5234 5201 5166 5148 5137 5135 5124 5096 5053 5028 5003 4976 4914 4901 4845 4816 4806 4799 4768 4766 4765 4740 4698 4691 4676 4635 4629 4625 4623 4608 4583 4583 4575 4574 Ngr ĠYLE IĞIM ÜġÜN ĠĞĠM ENĠM LEDĠ SĠNE CEĞĠ ECEĞ KALA KARI UNUN TĠĞĠ ÜYOR RDEN MESĠ ACAĞ YENĠ ERĠM ARIM ÇĠND AġIN MEDĠ SĠND ETME LLAR ZLER ALAN AĞIN YORL KTAN GÖRÜ AYAN ALIġ ĠNSA AMIġ NSAN BANA BAġI ĠSTA AKTI OLAN ÖNCE BABA ARġI BĠRL ÇOCU BULU GELĠ UKLA Fre 4546 4538 4495 4494 4474 4378 4361 4351 4324 4312 4301 4298 4288 4287 4284 4237 4232 4232 4226 4211 4209 4204 4189 4154 4153 4143 4135 4092 4088 4081 4080 4073 4065 4025 4014 4013 4000 3998 3993 3990 3981 3964 3964 3933 3901 3843 3833 3821 3817 3813 Ngr LĠĞĠ IRLA EDĠM ILDI ASIL ĠRĠN ABĠL LECE SINA AYAT ETLE IMIZ NLAT LERE STER LAYA ARTI ISIN ARAS ULAR TANI VARD KARġ ADAM IKAR RDAN ÇALI LIYO BÜYÜ ĠLME ALLA AKTA ĠKTE ELĠN EMĠġ UNLA LMIġ DĠLE BAKA LACA YAPI EDER EMEK ĠLĠR UġTU NDAK ĠRLĠ EĞĠN ÜZER ĠġLE Fre 3804 3775 3763 3759 3731 3728 3726 3715 3715 3715 3715 3694 3693 3686 3685 3680 3678 3676 3674 3658 3654 3642 3635 3615 3612 3608 3591 3585 3575 3568 3567 3559 3553 3547 3540 3513 3509 3507 3505 3488 3483 3476 3465 3464 3451 3434 3421 3420 3420 3412 Ngr GÖRE KAPI ĠNĠZ ġLER YAZI ANIM OLAR ĠNLE RDUM LANI ALTI URDU NASI SENĠ MĠYO TANB ANBU NBUL IRDI ĠSTĠ LEME GELE AKLI ETĠN APTI EYLE ATIR ĠMDĠ TELE ENLE KORK ĠRLE YAPA ANNE BĠLM GENE ELDĠ OCUK ATTI SUNU KANI AĞIR ĠLMĠ EVLE ĠLGĠ ÜZEL RESĠ RINA DENĠ ALMA Fre 3399 3395 3380 3377 3376 3367 3366 3361 3356 3327 3319 3315 3286 3286 3281 3251 3240 3236 3222 3212 3205 3203 3202 3186 3185 3174 3167 3141 3134 3130 3116 3107 3098 3096 3092 3092 3088 3087 3086 3086 3081 3075 3066 3053 3052 3045 3045 3042 3038 3029 12 ANMA BÖYL GECE EDĠY YORS YAġA VERĠ ĠTTĠ NDĠM TLAR DĠSĠ YAPT OTUR ġĠMD LLER BĠZĠ NEDE NDEK ONUġ LAMI ÜYÜK ADIM NDĠS ORTA ZERĠ YATI YACA TTĠĞ MANI ATLA ĠYET LLAH ÖZLE KTEN RALA PARA RSUN MLER ĠLĠY BAKI GÜZE BUNU RKAD RLĠK ESKĠ MEKT ġTIR ġTĠR EDĠL MANL 3024 3012 3002 2997 2984 2983 2978 2978 2978 2974 2971 2956 2945 2940 2925 2919 2908 2906 2903 2902 2902 2887 2884 2881 2875 2873 2867 2865 2857 2853 2824 2822 2819 2817 2805 2802 2800 2784 2783 2781 2774 2773 2769 2762 2751 2750 2749 2743 2737 2735 AYIN KĠTA GELD DÜĞÜ ULUN DURU LĠKT IRAK ILMA LIKL EKLĠ AKIN MADA LMAS ERĠY RTIK EVĠN ELĠM LANM NLIK ATIN ELLĠ EKTĠ LMĠġ ARAR YÜZÜ LEYE AYDI ĠMLE ÖRDÜ ĠNCE DOĞR APIL YARA GEÇĠ ATLI DEMĠ OĞRU ELEN EYEN INIZ LAND UĞUM ARIY ADAġ LAYI VERD BASI INLA ĞĠMĠ 2732 2727 2722 2719 2708 2702 2695 2688 2682 2680 2672 2666 2663 2652 2652 2648 2646 2641 2641 2638 2636 2624 2619 2611 2607 2595 2591 2585 2579 2573 2562 2562 2560 2553 2546 2543 2541 2540 2531 2512 2507 2504 2499 2497 2497 2493 2488 2484 2484 2480 OYUN USUN ULLA ĠZLĠ EYĠN NĠYE EMĠN ALNI ORSU ĠDEN SANI ÜZÜN MALA ÖĞRE EYEC DERĠ AYAC AYLA RMĠġ GÖRD MLAR BEKL UZUN HMET ĠLEN KASI LARL ĠRME EBĠL EMEN ARAN BIRA AMIN HĠÇB ĞIMI ĠÇBĠ GERE YILL ĠMDE ILAN IRMA RDĠĞ LMUġ GĠTT KÜÇÜ ABAS ÜNYA AĞLA DÜNY AYNI 2477 2473 2471 2470 2461 2461 2459 2459 2453 2452 2448 2446 2446 2436 2427 2422 2421 2415 2413 2411 2410 2408 2406 2403 2402 2402 2394 2391 2388 2374 2373 2373 2372 2367 2366 2365 2358 2356 2347 2345 2341 2332 2331 2324 2320 2316 2315 2315 2311 2307 MIYO YECE ÜSTÜ LMAD EKTE LMAK LDĠĞ ÖSTE LNIZ YALN KĠġĠ SONU RĠMĠ ÇBĠR RECE ġIND GÖZL ANIY AMAY SORU LDIR MAYI YAPM AMAD BURA ENCE ERME ILLA GĠDE GÖST ELĠR REDE DEME ALIN IġLA LDIĞ ARMA DĠNĠ ĠZĠN KANL YERĠ ALIK OLAY ETĠR LERL ENĠZ VERM AġTI TEDĠ BĠRA 2304 2300 2298 2295 2290 2289 2287 2286 2286 2286 2283 2283 2281 2276 2276 2275 2267 2266 2263 2260 2260 2258 2258 2257 2257 2256 2255 2250 2248 2246 2246 2244 2239 2239 2237 2232 2231 2224 2222 2222 2222 2219 2217 2208 2202 2202 2199 2198 2195 2193 TARA LTIN ĞIND NDĠN HATI YLER IKTI NESĠ ILMI KALI ĠNAN AKAL MAMI RDIM ĠNCĠ YERE ZETE RAYA AKAN SIRA ĠMSE ĠġTE ġKAN GETĠ AZET ARKE URMA HEME KALD ġLAD GAZE IZLA ĠRAZ MUTL ĠYDĠ LĠKL KESĠ ARAL AMLA DINI ATIL AĞIM ĠZĠM RMUġ IMDA ĠLEC ĞREN ENDE ERSE CUKL 2190 2185 2182 2181 2176 2176 2175 2173 2171 2170 2164 2164 2164 2155 2155 2153 2147 2146 2140 2138 2135 2131 2126 2126 2124 2124 2120 2119 2118 2116 2115 2113 2113 2104 2100 2099 2098 2095 2088 2087 2087 2087 2082 2077 2074 2068 2068 2067 2066 2064 ERÇE HĠSS MAKT YLED LUĞU ANCA LMAY ĠSSE NIYO FEND MELE RABA TMĠġ GELM EFEN SÜRE EDĠK OLUR YÜRÜ TILA YAKI SEVĠ ORKU SOKA LANA MASA ARAY ĠZLE SAAT ZLAR RMIġ ARAB KARD YARI GÖRM ATTA ERĠL IKTA KĠYE NERE BĠRB AYRI ÜNKÜ AZIL HANE ĠRMĠ MIZI RBĠR RIMI AZAN 2063 2062 2062 2054 2046 2044 2043 2041 2037 2033 2026 2026 2025 2025 2025 2024 2024 2023 2023 2022 2022 2021 2019 2016 2015 2014 2009 2006 2005 2004 2000 1994 1993 1986 1985 1983 1979 1979 1977 1972 1969 1968 1963 1963 1962 1961 1961 1956 1956 1953 13 Table 2.7 Frequencies of 600 Most Frequent Pentagrams in Turkish within 11M Ngr LARIN LERĠN YORDU SONRA KENDĠ ARINI KLARI ERĠNĠ LDUĞU INDAN ANLAR OLDUĞ NLARI SINDA ALARI ĠNDEN ĠYORD RĠNDE DEĞĠL LARDA ZAMAN ERĠND KADAR BAġKA YORUM ASIND SÖYLE KLERĠ ELERĠ MIġTI IYORD DUĞUN UĞUNU RINDA ARIND ADIĞI ERĠNE LERDE MĠġTĠ EDĠĞĠ AKLAR BÜTÜN LĠYOR ĠLERĠ DĠYOR BAġLA IĞINI ASINI ĠĞĠNĠ IKLAR Fre 23638 19659 16000 10524 10266 9092 8788 8428 8212 7993 7940 7706 7546 7329 7133 7047 6793 6778 6254 6239 6158 6144 6113 6106 6049 5970 5963 5814 5801 5606 5448 5346 5192 5148 5101 5089 4906 4894 4892 4844 4806 4767 4693 4689 4671 4638 4627 4582 4535 4519 Ngr ONLAR ORLAR KADIN ECEĞĠ DÜġÜN BENĠM ACAĞI ĠÇĠND ÇĠNDE YORLA SĠNDE ĠNSAN ESĠNĠ LARDI DĠĞĠN RASIN ĠSTAN HAYAT KARġI ĠYORU ARASI LIYOR DIĞIN ANLAT VARDI ĠKLER EKLER ÇIKAR NDAKĠ RĠNĠN RININ MASIN ERLER MĠYOR ANBUL STANB TANBU ÇALIġ ILARI LARAK LARIM ARDAN ÇOCUK RLERĠ ESĠND BÖYLE UKLAR ARINA YAPTI AġLAR Fre 4418 4406 4359 4269 4234 4232 4212 4181 4147 4081 4041 3937 3920 3858 3785 3785 3711 3673 3633 3615 3592 3574 3560 3543 3535 3516 3504 3490 3401 3366 3337 3326 3323 3267 3235 3234 3234 3219 3176 3175 3174 3095 3085 3077 3039 3012 2999 2989 2948 2947 Ngr BĠLĠR ASINA BÜYÜK KONUġ ORDUM NDEKĠ DĠĞĠM TTĠĞĠ LIĞIN ġĠMDĠ ENDĠS ERDEN ZLERĠ NASIL RKADA BĠRĠN ARKAD NDĠSĠ YANIN GÜZEL DIĞIM ENDĠM GELDĠ ÜZERĠ UNLAR OLARA BĠRLĠ ETLER TLERĠ ġLARI ĠSĠNĠ ĠLĠYO LĠKTE MADIĞ ĠġLER SĠNĠN ERKEN INDAK DĠLER ARTIK DOĞRU ZERĠN KADAġ VERDĠ ENLER ORSUN YORSU NEDEN AYACA GÖRDÜ Fre 2942 2931 2902 2897 2889 2884 2866 2862 2857 2839 2830 2823 2781 2776 2768 2768 2766 2762 2756 2753 2742 2727 2720 2716 2710 2693 2687 2670 2669 2653 2645 2633 2632 2607 2599 2587 2586 2579 2556 2551 2536 2529 2485 2475 2450 2448 2438 2417 2417 2408 Ngr RLARD ĠRLĠK LMASI BIRAK LARLA HĠÇBĠ ĠNDEK LERĠM ĠRLER RALAR RDĠĞĠ GĠTTĠ ÖZLER LANDI DÜNYA IYORU SININ LIKLA MIYOR BULUN LDĠĞĠ YALNI ALNIZ EDĠYO MESĠN ĠÇBĠR LACAK BAġIN GÖSTE ÖSTER MADAN LDIĞI GEREK ARLAR ESĠNE ġINDA RLĠKT NLERĠ ALTIN ILLAR MALAR LERLE ĞINDA AYATI ENDĠN HATIR MANLA GETĠR HEMEN AZETE Fre 2396 2386 2377 2372 2371 2364 2363 2357 2352 2337 2331 2323 2323 2317 2305 2301 2300 2299 2299 2295 2287 2286 2286 2285 2277 2272 2270 2252 2246 2246 2232 2229 2213 2199 2199 2188 2188 2187 2182 2180 2180 2172 2161 2149 2145 2145 2129 2126 2119 2106 Ngr YILLA GAZET BĠLME ġLADI KALDI AĞINI UYORD ILMIġ BĠLĠY AġLAD ĠLECE ETTĠĞ YLEDĠ LĠKLE DĠSĠN HĠSSE ÖYLED LĠĞĠN BĠZĠM OCUKL FENDĠ NIYOR KORKU DEDĠM ALLAH ANLIK EFEND BĠRAZ NINDA BĠRBĠ ĠRBĠR TĠYOR ÖĞREN LLERĠ BUNLA GÖZLE BEKLE IġLAR UNDAN LMADI MANIN LAMIġ KÜÇÜK TLARI ATIRL SENĠN CAĞIN RĠYOR GERÇE ANIYO Fre 2106 2105 2103 2101 2081 2080 2076 2069 2066 2063 2062 2053 2053 2052 2039 2034 2025 2025 2024 2023 2023 2013 2012 2002 2002 2002 1993 1942 1933 1927 1926 1920 1915 1912 1912 1904 1902 1901 1891 1881 1879 1877 1876 1865 1863 1854 1853 1846 1846 1846 Ngr ALDIR ARIMI KĠMSE YÜZÜN AMAYA YAPIL MAKTA ADINI ĠĞĠMĠ STEDĠ ANIND ĠSTED EĞĠNĠ FÜSUN ĠġTĠR NDIĞI LENDĠ AMIġT RLARI ÇÜNKÜ KARAR ANLAM ÜNDEN AMADI IĞIMI IġTIR OLACA BAKAN ĠSTĠY MUTLU ĠLMĠġ AġKAN ÇIKTI ERKES HERKE ĠKAYE TÜRKĠ MUġTU RKĠYE YERĠN ÜRKĠY ABASI MEDEN LMAYA BELKĠ ERĠMĠ TARĠH ĠMLER LĠNDE CEĞĠN Fre 1841 1834 1833 1827 1823 1818 1817 1811 1806 1806 1803 1801 1801 1797 1796 1795 1795 1793 1790 1786 1785 1780 1768 1768 1763 1761 1750 1743 1741 1740 1736 1734 1733 1733 1726 1722 1719 1715 1708 1707 1706 1706 1704 1701 1701 1699 1696 1695 1693 1686 14 ĞĠNDE INLAR ARKEN DUĞUM RANLI EHMET STĠYO MEHME EYECE LLARI MEDĠĞ NLIĞI TINDA YAPMA BĠLDĠ BĠLEC ARABA ERÇEK ANDIĞ ĠYORL DĠKLE ANMIġ ÜYORD AġIND OLMUġ DENĠZ ULARI NSANL AKTAN ANDIR GELEN GÖRÜN ġEYLE TIRLA ÜSTÜN AKġAM MLERĠ IĞIND KARAN KANLI MAMIġ SABAH YACAK EYLER ABĠLĠ LADIĞ RIYLA ARDIM SANLA TILAR 1679 1676 1675 1672 1662 1658 1658 1655 1654 1654 1647 1644 1644 1643 1642 1639 1636 1631 1631 1631 1630 1626 1622 1620 1619 1617 1616 1613 1611 1595 1595 1592 1590 1590 1587 1579 1572 1572 1567 1565 1564 1557 1550 1549 1548 1547 1546 1545 1542 1541 LANMA LECEĞ NLARD MEKTE ĠSTEM KĠTAP GÖRME OLMAD LINDA ĠNLER MELER ARIYL DEĞĠġ ANLIĞ EMĠġT CAĞIM ĠLGĠL RĠYLE LUYOR TĠĞĠN LECEK ġIYOR LERDĠ RUYOR ÖLDÜR ġLERĠ KASIN LAġTI ĠYORS LABĠL SOKAK BAKTI KIYOR ĠMĠZĠ CEĞĠM HABER YORUZ NLARA DINLA LARIY YAKIN ĠLDĠĞ NELER IYORL ÜNLER ATLAR ASKER POLĠS HĠKAY HAYAL 1540 1540 1539 1539 1538 1535 1535 1533 1528 1527 1527 1522 1518 1516 1515 1511 1511 1511 1511 1508 1507 1506 1505 1491 1491 1489 1484 1481 1480 1480 1479 1478 1475 1474 1469 1469 1468 1467 1464 1461 1455 1453 1452 1450 1449 1448 1443 1443 1442 1438 ALMIġ DIKLA IKTAN PTIĞI GEÇĠR ÖYLEM SANKĠ EMEDĠ SONUN APTIĞ BASIN EMEYE YERLE MAYAC ANCAK ARALA ARANL ĠĞĠND RĠMĠZ ETMĠġ ISINI BABAM TARAF ÖYLEY DUYGU RIYOR OLMAS VERME ERKEK AMLAR ĠRDEN ISIND MLARI TĠRDĠ NUNDA DILAR KULLA GÜNLE ĠRĠNĠ ORMUġ TIĞIN LATTI ARDIR EREDE ENDEN AMANL DĠNLE LERĠY ĞĠMĠZ RADAN 1433 1423 1422 1418 1411 1410 1410 1408 1406 1405 1404 1399 1398 1397 1395 1388 1384 1378 1372 1370 1369 1368 1364 1363 1360 1356 1355 1353 1352 1348 1346 1340 1339 1339 1333 1332 1329 1329 1327 1326 1326 1325 1324 1324 1320 1319 1316 1312 1312 1312 CUKLA ÇLARI SEYRE SEVER TELEF ELEFO LEFON TOPLA EKTEN ARġIL YECEK ARMIġ NDĠNĠ ERĠYL YACAĞ ORADA YATIN ALARD CAKLA ĞIMIZ ÜġÜNÜ UġTUR RDÜĞÜ TIĞIM ĠSTER AZILA URADA ELLER RESĠM YOKTU SĠNĠZ EMĠYO SUNUZ LAYAN MIġLA ULLAN TIYOR KINDA EĞĠLD ĞĠLDĠ ĠSĠNE ARISI LENME YAKLA SĠZLĠ ANNEM ĠRMĠġ HANGĠ MAYAN CAKTI 1311 1310 1309 1309 1309 1303 1303 1301 1299 1295 1291 1285 1281 1280 1280 1280 1277 1276 1275 1273 1264 1263 1261 1259 1258 1257 1257 1247 1246 1241 1239 1238 1236 1236 1232 1231 1230 1230 1225 1223 1220 1219 1217 1215 1214 1214 1211 1209 1206 1205 ENDĠL ÜġÜND ZÜNDE ERMĠġ LMIġT ĠRDĠĞ KARIġ LEDĠĞ DERDĠ SĠYLE TĠĞĠM LADIM BĠRDE AÇLAR ARADA LAMAY LEġTĠ BAĞIR AMIYO BELĠR SORDU ADINL IRLAR KĠLER BURAD ONUġM SESSĠ ACAKT ALDIĞ EDĠLE ESSĠZ URUYO ĠKTEN IMIZI UĞUMU KĠTAB ÇATLI MEMĠġ ĠÇERĠ YAZAR ONUND KANIN GĠRDĠ HAZIR IZLAR DAġLA ELERD BĠRLE HANIM SĠLAH 1205 1202 1200 1200 1200 1199 1199 1199 1198 1198 1193 1192 1190 1190 1190 1189 1189 1188 1183 1182 1182 1182 1182 1181 1179 1179 1178 1177 1175 1174 1174 1169 1169 1168 1167 1166 1165 1164 1162 1162 1160 1160 1158 1158 1157 1157 1156 1155 1155 1152 HAREK NLATI OLMAK MĠġLE RESĠN LACAĞ YARDI YORMU NDĠMĠ ADAġL ENCER ġÜNDÜ NCERE NĠDEN AKLAġ ELĠYO ERDĠĞ LANLA LÜYOR LMESĠ YÜKSE FAZLA BENZE RÜYOR YERDE ALIġI GÖTÜR DEVLE GEÇTĠ OLSUN NĠYET ANDAN GENEL CĠLER ELĠND LĠġKĠ HATTA HEPSĠ DEMĠġ ERDĠM MERAK EVLET BENDE ACAKL PARÇA GÖREV ALABA BALAR APLAR ĠSSET 1151 1151 1151 1147 1145 1144 1143 1142 1141 1138 1137 1137 1134 1134 1133 1133 1132 1131 1131 1130 1129 1129 1128 1128 1126 1126 1125 1125 1125 1122 1121 1120 1119 1119 1118 1117 1116 1114 1109 1106 1106 1104 1102 1102 1102 1101 1098 1097 1096 1092 15 2.1.3. Affine Cipher The encryption method aims to perform the encryption process on the linear equation of yi = axi+b (mod n) where xi refers to plaintext, yi refers to ciphertext, n refers to size of the alphabet and pair of (a, b) refer to key. In the equation, (a, n) must be coprime. Substitution cipher is a special form of Affine cipher where a is equal to 1. And vice versa, affine cipher is a special form of substitution cipher where composing substitution table is formulised. Table 2.8 Index Values of the Letter of the Turkish Alphabet A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Suppose that the plaintext is ALANKAY and the key is (2, -4)2. The plaintext would have the corresponding values found in Table 2.8. By using the plaintext, it is encrypted as illustrated in Table 2.9 and the ciphertext is decrypted as demonstrated in Table 2.10. Table 2.9 Encryption Process of Affine Cipher Plaintext Index values y= 2x - 4 y= 2x - 4 Mod 29 Ciphertext A 0 0.2-4 -4 25 Ü L 14 14.2-4 24 24 U A 0 0.2-4 -4 25 Ü N 16 16.2-4 28 28 Z K 13 13.2-4 22 22 Ş A 0 0.2-4 -4 25 Ü Y 27 27.2-4 50 21 S Table 2.10 Decryption Process of Affine Cipher Ciphertext Index values x=(y+4)/2 x=(y+4)/2 Mod 29 Mod 29 Plaintext 2 Ü U Ü Z Ş 25 24 25 28 22 (25+4)/2 (24+4)/2 (25+4)/2 (28+4)/2 (22+4)/2 14,5 14 14,5 16 13 145+5.29 145+5.29 14 16 13 10 0 A 10 14 L 0 A 16 N 13 K Ü 25 (25+4)/2 14,5 S 21 (21+4)/2 12,5 0 A 27 Y 145+5.29 10 125+5.29 10 The key (2, -4) is equal to (2, 25) because of the principle y mod n = ax+b mod n = ax mod n + b mod n 16 The key space of the affine cipher is 812 for Turkish3. Therefore, the method is weak against to brute force attack. Moreover, the method is also insecure against to frequency analysis attack because the method is a special form of substitution cipher. Furthermore, if two ciphertext symbols of the plaintext are detected, the key could be calculated from the equation easily. 2.2. Block Ciphers Most generally, plaintext is divided into blocks, and each block is encrypted respectively in block ciphers. Block ciphers manipulates the frequency distribution of the ciphertext and makes the frequency analysis attacks difficult. 2.2.1. Permutation Cipher Permutation cipher depends on the principle that transposing the plaintext letters each other. In order to implement the method, a permutation rule is specified and the plaintext divided into blocks. Finally, each block is permuted within the permutation rule. Figure 2.3 illustrates an example of permutation cipher. The method is weak against to frequency analysis attack because letters of the plaintext would not be changed in the ciphertext. Moreover, the key space of the permutation cipher is equal to m! and m could be equal to the length of the plaintext in worst case. Therefore, the method could be revealed by brute force attack easily. 3 The number of letters of the Turkish alphabet is equal to 29 which is a prime number. In the equation y= ax+b mod n, a has to be coprime with 29 to equation be reversible. That’s why, key space of the a is equal to 28 according to restriction. Moreover, key space of the b is always equal to n. Thus, key space of the affine cipher 28x29=812 for Turkish. 17 Figure 2.3 Illustration of the Permutation Cipher 2.2.2. Polygraphic Substitution Polygraphic substitution performs substitutions on blocks instead of substituting a letter. Substituting multiple characters at a time destroys the structure of the plaintext. That makes the frequency analysis attacks impossible. 2.2.2.1. Playfair Cipher Playfair cipher is one of the best known encryption techniques of multi-character substitution. Firstly, the encryption key is picked up. Secondly, the key would be filled into the 5x5 matrix. However, duplicated letters has to be dropped in the key. Remaining letters in the alphabet would be filled into the rest of the cells of matrix alphabeticlly. The letters of the alphabet has to be filled into 25 cells. Therefore, some letters can be filled into same cells. Encryption rule is based on a simple technique. The ciphertext is grouped into blocks of length 2. The encryption rule performs on each block. Specific letters should be inserted between repeating letters in the plaintext. If the letters of the current block are located in the same row of the matrix, the current block is replaced by letters to the right of each letter in the row. If the letters of the current block are located in the same column, the current block is replaced by letters below them. Otherwise, letters of the current block are located in neither same row nor same column, a virtual rectangle is composed. The rectangle has to contain the letters of the current block in the corner. The block is replaced by the the other corner elements. 18 Suppose that the key is İstanbul and the plaintext is Florya. The key is filled into the matrix as illustrated in Table 2.11. The plaintext is grouped into blocks of length 2. FL OR YA Table 2.11 The Encryption Key of Playfair Cipher Ġ B E K ġ S U F M/N Ü T L G/Ğ O/Ö V A C/Ç H P Y PK AC N D I/J R Z Finally, the ciphertext is represented as: GU Thus, the plaintext FLORYA is encrypted as GUPKAC in Playfair Cipher. Frequency analysis attacks fail against to playfair cipher [4]. It is a fact that the playfair cipher is based on bigram substitution. Therefore, the cipher would not balance out the bigram frequencies of ciphertext. Bigram frequencies of the source language could be used to attack playfair ciphers. 2.2.2.2. Hill Cipher Hill Cipher is a block cipher model based on matrix manipulations. The plaintext is divided into blocks and each block would be encrypted respectively. Suppose that block size is decided as n. A (nxn) matrix is specified as key and the each block of the plaintext is multiply to the matrix to encrypt the plaintext. C=[K] P mod m (2.3) 19 C and P are column vectors of length n, representing the plaintext and ciphertext respectively, and K is a nxn matrix, which is the encryption key. Decryption requires using the inverse matrix of the matrix K. P=[K]-1 C mod m (2.4) The inverse matrix K-1 is defined by the equation K K-1 = I, where I is the Identity matrix. It is a fact that the inverse matrix does not always exist. Suppose that the size of the key matrix is 2x2, and the key matrix is 𝐾= 1 2 3 4 Assume the plaintext would be alankay. First, the plaintext is grouped into blocks of length 2. The first letter of the plaintext is appended at the end of the plaintext because the plaintext consists of odd number. al an ka ya Secondly, letters of the plaintext are replaced by the corresponding numbers 0 14 0 16 13 0 27 0 Thirdly, each block is multiplied by K 0 1 14 3 0.1 + 14.2 2 28 28 = = 𝑚𝑜𝑑 29 = 0.3 + 14.4 4 56 27 0.1 + 16.2 0 1 2 32 3 = = 𝑚𝑜𝑑 29 = 0.3 + 16.4 16 3 4 64 6 13 1 0 3 13.1 + 0.2 2 13 13 = = 𝑚𝑜𝑑 29 = 13.3 + 0.4 4 39 10 20 27 1 0 3 27.1 + 0.2 2 27 27 = = 𝑚𝑜𝑑 29 = 27.3 + 0.4 4 81 23 Fourth, the numbers 28 27 3 6 13 10 27 23 represented as: ZYÇFKIYT Thus, the ciphertext ALANKAYA would be encrypted as ZYÇFKIYT in Hill Cipher. The inverse matrix is used to implement the decryption process. A matrix has an inverse matrix if and only if its determinant is not equal to 0. K= a b , K =ad-bc c d 1 K-1 = |K| d -c -b , |K|≠0 a 𝐾𝐾 −1 = 1 0 0 1 Firstly, the inverse matrix is calculated to decrypt the ciphertext. K = 1.4 − 2.3 = −2 ≠ 0 𝐾 −1 = 1 4 −2 −3 −2 −2 = 1,5 1 27 𝐾 −1 = 15 + 5.29 10 𝐾 −1 = 1 𝑚𝑜𝑑 29 −0,5 1 5.29 − 5 𝑚𝑜𝑑29 10 27 16 1 14 (2.5) (2.6) 21 Secondly, the letters is grouped into blocks ZY ÇF KI YT Thirdly, the letters is replaced by the corresponding numbers 28 27 3 6 13 10 27 23 Fourthly, each block is multiplied by K-1 28 27 27 16 3 27 6 16 28.27 + 27.1 1 783 0 = = 𝑚𝑜𝑑 29 = 28.16 + 27.14 14 826 14 3.27 + 1.6 1 87 0 = = 𝑚𝑜𝑑 29 = 3.16 + 6.14 14 132 16 13 27 10 16 13.27 + 10.1 1 361 13 = = 𝑚𝑜𝑑 29 = 13.16 + 10.14 14 348 0 27 27 23 16 27.27 + 23.1 1 27 752 = = 𝑚𝑜𝑑 29 = 27.16 + 23.14 14 0 754 Finally, the numbers 0 14 0 16 13 0 27 0 represented as: alankay Hill cipher is secure against to frequency analysis attack. Moreover, the Hill cipher is strong against to brute force attack. The key space of the hill cipher is equal to 29nxn which is an efficiently large value. However, Hill cipher succumbs to a known plaintext attack [3]. The key matrix can be calculated easily from a set of known 𝑃 , 𝐶 . 22 2.2.3. Polyalphabetic Substitution Ciphers Remaining the substitution rules same for every substitution makes the frequency analysis attacks applicable on monoalphabetic ciphers. In a polyalphabetic substitution cipher, multiple substitution alphabets are used throughout the encryption process. Therefore, same plaintext character could be encrypted to different ciphertext symbols. That makes the frequency analysis difficult. 2.2.3.1. Vigenere Cipher Suppose that an n character alphabet and an m character key, K = (k1, k2, …, km) is given. The Vigenere cipher consists of m shift ciphers and ki specifies the monoalphabetic substitution [10]. Encryption process could be implemented by adding plaintext and key values. Accordingly, decryption process could be performed by substruction of ciphertext and key values. Ci = ( Pi + ki ) mod n (2.7) Pi = ( Ci - ki ) mod n (2.8) Suppose that the plaintext is MUSTAFAKEMALATATÜRK and the key is İSTANBUL. The plaintext is encrypted as illustrated in Table 2.12 and the ciphertext is decrypted as demonstrated in Table 2.13. Table 2.12 Mathematical Illustration of Encyrption Process of Vigenere Cipher Plaintext Key Plaintext Key Plaintext+Key mod 29 Ciphertext M Ġ 15 11 26 26 V U S 24 21 45 16 N S T 21 23 44 15 M T A 23 0 23 23 T A N 0 16 16 16 N F B 6 1 7 7 G A U 0 24 24 24 U K L 13 14 27 27 Y E Ġ 5 11 16 16 N M S 15 21 36 7 G A T 0 23 23 23 T L A 14 0 14 14 L A N 0 16 16 16 N T B 23 1 24 24 U A U 0 24 24 24 U T L 23 14 37 8 Ğ Ü Ġ 25 11 36 7 G R S 20 21 41 12 J K T 13 23 36 7 G 23 Table 2.13 Mathematical Illustration of Decryption Process of Vigenere Cipher Ciphertext Key Ciphertext Key Ciphertext-Key mod 29 Plaintext V Ġ 26 11 15 15 M N S 16 21 -5 24 U M T 15 23 -8 21 S T A 23 0 23 23 T N N 16 16 0 0 A G B 7 1 6 6 F U U 24 24 0 0 A Y L 27 14 13 13 K N Ġ 16 11 5 5 E G S 7 21 -14 15 M T T 23 23 0 0 A L A 14 0 14 14 L N N 16 16 0 0 A U B 24 1 23 23 T U U 24 24 0 0 A Ğ L 8 14 -6 23 T G Ġ 7 11 -4 25 Ü J S 12 21 -9 20 R G T 7 23 -16 13 K Table 2.14 Encryption Process of Vigenere Cipher by Using of Vigenere Table M U S T A F A K E M A L A T A T Ü R K Ġ S T A N B U L Ġ S T A N B U L Ġ S T V N M T N G U Y N G T L N U U Ğ G J G Plaintext Key : Ciphertext : : Table 2.15 Vigenere Table for Turkish Alphabet A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B A B B C C Ç Ç D D E E F F G G Ğ Ğ H H I I Ġ Ġ J J K K L L M MN N O O Ö Ö P P R R S S ġ ġ T T U U Ü Ü V V Y Y Z Z A C C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B Ç Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C D D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç E E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D F F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E G G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F Ğ Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G H H I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ I I Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H Ġ Ġ J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I J J K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ K K L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J L L M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K M M N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L N N O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M O O Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N Ö Ö P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O P P R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö R R S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P S S ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R ġ ġ T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S T T U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ U U Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T Ü Ü V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U V V Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü Y Y Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Z Z A B C Ç D E F G Ğ H I Ġ J K L M N O Ö P R S ġ T U Ü V Y 24 Intersection of the plaintext and the key on Vigenere table states the ciphertext. Similarly, the plaintext could be obtained by performing inverse process. A sample scenario is illustrated in Table 2.14 while Table 2.15 is used as Vigenere table. Note that A would be replaced by N, U, T, N, and U respectively. The action manipulates the frequency distribution of the ciphertext. Therefore, frequency analysis attacks cannot be performed as easily on Vigenere cipher. // C is an array consisting of elements of ciphertext respectively // K is an array consisting of elements of key respectively for from i=0 to K.length by 1 print ‘Block ’+i+’:’ for from j=i to C.length by K.size print C[j] end for end for Figure 2.4 Pseudocode of Dividing Ciphertext Into Blocks to Attack Suppose that the length of the key is n. In other words, the ciphertext consists of n monoalphabetic substitution ciphers. Note that, the letters of the plaintext at positions 1, n+1, 2n+1, 3n+1 would be encrypted by the same monoalphabetic substitution cipher. Thereby, the ciphertext could be divided into blocks of key length. Implementing frequency analysis to each block respectively makes frequency analysis attacks possible. Figure 2.4 illustrates the algorithm of dividing ciphertext into blocks. An attacker needs to check all possibilities of key between length of 2 and 28 if the key size is not known. Thus, the key space of the Vigenere cipher would be larger than affine cipher and substitution cipher. 25 2.2.3.2. Kasiski Attack Kasiski attack is a method that tries to determine the key size of the polyalpahabatic substitution cipher. The ciphertext could contain a recurrent pattern if the letters of the plaintext are encrypted by the same sequence of key. It is assumed that the distance between the repeating groups of characters could be related with the key length. Greatest common divisor of the distances of the repetations could give a clue about the key size [11]. Suppose that the following ciphertext is given: AUYAJAMĞÖSCMĠRĞYRRMVEIGIZCTSĠALNDIFDARBKMNAOULCRMYEĠGĠ ZTCZÖAġBKRBMASYLÜRĠÇUZEEBYYMÜZDYZġRRKVYAĠINOĞKRDAGRÜP ELAÖLCMEBÖRCNSSVĠVAYSGSCZOBOUZUNĠLUUERRÖDNġMSOEGÖNÖÖB CRYSDVRRBYYĠÖORĠZHÜRġSĠĠJÖYBÖMÜKMJZKNNEFÖYġEYNVLRġMVFIF DULĞYĞRUCKNEATNZIÖORĠZ Table 2.16 Distances Between Repating Characters Character CM IFD BYY ÖORĠZ Times 2 2 2 2 Distance 102 180 90 60 In the example, repeating characters CM, IFD, BYY and ÖORİZ appear two times in the ciphertext. Distances between the characters are illustrated in the Table 2.16. The greatest common divisor of the 60, 90, 102 and 180 is 6. Therefore, 6 and its dividers (2, 3) are candidates of the key length. Indeed, the key is specified as ANKARA and the plaintext is selected from the following text. 26 AĞLASAMSESĠMĠDUYARMISINIZMISRALARIMDADOKUNABĠLĠRMĠSĠNĠZG ÖZYAġLARIMAELLERĠNĠZLEBĠLMEZDĠMġARKILARINBUKADARGÜZELKE LĠMELERĠNSEKĠFAYETSĠZOLDUĞUNUBUDERDEDÜġMEDENÖNCEBĠRYERV ARBĠLĠYORUMHERġEYĠSÖYLEMEKMÜMKÜNEPEYCEYAKLAġMIġIMDUYU YORUMANLATAMIYORUM 3. Homophonic Cipher Homophonic cipher is developed as an alternative to substitution cipher to compose more resistant ciphertexts against to the frequency analysis attacks. Homophonic cipher could be thought as the extended version of substitution cipher [2]. In the classical substitution, each plaintext character is replaced with a corresponding symbol by means of one to one mapping. Homophonic substitution is similar to the classical substitution, except that the mapping is one to many. Each source character is mapped into a set of symbols referred to as homophones [12]. The number of substitutes is proportional to the frequency of the letter in the source language. It can be used to provide randomization. That makes the frequency occurrence of the ciphertext symbols more uniform [3]. The term of homophonic means to sound the same that is an indication of the fact that different symbols refer to same source character. An important attribute of the homophonic coding is that a symbol is picked at random from the set of homophones to represent the given source character in the one to many mapping. Thus, homophonic cipher is transforming a given non-uniformly distributed plaintext into a random uniformly distributed ciphertext [13]. Homophonic substitution is also a cryptographic technique that reduces the redundancy of a message [14]. Homophonic cipher replaces each plaintext letter with different symbols proportional to its frequency rate. Its main goal is to convert the plaintext into a sequence of completely random code symbols. The idea behind homophonic cipher is to balance out the symbol frequencies. The frequency distribution of the ciphertext is manipulated and smoothed. Symbols located in the ciphertext have relatively equal frequencies. Each symbol takes space of about one percent of ciphertext. 28 Suppose that the source language consists of the alphabet A = {a, b} with probabilities pa=3/4, pb=1/4 and the plaintext letters would be substituted according to substitution rule a→{00, 01, 10} and b→{11} as illustrated in Figure 3.1. The message is encrypted at random into one of its homophones with equal probabilities [15]. So, the homophonic cipher provides the encrypted text appearing random. Figure 3.1 Domain Set and Target Set of Homophonic Cipher Figure 3.2 Illustration of Frequency Distribution of a Homophonic Encrypted Text The article of Bir Tılsımı Olmalıdır Hayatın written by Çetin Altan is encrypted by homophonic cipher corresponding to Figure 3.3. The text consists of 3856 letters. Finally, the Figure 3.2 is retrieved. 29 Table 3.1 Percentage of Turkish Letter Frequencies and Expression Count in Homophonic Cipher [2] Letter Freq. A B C Ç D E F G Ğ H 11,92 2,844 0,963 1,156 4,706 8,912 0,461 1,253 1,125 1,212 Exp. Count 12 3 1 1 5 9 1 1 1 1 Letter Freq. I Ġ J K L M N O Ö P 5,114 8,6 0,034 4,683 5,922 3,752 7,484 2,476 0,777 0,886 Exp. Count 5 9 1 5 6 4 7 2 1 1 Letter Freq. R S ġ T U Ü V Y Z 6,722 3,014 1,78 3,314 3,235 1,854 0,959 3,336 1,5 Exp. Count 7 3 2 3 3 2 1 3 2 The unigram frequencies of the source language assess how many symbols the letter would be expressed within homophonic cipher. Each letter would be replaced by different symbols proportional to its frequency rate. Table 3.1 illustrates the expression count of Turkish unigrams. A B C Ç D E F G Ğ H I İ J K L M N O Ö P R S Ş T U Ü V Y Z 009 048 013 062 001 014 010 006 025 023 032 083 015 004 026 022 018 000 072 038 029 011 076 017 008 063 034 021 002 012 081 003 016 070 088 096 037 027 058 005 035 019 086 020 061 085 052 069 033 028 045 024 073 093 056 051 039 059 040 036 030 097 075 047 079 044 031 060 065 084 050 066 042 053 041 046 089 007 068 043 071 077 067 055 054 049 091 080 078 057 090 101 102 092 064 099 082 074 095 087 098 094 Figure 3.3 Substitution Table of Homophonic Cipher Figure 3.3 illustrates an instance of the substitution table of homophonic cipher. The top row represents unigrams of the Turkish, while the below row represents homophones of each unigram. 30 Suppose that the plaintext is FLORYA. The plaintext could be encrypted as one of the following ciphertexts: 010 084 000 035 075 067 010 026 005 042 052 098 010 043 000 029 021 053 Homophonic cipher is a type of monoalphabetic substitution cipher. A letter could be shown as several characters. However, a character could symbolize only one plaintext character. In polyalphabetic cipher, a letter could be represented by several characters and a character would symbolize several letters throughout encryption. In other perspective, the encryption alphabet remains constant throughout the encryption process [16]. exp sym[29]=12, 3, 1, 1, 5, 9, 1, 1, 1, 1, 5, 9, 1, 5, 6, 4, 7, 2, 1, 1, 7, 3, 2, 3, 3, 2, 1, 3, 2 // Expression count of each letter in Turkish Alphabet num_of_sym=102 // Sum of the expression count of Turkish letters keyspace=1 b=sym[0] for from i=0 to 29 keyspace=keyspace*Combination(num_of_sym, b) num_of_sym=num_of_sym-b b=sym[i+1] end for return keyspace function Combination (a, b) dividend=1, denominator=1 for from i=a downto b by -1 dividend = dividend * i end for for from i=b downto 1 by -1 denominator = denominator*b end for resp = dividend / denominator return resp Figure 3.4 Algorithm for Computing the Key Space of Homophonic Cipher for Turkish 31 The key space of homophonic cipher for Turkish is calculated in Figure 3.4. It is a number larger than 10119. Suppose that the attacker is able to check a possibility per microsecond, it would take time more than 10105 years to solve ciphertext in worst case4. Obviously, the encipherment rules out a brute force attack. It is understood that brute force attacks would not be a solution to break homophonic cipher if the key space values of well known algorithms are compared. The key space of the method is overwhelmingly larger than other common algorithms, even most of modern algorithms, except Blowfish. Table 3.2 Comparison of Key Space Values of Common Algorithms Algorithm Substitution Cipher Homophonic Cipher DES 3DES IDEA AES Camellia Twofish Serpent Blowfish Category Classical Classical Modern Modern Modern Modern Modern Modern Modern Modern Key Space 29! 10119 256 2112 2128 2256 2256 2256 2256 2448 Approximate Value 1030 10119 1016 1033 1038 1077 1077 1077 1077 10134 In order to compare the key space of common encryption algorithms, BigDecimal class of Java programming language is used. Thus, approximate key space values are retrieved illustrated in Table 3.2. 4 10 119 10 −6 60.60.24.365 > 10105 4. A New Approach On Attacking Homophonic Cipher Most frequent n-grams would not help to solve homophonic ciphers. Even if these ngrams are assumed to appear in ciphertext, they would almost be impossible to solve because of the high expression count. Homophonic cipher extends the block size of the ciphertext in patches. Moreover, the block size is variable. That makes the identifying the plaintext difficult. Well known statistical features of the source language become invalid because of the manipulation of the ciphetext. The 100 most common words of Turkish are already illustrated in the section 2. Expression count of the most common words could contribute to solve homophonic cipher. Table 4.1 illustrates the expression count of most common 100 words of Turkish. It was sorted with respect to the expression count from smallest to greatest. Nevertheless, making a decision of useful n-grams belonging to the source language plays pivotal role to solve homophonic ciphers. The unigrams of n-grams should have low frequencies to be detected easily in homophonic encrypted texts, whereas the ngram itself should have high frequency to be assumed to appear in the plaintext. In other words, high frequent n-grams should consist of low frequent unigrams [2]. 33 Table 4.1 Expression Count of Most Common 100 Words of Turkish within Homophonic Cipher Word O ġu Ve Bu Hiç Çok Gün Mı Yok Çocuk In Ya Mi Hem Onu Son De Ki Kız ġey Biz Da Ne Her En 4.1. Exp Count 2 6 9 9 9 10 14 20 30 30 35 36 36 36 42 42 45 45 50 54 54 60 63 63 63 Word Ġn Önce Göre Bey Gece Var Yıl Küçük Uzun Tek Çünkü Tam Öyle Ona Büyük Oldu Bir Ben Sen Bunu Doğru Türk Güzel Gibi Ġyi Exp Count 63 63 63 81 81 84 90 100 126 135 140 144 162 168 180 180 189 189 189 189 210 210 216 243 243 Word Ġse Nın Bütün Olur Ġlk Onun Ġki Nin Ġle Böyle ĠĢte Olduğu Ġçin Ama Daha Olan Diye Eski Aynı Bile Beni Yeni Yine Biri Bizim Exp Count 243 245 252 252 270 294 405 441 486 486 486 540 567 576 720 1008 1215 1215 1260 1458 1701 1701 1701 1701 1944 Word Dedi Vardı Fakat Hemen Değil Adam Bana ġimdi Sonra KarĢı BaĢka Biraz Ancak Artık Benim Nasıl Zaman Kadın Olduğunu Kendi Ġnsan Kadar Ġçinde Türkiye Olarak Exp Count 2025 2100 2160 2268 2430 2880 3024 3240 3528 4200 4320 4536 5040 6300 6804 7560 8064 10500 11340 14175 15876 25200 25515 51030 60480 High Frequent N-grams Consisting of Low Frequent Unigrams Turkish n-gram frequencies are initially explored in a large corpus of size 13.4 MB to obtain high frequent n-grams consisting of low frequent unigrams. Secondly, expression count of each line within homophonic cipher is computed. Thirdly, sorting was done with respect to the frequeny values by taking into account the first 300 records for bigrams, 1500 results for trigrams, 2500 results for tetragrams and pentagrams from the greatest to smallest and the rest of data was discarded. Finally, it was sorted with respect to the expression count from the smallest to greatest. N-gram frequencies indicate frequencies in 11.371.564. 34 Table 4.2 Frequencies and Expression Count of High Frequent Bigrams Consisting of Low Frequent Unigrams in 11M Ngr Fre Exp Ngr Fre GÖ GÜ ÇO ÖZ OĞ OC ÜÇ ÇÜ OP PO ĞÜ ÜĞ ĞU UĞ ÖY SÖ CU PT UP BÖ GU VU ÜZ Üġ ZÜ ġÜ Oġ ĞI IĞ ÇI CI IP PI DÖ KÖ FI HI YO SO ġT TÜ UZ Uġ YÜ BÜ ÜY ÜS TO BO OY 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 25203 20124 14880 12477 10648 7324 5469 5186 5075 4765 4516 4463 18753 16907 14744 10196 9701 5061 4701 4683 4374 3907 17636 13030 7025 6549 5093 37718 24106 18112 10287 10041 8685 6918 5975 4036 3786 61044 27989 26972 23620 15172 14641 14156 11348 11253 10146 9312 9184 8640 ÜT SÜ OT PL ĞL ÖL LG ġU ÇL ZU OS VL NC ÖR ÖN ĞR RG NG RÇ VR NÇ MÜ ÜM ġM OM VE BU GE CE ĞĠ GĠ ST ĠÇ ÇĠ EV TU SU ĠĞ HE UY EC EĞ TT HĠ ÇE UT CĠ YU ĠP VĠ 7852 7729 7353 7068 7029 6801 6372 6102 5262 5205 4774 4745 21614 17778 15278 7347 6210 5692 4136 3925 3916 11630 11563 9568 5460 49863 44624 40841 37156 36283 35787 31918 31429 27377 25568 22533 22168 21975 21182 20367 19480 19472 19323 18821 17736 13198 11387 11219 10803 10524 Exp Ngr Fre Exp Ngr Fre Exp Ngr Fre Exp 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15 15 15 15 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 20 20 20 20 21 21 21 21 21 21 21 21 21 21 21 21 21 24 24 24 24 24 24 25 25 25 25 25 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 30 30 30 30 30 35 35 35 35 35 35 35 35 35 36 PE US EÇ FE EF EH ĠH FĠ EP ĠF SS ĠC PĠ Iġ OK KO DÜ IZ ġI ZI ÜK DO ġK KÜ ZD OD HA OL CA AH AP AĞ UM VA MU PA AC ġL ÇA ZL TM AV LÜ ÜL AÇ FA AF GA ĞA MS 10171 10074 8691 8636 7925 7137 6744 6520 6092 5528 4284 3936 3837 32622 27389 25764 25363 23091 22692 14561 13284 12350 11863 11279 8012 4565 56075 54389 33993 33171 27085 26617 26139 23616 23551 21525 18985 18154 17193 16237 13595 13357 13272 13181 12668 11877 11150 10575 6727 4790 LO SM OR ON ÜN ÜR RÜ NÜ Rġ RO DU TI SI KT IY YI KU UK YD IS KS SK BI UD Ġġ ĠZ YL UL LU ZE ġE TL ġĠ ZĠ Eġ EZ SL LT LS IM MI MD KM UN UR NU RU RT RS NS 4242 4030 70300 45874 43059 27057 13654 10971 5193 4899 52283 51386 50362 28964 27516 24913 23617 15139 13222 10610 7683 6723 5760 3968 38227 35238 32373 29371 27338 22563 19742 19492 17282 13862 13465 9758 7774 7738 4266 37245 36784 11337 7097 59182 38171 36391 28453 20288 15816 9885 NT YR YN RB NB TR Aġ AZ LM ZA ġA ML DI IK KI KK ID BĠ ĠY SĠ YE TE TĠ ET BE SE EY ES ĠS ĠT YĠ ĠB EB RM NM LI KL IL LD LK IN ND RD NI RI IR RK NK KR YA 8977 5622 5457 5141 4482 4288 46294 32755 32197 23774 22239 12350 62359 40865 31841 5957 4104 125824 57222 55239 55184 53881 50968 49927 45640 44712 44573 39387 37263 20510 19854 15851 5667 24077 13100 53814 43444 39898 37375 9755 132807 98136 71256 69437 66743 43789 29528 7820 3777 107833 35 Table 4.3 Frequencies and Expression Count of High Frequent Trigrams Consisting of Low Frequent Unigrams in 11M Ngr GÖZ ÇOC GÜV HOC GÖS GÖT ÜĞÜ ÖZÜ ÜÇÜ GÜZ ÜCÜ HOġ KÖP OCU OĞU TOP SÖZ ÖTÜ FÜS ÖLG BOĞ ġÖY GÖR ÖNC ÖĞR GÖN ÜġÜ ÜZÜ ĞÜM UĞU GEÇ HĠÇ SÖY CEĞ HEP GEC BÖY UCU ÖST UYG YGU ÇEV HÇE CEV SUÇ TUĞ HVE ÖPE EVG VGĠ Freq 5755 3833 1092 1017 2247 1143 4160 2872 2810 2803 1607 1540 1250 3910 3155 2979 2686 2203 1824 1038 997 982 12199 4016 2452 1184 5781 3851 964 15363 7184 7076 6645 4410 3254 3087 3016 2433 2287 1996 1940 1737 1719 1278 1217 1215 1090 1043 962 954 Exp 2 2 2 2 3 3 4 4 4 4 4 4 5 6 6 6 6 6 6 6 6 6 7 7 7 7 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Ngr ÇOK DOĞ DÜĞ KÜÇ ÇÜK KOC KÖġ IZC HIZ ÜTÜ YÜZ CAĞ ÜYO ÜYÜ ÖZL GÜL ĞUM HAF ÖLÜ OĞL POL OTO HAV BOġ ÜġT OPL ġTÜ ÇAĞ MUH AHV ÜSÜ AHÇ CUM VAP LÜĞ GÜN ÖRÜ ÖNÜ OĞR ÇÜN ÜNC ĞÜN ORG FON POR GÜR RGÜ DUĞ TIĞ PTI Freq 9685 4452 2915 2321 1907 1906 1290 1094 1063 5629 5561 4672 4299 4106 3266 2851 2715 2525 2404 2295 2079 2061 2005 1984 1937 1739 1615 1244 1233 1213 1051 1029 1026 978 968 8955 4600 3537 3298 2195 2063 1837 1560 1444 1429 1253 1218 10666 4856 3326 Exp 10 10 10 10 10 10 10 10 10 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 14 14 14 14 14 14 14 14 14 14 14 14 15 15 15 Ngr CUK YIP PIY TIP KÖY KÖT MÜġ ÜMÜ ÜġM MÜZ ZÜM ÖYL UYO ÜST BÜT ġTU UZU UġT BÜY OYU OTU ÖZE STÜ ġEH LUĞ BOY UġU SUZ ĞLU ÜSU ĞĠġ GĠZ EFO ÜVE OST BÖL SYO GEZ ÖġE GUL UÇL VĠZ LUP CUL DÜġ ĞIM ÜDÜ DÜZ KOġ ÜZD Freq 3185 1715 1561 1414 1288 1010 2801 2380 1709 1447 959 13810 5892 5059 4849 3803 3780 3615 3577 3519 3383 2459 2345 2272 2258 2202 1990 1927 1873 1827 1520 1499 1372 1366 1336 1255 1195 1141 1115 1084 1080 1025 1023 987 6913 6633 2013 1357 1151 1104 Exp 15 15 15 15 15 15 16 16 16 16 16 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 20 20 20 20 20 20 Ngr ÜKÜ ĞUN ĞRU UNC RUP VUR ÖRT RGU YÖN GUN VRU MUġ OCA GAZ ÜLÜ HAZ FAZ VAġ ġAH UġM MUZ OĞA OMU ZCA ġAĞ ÜMS APO MÜS TÜM IĞI ÇIK DIĞ ICI KIP GĠB SEV USU GĠT UYU TĠĞ YEC TUT UTU VET ÇBĠ ĠÇB HĠS GET UST ĠHT Freq 1016 7798 2722 1857 1662 1636 1431 1275 1235 1102 975 6665 3344 2805 2684 2055 1790 1767 1684 1674 1672 1475 1287 1195 1139 1077 1062 1026 955 22842 11312 11092 1863 1833 11642 5934 5550 4392 4351 4325 3166 3059 2586 2387 2379 2369 2271 2144 1880 1641 Exp 20 21 21 21 21 21 21 21 21 21 21 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 25 25 25 25 25 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 Ngr ÇTĠ PSĠ HTĠ GĠY EPS HEY TĠH EÇT CES SUS ÇET ÖTE FET SEF TTU UTT GĠS TEC ÜRÜ ÜNÜ ġÜN ZÜN ÖRM ZOR RÜġ IYO ġTI IġT OKU LIĞ YOK YÜK SIZ OKT SOK PIL ÇIL ġIY ÖLD KOY CIL LIP OKS KUġ ÜKS DOS ĞLI TIġ ĞIN DÖN Freq 1610 1575 1488 1407 1338 1311 1305 1248 1232 1188 1150 1114 1101 1070 1069 1065 1015 1008 6600 6358 4498 3283 2011 1800 1040 15455 10166 8101 6541 6354 5827 4374 3789 2768 2715 2599 2183 2095 1930 1928 1705 1554 1389 1326 1317 1195 1169 935 12555 4583 Exp 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 27 28 28 28 28 28 28 28 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 35 35 36 Table 4.4 Frequencies and Expression Count of High Frequent Tetragrams Consisting of Low Frequent Unigrams in 11M Ngr GÖZÜ GÜCÜ ÇOCU GÖTÜ OCUĞ ÇOĞU GÖST CUĞU GÖZL FOTO OTOĞ SÖZÜ CUMH GÜÇL GÖRÜ ÖRGÜ PTIĞ KUVV ÜĞÜM ÖZÜM FÜSU GÜVE BUGÜ HUZU SOĞU BÖLG DÜĞÜ KÜÇÜ ÜÇÜK ÖZÜK VRUP YÜZÜ ÜġTÜ GÜLÜ HOCA LÜĞÜ HĠÇB UYGU GEÇT HEPS TUĞU SEVG HÇET BEHÇ YGUS GUSU UVVE VVET GÖRM ÜĞÜN Fre 1760 628 3833 1132 743 692 2246 699 2267 732 683 633 613 583 4073 909 1419 574 962 608 1797 1092 803 764 678 600 2719 2320 1876 595 609 2595 1607 1314 1017 887 2367 1915 1131 1119 1036 955 807 807 634 600 549 546 1985 1798 Exp 4 4 6 6 6 6 9 9 12 12 12 12 12 12 14 14 15 15 16 16 18 18 18 18 18 18 20 20 20 20 21 24 24 24 24 24 27 27 27 27 27 27 27 27 27 27 27 27 28 28 Ngr GÜNÜ ĞÜNÜ ÖRÜġ ÖZÜN ġÜNC ÖNÜġ OCUK KÖTÜ DOĞU PIYO ġTIĞ OĞUK ÜMÜZ GÖRD GÖND BÜTÜ BÜYÜ GÜZE UĞUM ÜSTÜ OĞLU TOPL GEÇM ĞUMU ġÖYL ÖYLÜ BAHÇ UHAF OLCU HAFT SÖZL UMHU HAYV BÖLÜ YOLC OLUP OTOB TOBÜ OBÜS DÜġÜ PMIġ OĞRU ÖTÜR RGÜT UGÜN ORGU TOĞR OĞUN ÖRÜY PORT Fre 1046 915 893 831 679 611 3087 1007 1002 989 915 595 602 2411 814 4768 3575 2774 2499 2298 1864 1673 1436 1222 982 953 943 935 794 720 677 613 608 593 574 568 559 548 548 4740 567 2540 1182 802 801 784 680 562 560 560 Exp 28 28 28 28 28 28 30 30 30 30 30 30 32 35 35 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 40 40 42 42 42 42 42 42 42 42 42 Ngr ÖBÜR DUĞU DUYG TTIĞ KÖPE ÖFKE ÖPEK KTUP VDĠĞ YDUĞ ÖLÜM MÜġT OĞAZ ġIĞI SÖYL UġTU BÖYL LUĞU BOYU ÖLGE ĞĠġT SUÇL VĠZY ULUĞ VGĠL YGUL ÜġÜN ÜZÜN ġÜNÜ ZÜNÜ CAĞI ÜYÜK ÖLDÜ ġIYO KAHV HIZL ÇAĞI YÜZD ZLIĞ ÇAKÇ UĞUN ĞUNU ÖNCE GÖRE ÖĞRE GERÇ GENÇ ÖREV PENC YUNC Fre 554 10666 1360 1069 899 843 737 580 569 527 1533 1046 635 533 6638 3451 3012 2046 1050 1034 814 767 750 634 628 575 4495 2446 1264 829 4623 2902 1851 1508 1141 953 827 732 544 535 7229 5384 3964 3399 2436 1926 1646 1101 1072 1005 Exp 42 45 45 45 45 45 45 45 45 45 48 48 48 50 54 54 54 54 54 54 54 54 54 54 54 54 56 56 56 56 60 60 60 60 60 60 60 60 60 60 63 63 63 63 63 63 63 63 63 63 Ngr ÇEVR URUP VURU UCUN HERH RUCU ÖRDÜ DOĞR ÇÜNK DÖNÜ RDÜĞ ÖNDÜ GÜND NIZC MUġT LÜYO ÇMĠġ UMUZ OMUT OCAS BOĞA YÜZL MÜST ÜLTÜ MUYO OLUġ UMUġ TIĞI ÇIKT KTIĞ PISI TIPK CISI DÜġM MÜDÜ ġMIġ CEĞĠ ECEĞ GECE TTĠĞ GEÇĠ GĠTT ÖSTE HĠSS GEÇE TUTU UTTU HTĠY EVGĠ EHÇE Fre 804 767 665 587 570 563 2573 2562 1786 1394 1261 1085 902 814 1731 1131 966 942 780 746 700 697 684 649 584 560 524 4806 1764 999 903 727 668 1061 970 879 4351 4324 3002 2865 2546 2324 2286 2062 1655 1144 963 956 954 813 Exp 63 63 63 63 63 63 70 70 70 70 70 70 70 70 72 72 72 72 72 72 72 72 72 72 72 72 72 75 75 75 75 75 75 80 80 80 81 81 81 81 81 81 81 81 81 81 81 81 81 81 37 Table 4.5 Frequencies and Expression Count of High Frequent Pentagrams Consisting of Low Frequent Unigrams in 11M Ngr ÇOCUĞ FOTOĞ OCUĞU GÖZÜK GÖRÜġ GÖZÜN ġOFÖR ÇOCUK CUMHU GÖTÜR ÖRGÜT GÖRÜY GÖVDE GÖLGE ÜĞÜNÜ ÜġÜNC GÖRMÜ ÖZÜNÜ HÜZÜN GÖREV GÖRDÜ ÖRDÜĞ ĞUMUZ OTOBÜ PTIĞI DÜĞÜM HÜKÜM GÖSTE UYGUS UVVET YGUSU OTOĞR ÖRÜYO ÖTÜRÜ GÖRÜL SOĞUK MÜġTÜ GÜLÜM ÖLÜMÜ GÖRÜN ÖRÜNC GÖRÜR KÜÇÜK GÖZLE UĞUMU TUHAF VĠZYO CEVAP ÖYLÜY HAFĠF Fre 748 683 666 593 892 472 394 3085 612 1125 802 540 381 433 914 679 446 441 390 1101 2408 981 568 548 1418 803 425 2246 633 543 539 679 540 484 414 581 807 750 702 1592 550 409 1876 1904 1167 866 748 681 644 605 Exp 6 12 18 20 28 28 28 30 36 42 42 42 45 54 56 56 56 56 56 63 70 70 72 72 75 80 80 81 81 81 81 84 84 84 84 90 96 96 96 98 98 98 100 108 108 108 108 108 108 108 Ngr SÖYLÜ YOLCU TOBÜS UġUYO ÜġÜNÜ ÜZÜNÜ ÖRMÜġ CAĞIZ LDÜĞÜ ÜLKÜC LKÜCÜ DÜġTÜ ÇIKIP FÜSUN GÜVEN OYUNC BUGÜN HUZUR SORGU SONUC DUYGU KUVVE YDUĞU UYDUĞ ÇÜNKÜ RDÜĞÜ DÜĞÜN DÖNÜġ NDÜĞÜ ÜĞÜND BOĞAZ OĞLUM PACAĞ ġTIĞI IġTIĞ CILIĞ BÖLGE ÖYLEC UYGUL ULUĞU UTSUZ SUÇLU VVETL UYUġT YUġTU YÜZÜN ÜNÜYO OĞRAF ġÜNÜY ÜRÜYO Fre 601 559 548 380 1264 384 374 443 421 401 401 385 457 1797 1027 896 800 764 465 381 1360 549 527 416 1786 1261 1080 610 517 387 635 396 379 910 480 382 600 579 575 547 500 443 437 414 413 1827 755 732 612 589 Exp 108 108 108 108 112 112 112 120 120 120 120 120 125 126 126 126 126 126 126 126 135 135 135 135 140 140 140 140 140 140 144 144 144 150 150 150 162 162 162 162 162 162 162 162 162 168 168 168 168 168 Ngr LÜĞÜN GÜRÜL RLÜĞÜ OLDUĞ BÜYÜK OCUKL DUĞUM APTIĞ UĞUNU ÖĞRET URUCU UYGUN TUĞUN TURUC ÜLMÜġ ĞIMIZ DOĞRU PIYOR GÜZEL MUġTU OLUYO YLÜYO OLCUL LUĞUM ÜġÜNM TTIĞI KÖPEK CAĞIM HĠÇBĠ GEÇTĠ HEPSĠ YECEĞ SEVGĠ BEHÇE EHÇET HEYEC STĠHB TUTTU ZDIĞI IġIĞI BÜTÜN ÜSTÜN GÖRME YORUZ ONUġU ġÜNCE YÜRÜY ÜYORU UMHUR AVRUP Fre 508 391 386 7706 2902 2023 1672 1405 5192 520 468 420 402 400 380 1273 2536 987 2753 1715 941 600 517 437 701 1068 737 1511 2364 1125 1114 998 954 807 807 770 749 532 499 469 4767 1587 1535 1468 744 679 676 651 613 609 Exp 168 168 168 180 180 180 180 180 189 189 189 189 189 189 192 200 210 210 216 216 216 216 216 216 224 225 225 240 243 243 243 243 243 243 243 243 243 243 250 250 252 252 252 252 252 252 252 252 252 252 Ngr VRUPA GÜNEġ OĞLUN ġUYOR ONUġT LDUĞU YOKTU CUKLU KLUĞU UKLUĞ ULDUĞ YOKSU OKUSU OKUYU DÜġÜN ÜġÜND ġÜNDÜ ÜZÜND ÖRDÜM ÜRÜDÜ ÜDÜRÜ OLMUġ ÜLÜMS TIĞIM IġIYO ÇAKÇI DOKTO IZLIĞ ZLIĞI HIZLI DUĞUN RDUĞU GÖNDE URDUĞ UĞUND NDUĞU UNDUĞ CEĞĠM POLĠS ġÖYLE GEÇMĠ BAHÇE SÖZLE SUZLU BULUġ GEÇME MUTSU USTAF ÜSTEġ SAHĠP Fre 609 583 511 488 468 8212 1241 753 648 571 506 487 409 374 4234 1202 1137 1066 634 516 398 1619 741 1259 731 529 515 449 445 403 5346 911 814 788 778 512 487 1469 1443 982 929 902 630 556 554 507 488 472 472 445 Exp 252 252 252 252 252 270 270 270 270 270 270 270 270 270 280 280 280 280 280 280 280 288 288 300 300 300 300 300 300 300 315 315 315 315 315 315 315 324 324 324 324 324 324 324 324 324 324 324 324 324 38 The bigram gö seems to be one of the most challenging n-grams. It consists of rare unigrams while it has a high frequency. If a bigram is seen more than one time in the ciphertext and its frequency is about the frequency of % 0.22 (100x25203/11M), it could be assumed to correspond to gö. Rougly, the bigram would be uncovered in the ciphertext of length 873 (2x11M/25203=872.9). In the literature, the bigram qu is the most common sample way of attacking homophonic cipher for English [16]. Frequency of the bigram qu is % 0.2. However, the bigram would be expressed by 3 symbols. Therefore, it seems that precious information is obtained for beginning to attack. The rest of the bigrams could contribute to solve ciphertext but their frequencies are too close. It seems better to turn back after trying to detect more symbols by using other ngrams. Table 4.3 contains useful n-grams to solve ciphertext. Though the values are too close to each other, the trigrams gör and uğu could be evaluated as distinctive because of the frequency values. Table 4.4 contains obtrusive values. The tetragram cumh would be expressed by 12 different symbols. However, detecting the tetragram would be easy. The beginning and ending letter of the tetragram would be replaced with only one symbol. Similarly, same rules are valid for the tetragrams ptığ and vrup. Moreover, the tetragrams çocu and görü have distinctive frequencies. One of the most challenging n-gram seems to be a member of pentagrams. The pentagram çocuğ would be expressed by 6 different symbols. More interestingly, three letters of the tetragram (ç, c, ğ) would be repeated permanently in the ciphertext because each letter would be replaced with only one symbol. Detecting the rest of the letters (o, u) would be easier if the other letters are solved. Similarly, the pentagram fotoğ is a useful n-gram. The first letter and the last letter of the pentagram have a frequency of %1. Furthermore, the pentagrams çocuk and gördü have a distinctive frequency. 39 Another point that should not be ignored is that both the pentagrams çocuğ and çocuk consist of the distinctive tetragram çocu. Distinctive n-grams exist as seen. It seems more meaningful to begin with looking for the bigram gö. Then, pentagrams and tetragrams should be attempted to detect. If pentagrams and tetragrams could be detected in the ciphertext, it provides significant advantage in the rest of the process. Even if these tetragrams and pentagrams do not appear in plaintext, distinctive bigrams and trigrams would most probably help to solve encrypted texts. Suppose that the following ciphertext is given: 045 044 025 007 076 088 096 092 013 098 081 053 048 097 006 085 071 030 085 077 004 099 075 046 041 046 065 098 062 068 032 002 062 005 013 061 025 008 079 000 025 041 097 098 004 073 084 052 057 037 056 074 066 095 018 054 011 016 062 099 022 080 085 002 006 033 080 073 066 012 056 078 038 020 032 040 050 032 076 079 000 019 030 049 094 080 041 092 018 028 090 042 095 066 060 071 006 072 069 084 014 077 060 056 094 069 033 077 094 081 061 093 051 096 036 078 030 032 029 009 017 098 056 070 026 031 035 036 009 046 050 099 058 088 022 056 060 005 050 008 069 036 054 051 056 014 013 074 004 081 008 001 098 071 057 081 088 062 088 027 036 000 040 097 045 090 052 055 013 046 056 076 007 039 045 095 028 095 077 011 000 035 061 079 047 023 012 078 013 087 028 067 030 063 042 004 054 052 016 041 016 028 097 006 063 091 085 101 079 000 025 008 022 021 070 051 003 072 059 063 027 063 005 043 041 008 025 097 068 033 062 065 089 069 034 074 096 087 001 031 066 034 067 040 075 053 069 032 075 078 021 009 018 070 030 084 087 035 073 091 057 019 007 075 067 036 057 020 062 093 051 016 029 007 018 091 055 036 057 062 050 044 101 043 064 029 054 059 018 024 079 055 036 055 062 090 026 046 013 074 056 037 016 029 095 018 094 096 031 084 049 094 042 032 101 070 018 096 072 086 016 011 007 101 001 064 058 028 054 037 055 006 057 062 027 064 075 074 059 011 000 102 061 051 047 029 049 087 028 047 076 026 092 050 047 059 032 091 071 055 079 074 018 083 028 061 006 085 091 096 032 069 089 022 002 016 075 071 016 038 028 078 096 098 059 089 101 041 005 025 008 039 052 089 051 001 072 018 085 027 063 000 049 039 053 019 032 098 023 039 064 030 037 055 022 055 023 022 055 017 017 046 071 075 031 051 049 012 080 013 082 019 005 058 035 098 028 060 102 045 057 003 085 071 021 098 052 078 004 073 069 032 039 073 071 006 014 051 027 090 076 005 026 022 087 019 032 028 024 071 045 016 071 054 069 007 019 014 034 060 101 062 030 044 101 039 061 030 049 061 026 008 096 097 010 008 004 051 078 029 089 059 073 018 006 072 096 084 046 077 099 018 057 045 005 025 042 097 008 062 008 042 022 097 086 017 008 095 026 056 096 074 002 011 000 028 033 026 031 079 009 060 029 057 051 044 077 041 014 059 023 094 034 082 037 092 077 036 005 025 097 003 008 025 008 018 045 098 069 016 052 091 046 038 085 086 40 063 050 074 036 054 059 079 054 052 024 066 090 076 009 059 020 092 076 073 071 045 098 023 085 036 035 044 034 006 046 042 074 041 044 013 098 045 045 024 019 060 066 001 014 056 093 065 009 026 005 029 093 010 064 077 051 054 081 054 080 003 078 088 029 014 052 014 030 067 086 073 058 022 089 076 030 031 096 023 016 059 085 002 003 087 023 094 099 011 030 009 018 081 061 051 061 101 017 087 091 069 054 022 082 017 097 069 098 091 020 073 026 031 081 090 035 007 056 095 050 084 014 077 083 091 079 016 059 019 005 021 097 020 043 098 091 022 094 041 089 025 031 041 072 091 057 027 049 055 040 001 095 006 098 002 064 030 014 037 024 077 083 071 023 057 038 019 095 028 087 048 032 009 049 099 041 055 075 003 007 048 074 018 041 014 066 007 069 045 057 022 095 084 084 093 021 014 030 030 064 038 044 052 082 027 054 036 094 010 098 066 089 071 006 087 069 014 017 074 041 044 066 047 075 029 032 049 027 092 036 089 071 001 067 066 036 000 058 080 092 065 088 056 072 076 044 011 054 071 003 016 030 092 086 028 053 086 026 031 025 031 075 037 092 075 009 069 031 075 005 080 079 097 050 052 087 069 031 084 047 040 070 027 089 099 019 030 078 071 048 061 026 008 101 056 097 076 053 065 051 098 077 003 047 071 056 097 076 094 056 049 009 077 009 052 012 058 019 031 021 033 071 081 099 080 007 065 093 022 049 046 040 088 075 026 016 102 061 023 008 091 008 091 056 078 101 009 034 093 062 046 011 093 066 007 006 055 035 006 074 010 051 055 027 099 086 034 016 006 046 080 006 024 010 043 014 075 064 066 021 012 002 098 077 037 009 077 023 057 058 063 002 011 012 025 041 070 077 074 010 088 068 023 082 049 088 017 019 098 025 045 031 010 033 023 102 083 013 016 043 012 043 036 009 025 079 070 081 061 102 023 012 058 010 064 037 044 096 011 053 025 079 032 035 057 010 054 013 057 034 047 017 019 094 025 041 032 023 009 084 079 097 059 017 098 071 014 042 011 087 025 079 073 064 036 098 020 050 094 023 039 097 017 011 033 025 001 073 023 093 068 050 024 020 010 064 035 061 001 097 101 011 012 025 045 073 028 060 042 056 014 066 030 093 071 001 016 025 088 086 050 044 021 055 066 067 058 073 030 037 078 040 031 038 094 102 056 051 053 035 070 050 046 075 003 047 066 049 092 080 073 017 099 052 047 030 080 005 026 033 077 073 026 000 096 087 071 020 009 084 067 029 070 050 085 069 014 043 046 042 088 005 056 008 026 049 082 029 089 000 020 057 026 043 064 080 060 007 049 016 062 016 076 099 020 084 088 045 009 043 037 053 029 045 098 056 007 011 047 018 078 017 062 031 051 094 035 089 028 087 025 037 053 077 094 021 091 089 068 044 071 020 017 055 041 005 025 027 061 086 000 043 082 066 004 097 076 053 065 051 067 035 032 048 095 080 048 054 040 099 018 016 002 044 052 091 057 038 093 018 003 000 025 079 097 025 008 052 073 051 037 092 040 003 012 030 063 035 056 060 052 016 058 093 101 059 063 010 008 019 008 093 068 088 081 090 071 060 056 060 021 085 069 075 060 102 027 095 085 062 039 099 051 075 005 058 041 097 060 019 030 098 091 028 061 037 043 008 037 047 077 070 101 101 063 010 061 036 061 045 092 023 064 018 085 002 060 062 006 072 062 037 016 029 037 055 046 029 088 075 060 038 036 054 049 093 066 050 024 050 060 086 017 088 068 073 011 089 056 043 070 028 046 071 003 014 018 090 002 083 101 062 005 013 097 068 084 061 025 097 058 008 058 079 082 068 031 036 073 068 084 031 036 031 021 003 073 062 092 039 026 032 013 094 079 082 072 052 051 074 028 061 043 006 097 102 037 061 079 092 072 075 037 014 081 067 025 049 098 035 048 098 086 031 079 087 072 075 043 046 028 046 052 005 025 037 061 019 093 071 055 050 067 084 094 080 073 079 098 072 021 049 074 020 085 040 056 093 075 046 041 064 079 057 025 083 086 054 056 068 097 086 082 056 084 092 029 045 098 058 096 073 002 036 053 023 41 090 081 095 079 046 000 051 078 066 067 093 051 046 049 057 035 083 091 005 040 020 082 004 081 007 077 010 005 017 000 025 040 092 010 031 062 044 096 007 084 024 048 083 084 019 064 034 016 081 063 075 063 056 048 093 029 044 065 035 053 058 041 067 075 082 059 011 032 030 089 026 092 081 054 026 036 055 000 065 089 002 026 082 080 000 004 053 045 089 018 043 067 035 034 044 000 053 059 071 024 051 044 080 007 071 052 016 020 088 086 017 083 040 045 054 025 099 062 005 013 061 068 026 009 080 056 046 058 020 026 093 081 088 040 081 099 035 083 068 090 039 079 014 071 075 005 065 011 097 101 043 061 025 008 066 061 002 067 021 062 053 025 070 007 049 044 030 005 011 037 094 076 027 092 019 089 101 001 087 091 062 070 056 047 013 082 065 062 033 049 065 098 071 030 032 084 098 080 031 045 008 077 003 097 080 027 087 021 078 036 093 021 087 019 024 030 062 090 056 082 079 102 005 049 012 080 070 059 089 091 006 085 013 063 075 055 017 014 042 022 007 048 097 006 085 101 096 031 002 032 022 069 044 021 101 046 038 095 091 001 005 025 061 050 006 085 091 085 062 005 013 097 068 051 053 077 032 027 098 043 067 021 070 068 005 051 078 081 093 026 027 044 062 053 081 092 011 089 052 084 009 006 057 062 014 101 028 088 080 072 027 085 040 034 046 008 022 061 017 074 030 030 093 025 060 022 017 064 065 006 072 040 063 101 050 064 069 072 079 063 037 079 046 062 005 013 008 068 084 012 102 089 039 073 058 048 078 048 078 037 087 040 089 091 045 009 018 061 017 098 066 022 078 039 012 049 098 040 031 081 007 102 006 085 018 020 005 042 008 091 043 067 035 070 050 089 058 020 000 080 061 091 037 009 042 089 079 047 076 067 021 014 030 048 057 091 003 024 091 093 002 054 101 081 093 102 052 012 069 032 011 031 058 053 056 009 069 087 040 078 040 053 036 020 026 078 029 084 094 080 011 082 086 061 048 007 002 083 027 028 063 075 085 056 001 044 001 014 021 024 003 016 048 033 004 058 016 026 044 080 036 087 062 050 078 084 009 027 032 086 003 044 022 044 019 054 066 084 044 029 093 019 017 046 102 090 039 002 024 052 071 064 038 028 087 019 032 058 056 072 075 001 074 088 037 096 005 096 097 026 085 062 085 059 013 063 036 032 071 073 010 030 053 075 004 024 018 072 025 042 024 017 022 044 018 088 066 099 066 083 036 020 016 025 090 052 051 074 048 083 035 005 068 061 051 030 072 035 044 059 083 058 041 055 079 093 002 099 045 088 002 060 088 066 013 088 075 088 027 006 063 002 074 049 026 060 068 030 074 048 088 042 083 091 013 095 052 083 039 047 079 032 050 073 036 000 077 053 029 011 009 018 073 002 062 016 017 088 071 087 049 020 098 101 032 071 056 032 002 032 021 073 027 041 088 021 055 048 060 080 062 005 013 097 004 076 054 099 040 095 000 056 008 045 097 025 061 034 044 048 012 086 073 066 079 053 045 053 056 070 042 022 032 069 031 056 097 029 045 074 043 024 011 093 048 008 037 008 066 045 008 025 008 007 062 060 058 065 005 022 085 091 054 011 017 038 042 005 038 078 006 092 101 045 067 019 089 075 047 038 020 070 025 073 095 045 045 054 012 019 089 075 049 087 038 005 026 054 011 068 078 080 047 096 000 084 097 091 078 006 072 020 085 029 085 049 050 063 076 017 063 005 030 098 042 095 023 037 074 077 003 016 078 058 096 009 080 012 003 087 038 047 029 043 012 050 014 101 020 000 045 082 075 001 070 050 061 062 082 025 087 082 030 026 047 027 070 076 034 064 023 014 050 024 059 028 087 011 032 101 004 072 052 016 068 000 086 039 097 086 020 008 039 013 047 059 031 050 069 024 052 091 016 038 054 022 028 097 006 085 071 079 053 023 088 028 094 069 055 071 102 063 052 047 043 033 040 032 018 003 047 038 005 026 083 019 006 072 042 085 040 034 024 028 090 029 049 007 065 020 057 006 044 002 079 083 025 090 027 054 069 006 085 059 037 055 102 045 055 023 074 039 044 071 010 092 42 080 096 074 079 074 035 019 093 034 007 037 038 005 043 083 019 049 074 040 093 003 046 096 063 062 085 068 084 063 096 026 044 029 007 066 045 064 072 013 085 021 049 055 056 000 080 096 097 017 008 084 094 066 062 000 013 097 068 026 067 080 003 095 076 062 093 021 074 006 072 017 085 077 050 044 056 037 016 096 005 042 004 008 017 097 051 098 066 062 005 013 097 056 043 067 035 048 074 065 062 099 075 044 034 044 102 022 044 056 049 055 056 005 102 056 097 017 008 084 087 018 062 000 013 061 065 037 094 077 004 085 062 085 056 043 063 056 049 046 040 083 058 079 014 065 005 042 096 061 030 061 043 047 066 072 069 046 043 043 095 004 051 046 044 035 068 055 065 062 005 013 061 068 049 067 080 048 063 075 063 041 063 065 043 016 077 083 101 003 014 041 055 006 057 101 024 084 026 054 004 084 055 068 005 029 096 097 020 061 013 008 000 049 022 047 068 099 036 017 044 040 037 057 035 096 031 069 062 000 013 097 096 049 094 080 032 005 042 030 047 004 028 088 029 068 044 071 030 081 093 080 099 065 054 027 060 091 003 044 059 075 005 056 011 097 091 000 026 053 029 092 004 052 044 030 060 086 027 083 086 065 070 077 011 067 043 065 014 011 095 027 062 000 013 097 068 043 033 029 089 067 101 066 064 037 057 035 005 053 058 058 016 043 044 102 060 018 021 046 020 088 076 020 093 042 003 083 025 060 062 005 013 008 096 051 009 040 052 055 020 027 099 076 063 062 039 099 049 075 005 018 071 085 010 008 036 008 059 052 082 077 032 011 070 091 003 087 058 010 053 002 049 082 011 031 003 087 056 073 002 034 057 065 012 041 031 091 053 075 092 065 026 067 080 089 048 082 096 032 022 051 089 000 037 087 018 049 033 035 082 052 094 096 026 012 042 073 048 087 056 031 022 036 073 069 000 049 098 059 049 078 029 028 093 042 041 094 023 094 096 054 052 070 051 070 059 000 091 023 009 069 099 102 098 101 089 071 041 094 036 054 021 078 011 098 037 006 063 059 045 055 022 065 054 039 028 060 026 054 102 066 067 019 032 049 005 051 098 013 078 096 082 050 012 000 006 063 091 003 014 021 054 101 055 096 054 050 081 099 037 095 040 066 014 068 092 001 053 040 056 070 069 062 000 013 061 025 061 045 000 025 087 013 033 056 038 082 002 094 035 082 056 076 094 027 031 101 073 090 038 049 016 062 024 096 055 101 037 064 029 023 064 102 023 009 026 003 055 028 093 051 095 052 005 040 037 087 077 041 070 077 075 009 023 075 092 096 055 027 087 043 090 091 090 036 017 009 018 028 097 084 097 059 011 024 050 017 049 014 040 054 063 011 030 085 059 064 003 074 076 060 095 035 051 064 102 021 092 002 039 032 076 088 043 065 060 036 020 087 101 028 008 026 086 012 090 035 099 000 043 041 008 025 008 066 008 081 088 069 087 066 019 076 083 095 042 060 059 045 057 093 019 020 067 058 081 061 084 052 005 068 017 061 079 060 034 009 018 074 003 046 028 099 052 094 017 032 018 079 087 001 087 093 019 030 087 101 028 008 037 061 091 019 074 027 030 084 055 035 054 075 005 068 020 008 040 052 078 023 021 033 004 024 050 053 051 060 071 102 063 075 067 006 088 028 054 081 007 080 075 098 069 041 070 076 083 083 077 099 059 088 101 081 046 011 030 016 004 098 029 089 000 019 027 087 059 059 099 023 082 020 079 094 033 023 027 046 017 102 047 011 088 050 090 018 030 000 080 008 018 097 052 003 061 028 054 042 068 016 058 020 081 060 040 083 056 093 039 093 101 041 014 071 053 102 030 009 096 012 026 087 018 048 008 056 014 030 043 055 040 069 016 052 018 074 038 046 003 064 012 052 071 031 006 085 066 001 000 025 022 008 076 000 043 094 091 051 087 035 009 045 092 003 005 025 097 039 075 032 037 045 072 018 063 027 026 044 040 090 068 097 020 049 061 005 051 036 061 058 059 097 017 097 056 049 098 042 081 083 042 021 098 018 009 045 005 025 097 050 006 085 066 037 014 080 054 081 054 080 021 087 101 033 43 If the n-gram based attacking model is performed on the ciphertext above, the following consequences are obtained: If the encrypted text is examined, the bigram 006 072 draws attention. It appears 7 times in text of length 3381. Moreover, the frequency of the bigram 006 072 (7x100/3381 = 0.2) is almost equal to the frequency of the bigram gö. Therefore, the bigram could most probably be the bigram gö. Another pattern that comes one step forward in the ciphertext is constituted by the following trigrams: 062 005 013 062 000 013 The pattern appears 14 times in the ciphertext. Moreover, the trigrams specified above would only continue with the characters 097, 061 and 008 in the ciphertext. The pattern could only symbolize the tetragram ÇOCU because of the expression count. The fact is that the tetragram çocu could only continue with the letters k and ğ. In the encrypted text, the tetragram çocu continues with the characters of 068, 004, 056, 065, 096, and 025. If the other trigram patterns that contain 025 are examined, the following bigrams are interesting patterns for an initial investigation: 008 025 008 061 025 061 008 025 061 061 025 008 It had already been detected that the characters 008 and 061 are used to encrypt the letter u. The pattern specified above would most probably be the trigram uğu. 44 Therefore, it could be said that the character 025 corresponds to the letter ğ and the characters 068, 004, 056, 065, 096 correspond to the letter k. K O 080 K U K O 042 K U K O 102 K U K O 029 K U Detection of the letters (g, ö, ç, o, c, u, ğ, k) reveals some patterns specified above. The n-gram KORKU could only correspond to the patterns. Therefore, the characters 080, 042, 102, 029 could correspond to the letter R. Thereafter, the following patterns are detected and solved respectively. O K U 045 U Ğ U: OKUDUĞU Ç O C U K 084 U Ğ U: ÇOCUKLUĞU Ç O C U K L 012 R: ÇOCUKLAR Ç O C U Ğ U D O Ğ 087 C 033 K: ÇOCUĞUDOĞACAK Ç O C U K 043 A R: ÇOCUKLAR Ç O C U K 049 067 R: ÇOCUKLAR Ç O C U K 026 A R: ÇOCUKLAR O K U L L 082 R: OKULLAR Ç O C U K L 009 R: ÇOCUKLAR K 063 Ç 085 K L 063 K: KÜÇÜKLÜK Ç O C U K L A 035: ÇOCUKLAR R Ü 002 G A R: RÜZGAR O L 041 U Ğ U: OLDUĞU Ç O C U K L 094 R: ÇOCUKLAR A K 031 L L A R: AKILLAR A Ğ L 098 R: AĞLAR K O R K U 020 U C U: KORKUTUCU Ü Ç Ü 059 C Ü: ÜÇÜNCÜ Ö Ğ R 024: ÖĞRE L A R D 092: LARDA 45 O L 053 R A K: OLARAK T O R U 091 L A R: TORUNLAR 001 O Ğ U 050 G Ü N Ü: DOĞUMGÜNÜ T O R U N 037 A R: TORUNLAR K Ü Ç Ü K L Ü K L 046: KÜÇÜKLÜKLE O L A N L 078 R: OLANLAR 028 U G Ü N: BUGÜN D A L L A R D A K 007: DALLARDAKĠ G Ö Ç L 016 R: GÖÇLER T O R U N L A R 070 M: TORUNLARIM Ç O C U K L A 077: ÇOCUKLAR Ç O C U K 051 A R: ÇOCUKLAR 088 L K O K U L: ĠLKOKUL G Ü N L 055 R D 055: GÜNLERDE K Ü Ç Ü K L Ü K L 044 R Ġ: KÜÇÜKLÜKLERĠ D E Ğ Ġ 076 Ġ K: DEĞĠġĠK E C 074 K L E R: ECEKLER 075 I L L A R C A: YILLARCA Ç E ġ 099 T L Ġ: ÇEġĠTLĠ B E 052 O Ğ L U: BEYOĞLU G Ü Z E L L 060 K: GÜZELLĠK G Ö 017 Ü R M E K L E: GÖTÜRMEKLE 30 Ü R K Ġ Y E D E: TÜRKĠYEDE 039 U T L U L U K: MUTLULUK K U ġ A K L A R 003 047 071 K U ġ A K L A R A: KUġAKLARDANKUġAKLARA Ġ 019 T A N B U L L U L 047 R: ĠSTANBULLULAR 048 Ü Y Ü D Ü K L E R 083: BÜYÜDÜKLERĠ Ġ 036 T A 101 B U L: ĠSTANBUL E 003 E B Ġ Y A T: EDEBĠYAT R Ü Y A G Ġ B 054: RÜYAGĠBĠ D O Ğ 022 U ġ O L A N: DOĞMUġOLAN A C A 081 A: ACABA S Ġ Y A S 057 T: SĠYASET B A ġ L A M A N 032 N: BAġLAMANIN D O Ğ U M Y 089 L D Ö 018 Ü 027 Ü: DOĞUMYILDÖNÜMÜ 46 095 L K K E Z: ĠLKKEZ 023 Ü S R E 034 G E R E D E: HÜSREVGEREDE B A B 032 A L Ġ: BABIALĠ D O Ğ 079 U Ğ U: DOĞDUĞU E R Ġ Y Ġ 038: ERĠYĠP S 093 N E M A: SĠNEMA A N N E L E R Ġ N Y E T Ġ 086 T Ġ 040 D Ġ Ğ Ġ: ANNELERĠNYETĠġTĠRDĠĞĠ T O R U N L A R I M 089 058 T O R U N L A R 089: TORUNLARIMINTORUNLARI B U L U 066 D U Ğ U: BULUNDUĞU K O M Ü N Ġ 011 T: KOMÜNĠST B Ü Y Ü D Ü K L E R Ġ N D 014: BÜYÜDÜKLERĠNDE R Ü Z G A R 073 N A: RÜZGARINA B 090 R Ġ N Ġ N: BĠRĠNĠN 069 E Y N E P: ZEYNEP 021 I L D Ö N Ü M Ü: YILDÖNÜMÜ S E Ç M E N L 064 R: SEÇMENLER K A L O R Ġ 010 064 R: KALORĠFER As it is seen, solving the encrypted text is possible on long enough ciphertexts. Hereby, all the characters of the encrypted text are solved and the following text is obtained: DEĞĠġĠKACABABUGÜNTÜRKĠYEDEKAÇKIZÇOCUĞUDOĞDUAKILYELKENĠ NĠSEÇĠMRÜZGARINAKAPTIRMIġDOSTLARDANBĠRĠNĠNGÖZLERĠKAZARAB UĠLKSATIRATAKILIRSAEMĠNĠMKĠOMUZSĠLKECEKBUDANEBĠÇĠMSORUDĠY ECEKġĠMDĠBĠRSORUDAHAACABATÜRKĠYEDEBUGÜNÜNDOĞUMYILDÖNÜ MÜOLDUĞUKAÇKIZVEKADINVARYAZIYAYANITLARINESĠYASETÇĠLERĠN NESEÇMENLERĠNNEDESEÇĠLECEKLERĠNAKILLARININKÖġESĠNDENBĠLEG EÇMEYENSORULARLABAġLAMANINNEDENĠBUGÜNKIZIMZEYNEPBAKANI NDOĞUMYILDÖNÜMÜOLMASIAHMETLEMEHMETTENYILLARCASONRABĠ RDEDÜNYAYAKIZIMINGELMĠġOLMASIBENDENĠZĠSEVĠNÇTENMUTLULUK UFUKLARININGÖKLERĠNEDOĞRUUÇURMUġTUĠLKKEZSOBALIDAĠRELERD ENHAVALARSOĞUDUĞUNDAZEYNEPÜġÜMESĠNDĠYENĠġANTAġINDAHÜS REVGEREDECADDESĠNDEKĠKALORĠFERLĠBĠRDAĠREYETAġINMIġTIKHENÜ ZDAHAĠSTANBULUNTANZĠMATUZANTILIBĠRĠKĠMLERĠNDENSOYUTLANM 47 ADIĞIDÖNEMLERDĠGAZETELERĠNHEPSĠBABIALĠDEYDĠBENDENĠZDEMĠLLĠ YETTEPEYAMĠSAFANINGAZETEDENAYRILMASINDANSONRAKĠKÖġESĠND ETAġBAġLIĞIYLAYAZIYORDUMYAZILARIMIĠSTANBULUNKUġAKLARDAN KUġAKLARAYANSIYANBĠRĠKĠMLERĠYLERUHUNUNKANAVĠÇESĠNĠGERGE FLEMĠġVEGERGEFLEYENYAZARLARHENÜZSAĞDIREFĠKHALĠTSAĞDIFAH RĠCELALSAĞDIBURHANFELEKSAĞDIREFĠCEVATSAĞDIHALDUNTANERSA ĞDIESATMAHMUTSAĞDIHĠKMETFERUDUNSAĞDIBĠRKENTĠNDEĞĠġMEYE NANITLARIPARKLARIMEYDANLARITĠYATROLARILOKANTALARIMÜZELE RĠOKULLARIOTELLERĠĠLEÇEġĠTLĠDALLARDAKĠSANATÇILARIBAĞLARAY NIKENTTEDOĞMUġOLANKUġAKLARIBĠRBĠRĠNEZEYNEPĠNDOĞDUĞUYILL ARDATÜRKĠYENĠNNÜFUSUĠKĠBĠNĠKĠYÜZYĠRMĠÜÇMĠLYONDUĠSTANBULL ULARINNÜFUSUDAHENÜZĠÇGÖÇLERLEERĠYĠPSĠLĠNMEMĠġTĠKISIKLIBEND ENĠZĠNÇOCUKLUĞUNUNDAKISIKLISIYDIÇAMLICADAÖYLEBULGURLUDA ÖYLEBAĞLARBAġIDAÖYLEBEYOĞLUSĠNEMALARIDAÖYLETÜRKĠYEDEDE ĞĠġĠKKUġAKLARDANKIZSAHĠBĠDEOLANAĠLELERĠNORTAKBĠRFOTOĞRAFI ÇEKĠLEBĠLSEVEBÜYÜKBĠREKRANDAYANSITILABĠLSEOKIZLAROKADINL ARVEOANNELERĠNYETĠġTĠRDĠĞĠÇOCUKLARKENTLĠBĠRBĠRĠKĠMDENYOKS UNLUĞUNUZAYÇAĞIĠLETOSLAġMASINDANÇIKACAKÇALKANTILARIDUR DURMAYASĠYASETÇĠKADROLARININGÜCÜYETERMĠBUGÜNKIZIMZEYNE PĠNDOĞUMGÜNÜÇOCUKLARIMALAYIKOLABĠLMEÇABASIYLAGEÇENBĠR ÖMÜRVEUMUTETTĠĞĠMTEKGÖRÜNMEZÖDÜLDEÇOCUKLARIMINBABALA RINDANUTANMAMALARIBĠRGÜNTORUNLARIMINTORUNLARIDAġAYETB ENDENĠZĠNBĠRYAZISINAKAZARARASTLARLARSAġUBĠZĠMBÜYÜKDEDEY EDEBAKNELERSAÇMALAMIġDEMESĠNLERĠSTERĠMZEYNEPBASINKÖYDEĠ LKOKULÜÇÜNCÜSINIFTAYKENÖĞRETMENĠNĠNĠSTEĞĠYLEBĠROKULTÖRE NĠNDEDĠZĠDĠZĠĠNCĠYĠMGÜZELLĠKTEBĠRĠNCĠYĠMADIMISORARSANIZÇETĠN ALTANINKIZIYIMDĠYEBĠRÇOCUKġĠĠRĠOKUDUĞUVEBAġINDADAKIRMIZIK URDELESĠBULUNDUĞUĠÇĠNKOMÜNĠSTPROPAGANDASIYAPTIĞIĠDDĠASIYL APOLĠSKARAKOLUNAGÖTÜRÜLMÜġTÜOTARĠHLERDEANKARADAPARLA MENTODAYDIMUÇAĞAATLAMIġVEHEMENBASINKÖYEKOġMUġTUMCANI MZEYNEPĠMBUGÜNDAHĠBAZENRÜYALARINDAPOLĠSGÖRÜRVEBĠRLĠKTE GEZDĠĞĠMĠZGÜNLERDEHEMENFARKEDERSĠVĠLPOLĠSLERĠDEKÜÇÜKLÜKL 48 ERĠNDEÖCÜYLEKORKUTULANÇOCUKLARDĠġÇĠYEGÖTÜRMEKLEKORKUT ULANÇOCUKLARBEKÇĠYEVERMEKLEKORKUTULANÇOCUKLARKÜÇÜKLÜ KLERĠNDEKORKUTULANÖZELLĠKLEERKEKÇOCUKLARBÜYÜDÜKLERĠND EDEGENELLĠKLEKORKUTUCUOLMAKĠSTERLERKIZÇOCUKLARIORTAKBĠR KENTBĠRĠKĠMĠNDENYOKSUNOLARAKYETĠġMĠġKIRSALKESĠMÇOCUKLARI ANNELEROANNELERĠNYETĠġTĠRDĠĞĠÇOCUKLARYETMĠġÜÇMĠLYONNÜFU SUNYARISINDANFAZLASIDAKIZVEKADINAYAKLARIBAKIMLIOLANLARA YAKLARIBAKIMSIZOLANLARBĠRDAHAKĠYILINONHAZĠRANINDASĠYASAL GÜNDEMKĠMBĠLĠRNASILOLACAKAMAOGÜNDEYĠNEKĠMBĠLĠRNEKADARK IZÇOCUĞUDOĞACAKPAZARAKġAMINIĠPLEÇEKENLERHERHALDEBĠLĠYOR LARDIRYAHYAKEMALĠNĠSTANBULUNSEMTLERĠÜSTÜNEDEġĠĠRLERYAZM IġĠLKĠSTANBULġAĠRĠOLDUĞUNUBĠZANSġĠĠRĠNDEĠSTANBULYOKTUDĠVAN EDEBĠYATINDADAĠSTANBULUNSEMTLERĠYOKTURYAHYAKEMALĠNRÜY AGĠBĠBĠRYAZDIġĠĠRĠNĠNBESTEKARIOSMANNĠHATDAAHMETRASĠMĠNTOR UNUYDUBĠRKENTBĠRĠKĠMĠNDENARTAKALANBUKETLERZEYNEPEDEAYN IGÜNDOĞMUġOLANLARADADOĞUMYILDÖNÜMLERĠKUTLUOLSUNNUTU KLARBĠRYANADOĞUMGÜNLERĠBĠRYANA Homophonic cipher comes one step forward in the classical encryption methods because it generates the ciphertexts consisting of variable block sizes. This makes well known attacking models invalid. Although, the encryption method contains vulnerabilities for Turkish, it could clearly be said that the method is stronger than most of classical methods. Moreover, long ciphertexts are needed to cryptoanalysis. If long enough and uniform distributed ciphertext is given, distinctive n-grams would most probably contribute to detect vast majority of the letters of the alphabet. All in all, the method still maintains its resistance today against to frequency analysis attacks on short ciphertexts. 5. Conclusion The key space of the substitution cipher is enormously larger than one of the most common modern encryption method DES. However, the block size of the encryption method remains stable and it is equal to the value of one. That would cause the defeat against to attacks based on statistical features of the source language. Nevertheless, the most of modern cryptosystems such as DES and AES are insprired by the substitution cipher. Moreover, the substitution cipher constitutes the base for the substitution-permutation networks. Herein, homophonic cipher is developed as an alternative to the substitution cipher method. Homophonic cipher extends the block size of the ciphertext in patches. Moreover, the block size is variable. This renders plaintext identification diffucult. Also, well known statistical features of the source language are invalid. In this work, a novel attacking model for Homophonic cipher in Turkish is developed while main concepts of cryptology are demonstrated. Herein, it is copied that Homophonic cipher has a key space wider than the most modern cryptosystems. All in all, without any hesitation it is figured out that the homophonic cipher still maintains its resistence today against to frequency analysis attacks on short ciphertexts. Long ciphertexts are needed to attack encrypted texts. Meanwhile, the corpus size of the related work presented by Dalkılıç [1] is overperformed by this study. Therefore, more correct and more consistent results are obtained. As an additional deliverable, the results obtained towards the solutions of language based cryptographic problems also contribute to the linguistic studies. References [1] Dalkıllıç, M.E., Dalkılıç, G., “On the Cryptographic Patterns and Frequencies in Turkish Language”, Advances in Information Systems, 144-153, (2002). [2] Serengil, S.I., Akin, M., “Attacking Turkish Texts Encrypted by Homophonic Cipher”, Proceedings of the 10th WSEAS International Conference on Electronics, Hardware, Wireless and Optical Communications, 123-126, (2011). [3] Hoffstein, J., Pipher, J., Silverman, J., An Introduction to Mathematical Cryptography, Springer, New York, (2000). [4] Menezes, A.J., Oorschot, P.C.V. et al, Handbook of Applied Cryptography, CRC, New York, (1997). [5] Delf, H., Knebl, H., Introduction to Cryptography, Springer, New York, (2001). [6] Stallings, W., Cryptography and Network Security Principles and Practice, Prentice Hall, New York, (2005). [7] Acharya, B., Rath, G. S. et al, “Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm”, International Journal of Security, 1(1), 1421, (2007). [8] Eisenberg, M., “Hill Ciphers and Modular Linear Algebra”, Mimeographed Notes, University of Massachusetts, 1-19, (1999). [9] Acharya, B., Jena, D. et al, “Invertible, Involutory and Permutation Matrix Generation Methods for Hill Cipher System”, International Journal of Security, 410-414, (2007). [10] Dalkılıç, M. E., Gungor, C., “An Interactive Cryptanalysis Algorithm for the Vigenere Cipher”, Advances in Information Systems, 341-351, (2000). [11] Pommerening, K., “Kasiski’s Test: Couldn’t the Repetitions be by Accident”, Cryptologia, 30(4), 346-352, (2006). [12] Penzhorn, W., “A Fast Homophonic Coding Algorithm Based on Arithmetic Coding”, IEEE Transactions on Information Theory, 45(6), 2083-2091, (1999). 51 [13] Penzhorn, W.T., Els, W.C., “A Universal Homophonic Coding Algorithm Based on Arithmetic Coding”, Communications and Signal Processing, 45(6), 20832091, (1994). [14] Rocha, V., Massey, J., “On the Entropy Bound for Optimum Homophonic Substitution”, Information Theory, 93-94, (1997). [15] Ryabko, B., Fionov, A., “Efficient Homophonic Coding”, 45(6), 2083-2091, Information Theory, (1999). [16] Singh, S., The Code Book: The Secret History of Codes and Code-breaking, Anchor, New York, (1999). [17] Günther, C., Boveri, A. et al, “A Universal Algorithm for Homophonic Coding”, Advances in Cryptography, 405-414, (1988). [18] Jendal, H., Kuhn, Y., et al, “An Information-Theoretic Treatment of Homophonic Substitution”, Advances in Cryptology, 382-394, (1998). [19] Sgarro, A., “Equivocations for Homophonic Ciphers”, Advances in Cryptology, 51-61, (1985). [20] Hammer, C., “Second Order Homophonic Ciphers”, 12(1), Cryptologia, 11-20, (1988). [21] Hammer, C., “Higher Order Homophonic Ciphers”, 5(4), Cryptologia, 231-242, (1981). [22] King, J. C., Bahler, D. R., “A Framework for the Study of Homophonic Ciphers in Classical Encryption and Genetic Systems”, Cryptologia, 17(1), 45-54, (1993). [23] King, J. C., Bahler, D. R., “An Algorithmic Solution of Sequential Homophonic Ciphers”, Cryptologia, 17(2), 148-165, (1993). [24] Ravi, S., Knight, K., “Bayesian Inference for Zodiac and Other Homophonic Ciphers”, Human Language Technologies, (2011). Appendix In this work, the corpus of size 13.4 MB is used to obtain the Turkish n-gram frequencies. Also, the corpus consists of the sources specified below: 1. 120 articles of a columnist, Çetin Altan, from the Turkish daily newspaper Milliyet published between 20.01.2010 - 04.07.2010. (Available at www.milliyet.com.tr ) 2. 37 novels of 9 authors as listed in Table A.1. 53 Table A.1 List of Novels Used in the Corpus Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Author Ahmet Altan Ahmet Altan Ahmet Altan Ahmet Altan Ahmet Altan Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Aziz Nesin Çetin Altan Gülse Birsel Gülse Birsel Gülse Birsel Orhan Kemal Orhan Kemal Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Orhan Pamuk Rıfat Ilgaz Soner Yalçın Soner Yalçın Soner Yalçın Soner Yalçın Yılmaz Erdoğan Yılmaz Erdoğan Novel Ġçimizde Bir Yer Kırar Göğsüne Bastırırken Aldatmak Kristal Denizaltı Sudaki Ġz Anıtı Dikilen Sinek Borçlu Olduklarımız Tatlı BetüĢ Rıfat Bey Neden KaĢınıyor Bay Düdük Memleketin Birinde Damda Deli Var Ġstanbul’un Halleri Sizin Memlekette EĢĢek Yokmu Gerçeğin Masalı Viski Yolculuk Nereye HemĢerim Hala Ciddiyim Gayet Ciddiyim Cemile Baba Evi Masumiyet Müzesi Babamın Bavulu Hatıralar ve ġehir Kar Benim Adım Kırmızı Yeni Hayat Kara Kitap Beyaz Kale Sessiz Ev Hababam Sınıfı Bay Pipo Reis Beco BinbaĢı Ersever’in Ġtirafları Hijyenik AĢklar Hüsünbaz SeviĢmeler Publishing Year 2004 2003 2002 2001 1985 1982 1976 1974 1965 1958 1958 1956 1974 2005 2004 2003 1952 1949 2008 2006 2003 2002 1998 1994 1990 1985 1983 1957 1999 1997 1996 1994 2003 2001 Biographical Sketch Serengil was born in Istanbul on November 24th, 1986. In 2003, he graduated from Kadikoy Intas Lisesi in Istanbul. He began his undergraduate studies in 2004 at Istanbul Commerce University Computer Engineering Department. In 2009, he received his BSc degree in Computer Engineering from Istanbul Commerce University. He enrolled in MSc studies in Galatasaray University Computer Engineering Department same year. He is currently pursuing his MSc degree. Presently, he is working as a Software Developer at Softtech, which is a subsidiary compay of Isbank Group since August 2010. Also, he is the co-author of the paper entitled “Attacking Turkish Texts encrypted by Homophonic Cipher” which was published in the Proceedings of the 10th WSEAS International Conference on Electronics, Hardware, Wireless and Optical Communications held at Cambridge, UK in February 20-22, 2011.
Benzer belgeler
n - CENG520
• Since the most frequent letters in English are
E, T, A, and O, we first consider the case he
above letters are images of these letters.
• For example, if E is mapped to S in the first
column, th...