New Post has been published on http://www.ismailkeles.info/elementary-cryptography-with-examples.html
Elementary Cryptography [With Examples]
Terminology, Background and Substitution Ciphers
Encryption is the process of encoding a message so that its meaning is not obvious; decryption is the reverse process, transforming an encrypted message back into its normal, original form. Alternatively, the terms encode and decode or encipher and decipher are used instead of encrypt and decrypt. That is, we say that we encode, encrypt, or encipher the original message to hide its meaning. Then, we decode, decrypt, or decipher it to reveal the original message. A system for encryption and decryption is called a cryptosystem.
The original form of a message is known as plaintext, and the encrypted form is called ciphertext. This relationship is shown in Figure 2-1. For convenience, we denote a plaintext message P as a sequence of individual characters P = <p1, p2, âŚ, pn>. Similarly, ciphertext is written as C = <c1, c2, âŚ, cm>. For instance, the plaintext message âI want cookiesâ can be denoted as the message string <I, ,w,a,n,t, , c,o,o,k,i,e,s>. It can be transformed into ciphertext <c1, c2, âŚ, c14>, and the encryption algorithm tells us how the transformation is done.
 We use this formal notation to describe the transformations between plaintext and ciphertext. For example, we write C = E(P) and P = D(C), where C represents the ciphertext, E is the encryption rule, P is the plaintext, and D is the decryption rule. What we seek is a cryptosystem for which P = D(E(P)). In other words, we want to be able to convert the message to protect it from an intruder, but we also want to be able to get the original message back so that the receiver can read it properly.
The cryptosystem involves a set of rules for how to encrypt the plaintext and how to decrypt the ciphertext. The encryption and decryption rules, called algorithms, often use a device called a key, denoted by K, so that the resulting ciphertext depends on the original plaintext message, the algorithm, and the key value. We write this dependence as C = E(K, P). Essentially, E is a set of encryption algorithms, and the key K selects one specific algorithm from the set. We see later in this chapter that a cryptosystem, such as the Caesar cipher, is keyless but that keyed encryptions are more difficult to break.
Sometimes the encryption and decryption keys are the same, so P = D(K, E(K,P)). This form is called symmetric encryption because D and E are mirror-image processes. At other times, encryption and decryption keys come in pairs. Then, a decryption key, KD, inverts the encryption of key KE so that P = D(KD, E(KE,P)). Encryption algorithms of this form are called asymmetric because converting C back to P involves a series of steps and a key that are different from the steps and key of E. The difference between symmetric and asymmetric encryption is shown in Figure 2-2.
 A key gives us flexibility in using an encryption scheme. We can create different encryptions of one plaintext message just by changing the key. Moreover, using a key provides additional security. If the encryption algorithm should fall into the interceptorâs hands, future messages can still be kept secret because the interceptor will not know the key value.
The history of encryption is fascinating; it is well documented in Kahnâs book [Kahn, D. The Codebreakers. Scribner, 1996.]. Encryption has been used for centuries to protect diplomatic and military communications, sometimes without full success. The word cryptography means hidden writing, and it refers to the practice of using encryption to conceal text. A cryptanalyst studies encryption and encrypted messages, hoping to find the hidden meanings.
We want to study ways of encrypting any computer material, whether it is written as ASCII characters, binary data, object code, or a control stream. However, to simplify the explanations, we begin with the encryption of messages written in the standard 26-letter English alphabet, A through Z.
Thus, the letter A is represented by a zero, B by a one, and so on. This representation allows us to consider performing arithmetic on the âlettersâ of a message. That is, we can perform addition and subtraction on letters by adding and subtracting the corresponding code numbers. Expressions such as A + 3 = D or K â 1 = J have their natural interpretation. Arithmetic is performed as if the alphabetic table were circular.[1] In other words, addition wraps around from one end of the table to the other so that Y + 3 = B. Thus, every result of an arithmetic operation is between 0 and 25.
[1]:Â This form of arithmetic is called modular arithmetic, written mod n, which means that any result greater than n is reduced by n as many times as necessary to bring it back into the range 0 result & < n. Another way to reduce a result is to use the remainder after dividing the number by n. For example, the value of 95 mod 26 is the remainder of 95/26, which is 17, while 95 â 26 â 26 â 26 = 17; alternatively, starting at position 0 (A) and counting ahead 95 positions (and returning to position 0 each time after passing position 25) also brings us to position 17.
Children sometimes devise âsecret codesâ that use a correspondence table with which to substitute a character or symbol for each character of the original message. This technique is called a mono alphabetic cipher or simple substitution. A substitution is an acceptable way of encrypting text. In this section, we study several kinds of substitution ciphers.The Caesar Cipher
The Caesar cipher has an important place in history. Julius Caesar is said to have been the first to use this scheme, in which each letter is translated to the letter a fixed number of places after it in the alphabet. Caesar used a shift of 3, so plaintext letter pi was enciphered as ciphertext letter ci by the rule ci = E(pi) = pi + 3
A full translation chart of the Caesar cipher is shown here.
Using this encryption, the message
T R E A T Y I M P O S S I B L E w u h d w b l p s r v v l e o h
Advantages and Disadvantages of the Caesar Cipher
Most ciphers, and especially the early ones, had to be easy to perform in the field. In particular, it was dangerous to have the cryptosystem algorithms written down for the soldiers or spies to follow. Any cipher that was so complicated that its algorithm had to be written out was at risk of being revealed if the interceptor caught a sender with the written instructions. Then, the interceptor could readily decode any ciphertext messages intercepted (until the encryption algorithm could be changed).
The Caesar cipher is quite simple. During Caesarâs lifetime, the simplicity did not dramatically compromise the safety of the encryption because anything written, even in plaintext, was rather well protected; few people knew how to read! The pattern pi + 3 was easy to memorize and implement. A sender in the field could write out a plaintext and a ciphertext alphabet, encode a message to be sent, and then destroy the paper containing the alphabets.
Let us take a closer look at the result of applying Caesarâs encryption technique to âTREATY IMPOSSIBLE.â If we did not know the plaintext and were trying to guess it, we would have many clues from the ciphertext. For example, the break between the two words is preserved in the ciphertext, and double letters are preserved: The SS is translated to vv. We might also notice that when a letter is repeated, it maps again to the same ciphertext as it did previously. So the letters T, I, and E always translate to w, l, and h. These clues make this cipher easy to break.
Suppose you are given the following ciphertext message, and you want to try to determine the original plaintext.
wklv phvvdjh lv qrw wrr kdug wr euhdn
The message has actually been enciphered with a 27-symbol alphabet: A through Z plus the âblankâ character or separator between words. As a start, assume that the coder was lazy and has allowed the blank to be translated to itself. If your assumption is true, it is an exceptional piece of information; knowing where the spaces are allows us to see which are the small words. English has relatively few small words, such as am, is, to, be, he, we, and, are, you, she, and so on. Therefore, one way to attack this problem and break the encryption is to substitute known short words at appropriate places in the ciphertext until you have something that seems to be meaningful. Once the small words fall into place, you can try substituting for matching characters at other places in the ciphertext.
Look again at the ciphertext you are decrypting. There is a strong clue in the repeated r of the word wrr. You might use this text to guess at three-letter words that you know. For instance, two very common three-letter words having the pattern xyy are see and too; other less common possibilities are add, odd, and off. (Of course, there are also obscure possibilities like woo or gee, but it makes more sense to try the common cases first.) Moreover, the combination wr appears in the ciphertext, too, so you can determine whether the first two letters of the three-letter word also form a two-letter word.
For instance, if wrr is SEE, wr would have to be SE, which is unlikely. However, if wrr is TOO, wr would be TO, which is quite reasonable. Substituting T for w and O for r, the message becomes
wklv phvvdjh lv qrw wrr kdug wr euhdn T--- ------- -- -OT TOO ---- TO -----
The OT could be cot, dot, got, hot, lot, not, pot, rot, or tot; a likely choice is not. Unfortunately, q = N does not give any more clues because q appears only once in this sample.
The word lv is also the end of the word wklv, which probably starts with T. Likely two-letter words that can also end a longer word include so, is, in, etc. However, so is unlikely because the form T-SO is not recognizable; IN is ruled out because of the previous assumption that q is N. A more promising alternative is to substitute IS for lv tHRoughout, and continue to analyze the message in that way.
By now, you might notice that the ciphertext letters uncovered are just three positions away from their plaintext counterparts. You (and any experienced cryptanalyst) might try that same pattern on all the unmatched ciphertext. The completion of this decryption is left as an exercise.
The cryptanalysis described here is ad hoc, using deduction based on guesses instead of solid principles. But you can take a more methodical approach, considering which letters commonly start words, which letters commonly end words, and which prefixes and suffixes are common. Cryptanalysts have compiled lists of common prefixes, common suffixes, and words having particular patterns. (For example, sleeps is a word that follows the pattern abccda.) In the next section, we look at a different analysis technique.
Complexity of Substitution Encryption and Decryption
An important issue in using any cryptosystem is the time it takes to turn plaintext into ciphertext, and vice versa. Especially in the field (when encryption is used by spies or decryption is attempted by soldiers), it is essential that the scrambling and unscrambling not deter the authorized parties from completing their missions. The timing is directly related to the complexity of the encryption algorithm. For example, encryption and decryption with substitution ciphers can be performed by direct lookup in a table illustrating the correspondence, like the ones shown in our examples. Transforming a single character can be done in a constant amount of time, so we express the complexity of the algorithm by saying that the time to encrypt a message of n characters is proportional to n. One way of thinking of this expression is that if one message is twice as long as another, it will take twice as long to encrypt.
The Cryptographerâs Dilemma
As with many analysis techniques, having very little ciphertext inhibits the effectiveness of a technique being used to break an encryption. A cryptanalyst works by finding patterns. Short messages give the cryptanalyst little to work with, so short messages are fairly secure with even simple encryption.
Substitutions highlight the cryptologistâs dilemma: An encryption algorithm must be regular for it to be algorithmic and for cryptographers to be able to remember it. Unfortunately, the regularity gives clues to the cryptanalyst.
There is no solution to this dilemma. In fact, cryptography and cryptanalysis at times seem to go together like a dog chasing its tail. First, the cryptographer invents a new encryption algorithm to protect a message. Then, the cryptanalyst studies the algorithm, finding its patterns and weaknesses. The cryptographer then sets out to try to secure messages by inventing a new algorithm, and then the cryptanalyst has a go at it.
The Vernam cipher is a type of one-time pad devised by Gilbert Vernam for AT&T. The Vernam cipher is immune to most cryptanalytic attacks. The basic encryption involves an arbitrarily long nonrepeating sequence of numbers that are combined with the plaintext. Vernamâs invention used an arbitrarily long punched paper tape that fed into a teletype machine. The tape contained random numbers that were combined with characters typed into the teletype. The sequence of random numbers had no repeats, and each tape was used only once. As long as the key tape does not repeat or is not reused, this type of cipher is immune to cryptanalytic attack because the available ciphertext does not display the pattern of the key. A model of this process is shown down.
To see how this method works, we perform a simple Vernam encryption. Assume that the alphabetic letters correspond to their counterparts in arithmetic notation mod 26. That is, the letters are represented with numbers 0 through 25. To use the Vernam cipher, we sum this numerical representation with a stream of random two-digit numbers. For instance, if the message is
the letters would first be converted to their numeric equivalents, as shown here.
Next, we generate random numbers to combine with the letter codes. Suppose the following series of random two-digit numbers is generated.
76 48 16 82 44 03 58 11 60 05 48 88
The encoded form of the message is the sum mod 26 of each coded letter with the corresponding random number. The result is then encoded in the usual base-26 alphabet representation.
In this example, the repeated random number 48 happened to fall at the places of repeated letters, accounting for the repeated ciphertext letter a; such a repetition is highly unlikely. The repeated letter t comes from different plaintext letters, a much moreÂ
likely occurrence. Duplicate ciphertext letters are generally unrelated when this encryption algorithm is used.