Introduction to Cryptography
The word cryptography comes from two Greek words meaning “secret writing” and is the art and science of information hiding. This field is very much associated with mathematics and computer science with application in many fields like computer security, electronic commerce telecommunication, etc.
So cryptography is a subject that should be of interest to many people, especially because we now live in the Information Age, and our secrets can be transmitted in so many ways – email, cell phone, etc. – and all these channels need to be protected.
Secrecy and Encryption
In the ancient days, cryptography was mostly referred to as encryption – the mechanism to convert the readable plain text into unreadable text i.e. ciphertext, and decryption – the opposite process of encryption i.e. conversion of ciphertext back to the plaintext. Though the consideration of cryptography was on message confidentiality (encryption) in the past, nowadays cryptography considers the study and practices of authentication, digital signatures, integrity checking, and key management, etc.
Encryption mostly provides the secrecy of message being transmitted over the communication network. This is called confidentiality of the message. The only sender knows the keys and can decipher the message.
Cryptology
Cryptanalysis is the breaking of codes. Cryptanalysis encompasses all of the techniques to recover the plaintext and/or key from the ciphertext.
The combined study of cryptography and cryptanalysis is known as cryptology. Though most of the time we use cryptography and cryptology in the same way.
Encryption and Decryption
Encryption is the process of encoding a message so that its meaning is not obvious i.e. converting information from one form to some other unreadable form using some algorithm called cipher with the help of secret message called key. The converting text is called is plaintext and the converted text is called cipher text.]Decryption is the reverse process, transforming an encrypted message back into its normal, original form. In decryption process also the use of a key is important.
Alternatively, the terms encode and decode or encipher and decipher are used instead of encrypting and decrypting. That is, we say that we encode, encrypt, or encipher the original message to hide its meaning. Then, we decode, decrypt, or decipher it to reveal the original message.
The use of encryption techniques is being used since very long period as it can be noted from the technique called Caesar’s cipher used by Julius Caesar for information passing to his soldiers. Encryption techniques have also been extensively used for military purposes to conceal the information from the enemy. Nowadays to gain the confidentiality encryption is being used in many areas like communication, internet banking, digital right management, etc.
Key
A key is a parameter or a piece of information used to determine the output of the cryptographic algorithm. While doing the encryption,
determines the transformation of plaintext to the ciphertext and vice versa. Keys are also used in other cryptographic processes like message authentication codes and digital signatures. Most of the cryptographic systems depend on upon the key and thus the secrecy of the key is very important and is one of the difficult problems in practice. Another important issue for the key is its length. Since the key is the sole entity that defines the strength of the security (normally algorithm used is public) we need to select the key in a way such that attacker should take long enough to try all possibilities. To prevent the key from being guessed the choice of the key must be random.
Cipher
A cipher is an algorithm for performing encryption and decryption. The operation of cipher depends on upon the special information called key. Without knowledge of the key, it should be difficult, if not nearly impossible, to decrypt the resulting cipher into readable plaintext. There are many types of encryption techniques that have advanced from history, however, the distinction of encryption technique can be broadly categorized in terms of a number of the key used and way of converting plaintext to the ciphertext.
Cryptosystem
A cryptosystem is a 5-tuple/quintuple (E, D, M, K, C), where M set of plaintexts, K set of keys, C set of ciphertexts, E set of encryption functions e: M × K → C and D set of decryption functions d: C × K →M.
The type of operations used for transforming plaintext into ciphertext
All encryption algorithms are based on two general principles: substitution, in which each element in the plaintext (bit, letter, group of bits or letters) is mapped into another element, and transposition, in which elements in the plaintext are rearranged. The fundamental requirement is that no information be lost (that is, that all operations are reversible). Most systems, referred to as product systems, involve multiple stages of substitutions and transpositions.
The number of keys used
If both sender and receiver use the same key, the system is referred to as symmetric, single-key, secret-key, or conventional encryption. If the sender and receiver use different keys, the system is referred to as asymmetric, two-key, or public-key encryption.
The way in which the plaintext is processed
A block cipher processes the input one block of elements at a time, producing an output block for each input block. A stream cipher processes the input elements continuously, producing output one element at a time, as it goes along.
Classical Cryptosystem
Historical pen and paper ciphers used in the past are sometimes known as classical ciphers.
These are the very old or quite old cryptosystem that was used in the pre-computer age. These cryptosystems are too weak nowadays and can be broken easily with a computer.
But we even studied this cryptosystem because they illustrate basic of the concepts of cryptography.
Substitution Cipher
In substitution ciphers, the letters are systematically replaced by other letters or symbols.
So when we encode HELLO WORLD, the cipher text becomes KHOORZRUOG. Here we number each English alphabet starting from 0 (A) to 25 (Z). Each letter of the clear message is replaced by the letter whose number is obtained by adding the key (a number from 0 to 25) to the letter's number modulo 26. See the picture to visualize the Caesar cipher. The encryption can also be represented using modular arithmetic by first transforming the letters into numbers, according to the scheme, A = 0, B = 1, ..., Z = 25. Encryption of a letter c by a shift k can be described mathematically as,
Attacking the Cipher
Caesar Cipher is quite easily broken even with ciphertext only. One can attack the ciphertext using exhaustive search by trying all possible keys until you find the right one. Exhaustive search is best suited if the keyspace is small and we have only 26 possible keys in Caesar cipher. Another approach of attacking the cipher is statistical analysis where we compare the ciphertext to 1-gram model of English.
Caesar’s Problem
The main problem with Caesar’s Cipher is that the key is too short and can be found by exhaustive search. Again statistical frequencies not concealed well i.e. they look too much like regular English letters. So the solution can be to increase the key length (can be done using multiple letters in key) so that cryptanalysis gets harder.
Transposition Cipher
In transposition ciphers, the letters are systematically arranged so that the actual position of letters is gets changed making the text garbled.-Fence Cipher
Then reads off:
WECRL TEERD SOEEF EAOCA IVDEN
Key VIGVIGVIGVIGVIGV
Plain:- THEBOYHASTHEBALL
Cipher:- OPKWWECIYOPKWIRG
Here, generally, we repeatedly write key above the plaintext and use the Caesar cipher for each letter in the plaintext where the key for each letter being processed is taken from the repeated key letter just above it. This process is simplified by using the table as below called Tableau
Assuming key on top and the plaintext on left, Decryption is performed by finding the position of the ciphertext letter in a column, corresponding to the key letter, of the table, and then taking the label of the row in which it appears as the plaintext letter. For example, in column V (key letter), the ciphertext letter O appears in row T, which taken as the first plaintext letter. The second letter is decrypted by looking up P in column I of the table; it appears in row H, which is taken as the plaintext letter. This process continues until we find the plaintext letters for all the cipher text letters
One-Time Pad (simple XOR)
It is a variant of a Vigenère cipher with a random key at least as long as the message. Since it has very high key length it is provably unbreakable. Joseph Mauborgne proposed this concept. He suggested using a random key that is as long as the message, so the key need not be repeated. In addition, the key is to be used to encrypt and decrypt a single message and then is discarded. Each new message requires a new key of the same length as the new message. In One-time pad keys must be random, or we can attack the cipher by trying to regenerate the key approximations, such as using pseudorandom number generators to generate keys, are not random. This approach produces random output that bears no statistical relationship to the plaintext. Because the ciphertext contains no information whatsoever about the plaintext, there is simply no way to break the code.
Playfair Cipher
The best-known multiple-letter encryption cipher is the Playfair, which treats diagrams in the plaintext as single units and translates these units into cipher text diagrams
The Playfair algorithm is based on the use of a 5 x 5 matrix of letters constructed using a keyword. Here keyword is MONARCHY then the matrix is:
The matrix is constructed by filling in the letters of the keyword (minus duplicates) from left to right and from top to bottom, and then filling in the remainder of the matrix with the remaining letters in alphabetic order. Plaintext is encrypted two letters at a time, according to the following rules:
Hill Cipher
Another interesting multi-letter cipher is the Hill cipher, developed by the mathematician Lester Hill in 1929. The encryption algorithm takes m successive plaintext letters and substitutes for them m ciphertext letters. The substitution is determined by m linear equations in which each character is assigned a numerical value (a = 0, b = 1 ... z = 25).
References:
www.csitnepal.com
(Shahi)
Bibliography
Shahi, Tej. CSIT NEPAL. <http://www.csitnepal.com/elibrary/notes/>.