CSci 4554 Lab 1. Affine ciphers: encryption, decryption, and cryptanalysis.

Due Wednesday, Feb. 17 at 11:59pm (by e-mail)

35 points total (plus up to 6 extra credit points)

Description

This lab is done in groups of 2. Each group picks a secret key (a pair of numbers) for an affine cipher and encrypts an English text of their choice using this key. Then they post their encrypted text (and their group) on the wiki: http://wiki.umn.edu/view/UMMCSci4554Spring10/AffineCipherLab. After some encrypted texts are posted, each group attempts to break encryptions posted by other groups.

Background

Affine cipher is a substitution monoalphabetic cipher, i.e. it is a keyed algorithm for replacing each letter of the alphabet with a unique letter from the same alphabet. Let N be the size of the alphabet. The key consists of two numbers K1, K2 s.t. 0 < K1 < N, 0 <= K2 < N, and gcd(K1,N) = 1.

To encrypt a plaintext character p, compute: E(p) = (K1 * p + K2) mod N.

To decrypt a ciphertext character c, compute D(c) = ((c - K2) * K1') mod N, where K1' = K1^(-1) is the multiplicative inverse of K1 modulo N.

For more on affine cipher see section 7.3 of the textbook and http://en.wikipedia.org/wiki/Affine_cipher.

Affine cipher can be broken using a simplified frequency analysis approach. For a general frequency analysis see http://en.wikipedia.org/wiki/Frequency_analysis. However, affine cipher is much more specific than a general substitution cipher: it is enough to correctly guess two letters of the encryption to determine the key pair, and therefore to decrypt the cipher:

Suppose you know that a plaintext character p1 encrypts to c1 and p2 encrypts to c2. Then you have a linear equation system for K1 and K2:
c1 = (p1 * K1 + K2) mod N
c2 = (p2 * K1 + K2) mod N
Solving this equation (don't forget that all operations are mod N), you can get K1, K2. Note that you would need to compute a multiplicative inverse of (p1 - p2) modulo N; this would not work if p1 - p2 divides N. In this case just try a different pair p, c.

Tasks for the lab:

  1. Work in pairs.
  2. Write a program that encrypts an English text using an affine cipher. The program should ignore all punctuations and spaces and output the result in upper case letters regardless of the initial case. The key pair K1, K2 can be an input to the program or it can be set in the program itself. The programming language doesn't matter, as long as you submit well-documented source code and instructions for how to compile (if needed) and run it.
  3. Write a function to decrypt an encrypted text with a given K1, K2.
  4. Choose K1, K2 and English text of at least 1000 characters. Encrypt the text using the key pair.
  5. Decrypt your text using your decryption function to make sure that everything is working correctly.
  6. Save the plaintext and the key pair and keep all this information secret (you might want to reset file permissions accordingly). Post the encrypted text on the wiki: http://wiki.umn.edu/view/UMMCSci4554Spring10/AffineCipherLab (mark it with your name).
  7. Write a function to read a text and compute frequencies of letters. You can use the one you wrote for the problem set. You also might want to write a function that finds the most common two-letter combinations in a text.
  8. As other encrypted texts appear on the wiki, take any one of them and use the frequency analysis to try to guess character matchings. Once you have a hypothesis, take two letters that you are have good guesses for, write a linear equations system for K1, K2 used in that encryption, and solve it (see Background above). Note that not all such systems have a unique solutions: if p1 - p2 divides 26 then you need to try a different guessed pair.
  9. Once you solve the system of equations, check your guess by decrypting the text using K1, K2. If it decrypts to a valid English text, record your K1, K2, and the text. Otherwise try a different guess; keep trying until you get a valid English decryption. Write down all the pairs you tried, briefly explain your guesses (for instance, you guessed that the most common two-letter combination may decrypt to "th").
  10. You need to decrypt 3 different texts to get full credit. Every additional one that you decrypt gives you 3 points extra credit.

What to submit (by e-mail to me, CC your partner)

  1. All your program code (well-documented) and instructions for running it.
  2. Your own text (the original) and your K1, K2.
  3. All the decryptions that you have accomplished (decrypted texts and K1, K2) with brief explanations of what you tried, why, and what worked.

CSci 4554 course web site.