Crypto Basics

I see cryptography as the art of keeping secrets. In reality it encompasses so much more than that; it is central to pretty much any security problem out there. Things like verifying the identies of parties talking to each other is an example. Another is playing gambling-like games fairly where no player can be trusted. My interest is in ciphers; that is systems that use small secrets (keys) to protect larger secrets (information). Crypto can typically be divided into algorithms and protocols. Algorithms are procedures that take inputs, process it, and produce outputs. Usually this is the meat of a cryptosystem, the other side is the protocol used for the communication. A protocol is a system of rules by which two parties can accomplish a task (example: communicating securely among hostile eavesdroppers).

Symmetric Key Ciphers

One branch of this world is that of symmetric cryptography. This is a family of ciphers in which both parties share what is known as a secret key. As the name implies, a secret key, is a piece of information that only the two communicating parties know. It is understood that any attacker listening in on the comms does not possess this key. An important aspect of this key is that it is small and easy to store and share. It usually takes the form of a number of a specfied length in bits. What makes a cipher symmetric is that both parties are using the same key for everything. The information that is meant to be kept secret is called the plaintext. The goal of a cipher is make sure that an eavesdropper cannot read the plaintext of a transmission from Bob to Alice or vise-versa. If Bob wants to send a secret message to Alice, he should first encrypt the plaintext using a cipher that takes the secret key as an input. The output of this encryption is called ciphertext. An adversary can read this ciphertext all day long, but without the key, it will mean nothing. When Alice receives the ciphertext, she combines it with her copy of the secret key using the proper decryption algorithm. This reproduces the original plaintext that Bob had encrypted for her.

Asymmetric Key Encryption

The other major type of enciphering method is "Asymmetric"; also known as Public-key crypto. What defines this type of cipher is its use of different keys for encryption and decryption. These key pairs are mathematically generated and are related to each other in such a way that encrypting text with one (called the public key) requires decryption with the other key (called the private key). Here's how it works...Bob and Alice each generate their own key pairs (1 public and 1 private for each of them). They exchange their public keys in some secure manner, but must keep their respective private keys secret. It does not matter if the enemy knows the public keys, it is only important that Alice accurately receives Bob's public key and vise-versa. Once they have each other's public keys (saved in an Address Book perhaps), they can communicate securely. If Alice wants to send Bob a secret love letter, she must encrypt it with the asymmetric cipher using Bob's public key. She then transmits this ciphertext across the insecure medium where the evil hackers live. When Bob recieves the ciphertext, he decrypts it with his private key and reads the love letter. Bob then writes a letter back detailing why Alice's excessive paranoia is unattractive. He should then encrypt the letter with Alice's public key and send it. When Alice receives the encrypted response to her letter, she will hurry decrypting it with her private key...she will then begin searching for a stronger algorithm; convinced the NSA is tampering with them.

Digital Signatures

Public-key cryptography can also be used to digitally sign and verify data. A great analogy for this is traditional handwritten signatures on paper contracts. The purpose of a signature is to prove that the document was created by or agreed upon by the signer. It also gives the data a property called non-repudiation; the inability for the signer to claim forgery. The most straightforward method of signing a digital document is to encrypt it with the signer's private key. The resulting ciphertext is the signed document and could only have been created by someone knowing the signer's secret private-key. When another person wants to verify the signature (and decrypt the document), they need only to decrypt it using the claimed signer's public key. If the result is a legible document, the verifier can be sure that the document was signed. In practice, the algorithm for signing is the asymmetric cipher's decryption function and the method for verifying the signature is the cipher's encryption function.





Block and Stream Ciphers

Another manner by which we can categorize encryption algorithms is by how much plaintext they encrypt at a time. Stream ciphers process the data is very small increments like a stream. They operate on one bit or byte at a time and use a relatively simple operation to combine this plaintext stream with a pseudorandom keystream. The operation that combines these two streams is typically Exclusive-Or (XOR) which is represented by a circle with a cross in it. The keystream is generated by beginning with the key itself as a seed (not shown in the diagram above). Each time a bit of the stream is processed, the key generator churns some internal data around and produces new key bits. If the key generator is absolutely perfect and only reproducible with the key (or we simply use an infinitely-long key), then the cipher is perfectly secure and literally unbreakable. No such key generator has been created/discovered.

The other major cipher-type in this classification scheme is called a block cipher. In these algorithms, data is processed in chunks (64 bit block size for example). The same key is typically used for each chunk. It simplifies things if the key size is the same as the block size, but it doesn't need to be. A part of the cipher called the key expansion algorithm makes the key "fit" the block size. The details of what happens when the block combines with this expanded key varies between ciphers. The next crypto page for this site will almost definitely be about the elements of block ciphers.


Hashing Algorithms

The last popular form of cryptography that we will discuss is cryptographic hashes. A hash is a short piece of data (64 bits for example) that represents a much larger piece of data (like a file). Hash functions take the original data in and produce a hash of it; they cannot do this process in reverse. Because of the difference in the amount of input data and the size of the hash, many data sets will hash to the same value. The trick is that the hashing algorithm should make it very hard/impossible to find these hash collisions. Here is an example of a hashing application: Lets say that you are a game developer and want to ensure that the FBI is not modifying your games and adding spyware into them before setting them on the store shelves. You can inform your customers that the hash of the original game's .exe file is [2552569 for example] and convince them to run their store bought game through the same hash function. If the hash they calculate is also [2552569 for example] then they can be sure (if the hashing algorithm is strong) that the FBI did not tamper with the data. It would be computationally infeasible to modify the game in any targeted way and still have it produce the same hash.