Encode / Decode / Crack Caesar-Verschlüsselung

Dav1d · Samstag 19. Dezember 2009, 20:27

Passts hier überhaupt rein, oder gehörts zu Codesnippets?

3 Funktionen, die etwas mit der Cäsarverschlüsselung zu tun haben, eine zum Enkodieren, eine zum Dekodieren und eine zum Cracken

Basierend auf der "Ring-Klasse" von Leonidas

Code: Alles auswählen

#!/usr/bin/python
# -*- coding: utf-8 -*-

from string import ascii_lowercase


char_order_de = ['e', 'n', 'i', 'r', 's', 'a', 't', 'd', 'h', 'u', 'l', 'g',
                 'o', 'c', 'm', 'b', 'f', 'w', 'k', 'z', 'p', 'v', 'j', 'y', 'x', 'q']

char_order_en = ['e', 't', 'a', 'o', 'i', 'n', 's', 'r', 'h', 'l', 'd', 'c',
                 'u', 'm', 'f', 'p', 'g', 'w', 'y', 'b', 'v', 'k', 'x', 'j', 'q', 'z']
langs = {'de' : char_order_de, 'en' : char_order_en}


class Ring(list): # from Leonidas
    def __getitem__(self, index):
        real_index = index % len(self)
        return list.__getitem__(self, real_index)

def encode_caesar(text, move_each):
    '''Encodes a text with the given chars to move'''
    ring = Ring(ascii_lowercase)
    lower_text = text.lower().replace('ü', 'ue').replace('ö', 'oe').replace('ä', 'ae')
    encoded_text = []
    
    for char in lower_text:
        if char in ascii_lowercase:
            new_index = ring.index(char) + move_each
            new_char = ring[new_index]
        else:
            new_char = char
        encoded_text.append(new_char)
    return ''.join(encoded_text)

def decode_caesar(text, moved):
    '''Decodes a encoded text with the "Cäsar-Encoding", how much
    chars the the text is moved must be known'''
    return encode_caesar(text, 26-moved)

def crack_caesar(text, lang='en'):
    '''Decodes a encoded text with the "Cäsar-Encoding", but the
    number of moved chars isn't needed
    this way of cracking the encoded text doesn't work always, the
    longer the text is, the bigger is the chance that the text can
    be decoded''' 
    lang = langs.get(lang, 'en')
    char_count = {}
    char_number = float(len(text))
    
    for char in text:
        if char in ascii_lowercase:
            if char in char_count:
                char_count[char] += 1
            else:
                char_count[char] = 1
   
    percents = {}
    for char in char_count:
        percent = int(char_count[char]) / char_number
        if percent in percents:
            percents[percent].append(char)
        else:
            percents[percent] = [char]
    
    y = [x for x in reversed(sorted(percents.iteritems()))]
    old_move = 1
    move = 1
    for i, _ in enumerate(y):
        percent, chars = _
        if not len(chars) == 1 or not old_move == move:
            break
        char = ''.join(chars)
        old_move = move
        move = ascii_lowercase.index(lang[i]) - ascii_lowercase.index(char)
       
    return encode_caesar(text, move)

if __name__ == '__main__':
    move_each = int(raw_input('How much chars to rotate?: '))
    text = raw_input('Text: ')
    lang = raw_input('Language en/de: ')
    
    encoded_text = encode_caesar(text, move_each)
    print 'Encoded Text:', encoded_text 
    print 'Decoded Text:', decode_caesar(encoded_text, move_each)
    print 'Cracked Text:', crack_caesar(encoded_text, lang)

Feedback ist sehr erwünscht, schauts euch an

//Edit: jetzt mit Doc-Strings, wahrscheinlich grauenvoll

Dav1d · Sonntag 20. Dezember 2009, 11:38

Nochmal verbessert, im alten is auchn bug drinnen

Code: Alles auswählen

#!/usr/bin/python
# -*- coding: utf-8 -*-

from string import ascii_lowercase


char_order_de = ['e', 'n', 'i', 'r', 's', 'a', 't', 'd', 'h', 'u', 'l', 'g',
                 'o', 'c', 'm', 'b', 'f', 'w', 'k', 'z', 'p', 'v', 'j', 'y', 'x', 'q']

char_order_en = ['e', 't', 'a', 'o', 'i', 'n', 's', 'r', 'h', 'l', 'd', 'c',
                 'u', 'm', 'f', 'p', 'g', 'w', 'y', 'b', 'v', 'k', 'x', 'j', 'q', 'z']
langs = {'de' : char_order_de, 'en' : char_order_en}


class Ring(list): # from Leonidas
    def __getitem__(self, index):
        real_index = index % len(self)
        return list.__getitem__(self, real_index)

class CrackError(Exception):
    def __init__(self, val):
        self.val = val
    def __str__(self):
        return repr(self.val)

def get_max_count(iterable):
    store = {}
    for i in iterable:
        if i in store:
            store[i] += 1
        else:
            store[i] = 1
    highest = None
    x = 0
    for i in store:
        if store[i] > x:
            x = store[i]
            highest = i
    return highest

    

def encode_caesar(text, move_each):
    '''Encodes a text with the given chars to move'''
    ring = Ring(ascii_lowercase)
    lower_text = text.lower().replace('ü', 'ue').replace('ö', 'oe').replace('ä', 'ae')
    encoded_text = []
    
    for char in lower_text:
        if char in ascii_lowercase:
            new_index = ring.index(char) + move_each
            new_char = ring[new_index]
        else:
            new_char = char
        encoded_text.append(new_char)
    return ''.join(encoded_text)

def decode_caesar(text, moved):
    '''Decodes a encoded text with the "Cäsar-Encoding", how much
    chars the the text is moved must be known'''
    return encode_caesar(text, 26-moved)

def crack_caesar(text, lang='en'):
    '''Decodes a encoded text with the "Cäsar-Encoding", but the
    number of moved chars isn't needed
    this way of cracking the encoded text doesn't work always, the
    longer the text is, the bigger is the chance that the text can
    be decoded''' 
    lang = langs.get(lang, 'en')
    char_count = {}
    char_number = float(len(text))
    
    for char in text: # Counting how often a char occurs
        if char in ascii_lowercase:
            if char in char_count:
                char_count[char] += 1
            else:
                char_count[char] = 1
   
    percents = {}
    for char in char_count: 
        # Calculating the percent of each char
        percent = int(char_count[char]) / char_number
        # Adding to percents (dictionary), the percents are the keys
        if percent in percents:
            percents[percent].append(char)
        else:
            percents[percent] = [char]
    
    # Oh Jeah ;)
    # Building a list by sorting the key, value pairs of percents
    # and reversing the sorted list (reversed() = generator-object)
    y = [x for x in reversed(sorted(percents.iteritems()))]
    all_results = [] # Let's store all results
    first = y.pop(0) #Getting the first item
    if len(first[1]) == 1: # The first item must have just one char
        char = ''.join(first[1])
        move = ascii_lowercase.index(lang[0]) - ascii_lowercase.index(char) # here we store our result
        all_results.append(move)
        #old_move = move # Used for comparing
        for i, _ in enumerate(y):
            percent, chars = _
            if not len(chars) == 1: # or not old_move == move: 
                break
            else:
                char = ''.join(chars)
            #old_move = move
            # Let's get our new result!
            move = ascii_lowercase.index(lang[i]) - ascii_lowercase.index(char)
            all_results.append(move)   
        return encode_caesar(text, get_max_count(all_results))
    else:  
        raise CrackError('Unable to crack, need a longer text')

if __name__ == '__main__':
    move_each = int(raw_input('How much chars to rotate?: '))
    #text = raw_input('Text: ')
    #lang = raw_input('Language en/de: ')
    
    # Text copied from http://en.wikipedia.org/wiki/English_poetry
    lang = 'en' 
    long_text = '''The history of English poetry stretches from the middle of the 7th century to the present day.
    Over this period, English poets have written some of the most enduring poems in Western culture,
    and the language and its poetry have spread around the globe. Consequently, the term English poetry
    is unavoidably ambiguous. It can mean poetry written in England, or poetry written in the English language.

    The earliest surviving poetry from the area currently known as England was likely transmitted orally and
    then written down in versions that do not now survive; thus, dating the earliest poetry remains difficult
    and often controversial. The earliest surviving manuscripts date from the 10th century. Poetry written in
    Latin, Brythonic (a predecessor language of Welsh) and Old Irish survives which may date as early as the
    6th century. The earliest surviving poetry written in Anglo-Saxon, the most direct predecessor of modern
    English, may have been composed as early as the seventh century.

    With the growth of trade and the British Empire, the English language had been widely used outside England.
    In the twenty-first century, only a small percentage of the world's native English speakers live in England,
    and there is also a vast population of non-native speakers of English who are capable of writing poetry in
    the language. A number of major national poetries, including the American, Australian, New Zealand, Canadian
    and Indian poetry have emerged and developed. Since 1922, Irish poetry has also been increasingly viewed as
    a separate area of study.'''
        
    encoded_text = encode_caesar(long_text, move_each)
    print '\nEncoded Text:', encoded_text 
    print '\nDecoded Text:', decode_caesar(encoded_text, move_each)
    print '\nCracked Text:', crack_caesar(encoded_text, lang)

problembär · Montag 21. Dezember 2009, 01:16

pycrypto / yawPyCrypto dürfte sicherer sein.

lunar · Montag 21. Dezember 2009, 11:50

Ich glaube nicht, dass es dem OP um starke Verschlüsselung ging ...

Dav1d · Montag 21. Dezember 2009, 14:06

Lunar hat vollkommen recht, mir gings eher darum, dass Anfänger, die etwas in Tutorials etc. lesen auch etwas haben, wenn's mal nicht klappt und dass es auch mal eine "Crack"-Funktion gibt