Anagramme finden

Code-Stücke können hier veröffentlicht werden.

Dienstag 5. April 2005, 21:25

Ein kleines Skript, das Anagramme zu einem eingegebenen Wort in einer Wortdatei sucht. Im Skript ist die `/usr/share/dict/words` fest einkodiert, man kann aber auch jede andere Datei nehmen, die ein Wort pro Zeile enthält.

Code: Alles auswählen

#!/usr/bin/env python2.4
"""A script for finding anagrams of a given word.

An anagram of a word is another word with exactly the same characters but in
different order.  For example *salesmen* and *lameness* both contain one 'a',
'l', 'm', and 'n' and two 'e' and 's'.

A simple way to test if two words are anagrams of each other, is to sort both
word's characters and compare the results:

    >>> def sort_word(word):
    ...     return ''.join(sorted(list(word)))
    >>> a, b = map(sort_word, ('salesmen', 'lameness'))
    >>> a, b
    ('aeelmnss', 'aeelmnss')
    >>> a == b

To check a given word against a whole list of other words this script builds
a mapping from "normalized", i.e. "sorted words", to all matching words in the
list of words.  The script uses the ``dict`` word file usually found in
``/usr/share/dict/words`` under \*nix.  The file contains one english word per
from itertools import imap, ifilter

__author__ = "Marc 'BlackJack' Rintsch"
__version__ = '0.1'
__date__ = '$Date: 2005-04-05 18:43:49 +0200 (Tue, 05 Apr 2005) $'
__revision__ = '$Rev: 665 $'

WORDS = '/usr/share/dict/words'

def normalize(word):
    """Makes a key from `word` by stripping whitespace, converting to
    lowercase, and sorting the characters.

    >>> map(normalize, ('spam\\n', 'salesmen', '  lameness  '))
    ['amps', 'aeelmnss', 'aeelmnss']
    return ''.join(sorted(list(word.strip().lower())))

def make_anagram_map(words):
    """Maps a "normalized" version of each word to a list of anagrams.
    >>> make_anagram_map(('spam\\n', 'salesmen', '  lameness  '))
    {'aeelmnss': ['salesmen', 'lameness']}
    :returns: a mapping with items that have more than one entry for a key.
    anagram_map = dict()
    for word in imap(str.strip, words):
        anagram_map.setdefault(normalize(word), list()).append(word)
    return dict(ifilter(lambda i: len(i[1]) > 1, anagram_map.iteritems()))

def main():
    """Reads words from file and then asks the user for words to search
    anagrams from.
    words_file = open(WORDS, 'r')
    anagram_map = make_anagram_map(words_file)
    while True:
        word = raw_input('Find anagrams of word (just enter to end): ')
        if not word:
            print anagram_map[normalize(word)]
        except KeyError:
            print 'No anagrams found for %r' % word

#     # Print all anagrams sorted by number of anagrams.
#     print '\n'.join(map(str, sorted(anagram_map.values(), key=len)))
#     print len(anagram_map)

if __name__ == '__main__':