Word Frequency

This program takes a text document and a single word as input and outputs the following to stdout (sorted by frequency as a percentage):

The following command will clone the repo into the current directory:

git clone https://github.com/abannerjee/word_frequency.git

or download the raw script file from here:

https://raw.githubusercontent.com/abannerjee/word_frequency/master/word_frequency.py

python3.2 word_frequency.py -f <input_file> -w <single_word>

A word has been defined to have the following properties:

Case insensitive (e.g. "the" and "The" are considered the same word)
Delimited by spaces
Certain punctuation marks are not considered (the following characters are ignored: '[:.,(){}!?;"]')

Certain characters which have not been filtered are considered words, such as "&".
Email addresses will not parse correctly (e.g. alex@host.com will be interpreted as alex@hostcom).
In the case a tie occurs in the frequency of two or more words or sets of words, the word or set of words which appear first in the document is listed first. The numbering is unaffected, meaning there won't be two words or set of words marked as #1.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
newfile		newfile
word_frequency.py		word_frequency.py

Provide feedback