pyzor.digest

Handle digesting the messages.

class pyzor.digest.DataDigester(msg, spec=None)[source]

Bases: object

The major workhouse class.

atomic_num_lines = 4
digest
classmethod digest_payloads(msg)[source]
email_ptrn = re.compile('\\S+@\\S+')
handle_atomic(lines)[source]

We digest everything.

handle_line(line)[source]
handle_pieced(lines, spec)[source]

Digest stuff according to the spec.

longstr_ptrn = re.compile('\\S{10,}')
min_line_length = 8
classmethod normalize(s)[source]
static normalize_html_part(s)[source]
classmethod should_handle_line(s)[source]
unwanted_txt_repl = ''
url_ptrn = re.compile('[a-z]+:\\S+', re.IGNORECASE)
value
ws_ptrn = re.compile('\\s')
class pyzor.digest.HTMLStripper(collector)[source]

Bases: HTMLParser

Strip all tags from the HTML.

handle_data(data)[source]

Keep track of the data.

handle_endtag(tag)[source]
handle_starttag(tag, attrs)[source]
class pyzor.digest.PrintingDataDigester(msg, spec=None)[source]

Bases: DataDigester

Extends DataDigester: prints out what we’re digesting.

digest
handle_line(line)[source]
value