pyzor.digest¶
Handle digesting the messages.
-
class
pyzor.digest.
DataDigester
(msg, spec=None)¶ Bases:
object
The major workhouse class.
-
atomic_num_lines
= 4¶
-
digest
¶
-
classmethod
digest_payloads
(msg)¶
-
email_ptrn
= <_sre.SRE_Pattern object>¶
-
handle_atomic
(lines)¶ We digest everything.
-
handle_line
(line)¶
-
handle_pieced
(lines, spec)¶ Digest stuff according to the spec.
-
longstr_ptrn
= <_sre.SRE_Pattern object>¶
-
min_line_length
= 8¶
-
classmethod
normalize
(s)¶
-
static
normalize_html_part
(s)¶
-
classmethod
should_handle_line
(s)¶
-
unwanted_txt_repl
= ''¶
-
url_ptrn
= <_sre.SRE_Pattern object>¶
-
value
¶
-
ws_ptrn
= <_sre.SRE_Pattern object>¶
-
-
class
pyzor.digest.
HTMLStripper
(collector)¶ Bases:
HTMLParser.HTMLParser
Strip all tags from the HTML.
-
handle_data
(data)¶ Keep track of the data.
-
handle_endtag
(tag)¶
-
handle_starttag
(tag, attrs)¶
-
-
class
pyzor.digest.
PrintingDataDigester
(msg, spec=None)¶ Bases:
pyzor.digest.DataDigester
Extends DataDigester: prints out what we’re digesting.
-
handle_line
(line)¶
-