MCPcopy
hub / github.com/csev/py4e / BeautifulSoup

Class BeautifulSoup

code3/bs4/__init__.py:50–440  ·  view source on GitHub ↗

This class defines the basic interface called by the tree builders. These methods will be called by the parser: reset() feed(markup) The tree builder may call these methods from its feed() implementation: handle_starttag(name, attrs) # See note about return value

Source from the content-addressed store, hash-verified

48'You are trying to run the Python 2 version of Beautiful Soup under Python 3. This will not work.'!='You need to convert the code, either by installing it (`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'
49
50class BeautifulSoup(Tag):
51 """
52 This class defines the basic interface called by the tree builders.
53
54 These methods will be called by the parser:
55 reset()
56 feed(markup)
57
58 The tree builder may call these methods from its feed() implementation:
59 handle_starttag(name, attrs) # See note about return value
60 handle_endtag(name)
61 handle_data(data) # Appends to the current data node
62 endData(containerClass=NavigableString) # Ends the current data node
63
64 No matter how complicated the underlying parser is, you should be
65 able to build a tree using 'start tag' events, 'end tag' events,
66 'data' events, and "done with data" events.
67
68 If you encounter an empty-element tag (aka a self-closing tag,
69 like HTML&#x27;s <br> tag), call handle_starttag and then
70 handle_endtag.
71 """
72 ROOT_TAG_NAME = '[document]'
73
74 # If the end-user gives no indication which tree builder they
75 # want, look for one with these features.
76 DEFAULT_BUILDER_FEATURES = ['html', 'fast']
77
78 ASCII_SPACES = '\x20\x0a\x09\x0c\x0d'
79
80 NO_PARSER_SPECIFIED_WARNING = "No parser was explicitly specified, so I'm using the best available %(markup_type)s parser for this system (\"%(parser)s\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n\nTo get rid of this warning, change this:\n\n BeautifulSoup([your markup])\n\nto this:\n\n BeautifulSoup([your markup], \"%(parser)s\")\n"
81
82 def __init__(self, markup="", features=None, builder=None,
83 parse_only=None, from_encoding=None, exclude_encodings=None,
84 **kwargs):
85 """The Soup object is initialized as the 'root tag', and the
86 provided markup (which can be a string or a file-like object)
87 is fed into the underlying parser."""
88
89 if 'convertEntities' in kwargs:
90 warnings.warn(
91 "BS4 does not respect the convertEntities argument to the "
92 "BeautifulSoup constructor. Entities are always converted "
93 "to Unicode characters.")
94
95 if 'markupMassage' in kwargs:
96 del kwargs['markupMassage']
97 warnings.warn(
98 "BS4 does not respect the markupMassage argument to the "
99 "BeautifulSoup constructor. The tree builder is responsible "
100 "for any necessary markup massage.")
101
102 if 'smartQuotesTo' in kwargs:
103 del kwargs['smartQuotesTo']
104 warnings.warn(
105 "BS4 does not respect the smartQuotesTo argument to the "
106 "BeautifulSoup constructor. Smart quotes are always converted "
107 "to Unicode characters.")

Calls

no outgoing calls