hub / github.com/csev/py4e / BeautifulSoup

Class BeautifulSoup

code3/bs4/__init__.py:50–440 · view source on GitHub ↗

This class defines the basic interface called by the tree builders. These methods will be called by the parser: reset() feed(markup) The tree builder may call these methods from its feed() implementation: handle_starttag(name, attrs) # See note about return value

Source from the content-addressed store, hash-verified

48	'You are trying to run the Python 2 version of Beautiful Soup under Python 3. This will not work.'!='You need to convert the code, either by installing it (`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'
49
50	class BeautifulSoup(Tag):
51	"""
52	This class defines the basic interface called by the tree builders.
53
54	These methods will be called by the parser:
55	reset()
56	feed(markup)
57
58	The tree builder may call these methods from its feed() implementation:
59	handle_starttag(name, attrs) # See note about return value
60	handle_endtag(name)
61	handle_data(data) # Appends to the current data node
62	endData(containerClass=NavigableString) # Ends the current data node
63
64	No matter how complicated the underlying parser is, you should be
65	able to build a tree using 'start tag' events, 'end tag' events,
66	'data' events, and "done with data" events.
67
68	If you encounter an empty-element tag (aka a self-closing tag,
69	like HTML's <br> tag), call handle_starttag and then
70	handle_endtag.
71	"""
72	ROOT_TAG_NAME = '[document]'
73
74	# If the end-user gives no indication which tree builder they
75	# want, look for one with these features.
76	DEFAULT_BUILDER_FEATURES = ['html', 'fast']
77
78	ASCII_SPACES = '\x20\x0a\x09\x0c\x0d'
79
80	NO_PARSER_SPECIFIED_WARNING = "No parser was explicitly specified, so I'm using the best available %(markup_type)s parser for this system (\"%(parser)s\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n\nTo get rid of this warning, change this:\n\n BeautifulSoup([your markup])\n\nto this:\n\n BeautifulSoup([your markup], \"%(parser)s\")\n"
81
82	def __init__(self, markup="", features=None, builder=None,
83	parse_only=None, from_encoding=None, exclude_encodings=None,
84	**kwargs):
85	"""The Soup object is initialized as the 'root tag', and the
86	provided markup (which can be a string or a file-like object)
87	is fed into the underlying parser."""
88
89	if 'convertEntities' in kwargs:
90	warnings.warn(
91	"BS4 does not respect the convertEntities argument to the "
92	"BeautifulSoup constructor. Entities are always converted "
93	"to Unicode characters.")
94
95	if 'markupMassage' in kwargs:
96	del kwargs['markupMassage']
97	warnings.warn(
98	"BS4 does not respect the markupMassage argument to the "
99	"BeautifulSoup constructor. The tree builder is responsible "
100	"for any necessary markup massage.")
101
102	if 'smartQuotesTo' in kwargs:
103	del kwargs['smartQuotesTo']
104	warnings.warn(
105	"BS4 does not respect the smartQuotesTo argument to the "
106	"BeautifulSoup constructor. Smart quotes are always converted "
107	"to Unicode characters.")

Callers 15

wikigrade.pyFile · 0.90

urllinks.pyFile · 0.90

urllink2.pyFile · 0.90

urllink3.pyFile · 0.90

soupMethod · 0.90

assertSoupEqualsMethod · 0.90

test_formatter_processes_script_tag_for_xml_documentsMethod · 0.90

diagnoseFunction · 0.90

benchmark_parsersFunction · 0.90

test_tag_inherits_self_closing_rules_from_builderMethod · 0.90

test_formatter_skips_script_tag_for_html_documentsMethod · 0.90

test_formatter_skips_style_tag_for_html_documentsMethod · 0.90

Calls

no outgoing calls

Tested by 10

soupMethod · 0.72

assertSoupEqualsMethod · 0.72

test_formatter_processes_script_tag_for_xml_documentsMethod · 0.72

test_tag_inherits_self_closing_rules_from_builderMethod · 0.72

test_formatter_skips_script_tag_for_html_documentsMethod · 0.72

test_formatter_skips_style_tag_for_html_documentsMethod · 0.72

test_prettify_accepts_formatterMethod · 0.72

setUpMethod · 0.72

test_beautifulsoup_constructor_does_lookupMethod · 0.72

test_last_ditch_entity_replacementMethod · 0.72