hub / github.com/QData/TextAttack / AttackedText

Class AttackedText

textattack/shared/attacked_text.py:27–612 · view source on GitHub ↗

A helper class that represents a string that can be attacked. Models that take multiple sentences as input separate them by ``SPLIT_TOKEN``. Attacks "see" the entire input, joined into one string, without the split token. ``AttackedText`` instances that were perturbed from other ``Atta

Source from the content-addressed store, hash-verified

25
26
27	class AttackedText:
28	"""A helper class that represents a string that can be attacked.
29
30	Models that take multiple sentences as input separate them by ``SPLIT_TOKEN``.
31	Attacks "see" the entire input, joined into one string, without the split token.
32
33	``AttackedText`` instances that were perturbed from other ``AttackedText``
34	objects contain a pointer to the previous text
35	(``attack_attrs["previous_attacked_text"]``), so that the full chain of
36	perturbations might be reconstructed by using this key to form a linked
37	list.
38
39	Args:
40	text (string): The string that this AttackedText represents
41	attack_attrs (dict): Dictionary of various attributes stored
42	during the course of an attack.
43	"""
44
45	SPLIT_TOKEN = "<SPLIT>"
46
47	def __init__(self, text_input, attack_attrs=None):
48	# Read in ``text_input`` as a string or OrderedDict.
49	if isinstance(text_input, str):
50	self._text_input = OrderedDict([("text", text_input)])
51	elif isinstance(text_input, OrderedDict):
52	self._text_input = text_input
53	else:
54	raise TypeError(
55	f"Invalid text_input type {type(text_input)} (required str or OrderedDict)"
56	)
57	# Process input lazily.
58	self._words = None
59	self._words_per_input = None
60	self._pos_tags = None
61	self._ner_tags = None
62	# Format text inputs.
63	self._text_input = OrderedDict([(k, v) for k, v in self._text_input.items()])
64	if attack_attrs is None:
65	self.attack_attrs = dict()
66	elif isinstance(attack_attrs, dict):
67	self.attack_attrs = attack_attrs
68	else:
69	raise TypeError(f"Invalid type for attack_attrs: {type(attack_attrs)}")
70	# Indices of words from the original text. Allows us to map
71	# indices between original text and this text, and vice-versa.
72	self.attack_attrs.setdefault("original_index_map", np.arange(self.num_words))
73	# A list of all indices in this text that have been modified.
74	self.attack_attrs.setdefault("modified_indices", set())
75
76	def __eq__(self, other: AttackedText) -> bool:
77	"""Compares two AttackedText instances.
78
79	Note: Does not compute true equality across attack attributes.
80	We found this caused large performance issues with caching,
81	and it's actually much faster (cache-wise) to just compare
82	by the text, and this works for lots of use cases.
83	"""
84	if not (self.text == other.text):

Callers 10

attackMethod · 0.90

augmentMethod · 0.90

_get_transformationsMethod · 0.90

apply_perturbationMethod · 0.90

test_perplexityFunction · 0.90

generate_new_attacked_textMethod · 0.85

Calls

no outgoing calls

Tested by 1

test_perplexityFunction · 0.72