MCPcopy
hub / github.com/ArtifexSoftware/pdf2docx / group_by_connectivity

Method group_by_connectivity

pdf2docx/common/Collection.py:131–167  ·  view source on GitHub ↗

Collect connected instances into same group. Args: dx (float): x-tolerances to define connectivity dy (float): y-tolerances to define connectivity Returns: list: a list of grouped ``Collection`` instances. .. note:: *

(self, dx:float, dy:float)

Source from the content-addressed store, hash-verified

129
130
131 def group_by_connectivity(self, dx:float, dy:float):
132 """Collect connected instances into same group.
133
134 Args:
135 dx (float): x-tolerances to define connectivity
136 dy (float): y-tolerances to define connectivity
137
138 Returns:
139 list: a list of grouped ``Collection`` instances.
140
141 .. note::
142 * It's equal to a GRAPH traversing problem, which the critical point in
143 building the adjacent list, especially a large number of vertex (paths).
144
145 * Checking intersections between paths is actually a Rectangle-Intersection
146 problem, studied already in many literatures.
147 """
148 # build the graph -> adjacent list:
149 # the i-th item is a set of indexes, which connected to the i-th instance
150 num = len(self._instances)
151 index_groups = [set() for _ in range(num)] # type: list[set]
152
153 # solve rectangle intersection problem
154 i_rect_x, i = [], 0
155 d_rect = (-dx, -dy, dx, dy)
156 for rect in self._instances:
157 points = [a+b for a,b in zip(rect.bbox, d_rect)] # consider tolerance
158 i_rect_x.append((i, points, points[0]))
159 i_rect_x.append((i+1, points, points[2]))
160 i += 2
161 i_rect_x.sort(key=lambda item: item[-1])
162 solve_rects_intersection(i_rect_x, 2*num, index_groups)
163
164 # search graph -> grouped index of instance
165 groups = graph_bfs(index_groups)
166 groups = [self.__class__([self._instances[i] for i in group]) for group in groups]
167 return groups
168
169
170 def group_by_columns(self, factor:float=0.0, sorted:bool=True, text_direction:bool=False):

Callers 2

lattice_tablesMethod · 0.80

Calls 3

solve_rects_intersectionFunction · 0.85
graph_bfsFunction · 0.85
appendMethod · 0.45

Tested by

no test coverage detected