MCPcopy
hub / github.com/PaddlePaddle/PaddleOCR / reset_data_lines

Method reset_data_lines

ppocr/data/simple_dataset.py:312–334  ·  view source on GitHub ↗

Signal new epoch to persistent workers via shared memory. Workers lazily rebuild their _index_map on next __getitem__ call. No disk I/O, no dataloader reconstruction.

(self, seed=None, epoch=None)

Source from the content-addressed store, hash-verified

310 # ------------------------------------------------------------------ #
311
312 def reset_data_lines(self, seed=None, epoch=None):
313 """Signal new epoch to persistent workers via shared memory.
314
315 Workers lazily rebuild their _index_map on next __getitem__ call.
316 No disk I/O, no dataloader reconstruction.
317 """
318 self.seed = seed
319 epoch_val = epoch if epoch is not None else (seed if seed is not None else 0)
320 self._shared_epoch.value = int(epoch_val)
321
322 if self._all_lines is not None:
323 # Update main-process index_map (used by len() and batch_sampler)
324 self._index_map = self._generate_index_map(seed)
325 self._cached_epoch = int(epoch_val)
326 self.data_idx_order_list = list(range(len(self._index_map)))
327 else:
328 # Fallback for non-ratio cases
329 self.data_lines = self.get_image_info_list(
330 self.label_file_list, self.ratio_list
331 )
332 self.data_idx_order_list = list(range(len(self.data_lines)))
333 if self.mode == "train" and self.do_shuffle:
334 self.shuffle_data_random()
335
336 # ------------------------------------------------------------------ #
337 # Data access

Callers 1

trainFunction · 0.80

Calls 3

_generate_index_mapMethod · 0.95
get_image_info_listMethod · 0.95
shuffle_data_randomMethod · 0.95

Tested by

no test coverage detected