hub / github.com/tensorflow/datasets / load_dataset

Method load_dataset

tensorflow_datasets/core/load.py:302–383 · view source on GitHub ↗

Loads the named dataset from a dataset collection by calling `tfds.load`. Args: dataset: `str`, the dataset name to load. split: which split(s) of the dataset to load. If `None`, will return all splits available for the dataset. loader_kwargs: `dict` (optional), keywor

(
      self,
      dataset: str,
      split: Optional[Tree[splits_lib.SplitArg]] = None,
      loader_kwargs: dict[str, Any] | None = None,
  )

Source from the content-addressed store, hash-verified

300	self.loader_kwargs = loader_kwargs
301
302	def load_dataset(
303	self,
304	dataset: str,
305	split: Optional[Tree[splits_lib.SplitArg]] = None,
306	loader_kwargs: dict[str, Any] \| None = None,
307	) -> Mapping[str, tf.data.Dataset]:
308	"""Loads the named dataset from a dataset collection by calling `tfds.load`.
309
310	Args:
311	dataset: `str`, the dataset name to load.
312	split: which split(s) of the dataset to load. If `None`, will return all
313	splits available for the dataset.
314	loader_kwargs: `dict` (optional), keyword arguments to be passed to the
315	`tfds.load` function. Refer to `tfds.load` documentation for a
316	comperehensive overview of the different loading options.
317
318	Returns:
319	A `dict` of {`str`: tf.data.Dataset} for the desided dataset.
320
321	Raises:
322	KeyError: if trying to load a dataset not included in the collection.
323	RuntimeError: if `load` return type is not a `dict` or a `list`.
324	"""
325	if not dataset:
326	raise TypeError('You must specify a non-empty dataset to load.')
327
328	loader_kwargs = loader_kwargs or self.loader_kwargs or {}
329
330	# with_info must be False (or it will change the return type of `tfds.load`)
331	if 'with_info' in loader_kwargs and loader_kwargs['with_info']:
332	logging.warning('`with_info` cannot be True, setting it to False')
333	loader_kwargs['with_info'] = False
334
335	try:
336	dataset_reference = self.datasets[dataset]
337	except KeyError as e:
338	raise KeyError(
339	f'Dataset {dataset} is not included in this collection. '
340	f'{self.collection.list_datasets(version=self.requested_version)}'
341	) from e
342
343	# If `split` is defined both as argument and in `loader_kwargs`, always keep
344	# the one defined as argument.
345	if split:
346	loader_kwargs['split'] = dataset_reference.get_split(split)
347	# Make sure we always return a dict of dicts.
348	if 'split' in loader_kwargs and isinstance(loader_kwargs['split'], str):
349	loader_kwargs['split'] = [loader_kwargs['split']]
350
351	# Add the data dir from the reference to loader_kwargs if it is defined and
352	# not overridden in loader_kwargs.
353	if (
354	dataset_reference.data_dir is not None
355	and 'data_dir' not in loader_kwargs
356	):
357	loader_kwargs['data_dir'] = dataset_reference.data_dir
358
359	load_output = load(dataset_reference.tfds_name(), **loader_kwargs)

Callers 7

load_datasetsMethod · 0.95

test_load_datasetFunction · 0.80

test_load_dataset_splitFunction · 0.80

test_load_dataset_splitsFunction · 0.80

test_load_dataset_runtime_errorFunction · 0.80

test_load_dataset_key_errorFunction · 0.80

test_load_dataset_with_kwargsFunction · 0.80

Calls 4

list_datasetsMethod · 0.80

get_splitMethod · 0.80

tfds_nameMethod · 0.80

loadFunction · 0.70

Tested by 6

test_load_datasetFunction · 0.64

test_load_dataset_splitFunction · 0.64

test_load_dataset_splitsFunction · 0.64

test_load_dataset_runtime_errorFunction · 0.64

test_load_dataset_key_errorFunction · 0.64

test_load_dataset_with_kwargsFunction · 0.64