hub / github.com/huggingface/diffusers / preprocess

Method preprocess

src/diffusers/image_processor.py:1154–1267 · view source on GitHub ↗

r""" Preprocess the image input. Accepted formats are PIL images, NumPy arrays, or PyTorch tensors. Args: rgb (`torch.Tensor | PIL.Image.Image | np.ndarray`): The RGB input image, which can be a single image or a batch. depth (`torch.Tensor |

(
        self,
        rgb: torch.Tensor | PIL.Image.Image | np.ndarray,
        depth: torch.Tensor | PIL.Image.Image | np.ndarray,
        height: int | None = None,
        width: int | None = None,
        target_res: int | None = None,
    )

Source from the content-addressed store, hash-verified

1152	raise Exception(f"This type {output_type} is not supported")
1153
1154	def preprocess(
1155	self,
1156	rgb: torch.Tensor \| PIL.Image.Image \| np.ndarray,
1157	depth: torch.Tensor \| PIL.Image.Image \| np.ndarray,
1158	height: int \| None = None,
1159	width: int \| None = None,
1160	target_res: int \| None = None,
1161	) -> torch.Tensor:
1162	r"""
1163	Preprocess the image input. Accepted formats are PIL images, NumPy arrays, or PyTorch tensors.
1164
1165	Args:
1166	rgb (`torch.Tensor \| PIL.Image.Image \| np.ndarray`):
1167	The RGB input image, which can be a single image or a batch.
1168	depth (`torch.Tensor \| PIL.Image.Image \| np.ndarray`):
1169	The depth input image, which can be a single image or a batch.
1170	height (`int \| None`, optional, defaults to `None`):
1171	The desired height of the processed image. If `None`, defaults to the height of the input image.
1172	width (`int \| None`, optional, defaults to `None`):
1173	The desired width of the processed image. If `None`, defaults to the width of the input image.
1174	target_res (`int \| None`, optional, defaults to `None`):
1175	Target resolution for resizing the images. If specified, overrides height and width.
1176
1177	Returns:
1178	`tuple[torch.Tensor, torch.Tensor]`:
1179	A tuple containing the processed RGB and depth images as PyTorch tensors.
1180	"""
1181	supported_formats = (PIL.Image.Image, np.ndarray, torch.Tensor)
1182
1183	# Expand the missing dimension for 3-dimensional pytorch tensor or numpy array that represents grayscale image
1184	if self.config.do_convert_grayscale and isinstance(rgb, (torch.Tensor, np.ndarray)) and rgb.ndim == 3:
1185	raise Exception("This is not yet supported")
1186
1187	if isinstance(rgb, supported_formats):
1188	rgb = [rgb]
1189	depth = [depth]
1190	elif not (isinstance(rgb, list) and all(isinstance(i, supported_formats) for i in rgb)):
1191	raise ValueError(
1192	f"Input is in incorrect format: {[type(i) for i in rgb]}. Currently, we only support {', '.join(supported_formats)}"
1193	)
1194
1195	if isinstance(rgb[0], PIL.Image.Image):
1196	if self.config.do_convert_rgb:
1197	raise Exception("This is not yet supported")
1198	# rgb = [self.convert_to_rgb(i) for i in rgb]
1199	# depth = [self.convert_to_depth(i) for i in depth] #TODO define convert_to_depth
1200	if self.config.do_resize or target_res:
1201	height, width = self.get_default_height_width(rgb[0], height, width) if not target_res else target_res
1202	rgb = [self.resize(i, height, width) for i in rgb]
1203	depth = [self.resize(i, height, width) for i in depth]
1204	rgb = self.pil_to_numpy(rgb) # to np
1205	rgb = self.numpy_to_pt(rgb) # to pt
1206
1207	depth = self.depth_pil_to_numpy(depth) # to np
1208	depth = self.numpy_to_pt(depth) # to pt
1209
1210	elif isinstance(rgb[0], np.ndarray):
1211	rgb = np.concatenate(rgb, axis=0) if rgb[0].ndim == 4 else np.stack(rgb, axis=0)

Callers

nothing calls this directly

Calls 7

depth_pil_to_numpyMethod · 0.95

pil_to_numpyMethod · 0.80

binarizeMethod · 0.80

get_default_height_widthMethod · 0.45

resizeMethod · 0.45

numpy_to_ptMethod · 0.45

normalizeMethod · 0.45

Tested by

no test coverage detected