MCPcopy
hub / github.com/XPixelGroup/DiffBIR / process_anyres_image

Function process_anyres_image

llava/mm_utils.py:119–145  ·  view source on GitHub ↗

Process an image with variable resolutions. Args: image (PIL.Image.Image): The input image to be processed. processor: The image processor object. grid_pinpoints (str): A string representation of a list of possible resolutions. Returns: torch.Tensor: A

(image, processor, grid_pinpoints)

Source from the content-addressed store, hash-verified

117
118
119def process_anyres_image(image, processor, grid_pinpoints):
120 """
121 Process an image with variable resolutions.
122
123 Args:
124 image (PIL.Image.Image): The input image to be processed.
125 processor: The image processor object.
126 grid_pinpoints (str): A string representation of a list of possible resolutions.
127
128 Returns:
129 torch.Tensor: A tensor containing the processed image patches.
130 """
131 if type(grid_pinpoints) is list:
132 possible_resolutions = grid_pinpoints
133 else:
134 possible_resolutions = ast.literal_eval(grid_pinpoints)
135 best_resolution = select_best_resolution(image.size, possible_resolutions)
136 image_padded = resize_and_pad_image(image, best_resolution)
137
138 patches = divide_to_patches(image_padded, processor.crop_size['height'])
139
140 image_original_resize = image.resize((processor.size['shortest_edge'], processor.size['shortest_edge']))
141
142 image_patches = [image_original_resize] + patches
143 image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
144 for image_patch in image_patches]
145 return torch.stack(image_patches, dim=0)
146
147
148def load_image_from_base64(image):

Callers 1

process_imagesFunction · 0.85

Calls 3

select_best_resolutionFunction · 0.85
resize_and_pad_imageFunction · 0.85
divide_to_patchesFunction · 0.85

Tested by

no test coverage detected