This function annotates an image with bounding boxes and labels. Parameters: image_source (np.ndarray): The source image to be annotated. boxes (torch.Tensor): A tensor containing bounding box coordinates. in cxcywh format, pixel scale logits (torch.Tensor): A tensor contai
(image_source: np.ndarray, boxes: torch.Tensor, logits: torch.Tensor, phrases: List[str], text_scale: float,
text_padding=5, text_thickness=2, thickness=3)
| 324 | |
| 325 | |
| 326 | def annotate(image_source: np.ndarray, boxes: torch.Tensor, logits: torch.Tensor, phrases: List[str], text_scale: float, |
| 327 | text_padding=5, text_thickness=2, thickness=3) -> np.ndarray: |
| 328 | """ |
| 329 | This function annotates an image with bounding boxes and labels. |
| 330 | |
| 331 | Parameters: |
| 332 | image_source (np.ndarray): The source image to be annotated. |
| 333 | boxes (torch.Tensor): A tensor containing bounding box coordinates. in cxcywh format, pixel scale |
| 334 | logits (torch.Tensor): A tensor containing confidence scores for each bounding box. |
| 335 | phrases (List[str]): A list of labels for each bounding box. |
| 336 | text_scale (float): The scale of the text to be displayed. 0.8 for mobile/web, 0.3 for desktop # 0.4 for mind2web |
| 337 | |
| 338 | Returns: |
| 339 | np.ndarray: The annotated image. |
| 340 | """ |
| 341 | h, w, _ = image_source.shape |
| 342 | boxes = boxes * torch.Tensor([w, h, w, h]) |
| 343 | xyxy = box_convert(boxes=boxes, in_fmt="cxcywh", out_fmt="xyxy").numpy() |
| 344 | xywh = box_convert(boxes=boxes, in_fmt="cxcywh", out_fmt="xywh").numpy() |
| 345 | detections = sv.Detections(xyxy=xyxy) |
| 346 | |
| 347 | labels = [f"{phrase}" for phrase in range(boxes.shape[0])] |
| 348 | |
| 349 | box_annotator = BoxAnnotator(text_scale=text_scale, text_padding=text_padding,text_thickness=text_thickness,thickness=thickness) # 0.8 for mobile/web, 0.3 for desktop # 0.4 for mind2web |
| 350 | annotated_frame = image_source.copy() |
| 351 | annotated_frame = box_annotator.annotate(scene=annotated_frame, detections=detections, labels=labels, image_size=(w,h)) |
| 352 | |
| 353 | label_coordinates = {f"{phrase}": v for phrase, v in zip(phrases, xywh)} |
| 354 | return annotated_frame, label_coordinates |
| 355 | |
| 356 | |
| 357 | def predict(model, image, caption, box_threshold, text_threshold): |
no test coverage detected