You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hey, i have been trying to label the data with owlv2. i have tried 2 different codes for the same task one without postprocessing which is giving good results and another with postprocessing as i want correct annotations on my original image which is (704, 576). pre processing part automatically resize the image to (960, 960). i have tried changing different threshold=0.98, nms_threshold=1.0.
I just want correct annotations on image size (704, 576).
the code without post processing (giving perfect bounding box on image size (960, 960) )
hey, i have been trying to label the data with owlv2. i have tried 2 different codes for the same task one without postprocessing which is giving good results and another with postprocessing as i want correct annotations on my original image which is (704, 576). pre processing part automatically resize the image to (960, 960). i have tried changing different threshold=0.98, nms_threshold=1.0.
I just want correct annotations on image size (704, 576).
the code without post processing (giving perfect bounding box on image size (960, 960) )
target_image = Image.open(image_path)
target_pixel_values = processor(images=target_image, return_tensors="pt").pixel_values
unnormalized_target_image = get_preprocessed_image(target_pixel_values)
with torch.no_grad():
feature_map = model.image_embedder(target_pixel_values)[0]
b, h, w, d = feature_map.shape
target_boxes = model.box_predictor(
feature_map.reshape(b, h * w, d), feature_map=feature_map
)
target_class_predictions = model.class_predictor(
feature_map.reshape(b, h * w, d),
torch.tensor(query_embedding[None, None, ...]), # [batch, queries, d]
)[0]
target_boxes = np.array(target_boxes[0].detach())
target_logits = np.array(target_class_predictions[0].detach())
top_ind = np.argmax(target_logits[:, 0], axis=0)
score = sigmoid(target_logits[top_ind, 0])
top_boxes = target_boxes[top_ind]
the code with post processing (the result is inaccurate on both image size)
import requests
from PIL import Image
import torch
from transformers import Owlv2Processor, Owlv2ForObjectDetection
processor = Owlv2Processor.from_pretrained("google/owlv2-base-patch16-ensemble")
model = Owlv2ForObjectDetection.from_pretrained("google/owlv2-base-patch16-ensemble")
source_image = Image.open('./source_image.jpg')
target_image = Image.open('./all_images/2024-08-19-114707271_1000032$1$0$21_1-00000.jpg')
inputs = processor(images=target_image, query_images=source_image, return_tensors="pt",threshold=0.98, nms_threshold=1.0)
with torch.no_grad():
outputs = model.image_guided_detection(**inputs)
target_sizes = torch.Tensor([target_image.size[::-1]])
results = processor.post_process_image_guided_detection(outputs=outputs, target_sizes=target_sizes, threshold=0.98, nms_threshold=1.0)
boxes, scores = results[i]["boxes"], results[i]["scores"]
for box, score in zip(boxes, scores):
box = [round(i, 2) for i in box.tolist()]
print(f"Detected similar object with confidence {round(score.item(), 3)} at location {box}")
The text was updated successfully, but these errors were encountered: