-
Notifications
You must be signed in to change notification settings - Fork 738
Description
Hi!
I am trying to understand the discrepancy between the CoreML and the PyTorch inference we have. I have at least 10% differences for specific heads and test cases and I need to find the reason.
This is a screenshot of model's input using Netron app:
The MUL and ADD are vector, (The current CoreML does not support vectors for Bias operation, however I changed it, see the issue here: #2619)
The values for MUL and ADD are:
MUL: 0.01464911736547947, 0.015123673714697361, 0.015288766473531723
ADD: -1.8590586185455322, -1.7242575883865356, -1.5922027826309204
I am adding the code I used to do inference on development environment:
def test_on_single_image(multihead_model: nn.Module, image_path: Path) -> List[float]:
"""
Default method to test the model on a single image.
"""
img_bgr = cv2.imread(image_path)
if img_bgr is None:
raise RuntimeError(f"Could not read image: {image_path}")
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
img_rgb = cv2.resize(img_rgb, (224, 224), cv2.INTER_LINEAR)
# Convert to tensor and add batch dimension
img_tensor = (
torch.from_numpy(img_rgb).permute(2, 0, 1).unsqueeze(0)
) # Shape: 1,3,H,W
img_tensor = img_tensor.float() / 255.0
for c in range(3):
img_tensor[0, c, :, :] = (img_tensor[0, c, :, :] - MEAN[c]) / STD[c]
outputs = multihead_model(img_tensor)
predicted = outputs
return predicted
The values for MEAN and STD are:
MEAN = [0.49767, 0.4471, 0.4084]
STD = [0.2677, 0.2593, 0.2565]
img_tensor[0, 0, 0, 0] = 1.2319, which is same with the colorImage__biased__ (It is 1.2319052), however the second value is 1.1586595 which is different than img_tensor[0,0,0,1], which is 1.0561.
I compare the order of colorImage__biased__ and the img_tensor to check whether the discrepancy comes from row/channel order but they are not. The values img_tensor[0,1,0,1] and img_tensor[0,1,0, 1] are also different.
The only thing I belive can be different is the interpolation method used by CoreML. The OpenCV uses cv2.INTER_LINEAR, but I do not know what apple inference uses. I changed my interpolation method to different algorithms but the results are still different.
What are the possible reasons for this discrepancy?
