`Error while converting op of type: Conv. Error message: provided number axes -1 not supported` #509

dkossnick-figma · 2019-11-19T18:12:06Z

🐞Describe the bug

In converting the PyTorch pretrained Progressive GAN model to CoreML (https://pytorch.org/hub/facebookresearch_pytorch-gan-zoo_pgan/), I am encountering a fatal error in the onnx to coreml conversion step. See below for the trace and a simple end-to-end script that goes from PyTorch to CoreML that reproduces the issue both locally and in a Google Colab notebook environment.

Trace

This is the output from my conversion script, with the final part being the CoreML crash.

WARNING:root:TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
Using cache found in /Users/davidkosslyn/.cache/torch/hub/facebookresearch_pytorch_GAN_zoo_hub
Loading default model : celebaHQ-256
Average network found !
input shape: torch.Size([1, 512])
GNet(
  (scaleLayers): ModuleList(
    (0): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
    (1): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
    (2): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
    (3): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(512, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
    (4): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(256, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
    (5): ModuleList(
      (0): EqualizedConv2d(
        (module): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
      (1): EqualizedConv2d(
        (module): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      )
    )
  )
  (toRGBLayers): ModuleList(
    (0): EqualizedConv2d(
      (module): Conv2d(512, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (1): EqualizedConv2d(
      (module): Conv2d(512, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (2): EqualizedConv2d(
      (module): Conv2d(512, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (3): EqualizedConv2d(
      (module): Conv2d(512, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (4): EqualizedConv2d(
      (module): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (5): EqualizedConv2d(
      (module): Conv2d(128, 3, kernel_size=(1, 1), stride=(1, 1))
    )
    (6): EqualizedConv2d(
      (module): Conv2d(64, 3, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (formatLayer): EqualizedLinear(
    (module): Linear(in_features=512, out_features=8192, bias=True)
  )
  (groupScale0): ModuleList(
    (0): EqualizedConv2d(
      (module): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
  )
  (leakyRelu): LeakyReLU(negative_slope=0.2)
  (normalizationLayer): NormalizationLayer()
)
Converting model to ONNX...
Exported model progan.onnx has been tested with ONNXRuntime, and the result looks good!
Converting to CoreML...
1/278: Converting Node Type Pow
2/278: Converting Node Type ReduceMean
3/278: Converting Node Type Add
4/278: Converting Node Type Sqrt
5/278: Converting Node Type Div
6/278: Converting Node Type Mul
7/278: Converting Node Type Reshape
8/278: Converting Node Type Gemm
9/278: Converting Node Type Mul
10/278: Converting Node Type LeakyRelu
11/278: Converting Node Type Shape
12/278: Converting Node Type Gather
13/278: Converting Node Type Unsqueeze
14/278: Converting Node Type Concat
15/278: Converting Node Type Reshape
16/278: Converting Node Type Pow
17/278: Converting Node Type ReduceMean
18/278: Converting Node Type Add
19/278: Converting Node Type Sqrt
20/278: Converting Node Type Div
21/278: Converting Node Type Mul
22/278: Converting Node Type Conv
Traceback (most recent call last):
  File "repro.py", line 43, in <module>
    image_output_names=['output'])
  File "/Users/davidkosslyn/anaconda3/lib/python3.7/site-packages/onnx_coreml/converter.py", line 625, in convert
    _convert_node_nd(builder, node, graph, err)
  File "/Users/davidkosslyn/anaconda3/lib/python3.7/site-packages/onnx_coreml/_operators_nd.py", line 2370, in _convert_node_nd
    return converter_fn(builder, node, graph, err)
  File "/Users/davidkosslyn/anaconda3/lib/python3.7/site-packages/onnx_coreml/_operators_nd.py", line 524, in _convert_conv
    builder, node, graph, err)
  File "/Users/davidkosslyn/anaconda3/lib/python3.7/site-packages/onnx_coreml/_operators_nd.py", line 70, in _add_conv_like_op
    return err.unsupported_op_configuration(builder, node, graph, "provided number axes {} not supported".format(rank))
  File "/Users/davidkosslyn/anaconda3/lib/python3.7/site-packages/onnx_coreml/_error_utils.py", line 60, in unsupported_op_configuration
    self.rerun_suggestion)
TypeError: Error while converting op of type: Conv. Error message: provided number axes 1 not supported 
 Please try converting with higher target_ios.
You can also provide custom function/layer to convert the model.

To Reproduce

This below script will pull the pretrained model, convert to onnx and from onnx convert to coreml. You can also use this directly on a colab notebook that also reproduces the issue: https://colab.research.google.com/drive/126k3OL3378IiNPO8NmFHuPq2BhwKb404#scrollTo=kE8VYX3n4-14.

import onnx
import onnxruntime
import torch.onnx
import numpy as np
from onnx_coreml import convert
import sys

print(f"Torch version: {torch.__version__}")
print(f"Onnx version: {onnx.__version__}")
print(f"onnxruntime version: {onnxruntime.__version__}")
print(f"Python version: {sys.version}")

NAME = "progan"
full_network = torch.hub.load('facebookresearch/pytorch_GAN_zoo:hub', 'PGAN', pretrained=True, useGPU=False)
input, _ = full_network.buildNoiseData(1)
print(f"input shape: {input.shape}")
model = full_network.netG
print(model)
model.eval()
output = model(input)

print("Converting model to ONNX...")
onnx_name = f"{NAME}.onnx"
torch.onnx.export(model,               # model being run
                  input,                         # model input (or a tuple for multiple inputs)
                  onnx_name,   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=11,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ["input"],   # the model's input names
                  output_names = ["output"], # the model's output names
                  )
onnx_model = onnx.load(onnx_name)
onnx.checker.check_model(onnx_model)
ort_session = onnxruntime.InferenceSession(onnx_name)
def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
# compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(input)}
ort_outs = ort_session.run(None, ort_inputs)
# compare ONNX Runtime and PyTorch results
np.testing.assert_allclose(to_numpy(output), ort_outs[0], rtol=1e-03, atol=1e-04)
print(f"Exported model {onnx_name} has been tested with ONNXRuntime, and the result looks good!")

print("Converting to CoreML...")
mlmodel = convert(onnx_model,
                minimum_ios_deployment_target='13',
                image_output_names=['output'])
coreml_name = f"{NAME}.mlmodel"
mlmodel.save(coreml_name)
print(f"coreml model saved at {coreml_name}")

The onnx model correctly output from the above script is available to download here, if that helps in reproducing: https://drive.google.com/file/d/1ILLDBo2xXdsgaaf-Q_jilKFgTehNOVst/view?usp=sharing.

Here is also attached an exported environment.yml file for my conda environment which locally also has the same issue.

Diagram (too big a network to view in one frame):

System environment:

coremltools version: 3.0
onnx-coreml version: 1.0
OS: MacOS
macOS version: 10.15.1
How you install python: anaconda3
python version: 3.7.3

The text was updated successfully, but these errors were encountered:

dkossnick-figma · 2019-12-19T01:31:11Z

Just updated the colab notebook to onnx opset 11, and
Torch version: 1.3.1
Onnx version: 1.6.0
onnxruntime version: 1.1.0
onnx_coreml version: 1.1.0

Was hoping this would help. Did change the error from axes 1 to -1, but it's still there.

yaroslavvb · 2019-12-19T02:21:03Z

Can reproduce

DawerG · 2019-12-19T22:24:03Z

@kossnick Can you please try converting the model using the change in PR
#524 ?

dkossnick-figma · 2019-12-19T22:55:58Z

The conv op in my own model is indeed converting correctly, thank you! 🙌 ❤️

I have gotten other similar axes=-1 issues elsewhere in our models (InstanceNorm2d in some scenarios and Slice in some scenarios). I've yet to find/make a simpler standalone repro for the other axis=-1 issues. Any advice on how to tackle those, or things to dig into for them?

jeremycochoy · 2020-02-01T10:35:40Z

@kossnick Can you please try converting the model using the change in PR
#524 ?

I Had exactly this problem with my "home made" model. Forcing the rank to be evaluated to 3 (if rank == -1: rank = 3) solved the issue to me (I have two Conv1D with tensor shape of rank 3 and I know they was the one causing this problem).

I tested your code and the export work perfectly, and I think it is the easiest way to handle this.

@DawerG: When do you think your PR will be merged ?

themez · 2020-02-06T15:22:37Z

I used the change in PR #524 , now I got a new Error while converting op of type: BatchNormalization. Error message: provided number axes -1 not supported error, guess it can be fixed by the same way?

dragen1860 · 2020-03-04T02:39:06Z

Yes, for Conv2D, BN and GlobalPooling, it will all get rank=-1 error. Need to modify all these 3 source code.

gemfield · 2020-04-01T06:34:31Z

related with c36bfef ?

jeremycochoy · 2020-04-02T12:45:02Z

May be the source of the problem. But I don't know if the previous implementation (rank = len(graph.shape_dict[node.inputs[0]])) would work with the current code base. Unfortunately I don't have the time to run this type of tests this week :/

langleyd · 2020-05-25T13:26:58Z

Also saw this issue for Conv ConvTranspose and MaxPool. Worked around it by forcing the rank to the value I knew was expected in my "home made" model, in the _add_conv_like_op function.

dkossnick-figma added the bug Unexpected behaviour that should be corrected (type) label Nov 19, 2019

dkossnick-figma changed the title ~~Error while converting op of type: Conv. Error message: provided number axes 1 not supported~~ Error while converting op of type: Conv. Error message: provided number axes -1 not supported Dec 19, 2019

TMVector mentioned this issue May 21, 2020

onnx to CoreML onnx/onnx#2779

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Error while converting op of type: Conv. Error message: provided number axes -1 not supported` #509

`Error while converting op of type: Conv. Error message: provided number axes -1 not supported` #509

dkossnick-figma commented Nov 19, 2019 •

edited

Loading

dkossnick-figma commented Dec 19, 2019 •

edited

Loading

yaroslavvb commented Dec 19, 2019

DawerG commented Dec 19, 2019 •

edited

Loading

dkossnick-figma commented Dec 19, 2019

jeremycochoy commented Feb 1, 2020 •

edited

Loading

themez commented Feb 6, 2020

dragen1860 commented Mar 4, 2020

gemfield commented Apr 1, 2020

jeremycochoy commented Apr 2, 2020

langleyd commented May 25, 2020 •

edited

Loading

Error while converting op of type: Conv. Error message: provided number axes -1 not supported #509

Error while converting op of type: Conv. Error message: provided number axes -1 not supported #509

Comments

dkossnick-figma commented Nov 19, 2019 • edited Loading

🐞Describe the bug

Trace

To Reproduce

System environment:

dkossnick-figma commented Dec 19, 2019 • edited Loading

yaroslavvb commented Dec 19, 2019

DawerG commented Dec 19, 2019 • edited Loading

dkossnick-figma commented Dec 19, 2019

jeremycochoy commented Feb 1, 2020 • edited Loading

themez commented Feb 6, 2020

dragen1860 commented Mar 4, 2020

gemfield commented Apr 1, 2020

jeremycochoy commented Apr 2, 2020

langleyd commented May 25, 2020 • edited Loading

`Error while converting op of type: Conv. Error message: provided number axes -1 not supported` #509

`Error while converting op of type: Conv. Error message: provided number axes -1 not supported` #509

dkossnick-figma commented Nov 19, 2019 •

edited

Loading

dkossnick-figma commented Dec 19, 2019 •

edited

Loading

DawerG commented Dec 19, 2019 •

edited

Loading

jeremycochoy commented Feb 1, 2020 •

edited

Loading

langleyd commented May 25, 2020 •

edited

Loading