Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading on Mac #3

Open
Mohinta2892 opened this issue Aug 1, 2024 · 4 comments
Open

Loading on Mac #3

Mohinta2892 opened this issue Aug 1, 2024 · 4 comments

Comments

@Mohinta2892
Copy link

Mohinta2892 commented Aug 1, 2024

Hi Jordão,

Great work, thank you for this!

I am trying to load the plugin on a mac M1. However, getting the error:

  Signal emitted at: /Users/sam/miniforge3/envs/sam2/lib/python3.10/site-packages/napari_segment_anything_2/_widget.py:74, in __init__
    >  self._model_type_widget.changed.emit(model_type)

  Callback error at: /Users/sam/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/cuda/__init__.py:305, in _lazy_init
    >  raise AssertionError("Torch not compiled with CUDA enabled")

Do you think this can be fixed for Apple M1 machines, which have MPS now?

Best,
Samia

@JoOkuma
Copy link
Member

JoOkuma commented Aug 1, 2024

Hi @Mohinta2892 , could you share the full error?

segment-anything-1 never got full MPS support so the performance wasn't great because half of the execution was running on CPU.

@Mohinta2892
Copy link
Author

Error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/psygnal/_signal.py:1196, in SignalInstance._run_emit_loop(self=<SignalInstance 'changed' on ComboBox(value='sam2_hiera_b+', annotation=None, name='')>, args=('sam2_hiera_b+',))
   1195     with Signal._emitting(self):
-> 1196         self._run_emit_loop_inner()
        self = <SignalInstance 'changed' on ComboBox(value='sam2_hiera_b+', annotation=None, name='')>
        self._run_emit_loop_inner = <bound method SignalInstance._run_emit_loop_immediate of <SignalInstance 'changed' on ComboBox(value='sam2_hiera_b+', annotation=None, name='')>>
   1197 except RecursionError as e:

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/psygnal/_signal.py:1225, in SignalInstance._run_emit_loop_immediate(self=<SignalInstance 'changed' on ComboBox(value='sam2_hiera_b+', annotation=None, name='')>)
   1224 for caller in self._slots:
-> 1225     caller.cb(args)
        args = ('sam2_hiera_b+',)
        caller = <WeakMethod on napari_segment_anything_2._widget.SAM2Widget._load_model>

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/psygnal/_weak_callback.py:453, in WeakMethod.cb(self=<WeakMethod on napari_segment_anything_2._widget.SAM2Widget._load_model>, args=('sam2_hiera_b+',))
    452     args = args[: self._max_args]
--> 453 func(obj, *self._args, *args, **self._kwargs)
        obj = <Container ()>
        func = <function SAM2Widget._load_model at 0x3034245e0>
        args = ('sam2_hiera_b+',)
        self = <WeakMethod on napari_segment_anything_2._widget.SAM2Widget._load_model>
        self._args = ()
        self._kwargs = {}

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/napari_segment_anything_2/_widget.py:78, in SAM2Widget._load_model(self=<Container ()>, model_type='sam2_hiera_b+')
     77 def _load_model(self, model_type: str) -> None:
---> 78     self._predictor = build_sam2_video_predictor(
        self = <Container ()>
        model_type = 'sam2_hiera_b+'
     79         model_type, get_weights_path(model_type)
     80     )
     81     self._predictor.fill_hole_area = 0

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/napari_segment_anything_2/sam2.py:42, in build_sam2_video_predictor(config_file='sam2_hiera_b+', ckpt_path=PosixPath('/Users/sam/.cache/napari-segment-anything/sam2_hiera_base_plus.pt'), device='cuda', mode='eval', hydra_overrides_extra=['++model.sam_mask_decoder_extra_args.dynamic_multimask_via_stability=true', '++model.sam_mask_decoder_extra_args.dynamic_multimask_stability_delta=0.05', '++model.sam_mask_decoder_extra_args.dynamic_multimask_stability_thresh=0.98', '++model.binarize_mask_from_pts_for_mem_enc=true', '++model.fill_hole_area=8'], apply_postprocessing=True)
     41 _load_checkpoint(model, ckpt_path)
---> 42 model = model.to(device)
        model = SAM2VideoPredictor(
  (image_encoder): ImageEncoder(
    (trunk): Hiera(
      (patch_embed): PatchEmbed(
        (proj): Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
      )
      (blocks): ModuleList(
        (0-1): 2 x MultiScaleBlock(
          (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=112, out_features=336, bias=True)
            (proj): Linear(in_features=112, out_features=112, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=112, out_features=448, bias=True)
              (1): Linear(in_features=448, out_features=112, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (2): MultiScaleBlock(
          (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=112, out_features=672, bias=True)
            (proj): Linear(in_features=224, out_features=224, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=224, out_features=896, bias=True)
              (1): Linear(in_features=896, out_features=224, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=112, out_features=224, bias=True)
        )
        (3-4): 2 x MultiScaleBlock(
          (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=224, out_features=672, bias=True)
            (proj): Linear(in_features=224, out_features=224, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=224, out_features=896, bias=True)
              (1): Linear(in_features=896, out_features=224, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (5): MultiScaleBlock(
          (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=224, out_features=1344, bias=True)
            (proj): Linear(in_features=448, out_features=448, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=448, out_features=1792, bias=True)
              (1): Linear(in_features=1792, out_features=448, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=224, out_features=448, bias=True)
        )
        (6-20): 15 x MultiScaleBlock(
          (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=448, out_features=1344, bias=True)
            (proj): Linear(in_features=448, out_features=448, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=448, out_features=1792, bias=True)
              (1): Linear(in_features=1792, out_features=448, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (21): MultiScaleBlock(
          (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=448, out_features=2688, bias=True)
            (proj): Linear(in_features=896, out_features=896, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=896, out_features=3584, bias=True)
              (1): Linear(in_features=3584, out_features=896, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=448, out_features=896, bias=True)
        )
        (22-23): 2 x MultiScaleBlock(
          (norm1): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=896, out_features=2688, bias=True)
            (proj): Linear(in_features=896, out_features=896, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=896, out_features=3584, bias=True)
              (1): Linear(in_features=3584, out_features=896, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
      )
    )
    (neck): FpnNeck(
      (position_encoding): PositionEmbeddingSine()
      (convs): ModuleList(
        (0): Sequential(
          (conv): Conv2d(896, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (1): Sequential(
          (conv): Conv2d(448, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (2): Sequential(
          (conv): Conv2d(224, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (3): Sequential(
          (conv): Conv2d(112, 256, kernel_size=(1, 1), stride=(1, 1))
        )
      )
    )
  )
  (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4))
  (memory_attention): MemoryAttention(
    (layers): ModuleList(
      (0-3): 4 x MemoryAttentionLayer(
        (self_attn): RoPEAttention(
          (q_proj): Linear(in_features=256, out_features=256, bias=True)
          (k_proj): Linear(in_features=256, out_features=256, bias=True)
          (v_proj): Linear(in_features=256, out_features=256, bias=True)
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (cross_attn_image): RoPEAttention(
          (q_proj): Linear(in_features=256, out_features=256, bias=True)
          (k_proj): Linear(in_features=64, out_features=256, bias=True)
          (v_proj): Linear(in_features=64, out_features=256, bias=True)
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (linear1): Linear(in_features=256, out_features=2048, bias=True)
        (dropout): Dropout(p=0.1, inplace=False)
        (linear2): Linear(in_features=2048, out_features=256, bias=True)
        (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (dropout1): Dropout(p=0.1, inplace=False)
        (dropout2): Dropout(p=0.1, inplace=False)
        (dropout3): Dropout(p=0.1, inplace=False)
      )
    )
    (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
  )
  (memory_encoder): MemoryEncoder(
    (mask_downsampler): MaskDownSampler(
      (encoder): Sequential(
        (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (1): LayerNorm2d()
        (2): GELU(approximate='none')
        (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (4): LayerNorm2d()
        (5): GELU(approximate='none')
        (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (7): LayerNorm2d()
        (8): GELU(approximate='none')
        (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (10): LayerNorm2d()
        (11): GELU(approximate='none')
        (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fuser): Fuser(
      (proj): Identity()
      (layers): ModuleList(
        (0-1): 2 x CXBlock(
          (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256)
          (norm): LayerNorm2d()
          (pwconv1): Linear(in_features=256, out_features=1024, bias=True)
          (act): GELU(approximate='none')
          (pwconv2): Linear(in_features=1024, out_features=256, bias=True)
          (drop_path): Identity()
        )
      )
    )
    (position_encoding): PositionEmbeddingSine()
    (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
  )
  (sam_prompt_encoder): PromptEncoder(
    (pe_layer): PositionEmbeddingRandom()
    (point_embeddings): ModuleList(
      (0-3): 4 x Embedding(1, 256)
    )
    (not_a_point_embed): Embedding(1, 256)
    (mask_downscaling): Sequential(
      (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2))
      (1): LayerNorm2d()
      (2): GELU(approximate='none')
      (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2))
      (4): LayerNorm2d()
      (5): GELU(approximate='none')
      (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1))
    )
    (no_mask_embed): Embedding(1, 256)
  )
  (sam_mask_decoder): MaskDecoder(
    (transformer): TwoWayTransformer(
      (layers): ModuleList(
        (0-1): 2 x TwoWayAttentionBlock(
          (self_attn): Attention(
            (q_proj): Linear(in_features=256, out_features=256, bias=True)
            (k_proj): Linear(in_features=256, out_features=256, bias=True)
            (v_proj): Linear(in_features=256, out_features=256, bias=True)
            (out_proj): Linear(in_features=256, out_features=256, bias=True)
          )
          (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (cross_attn_token_to_image): Attention(
            (q_proj): Linear(in_features=256, out_features=128, bias=True)
            (k_proj): Linear(in_features=256, out_features=128, bias=True)
            (v_proj): Linear(in_features=256, out_features=128, bias=True)
            (out_proj): Linear(in_features=128, out_features=256, bias=True)
          )
          (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=256, out_features=2048, bias=True)
              (1): Linear(in_features=2048, out_features=256, bias=True)
            )
            (act): ReLU()
          )
          (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (cross_attn_image_to_token): Attention(
            (q_proj): Linear(in_features=256, out_features=128, bias=True)
            (k_proj): Linear(in_features=256, out_features=128, bias=True)
            (v_proj): Linear(in_features=256, out_features=128, bias=True)
            (out_proj): Linear(in_features=128, out_features=256, bias=True)
          )
        )
      )
      (final_attn_token_to_image): Attention(
        (q_proj): Linear(in_features=256, out_features=128, bias=True)
        (k_proj): Linear(in_features=256, out_features=128, bias=True)
        (v_proj): Linear(in_features=256, out_features=128, bias=True)
        (out_proj): Linear(in_features=128, out_features=256, bias=True)
      )
      (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
    )
    (iou_token): Embedding(1, 256)
    (mask_tokens): Embedding(4, 256)
    (obj_score_token): Embedding(1, 256)
    (output_upscaling): Sequential(
      (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2))
      (1): LayerNorm2d()
      (2): GELU(approximate='none')
      (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2))
      (4): GELU(approximate='none')
    )
    (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1))
    (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
    (output_hypernetworks_mlps): ModuleList(
      (0-3): 4 x MLP(
        (layers): ModuleList(
          (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
          (2): Linear(in_features=256, out_features=32, bias=True)
        )
        (act): ReLU()
      )
    )
    (iou_prediction_head): MLP(
      (layers): ModuleList(
        (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
        (2): Linear(in_features=256, out_features=4, bias=True)
      )
      (act): ReLU()
    )
    (pred_obj_score_head): MLP(
      (layers): ModuleList(
        (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
        (2): Linear(in_features=256, out_features=1, bias=True)
      )
      (act): ReLU()
    )
  )
  (obj_ptr_proj): MLP(
    (layers): ModuleList(
      (0-2): 3 x Linear(in_features=256, out_features=256, bias=True)
    )
    (act): ReLU()
  )
  (obj_ptr_tpos_proj): Identity()
)
        device = 'cuda'
     43 if mode == "eval":

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:1174, in Module.to(self=SAM2VideoPredictor(
  (image_encoder): ImageEnco...): ReLU()
  )
  (obj_ptr_tpos_proj): Identity()
), *args=('cuda',), **kwargs={})
   1172             raise
-> 1174 return self._apply(convert)
        self = SAM2VideoPredictor(
  (image_encoder): ImageEncoder(
    (trunk): Hiera(
      (patch_embed): PatchEmbed(
        (proj): Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
      )
      (blocks): ModuleList(
        (0-1): 2 x MultiScaleBlock(
          (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=112, out_features=336, bias=True)
            (proj): Linear(in_features=112, out_features=112, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=112, out_features=448, bias=True)
              (1): Linear(in_features=448, out_features=112, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (2): MultiScaleBlock(
          (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=112, out_features=672, bias=True)
            (proj): Linear(in_features=224, out_features=224, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=224, out_features=896, bias=True)
              (1): Linear(in_features=896, out_features=224, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=112, out_features=224, bias=True)
        )
        (3-4): 2 x MultiScaleBlock(
          (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=224, out_features=672, bias=True)
            (proj): Linear(in_features=224, out_features=224, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=224, out_features=896, bias=True)
              (1): Linear(in_features=896, out_features=224, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (5): MultiScaleBlock(
          (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=224, out_features=1344, bias=True)
            (proj): Linear(in_features=448, out_features=448, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=448, out_features=1792, bias=True)
              (1): Linear(in_features=1792, out_features=448, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=224, out_features=448, bias=True)
        )
        (6-20): 15 x MultiScaleBlock(
          (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=448, out_features=1344, bias=True)
            (proj): Linear(in_features=448, out_features=448, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=448, out_features=1792, bias=True)
              (1): Linear(in_features=1792, out_features=448, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
        (21): MultiScaleBlock(
          (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
          (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (attn): MultiScaleAttention(
            (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
            (qkv): Linear(in_features=448, out_features=2688, bias=True)
            (proj): Linear(in_features=896, out_features=896, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=896, out_features=3584, bias=True)
              (1): Linear(in_features=3584, out_features=896, bias=True)
            )
            (act): GELU(approximate='none')
          )
          (proj): Linear(in_features=448, out_features=896, bias=True)
        )
        (22-23): 2 x MultiScaleBlock(
          (norm1): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (attn): MultiScaleAttention(
            (qkv): Linear(in_features=896, out_features=2688, bias=True)
            (proj): Linear(in_features=896, out_features=896, bias=True)
          )
          (drop_path): Identity()
          (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=896, out_features=3584, bias=True)
              (1): Linear(in_features=3584, out_features=896, bias=True)
            )
            (act): GELU(approximate='none')
          )
        )
      )
    )
    (neck): FpnNeck(
      (position_encoding): PositionEmbeddingSine()
      (convs): ModuleList(
        (0): Sequential(
          (conv): Conv2d(896, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (1): Sequential(
          (conv): Conv2d(448, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (2): Sequential(
          (conv): Conv2d(224, 256, kernel_size=(1, 1), stride=(1, 1))
        )
        (3): Sequential(
          (conv): Conv2d(112, 256, kernel_size=(1, 1), stride=(1, 1))
        )
      )
    )
  )
  (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4))
  (memory_attention): MemoryAttention(
    (layers): ModuleList(
      (0-3): 4 x MemoryAttentionLayer(
        (self_attn): RoPEAttention(
          (q_proj): Linear(in_features=256, out_features=256, bias=True)
          (k_proj): Linear(in_features=256, out_features=256, bias=True)
          (v_proj): Linear(in_features=256, out_features=256, bias=True)
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (cross_attn_image): RoPEAttention(
          (q_proj): Linear(in_features=256, out_features=256, bias=True)
          (k_proj): Linear(in_features=64, out_features=256, bias=True)
          (v_proj): Linear(in_features=64, out_features=256, bias=True)
          (out_proj): Linear(in_features=256, out_features=256, bias=True)
        )
        (linear1): Linear(in_features=256, out_features=2048, bias=True)
        (dropout): Dropout(p=0.1, inplace=False)
        (linear2): Linear(in_features=2048, out_features=256, bias=True)
        (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
        (dropout1): Dropout(p=0.1, inplace=False)
        (dropout2): Dropout(p=0.1, inplace=False)
        (dropout3): Dropout(p=0.1, inplace=False)
      )
    )
    (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
  )
  (memory_encoder): MemoryEncoder(
    (mask_downsampler): MaskDownSampler(
      (encoder): Sequential(
        (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (1): LayerNorm2d()
        (2): GELU(approximate='none')
        (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (4): LayerNorm2d()
        (5): GELU(approximate='none')
        (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (7): LayerNorm2d()
        (8): GELU(approximate='none')
        (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (10): LayerNorm2d()
        (11): GELU(approximate='none')
        (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
      )
    )
    (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fuser): Fuser(
      (proj): Identity()
      (layers): ModuleList(
        (0-1): 2 x CXBlock(
          (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256)
          (norm): LayerNorm2d()
          (pwconv1): Linear(in_features=256, out_features=1024, bias=True)
          (act): GELU(approximate='none')
          (pwconv2): Linear(in_features=1024, out_features=256, bias=True)
          (drop_path): Identity()
        )
      )
    )
    (position_encoding): PositionEmbeddingSine()
    (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
  )
  (sam_prompt_encoder): PromptEncoder(
    (pe_layer): PositionEmbeddingRandom()
    (point_embeddings): ModuleList(
      (0-3): 4 x Embedding(1, 256)
    )
    (not_a_point_embed): Embedding(1, 256)
    (mask_downscaling): Sequential(
      (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2))
      (1): LayerNorm2d()
      (2): GELU(approximate='none')
      (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2))
      (4): LayerNorm2d()
      (5): GELU(approximate='none')
      (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1))
    )
    (no_mask_embed): Embedding(1, 256)
  )
  (sam_mask_decoder): MaskDecoder(
    (transformer): TwoWayTransformer(
      (layers): ModuleList(
        (0-1): 2 x TwoWayAttentionBlock(
          (self_attn): Attention(
            (q_proj): Linear(in_features=256, out_features=256, bias=True)
            (k_proj): Linear(in_features=256, out_features=256, bias=True)
            (v_proj): Linear(in_features=256, out_features=256, bias=True)
            (out_proj): Linear(in_features=256, out_features=256, bias=True)
          )
          (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (cross_attn_token_to_image): Attention(
            (q_proj): Linear(in_features=256, out_features=128, bias=True)
            (k_proj): Linear(in_features=256, out_features=128, bias=True)
            (v_proj): Linear(in_features=256, out_features=128, bias=True)
            (out_proj): Linear(in_features=128, out_features=256, bias=True)
          )
          (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (mlp): MLP(
            (layers): ModuleList(
              (0): Linear(in_features=256, out_features=2048, bias=True)
              (1): Linear(in_features=2048, out_features=256, bias=True)
            )
            (act): ReLU()
          )
          (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
          (cross_attn_image_to_token): Attention(
            (q_proj): Linear(in_features=256, out_features=128, bias=True)
            (k_proj): Linear(in_features=256, out_features=128, bias=True)
            (v_proj): Linear(in_features=256, out_features=128, bias=True)
            (out_proj): Linear(in_features=128, out_features=256, bias=True)
          )
        )
      )
      (final_attn_token_to_image): Attention(
        (q_proj): Linear(in_features=256, out_features=128, bias=True)
        (k_proj): Linear(in_features=256, out_features=128, bias=True)
        (v_proj): Linear(in_features=256, out_features=128, bias=True)
        (out_proj): Linear(in_features=128, out_features=256, bias=True)
      )
      (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
    )
    (iou_token): Embedding(1, 256)
    (mask_tokens): Embedding(4, 256)
    (obj_score_token): Embedding(1, 256)
    (output_upscaling): Sequential(
      (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2))
      (1): LayerNorm2d()
      (2): GELU(approximate='none')
      (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2))
      (4): GELU(approximate='none')
    )
    (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1))
    (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
    (output_hypernetworks_mlps): ModuleList(
      (0-3): 4 x MLP(
        (layers): ModuleList(
          (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
          (2): Linear(in_features=256, out_features=32, bias=True)
        )
        (act): ReLU()
      )
    )
    (iou_prediction_head): MLP(
      (layers): ModuleList(
        (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
        (2): Linear(in_features=256, out_features=4, bias=True)
      )
      (act): ReLU()
    )
    (pred_obj_score_head): MLP(
      (layers): ModuleList(
        (0-1): 2 x Linear(in_features=256, out_features=256, bias=True)
        (2): Linear(in_features=256, out_features=1, bias=True)
      )
      (act): ReLU()
    )
  )
  (obj_ptr_proj): MLP(
    (layers): ModuleList(
      (0-2): 3 x Linear(in_features=256, out_features=256, bias=True)
    )
    (act): ReLU()
  )
  (obj_ptr_tpos_proj): Identity()
)

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:780, in Module._apply(self=SAM2VideoPredictor(
  (image_encoder): ImageEnco...): ReLU()
  )
  (obj_ptr_tpos_proj): Identity()
), fn=<function Module.to.<locals>.convert>, recurse=True)
    779     for module in self.children():
--> 780         module._apply(fn)
        module = ImageEncoder(
  (trunk): Hiera(
    (patch_embed): PatchEmbed(
      (proj): Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
    )
    (blocks): ModuleList(
      (0-1): 2 x MultiScaleBlock(
        (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
        (attn): MultiScaleAttention(
          (qkv): Linear(in_features=112, out_features=336, bias=True)
          (proj): Linear(in_features=112, out_features=112, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=112, out_features=448, bias=True)
            (1): Linear(in_features=448, out_features=112, bias=True)
          )
          (act): GELU(approximate='none')
        )
      )
      (2): MultiScaleBlock(
        (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
        (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (attn): MultiScaleAttention(
          (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (qkv): Linear(in_features=112, out_features=672, bias=True)
          (proj): Linear(in_features=224, out_features=224, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=224, out_features=896, bias=True)
            (1): Linear(in_features=896, out_features=224, bias=True)
          )
          (act): GELU(approximate='none')
        )
        (proj): Linear(in_features=112, out_features=224, bias=True)
      )
      (3-4): 2 x MultiScaleBlock(
        (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
        (attn): MultiScaleAttention(
          (qkv): Linear(in_features=224, out_features=672, bias=True)
          (proj): Linear(in_features=224, out_features=224, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=224, out_features=896, bias=True)
            (1): Linear(in_features=896, out_features=224, bias=True)
          )
          (act): GELU(approximate='none')
        )
      )
      (5): MultiScaleBlock(
        (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
        (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (attn): MultiScaleAttention(
          (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (qkv): Linear(in_features=224, out_features=1344, bias=True)
          (proj): Linear(in_features=448, out_features=448, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=448, out_features=1792, bias=True)
            (1): Linear(in_features=1792, out_features=448, bias=True)
          )
          (act): GELU(approximate='none')
        )
        (proj): Linear(in_features=224, out_features=448, bias=True)
      )
      (6-20): 15 x MultiScaleBlock(
        (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
        (attn): MultiScaleAttention(
          (qkv): Linear(in_features=448, out_features=1344, bias=True)
          (proj): Linear(in_features=448, out_features=448, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=448, out_features=1792, bias=True)
            (1): Linear(in_features=1792, out_features=448, bias=True)
          )
          (act): GELU(approximate='none')
        )
      )
      (21): MultiScaleBlock(
        (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
        (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (attn): MultiScaleAttention(
          (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
          (qkv): Linear(in_features=448, out_features=2688, bias=True)
          (proj): Linear(in_features=896, out_features=896, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=896, out_features=3584, bias=True)
            (1): Linear(in_features=3584, out_features=896, bias=True)
          )
          (act): GELU(approximate='none')
        )
        (proj): Linear(in_features=448, out_features=896, bias=True)
      )
      (22-23): 2 x MultiScaleBlock(
        (norm1): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
        (attn): MultiScaleAttention(
          (qkv): Linear(in_features=896, out_features=2688, bias=True)
          (proj): Linear(in_features=896, out_features=896, bias=True)
        )
        (drop_path): Identity()
        (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
        (mlp): MLP(
          (layers): ModuleList(
            (0): Linear(in_features=896, out_features=3584, bias=True)
            (1): Linear(in_features=3584, out_features=896, bias=True)
          )
          (act): GELU(approximate='none')
        )
      )
    )
  )
  (neck): FpnNeck(
    (position_encoding): PositionEmbeddingSine()
    (convs): ModuleList(
      (0): Sequential(
        (conv): Conv2d(896, 256, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): Sequential(
        (conv): Conv2d(448, 256, kernel_size=(1, 1), stride=(1, 1))
      )
      (2): Sequential(
        (conv): Conv2d(224, 256, kernel_size=(1, 1), stride=(1, 1))
      )
      (3): Sequential(
        (conv): Conv2d(112, 256, kernel_size=(1, 1), stride=(1, 1))
      )
    )
  )
)
        fn = <function Module.to.<locals>.convert at 0x30690beb0>
    782 def compute_should_use_set_data(tensor, tensor_applied):

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:780, in Module._apply(self=ImageEncoder(
  (trunk): Hiera(
    (patch_embed...l_size=(1, 1), stride=(1, 1))
      )
    )
  )
), fn=<function Module.to.<locals>.convert>, recurse=True)
    779     for module in self.children():
--> 780         module._apply(fn)
        module = Hiera(
  (patch_embed): PatchEmbed(
    (proj): Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
  )
  (blocks): ModuleList(
    (0-1): 2 x MultiScaleBlock(
      (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
      (attn): MultiScaleAttention(
        (qkv): Linear(in_features=112, out_features=336, bias=True)
        (proj): Linear(in_features=112, out_features=112, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=112, out_features=448, bias=True)
          (1): Linear(in_features=448, out_features=112, bias=True)
        )
        (act): GELU(approximate='none')
      )
    )
    (2): MultiScaleBlock(
      (norm1): LayerNorm((112,), eps=1e-06, elementwise_affine=True)
      (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (attn): MultiScaleAttention(
        (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (qkv): Linear(in_features=112, out_features=672, bias=True)
        (proj): Linear(in_features=224, out_features=224, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=224, out_features=896, bias=True)
          (1): Linear(in_features=896, out_features=224, bias=True)
        )
        (act): GELU(approximate='none')
      )
      (proj): Linear(in_features=112, out_features=224, bias=True)
    )
    (3-4): 2 x MultiScaleBlock(
      (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
      (attn): MultiScaleAttention(
        (qkv): Linear(in_features=224, out_features=672, bias=True)
        (proj): Linear(in_features=224, out_features=224, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=224, out_features=896, bias=True)
          (1): Linear(in_features=896, out_features=224, bias=True)
        )
        (act): GELU(approximate='none')
      )
    )
    (5): MultiScaleBlock(
      (norm1): LayerNorm((224,), eps=1e-06, elementwise_affine=True)
      (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (attn): MultiScaleAttention(
        (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (qkv): Linear(in_features=224, out_features=1344, bias=True)
        (proj): Linear(in_features=448, out_features=448, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=448, out_features=1792, bias=True)
          (1): Linear(in_features=1792, out_features=448, bias=True)
        )
        (act): GELU(approximate='none')
      )
      (proj): Linear(in_features=224, out_features=448, bias=True)
    )
    (6-20): 15 x MultiScaleBlock(
      (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
      (attn): MultiScaleAttention(
        (qkv): Linear(in_features=448, out_features=1344, bias=True)
        (proj): Linear(in_features=448, out_features=448, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=448, out_features=1792, bias=True)
          (1): Linear(in_features=1792, out_features=448, bias=True)
        )
        (act): GELU(approximate='none')
      )
    )
    (21): MultiScaleBlock(
      (norm1): LayerNorm((448,), eps=1e-06, elementwise_affine=True)
      (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
      (attn): MultiScaleAttention(
        (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False)
        (qkv): Linear(in_features=448, out_features=2688, bias=True)
        (proj): Linear(in_features=896, out_features=896, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=896, out_features=3584, bias=True)
          (1): Linear(in_features=3584, out_features=896, bias=True)
        )
        (act): GELU(approximate='none')
      )
      (proj): Linear(in_features=448, out_features=896, bias=True)
    )
    (22-23): 2 x MultiScaleBlock(
      (norm1): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
      (attn): MultiScaleAttention(
        (qkv): Linear(in_features=896, out_features=2688, bias=True)
        (proj): Linear(in_features=896, out_features=896, bias=True)
      )
      (drop_path): Identity()
      (norm2): LayerNorm((896,), eps=1e-06, elementwise_affine=True)
      (mlp): MLP(
        (layers): ModuleList(
          (0): Linear(in_features=896, out_features=3584, bias=True)
          (1): Linear(in_features=3584, out_features=896, bias=True)
        )
        (act): GELU(approximate='none')
      )
    )
  )
)
        fn = <function Module.to.<locals>.convert at 0x30690beb0>
    782 def compute_should_use_set_data(tensor, tensor_applied):

    [... skipping similar frames: Module._apply at line 780 (1 times)]

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:780, in Module._apply(self=PatchEmbed(
  (proj): Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
), fn=<function Module.to.<locals>.convert>, recurse=True)
    779     for module in self.children():
--> 780         module._apply(fn)
        module = Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3))
        fn = <function Module.to.<locals>.convert at 0x30690beb0>
    782 def compute_should_use_set_data(tensor, tensor_applied):

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:805, in Module._apply(self=Conv2d(3, 112, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3)), fn=<function Module.to.<locals>.convert>, recurse=True)
    804 with torch.no_grad():
--> 805     param_applied = fn(param)
        param = Parameter containing:
tensor([[[[-7.4484e-04, -3.8063e-04, -2.9784e-03,  ..., -2.7585e-03,
           -1.5209e-02, -1.2165e-02],
          [ 1.8151e-03,  4.3306e-03,  1.7098e-03,  ...,  8.3657e-03,
           -8.8945e-03, -6.8560e-03],
          [-1.7331e-03,  1.9146e-03,  7.0163e-03,  ...,  1.9202e-03,
           -1.7937e-02, -1.8916e-03],
          ...,
          [-1.0688e-03,  6.9353e-03, -2.2353e-02,  ..., -2.1545e-02,
           -3.4849e-02, -1.6866e-02],
          [ 1.7823e-03,  1.5267e-02,  1.3038e-02,  ...,  4.4355e-02,
           -4.8981e-03,  2.5057e-02],
          [ 6.5497e-05, -1.2158e-03, -7.7529e-03,  ..., -3.2940e-02,
           -2.9126e-02, -1.8380e-02]],

         [[-1.1883e-03, -2.4921e-03, -7.3217e-03,  ...,  3.5448e-04,
           -1.5525e-02, -1.1140e-02],
          [ 2.3815e-04, -2.7238e-04, -4.2072e-03,  ...,  1.4057e-02,
           -5.6361e-03, -5.5239e-03],
          [-3.7352e-03, -2.7197e-03,  9.6040e-03,  ...,  2.5245e-02,
           -3.9443e-03,  8.6329e-03],
          ...,
          [-1.9068e-03, -1.1552e-03, -1.9126e-02,  ...,  8.9115e-03,
           -1.5037e-02, -2.2457e-03],
          [ 2.9953e-04,  7.7729e-03,  1.2630e-02,  ...,  5.7462e-02,
           -1.0077e-02,  2.5434e-02],
          [ 3.0700e-04,  2.8200e-04, -4.6891e-03,  ..., -2.5356e-04,
            3.9760e-04,  2.0030e-02]],

         [[-9.5186e-04,  4.9469e-04, -3.0184e-03,  ..., -8.7029e-04,
           -8.8268e-03, -8.9423e-03],
          [ 2.3988e-03,  4.5468e-03, -7.5672e-04,  ...,  9.9642e-03,
            3.7210e-03,  7.3450e-04],
          [-2.1258e-03,  2.3685e-03,  9.5525e-03,  ...,  9.6993e-03,
            2.6703e-03,  8.2641e-03],
          ...,
          [-2.5476e-03,  8.4886e-03, -6.7571e-03,  ..., -1.3325e-02,
           -7.2408e-03, -4.7528e-03],
          [-1.1508e-03,  1.2015e-02,  1.4122e-02,  ...,  2.9050e-02,
            6.0221e-03,  2.5519e-02],
          [-3.9306e-05, -3.0419e-04, -2.1076e-04,  ..., -2.4480e-02,
           -1.1745e-02, -5.4721e-03]]],


        [[[ 1.2356e-03,  1.9271e-03,  1.4626e-04,  ...,  8.8590e-03,
            8.6838e-03,  9.9027e-04],
          [ 4.6430e-04, -1.8509e-03, -1.9180e-03,  ...,  1.7677e-02,
            1.3360e-02,  4.1551e-03],
          [-3.5008e-03, -8.0897e-03, -6.3002e-04,  ...,  8.2896e-03,
           -3.4658e-03, -4.9663e-03],
          ...,
          [-1.9242e-03,  4.1331e-03,  1.9469e-02,  ..., -2.1537e-02,
           -2.1311e-02,  4.1355e-02],
          [ 7.1102e-04, -2.1139e-03, -1.1483e-02,  ...,  1.8262e-02,
           -3.3053e-02,  9.3409e-03],
          [ 1.3651e-03,  2.0806e-03,  4.6978e-03,  ..., -4.8697e-02,
           -5.2937e-02,  4.6720e-02]],

         [[-1.0206e-03,  7.2948e-04,  1.5788e-03,  ...,  2.0427e-04,
            4.3039e-04,  2.1217e-04],
          [-3.4039e-04,  1.9338e-03,  5.1431e-03,  ..., -3.1936e-03,
            1.0594e-03, -2.9816e-04],
          [ 1.0015e-03,  4.0346e-03,  1.1202e-02,  ..., -5.8638e-03,
           -9.7761e-04, -8.9009e-03],
          ...,
          [ 4.4911e-04,  1.4189e-02,  2.8688e-02,  ..., -8.3475e-02,
           -3.4376e-02,  1.6903e-02],
          [-3.4167e-03,  5.4817e-03, -4.9090e-03,  ...,  2.9915e-02,
            1.0103e-02,  9.6365e-03],
          [-1.5054e-03,  4.3841e-04,  2.8718e-03,  ..., -4.3588e-02,
           -8.5247e-03,  3.6395e-02]],

         [[-9.7619e-04, -1.8248e-03,  4.6931e-04,  ..., -5.4075e-03,
           -6.9350e-03, -3.4669e-03],
          [ 5.0260e-04,  1.7070e-03,  5.9101e-03,  ..., -3.5104e-03,
            3.1204e-03, -3.7837e-03],
          [ 3.1944e-03,  3.8902e-03,  7.8035e-04,  ..., -1.3977e-02,
            9.0707e-03, -1.4353e-02],
          ...,
          [-5.3115e-03, -6.3651e-03, -1.5948e-02,  ..., -8.5217e-02,
           -1.0625e-02, -6.5526e-02],
          [-5.3361e-03,  1.5615e-03,  1.2021e-02,  ...,  6.5445e-02,
            7.2142e-02, -1.1216e-02],
          [-1.4429e-03, -1.4066e-02, -1.1074e-02,  ..., -4.4381e-04,
            4.7664e-02, -4.9982e-02]]],


        [[[ 1.0553e-02,  1.1322e-02,  1.4326e-02,  ...,  3.1443e-02,
            2.1137e-02,  5.1034e-03],
          [ 6.1091e-03,  2.8269e-03,  7.0679e-03,  ..., -8.7693e-04,
            1.7148e-03,  2.2283e-03],
          [-3.2161e-03, -1.1323e-02, -7.0335e-03,  ..., -2.5902e-02,
           -2.1403e-02, -3.5484e-02],
          ...,
          [ 8.5056e-03,  4.2081e-03, -2.4422e-02,  ..., -5.5212e-02,
           -1.8391e-02,  2.3243e-02],
          [ 4.2755e-03,  1.0028e-02, -3.3063e-03,  ..., -2.5601e-02,
           -2.9477e-02,  2.5839e-02],
          [-7.7893e-03,  4.0849e-03, -1.5313e-02,  ...,  4.4130e-02,
            3.9363e-02,  1.8704e-02]],

         [[-6.4656e-03, -2.4767e-03, -4.2632e-03,  ...,  9.7555e-03,
            5.2409e-03,  7.9285e-03],
          [ 1.1871e-03,  8.8314e-04,  1.0979e-03,  ..., -8.9366e-03,
           -1.0109e-02,  1.4175e-02],
          [-2.7630e-03, -1.4088e-03,  1.5211e-02,  ...,  6.8420e-03,
           -4.5563e-03, -8.8332e-03],
          ...,
          [ 1.3173e-03,  1.5659e-03, -5.2547e-03,  ..., -7.7192e-03,
           -5.9602e-03, -1.7976e-02],
          [-2.6107e-03, -8.8215e-04, -2.7452e-03,  ...,  1.1080e-02,
           -9.7003e-03,  1.2743e-02],
          [ 2.0264e-03,  8.5462e-03, -1.2183e-02,  ...,  2.3867e-02,
            5.6096e-03, -1.5550e-02]],

         [[-4.8149e-03, -2.6166e-03, -9.3493e-03,  ..., -1.0785e-02,
           -6.8974e-04,  1.4980e-02],
          [ 2.2272e-03,  1.0257e-03, -5.1031e-03,  ..., -2.4950e-02,
           -2.0353e-02,  1.3432e-02],
          [ 2.5414e-03,  5.4957e-03,  2.7631e-02,  ...,  2.6271e-02,
            3.2319e-02,  2.3456e-02],
          ...,
          [-3.8618e-03, -4.9577e-04,  1.3917e-02,  ...,  4.2394e-02,
            5.6822e-02, -5.8700e-02],
          [-6.5298e-03, -8.5593e-03,  7.3369e-03,  ...,  4.7999e-02,
            4.7604e-02, -2.6748e-02],
          [ 9.1871e-03,  1.0932e-02, -4.1369e-03,  ..., -3.2431e-02,
           -2.9809e-02, -8.8524e-02]]],


        ...,


        [[[-5.2062e-04, -7.8277e-04, -6.2180e-04,  ...,  5.0440e-03,
            3.9998e-03,  2.4069e-03],
          [-9.4123e-04, -3.0715e-03, -4.1036e-03,  ...,  1.4605e-03,
            1.4064e-03,  3.2020e-05],
          [-7.3997e-04, -3.8483e-03, -7.4699e-03,  ..., -1.9704e-03,
           -6.7454e-04, -3.3948e-03],
          ...,
          [ 1.6385e-03, -1.3034e-03, -3.6201e-03,  ..., -8.1645e-03,
           -7.6576e-03, -1.0924e-02],
          [-6.0097e-04, -2.8441e-03, -3.4643e-03,  ..., -1.1596e-02,
           -1.1167e-02, -1.3908e-02],
          [ 3.2926e-04, -3.0315e-03, -4.1733e-03,  ..., -1.8412e-02,
           -1.8002e-02, -2.1658e-02]],

         [[-1.6582e-03, -6.6081e-04, -2.5985e-03,  ...,  7.4939e-03,
            4.7412e-03,  5.2543e-03],
          [-1.1033e-03, -1.8762e-03, -4.6103e-03,  ...,  7.9282e-03,
            6.4565e-03,  7.7538e-03],
          [-3.6774e-03, -5.2892e-03, -9.7671e-03,  ...,  3.5431e-03,
            3.0037e-03,  1.8356e-03],
          ...,
          [ 3.8064e-03,  7.2477e-03,  6.1575e-03,  ...,  1.5032e-02,
            1.0977e-02,  1.3800e-02],
          [-2.3361e-05,  4.0556e-03,  2.9980e-03,  ...,  7.3346e-03,
            4.4900e-03,  7.0280e-03],
          [ 1.2572e-03,  5.4262e-03,  1.9734e-03,  ...,  6.7534e-03,
            3.3487e-03,  5.9427e-03]],

         [[ 7.6976e-05, -1.2420e-03, -3.5198e-03,  ...,  8.8423e-04,
            4.3991e-04,  4.3118e-04],
          [-5.1724e-04, -3.3540e-03, -5.6957e-03,  ..., -1.2521e-03,
           -9.3189e-04,  2.7817e-04],
          [-2.1721e-03, -5.2063e-03, -9.5170e-03,  ..., -3.1246e-03,
           -2.4646e-03, -2.6195e-03],
          ...,
          [ 1.1597e-03,  3.9777e-04, -1.0060e-03,  ..., -4.6246e-03,
           -6.7299e-03, -6.6358e-03],
          [ 1.7065e-04, -7.4358e-04, -8.9336e-04,  ..., -8.8462e-03,
           -9.5755e-03, -8.1581e-03],
          [-1.2848e-04, -1.3902e-03, -2.8701e-03,  ..., -1.2778e-02,
           -1.3116e-02, -1.2179e-02]]],


        [[[-1.2923e-03,  1.3309e-03, -3.4818e-03,  ...,  4.1971e-03,
            2.5150e-04, -1.9228e-03],
          [-1.0573e-03,  5.0249e-04, -8.5805e-03,  ...,  1.5557e-02,
            5.9760e-03, -2.5045e-03],
          [ 1.1173e-03,  3.5124e-03, -1.4894e-02,  ...,  7.6823e-03,
            6.4722e-03, -1.7537e-02],
          ...,
          [ 2.5662e-03,  1.4192e-02, -1.0665e-02,  ...,  3.0173e-02,
            5.2288e-02, -2.3079e-02],
          [ 4.0271e-03,  5.5014e-03, -1.6477e-02,  ..., -1.3157e-02,
            6.4199e-03, -5.5198e-02],
          [ 3.4949e-03,  1.9765e-02,  7.7195e-03,  ...,  4.0261e-02,
            9.0705e-03, -1.5054e-02]],

         [[ 9.5297e-05, -1.1442e-04,  8.5512e-06,  ...,  3.3927e-03,
           -1.6613e-03, -2.3388e-03],
          [-8.2742e-04, -1.4888e-03, -5.0752e-03,  ...,  1.1454e-02,
            1.4553e-03, -1.2022e-03],
          [ 3.6592e-03,  3.5641e-03, -1.3185e-02,  ...,  7.4435e-04,
            3.7396e-03, -1.7190e-02],
          ...,
          [ 5.2127e-03,  9.8486e-03, -1.8867e-02,  ...,  2.0007e-02,
            6.1541e-02,  3.6173e-03],
          [ 2.2794e-03,  2.1029e-05, -1.3950e-02,  ..., -9.6807e-03,
            2.5041e-03, -3.3219e-02],
          [ 5.1852e-03,  8.3245e-03,  2.6235e-03,  ...,  2.3206e-02,
           -1.5107e-02,  1.8085e-02]],

         [[ 6.1483e-04,  5.9193e-04,  5.6601e-03,  ...,  2.6714e-03,
           -1.9779e-03, -5.3654e-06],
          [-7.8543e-04, -6.7515e-04,  1.6466e-03,  ...,  6.6754e-03,
            1.4732e-03, -2.3259e-03],
          [ 4.8045e-03,  4.9154e-03, -2.9143e-03,  ..., -7.2897e-03,
            9.0034e-03, -6.2386e-03],
          ...,
          [-7.9784e-03, -6.1070e-03, -2.1397e-02,  ..., -1.5827e-02,
            2.4303e-02, -1.3620e-02],
          [-5.7709e-03, -1.6775e-03,  6.6735e-03,  ...,  1.5027e-02,
            2.1168e-02, -9.7590e-03],
          [ 3.0023e-03, -1.1755e-03,  8.1551e-03,  ...,  2.3961e-02,
           -3.5907e-03,  2.0575e-02]]],


        [[[-3.0456e-03, -5.6756e-03, -7.6827e-03,  ...,  2.9488e-03,
            1.7588e-03,  2.0029e-02],
          [ 8.2100e-04, -5.2458e-03, -3.6726e-03,  ..., -8.4385e-03,
           -3.0306e-02,  8.3624e-03],
          [ 3.1192e-03,  9.7630e-04,  1.4849e-02,  ...,  8.9288e-03,
           -5.7967e-03, -5.4297e-03],
          ...,
          [ 1.5802e-02, -2.6536e-03,  4.5022e-02,  ...,  4.2988e-02,
            7.2975e-03,  1.6225e-02],
          [-5.1712e-03, -2.5649e-02, -1.6816e-03,  ...,  4.8976e-02,
            3.0842e-02,  4.7770e-03],
          [ 2.0409e-03, -1.6429e-02,  2.3955e-03,  ..., -8.8538e-03,
           -1.6545e-02, -3.4540e-02]],

         [[ 6.0608e-04, -1.4312e-03, -3.3261e-03,  ...,  4.3023e-04,
            1.1340e-04,  1.2702e-02],
          [ 2.2731e-03, -2.1259e-03,  4.1843e-04,  ..., -1.5698e-02,
           -3.2651e-02, -6.0042e-03],
          [ 5.6862e-03,  5.1457e-03,  2.0555e-02,  ...,  7.7007e-03,
            1.0566e-02, -9.5118e-03],
          ...,
          [ 6.7461e-03, -1.4728e-02,  3.6909e-02,  ..., -4.9246e-03,
            9.9975e-03, -5.9974e-03],
          [-7.1357e-03, -2.8367e-02,  3.6196e-04,  ...,  1.5459e-02,
            3.2032e-02, -3.1037e-02],
          [-1.4704e-03, -5.1660e-03,  1.4092e-02,  ...,  9.5614e-03,
            3.5618e-02, -2.5627e-03]],

         [[ 4.4391e-04, -1.4724e-03, -2.7265e-03,  ...,  1.5673e-03,
            2.3488e-04,  5.4630e-03],
          [ 1.9466e-03, -3.6650e-03, -1.4482e-03,  ..., -1.1559e-02,
           -2.2967e-02,  2.5918e-03],
          [ 3.5293e-03,  3.3653e-03,  1.3656e-02,  ...,  6.0606e-03,
            1.3521e-02, -5.3457e-03],
          ...,
          [-1.8808e-03, -2.0722e-02,  2.9218e-03,  ..., -3.9708e-02,
           -1.2956e-03,  4.9875e-03],
          [-4.0036e-03, -1.1779e-02, -2.5937e-03,  ..., -1.4814e-02,
            1.7447e-02, -1.2174e-02],
          [-3.9600e-03, -3.8076e-03,  7.7254e-03,  ..., -1.4007e-02,
            1.1722e-02,  4.5623e-03]]]], requires_grad=True)
        fn = <function Module.to.<locals>.convert at 0x30690beb0>
    806 p_should_use_set_data = compute_should_use_set_data(param, param_applied)

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/nn/modules/module.py:1160, in Module.to.<locals>.convert(t=Parameter containing:
tensor([[[[-7.4484e-04, -3... 1.1722e-02,  4.5623e-03]]]], requires_grad=True))
   1154         return t.to(
   1155             device,
   1156             dtype if t.is_floating_point() or t.is_complex() else None,
   1157             non_blocking,
   1158             memory_format=convert_to_format,
   1159         )
-> 1160     return t.to(
        t = Parameter containing:
tensor([[[[-7.4484e-04, -3.8063e-04, -2.9784e-03,  ..., -2.7585e-03,
           -1.5209e-02, -1.2165e-02],
          [ 1.8151e-03,  4.3306e-03,  1.7098e-03,  ...,  8.3657e-03,
           -8.8945e-03, -6.8560e-03],
          [-1.7331e-03,  1.9146e-03,  7.0163e-03,  ...,  1.9202e-03,
           -1.7937e-02, -1.8916e-03],
          ...,
          [-1.0688e-03,  6.9353e-03, -2.2353e-02,  ..., -2.1545e-02,
           -3.4849e-02, -1.6866e-02],
          [ 1.7823e-03,  1.5267e-02,  1.3038e-02,  ...,  4.4355e-02,
           -4.8981e-03,  2.5057e-02],
          [ 6.5497e-05, -1.2158e-03, -7.7529e-03,  ..., -3.2940e-02,
           -2.9126e-02, -1.8380e-02]],

         [[-1.1883e-03, -2.4921e-03, -7.3217e-03,  ...,  3.5448e-04,
           -1.5525e-02, -1.1140e-02],
          [ 2.3815e-04, -2.7238e-04, -4.2072e-03,  ...,  1.4057e-02,
           -5.6361e-03, -5.5239e-03],
          [-3.7352e-03, -2.7197e-03,  9.6040e-03,  ...,  2.5245e-02,
           -3.9443e-03,  8.6329e-03],
          ...,
          [-1.9068e-03, -1.1552e-03, -1.9126e-02,  ...,  8.9115e-03,
           -1.5037e-02, -2.2457e-03],
          [ 2.9953e-04,  7.7729e-03,  1.2630e-02,  ...,  5.7462e-02,
           -1.0077e-02,  2.5434e-02],
          [ 3.0700e-04,  2.8200e-04, -4.6891e-03,  ..., -2.5356e-04,
            3.9760e-04,  2.0030e-02]],

         [[-9.5186e-04,  4.9469e-04, -3.0184e-03,  ..., -8.7029e-04,
           -8.8268e-03, -8.9423e-03],
          [ 2.3988e-03,  4.5468e-03, -7.5672e-04,  ...,  9.9642e-03,
            3.7210e-03,  7.3450e-04],
          [-2.1258e-03,  2.3685e-03,  9.5525e-03,  ...,  9.6993e-03,
            2.6703e-03,  8.2641e-03],
          ...,
          [-2.5476e-03,  8.4886e-03, -6.7571e-03,  ..., -1.3325e-02,
           -7.2408e-03, -4.7528e-03],
          [-1.1508e-03,  1.2015e-02,  1.4122e-02,  ...,  2.9050e-02,
            6.0221e-03,  2.5519e-02],
          [-3.9306e-05, -3.0419e-04, -2.1076e-04,  ..., -2.4480e-02,
           -1.1745e-02, -5.4721e-03]]],


        [[[ 1.2356e-03,  1.9271e-03,  1.4626e-04,  ...,  8.8590e-03,
            8.6838e-03,  9.9027e-04],
          [ 4.6430e-04, -1.8509e-03, -1.9180e-03,  ...,  1.7677e-02,
            1.3360e-02,  4.1551e-03],
          [-3.5008e-03, -8.0897e-03, -6.3002e-04,  ...,  8.2896e-03,
           -3.4658e-03, -4.9663e-03],
          ...,
          [-1.9242e-03,  4.1331e-03,  1.9469e-02,  ..., -2.1537e-02,
           -2.1311e-02,  4.1355e-02],
          [ 7.1102e-04, -2.1139e-03, -1.1483e-02,  ...,  1.8262e-02,
           -3.3053e-02,  9.3409e-03],
          [ 1.3651e-03,  2.0806e-03,  4.6978e-03,  ..., -4.8697e-02,
           -5.2937e-02,  4.6720e-02]],

         [[-1.0206e-03,  7.2948e-04,  1.5788e-03,  ...,  2.0427e-04,
            4.3039e-04,  2.1217e-04],
          [-3.4039e-04,  1.9338e-03,  5.1431e-03,  ..., -3.1936e-03,
            1.0594e-03, -2.9816e-04],
          [ 1.0015e-03,  4.0346e-03,  1.1202e-02,  ..., -5.8638e-03,
           -9.7761e-04, -8.9009e-03],
          ...,
          [ 4.4911e-04,  1.4189e-02,  2.8688e-02,  ..., -8.3475e-02,
           -3.4376e-02,  1.6903e-02],
          [-3.4167e-03,  5.4817e-03, -4.9090e-03,  ...,  2.9915e-02,
            1.0103e-02,  9.6365e-03],
          [-1.5054e-03,  4.3841e-04,  2.8718e-03,  ..., -4.3588e-02,
           -8.5247e-03,  3.6395e-02]],

         [[-9.7619e-04, -1.8248e-03,  4.6931e-04,  ..., -5.4075e-03,
           -6.9350e-03, -3.4669e-03],
          [ 5.0260e-04,  1.7070e-03,  5.9101e-03,  ..., -3.5104e-03,
            3.1204e-03, -3.7837e-03],
          [ 3.1944e-03,  3.8902e-03,  7.8035e-04,  ..., -1.3977e-02,
            9.0707e-03, -1.4353e-02],
          ...,
          [-5.3115e-03, -6.3651e-03, -1.5948e-02,  ..., -8.5217e-02,
           -1.0625e-02, -6.5526e-02],
          [-5.3361e-03,  1.5615e-03,  1.2021e-02,  ...,  6.5445e-02,
            7.2142e-02, -1.1216e-02],
          [-1.4429e-03, -1.4066e-02, -1.1074e-02,  ..., -4.4381e-04,
            4.7664e-02, -4.9982e-02]]],


        [[[ 1.0553e-02,  1.1322e-02,  1.4326e-02,  ...,  3.1443e-02,
            2.1137e-02,  5.1034e-03],
          [ 6.1091e-03,  2.8269e-03,  7.0679e-03,  ..., -8.7693e-04,
            1.7148e-03,  2.2283e-03],
          [-3.2161e-03, -1.1323e-02, -7.0335e-03,  ..., -2.5902e-02,
           -2.1403e-02, -3.5484e-02],
          ...,
          [ 8.5056e-03,  4.2081e-03, -2.4422e-02,  ..., -5.5212e-02,
           -1.8391e-02,  2.3243e-02],
          [ 4.2755e-03,  1.0028e-02, -3.3063e-03,  ..., -2.5601e-02,
           -2.9477e-02,  2.5839e-02],
          [-7.7893e-03,  4.0849e-03, -1.5313e-02,  ...,  4.4130e-02,
            3.9363e-02,  1.8704e-02]],

         [[-6.4656e-03, -2.4767e-03, -4.2632e-03,  ...,  9.7555e-03,
            5.2409e-03,  7.9285e-03],
          [ 1.1871e-03,  8.8314e-04,  1.0979e-03,  ..., -8.9366e-03,
           -1.0109e-02,  1.4175e-02],
          [-2.7630e-03, -1.4088e-03,  1.5211e-02,  ...,  6.8420e-03,
           -4.5563e-03, -8.8332e-03],
          ...,
          [ 1.3173e-03,  1.5659e-03, -5.2547e-03,  ..., -7.7192e-03,
           -5.9602e-03, -1.7976e-02],
          [-2.6107e-03, -8.8215e-04, -2.7452e-03,  ...,  1.1080e-02,
           -9.7003e-03,  1.2743e-02],
          [ 2.0264e-03,  8.5462e-03, -1.2183e-02,  ...,  2.3867e-02,
            5.6096e-03, -1.5550e-02]],

         [[-4.8149e-03, -2.6166e-03, -9.3493e-03,  ..., -1.0785e-02,
           -6.8974e-04,  1.4980e-02],
          [ 2.2272e-03,  1.0257e-03, -5.1031e-03,  ..., -2.4950e-02,
           -2.0353e-02,  1.3432e-02],
          [ 2.5414e-03,  5.4957e-03,  2.7631e-02,  ...,  2.6271e-02,
            3.2319e-02,  2.3456e-02],
          ...,
          [-3.8618e-03, -4.9577e-04,  1.3917e-02,  ...,  4.2394e-02,
            5.6822e-02, -5.8700e-02],
          [-6.5298e-03, -8.5593e-03,  7.3369e-03,  ...,  4.7999e-02,
            4.7604e-02, -2.6748e-02],
          [ 9.1871e-03,  1.0932e-02, -4.1369e-03,  ..., -3.2431e-02,
           -2.9809e-02, -8.8524e-02]]],


        ...,


        [[[-5.2062e-04, -7.8277e-04, -6.2180e-04,  ...,  5.0440e-03,
            3.9998e-03,  2.4069e-03],
          [-9.4123e-04, -3.0715e-03, -4.1036e-03,  ...,  1.4605e-03,
            1.4064e-03,  3.2020e-05],
          [-7.3997e-04, -3.8483e-03, -7.4699e-03,  ..., -1.9704e-03,
           -6.7454e-04, -3.3948e-03],
          ...,
          [ 1.6385e-03, -1.3034e-03, -3.6201e-03,  ..., -8.1645e-03,
           -7.6576e-03, -1.0924e-02],
          [-6.0097e-04, -2.8441e-03, -3.4643e-03,  ..., -1.1596e-02,
           -1.1167e-02, -1.3908e-02],
          [ 3.2926e-04, -3.0315e-03, -4.1733e-03,  ..., -1.8412e-02,
           -1.8002e-02, -2.1658e-02]],

         [[-1.6582e-03, -6.6081e-04, -2.5985e-03,  ...,  7.4939e-03,
            4.7412e-03,  5.2543e-03],
          [-1.1033e-03, -1.8762e-03, -4.6103e-03,  ...,  7.9282e-03,
            6.4565e-03,  7.7538e-03],
          [-3.6774e-03, -5.2892e-03, -9.7671e-03,  ...,  3.5431e-03,
            3.0037e-03,  1.8356e-03],
          ...,
          [ 3.8064e-03,  7.2477e-03,  6.1575e-03,  ...,  1.5032e-02,
            1.0977e-02,  1.3800e-02],
          [-2.3361e-05,  4.0556e-03,  2.9980e-03,  ...,  7.3346e-03,
            4.4900e-03,  7.0280e-03],
          [ 1.2572e-03,  5.4262e-03,  1.9734e-03,  ...,  6.7534e-03,
            3.3487e-03,  5.9427e-03]],

         [[ 7.6976e-05, -1.2420e-03, -3.5198e-03,  ...,  8.8423e-04,
            4.3991e-04,  4.3118e-04],
          [-5.1724e-04, -3.3540e-03, -5.6957e-03,  ..., -1.2521e-03,
           -9.3189e-04,  2.7817e-04],
          [-2.1721e-03, -5.2063e-03, -9.5170e-03,  ..., -3.1246e-03,
           -2.4646e-03, -2.6195e-03],
          ...,
          [ 1.1597e-03,  3.9777e-04, -1.0060e-03,  ..., -4.6246e-03,
           -6.7299e-03, -6.6358e-03],
          [ 1.7065e-04, -7.4358e-04, -8.9336e-04,  ..., -8.8462e-03,
           -9.5755e-03, -8.1581e-03],
          [-1.2848e-04, -1.3902e-03, -2.8701e-03,  ..., -1.2778e-02,
           -1.3116e-02, -1.2179e-02]]],


        [[[-1.2923e-03,  1.3309e-03, -3.4818e-03,  ...,  4.1971e-03,
            2.5150e-04, -1.9228e-03],
          [-1.0573e-03,  5.0249e-04, -8.5805e-03,  ...,  1.5557e-02,
            5.9760e-03, -2.5045e-03],
          [ 1.1173e-03,  3.5124e-03, -1.4894e-02,  ...,  7.6823e-03,
            6.4722e-03, -1.7537e-02],
          ...,
          [ 2.5662e-03,  1.4192e-02, -1.0665e-02,  ...,  3.0173e-02,
            5.2288e-02, -2.3079e-02],
          [ 4.0271e-03,  5.5014e-03, -1.6477e-02,  ..., -1.3157e-02,
            6.4199e-03, -5.5198e-02],
          [ 3.4949e-03,  1.9765e-02,  7.7195e-03,  ...,  4.0261e-02,
            9.0705e-03, -1.5054e-02]],

         [[ 9.5297e-05, -1.1442e-04,  8.5512e-06,  ...,  3.3927e-03,
           -1.6613e-03, -2.3388e-03],
          [-8.2742e-04, -1.4888e-03, -5.0752e-03,  ...,  1.1454e-02,
            1.4553e-03, -1.2022e-03],
          [ 3.6592e-03,  3.5641e-03, -1.3185e-02,  ...,  7.4435e-04,
            3.7396e-03, -1.7190e-02],
          ...,
          [ 5.2127e-03,  9.8486e-03, -1.8867e-02,  ...,  2.0007e-02,
            6.1541e-02,  3.6173e-03],
          [ 2.2794e-03,  2.1029e-05, -1.3950e-02,  ..., -9.6807e-03,
            2.5041e-03, -3.3219e-02],
          [ 5.1852e-03,  8.3245e-03,  2.6235e-03,  ...,  2.3206e-02,
           -1.5107e-02,  1.8085e-02]],

         [[ 6.1483e-04,  5.9193e-04,  5.6601e-03,  ...,  2.6714e-03,
           -1.9779e-03, -5.3654e-06],
          [-7.8543e-04, -6.7515e-04,  1.6466e-03,  ...,  6.6754e-03,
            1.4732e-03, -2.3259e-03],
          [ 4.8045e-03,  4.9154e-03, -2.9143e-03,  ..., -7.2897e-03,
            9.0034e-03, -6.2386e-03],
          ...,
          [-7.9784e-03, -6.1070e-03, -2.1397e-02,  ..., -1.5827e-02,
            2.4303e-02, -1.3620e-02],
          [-5.7709e-03, -1.6775e-03,  6.6735e-03,  ...,  1.5027e-02,
            2.1168e-02, -9.7590e-03],
          [ 3.0023e-03, -1.1755e-03,  8.1551e-03,  ...,  2.3961e-02,
           -3.5907e-03,  2.0575e-02]]],


        [[[-3.0456e-03, -5.6756e-03, -7.6827e-03,  ...,  2.9488e-03,
            1.7588e-03,  2.0029e-02],
          [ 8.2100e-04, -5.2458e-03, -3.6726e-03,  ..., -8.4385e-03,
           -3.0306e-02,  8.3624e-03],
          [ 3.1192e-03,  9.7630e-04,  1.4849e-02,  ...,  8.9288e-03,
           -5.7967e-03, -5.4297e-03],
          ...,
          [ 1.5802e-02, -2.6536e-03,  4.5022e-02,  ...,  4.2988e-02,
            7.2975e-03,  1.6225e-02],
          [-5.1712e-03, -2.5649e-02, -1.6816e-03,  ...,  4.8976e-02,
            3.0842e-02,  4.7770e-03],
          [ 2.0409e-03, -1.6429e-02,  2.3955e-03,  ..., -8.8538e-03,
           -1.6545e-02, -3.4540e-02]],

         [[ 6.0608e-04, -1.4312e-03, -3.3261e-03,  ...,  4.3023e-04,
            1.1340e-04,  1.2702e-02],
          [ 2.2731e-03, -2.1259e-03,  4.1843e-04,  ..., -1.5698e-02,
           -3.2651e-02, -6.0042e-03],
          [ 5.6862e-03,  5.1457e-03,  2.0555e-02,  ...,  7.7007e-03,
            1.0566e-02, -9.5118e-03],
          ...,
          [ 6.7461e-03, -1.4728e-02,  3.6909e-02,  ..., -4.9246e-03,
            9.9975e-03, -5.9974e-03],
          [-7.1357e-03, -2.8367e-02,  3.6196e-04,  ...,  1.5459e-02,
            3.2032e-02, -3.1037e-02],
          [-1.4704e-03, -5.1660e-03,  1.4092e-02,  ...,  9.5614e-03,
            3.5618e-02, -2.5627e-03]],

         [[ 4.4391e-04, -1.4724e-03, -2.7265e-03,  ...,  1.5673e-03,
            2.3488e-04,  5.4630e-03],
          [ 1.9466e-03, -3.6650e-03, -1.4482e-03,  ..., -1.1559e-02,
           -2.2967e-02,  2.5918e-03],
          [ 3.5293e-03,  3.3653e-03,  1.3656e-02,  ...,  6.0606e-03,
            1.3521e-02, -5.3457e-03],
          ...,
          [-1.8808e-03, -2.0722e-02,  2.9218e-03,  ..., -3.9708e-02,
           -1.2956e-03,  4.9875e-03],
          [-4.0036e-03, -1.1779e-02, -2.5937e-03,  ..., -1.4814e-02,
            1.7447e-02, -1.2174e-02],
          [-3.9600e-03, -3.8076e-03,  7.7254e-03,  ..., -1.4007e-02,
            1.1722e-02,  4.5623e-03]]]], requires_grad=True)
        device = device(type='cuda')
        dtype = None
        non_blocking = False
   1161         device,
   1162         dtype if t.is_floating_point() or t.is_complex() else None,
   1163         non_blocking,
   1164     )
   1165 except NotImplementedError as e:

File ~/miniforge3/envs/sam2/lib/python3.10/site-packages/torch/cuda/__init__.py:305, in _lazy_init()
    304 if not hasattr(torch._C, "_cuda_getDeviceCount"):
--> 305     raise AssertionError("Torch not compiled with CUDA enabled")
    306 if _cudart is None:

AssertionError: Torch not compiled with CUDA enabled

@bennm37
Copy link

bennm37 commented Nov 7, 2024

Hi,
I had a similar issue. Adding

import os
os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"

before torch is imported in _widget.py seems to fix this.

@horsto
Copy link

horsto commented Nov 26, 2024

Thanks @bennm37 - I am one step further here.
device is detected as MPS correctly on my Mac,
but then

def _load_model(self, model_type: str) -> None:
--> 136     self._predictor = build_sam2_video_predictor(
        self = <Container ()>
        model_type = 'sam2_hiera_b+'
    137         model_type, get_weights_path(model_type)
    138     )
    139     self._predictor.fill_hole_area = 0

File ~/Documents/python/napari-segment-anything-2/src/napari_segment_anything_2/sam2.py:42, in build_sam2_video_predictor(config_file='sam2_hiera_b+', ckpt_path=PosixPath('/Users/horst/.cache/napari-segment-anything/sam2_hiera_base_plus.pt'), device='cuda', mode='eval', hydra_overrides_extra=['++model.sam_mask_decoder_extra_args.dynamic_multimask_via_stability=true', '++model.sam_mask_decoder_extra_args.dynamic_multimask_stability_delta=0.05', '++model.sam_mask_decoder_extra_args.dynamic_multimask_stability_thresh=0.98', '++model.binarize_mask_from_pts_for_mem_enc=true', '++model.fill_hole_area=8'], apply_postprocessing=True)
     41 _load_checkpoint(model, ckpt_path)
---> 42 model = model.to(device)
        model = SAM2VideoPredictor(

EmitLoopError: 

While emitting signal 'magicgui.widgets.ComboBox.changed', a AssertionError occurred in a callback:

  Signal emitted at: /Users/horst/Documents/python/napari-segment-anything-2/src/napari_segment_anything_2/_widget.py:129, in __init__
    >  self._model_type_widget.changed.emit(model_type)

  Callback error at: /Users/horst/miniconda3/envs/napari/lib/python3.10/site-packages/torch/cuda/__init__.py:310, in _lazy_init
    >  raise AssertionError("Torch not compiled with CUDA enabled")

Is there something obvious I could try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants