Skip to content

CUDA error: operation not supported #170

@Thurs-Lw

Description

@Thurs-Lw

Hello, I encountered the following error while running Evo2:

Found complete file in repo: evo2_7b.pt
[08/07/25 08:58:08] INFO StripedHyena - INFO - Initializing model.py:616
StripedHyena with config:
{'model_name': 'shc-evo2-7b-8k-2T-v2',
'vocab_size': 512, 'hidden_size':
4096, 'num_filters': 4096,
'hcl_layer_idxs': [2, 6, 9, 13, 16,
20, 23, 27, 30], 'hcm_layer_idxs': [1,
5, 8, 12, 15, 19, 22, 26, 29],
'hcs_layer_idxs': [0, 4, 7, 11, 14,
18, 21, 25, 28], 'attn_layer_idxs':
[3, 10, 17, 24, 31],
'hcm_filter_length': 128,
'hcl_filter_groups': 4096,
'hcm_filter_groups': 256,
'hcs_filter_groups': 256,
'hcs_filter_length': 7, 'num_layers':
32, 'short_filter_length': 3,
'num_attention_heads': 32,
'short_filter_bias': False,
'mlp_init_method':
'torch.nn.init.zeros_',
'mlp_output_init_method':
'torch.nn.init.zeros_', 'eps': 1e-06,
'state_size': 16, 'rotary_emb_base':
100000000000,
'rotary_emb_scaling_factor': 128,
'use_interpolated_rotary_pos_emb':
True, 'make_vocab_size_divisible_by':
8, 'inner_size_multiple_of': 16,
'inner_mlp_size': 11264,
'log_intermediate_values': False,
'proj_groups': 1,
'hyena_filter_groups': 1,
'column_split_hyena': False,
'column_split': True, 'interleave':
True, 'evo2_style_activations': True,
'model_parallel_size': 1,
'pipe_parallel_size': 1,
'tie_embeddings': True,
'mha_out_proj_bias': True,
'hyena_out_proj_bias': True,
'hyena_flip_x1x2': False,
'qkv_proj_bias': False,
'use_fp8_input_projections': False,
'max_seqlen': 1048576,
'max_batch_size': 1, 'final_norm':
True, 'use_flash_attn': True,
'use_flash_rmsnorm': False,
'use_flash_depthwise': False,
'use_flashfft': False,
'use_laughing_hyena': False,
'inference_mode': True,
'tokenizer_type':
'CharLevelTokenizer', 'prefill_style':
'fft', 'mlp_activation': 'gelu',
'print_activations': False, 'Loader':
<class 'yaml.loader.FullLoader'>}
Traceback (most recent call last):
File "/home-ssd/Users/evo2/evo2_offical_test/github_test_1.py", line 10, in
evo2_model = Evo2('evo2_7b')
^^^^^^^^^^^^^^^
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/evo2/models.py", line 48, in init
self.model = self.load_evo2_model(model_name, config_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/evo2/models.py", line 255, in load_evo2_model
model = StripedHyena(global_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/vortex/model/model.py", line 619, in init
self.embedding_layer = VocabParallelEmbedding(config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/vortex/model/layers.py", line 232, in init
super().init(
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/torch/nn/modules/sparse.py", line 167, in init
torch.empty((num_embeddings, embedding_dim), **factory_kwargs),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home-ssd/Users/miniconda3/envs/flattn274/lib/python3.12/site-packages/torch/utils/_device.py", line 104, in torch_function
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: operation not supported
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I have already set use_fp8_input_projections to False.
In addition, I have tried the following combinations:

  • CUDA 12.8 + Torch 2.7 + FlashAttention 2.8.0.post2
  • CUDA 12.8 + Torch 2.7 + FlashAttention 2.7.4.post1
  • CUDA 12.8 + Torch 2.6 + FlashAttention 2.8.0.post2
  • CUDA 12.4 + Torch 2.6 + FlashAttention 2.8.0.post2

The issue occurs in all of these setups.

My GPU is NVIDIA A800.

How can I resolve this? Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions