Skip to content

gemma3_text support in NormalizedConfig #2393

@lmmx

Description

@lmmx

Feature request

I got the following error while trying to optimise these embeddings with ONNX

KeyError: 'gemma3_text model type is not supported yet in NormalizedConfig. Only albert, bart, bert, big_bird, bigbird_pegasus, blenderbot, blenderbot-small, bloom, falcon, camembert, codegen, cvt, deberta, deberta-v2, deit, dinov2, distilbert, donut-swin, electra, encoder-decoder, gemma, gpt2, gpt_bigcode, gpt_neo, gpt_neox, gptj, imagegpt, internlm2, llama, longt5, marian, markuplm, mbart, mistral, mixtral, modernbert, mpnet, mpt, mt5, m2m_100, nystromformer, olmo, olmo2, opt, pegasus, pix2struct, phi, phi3, poolformer, regnet, resnet, roberta, segformer, speech_to_text, splinter, t5, trocr, vision-encoder-decoder, vit, whisper, xlm-roberta, yolos, qwen2, qwen3, qwen3_moe, smollm3, granite, clip are supported. If you want to support gemma3_text please propose a PR or open up an issue.'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/louis/lab/emb/tarka/.venv/bin/optimum-cli", line 10, in <module>
    sys.exit(main())
             ~~~~^^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/commands/optimum_cli.py", line 219, in main
    service.run()
    ~~~~~~~~~~~^^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/commands/export/onnx.py", line 264, in run
    main_export(
    ~~~~~~~~~~~^
        model_name_or_path=self.args.model,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<20 lines>...
        **input_shapes,
        ^^^^^^^^^^^^^^^
    )
    ^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/exporters/onnx/__main__.py", line 399, in main_export
    onnx_export_from_model(
    ~~~~~~~~~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<18 lines>...
        **kwargs_shapes,
        ^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/exporters/onnx/convert.py", line 1096, in onnx_export_from_model
    optimizer = ORTOptimizer.from_pretrained(output, file_names=onnx_files_subpaths)
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/onnxruntime/optimization.py", line 126, in from_pretrained
    return cls(onnx_model_path, config=config, from_ortmodel=from_ortmodel)
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/onnxruntime/optimization.py", line 75, in __init__
    raise NotImplementedError(
    ...<2 lines>...
    )
NotImplementedError: Tried to use ORTOptimizer for the model type gemma3_text, but it is not available yet. Please open an issue or submit a PR at https://github.com/huggingface/optimum.

Motivation

The resulting embeddings are very fast on GPU but slow (20x slower) on CPU! This must surely be from lack of optimisation (-O2 and -O3 do not work)

Your contribution

I can look into it but the error message said to report it here first!

To reproduce:

# Try exporting with specific CPU optimization
uv pip install sentence-transformers optimum[onnxruntime-gpu]

optimum-cli export onnx \
  --model Tarka-AIR/Tarka-Embedding-150M-V1 \
  --task feature-extraction \
  --optimize O3 \
  tarka-150m-v1-onnx-o3/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions