RuntimeError converting Whisper model

### System Info

```shell
optimum==1.27.0
onnx==1.18.0
onnxruntime==1.22.1
transformers==4.53.3
python version 3.10.5
```

### Who can help?

Dear all,
following the example provided at https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model#exporting-a-model-using-past-keysvalues-in-the-decoder , I tried to translate it to the case of hidden_states, as reported in the following script. However, both this version and the version provided in the tutorial yield the same error: RuntimeError: number of output names provided (5) exceeded number of outputs (1)

### Information

- [x] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction (minimal, reproducible, runnable)

`from optimum.exporters.onnx import main_export
from optimum.exporters.onnx.model_configs import WhisperOnnxConfig
from transformers import AutoConfig

from optimum.exporters.onnx.base import ConfigBehavior
from typing import Dict

class CustomWhisperOnnxConfig(WhisperOnnxConfig):
    @property
    def outputs(self) -> Dict[str, Dict[int, str]]:
        common_outputs = super().outputs

        if self._behavior is ConfigBehavior.ENCODER:
            for i in range(self._config.encoder_layers):
                common_outputs[f"encoder_hidden_states.{i}"] = {0: "batch_size"}
        elif self._behavior is ConfigBehavior.DECODER:
            for i in range(self._config.decoder_layers):
                common_outputs[f"decoder_hidden_states.{i}"] = {
                    0: "batch_size",
                    2: "decoder_sequence_length",
                    3: "d_model"
                }

        return common_outputs


model_id = "openai/whisper-tiny"
config = AutoConfig.from_pretrained(model_id)

custom_whisper_onnx_config = CustomWhisperOnnxConfig(
        config=config,
        task="automatic-speech-recognition"
)

encoder_config = custom_whisper_onnx_config.with_behavior("encoder")
decoder_config = custom_whisper_onnx_config.with_behavior("decoder", use_past=False)
decoder_with_past_config = custom_whisper_onnx_config.with_behavior("decoder", use_past=True)

custom_onnx_configs={
    "encoder_model": encoder_config,
    "decoder_model": decoder_config,
    "decoder_with_past_model": decoder_with_past_config,
}

main_export(
    model_id,
    output="custom_whisper_onnx",
    model_kwargs={"output_hidden_states": True},
    custom_onnx_configs=custom_onnx_configs,
)`

### Expected behavior

Simply exporting the model so to extract the final decoder hidden state using method generate on a ORTModelForSpeechSeq2Seq object.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError converting Whisper model #2338

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError converting Whisper model #2338

Description

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions