You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Whisper performance on CPU (through Optimum) is very slow - less than 1 token/s in decode for medium and larger. This is because it's using functional index_put, which is very slow. We should be using the KV cache update logic.