What's Changed
Inference
- Fix input slots exhaustion in vLLM plugin by @dacorvo in #1028
- Agentic example by @tengomucho in #1030
- perf: move accuracy benchmark to vllm by @dacorvo in #1031
- Add support for Qwen3 embedding model by @dacorvo in #1023
- Update vllm version to 0.11.0 by @dacorvo in #1027
- feat: Add
encodeandsimilarityof Sentence transformers by @JingyaHuang in #1012
Training
- Metrics for training by @michaelbenayoun in #982
- Update
trlversion to the latest release0.11.4->0.24.0by @michaelbenayoun in #1000 - Add cache features to the
NeuronTrainerby @michaelbenayoun in #1026
Other
- Sync with transformers 4.57.1 by @michaelbenayoun in #1016
- ci(vllm): login to docker by @tengomucho in #1010
- Fix small typos by @tengomucho in #1021
- Bump optimum to 2.0 by @JingyaHuang in #1018
- Unpin protobuf version by @JingyaHuang in #1014
- Fixing link in error message by @jimburtoft in #1029
- fix(vllm): fix base_neuron_llm_config fixture by @tengomucho in #1032
Full Changelog: v0.4.1...v0.4.2