[Bug] model.fast_generate() with lora_request fails with TypeError

I'm getting this error when calling `model.fast_generate()` and specifying a LoRA adapter path via the `lora_request` parameter:

```
File /venv/main/lib/python3.12/site-packages/unsloth_zoo/vllm_lora_worker_manager.py:147, in WorkerLoRAManager._load_adapter(self, lora_request)
    144         kwargs["embedding_modules"] = self.embedding_modules
    145         kwargs["embedding_padding_modules"] = self.embedding_padding_modules
--> 147     lora = load_method(**kwargs)
    149 except FileNotFoundError as e:
    150     # FileNotFoundError should be raised if both
    151     # - No adapter found to download from huggingface (or in
    152     #       offline mode)
    153     # - No local adapter files found at `lora_request.lora_path`
    154     # For NotFoundError
    155     raise ValueError(
    156         f"Loading lora {lora_request.lora_name} failed: No adapter "
    157         f"found for {lora_path}") from e

TypeError: LoRAModel.from_local_checkpoint() got an unexpected keyword argument 'lora_path'
```

This seems to break e.g. [DeepSeek_R1_0528_Qwen3_(8B)_GRPO.ipynb](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/DeepSeek_R1_0528_Qwen3_(8B)_GRPO.ipynb) when I run locally on H100 with latest unsloth and vllm. Here's a minimal repro:

```python
from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = 'Qwen/Qwen3-0.6B', # can be anything
    max_seq_length = 512,
    load_in_4bit = False,
    fast_inference = True, 
    max_lora_rank = 8,
    gpu_memory_utilization = 0.6, 
)

model = FastLanguageModel.get_peft_model(
    model,
    r = 8,
    target_modules = [
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj",
    ],
    lora_alpha = 16
)

# Save the empty adapter as a dummy
model.save_lora('saved_lora_adapter')

# Fails with TypeError
outputs = model.fast_generate(
    ['dummy prompt'],
    lora_request=model.load_lora('saved_lora_adapter'),
)
```

I think the culprit is [this change](https://github.com/unslothai/unsloth-zoo/blob/e1d6791803ec67acc8f1c61a6c7ca665bdb0cefc/unsloth_zoo/vllm_lora_worker_manager.py#L147) from 3 days ago. The code now has:

```python
 kwargs["lora_path"] = lora_path
# [...]
lora = load_method(**kwargs)
```

But `vllm.lora.LoraModel.from_local_checkpoint()` ([source](https://github.com/vllm-project/vllm/blob/990f806473888451ef6590f85a6ed8436db7801c/vllm/lora/models.py#L155)) expects `lora_dir`, not `lora_path`.

Otherwise thanks unsloth team for the amazing work 🤩 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] model.fast_generate() with lora_request fails with TypeError #3677

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] model.fast_generate() with lora_request fails with TypeError #3677

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions