[WIP] [GenAI] Lora Finetune

Lora fine-tuning is an adapter-based technique to fine-tune an LLM. It changes LLM model architecture by adding learnable lora layers to transformers. During fine-tuning, only lora weights are adjustable and the LLM weights are frozen, so it requires much less GPU memory comparing to a full-layer fine-tuning. Based on [this table](https://github.com/hiyouga/LLaMA-Factory#hardware-requirement), it requires 16GB memory to fine-tuning a 7B size model in 16bits, which can be fit in rtx 3090, 4080 and 4090. A wider range of GPUs can be fit on 3.8B LLMs like phi-3.5-mini

## API design (wip)

### Package: `Microsoft.ML.GenAI.Lora`

```csharp
interface ICausalLMLoraPipeline {} // pipeline for loading causal LM + lora layers

class LoraConfiguration // lora configuration
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] [GenAI] Lora Finetune #7288

API design (wip)

Package: `Microsoft.ML.GenAI.Lora`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[WIP] [GenAI] Lora Finetune #7288

Description

API design (wip)

Package: Microsoft.ML.GenAI.Lora

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Package: `Microsoft.ML.GenAI.Lora`