-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
AutoML.NETAutomating various steps of the machine learning processAutomating various steps of the machine learning processenhancementNew feature or requestNew feature or request
Milestone
Description
Lora fine-tuning is an adapter-based technique to fine-tune an LLM. It changes LLM model architecture by adding learnable lora layers to transformers. During fine-tuning, only lora weights are adjustable and the LLM weights are frozen, so it requires much less GPU memory comparing to a full-layer fine-tuning. Based on this table, it requires 16GB memory to fine-tuning a 7B size model in 16bits, which can be fit in rtx 3090, 4080 and 4090. A wider range of GPUs can be fit on 3.8B LLMs like phi-3.5-mini
API design (wip)
Package: Microsoft.ML.GenAI.Lora
interface ICausalLMLoraPipeline {} // pipeline for loading causal LM + lora layers
class LoraConfiguration // lora configurationibocon, strong99 and AkramAlQaifiLunarExplorer and AkramAlQaifiAkramAlQaifi
Metadata
Metadata
Assignees
Labels
AutoML.NETAutomating various steps of the machine learning processAutomating various steps of the machine learning processenhancementNew feature or requestNew feature or request