TR2023-150

LoDA: Low-Dimensional Adaptation of Large Language Models

- Liu, J., Koike-Akino, T., Wang, P., Brand, M., Wang, Y., Parsons, K., "LoDA: Low-Dimensional Adaptation of Large Language Models", Advances in Neural Information Processing Systems (NeurIPS) workshop, December 2023.
  BibTeX TR2023-150 PDF
  - @inproceedings{Liu2023dec,
  - author = {Liu, Jing and Koike-Akino, Toshiaki and Wang, Pu and Brand, Matthew and Wang, Ye and Parsons, Kieran},
  - title = {{LoDA: Low-Dimensional Adaptation of Large Language Models}},
  - booktitle = {Advances in Neural Information Processing Systems (NeurIPS) workshop},
  - year = 2023,
  - month = dec,
  - url = {https://www.merl.com/publications/TR2023-150}
  - }
MERL Contacts:
Research Areas:

Artificial Intelligence, Machine Learning

Abstract:

Parameter-Efficient Fine-Tuning (PEFT) has recently garnered significant attention, due to the enormous size of Large Language Models (LLM). Among various PEFT methods, Low-Rank Adaptation (LoRA) demonstrates comparable performance to full fine-tuning, despite having significantly fewer trainable parameters. In this work, we first generalize LoRA from a low-rank linear adaptation/mapping to low- dimensional, non-linear adaptation/mapping, called Low-Dimensional Adaptation (LoDA). We further propose LoDA+, which further improves the expressiveness of the non-linear adaptation and still uses almost the same number of tunable parameters as LoRA. Both LoDA and LoDA+ include LoRA as a special case. To improve computational efficiency at inference, we further propose R-LoDA(+) and S-LoDA(+), replacing the pretrained weight matrix by its low-rank or sparse approximation, which is frozen during fine-tuning. Empirical evaluations on Natu- ral Language Generation tasks show that LoDA(+) and some variants outperform LoRA as well as other baselines. We will release a package that facilitates the integration of LoDA(+) and their variants with PyTorch models.

Related Research Highlights

Sustainable AI

TR2023-150

LoDA: Low-Dimensional Adaptation of Large Language Models

MERL Contacts:

Jing
Liu

Toshiaki
Koike-Akino

Pu
(Perry)
Wang

Matthew
Brand

Ye
Wang

Kieran
Parsons

Research Areas:

Abstract:

Related Research Highlights

MERL Contacts:

JingLiu

ToshiakiKoike-Akino

Pu(Perry)Wang

MatthewBrand

YeWang

KieranParsons

Research Areas:

Abstract:

Jing
Liu

Toshiaki
Koike-Akino

Pu
(Perry)
Wang

Matthew
Brand

Ye
Wang

Kieran
Parsons