Material

Course: https://www.deeplearning.ai/short-courses/finetuning-large-language-models/

Why finetuning

什么是微调? 微调是将让模型具备特定回复风格和领域能力，例如chatgpt的问答回复，github copilot的代码能力等。 alt text

Prompt Engineering vs. Finetuning

alt text

Prompt Engineering

Pros

不需要训练数据
更少的预付成本
不需要掌握底层原理
可以使用RAG技术获得额外的数据 Cons
不能适配更多的数据
数据遗忘
幻觉
RAG召回的质量影响效果
Finetuning
Pros
需要尽可能多的数据
可以掌握最新产生的信息（新闻、前沿知识等）
调整之前测试出现的错误信息
如果能训练出小的模型最终的成本降低
同样能够使用RAG Cons
需要更高质量的数据
计算成本高
需要知道的底层逻辑很多，微调技术，数据处理等。

Benefits of finetuning your own LLM

Performance

降低幻觉
增强一致性（公司价值观，产品策略等）
减少不想要模型输出的信息 Privacy
作为预制软件，私有化
防止数据泄漏
不会侵权 Cost
降低每次的请求
增加透明性
更好的控制 Reliability
实时控制
更低的时延
节制（话题相关）

Examples

baseModel和chatModel(finetuned) 的区别，没有finetune前的模型回答的效果会比较奇怪，容易出现重复的话，而且对话的风格也不像人人对话一般。

Where finetuning fits in

Pretraining

Model at start
- zero knowledge about the world
- can not form English words
Next token prediction
Giant corpus of text data
Oftern scraped from the internet: “unlabeled”
Self-supervised learning
After Training
- learns language
- learns knowledge
  What is “data scraped from the internet”
Often not publicized how to pretrain
Open-source pretraining data: “The Pile”
Expensive & time-consuming to train
Limitations of pretrained base models
base 模型的训练数据：
```
Geography Homework
Name:
Q: What's the capital of India?
Q: What's the capital of Kenya?
Q: What's the capital of France?
```
输入：What is the capital of Mexico? 输出：What is the capital of Hungary?

base模型在没有进行数据微调的情况下，回答效果和预训练数据的质量和特征有很大关系。

Finetuning after pretraining

Corpus Texts –> Pretraining –> Base Model Corpus Texts –> Pretraining –> Base Model –> High quality data –> Finetuning –> Finetuned Model Finetuning 通常需要：

自监督的无标注数据
标注数据
不需要预训练时那么多数据
可以作为后续被使用（翻译，chatbots，codepilot等） Finetuning for generative tasks is not well-defined:
更新整个模型
训练目标：next token prediction
What is finetuning doing for you?
这里的base model，相当于一个具有很多知识的大脑，但是他并不知道该怎么输出这些信息，所以微调这个过程就是要让模型知道什么时候应该输出什么样的知识，这是一个Gain Knowledge通过Behavior Change让模型能够真正的被使用。
First time finetuning
1. Identify task(s) by prompt-engineering a large LLM
2. Fine tasks that you see an LLM doing ~ok at
3. Pick one tsk
4. Get ~1000 inputs and outputs for the task
1. Better that the ~OK from the LLM
  1. Finetune a small LLM on this data
    Examples
    这里的例子主要是构造训练数据，需要通过构造特定的输入，让模型能够学会我们想要的回答的格式。

Instruction finetuning

Instruction finetuning(reasoning, routing, copilot, caht, agents) is part of finetuning. data: FAQs, Customer support service, etc.

Training Process

Data Prep -> Training -> Evaluation -> Data Prep …

Examples

比较了base model和finetuning model的实际效果。

Data preparation

Garbage In -> Garbage out

Steps

Collect instruction-response pairs
Concatenate pairs
Tokenize: Pad, Truncate
Split into train/test

Example

整个流程跟训练其他的模型并没有什么很大区别。

Evaluation and iteration

Human Evaluation Test Suites Elo Rankings

LLM Benchmarks: Suite of Evaluation Methods

Common LLM benchmarks

ARC is a set of grade-school questions
HellaSwag is a test of common sense
MMLU is a multitask metric covering elementary math, US history, computer science, law and more
TruthfulQA measures a model’s prepensity to reproduce falsehoods commonly found online.

Consideration

Pratical approach to finetuning

Figure out your task
Collect data related to the task’s inputs/outputs
Generate data if you don’t have enough data
Finetune a small model(e.g. 400M-1B)
Vary the amount of data you give the model
Evaluate your LLM to know what’s going well vs. not
Collect more data to improve
Increase task complexity
Increase model size for

Model size x Compute

resource

Parameters Efficient finetuning

finetuning methods

Practice

跑通训练脚本：https://learn.deeplearning.ai/courses/finetuning-large-language-models/lesson/6/training-process

文档信息

本文作者：pnightowl
本文链接：https://pnightowlzy.github.io/2024/07/12/llm-finetuning/
版权声明：自由转载-非商用-非衍生-保持署名（创意共享3.0许可证）

交个朋友

Finetuning Large Language Models 理论介绍

Material

Why finetuning

Prompt Engineering vs. Finetuning

Prompt Engineering

Finetuning

Benefits of finetuning your own LLM

Examples

Where finetuning fits in

Pretraining

What is “data scraped from the internet”

Limitations of pretrained base models

Finetuning after pretraining

What is finetuning doing for you?

First time finetuning

Examples

Instruction finetuning

Training Process

Examples

Data preparation

Steps

Example

Evaluation and iteration

LLM Benchmarks: Suite of Evaluation Methods

Consideration

Model size x Compute

Parameters Efficient finetuning

Practice

文档信息

Search

Table of Contents