Generate instruction dataset using LLMs
Self-Instruct Prone to hallucination
knowledge is extracted from the outputs of a small set of expert-written seed data
generates new user instructions and outputs with external knowledge
lawyer's argument (output) + case law (external data)
analogous to how teachers create exam questions based on textbook
generates exam questions and context
handcrafted
wrote 8 system instructions
they differ to allow the creation of outputs in various manners, lengths, formats
Use all knowledge, system instructions, instructions and input to generate output from LLM
8 outputs for each user instruction and input pair
Similar to knowledge distillation
distills domain knowledge
sLLM is trained to generate responses based on indirectly learned knowledge
learns 8 types of system instructions and corresponding output forms
GPT-3.5-turbo in Step 1
GPT-4-preview-1106 for Step 2 and 4
compared the lengths of the user instruction, input and output
more even than Self-Instruct
various system instructions worked well for the diversity
extracting objective knowledge from outputs will help model not be limited to particular situations
human evaluation for 100 random sample
LLaMA-2-ko 7B
SELF-EXPERTISE dataset
3 epochs, AdamW, lr 2e-5, batch 1 per device, max len 1024
A100 80G
Foundation Models
Instruction Tuned Models
GPT
Instruction-tuned Models in Legal Domain
GPT-4 Evaluation
Human Evaluation
Excessivie augmentation of data with same seed dataset leads to overfitting on specific knowledge
Expanding seed data and additional general domain dataset will work.
ability to follow instructions
legal domain knowledge
still prone to make errors
automatically generating instruction dataset in specialized domain
can be extended to generate instruction datasets in other domain
지식을 추출해서 쓰자
Augmentation은 Seed에 맞게 적당히 할 것