Contributions
(1) Self-Instruct 제시: human-labeled data를 최소한도로 사용하는 instruction tuning approach
(2) insturction-tuning experiment로 Self-Instruct의 효과성을 증명
(3) 새로운 instruction tuning dataset 제시: 52k instruction과 manually-written novel tasks ((1) 에서 구축한 데이터셋 & evaluation을 위한 새로운 dataset)
Instruction dataset의 구성
(1) instruction; a.k.a. task description
(2) input (optional)
(3) output
(1) instruction (2) clf task 여부를 나누고 (3) input & output 의 순서로 instance를 생성
175개 task (task마다 각 1개 instruction과 1개 task의 instance를 포함) 으로 이루어진 task pool에서 task instruction을 생성
Seed Task의 모양새 (최종 dataset sample도 같은 모양새임)
"id": "seed_task_0",
"name": "breakfast_suggestion",
"instruction": "Is there anything I can eat for a breakfast that doesn't include eggs, yet includes protein, and has roughly 700-1000 calories?",
"instances": [{"input": "", "output": "Yes, you can have 1 oatmeal banana protein shake and 4 strips of bacon. The oatmeal banana protein shake may contain 1/2 cup oatmeal, 60 grams whey protein powder, 1/2 medium banana, 1tbsp flaxseed oil and 1/2 cup watter, totalling about 550 calories. The 4 strips of bacon contains about 200 calories."}],
"is_classification": false}
task pool 안에서 8개를 in context example로 주어 instruction을 생성
Self-Instruct 방식을 사용해 구축한 데이터의 규모, 다양성, 질을 확인
만약 instruction dataset을 개선할 수 있다면 어떤 방향으로..?
self-instruct dataset에서 instruction의 개수를 줄여가면서 실험
instruction 개수를 늘렸을 때 성능 향상은 있지만, 16k 이후로는 plateau 발견 (16k 일 때는 왜 점 안 찍었지..?)
Limitations & Comment
- GPT의 knowledge를 기반으로 dataset을 만들기 때문에 bias가 증폭될 수 있음
- (Table 2) instruction 과 input에 적절한 output과 response인지 여부를 확인했을 때 58%에 불과한데, 너무 낮은 것 아닌지?
- PromptSource와 SuperNI dataset과의 차이를 보여주는 자료가 있으면 좋을 것. 어떤 차이 때문에 PromptSource나 SuperNI를 사용하는 것보다 Self-Instruct를 사용하는 것이 더 좋은지 아리송
- instruction tuning 시 task의 다양성이 중요함을 확인 가능 (단, instance 개수의 중요도는 이전 논문을 인용함으로써 넘어감)
- 하지만, 데이터의 품질 또한 중요하다는 점을 강조
- instruction tuning scheduling 때 낮은 난이도의 data는 Self-Instruct 방식을, 높은 난이도의 data는 human-written 방식을 사용하여 efficiency와 effectiveness를 모두 잡는 것이 좋겠음
T0 dataset (PromptSource)
{"prompt": "Make a list of 10 ways to help students improve their study skills.\n\nOutput:",
"completion": " 1. Make a schedule for studying and stick to it.\n2. Study in the same place every time.\n3. Set goals for yourself.\n4. Take breaks when you need them.\n5. Don't cram before an exam.\n6. Get enough sleep.\n7. Eat healthy food.\n8. Exercise regularly.\n9. Find a study partner.\n10. Reward yourself after completing a task.<|endoftext|>"}
SuperNI dataset
{"prompt": "Generate an appropriate title for the given text. The generated title must be short and include the main topic of the text. The preferred titles are under fifteen words.\n\n
Input: Chorley Council has objected to plans for the Cuerden retail park near Bamber Bridge in Lancashire amid concerns it could threaten its \u00a317m town centre redevelopment. The authority is seeking the compensation from the developers and backers Lancashire County Council about the planned 65-hectare retail park. They said the \"money is not justified\". The developers said the site features restaurants and a hotel and would also create 4,500 jobs and could house offices and shops, as well as the Ikea store. But Chorley Council is concerned the plans could drive people away from its own new-look town centre, which will feature a cinema and new high street shops. The planning application will be dealt with by South Ribble Borough Council next month. Chorley Council has written a letter spelling out concerns, including fears about the impact of increased traffic. The council's deputy leader Peter Wilson said the council had been watching the plans \"very closely\" and wanted to \"protect the interests\" of Chorley. He added: \"While we want to see economic growth across Lancashire, we are concerned that the proposals don't properly address the impact that a development of that size could have on Chorley town centre and the traffic and highways surrounding the area. \"For those reasons, we cannot support the proposals as they currently stand.\" The council's objection letter demands \"a financial contribution of \u00a311,520,121.00 to mitigate the impact of the Cuerden development\". A spokesman for the Cuerden Strategic Site developers said: \"We are aware that Chorley Council has objected to the Cuerden application, however it is unclear from their letter what the justification is for doing so. \"Following a detailed and robust assessment, our own professional advisers have concluded that the potential retail impacts on neighbouring areas are acceptable. \"Therefore, Chorley Council's request for a significant sum of money by way of mitigation is not justified.\"\n\nOutput:",
"completion": " Chorley Council seeks \u00a311.5m compensation over Ikea retail park plan<|endoftext|>"}