๋ณด๋ฆ„๋‹ฌ ๐ŸŒ #2.2 Machine Learning - Analgesic Recommendation (ENG)

๋ณด๋ฆ„๋‹ฌยท2020๋…„ 11์›” 26์ผ
4

๋จธ์‹ ๋Ÿฌ๋‹ / Machine Learning

๋ชฉ๋ก ๋ณด๊ธฐ
4/4
post-thumbnail

1. Introduction

ย ย ย ย The easiest way to relieve menstrual pain is to take analgesic, especially over-the-counter general pharmaceuticals. Though hundred kinds of painkillers are available on the market, consumers habitually choose familiar ones because of lack of information.

ย ย ย ย Of course, the basic effects are the same since they are all alagesic. However, the symptoms are various, and the effects also vary depending on the ingredients of the painkiller. Proper dosage will help the user to relieve pain quickly. We aim to recommend appropriate kind of painkiller in accordance with symptoms using symptoms that user put in the calendar detail page.

2. Datasets


ย ย ย Since there is no accessible public data set, we made fake dataset with 1521 cases after researching for the relationship between symptoms and painkillers.

ย ย ย ย Analgesic is divided into 2 broad groups, acetaminophen and nonsteroidal. Acetaminophen family can be a burden on the liver, and the ibuprofen, aspirin and naphroxen in the nonsteroid family can be a burden on the stomach. Referring to ใ€Œ์•Œ๊ณ  ๋จน๋Š” ์•ฝ ๋ชจ๋ฅด๊ณ  ๋จน๋Š” ์•ฝ, ๊น€์ •ํ™˜ใ€ and Mint Hospital Blog, we established rules to recommend painkillers that prioritize user's liver and stomach conditions.

ย ย ย ย We classified Pill Numbers considering back pain, heartburn, severe pain, menstrual irregularity, swelling, mild fever, headache, abdominal pain, convulsions, psychological symptoms, diarrhea, liver condition, and stomach condition. In the symptoms stored in the databse entered through mobile application and nugu speakers are in the boolean format. Since it is the fake dataset, pill recommendation is stored in Pill Number without feature engineering process.

Pill NumberPill Recommendation
1์Šคํ”ผ๋“œํŽœ ๋‚˜๋…ธ, ์ž์ด๋‚ , ์ด์ง€์—”
2์ด์ง€์—”6 ์ด๋ธŒ
3๊ทธ๋‚ ์—”
4์ด์ฆˆํŽœ
5์šฐ๋จผ์Šค ํƒ€์ด๋ ˆ๋†€
6์ด์ง€์—”6 ์ด๋ธŒ, ์ด๋ธŒํ ๋ ˆ์ด๋””
7์‹ธ์ด๋ฒ ๋ฆฐ
8๊ฒŒ๋ณด๋ฆฐ, ํŽœ์ž˜ํ, ์‚ฌ๋ฆฌ๋ˆ ์—์ด
9ํƒ์„ผ
10๋ถ€์Šค์ฝ”ํŒ ํ”Œ๋Ÿฌ์Šค
11๊ทธ๋‚ ์—” Q
12์ž์ด๋‚ 

3. Methodology

Multiclass Classification Algorithm
One of Softmax Regression which is for choosing 1 out of more than 3 classes.

One-Hot Encoding
Encoding which represents True for 1 feature and False for the last. For example, 0 is encoded in [1,0,0,0,0] and 4 is [0,0,0,0,1]. It is done to Apply various models, inprove accuracy, and for ease of operation in classification problem.

ย ย ย ย What we should do in machine learing is to predict 12 Pill Number using 14 features. So we judged the multiclass classification algorithm would be right. Also, we did one-hot encoding to change Pill Number which is from 1 to 12 to 0 to 11.

ย ย ย ย We used tensorflow.js for the machine learning. We divided dataset 70:30 traing set and test set, trained the model and tested.

4. Evaluation & Analysis

ย ย ย ย Since we've done prior investigation to make only related features into the dataset, we decided not to test what features would affect the result. We did training assuming that all features are important.

ย ย ย The graph above shows the training performance of the multiclass classification model using tensorflow.

ย ย ย ย In the tensorflow, epoch is the number of training, batch size is the size of data at one training, and iteration is the number of traing batch at one epoch. Since our data set is limitied, one epoch is not enough. However, when the epoch is too much, there could be an overfitting. So we draw the graph to show the loss value in accordance with epoch and batch size.

ย ย ย ย We achieved a loss value around 0,03 on both training and testing, using 400 epochs and 32 a batch size of 32

6. Conclusion

ย ย ย ย We were able to get the results we aimed for. Full Moon Service is designed to help users to recommend painkillers using the information entered. Users can expect faster effects by taking them. If there's any missing information, NUGU can get it through the slot filling function.
ย ย ย One thing that's unfortunate is that the current data set is a random situation. We will be able to progress as collecting the real user's information and feedback.

profile
๋‹น์‹ ์˜ ์—ฌ์„ฑ ๊ฑด๊ฐ• ์ง€ํ‚ด์ด

2๊ฐœ์˜ ๋Œ“๊ธ€

comment-user-thumbnail
2020๋…„ 12์›” 15์ผ

Thanks a lot for sharing!JOKER123

๋‹ต๊ธ€ ๋‹ฌ๊ธฐ