[머신러닝]고양이 종류 찾기_(데이터 사전준비)

Ronie🌊·2021년 2월 2일

머신러닝

머신러닝👨‍🏫

목록 보기

6/6

해당 git가기

개요
프로젝트 흐름
데이터 사전준비

개요

고양이 종별 사진을 머신러닝하여, 고양이 종 판단 웹어플리케이션 제작하기

고양이 종류는 크게 스코티쉬 폴드, 러시안 블루, 먼치킨, 샴, 뱅갈, 터키쉬 앙고라로 6종류로 제한한다.
고양이 종류별 설명또는 특성까지 추가해주는 것을 목표로 한다.

프로젝트 흐름

데이터 사전준비
파이썬 웹크롤링을 사용하여 고양이 종별 사진 데이터 준비
모델링
teachable machine를 사용하여 모델 api준비
https://teachablemachine.withgoogle.com/train
모델 활용
JavaScript, HTML으로 모델 API적용

데이터 사전준비

개요

구글 검색사진을 크롤링하여 크롤링한 데이터의 정확성을 높이기위해 고양이 얼굴인식을 사용, 해당 고양이 한마리만 나온 사진을 사전데이터로 준비한다.

필요 라이브러리

opencv
haarcascades/haarcascade_frontalcatface.xml
selenium

데이터크롤링 검색어 지정

json 파일에서 출력
검색어의 다양성을 위해 한국어,영어 두 검색어를 사용

jsonReader.py

import json

def jsonReader():
    with open('./searchIndexs.json', 'r', encoding="utf-8") as f:
        json_data = json.load(f)
    result = json_data['cats']['category']
    print(result)
    return result

searchIndexs.json

{
	"cats": {
        "category": [
            "스코티쉬 폴드",
            "scottish fold",
            "러시안 블루",
            "russian blue",
            "샴",
            "siamese",
            "뱅갈",
            "bengal",
            "터키쉬 앙고라",
            "turkish angora"]
    }
}

웹크롤링

WebScrappingCat.py
- 사전 크롬 드라이버 필요(chromedriver.exe)
  https://chromedriver.chromium.org/downloads

def scrapping(key,num):
    driver = webdriver.Chrome(options=options)
    driver.get("https://www.google.co.kr/imghp?hl=ko&tab=wi&authuser=0&ogbl")
    elem = driver.find_element_by_name("q")
    elem.send_keys(key)
    elem.send_keys(Keys.RETURN)

    SCROLL_PAUSE_TIME = 1
    # Get scroll height
    last_height = driver.execute_script("return document.body.scrollHeight")
    while True:
        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)
        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            try:
                driver.find_element_by_css_selector(".mye4qd").click()
            except:
                break
        last_height = new_height

    images = driver.find_elements_by_css_selector(".rg_i.Q4LuWd")
    count = 1
    images_num = 0
    for image in images:
        images_num += 1
        if images_num > num:
            break
        try:
            if image.is_enabled():
                driver.execute_script("arguments[0].click();", image)
            else:
                time.sleep(5)
                driver.execute_script("arguments[0].click();", image)
            time.sleep(2)
            try:
                imgUrl = driver.find_element_by_xpath('//*[@id="Sva75c"]/div/div/div[3]/div[2]/c-wiz/div[1]/div[1]/div/div[2]/a/img').get_attribute("src")
            except NoSuchElementException as e:
                print(e)
                continue
            opener=urllib.request.build_opener()
            opener.addheaders=[('User-Agent','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1941.0 Safari/537.36')]
            urllib.request.install_opener(opener)
            urllib.request.urlretrieve(imgUrl, "./img/" + key +  "/"+ str(count) + ".jpg")
            print("[생성완료] ./img/" + key +  "/"+ str(count) + ".jpg")
            count = count + 1
        except FileNotFoundError as err: 
            print(err)
        except:
            print("예상 외 에러 발생")
    driver.close()
    # 출력값
    return key

고양이 얼굴인식

catFaceRecognition.py

def detectCatFace(imgPath):
    result = False
    # 이미지 불러오기 
    ff = np.fromfile(imgPath, np.uint8)
    img = cv2.imdecode(ff, cv2.IMREAD_COLOR)
    # 회색으로 변경 
    grayImg = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # 얼굴 검출
    faces = face_cascade.detectMultiScale(grayImg, scaleFactor=SF, minNeighbors=N, minSize=MS)
    # 얼굴 검출 1개인 경우만 승인
    if len(faces) == 1:
        result = True
    else:
        result = False
    # 출력값
    return result

Ronie🌊

이전 포스트

[머신러닝]고양이 종류 찾기_(데이터 사전준비)

머신러닝👨‍🏫

개요

프로젝트 흐름

데이터 사전준비

개요

필요 라이브러리

데이터크롤링 검색어 지정

웹크롤링

고양이 얼굴인식

[머신러닝 야학]수료증?!

0개의 댓글