The name of our service is “NUGaller”, picture finding service.
These days, because we have such high quality cameras on our phones, and with many people using SNS applications such as Instagram, people are taking more photos than ever before.
However, considering the large amount of photos that we take, it is not easy to organise our photo albums. I am sure that most of you have had the experience of thinking like, ‘Where is that photo that I took the other day?’ and having to go through the endless photos in your photo album. How convenient would it be if you could have a service that can find the exact photo in these frustrating moments?
This is why we decided to design a service that can allow you to find pictures in a more easy and convenient manner. Picture Finder will find pictures according to your specific needs such as
"show me the pictures of the moon in my phone"
"show me the pictures that I took today"
"show me the pictures with more than 5 people in it"
"show me the pictures I took in November"
Furthermore, we are going to make it possible for Picture Finder to recognize not only objects, but also human faces, so that we can ask questions such as
"Find pictures that I took of him or her"
In conclusion, we are going to present service that can manage own album by NUGU AI Speaker.
When we use NUGU AI Speaker, we need NUGU application, So we develop our own application for user. Application show the output of the service. Develop an application for an intermediate service between user and server. This application plays three rules. First, send images to the server and receive from server using Okhttp library. Second, provide an interface to view and manage the pictures through this application in asynchronous processing environment. Third, receive an alarm from server using Firebase Clouding Messaging API, when server find proper result images responding user's require.
- Tool = Android studio
- Language = JAVA
- API = Firebase Cloud Messaging (FCM)
Build a Server to communicate NUGU AI Speaker with our application. Server play 2 rules. First, execute the Google vision API on the server to analyze the photos. Secondly, retrieve entity string coming from NUGU AI speaker and derive the result by querying.
- Database = MariaDB
- Tool = Atom
- Language = Node.JS
- API = Google Vision API
- NUGU play builder
It supports interworking with NUGU platform and our service(Our application and server) base on NUGU's natural language understanding. We can start our service by NUGU AI Speaker. And It provide proper answer about user's requirments.
- Dataset of Vocabulary list
We're registering keywords for NUGU's intents so need data to learn and synonyms to use as tag.
Therefore we will use National Institute of Korean Language's vocabulary words data for Korean learning. Because this file is provided in Excel file, we will extract only nouns with synonyms for key words and make databases using python data library. Our builders will register NUGU's intents and tags. These intents would be called when users use NUGU speakers.
- Dataset of humanface recognition
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN):
The dataset consists of 70,000 high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity and image background. It also has good coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from Flickr, thus inheriting all the biases of that website, and automatically aligned and cropped using dlib. Only images under permissive licenses were collected. Various automatic filters were used to prune the set, and finally Amazon Mechanical Turk was used to remove the occasional statues, paintings, or photos of photos.
What we are going to do is to send photos from android application to backend server and save it in the database with its tags, so when the user requests NUGU to find the photo with tags, the backend server will send the right photos to the application.
We're trying to make android application to upload photos from local storage of smartphone to backend server. Server will annotate photos using Vision API and send to application again, finally users can use labeled photos through NUGU speakers. For this task, we're using Android Studio.
This action finds photos with the keywords that user wants to find. When the user says keyword to NUGU speaker, the server searches photos in the album analyzed and tagged with Google Vision API. It also offers the result via application.
This action allows the user to make an album. The user can also put the photos searched by find_photo action on the album with movi_photos_to_album action. The user can check the album on the application. It is a multi-turn action and the branch is create_receive_name, receives the name of the album.
Moves the photos found by find_photo action to the album the user want. It is a multi-turn action and the branch is move_receive_name, receives the name of the album the user want.
Find the photos with stored date of the photos. BID_DT_CYEAR, BID_DT_YMONTH, BID_DT_MDAY and BID_DT_DAY, which are Built-in Entities are used to receive the date the user wants, searches the stored photos in the server.
Deletes photos found by find_photo.
<Simulation 1 - 사진 검색>
사용자 : 아리아, 누구 사진으로 달 사진 찾아줘.
NUGU : 23개의 사진이 검색되었습니다.
OUTPUT : 검색된 23개의 사진을 휴대폰에 출력한다.
<simulation 2 - 사진 검색(날짜 기반)>
사용자 : 아리아, 누구 사진으로 오늘 찍은 사진 보여줘.
NUGU : 10장의 사진이 검색되었습니다.
OUTPUT : 검색된 10장의 사진을 휴대폰에 출력한다.
<Simulation 3 - 검색 사진 삭제>
사용자 : 아리아, 누구 사진으로 내 휴대폰에 이 사람 얼굴 나온 사진 찾아줘
NUGU : 35개의 사진이 휴대폰에서 검색 되었습니다.
사용자 : 검색된 사진 전부 지워줘
NUGU : 검색된 35개의 사진을 모두 지울까요?
사용자 : 응.
NUGU : 35개의 사진이 삭제되었습니다.
OUTPUT1 : 35개의 사진을 휴대폰에 출력한다.
OUTPUT2 : 35개의 검색된 사진을 엘범에서 삭제한다.
<Simulation 4 - 사진 검색 실패>
사용자 : 아리야, 누구 사진으로 고양이 사진 찾아줘.
NUGU : 검색된 사진이 없습니다.
<Simulation 5 – 앨범 생성>
사용자 : 아리아, 누구 사진으로 새앨범 만들어줘
NUGU : 앨범 이름은 무엇으로 할까요?
사용자 : 내 셀카로 해줘
NUGU : 앨범이 생성되었습니다.
OUTPUT : 해당 이름의 앨범 생성
<Simulation 6 – 앨범 정리>
사용자 : 아리아, 누구 사진으로 내 셀카 찾아줘
NUGU : 12개의 사진이 검색되었습니다.
사용자 : 이 사진들 내 사진 앨범으로 옮겨줘
NUGU : 사진들을 해당 앨범으로 옮겼습니다.
OUTPUT1 : 검색된 12개의 사진을 휴대폰에 출력한다.
OUTPUT2 : 검색된 12개의 사진을 해당 앨범으로 옮긴다.
- Using okhttp library for connection between server and http.
- Using Swipe layout through butterknife library, for showing refresh screen
- To get transmission completion alarm, Using FCM service of Google (Firebase Cloud Message)
- Refers to resolve http access blocking issue in Android 9
- Used to build a backend proxy server to communicate with NUGU
- Refer to Node.js and MariaDB integration.
- Reference when implementing photo upload using multer module in Node.js
- Refer to Vision API initial setup and basic test.
To analyze the accuracy of the service, we tested the recognition rate of NUGaller.
Four main functions of the NUGaller were tested - (Find a picture / Delete a picture / Create an album /Move a picture).
Evaluation was conducted on Recognition Rate(Clean, Noisy), Misperception rate in proportion to the frequency, Accuracy and Response speed.
We develop Picture finding and Album organizing service, NUGaller base on NUGU AI Speaker. The high quality of the Google vision api gave us better accuracy than we initially expected.
When we fisrt started we planned a service for family or group. Because most of NUGU's users are family or group units. So we were going to make a public cloud album called 'NUGU album' for group using NUGU AI Speaker, and put the NUGaller service inside this service.
However, we scaled down our services because of limited development capabilities and time constraints. If we extend the service in the manner described above, we we can probably expect to activate the use of NUGU speaker.