dense encoder로 text 데이터를 인코딩하고 dense embedding 값을 field로 넣어 'title', 'text', 'text_vector' field로 서버 DB를 구성할 계획이다.
Dense Embedding을 만들고 indexing하는 방법은 조만간 게시물로 작성할 계획이다.
그전에 Docker로 서버에서 elasticsearch, Kibana를 구동하고 한국어 형태소 분석기인 nori plugin을 설치하는 법에 대해 알아보자.
나중에 멀티노드를 사용할 수도 있고 익숙하기도 해서 본인은 docker-compose를 사용했다.
docker-compose-es-single.yml
version: '3.7'
services:
fastcampus-es:
image: docker.elastic.co/elasticsearch/elasticsearch:7.13.2
container_name: es-singlenode
environment:
- node.name=single
- cluster.name=standalone
- discovery.type=single-node
ports:
- 9200:9200
- 9300:9300
networks:
- elastic
networks:
elastic:
driver: bridge
docker-compose-es-multi.yml
version: '3.7'
services:
es01:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.4
container_name: es01
environment:
- node.name=es01
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es02,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data01:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- elastic
es02:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.4
container_name: es02
environment:
- node.name=es02
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es03
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data02:/usr/share/elasticsearch/data
networks:
- elastic
es03:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.4
container_name: es03
environment:
- node.name=es03
- cluster.name=es-docker-cluster
- discovery.seed_hosts=es01,es02
- cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- data03:/usr/share/elasticsearch/data
networks:
- elastic
volumes:
data01:
driver: local
data02:
driver: local
data03:
driver: local
networks:
elastic:
driver: bridge
version: "3.7"
services:
docker-kibana:
image: docker.elastic.co/kibana/kibana:7.17.4
container_name: docker-kibana
environment:
ELASTICSEARCH_HOSTS: '["http://host.docker.internal:9200"]'
ports:
- 5601:5601
expose:
- 5601
restart: always
network_mode: bridge
이렇게 single-node, multi-node, kibana에 대한 별개의 yml을 짤 수 있다.
본인은 single-node + kibana로 구성해 docker를 실행하였기 때문에 다음과 같은 yml을 사용하였다.
docker-compose-es-kibana.yml
version: '3.7'
services:
es:
image: docker.elastic.co/elasticsearch/elasticsearch:7.17.4
container_name: es-single-node
environment:
- node.name=single
- cluster.name=standalone
- discovery.type=single-node
volumes:
- data:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- es-bridge
kibana:
image: docker.elastic.co/kibana/kibana:7.17.4
container_name: kibana
ports:
- 5601:5601
environment:
- ELASTICSEARCH_HOSTS=["http://es:9200"]
depends_on:
- es
networks:
- es-bridge
volumes:
data:
driver: local
networks:
es-bridge:
driver: bridge
사용하는 node마다 nori plugin을 설치하면 된다.
본인은 single-node이기 때문에 한 번만 설치했다.
docker-compose -f {yml 파일 경로} up
본인의 경우 docker-compose -f docker-compose-es-kibana.yml up
docker exec it {container 이름} bash
본인의 경우 docker exec it es bash
bin/elasticsearch-plugin install analysis-nori
docker-compose stop
docker-compose up