Representing words by their context

YoungJoon Suh·2022년 9월 27일
0

Distributional semantics: A word's meaning is given by the words that frequently appear close-by
You shall know a word by the company it keeps.
One of the most successful ideas of modern statistical NLP!
When a word w appears in a text, its context is the set of words that appear nearby (within a fixed-size window).
Use the many contexts of w to build up a representational of w

  1. Word2vec: Overview
    Word2vec is a framework for learning word vectors

idea:
We have a large corpus ("body") of text
Every word in a fixed vocabulary is represented by a vector.
Go through each position t in the text, which has a center word c and context ("outside") words o
Use the similarity of the word vectors for c and o to calculate the probability of o given c (or vice versa)
Keep adjusting the word vectors to maximize this probability

profile
저는 서영준 입니다.

0개의 댓글

관련 채용 정보