Representation Learning: A Review and New Perspectives in 2012

SHIN·2023년 8월 23일
0

Representation Learning: A Review and New Perspectives
Yoshua Bengio†, Aaron Courville, and Pascal Vincent†
Department of computer science and operations research, U. Montreal
† also, Canadian Institute for Advanced Research (CIFAR)

Basically, this review is about why deep learning is doing great.

1. Introduction

An AI must fundamentally understand the world around us, and we argue that this can only be achieved if it can learn to identify and disentangle the underlying explanatory factors hidden in the observed milieu of low-level sensory data. Less human ingenuity involved while extracting feature from raw data.

2. WHY SHOULD WE CARE ABOUT LEARNING REPRESENTATIONS?

Representation learning has had a great impact in Speech Recognition both in academic and industrial labs. Microsoft released there software based on deep learning.
Also in Object recognition(MNIST digit classification problem sota by covnet model) and NLP(word embdding, where learning distributed represetation for each word).
And for Multi-task tranfer learning and domain adaptation, representation learning model showed great result, which means that the strengths of representation model has been confirmed empiricaly.

3. WHAT MAKES A REPRESENTATION GOOD?

3.1. Priors for Representation Learning in AI

Smoothness

xyx \approx y then f(x)f(y)f(x) \approx f(y)
But, we have to use the data to find the function, not by smooth interpolation that generalizes the outcome.

Multiple explanatory factors

Parameters understands huge number of configurations not only local generalization.

A hierarchical organization of explanatory factors

More abstract concepts are in turns of less abstract ones. This leds to re-use or extracted feature and invariance of most local changes that covers more varied phenomena.

Semi-supervised learning

Hence representations that are useful for P(X) tend to be useful when learning P(Y |X), allowing sharing of statistical strength between the unsupervised and supervised learning tasks.

Shared factors across tasks

With many Y(target) ’s of interest or many learning tasks in general, tasks (e.g., the corresponding P(Y |X,task)) are explained by factors that are shared with
other tasks,

Manifolds

Probability mass concentrates near regions that have a much smaller dimensionality than the original space where the data lives.

Natural clustering

Small changes still preserve the data's category. Categorical variables are associated with separate manifold.

Temporal and spatial coherence

Different factors change at different temporal and spatial scales, and many categorical concepts of interest change slowly.

Sparsity

Most of the extracted features are insensitive to small variations of given observation x

Simplicity of Factor Dependencies

In good high-level representations, the factors are related to each other through simple, typically linear dependencies.

profile
HAPPY the cat

0개의 댓글