유튜브 채널 WIRED
Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED
Hilary Mason
Machine Learning
- when we teach computers to learn patterns from looking at examples in data,
- such that they can recognize those patterns
- and apply them to new things that they haven't seen before.
ML
- a way that we teach computers to learn things about the world
- by looking at patterns and looking at examples of things.
개, 고양이, 늑대, 자칼, 사람 사진 보여주고
→ "Is it a cat or dog?'
→ 개, 고양이에서 벗어난 답을 했을 때 이유 묻기
ML
- When we teach machines to make guesses about what things are
- based on looking at a lot of different examples.
tests in school
동물 사진을 천만 개 주고 개 or 고양이인지 구분하라 하면 어떻게 할 거니? 빨리 할 수 있겠어? → No
ML
- (Teen) Humans being able to teach machines or robots how to learn themselves.
- (Mason) When we teach machines to learn from data,
- to build a model from that data or a representation of that,
- and then to make a prediction.
Spotify - recommendation system
What machines can understand
- The machine can understand whatever we tell it to understand.
- Things like the pitch or the pacing or the tone.
- Sometimes machines can figure out things about music or images or videos that we don't tell it to discover.
Facebook or Instagram use ML to target ads
Algorithms
- (Teen) A set of steps or a process carried out to complete something.
There are thing that machines are really great at that humans are actually not great at.
People are really great with only one or two examples of learning something new
Machines are great at predicting based on what they've seen in the past,
An undergraduate who study Math and Computer Science in New York University
Gmail program
Supervised Learning classic classification approach
- A person would need to think about those features and creatively come up with them
- in approach we call the kitchen sink approach.
- which is just try everything you can possibly think of and see what works.
Unsupervised Learning
- We don't have labeled data and we're trying to infer some structure out of the data
- is you're projecting that data into a space and looking for things like clusters.
- And there's a bunch of really fun math about how you do that, how you think about distance
- and by distance, I mean that if we have two data points in space, how do we decide if they're similar or not?
How do the algorithms themselves usually differ between unsupervised and supervised learning?
Reinforcement Learning
- You can think about it like a turn in a game
- and you can play, you know, millions and millions trials
- so that you're able to develop a system
- that by experimenting with reinforcement learning
- can eventually learn to play these games pretty successfully.
- It also thrives in environments where you have a decision point, a pallette of actions to choose from.
- It actually comes historically from trying to train a robot to navigate a room.
- If it bonks into this chair, it can't go forward anymore.
- If it keeps exploring, it'll eventually get to the goal.
Deep Learning
- which is essentially using neural networks
- and very large amounts of data to eventually iterate on a network structure that can make predictions.
Is there a situation which you'd want to use a deep learning algorithm over a reinforcement learning algorithm?
You could build a system that could actually be useless.
Graduate Student who is in her first year of a PhD in Computer Science and studying natural language processing and machine learning in Columbia University
What have you been working on or interested in lately?
What are some of the techniques you're applying to look at that debate data?
In the last few years, we've seen a lot of changes and improvements in the capabilities of NLP systems. So is there anything in that you're particularly excited about exploring further?
Claudia Perlich
(Claudia) What types of biases in the data collection, and then also in usage?
(Mason) When you're collecting data from the real world and then building machine learning systems that automate decisions based on that data,
(Mason) And so, it's not just the provenance of that data, but it's, sort of, deeply understanding, "Why does it look the way it looks? Why was it collected this way? What are the limitations of it?"
(Mason) So things like actuarial science, operations research, where they actually are not using machine learning as much as you might think.
(Claudia) I am somewhat frustrated with a generation of students who have standard data sets that they never think about what the model needs to be used for.