AI-900 정리노트 - Text Analysis

Alpha, Orderly·2025년 4월 27일

AI 900

목록 보기

3/7

Tokenization

Break text into tokens.
Tokens can be generated for partial words, or combinations of words and punctuation.

1. we 
2. choose
3. to
4. go
5. the
6. moon

Frequency Analysis

count the number of occuerences of each token
can often provide a clude as to the main subject of the text corpus

Machine learning for text classification

Train a machine learning model classifies text based ono known set of categorization
Ex. Positive or Negative sentence

Semantic language model

Embeddings ( endocing of language token as vectors )
Embedding vectors as a coordinate in multi-dimensional space, that they occupy their own space
more related one groups together

Language detection

Can return NaN if language is ambiguous

Question answering

Create a custon question answering knowledge base

1. Define questions and answers

Using Language Studio's custom question answering, create a project with question-and-answer pairs.
These can be sourced from existing FAQs/web pages, entered manually, or a mix of both.
You can add alternative phrasing to questions
(e.g., for "What is your head office location?", add "Where is your head office?") so the system recognizes different ways users might ask the same thing.

2. Test the Project

After creating your question pairs, save the project.
This trains a natural language model to match questions to answers, even if phrased differently.
Then, use the built-in test tool in Language Studio to submit questions and review the generated answers.

Conversational Language Understanding

Build language model that interpret the meaning of phrases in a conversational setting.
Ex. "turn the light off" -> interpret -> turn the light of home off

HOW?

1. Utterance ( Spoken word )

Something user might say that application must interpret

- Switch the fan off
- Turn on the light

2. Entities

Refers an item utterance refers.

Ex

Switch the fan on.
Turn on the light

3. Intents

Purpose or goal expressed on utterance

CLU Authoring, Training, and Predicting

Authoring

Create an authoring resource to start building a CLU model.
Define intents, entities, and sample utterances.
Use prebuilt domains (with predefined intents and entities) or create your own.
You can create intents and entities in any order.
Authoring is easiest via the Language Studio (web-based interface).

Training

After defining intents, entities, and utterances, train the model.
Training teaches the model to match user input to the correct intent and entities.
Training and testing are iterative:
Test → Update → Retrain → Test again until performance is satisfactory.

Predicting

Once satisfied, publish your CLU application to a prediction resource.
Client apps connect to the prediction endpoint with an authentication key.
Predictions (intents and entities) are returned, allowing the app to act accordingly.

Azure AI Speech

Speech Recognition and Synthesis

Speech Recognition

Converts spoken words (live or recorded) into data, often as text.
Uses models:
- Acoustic model: Maps audio to phonemes.
- Language model: Maps phonemes to words using statistical prediction.
Applications:
- Closed captions
- Transcripts of calls/meetings
- Automated dictation
- Interpreting user input

Speech Synthesis

Converts text into spoken audio.
Requires:
- Text to speak
- Chosen voice characteristics
Process:
- Tokenizes text → assigns phonetic sounds → organizes into prosodic units → generates audio.
Applications:
- Spoken responses to users
- Voice menus
- Reading messages aloud
- Public announcements

Speech to Text

Use Azure AI Speech to transcribe audio (real-time or batch) into text.
Based on Microsoft's Universal Language Model, optimized for conversation and dictation.
Supports custom models (acoustics, language, pronunciation) if needed.

Real-time transcription:

Streams audio from microphone or file.
Returns transcribed text instantly for live scenarios.

Batch transcription:

Works with stored audio files (e.g., Azure storage with SAS URI).
Asynchronous job processing without guaranteed start time.

Text to Speech

Converts text into audible speech via API.
Output can be played directly or saved as an audio file.

Speech synthesis voices:

Choose from multiple pre-defined or neural voices (natural intonation).
Supports multiple languages and regional accents.
Option to create and use custom voices.

Translation

Literal and Semantic Translation

Early machine translation used literal word-for-word translation, often causing meaning loss.
AI translation focuses on semantic understanding, considering grammar, context, formality, and colloquialisms for accurate results.

Text and Speech Translation

Text translation:

Translate documents, emails, websites, and social media content.

Speech translation:

Translate spoken language directly (speech-to-speech) or via text (speech-to-text).

Azure AI Translator

Integrates easily into apps, websites, and tools.
Uses Neural Machine Translation (NMT) for context-aware, accurate translations.

Language support:

Supports translation between 130+ languages.
Use ISO 639-1 codes (e.g., en, fr, zh) or extended culture codes (e.g., en-US, fr-CA).
Allows translating from one source language into multiple target languages at once.

Azure AI Speech

Translates spoken audio from streams (e.g., microphone, audio file) into text or audio.
Enables real-time closed captioning and two-way spoken conversation translation.

Language support:

Translates speech into 90+ languages.
Source language must use extended culture codes (e.g., es-US), target languages use two-letter codes (e.g., en, de).

Alpha, Orderly

만능 컴덕후 겸 번지 팬

이전 포스트

AI-900 정리노트 - Computer vision

다음 포스트

AI-900 정리노트 - Text Analysis

AI 900

Tokenization

Frequency Analysis

Machine learning for text classification

Semantic language model

Language detection

Question answering

Create a custon question answering knowledge base

1. Define questions and answers

2. Test the Project

Conversational Language Understanding

HOW?

1. Utterance ( Spoken word )

2. Entities

Ex

3. Intents

CLU Authoring, Training, and Predicting

Authoring

Training

Predicting

Azure AI Speech

Speech Recognition and Synthesis

Speech Recognition

Speech Synthesis

Speech to Text

Text to Speech

Translation

Literal and Semantic Translation

Text and Speech Translation

Azure AI Translator

Azure AI Speech

AI-900 정리노트 - Computer vision

AI-900 정리노트 - Document intelligence, Knowledge Mining

0개의 댓글