Skip to content
Zhengyuan Zhu
Go back

Conversational AI Terminology and Learning Map

Chatbot Basics

Advanced Terminology

Common NLP Terms

Word Level

Sentence Level

Document Level

Common Basic Algorithms:

Question Answering (QA)

Reinforcement Learning (RL)

Markov Decision Process (MDP)

POMDP

Image captioning

Phonology

Segmentation

Both Chinese and English have segmentation problems, but relatively speaking, English words already have spaces between them for separation, so processing is relatively convenient. However, Chinese writing has no delimiters, so the segmentation problem is more prominent. Common methods for segmentation can be dictionary-based longest string matching, which is said to solve 85% of problems, but ambiguous segmentation is difficult. Another is the current mainstream statistical machine learning approach.

Part-of-Speech Tagging (Label)

In machine learning-based methods, it is often necessary to tag the part of speech of words. The purpose of tagging is to represent a hidden state of the word, and the transitions between hidden states constitute the state transition sequence. For example: Suning.com/n invested/v in/u Inter Milan/n. Among them, n represents noun, v represents verb, n and v are all tags. And so on.

Named Entity Recognition

Essentially it is still a type of tagging problem. It just refines the tagging. For example, Suning/cmp_s .com/cmp_e is/v B2C/n e-commerce/n. We tag Suning.com as cmp_s and cmp_e, representing the start and end of a company name. This way, when encountering Suning/Yun/Shang/.com in this scenario, it can be completely identified as a company name. If we use traditional tagging methods, Suning/cmp .com/cmp, such general tagging might have problems.

Syntax Parsing

Syntax parsing is often a rule-based expert system. Of course, it’s not to say it can’t be constructed using statistical methods, but initially, it was still constructed using knowledge from linguistics experts. The purpose of syntax parsing is to parse the dependency relationships between various components in a sentence. So often the final result generated is a syntax parsing tree. Syntax parsing can solve the problem that traditional bag-of-words models don’t consider context. For example, Zhang San is Li Si’s leader; Li Si is Zhang San’s leader. These two sentences are completely the same using the bag-of-words model, but syntax parsing can analyze the master-slave relationship within them, truly clarifying the sentence’s relationship.

Coreference Resolution (Anaphora Resolution)

Pronouns appear very frequently in Chinese, their role is to represent people’s names, place names, and other words that appeared in the previous text. For example, Suning.com is located in Nanjing, and this company is currently in the top three of China’s B2C market. In this sentence, “Suning.com” actually appears twice, “this company” refers to Suning.com. But due to Chinese habits, we won’t repeat “Suning.com” again.

AI Models

Deep Semantic Similarity Model (DSSM)

Triplet loss

Machine Reading Comprehension (MRC)

Knowledge Base-QA (KBQA)

Knowledge base completion (KBC)

Chatbot Domain Learning Map:

References and Citations


Share this post on:

Previous Post
Conversational AI Paper List
Next Post
Deep Learning Handwritten Notes
Jack the orange tabby cat
I'm Jack 🧡
Luna the tuxedo cat
I'm Luna! 🖤