when dealing with text processing, it's essential to identify the words within the text. this process entails breaking down the input sequence of characters into tokens and then normalizing these tokens into recognizable words.

Handwritten rulesIndividuals who speak a language possess a wealth of knowledge about it. One method to harness and utilize this knowledge is through the creation of rules.
Finite state transducerFinite State Transducers are versatile tools used for transforming an input sequence into an output sequence. They are employed in various applications, including converting NSWs into natural language.
Phonemes and allophonesThis video provides an introduction to the concept of a phoneme, which is a fundamental unit in phonological analysis.

There are two levels of representation in phonology: the surface or allophonic level, which is close to the actual articulation and reflects the phonetic descriptions we've learned, and the underlying or phonemic level, which represents abstract categories based on our perceptual judgments of sound similarity. Both levels use symbols from the IPA, but to differentiate them, we use square brackets
[ ]for surface forms, which can show varying levels of detail, and slashes
/ /for underlying forms, which only indicate abstract phonemic categories.
It can be challenging to discern the differences at the underlying level, especially in English.


In English, there are two surface representations with one underlying representation, while in Mapudungun, there are two surface representations with two distinct underlying representations.


Phonologists often express the relationship between a phoneme and its allophones through rules. The arrow in these rules is interpreted as "is realized as," and the slash indicates "in the environment of." The blank space denotes where the phoneme must appear for the rule to be applicable. To fully define a phoneme, we must first observe the surface forms and their contexts, then describe the patterns and seek generalizations related to shared features in these contexts.
PronunciationThe selection of a phoneme inventory is a crucial decision when developing a TTS or ASR system. While the IPA serves as a useful reference, it's not mandatory to adhere to it, allowing for flexibility in choices.
ProsodyIn Text-To-Speech systems, prosody can be simplified to the task of predicting pauses, durations, and F0.
Decision treeDecision trees are effective because they pose simple 'yes or no' questions about predictors, making them suitable for both categorical and continuous predictors, or a combination thereof.
Learning decision treesAfter defining the model, the next step is to develop an algorithm to estimate it from data. For Decision Trees, a straightforward greedy algorithm is used.

Summary
Origin: Module 5 speech synthesis – phonemes and the front end Translate + Edit: YangSier (Homepage)










