SP Modules Review Contents

杨丝儿

发布于 2022-11-15 09:12:50

6370

发布于 2022-11-15 09:12:50

Module 1: Phonetics and Representations of Speech

Systems in Speech Production

Speech production involves three systems in the body: the respiratory system, the phonation system, and the articulation system (Figure 1.2).

The Respiratory System

The respiratory system supplies the air needed to initiate speech sounds (see Figure 1.3). It consists of parts of the body that allow us to breathe, including the lungs, the diaphragm, the muscles of the rib cage, and the abdominal muscles.

The Phonation System

The phonation system comprises the larynx and its internal structure.
- Formed by two major cartilages, the thyroid and the cricoid,
  - the larynx (commonly known as the Adam’s apple) sits on a ring of connecting cartilage known as the trachea, or the windpipe (Figure 1.4a).
- Inside the larynx are the vocal folds (Figure 1.4b, and History Box).
  - They are made up of layers of tissue attached to a pair of arytenoid cartilages at the back end, and to the thyroid cartilage at the front end.
  - Movements of the arytenoids either bring the vocal folds together (adduct) and close off the airflow from the lungs, or move them apart (abduct) to allow the upward flow of air without obstruction.
  - The spacing between the vocal folds is the glottis.

The Articulation System: The Vocal Tract

Airflow through the glottis is further modified inside the vocal tract, which consists of three main cavities: the pharyngeal cavity, the oral cavity, and the nasal cavity (Figure 1.6).
- The upper surface of the oral cavity contains relatively stationary or passive articulators, including the upper lip, the teeth, the alveolar ridge, the hard palate, the soft palate (also called the velum), and the uvula.
- The lower lip and the tongue are the main mobile or active articulators on the lower surface of the vocal tract.
Different parts of the tongue are involved in speech production, and it is divided into different areas: tongue tip, blade, front, mid, body, back, and root.
Active articulators move toward passive articulators to form varying degrees of constriction, which shape the airflow before it leaves the vocal tract as distinct speech sounds.

Consonants

Voicing

Voicing occurs when the air from the lungs pushes the closed vocal folds apart, causing them to vibrate.
- Sounds produced with vocal fold vibration are voiced.
- In contrast, voiceless sounds are those made without vibration of the vocal folds

Place of articulation

Manner

Stops, trill, fricative, lateral fricatives, approximant.

Vowels

Monophthong: A vowel produced with the same tongue position throughout the syllable that is perceived as having a single quality such as the <ee> vowel in English ‘bee’. Diphthong: A vowel produced with tongue movement from one vowel to another within the same syllable, thus being perceived as having two qualities, such as the <uy> vowel in English ‘buy’.
Oral vowels: Vowels produced without velum lowering or nasalization. Nasal vowels: Vowels produced with airflows through the nasal and the oral cavities to signal a meaning contrast with oral vowels, as in French <bon> ‘good’ vs. <beau> ‘beautiful’.

Module 2: Acoustics of Consonants and vowels

Fundamental Frequency (F0): The lowest frequency component in a complex periodic sound.

Resonance: The condition whereby vibrations of a physical body such as an object or an air column are amplified by an outside force of the same natural frequency (sympathetic vibration).

Module 3: Digital Speech Signals

pass

Module 4: Source-Filter model

Q: Typically, telephones do not transmit any frequencies below 300Hz. What is the consequence on the listener’s perception of pitch, when the speaker has a fundamental frequency of 200Hz? A: The perceived pitch will be 200Hz, corresponding to the pitch of the original speech. As we saw in class, even if we use a high pass filter to remove frequencies that include F0, we can (somewhat surprisingly) still perceive the original pitch. This is because the harmonic structure is still in the signal (i.e., multiples of the F0) and the brain can use this to reconstruct F0. (Module 4 Lecture discussion, this is a stretch question!).

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2022-10-19，如有侵权请联系 cloudcommunity@tencent.com 删除

编程算法

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

编程算法

登录后参与评论

0 条评论

热度