Prosody & Rhythm

The musicality of language that text-based models cannot capture. Intonation, meter, stress patterns, and speech rhythm—the features that make oral language fundamentally different from written.

Prosody Theory

The Syntax-Phonology Interface
Selkirk, 2011 · Handbook of Phonological Theory (Wiley)
The Prosodic Hierarchy in Meter
Hayes, 1989 · Rhythm and meter formalisms
Intonation and Meaning
Ladd, 2012 · Oxford Handbook of Laboratory Phonology
Intonational Phonology
Ladd, 1996/2008 · Cambridge · Standard ToBI framework

Speech Rhythm

Rhythm Class Hypothesis
Ramus et al., 1999 · Stress-timed vs syllable-timed
Calibrating Rhythm: First Language and Second Language Studies
White & Mattys, 2007 · Journal of Phonetics
Timing in Speech: A Multi-level Process
Turk & Shattuck-Hufnagel, 2014

Computational Prosody

ToBI Automatic Annotation
Rosenberg, 2010 · Interspeech · AuToBI tool
Neural Speech Synthesis Survey
arXiv:2106.15561 · TTS and prosody modeling
Tacotron 2: Natural TTS Synthesis
Shen et al., arXiv:1712.05884 · Neural prosody generation
Style Tokens: Unsupervised Style Modeling for TTS
Wang et al., ICML 2018 · Global style tokens
FastSpeech: Fast and Controllable TTS
Ren et al., arXiv:1907.04462 · Duration control

Poetic Meter & Rhythm

Automatic Analysis of Poetic Rhythm
Estes & Hench, COLING 2016
Metrical Phonology and Phonological Structure
Liberman & Prince, 1977 · The foundational paper

Prosody & Discourse

Prosody in Conversation
Couper-Kuhlen & Selting (eds.), 1996 · Cambridge
Prosodic Cues to Discourse Structure
Hirschberg & Pierrehumbert, 1986
← back to /learn