Our Spanish implementations using sentence-based accentuation according to context, something that no TTS systems do
We propose Sentence-Aware Prosody Modelling (SAPM), which consists in the control of the accentuation/stress in words that contains more than one context.
It’s important to remember that, in Spanish, there are unstressed words, especially in possessive adjectives, demonstratives, prepositions, articles, etc. For example, we can’t say “lós juegos” instead of “los juegos”. Fortunately, almost all Spanish TTS engines have these unstressed words into account in a basic way. However, there are some words that can contain more than one context. This is the case of “bajo”, which has four contexts: the bass instrument (tocar el bájo), the low/lower (muy bájo, tan bájo), lowering action (yo bájo, ya bájo) and “under” attribution (bajo la luz del sol). As you can notice, in the first three cases, “bajo” as stress in the “a”; however, in the last one, it is unstressed, which is the behavior by default. Other case is “sobre” with two contexts: paper object (el sóbre de papel) and “about” preposition (hablar sobre ti).
Please note that, we are forcing stress with acute vowels on purpose, to illustrate the goal of SAPM better. This is the way a normal speaker would interpret the text. Some other examples are: una tarde / cada úna, este chico / éste es mi amigo, como estos / yo cómo arroz.
We implemented SAPM in RHVoice successfully under a sentence-based backend, and according to Cedillo (2025), “this enriches the form of interpretation and communication of the sentence in TTS while reading texts, something that no TTS system has and, as a consequence, leads the user to confusion”. While for Cedillo likes this feature, we also want to hear people’s opinion to ensure that SAPM is a good approach. For that, we did a form recently, in which you can evaluate audio clips and select the one what you consider good.















