Eintrag weiter verarbeiten
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech
Gespeichert in:
Zeitschriftentitel: | American Journal of Speech-Language Pathology |
---|---|
Personen und Körperschaften: | , , , |
In: | American Journal of Speech-Language Pathology, 28, 2019, 2S, S. 875-886 |
Format: | E-Article |
Sprache: | Englisch |
veröffentlicht: |
American Speech Language Hearing Association
|
Schlagwörter: |
author_facet |
Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. |
---|---|
author |
Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. |
spellingShingle |
Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. American Journal of Speech-Language Pathology The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech Speech and Hearing Linguistics and Language Developmental and Educational Psychology Otorhinolaryngology |
author_sort |
vojtech, jennifer m. |
spelling |
Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. 1058-0360 1558-9110 American Speech Language Hearing Association Speech and Hearing Linguistics and Language Developmental and Educational Psychology Otorhinolaryngology http://dx.doi.org/10.1044/2019_ajslp-msc18-18-0052 <jats:sec><jats:title>Purpose</jats:title><jats:p>This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech.</jats:p></jats:sec><jats:sec><jats:title>Method</jats:title><jats:p>Sixteen sentences of varying prosodic content were developed via a speech synthesizer. The f0 contour and speech rate of these sentences were altered to produce 4 stimulus sets: (a) normal rate with a fixed f0 level, (b) slow rate with a fixed f0 level, (c) normal rate with prosodically natural f0 variation, and (d) normal rate with prosodically unnatural f0 variation. Sixteen listeners provided orthographic transcriptions and judgments of naturalness for these stimuli.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>Sentences with f0 variation were rated as more natural than those with a fixed f0 level. Conversely, sentences with a fixed f0 level demonstrated higher intelligibility than those with f0 variation. Speech rate did not affect the intelligibility of stimuli with a fixed f0 level. Communication efficiency was highest for sentences produced at a normal rate and a fixed f0 level.</jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p>Sentence-level f0 variation increased naturalness ratings of synthesized speech, whether the variation was prosodically natural or not. However, these f0 variations reduced intelligibility. There is evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.</jats:p></jats:sec><jats:sec><jats:title>Supplemental Material</jats:title><jats:p><jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://doi.org/10.23641/asha.8847833">https://doi.org/10.23641/asha.8847833</jats:ext-link></jats:p></jats:sec> The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech American Journal of Speech-Language Pathology |
doi_str_mv |
10.1044/2019_ajslp-msc18-18-0052 |
facet_avail |
Online |
finc_class_facet |
Allgemeines Allgemeine und vergleichende Sprach- und Literaturwissenschaft, Indogermanistik, Außereuropäische Sprachen und Literaturen Biologie Psychologie Medizin |
format |
ElectronicArticle |
fullrecord |
blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA0NC8yMDE5X2Fqc2xwLW1zYzE4LTE4LTAwNTI |
id |
ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA0NC8yMDE5X2Fqc2xwLW1zYzE4LTE4LTAwNTI |
institution |
DE-14 DE-Ch1 DE-L229 DE-D275 DE-Bn3 DE-Brt1 DE-D161 DE-Gla1 DE-Zi4 DE-15 DE-Pl11 DE-Rs1 |
imprint |
American Speech Language Hearing Association, 2019 |
imprint_str_mv |
American Speech Language Hearing Association, 2019 |
issn |
1058-0360 1558-9110 |
issn_str_mv |
1058-0360 1558-9110 |
language |
English |
mega_collection |
American Speech Language Hearing Association (CrossRef) |
match_str |
vojtech2019theeffectsofmodulatingfundamentalfrequencyandspeechrateontheintelligibilitycommunicationefficiencyandperceivednaturalnessofsyntheticspeech |
publishDateSort |
2019 |
publisher |
American Speech Language Hearing Association |
recordtype |
ai |
record_format |
ai |
series |
American Journal of Speech-Language Pathology |
source_id |
49 |
title |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_unstemmed |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_full |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_fullStr |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_full_unstemmed |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_short |
The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_sort |
the effects of modulating fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech |
topic |
Speech and Hearing Linguistics and Language Developmental and Educational Psychology Otorhinolaryngology |
url |
http://dx.doi.org/10.1044/2019_ajslp-msc18-18-0052 |
publishDate |
2019 |
physical |
875-886 |
description |
<jats:sec><jats:title>Purpose</jats:title><jats:p>This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech.</jats:p></jats:sec><jats:sec><jats:title>Method</jats:title><jats:p>Sixteen sentences of varying prosodic content were developed via a speech synthesizer. The f0 contour and speech rate of these sentences were altered to produce 4 stimulus sets: (a) normal rate with a fixed f0 level, (b) slow rate with a fixed f0 level, (c) normal rate with prosodically natural f0 variation, and (d) normal rate with prosodically unnatural f0 variation. Sixteen listeners provided orthographic transcriptions and judgments of naturalness for these stimuli.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>Sentences with f0 variation were rated as more natural than those with a fixed f0 level. Conversely, sentences with a fixed f0 level demonstrated higher intelligibility than those with f0 variation. Speech rate did not affect the intelligibility of stimuli with a fixed f0 level. Communication efficiency was highest for sentences produced at a normal rate and a fixed f0 level.</jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p>Sentence-level f0 variation increased naturalness ratings of synthesized speech, whether the variation was prosodically natural or not. However, these f0 variations reduced intelligibility. There is evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.</jats:p></jats:sec><jats:sec><jats:title>Supplemental Material</jats:title><jats:p><jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://doi.org/10.23641/asha.8847833">https://doi.org/10.23641/asha.8847833</jats:ext-link></jats:p></jats:sec> |
container_issue |
2S |
container_start_page |
875 |
container_title |
American Journal of Speech-Language Pathology |
container_volume |
28 |
format_de105 |
Article, E-Article |
format_de14 |
Article, E-Article |
format_de15 |
Article, E-Article |
format_de520 |
Article, E-Article |
format_de540 |
Article, E-Article |
format_dech1 |
Article, E-Article |
format_ded117 |
Article, E-Article |
format_degla1 |
E-Article |
format_del152 |
Buch |
format_del189 |
Article, E-Article |
format_dezi4 |
Article |
format_dezwi2 |
Article, E-Article |
format_finc |
Article, E-Article |
format_nrw |
Article, E-Article |
_version_ |
1792343778314420226 |
geogr_code |
not assigned |
last_indexed |
2024-03-01T16:57:03.358Z |
geogr_code_person |
not assigned |
openURL |
url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=The+Effects+of+Modulating+Fundamental+Frequency+and+Speech+Rate+on+the+Intelligibility%2C+Communication+Efficiency%2C+and+Perceived+Naturalness+of+Synthetic+Speech&rft.date=2019-07-15&genre=article&issn=1558-9110&volume=28&issue=2S&spage=875&epage=886&pages=875-886&jtitle=American+Journal+of+Speech-Language+Pathology&atitle=The+Effects+of+Modulating+Fundamental+Frequency+and+Speech+Rate+on+the+Intelligibility%2C+Communication+Efficiency%2C+and+Perceived+Naturalness+of+Synthetic+Speech&aulast=Stepp&aufirst=Cara+E.&rft_id=info%3Adoi%2F10.1044%2F2019_ajslp-msc18-18-0052&rft.language%5B0%5D=eng |
SOLR | |
_version_ | 1792343778314420226 |
author | Vojtech, Jennifer M., Noordzij, Jacob P., Cler, Gabriel J., Stepp, Cara E. |
author_facet | Vojtech, Jennifer M., Noordzij, Jacob P., Cler, Gabriel J., Stepp, Cara E., Vojtech, Jennifer M., Noordzij, Jacob P., Cler, Gabriel J., Stepp, Cara E. |
author_sort | vojtech, jennifer m. |
container_issue | 2S |
container_start_page | 875 |
container_title | American Journal of Speech-Language Pathology |
container_volume | 28 |
description | <jats:sec><jats:title>Purpose</jats:title><jats:p>This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech.</jats:p></jats:sec><jats:sec><jats:title>Method</jats:title><jats:p>Sixteen sentences of varying prosodic content were developed via a speech synthesizer. The f0 contour and speech rate of these sentences were altered to produce 4 stimulus sets: (a) normal rate with a fixed f0 level, (b) slow rate with a fixed f0 level, (c) normal rate with prosodically natural f0 variation, and (d) normal rate with prosodically unnatural f0 variation. Sixteen listeners provided orthographic transcriptions and judgments of naturalness for these stimuli.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>Sentences with f0 variation were rated as more natural than those with a fixed f0 level. Conversely, sentences with a fixed f0 level demonstrated higher intelligibility than those with f0 variation. Speech rate did not affect the intelligibility of stimuli with a fixed f0 level. Communication efficiency was highest for sentences produced at a normal rate and a fixed f0 level.</jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p>Sentence-level f0 variation increased naturalness ratings of synthesized speech, whether the variation was prosodically natural or not. However, these f0 variations reduced intelligibility. There is evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.</jats:p></jats:sec><jats:sec><jats:title>Supplemental Material</jats:title><jats:p><jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://doi.org/10.23641/asha.8847833">https://doi.org/10.23641/asha.8847833</jats:ext-link></jats:p></jats:sec> |
doi_str_mv | 10.1044/2019_ajslp-msc18-18-0052 |
facet_avail | Online |
finc_class_facet | Allgemeines, Allgemeine und vergleichende Sprach- und Literaturwissenschaft, Indogermanistik, Außereuropäische Sprachen und Literaturen, Biologie, Psychologie, Medizin |
format | ElectronicArticle |
format_de105 | Article, E-Article |
format_de14 | Article, E-Article |
format_de15 | Article, E-Article |
format_de520 | Article, E-Article |
format_de540 | Article, E-Article |
format_dech1 | Article, E-Article |
format_ded117 | Article, E-Article |
format_degla1 | E-Article |
format_del152 | Buch |
format_del189 | Article, E-Article |
format_dezi4 | Article |
format_dezwi2 | Article, E-Article |
format_finc | Article, E-Article |
format_nrw | Article, E-Article |
geogr_code | not assigned |
geogr_code_person | not assigned |
id | ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA0NC8yMDE5X2Fqc2xwLW1zYzE4LTE4LTAwNTI |
imprint | American Speech Language Hearing Association, 2019 |
imprint_str_mv | American Speech Language Hearing Association, 2019 |
institution | DE-14, DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-D161, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1 |
issn | 1058-0360, 1558-9110 |
issn_str_mv | 1058-0360, 1558-9110 |
language | English |
last_indexed | 2024-03-01T16:57:03.358Z |
match_str | vojtech2019theeffectsofmodulatingfundamentalfrequencyandspeechrateontheintelligibilitycommunicationefficiencyandperceivednaturalnessofsyntheticspeech |
mega_collection | American Speech Language Hearing Association (CrossRef) |
physical | 875-886 |
publishDate | 2019 |
publishDateSort | 2019 |
publisher | American Speech Language Hearing Association |
record_format | ai |
recordtype | ai |
series | American Journal of Speech-Language Pathology |
source_id | 49 |
spelling | Vojtech, Jennifer M. Noordzij, Jacob P. Cler, Gabriel J. Stepp, Cara E. 1058-0360 1558-9110 American Speech Language Hearing Association Speech and Hearing Linguistics and Language Developmental and Educational Psychology Otorhinolaryngology http://dx.doi.org/10.1044/2019_ajslp-msc18-18-0052 <jats:sec><jats:title>Purpose</jats:title><jats:p>This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech.</jats:p></jats:sec><jats:sec><jats:title>Method</jats:title><jats:p>Sixteen sentences of varying prosodic content were developed via a speech synthesizer. The f0 contour and speech rate of these sentences were altered to produce 4 stimulus sets: (a) normal rate with a fixed f0 level, (b) slow rate with a fixed f0 level, (c) normal rate with prosodically natural f0 variation, and (d) normal rate with prosodically unnatural f0 variation. Sixteen listeners provided orthographic transcriptions and judgments of naturalness for these stimuli.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>Sentences with f0 variation were rated as more natural than those with a fixed f0 level. Conversely, sentences with a fixed f0 level demonstrated higher intelligibility than those with f0 variation. Speech rate did not affect the intelligibility of stimuli with a fixed f0 level. Communication efficiency was highest for sentences produced at a normal rate and a fixed f0 level.</jats:p></jats:sec><jats:sec><jats:title>Conclusions</jats:title><jats:p>Sentence-level f0 variation increased naturalness ratings of synthesized speech, whether the variation was prosodically natural or not. However, these f0 variations reduced intelligibility. There is evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.</jats:p></jats:sec><jats:sec><jats:title>Supplemental Material</jats:title><jats:p><jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://doi.org/10.23641/asha.8847833">https://doi.org/10.23641/asha.8847833</jats:ext-link></jats:p></jats:sec> The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech American Journal of Speech-Language Pathology |
spellingShingle | Vojtech, Jennifer M., Noordzij, Jacob P., Cler, Gabriel J., Stepp, Cara E., American Journal of Speech-Language Pathology, The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech, Speech and Hearing, Linguistics and Language, Developmental and Educational Psychology, Otorhinolaryngology |
title | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_full | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_fullStr | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_full_unstemmed | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_short | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
title_sort | the effects of modulating fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech |
title_unstemmed | The Effects of Modulating Fundamental Frequency and Speech Rate on the Intelligibility, Communication Efficiency, and Perceived Naturalness of Synthetic Speech |
topic | Speech and Hearing, Linguistics and Language, Developmental and Educational Psychology, Otorhinolaryngology |
url | http://dx.doi.org/10.1044/2019_ajslp-msc18-18-0052 |