Log out of Readcube. Click on an option below to access. Log out of ReadCube. The stimuli were ambiguous idioms used figuratively and literally, and matched novel control phrases. The analysis of the articulatory durations showed a processing advantage for idioms over controls. Further, we found that figurative meanings were articulated somewhat faster than their literal counterparts.

The results suggest that the processing advantage for idioms over control phrases, previously reported in comprehension studies, is also present during their production. Unlike the comprehension idiom literature, however, the two idiom meanings might be processed differently during reading aloud. The study concludes with directions for future research, and a case is made for why this line of research is important for the field of applied linguistics.

Share Give access Share full text access. Share full text access. Please review our Terms and Conditions of Use and check box below to share full-text version of article. Moreover, statistical linguistic features such as connective degree [ 14 ], punctuation confidence PC [ 28 , 29 , 30 , 31 ], and quotation confidence QC [ 30 , 31 ] have been proposed to neglect complex syntactic tree parsing and manual word chunking that is impractical when constructing an unlimited-text MTTS.

This paper focuses on the second problem to extend and elaborate on our previous research pertaining to PC [ 28 , 29 , 30 , 31 ] and QC [ 30 , 31 ] features. A more substantial analysis and modeling details are provided in this paper to provide readers with an insight into the proposed PC and QC features, the design of which is influenced by automatic Chinese punctuation generation [ 32 ] and the linguistic characteristic of the Chinese punctuation system [ 33 ].

PC measures the likelihood of inserting a major punctuation mark MPM at a word boundary, whereas QC measures the likelihood of using a word string that is quoted by Chinese quotation marks or brackets to emphasize the meaning of the quoted word string. In [ 32 ], a maximum-entropy-based automatic Chinese punctuation generation method was proposed to insert 16 types of PMs into unpunctuated text by using word features and lexical—functional grammar features. The results in [ 32 ] indicated that the punctuation generation model could generate alternative or acceptable insertions, deletions, or substitutions of PMs.

A successful outcome was also obtained in a punctuation experiment involving human readers, as reported by Tseng [ 33 ], in which the alternative punctuation strategies of different native Mandarin Chinese speakers were found. These observations reflect that Chinese PMs serve as a loose reference to the syntactic structure and semantic domain.

Therefore, native Chinese writers can freely utilize PMs to delimit written Chinese into various linguistic elements, such as sentences, phrases, and clauses, for clearly expressing the meaning of text. Therefore, an automatic punctuation generation model that predicts MPMs and is trained by using a large text corpus can learn punctuation strategies for predicting MPMs from various contributors for providing useful cues for predictions of both prosodic breaks [ 28 , 31 ] and prosodic—acoustic features [ 29 , 30 , 31 ]. Word strings enclosed by brackets or quotes have essential or unique meanings in sentences.

In cases 3 and 4 , the quoted word strings, which are named quoted phrases in this paper, from small to large linguistic units, may form newly derived words, compound words, base phrases, word chunks, syntactic phrases, or sentences. The aforementioned linguistic units are usually larger than common words, contain more complex meanings than a word or may even have new meanings, and may be a higher-level unit in terms of the syntax compared with the POSs of words.

Because a quoted phrase exhibits richer linguistic information than only words, it plays a crucial role in human language understanding during the reading of a text. Moreover, it is generally agreed that a speaker can generate good prosody if they understand the meaning of a text. Thus, adding quotations to plain Chinese text and then regarding the added brackets as linguistic features may enable a system to generate prosody that sounds natural.

Note that in written Chinese, the use of quotations by adding brackets depends on the writing style or habit of the text contributor. Chinese input texts may thus already contain some brackets for the four functions indicated previously. However, the remaining unquoted words may also be emphasized and be regarded as larger syntactic units if they share similar contextual POSs or word structures with the quoted phrases.

For Chinese texts containing no quotations, if quotations can be labeled with brackets automatically by a machine when the word and POS information are given, then the features associated with the labeled brackets could provide richer linguistic information and thus enhance the performance of prosodic—acoustic feature prediction.

The PC can be regarded as a statistical linguistic feature measuring the likelihood of correctly inserting an MPM into a text. Word junctures in which MPMs are more likely to be inserted are, it is reasonable to assume, junctures in which pause breaks are more likely. We could, therefore, expect that the utilization of PC in prosody generation would improve the performance of prosodic—acoustic feature generation. The CRF-based quotation generation model predicts the structure of a quoted word string hereafter referred to as the quoted phrase, or QP from the bracket-removed word or POS sequences and calculates the associated confidence, which is referred to as the QC.

The QC can also be considered a statistical linguistic feature used for measuring the likelihood of word strings that are quoted using left and right brackets. Because words in brackets constitute meaning, it is reasonable to assume that fewer prosodic breaks are inserted within quoted text and that quoted text may be emphasized using some variation in prosodic—acoustic features. Therefore, we inferred that the use of QC may assist in prosody generation.

To evaluate the usefulness of the proposed PC and QC in Mandarin prosody generation, experiments of prosodic—acoustic feature prediction were conducted, and the corresponding objective and subjective tests were evaluated. The experimental database used was a Mandarin speech corpus, the Treebank speech corpus, which contains utterances with 56, syllables uttered by a professional female announcer. The corpus is further divided into three parts: a training set of utterances with 41, syllables, a development set of 75 utterances with 10, syllables, and a test set of 44 utterances with syllables.

For the prosodic—acoustic feature prediction, the proposed linguistic features combined with conventional linguistic features were employed as the input to directly predict four prosodic—acoustic features of the syllable log-F0 contour, syllable duration, syllable energy level, and intersyllable pause duration. Objective tests were evaluated using the root-mean-square error RMSE. Subjective tests were then conducted on speech-synthesized utterances by using the predicted prosodic—acoustic features.

Several advantages of the approach were discovered. First, the PC and QC were conveniently determined from the features of word or POS sequences robustly obtained by performing segmentation of the current word and employing POS-tagging technologies without using complicated statistical syntactic parsing. This advantage makes the proposed approach suitable for practical online unlimited TTS. Second, because the CRF-based punctuation generation models were trained by using a large text corpus, the models could learn alternative punctuation strategies from numerous paragraphs by various writers to generate more reliable PCs and QCs.

Third, compared with the size of an available text corpus for constructing a statistical syntactic parser, the size of the corpus used to train the CRF-based punctuation generator was considerably larger.

Therefore, we infer that the obtained PC and QC are more robust than the syntactic features derived from an automatic syntactic parser. Therefore, the relationship between Chinese PMs and Mandarin prosodic structure is analyzed in this section. The following subsections present the analyses that provided the motivations and rationality for using the proposed PC and QC features. The prosody labeling system for determining the prosodic structures of utterances is introduced in Section 2. The relationship between the labeled prosodic break types and PM types is discussed in Section 2.

Section 2. The relationships between the manually inserted MPMs by the native Mandarin speakers and the associated prosodic break types are analyzed, thus providing evidence for the proposed PC. Hierarchical prosodic model of Mandarin speech used in this study [ 42 ]. B 4 is defined as a major break and contains a long pause and apparent F0 reset across adjacent syllables.

B 3 is a major break with a medium pause and medium F0 reset. B 0 and B 1 are nonbreaks of a tightly coupled syllable juncture and a normal syllable boundary within a PW, respectively, which have no identifiable pauses between SYLs. Moreover, B 2 is a minor break with three variants—an F0 reset B , short pause B , and preboundary syllable duration lengthening B Distribution pdf of pause durations ms for the seven break types. The average pause duration ms for each of the prosodic break types is displayed in parentheses.

Co-occurrence matrix of four target break types and three syllable juncture types. In the texts of the training dataset, the Treebank speech corpus, no word strings were quoted in Chinese brackets. Thus, we could not directly analyze the relationship between Chinese brackets and labeled break types. In this study, we directly analyzed the characteristics of the brackets and their associated quoted phrases from the ASBC text corpus, as presented in Section 2. Therefore, we assumed that automatic punctuation generation models that predict MPMs and are trained using a large text corpus can learn strategies for inserting MPMs from texts by various contributors to provide informative cues for prosodic—acoustic feature prediction.