鄭秋豫博士 Dr. Chiu-yu Tseng

現職 Current position

中央研究院語言學研究所研究員(1990-)
Research Fellow, Institue of Linguistics , Academia Sinica(1990-)

O-COCOSDA召集人(2006-)
Convener,O-COCOSDA(Oriental chapter, International Committee for Co-ordination and Standardization of Speech Databases and Assessment Techniques)(2006-)

國際學會ISCA所屬SIG-CSLP主席(2007-)
President, SIG-CSLP(Special Interest Group-Chinese Spoken Language Processing), ISCA(International Speech Communication Association ) (2007-)

主要學歷 Degree

美國布朗大學語言學博士 (Ph.D. in Lingusitics, Brown Univesity,Providence, RI, USA1981

主要經歷Experience

中央研究院歷史語言研究所副研究員(1982-1990)

Adjunct Research Fellow, Institute of History & Philology, Academia Sinica, Taipei, Taiwan


中央研究院歷史語言研究所研究員(1990-1997)

Research Fellow, Institute of History & Philology, Academia Sinica, Taipei, Taiwan

中央研究院語言研究所籌備處研究員(1997-2003)

Research Fellow, Preparatory Office of the Institute of Linguistics, Academia Sinica, Taipei, Taiwan


中央研究院語言研究所研究員
(2003-迄今)

Research Fellow, Institute of Linguistics, Academia Sinica, Taipei, Taiwan

工作 Works

 

從事實驗語音學研究二十餘年,工作內容包括:
一、語音學理論研究
二、語音學與語音科技跨學科研究
三、口語韻律語料庫的建構及工作平台的開發http://myet.com/COSPRO
四、兩岸及國際學術社群的服務工作

 

研究簡述:
一、語音學理論研究:
最重要的研究是最近十年,研究課題從先前國語的字調與句調走向口語語流的韻律結構,主要是因為九零年代語音學的研究開始注意口語韻律speech prosody,但一般研究仍多以intonation unit(IU)為單位,強調單句調及變化。而她她從趙元任先生用大波浪小波浪說明字調與語調關係的的角度出發,幾乎是當今唯一考慮口語篇章所涉及的上層(或高層)訊息與口語產製規劃的間根本關係的學者(higher level information,planning of speech production and discourse prosody)。她從事語篇研究,設計語音資料庫,收集大批語料,提出短語語段的假設,透過量化的分析,從2003年起至今,透過一系列的證據,證明了口語篇章的上層訊息,系統性的呈現在口語的語流韻律表現上,並進而論述:口語語流韻律結構所反應的是上層的語段及篇章韻律訊息,而不是將短語句調直接串接而已。她在2005年以節奏、響度和停延停頓的證據,提出語流韻律的完整說法─階層式多短語韻律句群(Prosodic Phrase Group 簡稱PG)及如何對應到口語段落的韻律特徵,並提出數學模型。2006年繼續提出基頻證據,並將階層式韻律架構推到語篇。在語音學的理論研究裡,開創語流韻律的新領域。

二、語音學與語音科技跨學科研究:
她也是台灣最早從事跨學科研究的語音學者,至今已與中文資訊科技研究結合二十餘年,和台大電機系李琳山教授、清大電機系王小川教授、交大資訊工程系陳信宏教授合作,積極參與及協助國語的語音合成及語音識別的研究,對國語的語音科技研究做出長期的貢獻,並共同訓練出國內語音科技方面傑出的年輕學者。她所提出的口語語流韻律數學模型,可以語音合成為工具,衍生並操弄語流韻律的每一個聲學層面。目前她參與國科會兩項大型的跨學科研究計畫,一個計畫是是台大電機系李琳山教授主持的國科會大國際合作計畫NGSST (Next Generation Speech Science and Technologies http://sovideo.iis.sinica.edu.tw/NeGSST/Index.htm)是該計畫中唯一的語言學學者。另一個計畫是清大電機系王小川教授主持的國科會整合型計畫NGASR (Next Generation Automatic Speech Recognition http://diana.ee.nthu.edu.tw )

三、口語韻律語料庫的建構及工作平台的開發:
她是從九0年代後期,率先開發了研究課題導向語料庫語音學的研究方法,創新研究方法,走出傳統語音學重觀察不重語料量的限制。在語音資料庫的設計、語料收集的特性、標註系統的開發、標註結果一致性取得等方面,都提出新的做法。其次,由於研究口語韻律中的上層結構,她也把語音學帶出一向只重語音細節、只觀察極小的語音單位,卻無力處理大範圍語音單位,以致研究結果多為見樹不見林的不足的困境。採用語料庫研究,勢必要做大規模的量化處理,因此也將實驗語音學的研究,從觀察聲波、採用套裝統計軟體的方式,提升到配合理論架構、調整量化方式的層次。至今收集了九套口語韻律語流的資料庫,容量共共10.58GB,在2006年一月配合標註、工作平台一併釋出,對語音學研究提供具體的資源及建構平台知識的分享。(請參看中研院口語韻律語料庫Sinica Continuous Speech Corpra的網址http://www.myet.com/COSPRO )。

在2006年她在兩項國際會議裡受邀擔任keynote speaker,進一步受到國際學術界的肯定,一個會議是TAL 2006 (The Second International Symposium on Tonal Aspects of Languages網址 http://tal2006.free.fr), April 27-29, 2006, La Rochelle, France, 她的講題是Higher Level Information and Discourse Prosody。另一個會議是The 3rd International Conference on Speech Prosody 2006, May 2-5, 2006, Dresden, Germany 網址http://www.ias.et.tu-dresden.de/sp2006/ 她的講題是 “Fluent Speech Prosody and Discourse Organization—Evidence of Top-down Governing and Implications to Speech Technology” 這二項會議又以Speech Prosody 2006這個以研究課題為主的肯定最具代表意義,因為她研究Speech Prosody十餘年,語料又完全是國語,能在這個以研究課題為主旨的會議擔任keynote speaker,意義更為不同。

四、兩岸及國際學術社群的服務工作:
1998年起她積極參與兩岸學術交流,2005年受邀擔任北京大學漢語語言學研究中心兼任研究員一年,並於2005年10月應邀到北京大學漢語語言學研究中心擔任講座二週,中國社科院語言所擔任講座二週,在北京期間一共發表七場公開講座演講,在天津南開大學發表一場演講,又出席二項會議發表論文二篇,回程時在香港城市大學和香港中文大學各發表一場演講,訪問大陸一個月,共發表演講十一場。
在國內與國際學術社群工作方面,在台灣她是中華民國計算語言學會和台灣語言學會的創始會員及終身會員。在兩岸方面,她也是唯一受邀擔任「中國中文信息學會中文語言資源建設與工作管理委員會」中國語音大聯盟Chinese LDC (Linguistic Data Consortium)委員的台灣學者。在國際方面,她從1978年,起就是美國聲學學會ASA (The Acoustical Society of America)的會員,也是國際漢語語言學會IACL (The International Association of Chinese Linguistics)的創始及終身會員。此外她是ISCA (International Association of Speech Communication)的會員,並在ISCA擔任特別興趣小組CSLP (Chinese Spoken Language Processing)的副主席。此外,從2006年起,她擔任O-COCOSDA的召集人,這是亞洲地區11個國家及區域的語音資料庫資源共享及評估的聯盟(Oriental COCOSDA簡稱O-COCOSDA,即 International Committee for Co-ordination and Standardization of Speech Databases and Assessment Techniques網址http://www.cocosda.org ),自十年前成立起,一直在亞洲地區扮演重要的學術角色,召集人需負責協調各參與國間的溝通,並確保每年O-COCOSDA輪流在不同地區,舉辦國際研討會。

 

The major differences that set my research apart from both my predecessors and peers in the field of phonetics are the following:
1. Research Problems and Perspectives
I have been studying fluent continuous speech prosody of Mandarin Chinese from a macro/top-down perspective and taking units larger than phrase/sentence intonation into consideration. This perspective made possible the emergence of the major feature of fluent speech prosody, namely, the systematic cross-phrase prosodic association that constitutes the prosodic context rather than patterns of individual phrase intonation examined in separation and treated as intonation variations. Based on quantitative evidences obtained, I was able to postulate a hierarchical prosody framework that denotes how spoken discourse is formed in layers. My research also brought forth cross-phrase prosodic association from each layer of the prosodic hierarchy in every acoustic parameter. Consequently, I was able to construct Hierarchical Phrase Grouping Model (HPG) and multiple-phrase prosody templates in F0 trajectory patterns, syllable duration patterns, intensity distribution patterns as well as boundary properties in relation to boundary breaks (Tseng et al., 2004b, 2005a, 2006a). I have further obtained evidences of how these templates are in fact default base from, deep structure in linguistic sense that applies across prosodic styles and formats (Tseng et al., 2007 and forthcoming). The perspective also allows me to examine boundary information both in the speech signals and in post-boundary silent pauses in relation to discourse information and what significance boundary information bears in both speech planning and speech processing (Tseng et al., forthcoming).
2. Quantity of Speech Samples/Data Used
I have collected 10.58GB of speech data since the late 1990’s, consisting mostly of reading of text pieces of various features and by various speakers that aimed at bringing out the acoustic properties of continuous fluent speech prosody while removing factors related to spontaneous speech. Along the process, I have developed annotation systems and toolkit for acoustic analyses and manipulation (Tseng et al., 1999, 2005b). The results are COSPRO (Sinica Mandarin Continuous Speech Prosody and Toolkit http://www.myet.com/cospro, 7.9GB of the 10.58GB annotated) now available for a fee to the research community and public. I believe the corpora are useful both to research and teaching of Mandarin (Tseng et al., 2003, 2005b).
3. Research Methodology-Corpus Phonetics Developed to Investigate Acoustic Properties of Continuous Fluent Speech
I have chosen to deal with realistic research problems of continuous speech in large chunks, for example speech paragraphs up to over 180 syllables (or 70 seconds, COSPRO 01),? and developed experimentation methods by integrating engineering- and speech-technology-oriented techniques to acoustic phonetic investigations, and thereby moved phonetics of studying limited samples of limited speakers to multiple speakers and vast amount of data (by traditional phonetics account though perhaps modest to the speech technology community) as well as using more scientific methodology than observation and descriptions. The now standard annotation and quantitative analyses of corpus linguistics have been painstakingly adopted before full-fledged methodology was available. Along the course I had to adjust and develop consistent methods to examine acoustic phonetic properties of larger domains and units. For those who choose not to agree with my perspectives, arguments and/or framework, they would not and could not refute my data. The corpus phonetics approach I developed and used has thus made replication possible and phonetics a more responsible science.?
The Fujisaki Model (Fujisaki, 1984) was adopted to analyze F0 trajectories, but my group and I have since made it both analytical and predictive (Tseng et al., 2006b, 2007), and thus expanded and strengthened it. I have also developed methods to analysis speech rhythm and loudness of prosodic units (Tseng et al., 2004a, 2005c), as well as adopting and adapting linear regression analysis to account for layered and cumulative contributions from each prosodic layer of the HPG hierarchy. The statistical method also made it possible to show the existence of higher level discourse information in the speech signals and explain the interaction between and among various prosodic units and layers, thus making it clear why surface intonation variations are not random at all but higher level constrained and defined (Tseng et al., 2004b, 2005a, 2006a, 2006e).
By studying Mandarin speech prosody in relation to higher level discourse information, I have also moved acoustic phonetic studies of Mandarin Chinese to phenomenon other than tones and (phrase/sentence) intonation. The quantitative evidences I have obtained showed that additional discourse information is also present in the speech signals; such higher level information could be accounted for statistically by establishing a prosody hierarchy above sentences. From the viewpoint of linguistic research, the HPG framework has made studying phonetic information above phrase and sentences possible; evidences obtained have provided explanations of how fluent speech prosody is generated, and moreover, together the above has helped deriving abstract linguistic knowledge from surface speech variations within and between speakers possible. As a result, surface variations are not random, but systematic and predictable. I believe I have brought forth more linguistic knowledge of and about phonetic facts through fluent speech prosody and from extensive studies on speech corpora, and thus expanded phonetics from description to explanation and generalization as well as narrowed the gap between linguistic knowledge and speech facts considerably. I have shown how it is possible to utilize quantities of speech data to extract abstract linguistic knowledge in concrete sense.
4. Interdisciplinary Approach Implemented
Adopting a more technology oriented approach has allowed me to conduct phonetic investigation (of fluent speech prosody) with clearer goals in mind. A mathematical model was constructed on the bases of research results and ready to be tested in speech synthesis and recognition (Tseng et al., 2004c, 2004d, 2004e, 2005a, 2006b, 2006c).

 

 

聯絡 Mail Contact

E-mail

研究人員(PI)

研究助理(Staff)

研究計畫(Projects)

語音實例 (Speech_samples)

論文著作(Publications)

SINICA_COSPRO &Toolkit

近期活動(Recent_activities)

計算語言學會(ACLCLP)

回首頁