ImageVerifierCode 换一换
格式:PDF , 页数:27 ,大小:453.53KB ,
资源ID:450487      下载积分:10000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-450487.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(ASA S3 50-2013 American National Standard Method for Evaluation of the Intelligibility of Text-to-Speech Synthesis Systems (Includes Access to Additional Content).pdf)为本站会员(arrownail386)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

ASA S3 50-2013 American National Standard Method for Evaluation of the Intelligibility of Text-to-Speech Synthesis Systems (Includes Access to Additional Content).pdf

1、 ANSI/ASA S3.50-2013 AMERICAN NATIONAL STANDARD Method for Evaluation of the Intelligibility of Text-to-Speech Synthesis Systems Secretariat: Acoustical Society of America Approved on May 6, 2013 by: American National Standards Institute, Inc. Abstract This Standard is to be used for testing the spe

2、ech intelligibility of text-to-speech systems, providing a measure of human listeners recovery of words that correspond to the intended phonemic content of speech created by the system. Listeners are tasked to record the words or sentences they hear. Scoring may be either at the word or segment leve

3、l. A normalized edit distance of the response from the intended message is the measure of the systems speech intelligibility. This Standard specifies methods for selecting test material, which may depend on the purpose and constraints of the test. The Standard also specifies methods for selecting an

4、d training the listeners; for designing, controlling, and reporting the test conditions; and for analyzing and reporting the test results. The Standard also provides background material, important for designing the test. Informative software is provided to assist the user in creating stimuli and sco

5、ring the test results. Use of the software is not mandatory. AMERICAN NATIONAL STANDARDS ON ACOUSTICS The Acoustical Society of America (ASA) provides the Secretariat for Accredited Standards Committees S1 on Acoustics, S2 on Mechanical Vibration and Shock, S3 on Bioacoustics, S3/SC 1 on Animal Bioa

6、coustics, and S12 on Noise. These committees have wide representation from the technical community (manufacturers, consumers, trade associations, organizations with a general interest, and government representatives). The Standards are published by the Acoustical Society of America as American Natio

7、nal Standards after approval by their respective Standards Committees and the American National Standards Institute (ANSI). These standards are developed and published as a public service to provide standards useful to the public, industry, and consumers, and to Federal, State, and local governments

8、. Each of the Accredited Standards Committees (operating in accordance with procedures approved by ANSI) is responsible for developing, voting upon, and maintaining or revising its own Standards. The ASA Standards Secretariat administers Committee organization and activity and provides liaison betwe

9、en the Accredited Standards Committees and ANSI. After the Standards have been produced and adopted by the Accredited Standards Committees, and approved as American National Standards by ANSI, the ASA Standards Secretariat arranges for their publication and distribution. An American National Standar

10、d implies a consensus of those substantially concerned with its scope and provisions. Consensus is established when, in the judgment of the ANSI Board of Standards Review, substantial agreement has been reached by directly and materially affected interests. Substantial agreement means much more than

11、 a simple majority, but not necessarily unanimity. Consensus requires that all views and objections be considered and that a concerted effort be made towards their resolution. The use of an American National Standard is completely voluntary. Their existence does not in any respect preclude anyone, w

12、hether he or she has approved the Standards or not, from manufacturing, marketing, purchasing, or using products, processes, or procedures not conforming to the Standards. NOTICE: This American National Standard may be revised or withdrawn at any time. The procedures of the American National Standar

13、ds Institute require that action be taken periodically to reaffirm, revise, or withdraw this Standard. Acoustical Society of America ASA Secretariat 35 Pinelawn Road, Suite 114E Melville, New York 11747-3177 Telephone: 1 (631) 390-0215 Fax: 1 (631) 390-0217 E-mail: asastdsaip.org 2013 by Acoustical

14、Society of America. This Standard may not be reproduced in whole or in part in any form for sale, promotion, or any commercial purpose, or any purpose not falling within the provisions of the U.S. Copyright Act of 1976, without prior written permission of the publisher. For permission, address a req

15、uest to the Standards Secretariat of the Acoustical Society of America. 2013 Acoustical Society of America All rights reserved i Contents 1 Scope 1 2 Normative references . 1 3 Terms and definitions . 1 4 Description of a text-to-speech synthesis system 2 5 General guidance for experimental design a

16、nd testing 3 6 Requirements (Methods) 4 6.1 TTS system description and specification 4 6.2 Listeners 5 6.3 Selection and design of test materials 6 6.4 Intelligibility test procedures 7 6.5 Measurements and analysis of results . 8 Annex A (informative) Rationale for the recommendations concerning in

17、telligibility test materials . 9 A.1 Introduction . 9 A.2 Acoustic cues to linguistic units vary from context to context . 9 A.3 Systems vary in the algorithms they use and the types of errors they produce . 10 A.4 Conclusion 12 Annex B (normative) Methodological considerations for stimuli and respo

18、nses: Considerations for test material containing names and nonsense words 13 B.1 Stimuli preparation 13 B.2 Response scoring . 14 Annex C (informative) Example software to create stimuli and score results in conformity with the method described in ANSI/ASA S3.50-2013 . 15 C.1 Disclaimer . 15 C.2 Ex

19、ample software . 15 Bibliography 18 Figures Figure 1 Block diagram of a typical TTS system. This Standard primarily evaluates processing below the dotted line. 3 Figure A.1 Spectrograms of (a) Miss Peak, (b) Miss Beak, and (c) misspeak 10 ii 2013 Acoustical Society of America All rights reserved Tab

20、les Table A.1 Sample responses for one listener to fake ill 11 Table A.2 Sample responses for one listener to dock, cat, dock, bird 12 Table A.3 Sample responses for one listener to Jupiter eyebrows . 12 Table C.1 An example grammar for the susgen program showing sentence frames with Part of Speech

21、(POS) tags, and the total number of syllables in the non-variable content of each frame. 16 Table C.2 Example lexicon entries. Each row specifies a word, the POS tag to which that word can be assigned within grammar frames, and a syllable count. . 16 2013 Acoustical Society of America All rights res

22、erved iii Foreword This Foreword is for information only, and is not a part of the American National Standard ANSI/ASA S3.50-2013 American National Standard Method for Evaluation of the Intelligibility of Text-to-Speech Synthesis Systems. As such, this Foreword may contain material that has not been

23、 subjected to public review or a consensus process. In addition, it does not contain requirements necessary for conformance to the Standard. This Standard comprises a part of a group of definitions, standards, and specifications for use in bioacoustics. It was developed and approved by Accredited St

24、andards Committee S3, Bioacoustics, under its approved operating procedures. Those procedures have been accredited by the American National Standards Institute (ANSI). The Scope of Accredited Standards Committee S3 is as follows: Standards, specifications, methods of measurement and test, and termin

25、ology in the fields of psychological and physiological acoustics, including aspects of general acoustics, shock and vibration, which pertain to biological safety, tolerance and comfort. The software provided with this American National Standard is entirely informative and is provided for the conveni

26、ence of the user. Use of the provided software is not required for conformance with the Standard. The Acoustical Society of America (ASA) and the owners of the copyright to the software provided with this American National Standard make no other representation or warranty or condition of any kind, w

27、hether express or implied (either in fact or by operation of law) with respect to any part of the product, including, without limitation, with respect to the sufficiency, accuracy or utilization of, or any information or opinion contained or reflected in, any of the product. ASA and the owners expre

28、ssly disclaim all warranties or conditions of merchantability or fitness for a particular purpose. No officer, director, employee, member, agent, representative, or publisher of the copyright holder is authorized to make any modification, extension, or addition to this limited warranty. At the time

29、this Standard was submitted to Accredited Standards Committee S3, Bioacoustics, for approval, the membership was as follows: C.J. Struck, Chair G.J. Frye, Vice-Chair S.B. Blaeser, Secretary Acoustical Society of America C.J. Struck M.D. Burkard (Alt.) American Academy of Audiology . C. Schweitzer T.

30、 Ricketts (Alt.) American Academy of Otolaryngology, Head and Neck Surgery, Inc. . R.A. Dobie . L.A. Michael (Alt.) American Industrial Hygiene Association . T.K. Madison D. Driscoll (Alt.) American Speech-Hearing-Language Association L.A. Wilber . N. DiSarno (Alt.) Beltone/GN Resound . S. Petrovic

31、Council for Accreditation in Occupational Hearing Conservation L.D. Hager iv 2013 Acoustical Society of America All rights reserved ETS-Lindgren Acoustic Systems S. Dunlap . D. Winker (Alt.) Etymotic Research, Inc. M.C. Killian . J.K. Stewart (Alt.) Food and Drug Administration . S-C Peng Frye Elect

32、ronics, Inc. G.J. Frye K.E. Frye (Alt.) G.R.A.S. Sound and Vibration J. Soendergaard B. Schustrich (Alt.) Hearing Industries Association . VACANT . C.M. Rogin (Alt.) National Electrical Manufacturers Association, Signaling Protection and Communication Section (3SB) J. McNamara R. Reiswig (Alt.) Nati

33、onal Hearing Conservation Association . G.L. Poling National Institute for Occupational Safety and Health M. Stephenson . W.J. Murphy (Alt.) National Institute of Standards and Technology V. Nedzelnitsky R. Wagner (Alt.) National Park Service M. McKenna K. Fristrup (Alt.) Natus Medical, Inc. . Y. He

34、kimoglu P. Becke (Alt.) Ocean Conservation Research . M. Stocker Starkey Laboratories . D.A. Preves T.H. Burns (Alt.) U.S. Army Aeromedical Research Lab W. Ahroon U.S. Army CERL . D. Delaney M.J. White (Alt.) U.S. Army Human Research FAX: 631-390-0217; E-mail: asastdsaip.org. AMERICAN NATIONAL STAND

35、ARD ANSI/ASA S3.50-2013 2013 Acoustical Society of America All rights reserved 1American National Standard Method for Evaluation of the Intelligibility of Text-to-Speech Synthesis Systems 1 Scope This American National Standard specifies an experimental method for evaluation of the intelligibility o

36、f synthetic speech, in English, generated by text-to-speech (TTS) synthesis systems. It is intended to be used by developers of applications that incorporate TTS technology, such as e-mail and SMS readers, talking kiosks, e-learning systems, navigation systems, automated messaging services, screen r

37、eaders for people who are blind, and assistive devices for people who have difficulty speaking. Although this Standard is targeted toward English, many of the recommendations and requirements concerning experimental design, listener selection and training, test materials and procedures, and measurem

38、ent and analysis of results are sufficiently general to be valid for evaluating the intelligibility of synthetic speech in languages other than English. This Standard describes methodology that is applicable both for comparisons of different TTS systems, and for comparisons of different versions of

39、the same TTS system. 2 Normative references The following referenced documents are indispensable for the application of this Standard. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies. ANSI/

40、ASA S3.2-2009, American National Standard Method for Measuring the Intelligibility of Speech over Communication Systems 3 Terms and definitions For the purposes of this Standard, the terms and definitions given below apply: 3.1 speech synthesis. The generation of speech output from data input, which

41、 may include plain text; marked-up text; or parametric input, such as acoustic properties or articulatory configurations. 3.2 text-to-speech (TTS) synthesis. The generation of speech output from plain text or marked-up text. 3.3 intelligibility. That property which allows a human listener to identif

42、y words that correspond to the intended phonemic units of speech. 3.4 closed-response test. Evaluation in which participants, for each trial in the test, make a selection from a subset (termed a “closed set”) of possible responses. This procedure is exemplified by the familiar “multiple-choice” test

43、 format. 3.5 open-response test. Evaluation in which participants responses are not constrained to a subset of response alternatives, but instead are open to the full range of possible responses. 3.6 text pre-processing. The application-specific handling of text applied before input to a TTS system

44、(e.g., re-ordering of words in telephone listings; adjustments for non-standard pronunciations ANSI/ASA S3.50-2013 2 2013 Acoustical Society of America All rights reserved of drug names, acronyms, abbreviations), which may be accomplished via a mark-up language, application program, or other means t

45、hat is not performed by the TTS system. 3.7 text normalization. The expansion of acronyms, abbreviations, and non-alphabetic text to word-level text by the TTS system (e.g., 1024 as “ten twenty-four” or “one thousand twenty-four”; Dr. as “Doctor” or “Drive”; AAA as “triple A”). 3.8 mark-up language.

46、 Annotations that augment or alter the speech generated from text, e.g., Speech Synthesis Mark-up Language (SSML) for pronunciation, intonation, emphasis, voice selection, and speaking rate. 3.9 phonemes. The minimal units of speech that make a difference in meaning (e.g., buy and pie differ only in

47、 their initial phoneme). English has about 40 phonemes. 3.10 features. Shorthand labels used to describe and classify linguistic units. Distinctive features are phonological labels, based on phonetic descriptions of speech sounds, which can be used to categorize phonemes into different classes. For

48、example, /b/ and /m/ have the feature voiced to indicate that the vocal folds characteristically vibrate during production, while /p/ has the feature voiceless to indicate that there is an absence of vocal-fold vibration. Similarly, there are morphosyntactic features such as singular and plural, and

49、 semantic features such as female and male. Features can be unary, binary, or n-ary, depending on theory. 3.11 semantic predictability. The way some words in a sentence can be predicted from the meaning of other words in the sentence (e.g., the word “knife” in “You slice bread with a knife.”). 3.12 semantically anomalous. Violating semantic restrictions on word use (e.g., “Accidents spoke triangles”) while having superficially acceptable syntactic structure (e.g., noun-verb-noun). 3.13 semantically unpredictable sentences (SUSs)

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1