英文語音識(shí)別標(biāo)注標(biāo)準(zhǔn)
English speech recognition labeling standard
語音識(shí)別(ASR)指把語音轉(zhuǎn)換成文字。任務(wù)是把音頻中的speech(說話)一字不落的標(biāo)注出來。
Speech recognition (ASR) refers to the conversion of speech into text. The task is to mark out the speech in the audio without dropping a word.
1. 登錄小核眾測(cè)官網(wǎng)https://zc.bytedance.com/,點(diǎn)擊更多任務(wù);
Log on to the small nuclear public site https://zc.bytedance.com/, click more tasks;
2. 搜索ASR并點(diǎn)擊該隊(duì)列,點(diǎn)擊開始任務(wù);
Search for ASR and click the queue, click start task;
3. 標(biāo)注流程:
Annotation process:
4. 語音類型判斷標(biāo)準(zhǔn):
Speech type criteria:
1) speech:可聽清的人說話聲,若視頻中有多人說話,需要都寫出來;若音頻中有部分時(shí)段多人說話聲重疊,且很清晰,需要把重疊部分截掉(rap:節(jié)奏感不是很強(qiáng)的,也可以標(biāo)注。)
Speech: can hear the sound of people talking, if there are many people in the video speak, need to write it; If there are parts of the audio with multiple voices overlapping and clear, the overlapping parts need to be cut off (rap: rhythm is not very strong, can also be tagged.)
2) 非speech:音樂、唱歌 、動(dòng)物叫聲和自然界的聲音
Non-speech: music, singing, animal calls and natural sounds
3) 丟棄:英語除外的其他語種、聽不清、嘈雜聲
Discarded: other languages other than English, inaudible, noisy
5. 文本書寫標(biāo)準(zhǔn):
Text writing standards:
1) 不加標(biāo)點(diǎn)符號(hào),單詞間需加空格
Without punctuation, spaces should be added between words
2) 專有名詞、人名、電影名、書名 每個(gè)單詞首字母大寫;縮略語每個(gè)字母都需大寫,其余都小寫(包括句子首字母第一個(gè)單詞)
Proper nouns, personal names, movie names, book titles, each word is capitalized; acronyms are capitalized for each letter, and the rest are lowercase (including the first word in a sentence)
3) 數(shù)字不要寫阿拉伯?dāng)?shù)字,比如,59--fifty-nine
Numbers don't write Arabic numerals, for example, 59--fifty-nine
4) 若單詞發(fā)一半,可以不寫
If the word is half pronounced, you can leave it
5) 正常按照音頻發(fā)音標(biāo)注,若用戶發(fā)音錯(cuò)誤,需要按正確的標(biāo)注出來
Note normally according to the audio pronunciation, if the user pronunciation is wrong, it is necessary to mark it correctly
6) 郵箱和網(wǎng)址按照正常形式輸出,比如:www.yahoo.com
Mailboxes and URLs are exported in normal form, such as: www.yahoo.com
6. 截取操作
Interception operation
1) 需要截取的情況:句首或句尾有聽不清的語音、嘈雜音、靜音、多人說話重疊等需截掉
Situations where interception is required: inaudible sounds, noise, mute, overlapping of speech, etc., at the beginning or end of a sentence
2) 截取方式:可通過點(diǎn)擊【截取開始】和【截取結(jié)束】選定截取區(qū)間(或者對(duì)應(yīng)的快捷鍵),然后點(diǎn)擊【截取確認(rèn)】(或者使用快捷鍵a或5),此時(shí)區(qū)間內(nèi)的語音將自動(dòng)播放,表示截取完成
Interception: select the intercept interval (or the corresponding shortcut key) by clicking on [intercept start] and [intercept end], and then click [intercept confirmation] (or use shortcut key a or 5), where the voice within the interval will be played automatically, Indicates completion of interception
3) 截取技巧:拖動(dòng)小紅點(diǎn)進(jìn)行截取區(qū)間修改,點(diǎn)擊上方波形圖可顯示小紅點(diǎn)
Interception technique: drag small red dot to modify the intercept interval, click on the above waveform to display the small red spot
4) 注意: 截取后要確認(rèn)一下語音和文本是否對(duì)應(yīng)
Note: after intercepting, verify that the speech and text correspond
5) 必須在原截取區(qū)間內(nèi)截取,比如原語音的播放區(qū)間為3-8s,只能在3-8s內(nèi)截取,不可截長(zhǎng)至1-8s
Must be intercepted within the original intercept interval, for example, the playback interval of the original speech is 3-8s, can only be intercepted within 3-8s and cannot be cut to 1-8s
7. 快捷鍵
1) 空格-提交Spaces-submission
2) 1-開始1-start
3) 2-暫停2-suspension
4) 5-重復(fù)播放截取區(qū)間5- repeat play intercept interval
5) q-丟棄Q-discard
6) w-非speechW-non-speech
7) s-截取開始S- start of interception
8) e-截取結(jié)束E- end of interception
9) a-截取確認(rèn)A-intercept confirmation
10) shift+alt-文本切換Shift alt- text switching
7. 部分技巧
Partial technique
1) 多使用快捷鍵Use shortcuts more often
2) 可以先理解視頻大概意思再標(biāo)注
You can understand the general meaning of the video and then annotate it.
3) 可以根據(jù)意群,標(biāo)注
Can be tagged according to the meaning group
4) 對(duì)一些出現(xiàn)率高的視頻語音進(jìn)行文本整理,可直接粘貼復(fù)用
Text finishing for some video voice with high frequency, which can be directly pasted and multiplexed