全文总字数:2987字
1. 毕业设计(论文)主要内容:
语音识别的应用已经跟随着手机语音助手和智能音箱的普及变成了普通人日常生活的一部分。
即使近年来语音识别研究在像远场识别、噪声环境识别等一些具有挑战性的场景下上都有了很大的发展,但是在人群环境下的语音识别表现还不是很理想。
一种有效的解决方法是在语音识别的前端引入语音分离算法,将多个说话人的语料分离成多个对应的输出。
剩余内容已隐藏,您需要先支付后才能查看该篇文章全部内容!
2. 毕业设计(论文)主要任务及要求
- 阅读相关国内外文献。
了解目前语音领域的主要进展,包括语音识别、语音分离、声纹识别等,熟悉语音处理的相关知识。
- 收集处理语音数据。剩余内容已隐藏,您需要先支付后才能查看该篇文章全部内容!
3. 毕业设计(论文)完成任务的计划与安排
(1)2020/1/13—2020/2/28:确定选题,查阅文献,外文翻译和撰写开题报告;
(2)2020/3/1—2020/4/30:系统架构、程序设计与开发、系统测试与完善;
(3)2020/5/1—2020/5/25:撰写及修改毕业论文;
剩余内容已隐藏,您需要先支付后才能查看该篇文章全部内容!4. 主要参考文献
- Yu D, Kolbk M, Tan Z H, et al. Permutation invariant training of deep models for speaker-independent multi-talker speech separation[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017: 241-245.
- Chen Z, Luo Y, Mesgarani N. Deep attractor network for single-microphone speaker separation[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017: 246-250.
- Hershey J R, Chen Z, Le Roux J, et al. Deep clustering: Discriminative embeddings for segmentation and separation[C]//2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016: 31-35.
- Shi J, Xu J, Liu G, et al. Listen, think and listen again: capturing top-down auditory attention for speaker-independent speech separation[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence (AAAI), 2018: 4353-4360.
- Ephrat A, Mosseri I, Lang O, et al. Looking to listen at the cocktail party: a speaker-independent audio-visual model for speech separation[J]. ACM Transactions on Graphics (TOG), 2018, 37(4): 112.
- Wang Q, Muckenhirn H, Wilson K, et al. Voicefilter: Targeted voice separation by speaker-conditioned spectrogram masking[J]. Proc. Interspeech 2019.
- Wan L, Wang Q, Papir A, et al. Generalized end-to-end loss for speaker verification[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018: 4879-4883.
- Li C, Ma X, Jiang B, et al. Deep speaker: an end-to-end neural speaker embedding system[J]. arXiv preprint arXiv:1705.02304, 2017.
- Wang Q, Downey C, Wan L, et al. Speaker diarization with lstm[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018: 5239-5243.
- Panayotov V, Chen G, Povey D, et al. Librispeech: an ASR corpus based on public domain audio books[C]//2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015: 5206-5210.
剩余内容已隐藏,您需要先支付 10元 才能查看该篇文章全部内容!立即支付以上是毕业论文任务书,课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。