1. 毕业设计(论文)的内容和要求
Sign language (SL) is commonly known as the primary language of deaf people, and usually collected or broadcast in the form of video. SL is often considered as the most grammatically structured gestural communications. This nature makes SL recognition an ideal research field for developing methods to address problems such as human motion analysis, human-computer interaction (HCI) and user interface design, and makes it receive great attention in multimedia and computer vision . Continuous SL recognition which the project studies,concerns more about learning unsegmented gestures of long-term video streams, and is more suitable for processing continuous gestural videos in real-world systems. Its training also does not require an expensive annotation on temporal boundary for each gesture. Recognizing SL indicates simultaneous analysis and integration of gestural movements and appearance features, as well as disparate body parts, and therefore probably using a multimodal approach.Faster R-CNN is a popular object detection method, which is adopted in our proposed method for gesture detection and tracking. We pre-train a faster R-CNN on the VOC2012 person-layout dataset2 with two output units representing hands versus background. Subsequently, we randomly select 400 frames from our proposed CSL dataset and manually annotate the hand locations for fine-tuning. After that, each video is processed frame-by-frame for gesture detection. The faster R-CNN based detection can fail when the handshape varies hugely or is occluded by clothes. To localize gestures in these frames, tracking with kernelized correlation filters(KCF) is utilized. The KCF model represents target regions in multi-scale compressive feature vectors and scores the proposal regions with a Bayes classifier. The Bayes classifier parameters are updated based on detected targets in each frame, the KCF model is robust to huge appearance variations. Specifically, a KCF model is initialized whenever the faster R-CNN detection fails, with the successfully object detections.
2. 参考文献
[1] Abeida, H.; Zhang, Q.; Li, J.; and Merabtine, N. 2013. Iterative sparse asymptotic minimum variance based approaches for array processing. Signal Processing, IEEE Transactions on 61(4):933944.[2] Biba, M., and Xhafa, F. 2011. Learning Structure and Schemas from Documents, volume 375. Springer.[3] Bin, Y.; Yang, Y.; Shen, F.; Xu, X.; and Shen, H. T. 2016. Bidirectional long-short term memory for video description. In Proceedings of the ACM on Multimedia Conference, 436440.[4] Cai, X.; Zhou,W.;Wu, L.; Luo, J.; and Li, H. 2016. Effective active skeleton representation for low latency human action recognition. IEEE Transactions on Multimedia 18(2):141154.[5] Cui, R.; Liu, H.; and Zhang, C. 2017. Recurrent convolutional neural networks for continuous sign language recognition by staged optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 73617369.[6] Dan Guo,Wengang Zhou, H. L., andWang, M. 2017. Online early late fusion based on adaptive hmm for sign language recognition.[7] In ACM Transactions on Multimedia Computing Communications and Applications.[8] Dawod, A. Y.; Nordin, M. J.; and Abdullah, J. 2016. Gesture segmentation: automatic continuous sign language technique based on adaptive contrast stretching approach. Middle-East Journal of Scientific Research 24(2):347352.
以上是毕业论文任务书,课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。