Li G Y, Li C G, Wang W J, et al. Research on multi-feature human pose model recognition based on one-shot learning[J]. Opto-Electron Eng, 2021, 48(2): 200099. doi: 10.12086/oee.2021.200099
Citation: Li G Y, Li C G, Wang W J, et al. Research on multi-feature human pose model recognition based on one-shot learning[J]. Opto-Electron Eng, 2021, 48(2): 200099. doi: 10.12086/oee.2021.200099

Research on multi-feature human pose model recognition based on one-shot learning

    Fund Project: Youth Fund for Science and Technology Research in Colleges and Universities of Hebei Province (2011139) and Natural Science Foundation of Hebei Province (F2012203111)
More Information
  • With the development of human-computer interaction, virtual reality, and other related fields, human posture recognition has become a hot research topic. Since the human body belongs to a non-rigid model and has time-varying characteristics, the accuracy and robustness of recognition are not ideal. Based on the KinectV2 somatosensory camera to collect skeletal information, this paper proposes a one-shot learning model matching method based on human body angle and distance characteristics. First, feature extraction is performed on the collected bone information, and the joint point vector angle and joint point displacement are calculated and a threshold is set. Secondly, the pose to be measured is matched with the template pose, and the recognition is successful if the threshold limit is met. Experimental results show that the method can detect and recognize human poses within the defined threshold in real-time, which improves the accuracy and robustness of recognition.
  • 加载中
  • [1] Henry P, Krainin M, Herbst E, et al. RGB-D mapping: using Kinect-style depth cameras for dense 3D modeling of indoor environments[J]. Int J Rob Res, 2012, 31(5): 647-663. doi: 10.1177/0278364911434148

    CrossRef Google Scholar

    [2] 黄国范, 李亚. 人体动作姿态识别综述[J]. 电脑知识与技术, 2013, 12(1): 133-135.

    Google Scholar

    Huang G F, Li Y. A survey of human action and pose recognition[J]. Comput Knowl Technol, 2013, 12(1): 133-135.

    Google Scholar

    [3] Shotton J, Sharp T, Kipman A, et al. Real-time human pose recognition in parts from single depth images[J]. Commun ACM, 2013, 56(1): 116-124. doi: 10.1145/2398356.2398381

    CrossRef Google Scholar

    [4] 严利民, 杜斌, 郭强, 等. 基于局部扫描法对倾斜指势的识别[J]. 光电工程, 2016, 43(12): 147-153. doi: 10.3969/j.issn.1003-501X.2016.12.023

    CrossRef Google Scholar

    Yan L M, Du B, Guo Q, et al. Recognize the tilt fingertips by partial scan algorithm[J]. Opto-Electron Eng, 2016, 43(12): 147-153. doi: 10.3969/j.issn.1003-501X.2016.12.023

    CrossRef Google Scholar

    [5] 李红波, 丁林建, 吴渝, 等. 基于Kinect骨骼数据的静态三维手势识别[J]. 计算机应用与软件, 2015, 14(9): 161-165. doi: 10.3969/j.issn.1000-386x.2015.09.039

    CrossRef Google Scholar

    Li H P, Ding L J, Wu Y, et al. Static three-dimensional gesture recognition based on Kinect skeleton data[J]. Comput Appl Softw, 2015, 14(9): 161-165. doi: 10.3969/j.issn.1000-386x.2015.09.039

    CrossRef Google Scholar

    [6] Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//Conference on Computer Vision and Pattern Recognition(CVPR), 2017: 7291-7299.

    Google Scholar

    [7] Pfitscher M, Welfer D, Evaristo José Do Nascimento, et al. Article Users Activity Gesture Recognition on Kinect Sensor Using Convolutional Neural Networks and FastDTW for Controlling Movements of a Mobile Robot[J]. Inteligencia Artificial Revista Iberoamericana de Inteligencia Artificial, 2019, 22(63): 121-134.

    Google Scholar

    [8] 赵海勇. 基于视频流的运动人体行为识别研究[D]. 西安: 西安电子科技大学, 2011.

    Google Scholar

    Zhao H Y. Research of human action recognition based on video stream[D]. Xi'an: Xidian University, 2011.

    Google Scholar

    [9] Xu W Y, Wu M Q, Zhao M, et al. Human action recognition using multilevel depth motion maps[J]. IEEE Access, 2019, 7: 41811-41822. doi: 10.1109/ACCESS.2019.2907720

    CrossRef Google Scholar

    [10] Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016: 3637-3645.

    Google Scholar

    [11] 朱大勇, 郭星, 吴建国. 基于Kinect三维骨骼节点的动作识别方法[J]. 计算机工程与应用, 2018, 54(20): 152-158. doi: 10.3778/j.issn.1002-8331.1706-0285

    CrossRef Google Scholar

    Zhu D Y, Guo X, Wu J G. Action recognition method using Kinect 3D skeleton data[J]. Comput Eng Appl, 2018, 54(20): 152-158. doi: 10.3778/j.issn.1002-8331.1706-0285

    CrossRef Google Scholar

    [12] 蔡兴泉, 涂宇欣, 余雨婕, 等. 基于少量关键序列帧的人体姿态识别方法[J]. 图学学报, 2019, 40(3): 532-538.

    Google Scholar

    Cai X Q, Tu Y X, Yu Y J, et al. Human posture recognition method based on few key frames sequence[J]. J Graph, 2019, 40(3): 532-538.

    Google Scholar

    [13] 钱银中, 沈一帆. 姿态特征与深度特征在图像动作识别中的混合应用[J]. 自动化学报, 2019, 45(3): 626-636.

    Google Scholar

    Qian Y Z, Shen Y F. Hybrid of pose feature and depth feature for action recognition in static image[J]. Acta Autom Sin, 2019, 45(3): 626-636.

    Google Scholar

    [14] Monir S, Rubya S, Ferdous H S. Rotation and scale invariant posture recognition using Microsoft Kinect skeletal tracking feature[C]//2012 12th International Conference on Intelligent Systems Design and Applications (ISDA), 2012: 829-842.

    Google Scholar

    [15] 郭同欢, 陈姚节, 林玲. 基于姿态角的双Kinect数据融合技术及应用[J]. 科学技术与工程, 2019, 19(29): 172-178. doi: 10.3969/j.issn.1671-1815.2019.29.028

    CrossRef Google Scholar

    Guo T H, Chen Y J, Lin L. Gesture recognition based on the gesture angle of dual Kinect[J]. Sci Technol Eng, 2019, 19(29): 172-178. doi: 10.3969/j.issn.1671-1815.2019.29.028

    CrossRef Google Scholar

    [16] Wang P, Liu L Q, Shen C H, et al. Multi-attention network for one shot learning[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 2721-2729.

    Google Scholar

    [17] Yang Y, Saleemi I, Shah M. Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions[J]. IEEE Trans Pattern Anal Mach Intell, 2013, 35(7): 1635-1648. doi: 10.1109/TPAMI.2012.253

    CrossRef Google Scholar

    [18] Sundermeyer M, Alkhouli T, Wuebker J, et al. Translation modeling with bidirectional recurrent neural networks[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014: 14-25.

    Google Scholar

    [19] Cheng X L, He M Y, Duan W J. Machine vision based physical fitness measurement with human posture recognition and skeletal data smoothing[C]//2017 International Conference on Orange Technologies (ICOT), 2017: 323-342.

    Google Scholar

    [20] Li Q M, Lin W X, Li J. Human activity recognition using dynamic representation and matching of skeleton feature sequences from RGB-D images[J]. Signal Process Image Commun, 2018, 68: 265-272. doi: 10.1016/j.image.2018.06.013

    CrossRef Google Scholar

    [21] Zhang Z Q, Liu Y N, Li A, et al. A novel method for user-defined human posture recognition using Kinect[C]//2014 7th International Congress on Image and Signal Processing, 2014: 724-739.

    Google Scholar

    [22] Sagayam K M, Hemanth D J. Hand posture and gesture recognition techniques for virtual reality applications: a survey[J]. Virtual Real, 2017, 21(2): 91-107. doi: 10.1007/s10055-016-0301-0

    CrossRef Google Scholar

    [23] Stephenson R M, Chai R, Eager D. Isometric finger pose recognition with sparse channel SpatioTemporal EMG imaging[J]. Annu Int Conf IEEE Eng Med Biol Soc, 2018, 2018: 5232-5235.

    Google Scholar

    [24] Vishwakarma D K, Singh T. A visual cognizance based multi-resolution descriptor for human action recognition using key pose[J]. AEU-Int J Electron Commun, 2019, 107: 157-169. doi: 10.1016/j.aeue.2019.05.023

    CrossRef Google Scholar

    [25] Liu X X, Feng X Y, Pan S J, et al. Skeleton tracking based on Kinect camera and the application in virtual reality system[C]//Proceedings of the 4th International Conference on Virtual Reality, 2018: 21-25.

    Google Scholar

    [26] Bulbul M F, Islam S, Ail H. 3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture images[J]. Multimed Tools Appl, 2019, 78(15): 21085-21111. doi: 10.1007/s11042-019-7365-2

    CrossRef Google Scholar

    [27] Agahian S, Negin F, Köse C. An efficient human action recognition framework with pose-based spatiotemporal features[J]. Eng Sci Technol, 2020, 23(1): 196-203.

    Google Scholar

  • Overview: With the development of human-computer interaction, virtual reality, and other related fields, human pose recognition has become a hot research topic. Since the human body belongs to a non-rigid model and has time-varying characteristics, the accuracy and robustness of recognition are not ideal. Based on the KinectV2 somatosensory camera to collect skeletal information, this paper proposes a one-shot learning model matching method based on human body angle and distance characteristics. KinectV2 somatosensory camera can capture color images, depth images, and skeletal images. This article combines color images and skeletal images to extract and recognize human poses, thereby making the recognition more accurate, real-time, and robust. Through the feature extraction of the obtained bone information, the feature vector is constructed using the vector in mathematical thought, that is, the endpoint of the joint point minus the joint point of the starting point. The joint angle in this article uses the elbow joint as the fulcrum, shoulder joint, and the hand joint is constructed as the end-point. In this way, multiple types of poses can be combined, and the attitude sample library can be expanded. The joint point vector angle and the joint point displacement are calculated, and the threshold is set. The angle features and distance features are selected to make the feature points more prominent. The posture is easy to identify in complex situations. The final posture to be measured is matched with the template posture. The idea of matching calculation is to calculate the real-time posture characteristics, and compare the calculation results with the set type data and thresholds successfully. The significance of the model matching method using one-shot learning small sample learning is that small sample learning does not require a large number of samples and only a small number of samples for training. The advantages of a small number of samples are as follow: 1) The pose features can be accurately located during training; 2) Irrelevant data is discarded; 3) The calculation amount is reduced; 4) The recognition speed and accuracy are improved in the attitude matching process; 5) The anti-interference ability is stronger. The design of this paper is applied to option selection. Gesture recognition replaces the physical device and makes a defined gesture to complete the selection of options. Experimental results show that the method can detect and recognize human poses within the defined threshold in real-time, which improves the accuracy and robustness of recognition.

  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(9)

Tables(2)

Article Metrics

Article views(7039) PDF downloads(1459) Cited by(0)

Access History
Article Contents

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint