VNU-UET Repository: No conditions. Results ordered -Date Deposited.

VNU-UET Repository: No conditions. Results ordered -Date Deposited. 2024-07-27T16:38:27Z EPrints http://eprints.uet.vnu.edu.vn/images/sitelogo.png https://eprints.uet.vnu.edu.vn/eprints/ 2018-10-29T04:31:03Z 2018-10-29T04:31:03Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3123 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3123 2018-10-29T04:31:03Z Selecting active frames for action recognition with 3D convolutional network

Recent applications of Convolutional Neural Networks, especially 3-Dimensional Convoltutional Neural Networks (3DCNNs) for human action recognition (HAR) in videos have widely used. In this paper, we use a multi-stream framework which is a combination from separated networks with different kind of input generated from unique video dataset. To achieve the high results, firstly, we proposed a method to extract the active frames (called Selected Active Frames - SAF) from a videos to build datasets for 3DCNNs in video classifying problem. Second, we deploy a new approach called Vote fusion which considered as an effective fusion method for ensembling multi-stream networks. From the various datasets generated from videos, we extract frames by our method and feed into 3DCNNs for feature extraction, then we carry out training and then fuse the results of softmax layers of these streams. We evaluate the proposed methods on solving action recognition problem. These method are carried on three well-known datasets (HMFB51, UCF101, and KTH). The results are also compared to the state-of-the-art results to illustrate the efficiency and effectiveness in our approach

Tieu Binh Hoang binhhoangtieu@gmail.com Thi Chau Ma chaumt@vnu.edu.vn Sugimoto Akihiro The Duy Bui duybt@vnu.edu.vn