DIGILIB FISIPOL UGM - YOGYAKARTA INDONESIA :: Learning, Passion, Knowledge, Empathy, Social Value and Digital Access::ASNet: Auto-Augmented Siamese Neural Network for Action Recognition::-Technology Center for Human Social Computing-

Show simple item record

dc.contributor.author	Yujia Zhang
dc.contributor.author	Lai-Man Po
dc.contributor.author	Jingjing Xiong
dc.contributor.author	Yasar Abbas Ur REHMAN
dc.contributor.author	Kwok-Wai Cheung
dc.contributor.other	Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
dc.contributor.other	Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
dc.contributor.other	Department of Electrical Engineering, City University of Hong Kong, Hong Kong, China
dc.contributor.other	TCL Corporate Research Co. Limited, Hong Kong, China
dc.contributor.other	School of Communication, The Hang Seng University of Hong Kong, Hong Kong, China
dc.date.accessioned	2025-10-09T04:58:48Z
dc.date.available	2025-10-09T04:58:48Z
dc.date.issued	01-07-2021
dc.identifier.uri	https://www.mdpi.com/1424-8220/21/14/4720
dc.identifier.uri	http://digilib.fisipol.ugm.ac.id/repo/handle/15717717/40860
dc.description.abstract	Human action recognition methods in videos based on deep convolutional neural networks usually use random cropping or its variants for data augmentation. However, this traditional data augmentation approach may generate many non-informative samples (video patches covering only a small part of the foreground or only the background) that are not related to a specific action. These samples can be regarded as noisy samples with incorrect labels, which reduces the overall action recognition performance. In this paper, we attempt to mitigate the impact of noisy samples by proposing an Auto-augmented Siamese Neural Network (ASNet). In this framework, we propose backpropagating salient patches and randomly cropped samples in the same iteration to perform gradient compensation to alleviate the adverse gradient effects of non-informative samples. Salient patches refer to the samples containing critical information for human action recognition. The generation of salient patches is formulated as a Markov decision process, and a reinforcement learning agent called SPA (Salient Patch Agent) is introduced to extract patches in a weakly supervised manner without extra labels. Extensive experiments were conducted on two well-known datasets UCF-101 and HMDB-51 to verify the effectiveness of the proposed SPA and ASNet.
dc.language.iso	EN
dc.publisher	MDPI AG
dc.subject.lcc	Chemical technology
dc.title	ASNet: Auto-Augmented Siamese Neural Network for Action Recognition
dc.type	Article
dc.description.keywords	action recognition
dc.description.keywords	3D-CNN
dc.description.keywords	deep reinforcement learning
dc.description.keywords	data augmentation
dc.description.doi	10.3390/s21144720
dc.title.journal	Sensors
dc.identifier.e-issn	1424-8220
dc.identifier.oai	oai:doaj.org/journal:b25f04f1d387428caba56ce4a563a710
dc.journal.info	Volume 21, Issue 14

This item appears in the following Collection(s)

doaj

Show simple item record

ASNet: Auto-Augmented Siamese Neural Network for Action Recognition

This item appears in the following Collection(s)

CONTACT WITH US

DIGILIB

CONNECT WITH US

WEB CONTENT