青岛科技大学主页平台管理系统渠连恩--中文主页-- Human Action Recognition Based on 3D Convolution and Multi-Attention Transformer

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Human Action Recognition Based on 3D Convolution and Multi-Attention Transformer

发布时间：2025-07-14 点击次数：

关键字：NETWORK
摘要：To address the limitations of traditional two-stream networks, such as inadequate spatiotemporal information fusion, limited feature diversity, and insufficient accuracy, we propose an improved two-stream network for human action recognition based on multi-scale attention Transformer and 3D convolutional (C3D) fusion. In the temporal stream, the traditional 2D convolutional is replaced with a C3D network to effectively capture temporal dynamics and spatial features. In the spatial stream, a multi-scale convolutional Transformer encoder is introduced to extract features. Leveraging the multi-scale attention mechanism, the model captures and enhances features at various scales, which are then adaptively fused using a weighted strategy to improve feature representation. Furthermore, through extensive experiments on feature fusion methods, the optimal fusion strategy for the two-stream network is identified. Experimental results on benchmark datasets such as UCF101 and HMDB51 demonstrate that the proposed model achieves superior performance in action recognition tasks.
卷号：15
期号：5
是否译文：否

上一条：MDA-MIM：一种融合多尺度特征与双重注意力机制的雷达回波图预测模型

下一条：Real-time position and trajectory estimation based on deep learning and monocular cameras

崂山校区 - 山东省青岛市松岭路99号
四方校区 - 山东省青岛市郑州路53号
中德国际合作区（中德校区） - 山东省青岛市西海岸新区团结路3698号
高密校区 - 山东省高密市杏坛西街1号
济南校区 - 山东省济南市文化东路80号©2015 青岛科技大学
管理员邮箱：master@qust.edu.cn

: 访问量：次

: 手机版最后更新时间：..

个人信息

渠连恩

同专业硕导

论文成果

Human Action Recognition Based on 3D Convolution and Multi-Attention Transformer