适用于强化学习惯性环境的分数阶改进OU噪声

doi:10.19907/j.0490-6756.2023.022001

首页 > 过刊浏览>2023年第60卷第2期 >022001. DOI:10.19907/j.0490-6756.2023.022001

适用于强化学习惯性环境的分数阶改进OU噪声
DOI:
                        10.19907/j.0490-6756.2023.022001
                    
CSTR:
                        [cstr]
                    
作者:
                        
                        
                    
作者单位:四川大学计算机学院
作者简介:
通讯作者:
中图分类号:TP39
基金项目:四川省科技计划（2022YFQ0047）

An improved Ornstein-Uhlenbeck exploration noise based on fractional order calculus for reinforcement learning environments with momentum

Author:

Affiliation:

College of Computer Science, Sichuan University

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

本文将DDPG算法中使用的OrnsteinUhlenbeck (OU) 噪声整数阶微分模型推广为分数阶OU噪声模型，使得噪声的产生不仅和前一步的噪声有关而且和前K步产生的噪声都有关联.通过在gym惯性环境下对比基于分数阶OU噪声的DDPG和TD3算法和原始的DDPG和TD3算法，我们发现基于分数阶微积分的OU噪声相比于原始的OU噪声能在更大范围内震荡，使用分数阶OU噪声的算法在惯性环境下具有更好的探索能力，收敛得更快.

Abstract:

In this paper, the integer-order Ornstein-Uhlenbeck (OU) noise model used in the deep deterministic policy gradient (DDPG) algorithm is extended to the fractionalorder OU noise model, and the generated noise is not only related to the noise of the previous step but also related to the noise generated in the previous K steps in the proposed model.The DDPG algorithm and twin delayed deep deterministic(TD3) algorithm using the fractional-order OU noise model were compared with the original DDPG algorithm and TD3 algorithm in the gym inertial environment. We found that, compared with the original OU noise, the fractional-order OU noise can oscillate in a wider range, and the algorithm using the fractional-order OU noise had better exploration ability and faster convergence in inertial environment.

参考文献

相似文献

引证文献

引用本文

引用本文格式：王涛,张卫华,蒲亦非. 适用于强化学习惯性环境的分数阶改进OU噪声[J]. 四川大学学报: 自然科学版, 2023, 60: 022001.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-03-26
最后修改日期:2022-06-12
录用日期:2022-06-14
在线发布日期: 2023-03-29
出版日期:

首页

学报简介

作者投稿

专家审稿

编委会

读者须知

联系我们

引用本文

相关视频

分享

文章指标

历史

文章二维码

首 页

学报简介

作者投稿

专家审稿

编委会

读者须知

联系我们

引用本文

相关视频

分享

文章指标

历史

文章二维码

首页