基于语义API依赖图的恶意代码检测

doi:10.3969/j.issn.04906756.2020.03.012

首页 > 过刊浏览>2020年第57卷第3期 >488-494. DOI:10.3969/j.issn.04906756.2020.03.012

基于语义API依赖图的恶意代码检测
DOI:
                        10.3969/j.issn.04906756.2020.03.012
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP391. 1
基金项目:国家重点研发计划项目(2017YFB0802900)

Malware detection based on semantic API dependency graph

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

传统的恶意代码动态分析方法大多基于序列挖掘和图匹配来进行恶意代码检测，序列挖掘易受系统调用注入的影响，图匹配受限于子图匹配的复杂性问题，并且此类方法并未考虑到样本的反检测行为，如反虚拟机.因此检测效果越来越差.这篇文章设计并提出一种基于程序语义API依赖图的真机动态分析方法，在基于真机的沙箱中来提取恶意代码的API调用序列，从而不受反虚拟机检测的影响.这篇文章的特征构建方法是基于广泛应用于信息理论领域的渐近均分性（AEP）概念，基于AEP可以提取出语义信息丰富的API序列，然后以关键API序列依赖图的典型路径来定义程序行为，以典型路径的平均对数分支因子来定义路径的相关性，利用平均对数分支因子和直方图bin方法来构建特征空间.最后采用集成学习算法-随机森林进行恶意代码分类.实验结果表明，这篇文章所提出的方法可以有效分类恶意代码，精确率达到97.1%.

Abstract:

Traditional dynamic analysis methods are mostly based on sequence mining technology and graph matching technology to detect malware. Sequence mining technology is susceptible to system call injection, while graph matching technology is limited by the complexity of subgraph matching. Moreover, these methods don’t consider the antidetection behavior of samples, such as antivirtual machine. Therefore, the accuracy of detection becomes worse and worse. In this paper, we design a physical machine dynamic analysis method based on program semantic API dependency graph. The API call sequences of malware are extracted in the sandbox based on real machine, so as to avoid the influence of antivirtual machine detection. Our feature construction method is based on the asymptotic equipartition property (AEP) concept widely used in information theory. We can extract the semantic informationrich API sequences based on AEP, and then the behavior is defined with the typical path of the API dependency graph. We define the relevance of the path by the average logarithmic branch factor of typical paths. The average logarithm branch factor and histogram bin are used to construct the feature space. Finally, this paper adopts the random forest to classify malware. Experimental results show that the proposed method can effectively classify malware with the accuracy of 97.1%.

参考文献

相似文献

引证文献

引用本文

引用本文格式：赵翠镕,方勇,刘亮,张磊. 基于语义API依赖图的恶意代码检测[J]. 四川大学学报: 自然科学版, 2020, 57: 488.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2019-11-27
最后修改日期:2019-12-31
录用日期:2020-01-18
在线发布日期: 2020-05-26
出版日期:

首页

学报简介

作者投稿

专家审稿

编委会

读者须知

联系我们

引用本文

分享

文章指标

历史

首 页

学报简介

作者投稿

专家审稿

编委会

读者须知

联系我们

引用本文

分享

文章指标

历史

首页