A malware variant clustering method based on fuzzy hash
DOI:
Author:
Affiliation:

Clc Number:

TP309.7

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
    Abstract:

    Internet Security companies collect tens of millions of new malware variants each year, Virus Share, the online malware repository, has stored more than 27 million unlabeled malwares. Clustering malware variant according to certain behavior patterns, not only makes the new attack easier to be detected, but also helps us to obtain the malware trends in time and take the corresponding preventive measures. Therefore, this paper proposes a malware variant clustering method which use dynamic analysis technology to extract malware features, including import and export function name, strings, system resource records and system calls, then convert these features to the fuzzy hashes, finally clustering malware samples through the DPC clustering algorithm. We select the number of clusters, precision, recall, F-score and entropy as external criteria, select the intra-cluster cohesion and inter-cluster separation as internal criteria. The experimental results demonstrate that compared with Symantec and ESET-NOD32, the F-score obtained in this paper increased by 11.632% and 2.41%, and the number of clusters is closest to the artificial labeled.

    Reference
    Related
    Cited by
Get Citation

Cite this article as: XIAO Jinqi, WANG Jun-Feng. A malware variant clustering method based on fuzzy hash [J]. J Sichuan Univ: Nat Sci Ed, 2018, 55: 469.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:August 11,2017
  • Revised:November 27,2017
  • Adopted:December 05,2017
  • Online: June 06,2018
  • Published: