Abstract:In view of the existing Trojan detection methods based on network flow’s anomalous behaviors, there are some problems, such as the lack of representativeness of features, the poor detection effect caused by information redundancy. A feature selection method was proposed in this paper. Firstly, in order to get sufficient feature set, features were derived from the relevant attributes by analyzing communication behavior of Trojan. Then, in order to measure the importance of feature and the correlation between features, the improved evaluation coefficient of feature importance and the joint evaluation coefficient based on correlation entropy are proposed, and a feature selection algorithm based on sequential backward selection was designed to obtain a feature subset of adaptive size. The evaluation coefficient of feature was calculated through each iteration, and the selection was done by sorting those features. In order to verify the validity, the proposed method was compared with FCBF and IG by using the naive bayes classification and the SVM classification algorithm. Compared with FCBF, the recall rate of the two classification algorithms increased by 3.76% and 1.64%, and the F1 increased by 1.04 and 0.99. Compared with IG, the recall rate increased by 6.46% and 4.96%, and the F1 increased by 3.56 and 3.18, respectively. The results of contrast experiments showed that the proposed model can effectively select the characteristics of each attribute from Trojan traffic by using the proposed feature selection algorithm, and overcome the influence of the correlation between features by using the optimized feature subset. The purpose of reducing feature dimension and improving Trojan traffic detection was achieved.