eprintid: 3763
rev_number: 8
eprint_status: archive
userid: 274
dir: disk0/00/00/37/63
datestamp: 2019-12-09 09:16:51
lastmod: 2019-12-09 09:16:51
status_changed: 2019-12-09 09:16:51
type: conference_item
metadata_visibility: show
creators_name: Tran, Nghi Phu
creators_name: Le, Huy Hoang
creators_name: Nguyen, Ngoc Toan
creators_name: Nguyen, Dai Tho
creators_name: Nguyen, Ngoc Binh
creators_id: tnphvan@gmail.com
creators_id: hoangle.hvan@gmail.com
creators_id: ngoctoan.hvan@gmail.com
creators_id: nguyendaitho@vnu.edu.vn
creators_id: nn_binh@kcg.edu
corp_creators: VNU University of Engineering and Technology
corp_creators: People’s Security Academy
corp_creators: Kyoto College of Graduate Studies for Informatics
title: CFDVex: A Novel Feature Extraction Method for Detecting Cross-Architecture IoT Malware
ispublished: pub
subjects: IT
divisions: fac_fit
abstract: The widespread adoption of Internet of Things (IoT) devices built on different architectures gave rise to the creation and development of multi-architecture malware for mass compromise. Crossarchitecture malware detection plays an important role in detecting malware early on devices using new or strange architectures. Prior knowledge of malware detection on traditional architectures can be inherited for the same task on new and uncommon ones. Basing on CFD and Vex intermediate representation, we propose a feature selection method to detect cross-architecture malware, called CFDVex. Experimental evaluation of the proposed approach on our large IoT dataset achieved good results for cross-architecture malware detection. We only trained a SVM model by Intel 80386 architecture samples, our method could detect the IoT malware for the MIPS architecture samples with 95.72% of accuracy and 2.81% false positive rate.
date: 2019-12
date_type: published
official_url: https://soict.org/
full_text_status: public
pres_type: paper
pagerange: 248-254
event_title: 10th International Symposium on Information and Communication Technology (SoICT 2019)
event_location: Ha Noi - Ha Long
event_dates: December 4 – 6, 2019
event_type: conference
refereed: TRUE
referencetext: [1] Kaspersky IoT Lab Report. New IoT malware grew three fold in H1 2018. [Online]. Available: https://www.kaspersky.com/about/press-releases/2018_new-iotmalware-grew-three-fold-in-h1-2018. [Accessed: 02-Sep-2019].
[2] Yin Minn Pa Pa, Shogo Suzuki, Katsunari Yoshioka, Tsutomu Matsumoto,
Takahiro Kasama, and Christian Rossow. IoTPOT: Analysing the Rise of IoT Compromises. In Proceedings of the 9th USENIX Conference on Offensive Technologies,
9–19. WOOT’15. Berkeley, CA, USA: USENIX Association, 2015.
[3] Alhanahnah, Mohannad, Qicheng Lin, Qiben Yan, Ninh Zhang, and Zhenxiang
Chen. Efficient Signature Generation for Classifying Cross-Architecture IoT Malware.
2018 IEEE Conference on Communications and Network Security (CNS), 1–9,
2018.
[4] N. Idika, A.P. Mathur. A Survey of Malware Detection Techniques. Technical Report,
Purdue University, 2007
[5] Evanson Mwangi karanja, Shedden Masupe, Jeffrey Mandu. Internet of Things
Malware: A Survey. IJCSES, vol. 8, No.3, 2017.
[6] Xuxian Jiang, Xinyuan Wang, Dongyan Xu. Stealthy malware detection and
monitoring through VMM-based out-of-the-box semantic view reconstruction. ACM
Transactions on Information and System Security (TISSEC), Volume 13 Issue 2,
February 2010.
[7] Shahid Alam, R. Nigel Horspool, and Issa Traore. MAIL: Malware Analysis Intermediate Language: A Step Towards Automating and Optimizing Malware Detection.
In Proceedings of the 6th International Conference on Security of Information
and Networks, 233–240. SIN ’13. New York, NY, USA: ACM, 2013.
[8] Ralf Huuck. Iot: The internet of threats and static program analysis defense. Embedded World 2015, Exibition & Conferences, pp. 493–495.
[9] Rafiqul Islam, Ronghua Tian and Lynn Batten. Classification of Malware Based
on String and Function Feature Selection. Second Cybercrime and Trustworthy
Computing Workshop, 2010.
[10] Huy Trung Nguyen, Quoc Dung Ngo, and Van Hoang Le. IoT Botnet Detection
Approach Based on PSI Graph and DGCNN Classifier. In 2018 IEEE International
Conference on Information Communication and Signal Processing (ICICSP),
118–122, 2018.
[11] Igor Santos, Felix Brezo, Xabier Ugarte-Pedrero and Pablo Garcia Bringas. Opcode Sequences as Representation of Executables for Data-Mining-Based Unknown
Malware Detection. Information Sciences, Data Mining for Information Security,
231 (May 10, 2013): 64–82.
[12] Yuxin Ding, Wei Dai, Shengli Yan and Yumei Zhang. Control Flow-Based Opcode
Behavior Analysis for Malware Detection. Computers & Security 44 (July 1, 2014):
65–74.
[13] Soomin Kim, Markus Faerevaag, Minkyu Jung, SeungIl Jung, DongYeop Oh,
JongHyup Lee, and Sang Kil Cha. Testing Intermediate Representations for Binary
Analysis. In Proceedings of the 32Nd IEEE/ACM International Conference on
253
CFDVex: A Novel Feature Extraction Method for Detecting Cross-Architecture IoT Malware SoICT’ 19, December 4–6, 2019, Hanoi - Ha Long Bay, Vietnam
Automated Software Engineering, 353–364. ASE 2017. Piscataway, NJ, USA: IEEE
Press, 2017.
[14] Alexander Sepp, Bogdan Mihaila, and Axel Simon. Precise Static Analysis of
Binaries by Extracting Relational Information. In 18th Working Conference on
Reverse Engineering, 357–366. Limerick, Ireland: IEEE, 2011.
[15] N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic
binary instrumentation. SIGPLAN Not, 42(6):89 -100, June 2007.
[16] Intermediate Representation in Angr. Available https://docs.angr.io/advancedtopics/ir
[17] D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome,
P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via
binary analysis. In ICISS ’08, pages 1-25, Berlin, Heidelberg, 2008. Springer-Verlag.
[18] H. Yin and D. Song. Privacy-Breaching Behavior Analysis. In Automatic Malware
Analysis. Springer Briefs in Computer Science, pages 27-42. Springer New York,
2013
[19] Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino,
Audrey Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel
and Giovanni Vigna. State of The Art of War: Offensive Techniques in Binary
Analysis, IEEE Symposium on Security and Privacy (SP), 2016.
[20] Frequently Asked Questions. [Online]. Available: https://docs.angr.io/introductoryerrata/faq. [Accessed: 16-Jun-2019].
[21] Daniel Bilar. Opcodes as Predictor for Malware. International Journal of Electronic
Security and Digital Forensics 1, no. 2 (2007): 156.
[22] Robert Moskovitch, Clint Feher, Nir Tzachar, Eugene Berger, Marina Gitelman,
Shlomi Dolev and Yuval Elovici. Unknown Malcode Detection Using OPCODE
Representation. In Intelligence and Security Informatics, 204–215. Lecture Notes
in Computer Science. Springer Berlin Heidelberg, 2008.
[23] Igor Santos, Felix Brezo, Javier Nieves, Yoseba K. Penya, Borja Sanz, Carlos Laorden, and Pablo Garcia Bringas. Idea: Opcode-Sequence-Based Malware Detection.
In Engineering Secure Software and Systems, Second International Symposium,
ESSoS 2010, Pisa, Italy, (pp.35-43), 2010.
[24] Tran Nghi Phu, Nguyen Ngoc Toan, Le Hoang, Nguyen Dai Tho, Nguyen Ngoc
Binh. C500-CFG: A Novel Algorithm to Extract Control Flow-based Features for
IoT Malware Detection.19th International Symposium on Communications and
Information Technologies (ISCIT), 2019, Hochiminh, Vietnam.
[25] Shunichi Amari, Si Wu. Improving support vector machine classifiers by modifying
kernel functions. Neural Netw 1999;12:783-789.
[26] Andrei Costin, Jonas Zaddach, Aurélien Francillon and Davide Balzarotti, A largescale analysis of the security of embedded firmwares, in Proceedings of the 23rd
USENIX Security Symposium, 2014, pp.95-110.
[27] Pa Yin Minn Pa, Shogo Suzuki, Katsunari Yoshioka, Tsutomu Matsumoto,
Takahiro Kasama, and Christian Rossow. IoTPOT: A Novel Honeypot for Revealing
Current IoT Threats. Journal of Information Processing 24, no. 3 (2016): 522–533.
[28] Detux [Online]. Available https://github.com/detuxsandbox/detux
[29] David Brash. Recent Additions to the ARMv7-A Architecture. In 2010 IEEE International Conference on Computer Design, 2010.
[30] Vex IR Document. https://github.com/angr/vex/blob/master/pub/libvex_ir.h
[31] Hiroshi Ogura, Hiromi Amano and Masato Kondo. Feature Selection with a Measure of Deviations from Poisson in Text Categorization. Expert Systems with Applications 36, no. 3, Part 2 (April 1, 2009): 6826–6832.
[32] Y. Yang and J. O. Pedersen, A comparative study on feature selection in text categorization. Proceedings of the 14th International Conference on Machine Learning
(ICML ’97), p. 412-420, 1997.
[33] Virusshare [Online]. Available https://virusshare.com/
[34] Virus Total [Online]. Available https://virustotal.com/
[35] Tran Nghi Phu, Nguyen Ngoc Binh, Ngo Quoc Dung, and Le Van Hoang. Towards Malware Detection in Routers with C500-Toolkit. In 2017 5th International
Conference on Information and Communication Technology (ICoIC7), 1–5, 2017.
[36] Christopher Kruegel and Yan Shoshitaishvili. Using static binary analysis to find
vulnerabilities and backdoors in firmware. in: Black Hat USA, 2015.
citation:   Tran, Nghi Phu and Le, Huy Hoang and Nguyen, Ngoc Toan and Nguyen, Dai Tho and Nguyen, Ngoc Binh  (2019) CFDVex: A Novel Feature Extraction Method for Detecting Cross-Architecture IoT Malware.  In: 10th International Symposium on Information and Communication Technology (SoICT 2019), December 4 – 6, 2019, Ha Noi - Ha Long.     
document_url: https://eprints.uet.vnu.edu.vn/eprints/id/eprint/3763/1/p248-tran-nghi-phu.pdf