Demirkıran, Ferhat

Loading...
Profile Picture
Name Variants
Demirkıran, Ferhat
F.,Demirkıran
F. Demirkıran
Ferhat, Demirkıran
Demirkiran, Ferhat
F.,Demirkiran
F. Demirkiran
Ferhat, Demirkiran
Demirkiran,F.
Demirkiran, F.
Job Title
Araş. Gör.
Email Address
Ferhat.demırkıran@khas.edu.tr
ORCID ID
Scopus Author ID
Turkish CoHE Profile ID
Google Scholar ID
WoS Researcher ID
Scholarly Output

6

Articles

1

Citation Count

0

Supervised Theses

1

Scholarly Output Search Results

Now showing 1 - 6 of 6
  • Article
    Citation Count: 16
    An ensemble of pre-trained transformer models for imbalanced multiclass malware classification
    (Elsevier Advanced Technology, 2022) Demirkiran, Ferhat; Cayir, Aykut; Unal, Gur; Dag, Hasan
    Classification of malware families is crucial for a comprehensive understanding of how they can infect devices, computers, or systems. Hence, malware identification enables security researchers and incident responders to take precautions against malware and accelerate mitigation. API call sequences made by malware are widely utilized features by machine and deep learning models for malware classification as these sequences represent the behavior of malware. However, traditional machine and deep learning models remain incapable of capturing sequence relationships among API calls. Unlike traditional machine and deep learning models, the transformer-based models process the sequences in whole and learn relationships among API calls due to multi-head attention mechanisms and positional embeddings. Our experiments demonstrate that the Transformer model with one transformer block layer surpasses the performance of the widely used base architecture, LSTM. Moreover, BERT or CANINE, the pre-trained transformer models, outperforms in classifying highly imbalanced malware families according to evaluation metrics: F1-score and AUC score. Furthermore, our proposed bagging-based random transformer forest (RTF) model, an ensemble of BERT or CANINE, reaches the state-of-the-art evaluation scores on the three out of four datasets, specifically it captures a state-of-the-art F1-score of 0.6149 on one of the commonly used benchmark dataset. (C) 2022 Elsevier Ltd. All rights reserved.
  • Master Thesis
    Citation Count: 16
    An Ensemble of Pre-Trained Transformer Models for Imbalanced Multiclass Malware Classification
    (Kadir Has Üniversitesi, 2022) Demirkıran, Ferhat; Dağ, Hasan
    Classification of malware families is crucial for a comprehensive understanding of how they can infect devices, computers, or systems. Hence, malware identification enables security researchers and incident responders to take precautions against malware and accelerate mitigation. API call sequences made by malware are widely utilized features by machine and deep learning models for malware classification as these sequences represent the behavior of malware. However, traditional ma chine and deep learning models remain incapable of capturing sequence relation ships among API calls. Unlike traditional machine and deep learning models, the transformer-based models process the sequences in whole and learn relationships among API calls due to multi-head attention mechanisms and positional embed dings. Our experiments demonstrate that the transformer model with one trans former block layer surpass the performance of the widely used base architecture, LSTM. Moreover, BERT or CANINE, the pre-trained transformer models, out performs in classifying highly imbalanced malware families according to evaluation metrics: F1-score and AUC score. Furthermore, our proposed bagging-based ran dom transformer forest (RTF) model, an ensemble of BERT or CANINE, reaches the state-of-the-art evaluation scores on the three out of four datasets, specifically it captures a state-of-the-art F1-score of 0.6149 on one of the commonly used bench mark dataset.
  • Conference Object
    Citation Count: 3
    Website Category Classification Using Fine-Tuned Bert Language Model
    (Institute of Electrical and Electronics Engineers Inc., 2020) Demirkıran, Ferhat; Çayır, Aykut; Ünal, Uğur; Dağ, Hasan
    The contents on the Word Wide Web is expanding every second providing web users a rich content. However, this situation may cause web users harm rather than good due to its harmful or misleading information. The harmful contents can contain text, audio, video, or image that can be about violence, adult contents, or any other harmful information. Especially young people may readily be affected with these harmful information psychologically. To prevent youth from these harmful contents, various web filtering techniques, such as keyword filtering, Uniform Resource Locator (URL) based filtering, Intelligent analysis, and semantic analysis, are used. We propose an algorithm that can classify websites, which may contain adult contents, with 67.81% (BERT) accuracy among 32 unique categories. We also show that a BERT model gives higher accuracy than both the Sequential and Functional API models when used for text classification.
  • Conference Object
    Citation Count: 0
    Fine-Tuning Wav2vec2 for Classification of Turkish Broadcast News and Advertisement Jingles
    (Institute of Electrical and Electronics Engineers Inc., 2023) Demirkiran,F.; Oner,O.; Komecoglu,Y.; Guven,R.; Komecoglu,B.B.
    The accurate classification of news and commercial jingles is essential for the automated generation of broadcast flow. Currently, in press companies, editors manually label the start and end times of news and advertisements, which incurs both cost and time loss. Although the method of extracting fingerprints of news and commercial jingles has been employed to detect jingles on a channel basis and automatically classify news and commercial music, this approach falls short when it comes to classifying new jingles produced by channels. In this study, we created a new dataset by extracting segments of commercial and news jingles from TV channels in Turkey. We analyzed the most effective second interval for classifying news or commercials, resulting in an impressive accuracy score of 98.18%. By leveraging this dataset and conducting extensive analysis, we have made significant progress in accurately classifying news and commercial jingles. This research can potentially save press companies costs and time by automating the classification process. © 2023 IEEE.
  • Conference Object
    Citation Count: 0
    Benchmark Static Api Call Datasets for Malware Family Classification
    (Institute of Electrical and Electronics Engineers Inc., 2022) Gencaydin, B.; Kahya, C.N.; Demirkiran, F.; Duzgun, B.; Cayir, A.; Dag, H.
    Nowadays, malware and malware incidents are increasing daily, even with various antivirus systems and malware detection or classification methodologies. Machine learning techniques have been the main focus of the security experts to detect malware and determine their families. Many static, dynamic, and hybrid techniques have been presented for that purpose. In this study, the static analysis technique has been applied to malware samples to extract API calls, which is one of the most used features in machine/deep learning models as it represents the behavior of malware samples. Since the rapid increase and continuous evolution of malware affect the detection capacity of antivirus scanners, recent and updated datasets of malicious software became necessary to overcome this drawback. This paper introduces two new datasets: One with 14,616 samples obtained and compiled from VirusShare and one with 9,795 samples from VirusSample. In addition, benchmark results based on static API calls of malware samples are presented using several machine and deep learning models on these datasets. We believe that these two datasets and benchmark results enable researchers to test and validate their methods and approaches in this field. © 2022 IEEE.
  • Conference Object
    Citation Count: 0
    Garbage In, Garbage Out: a Case Study on Defective Product Prediction in Manufacturing
    (Institute of Electrical and Electronics Engineers Inc., 2023) Colhak,F.; Ucar,B.E.; Saygut,I.; Duzgun,B.; Demirkiran,F.; Dag,H.
    Despite their potential business value and invest-ments, data science projects often fail owing to a lack of preparedness, implementation challenges, and poor data quality. This study aimed to develop a machine learning model for predicting defective products in the dyeing process within the manufacturing domain. However, inadequate importance given to data by the involved factory, insufficient data quality, and the lack of the necessary technical infrastructure for data science projects have hindered attaining desired results. This study emphasizes to academic researchers and industry experts the significance of data quality and technical infrastructure, highlights how these deficiencies can impact the success of a data science project, and provides several recommendations. © 2023 IEEE.