Browsing by Author "Dağ, Hasan"
Now showing 1 - 20 of 66
- Results Per Page
- Sort Options
Master Thesis Ağ sızma tespit sistemleri için tablosal ve metin temelli özniteliklerden birlikte öğrenmeye dayalı yeni bir mimari(2023) Dağ, Hasan; Dağ, HasanAğ Saldırı Tespit Sistemleri (ASTS) bilgisayar ağlarının güvenliğinin ve bütünlüğünün korunmasında kritik bir rol oynar. Bu sistemler, kötü niyetli veya yetkisiz erişime işaret edebilecek anormal faaliyetleri tespit etmek ve bunlara yanıt vermek üzere tasarlanmıştır. Sürekli gelişen siber tehditlerle karakterize edilen günümüzün dijital ortamında sağlam ASTS çözümlerine duyulan ihtiyaç hiç bu kadar acil olmamıştı. Etkili ASTS'lerin konuşlandırılması, özellikle de sürekli artan sofistike ve tespit edilmesi zor siber tehditlerin ortasında ağ anormalliklerinin doğru bir şekilde tanımlanması zor olabilir. Araştırmamızın motivasyonu, ASTS çalışmaları önemli adımlar atmış olsa da, ağ anormalliklerini tespit etmek için daha etkili ve doğru yöntemlere olan önemli ihtiyacın devam ettiğinin fark edilmesinden kaynaklanmaktadır. STS çalışmalarında yaygın olarak kullanılan özellikler ağ günlüklerini içermektedir ve bazı çalışmalar yük bilgisi gibi metin tabanlı özellikleri araştırmıştır. Ancak geleneksel makine ve derin öğrenme modelleri, tablosal ve metin tabanlı özelliklerden birlikte öğrenme konusunda yetersiz kalabilmektedir. Burada, ASTS'in performansını artırmak için hem tablo hem de metin tabanlı özellikleri entegre eden yeni bir yaklaşım sunuyoruz. Araştırmamız, ASTS'in mevcut sınırlamalarını ele almayı ve ağ anormalliklerini tespit etmek için daha etkili ve doğru yöntemler sunarak daha güvenilir ve verimli ağ güvenliği çözümlerinin geliştirilmesine katkıda bulunmayı amaçlamaktadır. Dahili deneylerimiz, tablosal özelliklerini kullanan derin öğrenme yaklaşımının olumlu sonuçlar verdiğini, metin tabanlı özelliklerini kullanan önceden eğitilmiş dönüştürücü yaklaşımının ise yeterli performans göstermediğini ortaya koymuştur. Bununla birlikte, derin öğrenme ve önceden eğitilmiş dönüştürücü yaklaşımlarını birlikte kullanarak her iki özellik türünü entegre eden önerilen yaklaşımımız üstün performans elde etmektedir. Bu bulgular, derin öğrenme ve önceden eğitilmiş dönüştürücü yaklaşımlarını birlikte kullanarak her iki özellik türünü entegre etmenin ağ aykırılığı tespitinin doğruluğunu önemli ölçüde artırabileceğini göstermektedir. Ayrıca, önerilen yaklaşımımız ISCX-IDS2012, UNSW-NB15 ve CIC-IDS2017 gibi yaygın olarak kullanılan ASTS veri kümelerinde doğruluk, F1-skoru ve duyarlılık açısından son teknoloji yöntemlerden daha iyi performans göstermekte ve sırasıyla %99,80, %92,37 ve %99,69 F1-skorları ile ağ aykırılık tespit etmedeki etkinliğini ortaya koymaktadır.Book Part Citation Count: 3Alternative Credit Scoring and Classification Employing Machine Learning Techniques on a Big Data Platform(Institute of Electrical and Electronics Engineers Inc., 2019) Dağ, Hasan; Kiyakoğlu, Burhan Yasin; Rezaeinazhad, Arash Mohammadian; Korkmaz, Halil Ergun; Dağ, HasanWith the bloom of financial technology and innovations aiming to deliver a high standard of financial services, banks and credit service companies, along with other financial institutions, use the most recent technologies available in a variety of ways from addressing the information asymmetry, matching the needs of borrowers and lenders, to facilitating transactions using payment services. In the long list of FinTechs, one of the most attractive platforms is the Peer-to-Peer (P2P) lending which aims to bring the investors and borrowers hand in hand, leaving out the traditional intermediaries like banks. The main purpose of a financial institution as an intermediary is of controlling risk and P2P lending platforms innovate and use new ways of risk assessment. In the era of Big Data, using a diverse source of information from spending behaviors of customers, social media behavior, and geographic information along with traditional methods for credit scoring prove to have new insights for the proper and more accurate credit scoring. In this study, we investigate the machine learning techniques on big data platforms, analyzing the credit scoring methods. It has been concluded that on a HDFS (Hadoop Distributed File System) environment, Logistic Regression performs better than Decision Tree and Random Forest for credit scoring and classification considering performance metrics such as accuracy, precision and recall, and the overall run time of algorithms. Logistic Regression also performs better in time in a single node HDFS configuration compared to a non-HDFS configuration.Article Citation Count: 4AnomalyAdapters: Parameter-Efficient Multi-Anomaly Task Detection(IEEE-Inst Electrical Electronics Engineers Inc, 2022) Dağ, Hasan; Dag, HasanThe emergence of technological innovations brings sophisticated threats. Cyberattacks are increasing day by day aligned with these innovations and entails rapid solutions for defense mechanisms. These attacks may hinder enterprise operations or more importantly, interrupt critical infrastructure systems, that are essential to safety, security, and well-being of a society. Anomaly detection, as a protection step, is significant for ensuring a system security. Logs, which are accepted sources universally, are utilized in system health monitoring and intrusion detection systems. Recent developments in Natural Language Processing (NLP) studies show that contextual information decreases false-positives yield in detecting anomalous behaviors. Transformers and their adaptations to various language understanding tasks exemplify the enhanced ability to extract this information. Deep network based anomaly detection solutions use generally feature-based transfer learning methods. This type of learning presents a new set of weights for each log type. It is unfeasible and a redundant way considering various log sources. Also, a vague representation of model decisions prevents learning from threat data and improving model capability. In this paper, we propose AnomalyAdapters (AAs) which is an extensible multi-anomaly task detection model. It uses pretrained transformers' variant to encode a log sequences and utilizes adapters to learn a log structure and anomaly types. Adapter-based approach collects contextual information, eliminates information loss in learning, and learns anomaly detection tasks from different log sources without overuse of parameters. Lastly, our work elucidates the decision making process of the proposed model on different log datasets to emphasize extraction of threat data via explainability experiments.Conference Object Citation Count: 0Applications of Eigenvalue Counting and Inclusion Theorems in Model Order Reduction(Springer-Verlag Berlin, 2010) Dağ, Hasan; Dağ, HasanWe suggest a simple and an efficient iterative method based on both the Gerschgorin eigenvalue inclusion theorem and the deflation methods to compute a Reduced Order Model (ROM) to lower greatly the order of a given state space system. This method is especially efficient in symmetric state-space systems but it works for the other cases with some modifications.Master Thesis Audio detection using machine learning & transfer learning models(Kadir Has Üniversitesi, 2021) Dağ, Hasan; Dağ, HasanIn this paper, using datasets ESC-50 & ESC-10 of environmental sounds, machine learning algorithms, and feature extraction methods are used to develop recognition performance. K-NN, SVM, Random Forest are used for comparing the recognition results. The different feature extraction methods in the literature are used to get more meaningful attributes from these datasets and obtain a higher accuracy rate. This approach shows that SVM algorithm has a significantly good result with accuracy scores. The best accuracy scores obtained by classic machine learning algorithms are %42,15 for ESC-50 and %77,7 for ESC-10. In addition to this, the experiments have been done with a pre-trained ResNet neural network as a backbone, which achieves successful results despite the machine learning models. In this study, a higher accuracy rate is achieved from baseline machine learning algorithms in literature and using transfer learning with pre-trained Resnet backbones to reach some state of art results. The accuracy scores are %68,95 for ESC-50 and %87,25 for ESC-10.Master Thesis Bellek tabanli verı platformların karşılaştırması(Kadir Has Üniversitesi, 2016) Dağ, Hasan; Dağ, HasanBellek tabanli verı platformların karşılaştırmasıArticle Citation Count: 2Branch outage simulation based contingency screening by gravitational search algorithm(Praise Worthy Prize Srl, 2012) Ceylan, Oğuzhan; Dağ, Hasan; Dağ, HasanPower systems contingency analysis is an important issue for electric power system operators. This paper performs branch outage simulation based contingency screening using a bounded network approach. Local constrained optimization problem representing the branch outage phenomena is solved by the gravitational search algorithm. The proposed method is applied to IEEE 14 30 57 and 118 Bus Test systems and its performance from the point of capturing violations is evaluated. In addition false alarms and the computational accuracy of the proposed method are also analyzed by using scattering diagrams. Finally the proposed gravitational search based contingency screening is compared with full AC load flow solutions from the point of computational speed. Copyright (C) 2012 Praise Worthy Prize S.r.l. - All rights reserved.Conference Object Citation Count: 15Branch outage solution using particle swarm optimization(2008) Ceylan, Oğuzhan; Dağ, Hasan; Dağ, HasanFor post outage MW line flows and voltage magnitude calculations most of the methods use linear methods because of their simplicity. Especially for reactive power flow calculations one can face high errors. In this paper we use a minimization method that minimizes the errors resulting from the linear system model implementation. We solve the optimization problem using particle swarm optimization. We give some outage examples using IEEE 14 bus IEEE 30 bus and IEEE 57 bus data and compare the results with full ac load flow calculation. © 2008 Australasian Universities Power Engineering Conference (AUPEC'08).Conference Object Citation Count: 0Comparison of Cost-Free Computational Tools for Teaching Physics(IEEE, 2010) Dağ, Hasan; Dağ, HasanIt is widely accepted that it is quite difficult to engage today's students, from high schools to university, both in educational activities in class and "teaching" them physics due to their prejudices about the complexity of physics. The difficulty in capturing students' attention in class for a long time also plays a role in less effective teaching during learning activities. Research shows that students learn little from traditional lectures. According to constructivist learning theories, visual aids and hands-on activities play a major role in learning physics. In addition to laboratory work there are many computational tools for teaching physics, which help teachers and students in constructing a conceptual framework. With this in mind, this paper compares freeware and open source computational tools for teaching physics.Conference Object Citation Count: 16Comparison of feature selection algorithms for medical data(IEEE, 2012) Dağ, Hasan; Sayın, Kamran Emre; Yenidoğan, Işıl; Albayrak, Songül Varli; Acar, CanData mining application areas widen day by day. Among those areas medical area has been receiving quite a big attention. However working with very large data sets with many attributes is hard. Experts in this field use heavily advanced statistical analysis. The use of data mining techniques is fairly new. This paper compares three feature selection algorithms on medical data sets and comments on the importance of discretization of attributes. © 2012 IEEE.Book Part Citation Count: 12Comparison of post outage bus voltage magnitudes estimated by harmony search and differential evolution methods(2009) Ceylan, Oğuzhan; Dağ, Hasan; Dağ, HasanContingency studies are indispensable tools of both the power system planning and operational studies. Real time implementation of operational problems makes necessary the use of high speed computational methods while requiring reasonable accuracies. On the other hand, accuracy of the results and the speed of calculation depend on branch outage modeling as well as solution algorithm used. This paper presents a comparison of post outage bus voltage magnitudes calculated by two meta-heuristic approaches; namely differential evolution (DE) and harmony search (HS) methods. The methods are tested on IEEE 14, IEEE 30, IEEE 57, and IEEE 118 bus test systems and the results are compared both in terms of accuracy and calculation speed.Review Citation Count: 5Deepfake detection using deep learning methods: A systematic and comprehensive review(Wiley Periodicals, inc, 2024) Dağ, Hasan; Navimipour, Nima Jafari; Dag, Hasan; Unal, MehmetDeep Learning (DL) has been effectively utilized in various complicated challenges in healthcare, industry, and academia for various purposes, including thyroid diagnosis, lung nodule recognition, computer vision, large data analytics, and human-level control. Nevertheless, developments in digital technology have been used to produce software that poses a threat to democracy, national security, and confidentiality. Deepfake is one of those DL-powered apps that has lately surfaced. So, deepfake systems can create fake images primarily by replacement of scenes or images, movies, and sounds that humans cannot tell apart from real ones. Various technologies have brought the capacity to change a synthetic speech, image, or video to our fingers. Furthermore, video and image frauds are now so convincing that it is hard to distinguish between false and authentic content with the naked eye. It might result in various issues and ranging from deceiving public opinion to using doctored evidence in a court. For such considerations, it is critical to have technologies that can assist us in discerning reality. This study gives a complete assessment of the literature on deepfake detection strategies using DL-based algorithms. We categorize deepfake detection methods in this work based on their applications, which include video detection, image detection, audio detection, and hybrid multimedia detection. The objective of this paper is to give the reader a better knowledge of (1) how deepfakes are generated and identified, (2) the latest developments and breakthroughs in this realm, (3) weaknesses of existing security methods, and (4) areas requiring more investigation and consideration. The results suggest that the Conventional Neural Networks (CNN) methodology is the most often employed DL method in publications. According to research, the majority of the articles are on the subject of video deepfake detection. The majority of the articles focused on enhancing only one parameter, with the accuracy parameter receiving the most attention. This article is categorized under:Technologies > Machine LearningAlgorithmic Development > MultimediaApplication Areas > Science and TechnologyArticle Citation Count: 0Distributed Memory Parallel Transient Stability Analysis on a PC Cluster with Ethernet(Praise Worthy Prize Srl, 2010) Dağ, Hasan; Flueck, Alexander J.; Dağ, HasanOn-line transient stability analysis is a necessity for real-time power system control and security. Parallel processing is a natural technology for achieving real-time solution performance. This paper presents a parallel-in-space algorithm based on a multi-level partitioning scheme in a distributed memory cluster environment. The main aim of the research is to decrease the wallclock time of transient stability analysis of large scale power systems by leveraging open source software and commodity off the shelf hardware of a Linux PC cluster. The proposed solution algorithm focuses on speeding up the transient stability simulation by partitioning via METIS the linearized update solution process of the Very Dishonest Newton Method for solving the differential-algebraic equation system. Results are presented for two power systems: I) 3493 buses 844 generators 6689 branches and 2) 7935 buses 2135 generators 13624 branches. The simulations were run on a small Linux-cluster with a 100 Mbit/s ethernet interconnect which is cheaper than any specially constructed parallel computer. By tuning vertex weights the performance of the partition strategy can be improved relative to the no-weight case. The proposed method easily can be adapted by commercial packages and used in various parallel environments including multicore architectures with non-uniform memory access. Copyright (C) 2010 Praise Worthy Prize S.r.l. - All rights reserved.Conference Object Citation Count: 5Double branch outage modeling and its solution using differential evolution method(2011) Dağ, Hasan; Ceylan, Oğuzhan; Dağ, HasanPower system operators need to check the system security by contingency analysis which requires power flow solutions repeatedly. AC power flow is computationally slow even for a moderately sized system. Thus fast and accurate outage models and approximated solutions have been developed. This paper adopts a single branch outage model to a double branch outage one. The final constrained optimization problem resulted from modeling is then solved by using differential evolution method. Simulation results for IEEE 30 and 118 bus test systems are presented and compared to those of full AC load flow in terms of solution accuracy. © 2011 IEEE.Article Citation Count: 2Double branch outage modeling and simulation: Bounded network approach(Elsevier Science, 2015) Dağ, Hasan; Ceylan, Oğuzhan; Dağ, HasanEnergy management system operators perform regular outage simulations in order to ensure secure operation of power systems. AC power flow based outage simulations are not preferred because of insufficient computational speed. Hence several outage models and computational methods providing acceptable accuracy have been developed. On the other hand double branch outages are critical rare events which can result in cascading outages and system collapse. This paper presents a double branch outage model and formulation of the phenomena as a constrained optimization problem. Optimization problem is then solved by using differential evolution method and particle swarm optimization algorithm. The proposed algorithm is applied to IEEE test systems. Computational accuracies of differential evolution based solutions and particle swarm optimization based solutions are discussed for IEEE 30 Bus Test System and IEEE 118 Bus Test System applications. IEEE 14 Bus Test System IEEE 30 Bus Test System IEEE 57 Bus Test System IEEE 118 Bus Test System and IEEE 300 Bus Test System simulation results are compared to AC load flows in terms of computational speed. Finally the performance of the proposed method is analyzed for different outage configurations. (C) 2015 Elsevier Ltd. All rights reserved.Article Citation Count: 0AN EFFECTIVE ROCOMMENDER MODEL FOR E-COMMERCE PLATFORMS(2017) Dağ, Hasan; Dağ, HasanSahte kullanıcı hesapları, veri tabalarındaki seyreklik problemlerinden dolayı özellikle yeteri kadar kullanıcı tarafından puanlanmamış ürünlerde tavsiye algoritmalarını kolaylıkla etkileyebilmektedirler. Genellikle bu kullanıcı hesapları kendi ürününün puanını artırmak isteyen ürün sahipleri olabildiği gibi herhangi bir ürünü veya şirketi karalamak isteyen kötü niyetli kişiler de olabilmektedir. Bu durum birçok şirketin veri tabanı yoğunluğunun %1 den daha az olduğu düşünülürse e-ticaret ortamlarına nasıl bir etki yarattığı tahmin edilebilir. Bu çalışmada, sahte hesapların e-ticaret ortamlarında oluşturdukları negatif etkilerin üstesinden gelebilmek için, kullanıcılar arasındaki ilişkiler analiz edilerek diğer kullanıcılar üzerinde etkisi olan ve gerçekten güvenilir olduğu düşünülen kullanıcılar bulunarak bir tavsiye modeli oluşturulmaktadır. Böylece, güvenilir kullanıcıların düşüncelerinden yola çıkılarak e-ticaret ortamlarında kullanıcılara tavsiyelerde bulunan Tavsiye Sistemlerinin (TS) kalitesini artıracak bir tavsiye sistemi oluşturulacaktırConference Object Citation Count: 0Enhancing Malware Classification: A Comparative Study of Feature Selection Models with Parameter Optimization(Institute of Electrical and Electronics Engineers Inc., 2024) Dağ, Hasan; Dag,H.This study assesses the impact of seven feature selection algorithms (Minimum Redundancy Maximum Relevance (MRMR), Mutual Information (MI), Chi-Square (Chi), Leave One Feature Out (LOFO), Feature Relevance-based Unsupervised Feature Selection (FRUFS), A General Framework for Auto-Weighted Feature Selection via Global Redundancy Minimization (AGRM), and BoostARoota) across two malware datasets (Microsoft and API call sequences) using three machine learning models (Extreme Gradient Boosting (Xgboost), Random Forest, and Histogram-Based Gradient Boosting (Hist Gradient Boosting)). The analysis reveals that no feature selection algorithm uniformly outperforms the others as their effectiveness varies based on the dataset and model characteristics. Specifically, BoostARoota demonstrated significant compatibility with the Microsoft dataset, especially after parameter optimization, whereas its performance varied with the API call sequences dataset, suggesting the need for customized parameter selection. This study highlights the necessity of tailored feature selection approaches and parameter adjustments to optimize machine learning model performance across different datasets. © 2024 IEEE.Master Thesis Citation Count: 16An ensemble of pre-trained transformer models for imbalanced multiclass malware classification(Kadir Has Üniversitesi, 2022) Dağ, Hasan; Demirkıran, Ferhat; Dağ, HasanClassification of malware families is crucial for a comprehensive understanding of how they can infect devices, computers, or systems. Hence, malware identification enables security researchers and incident responders to take precautions against malware and accelerate mitigation. API call sequences made by malware are widely utilized features by machine and deep learning models for malware classification as these sequences represent the behavior of malware. However, traditional ma chine and deep learning models remain incapable of capturing sequence relation ships among API calls. Unlike traditional machine and deep learning models, the transformer-based models process the sequences in whole and learn relationships among API calls due to multi-head attention mechanisms and positional embed dings. Our experiments demonstrate that the transformer model with one trans former block layer surpass the performance of the widely used base architecture, LSTM. Moreover, BERT or CANINE, the pre-trained transformer models, out performs in classifying highly imbalanced malware families according to evaluation metrics: F1-score and AUC score. Furthermore, our proposed bagging-based ran dom transformer forest (RTF) model, an ensemble of BERT or CANINE, reaches the state-of-the-art evaluation scores on the three out of four datasets, specifically it captures a state-of-the-art F1-score of 0.6149 on one of the commonly used bench mark dataset.Article Citation Count: 16An ensemble of pre-trained transformer models for imbalanced multiclass malware classification(Elsevier Advanced Technology, 2022) Dağ, Hasan; Demirkıran, Ferhat; Unal, Gur; Dag, HasanClassification of malware families is crucial for a comprehensive understanding of how they can infect devices, computers, or systems. Hence, malware identification enables security researchers and incident responders to take precautions against malware and accelerate mitigation. API call sequences made by malware are widely utilized features by machine and deep learning models for malware classification as these sequences represent the behavior of malware. However, traditional machine and deep learning models remain incapable of capturing sequence relationships among API calls. Unlike traditional machine and deep learning models, the transformer-based models process the sequences in whole and learn relationships among API calls due to multi-head attention mechanisms and positional embeddings. Our experiments demonstrate that the Transformer model with one transformer block layer surpasses the performance of the widely used base architecture, LSTM. Moreover, BERT or CANINE, the pre-trained transformer models, outperforms in classifying highly imbalanced malware families according to evaluation metrics: F1-score and AUC score. Furthermore, our proposed bagging-based random transformer forest (RTF) model, an ensemble of BERT or CANINE, reaches the state-of-the-art evaluation scores on the three out of four datasets, specifically it captures a state-of-the-art F1-score of 0.6149 on one of the commonly used benchmark dataset. (C) 2022 Elsevier Ltd. All rights reserved.Conference Object Citation Count: 26Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods(Institute of Electrical and Electronics Engineers Inc., 2018) Dağ, Hasan; Yenidoğan, Işıl; Dağ, HasanDeep learning is a subfield of machine learning and deep neural architectures can extract high level features automatically without handcraft feature engineering unlike traditional machine learning algorithms. In this paper, we propose a method, which combines feature extraction layers of a convolutional neural network with traditional machine learning algorithms, such as, support vector machine, gradient boosting machines, and random forest. All of the proposed hybrid models and the above mentioned machine learning algorithms are trained on three different datasets: MNIST, Fashion-MNIST, and CIFAR-10. Results show that the proposed hybrid models are more successful than traditional models while they are being trained from raw pixel values. In this study, we empower traditional machine learning algorithms for classification using feature extraction ability of deep neural network architectures and we are inspired by transfer learning methodology to this.