Browsing by Author "Dağ, Hasan"

Now showing 1 - 20 of 59

Ağ Sızma Tespit Sistemleri için Tablosal ve Metin Temelli Özniteliklerden Birlikte Öğrenmeye Dayalı Yeni Bir Mimari
(2023) Düzgün, Berkant; Dağ, Hasan
Ağ Saldırı Tespit Sistemleri (ASTS) bilgisayar ağlarının güvenliğinin ve bütünlüğünün korunmasında kritik bir rol oynar. Bu sistemler, kötü niyetli veya yetkisiz erişime işaret edebilecek anormal faaliyetleri tespit etmek ve bunlara yanıt vermek üzere tasarlanmıştır. Sürekli gelişen siber tehditlerle karakterize edilen günümüzün dijital ortamında sağlam ASTS çözümlerine duyulan ihtiyaç hiç bu kadar acil olmamıştı. Etkili ASTS'lerin konuşlandırılması, özellikle de sürekli artan sofistike ve tespit edilmesi zor siber tehditlerin ortasında ağ anormalliklerinin doğru bir şekilde tanımlanması zor olabilir. Araştırmamızın motivasyonu, ASTS çalışmaları önemli adımlar atmış olsa da, ağ anormalliklerini tespit etmek için daha etkili ve doğru yöntemlere olan önemli ihtiyacın devam ettiğinin fark edilmesinden kaynaklanmaktadır. STS çalışmalarında yaygın olarak kullanılan özellikler ağ günlüklerini içermektedir ve bazı çalışmalar yük bilgisi gibi metin tabanlı özellikleri araştırmıştır. Ancak geleneksel makine ve derin öğrenme modelleri, tablosal ve metin tabanlı özelliklerden birlikte öğrenme konusunda yetersiz kalabilmektedir. Burada, ASTS'in performansını artırmak için hem tablo hem de metin tabanlı özellikleri entegre eden yeni bir yaklaşım sunuyoruz. Araştırmamız, ASTS'in mevcut sınırlamalarını ele almayı ve ağ anormalliklerini tespit etmek için daha etkili ve doğru yöntemler sunarak daha güvenilir ve verimli ağ güvenliği çözümlerinin geliştirilmesine katkıda bulunmayı amaçlamaktadır. Dahili deneylerimiz, tablosal özelliklerini kullanan derin öğrenme yaklaşımının olumlu sonuçlar verdiğini, metin tabanlı özelliklerini kullanan önceden eğitilmiş dönüştürücü yaklaşımının ise yeterli performans göstermediğini ortaya koymuştur. Bununla birlikte, derin öğrenme ve önceden eğitilmiş dönüştürücü yaklaşımlarını birlikte kullanarak her iki özellik türünü entegre eden önerilen yaklaşımımız üstün performans elde etmektedir. Bu bulgular, derin öğrenme ve önceden eğitilmiş dönüştürücü yaklaşımlarını birlikte kullanarak her iki özellik türünü entegre etmenin ağ aykırılığı tespitinin doğruluğunu önemli ölçüde artırabileceğini göstermektedir. Ayrıca, önerilen yaklaşımımız ISCX-IDS2012, UNSW-NB15 ve CIC-IDS2017 gibi yaygın olarak kullanılan ASTS veri kümelerinde doğruluk, F1-skoru ve duyarlılık açısından son teknoloji yöntemlerden daha iyi performans göstermekte ve sırasıyla %99,80, %92,37 ve %99,69 F1-skorları ile ağ aykırılık tespit etmedeki etkinliğini ortaya koymaktadır.
Citation - WoS: 6
Citation - Scopus: 7
Alternative Credit Scoring and Classification Employing Machine Learning Techniques on a Big Data Platform
(Institute of Electrical and Electronics Engineers Inc., 2019) Hindistan, Yavuz Selim; Kiyakoğlu, Burhan Yasin; Rezaeinazhad, Arash Mohammadian; Korkmaz, Halil Ergun; Dağ, Hasan
With the bloom of financial technology and innovations aiming to deliver a high standard of financial services, banks and credit service companies, along with other financial institutions, use the most recent technologies available in a variety of ways from addressing the information asymmetry, matching the needs of borrowers and lenders, to facilitating transactions using payment services. In the long list of FinTechs, one of the most attractive platforms is the Peer-to-Peer (P2P) lending which aims to bring the investors and borrowers hand in hand, leaving out the traditional intermediaries like banks. The main purpose of a financial institution as an intermediary is of controlling risk and P2P lending platforms innovate and use new ways of risk assessment. In the era of Big Data, using a diverse source of information from spending behaviors of customers, social media behavior, and geographic information along with traditional methods for credit scoring prove to have new insights for the proper and more accurate credit scoring. In this study, we investigate the machine learning techniques on big data platforms, analyzing the credit scoring methods. It has been concluded that on a HDFS (Hadoop Distributed File System) environment, Logistic Regression performs better than Decision Tree and Random Forest for credit scoring and classification considering performance metrics such as accuracy, precision and recall, and the overall run time of algorithms. Logistic Regression also performs better in time in a single node HDFS configuration compared to a non-HDFS configuration.
Applications of Eigenvalue Counting and Inclusion Theorems in Model Order Reduction
(Springer-Verlag Berlin, 2010) Yetkin, E. Fatih; Dağ, Hasan
We suggest a simple and an efficient iterative method based on both the Gerschgorin eigenvalue inclusion theorem and the deflation methods to compute a Reduced Order Model (ROM) to lower greatly the order of a given state space system. This method is especially efficient in symmetric state-space systems but it works for the other cases with some modifications.
Applying Machine Learning Algorithms in Sales Prediction
(Kadir Has Üniversitesi, 2019) Sekban, Judi; Dağ, Hasan
Makine öğrenimi bir çok endüstride üzerinde yoğun çalışmalar yapılan bir konu olmuştur, ve neyse ki şirketler kendi problemlerini çözebilecek çeşitli machine learning yaklaşımları hakkında günden güne daha fazla bilgi sahibi oluyorlar. Fakat, farklı makine öğreniminin modellerinden en iyi şekilde sonuç almak ve verimli sonuçlara ulaşabilmek için, modellerin uygulanış biçimlerini ve verinin doğasını iyi anlamak gerekir. Bu tez, belli bir tahmin görevi için, uygulanan farklı makine öğreniminin algoritmalarını ne kadar iyi sonuç verdiklerini araştırır. Bu amaçla tez, 4 faklı algoritma, bir istifleme topluluğu tekniği ve modeli geliştirmek için belirli bir özelllik seçme yaklaşımı sunar ve uygular. Farklı konfigürasyonlar uygulayarak sonuçlar birbiriyle test edilir. Bütün bu işlemler, gerekli veri önislemeleri ve özellik mühendisliği adımları tamamlandıktan sonra yapılır.
Audio detection using machine learning & transfer learning models
(Kadir Has Üniversitesi, 2021) Acar, Mesut; Dağ, Hasan
In this paper, using datasets ESC-50 & ESC-10 of environmental sounds, machine learning algorithms, and feature extraction methods are used to develop recognition performance. K-NN, SVM, Random Forest are used for comparing the recognition results. The different feature extraction methods in the literature are used to get more meaningful attributes from these datasets and obtain a higher accuracy rate. This approach shows that SVM algorithm has a significantly good result with accuracy scores. The best accuracy scores obtained by classic machine learning algorithms are %42,15 for ESC-50 and %77,7 for ESC-10. In addition to this, the experiments have been done with a pre-trained ResNet neural network as a backbone, which achieves successful results despite the machine learning models. In this study, a higher accuracy rate is achieved from baseline machine learning algorithms in literature and using transfer learning with pre-trained Resnet backbones to reach some state of art results. The accuracy scores are %68,95 for ESC-50 and %87,25 for ESC-10.
Bellek Tabanli Verı Platformların Karşılaştırması
(Kadir Has Üniversitesi, 2016) Akbari, Amirmahdi; Dağ, Hasan
Bellek tabanli verı platformların karşılaştırması
Citation - WoS: 2
Citation - Scopus: 6
Branch Outage Simulation Based Contingency Screening by Gravitational Search Algorithm
(Praise Worthy Prize Srl, 2012) Ceylan, Oğuzhan; Özdemir, Aydoğan; Dağ, Hasan
Power systems contingency analysis is an important issue for electric power system operators. This paper performs branch outage simulation based contingency screening using a bounded network approach. Local constrained optimization problem representing the branch outage phenomena is solved by the gravitational search algorithm. The proposed method is applied to IEEE 14 30 57 and 118 Bus Test systems and its performance from the point of capturing violations is evaluated. In addition false alarms and the computational accuracy of the proposed method are also analyzed by using scattering diagrams. Finally the proposed gravitational search based contingency screening is compared with full AC load flow solutions from the point of computational speed. Copyright (C) 2012 Praise Worthy Prize S.r.l. - All rights reserved.
Citation - Scopus: 15
Branch outage solution using particle swarm optimization
(2008) Ceylan, Oğuzhan; Ozdemir, Aydogan; Dağ, Hasan
For post outage MW line flows and voltage magnitude calculations most of the methods use linear methods because of their simplicity. Especially for reactive power flow calculations one can face high errors. In this paper we use a minimization method that minimizes the errors resulting from the linear system model implementation. We solve the optimization problem using particle swarm optimization. We give some outage examples using IEEE 14 bus IEEE 30 bus and IEEE 57 bus data and compare the results with full ac load flow calculation. © 2008 Australasian Universities Power Engineering Conference (AUPEC'08).
Bulut Ortamlarında Güvenli Uygulama Dağıtımının Sağlanması
(2025) Bostancı, Hakan; Dağ, Hasan
Bulut bilişim, ölçeklenebilirlik, maliyet avantajı ve esneklik gibi faydaları nedeniyle modern BT altyapılarının ayrılmaz bir parçası haline gelmiştir. Organizasyonlar, operasyonel verimliliklerini artırmak, uzaktan çalışma süreçlerini desteklemek ve daha esnek iş modelleri oluşturmak için giderek daha fazla bulut çözümlerine yönelmektedir. Ancak, bu artan bağımlılık, güvenli uygulama teslimi açısından önemli güvenlik risklerini de beraberinde getirmektedir. Bulut bilişime olan adaptasyonun ve bağımlılığın artmasıyla birlikte, güvenli uygulama teslimini sağlamak kuruluşlar için kritik bir zorluk haline gelmiştir. Bulut ortamlarının dinamik yapısı ve gelişen siber tehditler, hassas verilerin korunması ve sistem bütünlüğünün sağlanması için güçlü güvenlik önlemlerini zorunlu kılmaktadır. Bu tez, bulut tabanlı uygulama tesliminde karşılaşılan temel güvenlik sorunlarını inceleyerek riskleri azaltmaya yönelik kapsamlı bir güvenlik yaklaşımı sunmaktadır.Çalışmada, uygulama güvenliği, API güvenliği, ağ güvenliği, veri şifreleme, kimlik ve erişim yönetimi ve gerçek zamanlı tehdit algılama mekanizmalarını entegre eden çok katmanlı bir güvenlik yaklaşımı benimsenmiştir. Bu yaklaşımda önerilen iyileştirmeler uygulamalı olarak gösterilmiştir. Microsoft Azure üzerinde dağıtılan bulut tabanlı bir e-ticaret uygulaması, güvenlik kontrollerinin uygulanması ve değerlendirilmesi için bir test ortamı olarak kullanılmıştır. Sistem direncini değerlendirmek için penetrasyon testleri, güvenlik açığı değerlendirmeleri ve DDoS ve saldırı simülasyonları gibi güvenlik test metodolojileri uygulanmıştır.Bulgular, geleneksel bulut dağıtımlarında önemli güvenlik açıklarının bulunduğunu ve proaktif güvenlik stratejilerinin gerekliliğini ortaya koymaktadır. Uygulama mimarisini güvenli hale getirme, ağ güvenliği, web uygulama güvenlik duvarı, ddos koruması, kimlik doğrulandırma ve yetkilendirme yönetimi gibi gelişmiş güvenlik önlemlerinin uygulanmasının ardından sistemin siber tehditlere karşı daha güçlü bir koruma sağladığı gözlemlenmiştir. Elde edilen sonuçlar, kuruluşların bulut tabanlı uygulamalarını etkili bir şekilde güvence altına almalarına yönelik uygulanabilir bilgiler sunarak bulut güvenliği alanına katkıda bulunmaktadır.
Çelik Sektöründe Enerji Tüketimi Tahmini ile Daha İyi Enerji Verimliliğine Doğru
(2024) Koca, Aslı; Dağ, Hasan
Elektrik tüketiminin en doğru şekilde tahmin edilmesi, maliyet optimizasyonu, operasyonel verimlilik, rekabet gücü, sözleşme müzakereleri ve üretimde sürdürülebilir kalkınmanın küresel hedeflerine ulaşılması için çok önemlidir. Bu çalışma, bütüncül bir yaklaşımla, bir çelik şirketinde elektrik tüketimi için en uygun tahmin algoritmasının ve en etkin uygulama alanlarının belirlenmesine odaklanmaktadır. Rastgele Orman, Gradyan Destekli Ağaçlar, Genelleştirilmiş Doğrusal Modeller, Karar Ağaçları ve Derin Sinir Ağı verilen probleme uygun oldukları ve tahmin amacıyla yaygın olarak kullanılan regresyon algoritmaları oldukları için kullanılmıştır. Tahmin modellerinin performansı, artıkların standart sapmasına (RMSE) ve açıklanan varyans oranına (R-kare) göre değerlendirilir. Bu çalışma, Rastgele Orman modelinin Gradyan Destekli Ağaçlar, Genelleştirilmiş Doğrusal Modeller, Karar Ağaçları ve Derin Sinir Ağı modellerinden daha iyi performans ortaya koyduğunu göstermektedir. Sonuçlar birçok farklı alanda fayda sağlayacaktır. İlk olarak, sözleşme görüşmeleri sırasında, gün öncesi piyasasında elektrik satın almak için rekabet avantajı elde etmemizi sağlayacaktır. İkinci olarak, üretim planlama aşamasında, elektrik tüketimi en yüksek olan bobinlerin, en uygun fiyatlarla, talebin en az olduğu saatlerde üretimlerinin planlanmasına izin verecektir. Ve son olarak, satış siparişleri önceliklendirilirken, mevcut kapasitenin, daha düşük enerji tüketimi olan veya daha yüksek kar marjına sahip satış siparişleri için kullanılması sağlanacaktır.
Citation - Scopus: 2
Comparison of Cost-Free Computational Tools for Teaching Physics
(IEEE, 2010) Er, Neslihan Fatma; Dağ, Hasan
It is widely accepted that it is quite difficult to engage today's students, from high schools to university, both in educational activities in class and "teaching" them physics due to their prejudices about the complexity of physics. The difficulty in capturing students' attention in class for a long time also plays a role in less effective teaching during learning activities. Research shows that students learn little from traditional lectures. According to constructivist learning theories, visual aids and hands-on activities play a major role in learning physics. In addition to laboratory work there are many computational tools for teaching physics, which help teachers and students in constructing a conceptual framework. With this in mind, this paper compares freeware and open source computational tools for teaching physics.
Citation - Scopus: 17
Comparison of Feature Selection Algorithms for Medical Data
(IEEE, 2012) Dağ, Hasan; Sayın, Kamran Emre; Yenidoğan, Işıl; Albayrak, Songül Varli; Acar, Can
Data mining application areas widen day by day. Among those areas medical area has been receiving quite a big attention. However working with very large data sets with many attributes is hard. Experts in this field use heavily advanced statistical analysis. The use of data mining techniques is fairly new. This paper compares three feature selection algorithms on medical data sets and comments on the importance of discretization of attributes. © 2012 IEEE.
Citation - Scopus: 12
Comparison of Post Outage Bus Voltage Magnitudes Estimated by Harmony Search and Differential Evolution Methods
(2009) Ceylan, Oğuzhan; Özdemir, Aydoğan; Dağ, Hasan
Contingency studies are indispensable tools of both the power system planning and operational studies. Real time implementation of operational problems makes necessary the use of high speed computational methods while requiring reasonable accuracies. On the other hand, accuracy of the results and the speed of calculation depend on branch outage modeling as well as solution algorithm used. This paper presents a comparison of post outage bus voltage magnitudes calculated by two meta-heuristic approaches; namely differential evolution (DE) and harmony search (HS) methods. The methods are tested on IEEE 14, IEEE 30, IEEE 57, and IEEE 118 bus test systems and the results are compared both in terms of accuracy and calculation speed.
Citation - Scopus: 1
Distributed Memory Parallel Transient Stability Analysis on a Pc Cluster With Ethernet
(Praise Worthy Prize Srl, 2010) Soykan, Gürkan; Flueck, Alexander J.; Dağ, Hasan
On-line transient stability analysis is a necessity for real-time power system control and security. Parallel processing is a natural technology for achieving real-time solution performance. This paper presents a parallel-in-space algorithm based on a multi-level partitioning scheme in a distributed memory cluster environment. The main aim of the research is to decrease the wallclock time of transient stability analysis of large scale power systems by leveraging open source software and commodity off the shelf hardware of a Linux PC cluster. The proposed solution algorithm focuses on speeding up the transient stability simulation by partitioning via METIS the linearized update solution process of the Very Dishonest Newton Method for solving the differential-algebraic equation system. Results are presented for two power systems: I) 3493 buses 844 generators 6689 branches and 2) 7935 buses 2135 generators 13624 branches. The simulations were run on a small Linux-cluster with a 100 Mbit/s ethernet interconnect which is cheaper than any specially constructed parallel computer. By tuning vertex weights the performance of the partition strategy can be improved relative to the no-weight case. The proposed method easily can be adapted by commercial packages and used in various parallel environments including multicore architectures with non-uniform memory access. Copyright (C) 2010 Praise Worthy Prize S.r.l. - All rights reserved.
Citation - Scopus: 5
Double Branch Outage Modeling and Its Solution Using Differential Evolution Method
(2011) Ceylan, Oğuzhan; Ozdemir, Aydogan; Dağ, Hasan
Power system operators need to check the system security by contingency analysis which requires power flow solutions repeatedly. AC power flow is computationally slow even for a moderately sized system. Thus fast and accurate outage models and approximated solutions have been developed. This paper adopts a single branch outage model to a double branch outage one. The final constrained optimization problem resulted from modeling is then solved by using differential evolution method. Simulation results for IEEE 30 and 118 bus test systems are presented and compared to those of full AC load flow in terms of solution accuracy. © 2011 IEEE.
Citation - WoS: 2
Citation - Scopus: 2
Double Branch Outage Modeling and Simulation: Bounded Network Approach
(Elsevier Science, 2015) Ceylan, Oğuzhan; Özdemir, Aydoğan; Dağ, Hasan
Energy management system operators perform regular outage simulations in order to ensure secure operation of power systems. AC power flow based outage simulations are not preferred because of insufficient computational speed. Hence several outage models and computational methods providing acceptable accuracy have been developed. On the other hand double branch outages are critical rare events which can result in cascading outages and system collapse. This paper presents a double branch outage model and formulation of the phenomena as a constrained optimization problem. Optimization problem is then solved by using differential evolution method and particle swarm optimization algorithm. The proposed algorithm is applied to IEEE test systems. Computational accuracies of differential evolution based solutions and particle swarm optimization based solutions are discussed for IEEE 30 Bus Test System and IEEE 118 Bus Test System applications. IEEE 14 Bus Test System IEEE 30 Bus Test System IEEE 57 Bus Test System IEEE 118 Bus Test System and IEEE 300 Bus Test System simulation results are compared to AC load flows in terms of computational speed. Finally the performance of the proposed method is analyzed for different outage configurations. (C) 2015 Elsevier Ltd. All rights reserved.
An Effective Rocommender Model for E-Commerce Platforms
(2017) Işık, Muhittin; Dağ, Hasan
Sahte kullanıcı hesapları, veri tabalarındaki seyreklik problemlerinden dolayı özellikle yeteri kadar kullanıcı tarafından puanlanmamış ürünlerde tavsiye algoritmalarını kolaylıkla etkileyebilmektedirler. Genellikle bu kullanıcı hesapları kendi ürününün puanını artırmak isteyen ürün sahipleri olabildiği gibi herhangi bir ürünü veya şirketi karalamak isteyen kötü niyetli kişiler de olabilmektedir. Bu durum birçok şirketin veri tabanı yoğunluğunun %1 den daha az olduğu düşünülürse e-ticaret ortamlarına nasıl bir etki yarattığı tahmin edilebilir. Bu çalışmada, sahte hesapların e-ticaret ortamlarında oluşturdukları negatif etkilerin üstesinden gelebilmek için, kullanıcılar arasındaki ilişkiler analiz edilerek diğer kullanıcılar üzerinde etkisi olan ve gerçekten güvenilir olduğu düşünülen kullanıcılar bulunarak bir tavsiye modeli oluşturulmaktadır. Böylece, güvenilir kullanıcıların düşüncelerinden yola çıkılarak e-ticaret ortamlarında kullanıcılara tavsiyelerde bulunan Tavsiye Sistemlerinin (TS) kalitesini artıracak bir tavsiye sistemi oluşturulacaktır
Citation - WoS: 41
Citation - Scopus: 59
An Ensemble of Pre-Trained Transformer Models for Imbalanced Multiclass Malware Classification
(Kadir Has Üniversitesi, 2022) Demirkıran, Ferhat; Dağ, Hasan
Classification of malware families is crucial for a comprehensive understanding of how they can infect devices, computers, or systems. Hence, malware identification enables security researchers and incident responders to take precautions against malware and accelerate mitigation. API call sequences made by malware are widely utilized features by machine and deep learning models for malware classification as these sequences represent the behavior of malware. However, traditional ma chine and deep learning models remain incapable of capturing sequence relation ships among API calls. Unlike traditional machine and deep learning models, the transformer-based models process the sequences in whole and learn relationships among API calls due to multi-head attention mechanisms and positional embed dings. Our experiments demonstrate that the transformer model with one trans former block layer surpass the performance of the widely used base architecture, LSTM. Moreover, BERT or CANINE, the pre-trained transformer models, out performs in classifying highly imbalanced malware families according to evaluation metrics: F1-score and AUC score. Furthermore, our proposed bagging-based ran dom transformer forest (RTF) model, an ensemble of BERT or CANINE, reaches the state-of-the-art evaluation scores on the three out of four datasets, specifically it captures a state-of-the-art F1-score of 0.6149 on one of the commonly used bench mark dataset.
Citation - WoS: 39
Citation - Scopus: 55
Feature Extraction Based on Deep Learning for Some Traditional Machine Learning Methods
(Institute of Electrical and Electronics Engineers Inc., 2018) Çayır, Aykut; Yenidoğan, Işıl; Dağ, Hasan
Deep learning is a subfield of machine learning and deep neural architectures can extract high level features automatically without handcraft feature engineering unlike traditional machine learning algorithms. In this paper, we propose a method, which combines feature extraction layers of a convolutional neural network with traditional machine learning algorithms, such as, support vector machine, gradient boosting machines, and random forest. All of the proposed hybrid models and the above mentioned machine learning algorithms are trained on three different datasets: MNIST, Fashion-MNIST, and CIFAR-10. Results show that the proposed hybrid models are more successful than traditional models while they are being trained from raw pixel values. In this study, we empower traditional machine learning algorithms for classification using feature extraction ability of deep neural network architectures and we are inspired by transfer learning methodology to this.
Feature Selection and Discretization for Improving Classification Performance on Cac Data Set
(Kadir Has Üniversitesi, 2013) Sayın, Kamran Emre; Dağ, Hasan
Data Mining usage in Health Sector increased much in this decade because of the need for efficient treatment. From cost-cutting in medical expenses to acting as a Decision Support System for patient diagnosis, Data Mining nowadays is a strong companion in Health Sector. The dataset used in this thesis belongs to Dr. Nurhan Seyahi. Dr. Nurhan Seyahi and his colleagues made a research about Coronary Artery Calcification in 178 patients having renal transplantation recently. They used conventional statistical methods in their research. By using the power of data mining, this thesis shows the importance of feature selection and discretization used with classification methods for acting as a decision support system in patient diagnosis for CAC Dataset. Just by looking at seven important attributes, which are; age, time of transplantation, diabetes mellitus, phosphor, rose angina test, donor type and patient history, doctors can decide whether the patient has coronary artery calcification or not with approximately 70% accuracy. After the discretization process this accuracy approximately increases to 75% in some algorithms. Thus becoming a strong decision support system for doctors working in this area.