Residual Lstm Neural Network for Time Dependent Consecutive Pitch String Recognition From Spectrograms: a Study on Turkish Classical Music Makams

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Springer

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Top 10%
Influence
Average
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

Turkish classical music, characterized by 'makam', specific melodic configurations delineated by sequential pitches and intervals, is rich in cultural significance and poses a considerable challenge in identifying a musical piece's particular makam. This identification complexity remains an issue even for experienced musical experts, emphasizing the need for automated and accurate classification techniques. In response, we introduce a residual LSTM neural network model that classifies makams by leveraging the distinct sequential pitch patterns discerned within various audio segments over spectrogram-based inputs. This model's design uniquely merges the spatial capabilities of two-dimensional convolutional layers with the temporal understanding of one-dimensional convolutional and LSTM mechanisms embedded within a residual framework. Such an integrated approach allows for detailed temporal analysis of shifting frequencies, as revealed in logarithmically scaled spectrograms, and is adept at recognizing consecutive pitch patterns within segments. Employing stratified cross-validation on a comprehensive dataset encompassing 1154 pieces spanning 15 unique makams, we found that our model demonstrated an accuracy of 95.60% for a subset of 9 makams and 89.09% for all 15 makams. Our approach demonstrated consistent precision even when distinguishing makam pairs known for their closely related pitch sequences. To further validate our model's prowess, we conducted benchmark tests against established methodologies found in current literature, providing a comparative assessment of our proposed workflow's abilities.

Description

MIRZA, FUAT KAAN/0000-0002-7664-0632; Baykas, Tuncer/0000-0001-9535-2102; PEKCAN, Onder/0000-0002-0082-8209

Keywords

Musical information retrieval, Pitch sequence recognition, Modal music, Spectrogram, Residual LSTM neural network

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
N/A

Source

Multimedia Tools and Applications

Volume

83

Issue

Start Page

41243

End Page

41271
PlumX Metrics
Citations

Scopus : 14

Captures

Mendeley Readers : 13

SCOPUS™ Citations

14

checked on Feb 06, 2026

Web of Science™ Citations

11

checked on Feb 06, 2026

Page Views

22

checked on Feb 06, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
3.22121994

Sustainable Development Goals

SDG data is not available