Yigit, G.Amasyali, M.F.2023-10-192023-10-1920219781665436038https://doi.org/10.1109/INISTA52262.2021.9548535https://hdl.handle.net/20.500.12469/4942Kocaeli University;Kocaeli University Technopark2021 International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2021 --25 August 2021 through 27 August 2021 -- --172175Recurrent Neural Network (RNN) is a widely used deep learning architecture applied to sequence learning problems. However, it is recognized that RNNs suffer from exploding and vanishing gradient problems that prohibit the early layers of the network from learning the gradient information. GRU networks are particular kinds of recurrent networks that reduce the short-comings of these problems. In this study, we propose two variants of the standard GRU with simple but effective modifications. We applied an empirical approach and tried to determine the effectiveness of the current units and recurrent units of gates by giving different coefficients. Interestingly, we realize that applying such minor and simple changes to the standard GRU provides notable improvements. We comparatively evaluate the standard GRU with the proposed two variants on four different tasks: (1) sentiment classification on the IMDB movie review dataset, (2) language modeling task on Penn TreeBank (PTB) dataset, (3) sequence to sequence addition problem, and (4) question answering problem on Facebook's bAbitasks dataset. The evaluation results indicate that the proposed two variants of GRU consistently outperform standard GRU. © 2021 IEEE.eninfo:eu-repo/semantics/closedAccessGated recurrent unitsRecurrent neural networksSeq2seqClassification (of information)Modeling languagesMultilayer neural networksNetwork layersGated recurrent unitGradient informationsLearning architecturesLearning problemRecurrent networksSeq2seqSequence learningShort-comingsSimple++Vanishing gradientRecurrent neural networksSimple but effective GRU variantsConference Object10.1109/INISTA52262.2021.95485352-s2.0-85116609087N/AN/A6