Attention-based CNN-BiLSTM for Dialect Identification on Javanese Text

Ahmad Fathan Hidayatullah; Siwi Cahyaningtyas; Rheza Daffa Pamungkas

doi:10.22219/kinetik.v5i4.1121

Issue

Vol. 5, No. 4, November 2020

Issue Published : Nov 30, 2020

Attention-based CNN-BiLSTM for Dialect Identification on Javanese Text

https://doi.org/10.22219/kinetik.v5i4.1121

Ahmad Fathan Hidayatullah

Universitas Islam Indonesia

https://orcid.org/0000-0002-3755-2648

Siwi Cahyaningtyas

Universitas Islam Indonesia

Rheza Daffa Pamungkas

Universitas Islam Indonesia

Corresponding Author(s) : Ahmad Fathan Hidayatullah

fathan@uii.ac.id

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, Vol. 5, No. 4, November 2020
Article Published : Nov 22, 2020

Abstract

This study proposes a hybrid deep learning models called attention-based CNN-BiLSTM (ACBiL) for dialect identification on Javanese text. Our ACBiL model comprises of input layer, convolution layer, max pooling layer, batch normalization layer, bidirectional LSTM layer, attention layer, fully connected layer and softmax layer. In the attention layer, we applied a hierarchical attention networks using word and sentence level attention to observe the level of importance from the content. As comparison, we also experimented with other several classical machine learning and deep learning approaches. Among the classical machine learning, the Linear Regression with unigram achieved the best performance with average accuracy of 0.9647. In addition, our observation with the deep learning models outperformed the traditional machine learning models significantly. Our experiments showed that the ACBiL architecture achieved the best performance among the other deep learning methods with the accuracy of 0.9944.

Keywords

Attention Mechanism CNN BiLSTM Dialect Identification Deep Learning

Hidayatullah, A. F., Cahyaningtyas, S., & Pamungkas, R. D. (2020). Attention-based CNN-BiLSTM for Dialect Identification on Javanese Text. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 5(4), 317-324. https://doi.org/10.22219/kinetik.v5i4.1121

Download Citation

References

A. I. Fauzi and D. Puspitorini, “Dialect and Identity: A Case Study of Javanese Use in WhatsApp and Line,” IOP Conference Series: Earth and Environmental Science, vol. 175, p. 012111, Jul. 2018. https://doi.org/10.1088/1755-1315/175/1/012111
A. M. Warohma, P. Kurniasari, S. Dwijayanti, Irmawan, and B. Y. Suprapto, “Identification of Regional Dialects Using Mel Frequency Cepstral Coefficients (MFCCs) and Neural Network,” in 2018 International Seminar on Application for Technology of Information and Communication, Sep. 2018, pp. 522–527. https://doi.org/10.1109/ISEMANTIC.2018.8549731
Y. Fares et al., “Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 2019, pp. 224–228. http://dx.doi.org/10.18653/v1/W19-4626
T. Jauhiainen, K. Lindén, and H. Jauhiainen, “Language model adaptation for language and dialect identification of text,” Nat. Lang. Eng., vol. 25, no. 5, pp. 561–583, Sep. 2019. https://doi.org/10.1017/S135132491900038X
M. Altamimi and W. J. Teahan, “Arabic Dialect Identification of Twitter Text Using PPM Compression,” pp. 13, 2019.
F. Xu, M. Wang, and M. Li, “Sentence-Level Dialects Identification in the Greater China Region,” International Journal on Natural Language Computing, vol. 5, no. 6, pp. 9–20, Dec. 2016. https://doi.org/10.5121/ijnlc.2016.5602
L. Yang and Y. Xiang, “Naive Bayes and BiLSTM Ensemble for Discriminating between Mainland and Taiwan Variation of Mandarin Chinese,” in Proceedings of VarDial, Minneapolis, MN, 2019, pp. 120–127. http://dx.doi.org/10.18653/v1/W19-1412
M. Ali, “Character Level Convolutional Neural Network for German Dialect Identification,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), Aug. 2018, pp. 172–177.
M. Criscuolo and S. M. Aluisio, “Discriminating between Similar Languages with Word-level Convolutional Neural Networks,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), Valencia, Spain, 2017, pp. 124–130. http://dx.doi.org/10.18653/v1/W17-1215
Ç. Çöltekin, T. Rama, and V. Blaschke, “Tübingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification,” pp. 11.
M. Elaraby and A. Zahran, “A Character Level Convolutional BiLSTM for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 2019, pp. 274–278. http://dx.doi.org/10.18653/v1/W19-4636
D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” arXiv:1409.0473 [cs, stat], 2014.
M.-T. Luong, H. Pham, and C. D. Manning, “Effective Approaches to Attention-based Neural Machine Translation,” arXiv:1508.04025 [cs], Sep. 2015, Accessed: Sep. 16, 2020. http://dx.doi.org/10.18653/v1/D15-1166
Y. Wang, M. Huang, X. Zhu, and L. Zhao, “Attention-based LSTM for Aspect-level Sentiment Classification,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 2016, pp. 606–615. http://dx.doi.org/10.18653/v1/D16-1058
Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical Attention Networks for Document Classification,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, 2016, pp. 1480–1489. http://dx.doi.org/10.18653/v1/N16-1174
W. Li, F. Qi, M. Tang, and Z. Yu, “Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification,” Neurocomputing, vol. 387, pp. 63–77, Apr. 2020. https://doi.org/10.1016/j.neucom.2020.01.006
G. Liu and J. Guo, “Bidirectional LSTM with attention mechanism and convolutional layer for text classification,” Neurocomputing, vol. 337, pp. 325–338, Apr. 2019. https://doi.org/10.1016/j.neucom.2019.01.078
T. Shen, T. Zhou, G. Long, J. Jiang, S. Pan, and C. Zhang, “DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding,” in The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018, pp. 10. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16126/16099
S. A. Chowdhury and R. Zamparelli, “RNN Simulations of Grammaticality Judgments on Long-distance Dependencies,” in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, Aug. 2018, pp. 133–144.
Y. Liu, C. Sun, L. Lin, and X. Wang, “Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention,” arXiv:1605.09090 [cs], May 2016, Accessed: Aug. 05, 2020.
A. F. Hidayatullah and M. R. Ma’arif, “Pre-processing Tasks in Indonesian Twitter Messages,” Journal of Physics: Conference Series, vol. 801, p. 012072, Jan. 2017.https://doi.org/10.1088/1742-6596/801/1/012072
X. Zhang, Zhao, Junbo, and Y. LeCun, “Character-level Convolutional Networks for Text Classification,” dvances in neural information processing systems, pp. 649–657, 2015.
R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, Aug. 2018. https://doi.org/10.1007/s13244-018-0639-9
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Gated Feedback Recurrent Neural Networks,” in International conference on machine learning, 2015, pp. 2067–2075.
A. Graves, N. Jaitly, and A. Mohamed, “Hybrid speech recognition with Deep Bidirectional LSTM,” in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, Dec. 2013, pp. 273–278. https://doi.org/10.1109/ASRU.2013.6707742
Y. Zhao, Y. Shen, and J. Yao, “Recurrent Neural Network for Text Classification with Hierarchical Multiscale Dense Connections,” in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, China, Aug. 2019, pp. 5450–5456. https://doi.org/10.24963/ijcai.2019/757
C. Zhou, C. Sun, Z. Liu, and F. C. M. Lau, “A C-LSTM Neural Network for Text Classification,” arXiv:1511.08630 [cs], Nov. 2015, Accessed: Sep. 15, 2020.
T. Dozat, “Incorporating Nesterov Momentum into Adam,” in ICLR Workshop, 2016, vol. 1, pp. 2013–2016.
C. Guggilla, “Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, Dec. 2016, pp. 185–194.
Q. Zhou and H. Wu, “NLP at IEST 2018: BiLSTM-Attention and LSTM-Attention via Soft Voting in Emotion Classification,” in Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Brussels, Belgium, Oct. 2018, pp. 189–194. http://dx.doi.org/10.18653/v1/W18-6226
N. Srivastava, G. Hinton, A. Krizhevsky, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.

References

A. I. Fauzi and D. Puspitorini, “Dialect and Identity: A Case Study of Javanese Use in WhatsApp and Line,” IOP Conference Series: Earth and Environmental Science, vol. 175, p. 012111, Jul. 2018. https://doi.org/10.1088/1755-1315/175/1/012111

A. M. Warohma, P. Kurniasari, S. Dwijayanti, Irmawan, and B. Y. Suprapto, “Identification of Regional Dialects Using Mel Frequency Cepstral Coefficients (MFCCs) and Neural Network,” in 2018 International Seminar on Application for Technology of Information and Communication, Sep. 2018, pp. 522–527. https://doi.org/10.1109/ISEMANTIC.2018.8549731

Y. Fares et al., “Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 2019, pp. 224–228. http://dx.doi.org/10.18653/v1/W19-4626

T. Jauhiainen, K. Lindén, and H. Jauhiainen, “Language model adaptation for language and dialect identification of text,” Nat. Lang. Eng., vol. 25, no. 5, pp. 561–583, Sep. 2019. https://doi.org/10.1017/S135132491900038X

M. Altamimi and W. J. Teahan, “Arabic Dialect Identification of Twitter Text Using PPM Compression,” pp. 13, 2019.

F. Xu, M. Wang, and M. Li, “Sentence-Level Dialects Identification in the Greater China Region,” International Journal on Natural Language Computing, vol. 5, no. 6, pp. 9–20, Dec. 2016. https://doi.org/10.5121/ijnlc.2016.5602

L. Yang and Y. Xiang, “Naive Bayes and BiLSTM Ensemble for Discriminating between Mainland and Taiwan Variation of Mandarin Chinese,” in Proceedings of VarDial, Minneapolis, MN, 2019, pp. 120–127. http://dx.doi.org/10.18653/v1/W19-1412

M. Ali, “Character Level Convolutional Neural Network for German Dialect Identification,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), Aug. 2018, pp. 172–177.

M. Criscuolo and S. M. Aluisio, “Discriminating between Similar Languages with Word-level Convolutional Neural Networks,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), Valencia, Spain, 2017, pp. 124–130. http://dx.doi.org/10.18653/v1/W17-1215

Ç. Çöltekin, T. Rama, and V. Blaschke, “Tübingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification,” pp. 11.

M. Elaraby and A. Zahran, “A Character Level Convolutional BiLSTM for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 2019, pp. 274–278. http://dx.doi.org/10.18653/v1/W19-4636

D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” arXiv:1409.0473 [cs, stat], 2014.

M.-T. Luong, H. Pham, and C. D. Manning, “Effective Approaches to Attention-based Neural Machine Translation,” arXiv:1508.04025 [cs], Sep. 2015, Accessed: Sep. 16, 2020. http://dx.doi.org/10.18653/v1/D15-1166

Y. Wang, M. Huang, X. Zhu, and L. Zhao, “Attention-based LSTM for Aspect-level Sentiment Classification,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 2016, pp. 606–615. http://dx.doi.org/10.18653/v1/D16-1058

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical Attention Networks for Document Classification,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, 2016, pp. 1480–1489. http://dx.doi.org/10.18653/v1/N16-1174

W. Li, F. Qi, M. Tang, and Z. Yu, “Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification,” Neurocomputing, vol. 387, pp. 63–77, Apr. 2020. https://doi.org/10.1016/j.neucom.2020.01.006

G. Liu and J. Guo, “Bidirectional LSTM with attention mechanism and convolutional layer for text classification,” Neurocomputing, vol. 337, pp. 325–338, Apr. 2019. https://doi.org/10.1016/j.neucom.2019.01.078

T. Shen, T. Zhou, G. Long, J. Jiang, S. Pan, and C. Zhang, “DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding,” in The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018, pp. 10. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16126/16099

S. A. Chowdhury and R. Zamparelli, “RNN Simulations of Grammaticality Judgments on Long-distance Dependencies,” in Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA, Aug. 2018, pp. 133–144.

Y. Liu, C. Sun, L. Lin, and X. Wang, “Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention,” arXiv:1605.09090 [cs], May 2016, Accessed: Aug. 05, 2020.

A. F. Hidayatullah and M. R. Ma’arif, “Pre-processing Tasks in Indonesian Twitter Messages,” Journal of Physics: Conference Series, vol. 801, p. 012072, Jan. 2017.https://doi.org/10.1088/1742-6596/801/1/012072

X. Zhang, Zhao, Junbo, and Y. LeCun, “Character-level Convolutional Networks for Text Classification,” dvances in neural information processing systems, pp. 649–657, 2015.

R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, Aug. 2018. https://doi.org/10.1007/s13244-018-0639-9

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Gated Feedback Recurrent Neural Networks,” in International conference on machine learning, 2015, pp. 2067–2075.

A. Graves, N. Jaitly, and A. Mohamed, “Hybrid speech recognition with Deep Bidirectional LSTM,” in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, Dec. 2013, pp. 273–278. https://doi.org/10.1109/ASRU.2013.6707742

Y. Zhao, Y. Shen, and J. Yao, “Recurrent Neural Network for Text Classification with Hierarchical Multiscale Dense Connections,” in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, China, Aug. 2019, pp. 5450–5456. https://doi.org/10.24963/ijcai.2019/757

C. Zhou, C. Sun, Z. Liu, and F. C. M. Lau, “A C-LSTM Neural Network for Text Classification,” arXiv:1511.08630 [cs], Nov. 2015, Accessed: Sep. 15, 2020.

T. Dozat, “Incorporating Nesterov Momentum into Adam,” in ICLR Workshop, 2016, vol. 1, pp. 2013–2016.

C. Guggilla, “Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, Dec. 2016, pp. 185–194.

Q. Zhou and H. Wu, “NLP at IEST 2018: BiLSTM-Attention and LSTM-Attention via Soft Voting in Emotion Classification,” in Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Brussels, Belgium, Oct. 2018, pp. 189–194. http://dx.doi.org/10.18653/v1/W18-6226

N. Srivastava, G. Hinton, A. Krizhevsky, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.

Issue

Vol. 5, No. 4, November 2020

Attention-based CNN-BiLSTM for Dialect Identification on Javanese Text

Corresponding Author(s) : Ahmad Fathan Hidayatullah

Abstract

Keywords

Download Citation

References

Downloads