Explainable Bengali Multiclass News Classification

Abstract

The automatic classification of news articles is crucial in the era of information overflow as it assists readers in accessing relevant information in a timely manner. Even though text classification is not a new area of study, there is potential for advancement concerning the Bengali language. Unlike other languages, Bengali is a complex language, and most of the datasets available online are imbalanced in terms of class label distribution. To increase the performance of classification methods and make them robust to handle imbalanced data, in this work, we propose a model consisting of pre-trained BERT architecture. We use a publicly available dataset of Bengali news articles with nine classes and achieve 92% accuracy. Along with the classification, explaining the model and the result is necessary for the application of trustworthy Artificial Intelligence. From this motivation, we use Integrated Gradient, an explainable AI technique, to explain the outcome of our model. We show which words in a news article affect the model to choose a particular class.

Publication
26th International Conference on Computer and Information Technology, 2023, Bangladesh