21th AIAI 2025, 26 - 29 June 2025, Limassol, Cyprus

Fine-Tuning BERT for Robust Sentiment Classification of IMDb Reviews

Papadimitriou Orestis, Al-Hussaeni Khalil, Karamitsos Ioannis, Maragoudakis Manolis

Abstract:

  Online movie reviews provide a rich source of unstructured text data for sentiment classification, offering insights into user opinions. Accurate sentiment analysis is essential for stakeholders in the entertainment industry, including producers, marketers, and streaming platforms, to inform data-driven decisions. This study explores the effectiveness of Transformer-based models, specifically BERT (Bidirectional Encoder Representations from Transformers), in sentiment classification using the IMDb movie reviews dataset. By fine-tuning a pre-trained BERT model on a large-scale corpus of labeled IMDb reviews, this research captures contextual dependencies and semantic relationships within user-generated content. The preprocessing pipeline includes tokenization via the WordPiece tokenizer, padding and truncation to standardize input lengths, and attention masks to handle variable-length sequences. The fine-tuning process optimizes BERT’s deep contextual embeddings for binary sentiment classification, distinguishing between positive and negative reviews. The model is evaluated against traditional machine learning baselines, including Naive Bayes, Support Vector Machines (SVM), and Long Short-Term Memory networks (LSTM), demonstrating a classification accuracy of 92.13%. This study underscores the significance of fine-tuning large-scale language models for domain-specific applications and provides insights into their computational efficiency and real-world applicability.  

*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.