# RECLASS: Multi-Task Deep Learning for App Review Classification

**COMP6013 | Oxford Brookes University | 2025-26**

---

## Project Overview

RECLASS is a multi-task learning system which uses a shared BERT encoder with task-specific classification heads.

| Task | Output | Classes |
|------|--------|---------|
| Bug Report Detection | Binary | Yes / No |
| Feature Request Detection | Binary | Yes / No |
| Aspect Classification | Multi-class | Driver, App, Pricing, Service, Payment, General |
| Aspect Sentiment | Multi-class | Positive, Neutral, Negative |

## Dataset

- **Source**: [Uber Customer Reviews (Kaggle)](https://www.kaggle.com/datasets/khushipitroda/ola-vs-uber-play-store-reviews)
- **Original size**: 1,069,616 reviews
- **Cleaned size**: 495,036 reviews (after removing short/duplicate reviews)
- **Annotation target**: 5,000 manually labelled reviews

## Repository Structure

```
6013/
README.md
requirements.txt
    multitag/
        data/
            uber_reviews.csv           # Raw dataset
            uber_reviews_cleaned.csv   # Preprocessed reviews
            uber_reviews_sampled.csv   # Stratified sample for annotation
            uber_reviews_tagged.csv    # Annotated reviews (in progress)
        notebooks/
            datasets_reviews.ipynb     # Initial data exploration
            preprocessing_uber.ipynb   # Preprocessing analysis
            uber_cleaned.ipynb         # Cleaned data verification
        src/
            preprocess.py              # Text cleaning and filtering pipeline
            sampler.py                 # Stratified sampling strategies
            multitag.py                # GUI annotation tool
            train.py                   # Model training (in progress)
            infer.py                   # Inference pipeline (in progress)
```

## Current Progress

- Manual annotation of 5,000 reviews
- BERT baseline implementation
- Multi-task model architecture
- Training and evaluation
- Comparative analysis (MTL vs single-task)
- Final report and presentation

## Installation

```
# Clone repository
...
# Create conda environment
...
# Install dependencies
...requirements.txt
```

## Usage
## References
## Licenses

---

*Last updated: January 2025*