Detecting Mental Distress: A Comprehensive Analysis of Online Discourses Via ML and NLP
- This thesis explores and examines the effectiveness and efficacy of traditional machine learning (ML), advanced neural networks (NN) and state-of-the-art deep learning (DL) models for identifying mental distress indicators from the social media discourses based on Reddit and Twitter as they are immensely used by teenagers. Different NLP vectorization techniques like TF-IDF, Word2Vec, GloVe, and BERT embeddings are employed with ML models such as Decision Tree (DT), Random Forest (RF), Logistic Regression (LR) and Support Vector Machine (SVM) followed by NN models such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) to methodically analyse their impact as feature representation of models. DL models such as BERT, DistilBERT, MentalRoBERTa and MentalBERT are end-to-end fine tuned for classification task. This thesis also compares different text preprocessing techniques such as tokenization, stopword removal and lemmatization to assess their impact on model performance. Systematic experiments with different configuration of vectorization and preprocessing techniques in accordance with different model types and categories have been implemented to find the most effective configurations and to gauge the strengths, limitations, and capability to detect and interpret the mental distress indicators from the text. The results analysis reveals that MentalBERT DL model significantly outperformed all other model types and categories due to its specific pretraining on mental data as well as rigorous end-to-end fine tuning gave it an edge for detecting nuanced linguistic mental distress indicators from the complex contextual textual corpus. This insights from the results acknowledges the ML and NLP technologies high potential for developing complex AI systems for its intervention in the domain of mental health analysis. This thesis lays the foundation and directs the future work demonstrating the need for collaborative approach of different domain experts as well as to explore next generational large language models to develop robust and clinically approved mental health AI systems.