In this project, we demonstrate the process of building a text classification model using the BERT (Bidirectional Encoder Representations from Transformers) architecture. Our goal is to classify texts into two categories: AI-generated and human-written. Text classification serves as a fundamental task in the field of natural language processing (NLP), with a wide range of applications across various domains. The rationale for applying text classification to different areas, such as journalism, authorship of books, and school settings, hinges on the unique benefits and insights it can provide in each context. With the rise of AI text classification serves essential purposes in authorship identifications in the fields of journalism, authorships of books, schools and university settings.
With the rise of sophisticated AI language models, distinguishing between text written by humans and AI has become increasingly challenging. This notebook addresses this challenge by implementing a machine learning pipeline that leverages BERT for text classification.