This is the repository of Aprendizagem Automática's (Machine Learning) final project.
The data provided is in two pickle files, one for training (rateBeer75Ktrain.p) with 75,000 beer reviews, and another for testing (rateBeer25Ktest.p) with 25,000 beer reviews. The data was extracted from the ratebeer website, which collected over a period of ten years. In the files provided, each one contains a 'dictionary' whose values are other dictionaries with the information relating to a review of a given beer.
Below are the teacher's notes on the project:
Good explanation of all the concepts in AA, good explanation of the text cleaning process and correct optimisation process for the optimisation process. Use of an innovative process in cross-validation. Use of all the metrics in the binary problem and understanding the problem in the multiclass problem. They carried out three extra tasks: clustering, linear regression and PCA. However, in the multiclass they used the outputs from the regression. Very good work that showed dedication and effort.
Project's Grade: 18/20
Below are the results
Binary Classification - 75K train

Multiclass Classification - 75K train


