Sentiment Classification
[ Baseline ] Sentiment Classification
A getting started notebook using Random Forest to classify different face sentiment using embedding.
Starter Code for Sentiment Classification
In this baseline we will be training an sklearn model to do a multi-class classificattion of sentiment from face embeddings.
Downloading Dataset¶
Installing puzzle datasets via aicrowd-cli
!pip install aicrowd-cli
# Make sure to re-run below code whenever you restart colab notebook
%load_ext aicrowd.magic
# Logging in from our AIcrowd account. Make sure you have accepted the puzzle rules before logging in!
%aicrowd login
# Creating a new data directory and downloading the dataset
!rm -rf data
!mkdir data
%aicrowd ds dl -c sentiment-classification -o data
Importing Libraries¶
In this baseline, we will be sing sklearn RandomForestClassifier to classify the sentiment of face embeddings.
import pandas as pd
import os
import numpy as np
from ast import literal_eval
import random
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import f1_score, accuracy_score
random.seed(42)
Reading Dataset¶
As mented in the challenge readme, we have three different sets provided - train, validation and test respectively.
# Readging the csv
train = pd.read_csv("data/train.csv")
val = pd.read_csv("data/val.csv")
submission = pd.read_csv("data/sample_submission.csv")
train
# Getting the feature and labels from each set.
X = [literal_eval(embedding) for embedding in train['embeddings'].values]
y = train['label'].values
X_val = [literal_eval(embedding) for embedding in val['embeddings'].values]
y_val = val['label'].values
Training the model¶
Here, we will be training our model using the training set.
model = RandomForestClassifier()
model
model.fit(X, y)
Testing the Model¶
Here, we will be evaluator our model using validation set
y_pred = model.predict(X_val)
print(f"F1 Score : {f1_score(y_val, y_pred, average='weighted')}")
print(f"Accuracy Score : {accuracy_score(y_val, y_pred)}")
Generating the Predictions¶
Generating Predictions from test data to make submission in the puzzle.
submission_embeddings = [literal_eval(embedding) for embedding in submission['embeddings'].values]
predictions = model.predict(submission_embeddings)
predictions.shape
submission['label'] = predictions
submission
Saving the Predictions¶
# Saving the predictions
!rm -rf assets
!mkdir assets
submission.to_csv(os.path.join("assets", "submission.csv"))
Submitting our Predictions¶
%aicrowd notebook submit -c sentiment-classification -a assets --no-verify
Congratulations to making your first submission in the puzzle 🎉 . Let's continue with the journey by improving the baseline & making submission! Don't be shy to ask question related to any errors you are getting or doubts in any part of this notebook in discussion forum or in AIcrowd Discord sever, AIcrew will be happy to help you :)
Have a cool new idea that you want to see in the next blitz ? Let us know!
Content
Comments
You must login before you can post a comment.