Data Purchasing Challenge 2022
Sneak Peek into the image samples from Round 2 dataset.
This notebook will help you to visualise images from different classes and combinations of them.
Quickly take a look at the image samples of different class labels.
This notebook will help you to understand images from different classes. Specifically, images of 'stray_partical' and 'discoloration' new classes introduced in 2nd round of this challenge.
We make use of deepml python library to quickly visualize these images.
In [ ]:
!pip install deepml
In [1]:
import pandas as pd
import deepml
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
#mpl.rcParams['text.color'] = 'white'
In [2]:
train_df = pd.read_csv("data-purchasing-challenge-2022-starter-kit/data/training/labels.csv")
train_df.info()
In [3]:
train_df.head()
Out[3]:
Create additional class called 'no_defect' for image samples containig no damages.
In [4]:
train_df['no_defect'] = (~train_df.iloc[:, 1:].any(axis=1)).astype(int)
In [5]:
classes = train_df.columns[1:].tolist()
classes
Out[5]:
Since it's a multiclass classification challenge, let's create Joined Class Label Distribution.¶
In [6]:
train_df['joined_label'] = train_df[classes].apply(lambda row: " ".join([c for c in classes if row[c]]),
axis=1)
train_df.head()
Out[6]:
In [7]:
train_df['joined_label'].value_counts()
Out[7]:
In [8]:
plt.figure(figsize=(10,15))
sns.countplot(y='joined_label', data=train_df)
Out[8]:
In [9]:
from deepml.visualize import show_images_from_dataframe
Random samples from training csv file¶
In [10]:
train_image_dir = "data-purchasing-challenge-2022-starter-kit/data/training/images"
show_images_from_dataframe(train_df, img_dir = train_image_dir, image_file_name_column='filename',
label_column='joined_label', samples=10, cols=2, figsize=(10, 30))