Data Purchasing Challenge 2022
Sneak Peek into the image samples from Round 2 dataset.
This notebook will help you to visualise images from different classes and combinations of them.
Quickly take a look at the image samples of different class labels.
This notebook will help you to understand images from different classes. Specifically, images of 'stray_partical' and 'discoloration' new classes introduced in 2nd round of this challenge.
We make use of deepml python library to quickly visualize these images.
!pip install deepml
import pandas as pd
import deepml
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
#mpl.rcParams['text.color'] = 'white'
train_df = pd.read_csv("data-purchasing-challenge-2022-starter-kit/data/training/labels.csv")
train_df.info()
train_df.head()
Create additional class called 'no_defect' for image samples containig no damages.
train_df['no_defect'] = (~train_df.iloc[:, 1:].any(axis=1)).astype(int)
classes = train_df.columns[1:].tolist()
classes
Since it's a multiclass classification challenge, let's create Joined Class Label Distribution.¶
train_df['joined_label'] = train_df[classes].apply(lambda row: " ".join([c for c in classes if row[c]]),
axis=1)
train_df.head()
train_df['joined_label'].value_counts()
plt.figure(figsize=(10,15))
sns.countplot(y='joined_label', data=train_df)
from deepml.visualize import show_images_from_dataframe
Random samples from training csv file¶
train_image_dir = "data-purchasing-challenge-2022-starter-kit/data/training/images"
show_images_from_dataframe(train_df, img_dir = train_image_dir, image_file_name_column='filename',
label_column='joined_label', samples=10, cols=2, figsize=(10, 30))
from deepml.visualize import show_images_from_folder
Image samples showing only large scratches (scratch_large)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'scratch_large']['filename'].tolist())
Image samples showing only small scratches (scratch_small)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'scratch_small']['filename'].tolist(),
figsize=(15,20))
Image samples showing only small dents (dent_small)¶
show_images_from_folder(train_image_dir, images=train_df[train_df['joined_label'] == 'dent_small']['filename'].tolist()[:12],
figsize=(15, 20))
Please watch out for noise samples in the dataset. May be image file j1NNKMd2ho.png does not contain any damages.
Image samples showing only large dents (dent_large)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'dent_large']['filename'].tolist(),
figsize=(15, 10))
Image samples showing only discoloration (discoloration)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'discoloration']['filename'].tolist(), figsize=(15, 20))
We have only one sample showing only discoloration damages.
Image samples showing only stray particles (stray_particle)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'stray_particle']['filename'].tolist()[:12],
figsize=(15, 20))
Image samples showing no damages (no_defect)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'no_defect']['filename'].tolist()[:12],
figsize=(15, 20))
Similarly, we can look at image samples containing different combination of class labels.
Image samples showing all damages (scratch_small, scratch_large, dent_small, dent_large, stray_particle, discoloration)¶
show_images_from_folder(train_image_dir, images= train_df[train_df['joined_label'] == 'scratch_small scratch_large dent_small dent_large stray_particle discoloration']['filename'].tolist()[:12],
figsize=(15, 20))
Content
Comments
You must login before you can post a comment.
Interesting! It’s cool to see that there are actually noisy labels as was shared on Discourse.