Loading

Food Recognition Challenge

[Baseline] Detectron2 starter kit for food recognition 🍕

A beginner friendly notebook kick start your instance segmentation skills with detectron2

jyotish

AIcrowd-Logo

This dataset and notebook correspond to the Food Recognition Challenge being held on AIcrowd.

Join the communty!
chat on Discord

🍕 Food Recognition Challenge: Detectron2 starter kit

This notebook aims to build a model for food detection and segmentation using detectron2

How to use this notebook? 📝

  1. Copy the notebook. This is a shared template and any edits you make here will not be saved. You should copy it into your own drive folder. For this, click the "File" menu (top-left), then "Save a Copy in Drive". You can edit your copy however you like.
  2. Make a submission. Run all the code in the notebook to get a feel of how the notebook and the submission process works.
  3. Try tweaking the parameters. If you are new to the problem, a great way to start is try tweaking the configuration flags, train your model and submit again.
  4. Diving into the code. When you submit via this notebook, we create a repository on gitlab.aicrowd.com. You can check the code we generated based on this notebook and directly make changes you want there!

Setup the notebook 🛠

In [ ]:
!bash <(curl -sL https://gitlab.aicrowd.com/jyotish/food-recognition-challenge-detectron2-baseline/raw/master/utils/setup-colab.sh)
AIcrowd installer starting...
Setting up the environment for you!
⚙️ Installing PyTorch...
⚙️ Installing COCO API...
  Running command git clone -q https://github.com/cocodataset/cocoapi.git /tmp/pip-req-build-n74pv0xj
⚙️ Installing detectron...
🗄 Preparing the dataset for training...
🗄 Preparing the validation dataset...
All set! 🎉🍻

Configure static variables 📎

In [ ]:
class Paths:
  DATASET_DIR = "dataset"
  TRAIN_DATA_DIR = f"{DATASET_DIR}/train"
  TRAIN_IMAGES_DIR = f"{TRAIN_DATA_DIR}/images"
  TRAIN_ANNOTATIONS = f"{TRAIN_DATA_DIR}/annotations.json"
  VAL_DATA_DIR = f"{DATASET_DIR}/val"
  VAL_ANNOTATIONS = f"{VAL_DATA_DIR}/annotations.json"
  VAL_IMAGES_DIR = f"{VAL_DATA_DIR}/images"


class DatasetLabels:
  TRAIN = "dataset_train"
  VAL = "dataset_val"

Packages 🗃

Import here all the packages you need to define your model.

In [ ]:
import os
from multiprocessing import Pool
import json

from tqdm.notebook import tqdm
from pycocotools.coco import COCO
import numpy as np
import cv2

from detectron2.data.datasets import register_coco_instances
import detectron2
from detectron2.utils.logger import setup_logger
from detectron2.utils.visualizer import Visualizer
from detectron2.utils.visualizer import ColorMode
from detectron2.data import MetadataCatalog
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
from detectron2.engine import DefaultPredictor
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2 import model_zoo
from detectron2.utils.events import get_event_storage
from detectron2.engine import HookBase

Loading the data 📲

In [ ]:
with open(Paths.TRAIN_ANNOTATIONS) as fp:
  annotations = json.load(fp)

Helper functions to clean the dataset

First, we will see if all the annotations in the dataset are properly aligned with the images. These helper functions will let us do that.

In [ ]:
image_dir = ""


def validate_annotation(annotation):
  """Check the image dimensions and fix them if needed
  """
  filepath = os.path.join(image_dir, annotation.get("file_name"))
  if not os.path.exists(filepath):
    print("Skipping", filepath)
    return annotation
  img = cv2.imread(filepath)
  if img.shape[0] != annotation.get("height") or img.shape[1] != annotation.get("width"):
    annotation["height"], annotation["width"] = annotation["width"], annotation["height"]
  return annotation


def clean_annotations(annotation_images):
  """Read the image dimensions and fix them in parallel
  """
  annotated_images = []

  with Pool() as p:
    total_images = len(annotation_images)

    with tqdm(total=total_images) as progress_bar:
      for annotation in p.imap(validate_annotation, annotation_images):
        annotated_images.append(annotation)
        progress_bar.update(1)

  return annotated_images

Clean the training data 🧹

In [ ]:
image_dir = Paths.TRAIN_IMAGES_DIR
annotations["images"] = clean_annotations(annotations.get("images"))

with open(Paths.TRAIN_ANNOTATIONS, "w") as fp:
  json.dump(annotations, fp)

Clean the validation data 🧹

In [ ]:
image_dir = Paths.VAL_IMAGES_DIR

with open(Paths.VAL_ANNOTATIONS) as fp:
  validation_annotations = json.load(fp)

validation_annotations["images"] = clean_annotations(validation_annotations.get("images"))

with open(Paths.VAL_ANNOTATIONS, "w") as fp:
  json.dump(validation_annotations, fp)

Initialize detectron2

In [ ]:
_ = setup_logger()

register_coco_instances(DatasetLabels.TRAIN, {}, Paths.TRAIN_ANNOTATIONS, Paths.TRAIN_IMAGES_DIR)
register_coco_instances(DatasetLabels.VAL, {}, Paths.VAL_ANNOTATIONS, Paths.VAL_IMAGES_DIR)

Build your Model 🏭

We will use Mask R-CNN to generate the segmentation masks for the food items 🌯

Configure detectron2

Detectron2 has a variety of Instance Segmentation Models. We will use the zoo model with Mask RCNN + ResNet 50. If you want to try other models, you can find them here.

In [ ]:
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))

cfg.DATASETS.TRAIN = (DatasetLabels.TRAIN,)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 273  # Number of output classes

cfg.OUTPUT_DIR = "outputs"
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)

Load the pre-trained weights

In [ ]:
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 2

Hyperparameters

In [ ]:
cfg.SOLVER.BASE_LR = 0.00025  # Learning Rate
cfg.SOLVER.MAX_ITER = 20000  # MAx Iterations
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128  # Batch Size

Train the model 🚂

We will setup tensorboard to check the performance of the model while it is training.

Setting up Tensorboard

In [ ]:
%load_ext tensorboard
%tensorboard --logdir outputs

Train the Model

In [ ]:
trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()

Evaluating the model 🧪

We will check the performance of our model on the validation dataset.

In [ ]:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5   # set the testing threshold for this model
cfg.DATASETS.TEST = (DatasetLabels.VAL, )
predictor = DefaultPredictor(cfg)

Generate predictions on validation data

In [ ]:
evaluator = COCOEvaluator(DatasetLabels.VAL, cfg, False, output_dir=cfg.OUTPUT_DIR)
data_loader = build_detection_test_loader(cfg, DatasetLabels.VAL)
results = inference_on_dataset(predictor.model, data_loader, evaluator)

Visualizing the results 👓

Numbers are good, but visualizations are better!

In [ ]:
metadata = MetadataCatalog.get(DatasetLabels.VAL)

# Load the training annotations if not loaded
if not validation_annotations:
  with open(Paths.VAL_ANNOTATIONS) as json_file:
      annotations = json.load(json_file)

Check the predictions

Note: If you are not able to see segmentation masks on the images, that generally means that the model didn't predict a mask for that image. You can verify this by doing

predictions = predictor(img)
print(predictions)
In [ ]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams['figure.dpi'] = 180


# Visualize some random images
for i in range(8):
  image_filename = np.random.choice(validation_annotations.get("images")).get("file_name")
  image_filename = os.path.join(Paths.VAL_IMAGES_DIR, image_filename)

  img = cv2.imread(image_filename)
  predictions = predictor(img)

  v = Visualizer(img[:, :, ::-1],
    metadata=metadata, 
    scale=0.5, 
    # instance_mode=ColorMode.IMAGE_BW
  )
  annotated_image = v.draw_instance_predictions(predictions["instances"].to("cpu"))

  plt.subplot(2, 4, i+1)
  plt.axis('off')
  plt.imshow(annotated_image.get_image())

A note on class ID mappings

Here is how the category object looks like

{
  "id": 2578,
  "name": "water",
  "name_readable": "Water",
  "supercategory": "food"
}

Detectron2 usually maps the category IDs to contiguous numbers. For example, consider the following categories,

[
  {
    "id": 2578,
    "name": "water",
    "name_readable": "Water",
    "supercategory": "food"
  },
  {
    "id": 1157,
    "name": "pear",
    "name_readable": "Pear",
    "supercategory": "food"
  },
  {
    "id": 2022,
    "name": "egg",
    "name_readable": "Egg",
    "supercategory": "food"
  }
]

Detectron internally maps these categories to something like

{
  0: 2578, # detectron_id: actual_class_id
  1: 1157,
  2: 2022
}

So, when your model detects water, the prediction class ID that your model returns will be 0 and not 2578 . You should make sure to map these detectron IDs to their original actual class IDs for your submission to get scored properly.

Here's how you can get this mapping.

In [ ]:
coco_api = COCO(Paths.TRAIN_ANNOTATIONS)

category_ids = sorted(coco_api.getCatIds())
categories = coco_api.loadCats(category_ids)

class_to_category = { int(class_id): int(category_id) for class_id, category_id in enumerate(category_ids) }

with open("class_to_category.json", "w") as fp:
  json.dump(class_to_category, fp)
loading annotations into memory...
Done (t=2.93s)
creating index...
index created!

Ready? Submit to AIcrowd 🚀

Now you can submit the trained model to AIcrowd!

Submission configuration ⚙️

In [ ]:
aicrowd_submission = {
    "author": "<your name>",
    "username": "<your aicrowd username>",
    "description": "initial submission with detectron",
    "debug": False,
    "model_path": "outputs/model_final.pth",
    "model_type": "model_zoo",
    "model_config_file": "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml",
    "detectron_model_config": {
      "ROI_HEADS": {
        "SCORE_THRESH_TEST": 0.5,
        "NUM_CLASSES": 273
      }
    }
}

aicrowd_submission["description"] = aicrowd_submission["description"].replace(" ", "-")
with open("aicrowd.json", "w") as fp:
  json.dump(aicrowd_submission, fp)

Submit to AIcrowd

Note: We will create an SSH key on your google drive. This key will be used to identify you on gitlab.aicrowd.com.

In [ ]:
!bash <(curl -sL https://gitlab.aicrowd.com/jyotish/food-recognition-challenge-detectron2-baseline/raw/master/utils/submit-colab.sh)

Comments

You must login before you can post a comment.

Execute