rohit.vision
Notes Graph Search About Portfolio
Notes / Computer Vision / Object Detection

Object Detection

R-CNN family, YOLO variants, DETR, anchor-free detectors, and open-vocabulary detection

1.
CenterNet
Center-based keypoint detection using corners as proposals
2.
CornerNet
Anchor-free detection using top-left and bottom-right corner heatmaps
3.
DETR
End-to-end object detection with transformers using bipartite matching
4.
DINO & Grounding DINO WIP
DETR with Improved deNoising anchOr boxes and open-set grounding
5.
ExtremeNet
Object detection via extreme point prediction
6.
InternImage WIP
Large-scale vision foundation model with deformable convolutions
7.
OverFeat
Sliding window + bbox regression + classification
8.
OWLv2 WIP
Open-World Localization with vision-language models
9.
R-CNN Family
R-CNN, SPPNet, Fast R-CNN, Faster R-CNN evolution of region-based detectors
10.
RetinaNet
Focal loss for dense object detection addressing class imbalance
11.
Selective Search WIP
Region proposal algorithm for object detection
12.
SSD WIP
Single Shot MultiBox Detector
13.
Swin Transformer WIP
Hierarchical vision transformer with shifted windows
14.
YOLO-World
Open-vocabulary YOLO with vision-language modeling and RepVL-PAN
15.
YOLO
You Only Look Once — single-shot grid-based object detection
16.
YOLOE WIP
Efficient YOLO variant
GitHub LinkedIn Google Scholar

© 2026 Rohit Kumar. rohit.vision