Skip to content
AI Engineer @ Armada AI

Bridging Computer Vision & LLMs in Production

I've trained CNNs to see and transformers to reason. Now I build systems where both work together - from diffusion pipelines shipping photorealistic product imagery at Avataar AI, to agentic AI pipelines at Armada AI. IISc Bangalore alumnus, GATE AIR 221.

221
GATE AIR
2+
Years ML
IISc
M.Tech AI
01 About
Rohit Kumar - AI Engineer, IISc Bangalore M.Tech alumnus

From Signals to Neural Networks

My path to AI started in electrical engineering - clearing GATE (AIR 221) and BARC, then choosing research over a government job. That decision led me to IISc, where I published on continual learning (WACV 2025) and discovered my calling: building systems where vision and language work together.

Today, I ship AI that matters - diffusion pipelines at Avataar AI, agentic systems at Armada AI, and review research as an invited reviewer for NeurIPS, CVPR & ECCV 2026. The boundary between what machines see and what they understand is blurring. I build at that edge.

Currently reading: Reinforcement Learning — exploring how agents learn to act optimally through interaction.

02 Work Experience

AI Engineer

Armada AI
Jun 2025 - Present
Trivandrum, India
  • Architecting multi-agent RAG system with LangGraph orchestrating planner, router, retrieval, evaluation, and generation stages
  • Built hybrid search combining dense embeddings, BM25 sparse vectors, and Reciprocal Rank Fusion over Qdrant with Jina reranking
  • Developed ingestion pipeline: Playwright web scraping, Docling PDF extraction, and structure-aware chunking into Qdrant
  • Shipping production stack with FastAPI, Chainlit UI, Langfuse observability, Docker Compose, and PostgreSQL
LangGraph RAG Qdrant FastAPI Hybrid Search Playwright Docker Langfuse

Research Engineer

Avataar AI
Jul 2024 - Apr 2025
Bangalore, India
  • Built end-to-end lifestyle image generation pipeline using Flux Model and ControlNets
  • Modified diffusion sampling for improved object reconstruction with intrinsic decomposition
  • Developed classification systems using CLIP, BLIP2, and Qwen2.5 for low-data scenarios
  • Enhanced segmentation accuracy with BiRefNet and SAM + YOLO-world integration
Flux ControlNet Diffusion CLIP SAM YOLO

Teaching Assistant

IISc Bangalore - Signal Processing
Jan 2024 - Apr 2024
Bangalore, India
  • Integrated continual learning frameworks (L2P, DualPrompt) to mitigate catastrophic forgetting
  • Built self-supervised models using MoCo and SimCLR for visual representation learning
  • Developed adaptive prompt-based learning with dynamic token expansion
L2P DualPrompt MoCo SimCLR PyTorch

Teaching Assistant

IISc Bangalore - Digital Image Processing
Aug 2023 - Dec 2023
Bangalore, India
  • Developed DFT-based frequency domain filtering for image denoising and enhancement
  • Implemented SIFT and Normalized Cut for feature detection and segmentation
  • Optimized deep learning models using EfficientNet-B0 with custom classifiers
DFT SIFT EfficientNet OpenCV NumPy
03 Projects
TACLE overview diagram: comparison of CIL, SS-CIL, and EFSS-CIL learning settings

TACLE

Exemplar-free semi-supervised class incremental learning. SOTA on CIFAR-10, CIFAR-100 & ImageNet-100 without storing any previous examples. Published at WACV 2025.

PyTorch Continual Learning Semi-supervised WACV 2025
Agentic RAG pipeline: multi-agent LangGraph system with planner, router, retriever, evaluator, and generator stages

Agentic RAG System

Multi-agent RAG pipeline with LangGraph orchestrating planner, retriever, evaluator & generator stages. Hybrid search via dense + BM25 + RRF over Qdrant with Jina reranking.

LangGraph Qdrant Hybrid Search FastAPI Docker
Lifestyle image generation pipeline: Input, Segment (SAM+YOLO), Condition (ControlNet), Denoise (Flux), Output

Lifestyle Image Generation

End-to-end product image generation pipeline using Flux + ControlNets. Modified diffusion sampling with intrinsic decomposition for photorealistic object reconstruction.

Flux ControlNet Diffusion SAM CLIP
Virtual Try-On pipeline: 3-stage garment transfer using Florence2 and IDM-VTON

Virtual Try-On

Deep learning Virtual Try-On pipeline with 3-stage framework combining Florence2 and IDM-VTON models for automated garment transfer.

Florence2 IDM-VTON FLUX Diffusion
Cricket shot classification using CLIP embeddings and VideoMAE fine-tuning

Cricket-Shot Predictor

LSTM-based video classification using CLIP embeddings. Fine-tuned VideoMAE achieving 66% accuracy on cricket shots. Live demo on HuggingFace Spaces.

LSTM CLIP VideoMAE HuggingFace
04 Publications
New

AttriStory: Fine-Grained Attribute Realization for Visual Storytelling with Diffusion Models

Rohit Kumar, Aditya Rauniyar, Hari Chandana Kuchibhotla

CVPR 2026 Workshop on Generative AI for Storytelling (AISTORY)

TACLE: Task and Class-aware Exemplar-free Semi-supervised Class Incremental Learning

Jayateja Kalla*, Rohit Kumar*, Soma Biswas

WACV 2025 Cited by 1

05 Skills

AI/ML

LLMs RAG Agentic AI Computer Vision Diffusion Models NLP Hybrid Search

Frameworks & Tools

Python PyTorch LangGraph LangChain HuggingFace FastAPI Chainlit Playwright

Infrastructure

Docker Helm AWS PostgreSQL Qdrant Langfuse
06 Education

M.Tech in Artificial Intelligence

Indian Institute of Science (IISc), Bangalore

2022 - 2024 · CGPA: 8.0/10.0

Pattern Recognition & Neural Networks, Computer Vision, Digital Image Processing, Deep Learning for NLP, LLMs for Practical NLP, Stochastic Models, Optimization

B.Tech in Electrical Engineering

Bhagalpur College of Engineering, Bhagalpur

2018 - 2021 · CGPA: 8.75/10.0

Diploma in Electrical Engineering

Government Polytechnic Muzaffarpur, Muzaffarpur

2015 - 2018 · 77.73%

Achievements & Certifications

AIR 221
GATE EE 2022
Score: 803 | Marks: 73/100
View Certificate
AIR 227
GATE IN 2022
Score: 670 | Marks: 67.33/100
View Certificate
Rank 1
BCECE LE 2018
State Lateral Entry Exam
Agents Course
HuggingFace
AI Agents Development
View Certificate
miniCON AI Infra
Marktechpost
AI Infrastructure
View Certificate
OpenCV Bootcamp
OpenCV University
Computer Vision
View Certificate
07 Academic Service
Invited Reviewer NeurIPS 2026 Conference on Neural Information Processing Systems
Invited Reviewer CVPR 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Invited Reviewer ECCV 2026 European Conference on Computer Vision
Organizing Committee EE Summer School 2023 IISc Bangalore
08 Contact

Let's Build Something Together

Looking for collaboration on AI/ML projects, research opportunities, or just want to chat about generative models and agentic systems.

contact@rohit.vision
STEP
0/50
t=0.00
NOISE
1.00