AI Engineer @ Armada AI

Bridging Computer Vision & LLMs in Production

I've trained CNNs to see and transformers to reason. Now I build systems where both work together - from diffusion pipelines shipping photorealistic product imagery at Avataar AI, to agentic AI pipelines at Armada AI. IISc Bangalore alumnus, GATE AIR 221.

221

GATE AIR

Years ML

IISc

M.Tech AI

Get in Touch GitHub LinkedIn Scholar

01 About

From Signals to Neural Networks

My path to AI started in electrical engineering - clearing GATE (AIR 221) and BARC, then choosing research over a government job. That decision led me to IISc, where I published on continual learning (WACV 2025) and discovered my calling: building systems where vision and language work together.

Today, I ship AI that matters - diffusion pipelines at Avataar AI, agentic systems at Armada AI, and review research as an invited reviewer for NeurIPS, CVPR & ECCV 2026. The boundary between what machines see and what they understand is blurring. I build at that edge.

Currently reading: Reinforcement Learning — exploring how agents learn to act optimally through interaction.

02 Work Experience

AI Engineer

Armada AI

Jun 2025 - Present

Trivandrum, India

Architecting multi-agent RAG system with LangGraph orchestrating planner, router, retrieval, evaluation, and generation stages
Built hybrid search combining dense embeddings, BM25 sparse vectors, and Reciprocal Rank Fusion over Qdrant with Jina reranking
Developed ingestion pipeline: Playwright web scraping, Docling PDF extraction, and structure-aware chunking into Qdrant
Shipping production stack with FastAPI, Chainlit UI, Langfuse observability, Docker Compose, and PostgreSQL

LangGraph RAG Qdrant FastAPI Hybrid Search Playwright Docker Langfuse

Research Engineer

Avataar AI

Jul 2024 - Apr 2025

Bangalore, India

Built end-to-end lifestyle image generation pipeline using Flux Model and ControlNets
Modified diffusion sampling for improved object reconstruction with intrinsic decomposition
Developed classification systems using CLIP, BLIP2, and Qwen2.5 for low-data scenarios
Enhanced segmentation accuracy with BiRefNet and SAM + YOLO-world integration

Flux ControlNet Diffusion CLIP SAM YOLO

Teaching Assistant

IISc Bangalore - Signal Processing

Jan 2024 - Apr 2024

Bangalore, India

Integrated continual learning frameworks (L2P, DualPrompt) to mitigate catastrophic forgetting
Built self-supervised models using MoCo and SimCLR for visual representation learning
Developed adaptive prompt-based learning with dynamic token expansion

L2P DualPrompt MoCo SimCLR PyTorch

Teaching Assistant

IISc Bangalore - Digital Image Processing

Aug 2023 - Dec 2023

Bangalore, India

Developed DFT-based frequency domain filtering for image denoising and enhancement
Implemented SIFT and Normalized Cut for feature detection and segmentation
Optimized deep learning models using EfficientNet-B0 with custom classifiers

DFT SIFT EfficientNet OpenCV NumPy

03 Projects

TACLE

Exemplar-free semi-supervised class incremental learning. SOTA on CIFAR-10, CIFAR-100 & ImageNet-100 without storing any previous examples. Published at WACV 2025.

PyTorch Continual Learning Semi-supervised WACV 2025

Agentic RAG pipeline: multi-agent LangGraph system with planner, router, retriever, evaluator, and generator stages

Agentic RAG System

Armada AI

Multi-agent RAG pipeline with LangGraph orchestrating planner, retriever, evaluator & generator stages. Hybrid search via dense + BM25 + RRF over Qdrant with Jina reranking.

LangGraph Qdrant Hybrid Search FastAPI Docker

Lifestyle Image Generation

Avataar AI

End-to-end product image generation pipeline using Flux + ControlNets. Modified diffusion sampling with intrinsic decomposition for photorealistic object reconstruction.

Flux ControlNet Diffusion SAM CLIP

Virtual Try-On

Deep learning Virtual Try-On pipeline with 3-stage framework combining Florence2 and IDM-VTON models for automated garment transfer.

Florence2 IDM-VTON FLUX Diffusion

Cricket shot classification using CLIP embeddings and VideoMAE fine-tuning

Cricket-Shot Predictor

LSTM-based video classification using CLIP embeddings. Fine-tuned VideoMAE achieving 66% accuracy on cricket shots. Live demo on HuggingFace Spaces.

LSTM CLIP VideoMAE HuggingFace

04 Publications

New

AttriStory: Fine-Grained Attribute Realization for Visual Storytelling with Diffusion Models

Rohit Kumar, Manogna Sreenivas, Soma Biswas

CVPR 2026 Workshop on Generative AI for Storytelling (AISTORY)

Paper (Coming Soon)

TACLE: Task and Class-aware Exemplar-free Semi-supervised Class Incremental Learning

Jayateja Kalla*, Rohit Kumar*, Soma Biswas

WACV 2025 Cited by 1

Paper arXiv Code Website Poster Supplementary

05 Skills

AI/ML

LLMs RAG Agentic AI Computer Vision Diffusion Models NLP Hybrid Search

Frameworks & Tools

Python PyTorch LangGraph LangChain HuggingFace FastAPI Chainlit Playwright

Infrastructure

Docker Helm AWS PostgreSQL Qdrant Langfuse

06 Education

M.Tech in Artificial Intelligence

Indian Institute of Science (IISc), Bangalore

2022 - 2024 · CGPA: 8.0/10.0

Pattern Recognition & Neural Networks, Computer Vision, Digital Image Processing, Deep Learning for NLP, LLMs for Practical NLP, Stochastic Models, Optimization

B.Tech in Electrical Engineering

Bhagalpur College of Engineering, Bhagalpur

2018 - 2021 · CGPA: 8.75/10.0

Diploma in Electrical Engineering

Government Polytechnic Muzaffarpur, Muzaffarpur

2015 - 2018 · 77.73%

Achievements & Certifications

AIR 221

GATE EE 2022

Score: 803 | Marks: 73/100

View Certificate

AIR 227

GATE IN 2022

Score: 670 | Marks: 67.33/100

View Certificate

Rank 1

BCECE LE 2018

State Lateral Entry Exam

Agents Course

HuggingFace

AI Agents Development

View Certificate

miniCON AI Infra

Marktechpost

AI Infrastructure

View Certificate

OpenCV Bootcamp

OpenCV University

Computer Vision

View Certificate

07 Academic Service

Invited Reviewer — NeurIPS 2026 Conference on Neural Information Processing Systems

Invited Reviewer — CVPR 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Invited Reviewer — ECCV 2026 European Conference on Computer Vision

Organizing Committee — EE Summer School 2023 IISc Bangalore

08 Contact

Let's Build Something Together

Looking for collaboration on AI/ML projects, research opportunities, or just want to chat about generative models and agentic systems.

contact@rohit.vision