|
Nisarg Anish Shah
I am a fourth-year PhD student in the department of Electrical and Computer Engineering at Johns Hopkins University, where I am a member of the VIU Lab advised by Dr. Vishal Patel.
During my PhD, I have been fortunate to have interned at AWS on Agentic AI with Pengkai Zhu, Ankan Bansal, and Srikar Appalaraju, as well as at Netflix Research, working with Amir Ziai, Chaitanya Ekanadham, and Ben Klein on Video LLMs.
Before my PhD, I was a Research Engineer at AI Foundation, where I worked with Gaurav Bharaj on efficient generative models. I completed my undergraduate studies in Electrical Engineering at the Indian Institute of Technology Jodhpur, advised by Dr. Anil Kumar Tiwari.
Email /
CV /
Google Scholar /
Github /
X (Twitter) /
LinkedIn /
đŸŒ™
|
|
|
Research Interest
My research lies at the intersection of computer vision and natural language processing, where I focus on building advance open-world and multi-modal foundation models. My long-term goal is to develop vision-based agents capable of advanced reasoning and planning, ultimately pushing the boundaries of what foundational models can achieve.
|
News
- [July 2025] - Honored to receive a travel award for MIDL 2025!
- [June 2025] - One paper accepted to MICCAI 2025.
- [May 2025] - Started an exciting research internship at Amazon AWS focusing on Agentic AI!
- [Jan 2025] - One paper accepted to MIDL 2025.
- [June 2024] - Started a research internship at Netflix Research working on Video-LLMs!
- [February 2024] - One paper accepted to CVPR 2024.
- [June 2023] - One paper accepted to MICCAI 2023.
|
Selected Publications
Representative papers are highlighted. * denotes equal contribution.
|
|
Cinéaste: A Fine-grained Contextual Movie Question Answering Benchmark
Nisarg A Shah, Amir Ziai, Chaitanya Ekanadham, Vishal M Patel
Under Review, 2025
[Paper]
|
|
StepAL: Step-aware Active Learning for Cataract Surgical Videos
Nisarg A Shah, Bardia Bonab, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2025
[Paper]
|
|
A Vision Foundation Model for Cataract Surgery Using Joint-Embedding Predictive Architecture
Nisarg A Shah, Mingze Xia, Subhasri Vijay, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel
Medical Imaging with Deep Learning (MIDL), 2025
[Paper]
|
|
LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation
Nisarg A Shah, Vibashan VS, Vishal M Patel
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
[Project Page]
|
|
GLSFormer: Gated Long, Short Sequence Transformer for Step Recognition in Surgical Videos
Nisarg A Shah, Shameema Sikder, Swaroop Vedula, Vishal M Patel
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2023
[Paper]
|
|
Towards Device Efficient Conditional Image Generation
Nisarg A Shah, Gaurav Bharaj
British Machine Vision Conference (BMVC), 2022
[Paper]
|
|
How Far Can I Go? : A Self-Supervised Approach for Deterministic Video Depth Forecasting
Sauradip Nag*, Nisarg A Shah*, Anran Qi*, Raghavendra Ramachandra
NeurIPS Workshop on Machine Learning for Autonomous Driving, 2021
[Paper]
|
|
DSRN: an Efficient Deep Network for Image Relighting
Sourya Dipta Das*, Nisarg A Shah*, Saikat Dutta, Himanshu Kumar
IEEE International Conference on Image Processing (ICIP), 2021
[Paper]
|
|
Stacked Deep Multi-Scale Hierarchical Network for Fast Bokeh Effect Rendering
Saikat Dutta, Sourya Dipta Das, Nisarg A Shah
CVPR Workshop on Mobile AI, 2021
[Paper]
|
Academic Services
I frequently serve as a reviewer for conferences in Computer Vision and Medical Imaging. Recent venues include:
- [August 2025] - Serving as a reviewer for WACV 2026
- [May 2025] - Serving as a reviewer for ACMMM 2025
- [March 2025] - Serving as a reviewer for MICCAI 2025
- [December 2024] - Serving as a reviewer for CVPR 2025
|
|