Skip to content

forbiddenscholar/GeoNLI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒ GeoSpatial NLI

A Natural Language Interface for Multimodal Satellite Image Understanding

โ€œIs a picture really worth a thousand words?โ€


๐Ÿ“Œ Overview

GeoSpatial NLI is an end-to-end visionโ€“language system that enables non-expert users to analyze satellite imagery using natural language queries. frontend_design

Given a single satellite image, the system can:

  • ๐Ÿ“ Generate detailed captions
  • โ“ Answer natural language questions (VQA)
  • ๐Ÿ“ Localize objects via oriented bounding boxes (OBB grounding)

The pipeline is designed to work across RGB, SAR, IR, and False Color Composite (FCC) imagery and supports high-resolution inputs up to 2kร—2k, operating robustly across 0.5โ€“10 m/pixel spatial scales.


๐Ÿง  Key Contributions

  • Unified natural language interface for satellite imagery
  • Multi-modal handling of RGB, SAR, IR, and FCC images
  • Scale-robust inference across diverse spatial resolutions
  • Oriented object grounding suitable for overhead viewpoints
  • SAR grounding without SAR captions, using detector + LLM reasoning
  • Fully deployable web-based system

Acknowledgements

We thank the authors of SARATR-X, VRSBench, Qwen-VL, Moondream, and SAM for open-sourcing their work, which made this project possible.

About

This repo contains our solution to the ๐Ÿš€ ISRO's GeoNLI problem statement from the InterIIT-Tech Meet 14.0

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors