[IROS-24] Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models

Website
Lang2LTL ver. 1 (CoRL 2023)

NOTES:

This work extends Lang2LTL : a modular system that leverages language models for grounding concepts or referrents in natural language commands to a formal logic known as linear temporal logic (LTL).
- Lang2LTL version 1 (Liu and Yang et al. 2023) only worked on grounding commands with temporal constraints.
- Lang2LTL version 2 (i.e., this paper) added the capability of grounding spatiotemporal commands, where there may be reasoning required to understand spatial relations between referrents while also accounting for temporal ordering constraints.
Our system combines the modalities of text and images to perform language grounding.
We perform experiments in both simulated and real-robot experiments (see videos on our website ).

Citation:

J. X. Liu, A. Shah, G. Konidaris, S. Tellex, and D. Paulius (2024). “Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models”. In: Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Related Links:

NOTES:

Citation: