Papers

Least Commitment Planning for the Object Scouting Problem

TL;DR – This paper introduces a new planning framework specifically designed for object scouting with LOMDPs called the Scouting Partial-Order Planner (SPOP), which exploits the characteristics of partial order and regression planning to plan around gaps in knowledge the robot may have about the existence, location, and state of relevant objects in its environment.

Example of simple block stacking task performed in our LLM-OLP paper

Bootstrapping Object-level Planning with Large Language Models

TL;DR – This paper formalizes the concept of object-level planning and discusses how this level of planning naturally integrates with large language models (LLMs).

Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models

TL;DR – Building on prior work (Lang2LTL - CoRL 2023), this paper introduces a modular system that enables robots to follow natural language commands with spatiotemporal referring expressions. This system leverages multi-modal foundation models as well as the formal language LTL (linear temporal logic).

Overview of CAPE's reprompting methodology

CAPE: Corrective Actions from Precondition Errors using Large Language Models

TL;DR – In this paper, we introduce CAPE: an approach to correct errors encountered during robot plan execution. We exploit the ability of large language models to generate high-level plans and to reason about causes of errors.

Skill Generalization With Verbs

TL;DR – This paper introduces a deep learning-based method for learning about the effects of verbs – more specifically, looking at initiation and termination conditions as with Markov Decision Processes (MDPs).

Rationing and frictional unemployment in the United States, 1964–2009

Long-Horizon Planning and Execution with Functional Object-Oriented Networks

TL;DR – In this paper, we introduce the idea of connecting FOONs to robotic task and motion planning. We automatically transform a FOON graph, which exists at the object level (i.e., it is a representation that uses meaningful labels or expressions close to human language), into task planning specifications written in PDDL (not a very intuitive way to communicate about tasks).

Brief overview of key advantages of object-level planning for hierachical planning

Object-Level Planning and Abstraction

TL;DR – This workshop paper (specifically, a blue-sky submission) introduces the importance of object-level planning and representation as an additional layer on top of task and motion planning. I present several benefits of using object-level planning for long-term use in robotics.

Approximate Task Tree Retrieval in a Knowledge Network for Robotic Cooking

TL;DR – In this paper, we introduce the idea of connecting FOONs to robotic task and motion planning. We automatically transform a FOON graph, which exists at the object level (i.e., it is a representation that uses meaningful labels or expressions close to human language), into task planning specifications written in PDDL (not a very intuitive way to communicate about tasks).

Robot Learning of Assembly Tasks from Non-expert Demonstrations using Functional Object-Oriented Network

TL;DR – This was a collaboration with Clemson University’s Yunyi Jia and Yi Chen, who were interested in using FOONs for representing assembly tasks. They successfully utilized and adapted a FOON to robotic assembly execution.

Task Planning with a Weighted Functional Object-Oriented Network

TL;DR – In this paper, we attempt to execute task plan sequences extracted from FOONs. However, these sequences may contain actions that are not executable by a robot. Therefore, a human is introduced in the planning and execution loop, and both the robot and human assistant work together to solve the task.

Developing Motion Code Embedding for Action Recognition in Videos

TL;DR – This work uses the features from the motion taxonomy to improve action recognition on egocentric videos from the EPIC-KITCHENS dataset. This is done by integrating motion code detection for action sequences.

Estimating Motion Codes from Demonstration Videos

TL;DR – In this work, we showed how motion codes (which can be constructed using the motion taxonomy proposed in our RSS 2020 paper) can be used to improve action recognition with deep neural networks.

A Motion Taxonomy for Manipulation Embedding

TL;DR – In this work, we introduce new changes to the features of the motion taxonomy and show how action verbs encoded as motion codes better capture differences between them than conventional word embedding (as word2vec).

Manipulation Motion Taxonomy and Coding for Robots

TL;DR – This paper introduces the motion taxonomy, a collection of robot-relevant features that are better suited for verb or action embedding than conventional word embedding. Motion codes are constructed per verb using the taxonomy. In this work, we show that motion codes assigned to verbs are closely related to one another based on force and trajectory data.

A Survey of Knowledge Representation in Service Robotics

TL;DR – This was my first survey paper that covers knowledge representations for service robotics. Although it is dated, it covers an extensive list of approaches used to represent knowledge for several robot sub-tasks.

Long Activity Video Understanding using Functional Object-Oriented Network

TL;DR – This work leverages functional object-oriented networks and deep learning for video understanding. In addition, with the deep network framework, we jointly recognize object and action types, which can then be used for constructing new FOON structures.

Functional Object-Oriented Network: Construction & Expansion

TL;DR – In this paper, we explore methods in natural language processing (NLP) – specifically semantic similarity – for expanding or generalizing knowledge contained in a FOON. This alleviates the need for demonstrating and annotating graphs by other means.

Functional Object-Oriented Network for Manipulation Learning

TL;DR – This was the very first paper on FOON: the functional object-oriented network. Here, we introduced what they are and how they can be used for task planning. They are advantageous for their flexibility and human interpretability.

NOTE: Sorted from newest to oldest; some papers have not been added yet (WIP).#

NOTE: Sorted from newest to oldest; some papers have not been added yet (WIP).