Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

RoboVQA

Introduction

RoboVQA is a dataset developed by the anonymous authors for robot vision and question-answering tasks. It contains 2,000 episodes of a Franka Emika Panda robot interacting with objects, including RGB images, depth data, and robot joint states. The dataset supports research in open-vocabulary language understanding and real-time robot control, with a focus on integrating language and vision for task execution. It is accompanied by evaluation scripts and pre-trained models, enabling comparisons across different human-robot interaction methods. While the dataset's license is not explicitly stated, it is primarily intended for academic use and emphasizes the integration of visual and language data for robot task understanding.

Homepage

Visit the dataset homepage

Task Description

A robot or a human performs any long-horizon requests from a user within the entirety of 3 office buildings.

Dataset Details

Field Value
Action Space EEF Position
Control Frequency 10
Depth Cams 1
Gripper Default
Has Camera Calibration True
Has Proprioception True
Has Suboptimal False
Language Annotations Natural
Rgb Cams 1
Robot Morphology 3 embodiments: single-armed robot, single-armed human, single-armed human using grasping tools
Scene Type Table Top, Kitchen (also toy kitchen), Other Household environments, Hallways, anything within 3 entire office buildings
Wrist Cams 0