Abstract: |
Embodied Question Answering (EQA) is a rather novel research direction, which bridges the gap between intelligence of commonsense reasoning systems and reasoning over actionable capabilities of mobile robotic platforms. Mobile robotic platforms are usually located in random physical environments, which have to be dynamically explored and taken into account to deliver correct response to users’ requests. Users’ requests are mostly related to foreseeable physical objects, their properties and positional relations to other objects in a scene. The challenge here is to create an intelligent system which successfully maps the query expressed in natural language to a set of reasoning stems and physical actions, required to deliver the user a correct answer. In this paper we present an approach called Situational Question Answering (SQA), which enforces the embodied agent to reason about all available context-relevant information. The approach relies on reasoning over an explicit knowledge graph complemented by inference mechanisms with transparent, human-understandable explanations. In particular, we combine a set of facts with basic knowledge about the world, a situational memory, commonsense understanding, and reasoning capabilities, which go beyond dedicated object knowledge. On top, we propose a Semantics Abstraction Layer (SAL) that acts as intermediate level between knowledge and natural language. The SAL is designed in a way that reasoning functions can be executed hierarchically to provide complex queries resolution. To demonstrate the flexibility of the SAL we define a set of questions that require a basic understanding of time, space, and actions including related objects and locations. As an outlook, a roadmap on how to extend the question set for incrementally growing systems is presented. |