If we want robots to cooperate with humans “naturally” in high-risk environments (e.g. a clinic), the robots need to develop some form of higher understanding of its surroundings. Implementing some form of understanding of objects and their purpose can benefit robots in their ability to maneuver safely, for example by keeping more distance to a (moving) human than to a static wall.
It can also help them understand important information about how to handle an object, like keeping the opening of a cup of tea facing upwards to prevent spilling the liquid. Research in cognitive science indicates that humans and animals use some form of worldmodel to incorporate previously gathered knowledge into scene understanding and action planning. Such a so-called semantic worldmodel therefore acts as a link between perception and action, filtering the input stream and adding background information to the data of our biological senses or a robot’s sensors. The development of an efficient way of representing such information is the challenge of my research, as well as the fusion of multiple input sources and/or various sensor types for object recognition to allow for a more detailed entity representation.