Faculty Candidate Seminar
Recovering a Functional and Three Dimensional Understanding of Images
Add to Google Calendar
What does it mean to understand an image? One common answer in computer vision has been that understanding means naming things: this part of the image corresponds to a refrigerator and that to a person, for instance. While important, the ability to name is not enough: humans can effortlessly reason about the rich 3D world that images depict and how this world functions and can be interacted with. For example, just looking at an image, we know what surfaces we could put a cup on, what would happen if we tugged on all the handles in the image, and what parts of the image could be picked up and moved. A computer, on the other hand, understands none of this. My research aims to address this by giving computers the ability to understand these 3D and functional (or interactive) properties.
In this talk, I will discuss my efforts towards building this understanding. In particular, I will show work addressing what 3D representations we should infer from images, how we can learn them, and how to reconcile our prior knowledge with data-driven techniques. I will also discuss how to scalably gather data of humans interacting with the world and how to learn from this data.
David Fouhey is a postdoctoral fellow at the University of California, Berkeley. His research interests include computer vision and machine learning, with a particular focus on scene understanding. He received a Ph.D. in robotics in 2016 from Carnegie Mellon University where he was supported by NSF and NDSEG fellowships. He has spent time at the University of Oxford's Visual Geometry Group and at Microsoft Research.