AI Lab logo

Preeti Ramaraj (4th year PhD student)Diversity across different axes

Despite being a bibliophile all my life, I realized fairly recently that I had read books by less than 10 women authors and almost none by any authors of color growing up. I began following Roxane Gay on Goodreads and decided to do the BookRiot’s “Read Harder” challenge, both of which introduced me to books I would never have heard of or picked up on my own. That led me down a whole new path of experiences, a diversity of accounts, that enriched my reading experience more than I could have imagined. Here are three books from my new found list of authors.

Homegoing by Yaa Gyasi begins with two sisters who were born in Africa centuries ago and separated at birth. One sister marries a white man and settles in Africa, whereas the other is forced to come to America through slavery. This book traces the stories of multiple generations of both these sisters, who never end up meeting each other. Each chapter alternates between the snippets of a person’s life in each generation of each family. You don’t just end up reading their individual stories, but you also get a reflection of the time and society in which they lived. You end up encountering a lot of painful moments in black history, but that is not just it – the fact that you see so many generations being spoken about means that you also see their normalcies. No matter what you do, read this book. It is poetic, it is grand and it is an experience that will impact you.

Alok Vaid-Menon is a gender non-conforming activist and a performance artist. I came across them when a friend of mine sent me a photo they had posted. My first reaction was one of just surprise, not knowing how to process their atypical presentation. But the more photos I saw, the more I normalized it. But it made me realize that for all my ideas of acceptance, my understanding of gender had not even begun. Beyond the Gender Binary is a tiny book (64 pages) that describes the pitfalls of the gender binary, its historical origins and how it is high time we go beyond it. There were two arguments in this book that really hit me hard. One, the gender binary asks to put 7.5 billion people into one of two categories. Second, the gender binary exaggerates the difference between the two genders and minimizes the differences within them (and there are many). Alok Vaid-Menon describes how the gender binary is harmful for everyone — it forces you to conform to established behaviors starting very early, or be told and shown in myriad ways that you do not belong.
In the end, all this book asks for is space for everyone to live and let live, to not kill the creativity and differences that we naturally bring to the table as people. However, we are nowhere close to that simple-sounding goal. The book reminds you that people outside the gender binary (non-binary, trans, gender non-conforming people) have always been around, and yet the gender binary erases them, does not allow them to exist, let alone thrive in this world. I would highly recommend listening to this interview with them to supplement what you read in this book. 

El Deafo by Cece Bell
This is one of the most adorable books I’ve read in awhile. The author writes this comic book story based on her own childhood where she grew up deaf. Yes, this book helps dispel the stereotypes or rather, shows the stereotypes that people who are deaf are subject to. But honestly, it’s mostly a hilarious story of the author as a child with her huge imagination who has a good time with her friends. It’s a book that I would buy for a kid, but it’s also a book that I would recommend to adults. This is because people often forget that people with disabilities do have what one might consider “a normal life.” And as much as it is important to understand the challenges that people go through because of their circumstances, that is not all that defines them. I absolutely loved this book, it is super short and fun, and this child’s school antics will truly make your day.

Charlie Welch (4th year PhD student)

There are two books by Peter Wohlleben that I’ve recently enjoyed; The Hidden Life of Trees and The Inner Life of Animals. Both offered new perspectives on the experiences of the living things we share our planet with and fascinating facts like, for instance, how trees can recognize what insects are eating their leaves by the chemical makeup of their saliva and the trees will release pheromones of an insect that preys on their attacker in order to protect themselves. Some criticisms I’ve heard of these books are over anthropomorphizing and a lack of scientific clarity on coverage of some of the referenced studies. The foreword of The Inner Life of Animals actually described Wohlleben’s references to published articles in a way that made me think I wouldn’t enjoy the book. However, I found the studies and anecdotes presented in both books captivating. I think that it feels strange to anthropomorphize the plants and animals at times, but I think this is a necessary device by which Wohlleben offers perspective. It makes the reader think about how we feel, think, and interact with the world in similar ways and ask ourselves where the boundaries of those similarities are. Each plant and animal has a different way of perceiving the world which may be very different from our own.

How we see the differences is partly based on our perception. I didn’t know before reading The Hidden Life of Trees that trees have such a complex network of roots and fungi that allows them to communicate about where resources are, if predators are coming, and if other trees need help they will send each other resources through their root systems. Another factor contributing to how we see the differences stems from the optimization that comes from capitalist motivations. Among the many daily injustices done to animals throughout the world, Wohlleben points out for example, that many people don’t know how smart pigs are and how they seem to feel many of the feelings we do. If they did know then corporations would not be able to keep them in poor conditions and castrate them without anesthetic, which is expensive to give them. Amongst the animals I underestimated were ravens, who form lifelong relationships with friends and family and each raven has a distinct name that they will remember for years even if they don’t see each other. It’s facts like these that make the reader wonder what else we don’t yet understand. There is still much to learn about the plants and animals around us.

Allie Lahnala (2nd year PhD student) recommends:

From my 2020 reads, I recommend “How to Do Nothing: Resisting the Attention Economy” (2019) by Jenny Odell, “Geek Heresy: Rescuing Social Change from the Cult of Technology” (2015) by Kentaro Toyama (who is a professor in University of Michigan’s School of Information), and “Race After Technology: Abolitionist Tools for the New Jim Code” (2019) by Ruha Benjamin.

While the books and their authors’ voices are quite different, the core themes connect and enhance each other in thought-provoking ways. Each book in some way attempts to combat the framing that productivity means creating something new (e.g., novel technology meant to correct social injustices), and offers ways to consider actions that are reparative, regenerative, and nurturing to be productive instead (e.g., social support systems and policy changes).

My year of books began with “How to Do Nothing,” which elicited both existential amusement and a healthy form of existential crisis. Jenny Odell proposes a seemingly paradoxical practice of doing nothing as an active process, one that challenges our notions of productivity and emphasizes stopping to seek out “the effects of racial, environmental, and economic injustice” to elicit real change (pg. 22). My favorite chapter, “The Anatomy of Refusal”, relates tales of quirky philosophers and performance artists whose minor refusals of societal norms invoked disruptive perplexity from others; it also recounts histories of social activism during the civil rights and labor movements that required collective and sustained concentration. Through these stories, the author paints a picture of what it means to “pay attention” at an individual and collective level in such a way that allows for the mobilization of movements (pg. 81). What resonated with me most about this chapter were the histories of activism powered by university students and faculty, such as the 1960 Greensboro sit-in in which participating students were “under the care of black colleges, not at the mercy of white employers” (pg. 83), which demonstrate the positions that academic institutions can afford to be in when it comes to taking activist risks.

Similarly, in “Geek Heresy,” Kentaro Toyama discusses mentorship as a more productive form of activism than novel technical packaged interventions, which are often ill-suited to the problems they intend to address, but nonetheless are idealized by technologists. Mentorship, he argues, is often neglected by policymakers and donor organizations, though it “works well as an overarching framework that avoids the problems of top-down authority, benevolent paternalism, or pretended equality (pg. 240).” The core themes of “Geek Heresy” emphasize nurturing people and social institutions as a means toward fighting inequality rather than developing technical fixes. He argues that fixes such as the development of low-cost versions of expensive technologies and their subsequent distribution to impoverished communities only address a symptom of the real issue, and can actually amplify the problem they are intended to solve. Such an instance demonstrates the “Law of Amplification,” which in an article in The Atlantic Toyama describes as technology’s primary effect of amplifying human forces. By this effect, “technology – even when it’s equally distributed – isn’t a bridge, but a jack. It widens existing disparities
(pg. 49).”

Aligned with the “Law of Amplification,” Ruha Benjamin argues in “Race After Technology” that “tech fixes often hide, speed up, and even deepen discrimination, while appearing to be neutral or benevolent when compared to the racism of a previous era (pg. 7).”  Benjamin’s propositions reflect concrete racial manifestations of such problematic attempts to intervene in social injustices with new technology solutions. She introduces the theory of the New Jim Code as “the employment of new technologies that reflect and reproduce existing inequities but that are promoted and perceived as more objective or progressive than the discriminatory systems of a previous era (pg. 5).” Benjamin discusses numerous instances of racial fixes that, in ignoring underlying issues, either unproductively miss the point or actually turn malignant. Her solutions involve an abolitionist toolkit of researched strategies for resisting the New Jim Code which scrutinize technology development and deployment, and how we interpret data (pg. 192). Odell’s aforementioned active practice of “doing nothing” and Toyama’s advocacy against packaged technical interventions that lack long-term social commitments to the marginalized communities for which they are intended would both nurture the concept of an abolitionist toolkit.

Read these books for further insights and context into these aforementioned themes. You will encounter actionable suggestions from each author, from attention exercises that are as ubiquitously available as bird watching discussed by Odell, to social activism through mentorship as proposed by Toyama, and to supporting specific researchers and organizations that create tools for the abolitionist toolkit outlined by Benjamin. Also, checkout the recent study Critical Race Theory for HCI from Kentaro as senior author with his Ph.D. student Ihudiya Ogbonnaya-Ogburu a first co-author with Angela D. R. Smith and Alexandra To. They achieved a Best Paper award at the The 2020 CHI Conference on Human Factors in Computing Systems for this work.

Oana Ignat (3rd year PhD student) recommends:

Invisible Women: Data Bias in a World Designed for Men” by Caroline Criado Perez

This book is full of research facts on the gender data gap and its effects on women’s lives. Thanks to this book, I have learned how to recognize unconscious bias and how greatly spread this is in our society. Gender bias regards not only the pay gap, but it is also present in some unexpected areas like snow plowing, designing car safety tests, recognizing symptoms of a heart attack, or prescribing the correct medication. 
Women represent more than 50% of the world’s population and yet they are invisible for a great range of products and services. This is not due to bad intent, but rather to ignorance: when the most important decisions have been made by men, other perspectives and opinions were not taken into account, which led to a world designed by men for men. This is a classic example of how not having diverse teams reinforces inequality. The solution proposed by the author is to rethink the way we design things, to collect more data, study that data, and ask women what they need. “Invisible Women” should be read by everyone, especially those interested in creating policies.

“American Like Me: Reflections on Life Between Cultures” by America Ferrera

American Like Me” is a collection of very diverse stories centered around the life of  immigrant families in the US. As an international student myself, I can empathize with the feeling of being trapped between cultures and, occasionally, not knowing where I fit. Reading these stories left me feeling more empathetic and inspired from learning so much about other cultures and customs. I was really impressed by the variety of authors, who are actors, singers, activists, politicians, and more (see the book cover); plus, they all come from a variety of cultural backgrounds. The differences among the structure of the essays and the stories presented, reinforce the message about the value of diversity, and by extension, of immigration.

Other books that I would recommend, especially for students who want to improve their productivity and well-being: 
“Why We Sleep: Unlocking the Power of Sleep and Dreams” by Matthew Walker 
“Spark: The Revolutionary New Science of Exercise and the Brain” by John J. Ratey

David Fouhey was interviewed by Ralph Anzarouth on CVPR Daily. Permission to republish was exceptionally granted by RSIP Vision. June 2020

David Fouhey

David Fouhey is an Assistant Professor at the University of Michigan in the Computer Science and Engineering department.

RA: David, it is a pleasure to speak to you again. The last time you were featured in our magazine was in one of Angjoo Kanazawa’s photographs of the ‘Berkeley Crowd’. A lot has changed since then. What is your life like now?

DF: It’s a lot busier. I have many wonderful students now. For two of them, it’s their first CVPR, so I’m looking forward to hanging out at the posters with them virtually and answering all the questions. I’m also looking forward to sitting up and drinking coffee with one of my students. It’s important that we ensure everyone can come to the posters and see stuff and we can talk to everybody. Life has certainly changed a lot since the last time I featured, and it’s changed even more since the first time, which was four years ago!

RA: How many people are you in the lab now?

DF: There are currently eight graduate students and then I have a large number of undergraduates for the summer, which is exciting. Some of them are working remotely. There’s a lot of stuff going on, but it’s really wonderful to work with students.

RA: Did you always want to continue in academia and keep teaching?

DF: Yes, I have really enjoyed teaching, both in the classroom and getting students excited about computer vision and machine learning. I think it’s important not to hoard knowledge in your head. You have to get it out there. It’s really important for people to learn as much as possible and to teach people and welcome them into the field. Machine learning is very exciting now but there are lots of ways that it can go wrong. As people who have been around a while and seen that, I think it’s important for us to teach the next generation. We don’t want to keep on making the same mistakes.

RA: In what way is being an assistant professor different from what you expected?

DF: I have to do many more things than I realized! Lots of very different things. The topic can totally change from one meeting to the next. From talking about the next iteration of a course, to speaking about someone’s results. I switch around lot, which is exciting, because I see lots of new fun stuff.

Lab group image

RA: Do you find that you learn things from your students?

DF: What’s great is that students often have new and fresh ideas. What’s wonderful about computer vision is that we really, as a field, don’t know what’s going on most of the time. It’s very easy for someone to get started and to think of something totally new that you’ve never thought of before in that way. That’s why it’s wonderful to work with a collection of students from all sorts of different backgrounds. It keeps you on your toes and you get to learn all these new perspectives on things. It’s great. And they also help you keep up with reading arXiv!

RA: Do you ever feel overwhelmed by it all?

DF: Do you have moments where you think, “Get me out of here!” and want to be a software engineer in a start-up instead? I mean, definitely, in academia like in grad school you often do have these moments where nothing works and where your paper gets rejected, then your paper gets rejected again, and it’s really hard at times. Especially when you first start out. You go into this field where the default response is often no. I think it’s very important as a field that, especially as we’re growing, we treat people with respect and actively try to be inclusive of new people. It’s hard enough for my students when their papers get rejected, but at least they have someone who can say, “I’ll fix this,” but when people are just getting started and don’t have mentors floating around in their life, it can be tough. This is a problem that exists when a field grows really quickly, but in the long run the growth is really exciting.

RA: Now that you see the world through the eyes of a teacher, are there things that you see that really aren’t working, and you think we should fix to make the community work better? Funnily enough, last year, I interviewed Andrew Fitzgibbon from Microsoft and I asked him a similar question, and he told me: “Someday we’re going to have to figure out how to do these conferences without everybody travelling to the same place.” Last year, it sounded impossible, but what a difference a year makes!

David Fouhey Paris

DF: I really appreciate you asking this question. One of the things that has really changed since I started in computer vision is back then you looked for the example where your system worked and you were really excited, but it was a total fantasy. Like, “Maybe one day my system will work.” Now, we have systems that do stuff. One thing I try to teach, and I want to teach better, is that if you deploy these systems in the real world, if you’re not careful, they can have real consequences. There are all these stories that float in the community from ages ago about data bias. Like an entertaining story about a tank classifier that gets 100 per cent accuracy because it determines whether it’s taken at night or during the day with pictures of Soviet tanks at night and US tanks in the day. But now there are real serious issues where people deploy things. There’s this great paper from Joy Buolamwini and Timnit Gebru on Gender Shades and it has had real downstream impacts. It’s something that as a community we have to start thinking about because we know how a lot of these systems work and we need to make sure they’re not misused. We need to make sure that there aren’t bad outcomes and consequences. There’s an excitement about stuff working, but then this stuff can have really serious impacts and it’s important that as a community we talk about algorithmic bias and address it.

RA: Do you think the community will hear your call? How do you see things changing in this area?

DF: There are many other systemic issues and there’s a lot of reading that everyone can do. A lot of the issues that you spot in these articles are things that you’ll talk about, but for more simple things where you say, “If I trained a classifier to detect giraffes, maybe it only will pick up on some sort of other correlation.” I think it’s something where we talk about these things as academic examples, and it’s kind of interesting when it happens on MS COCO, but when it happens in the real world, we abstract away the concept that data and algorithms can have bias and forget about it. I think these are really hard problems and we have to find solutions. I don’t have solutions, but I think we have to talk about it and be aware of it and listen to people who have been talking about it for quite some time.

RA: Thinking back to the Berkeley Crowd, what do you miss the most from that time and those people?

DF: I miss ditching work and going off for a hike with my lab mates and taking long extended meals where you discuss anything and everything. Those are times that you should treasure in graduate school because you don’t get as many of them after.

RA: I think every one of our readers can relate to that.

Berkeley crowd

DF: One thing that I love about this community is that you see the same people and you’ve known them over many years. I met Angjoo at ECCV 2012. I was not part of the Berkeley Crowd for a while, but I knew them, I would see them at conferences, we’d hang out, we’d talk, we’d catch up. Now, they’re friends for life, and I’m sure in 20 or 30 years from now we’re still going to be in contact. You make these amazing friends over this really long period of time. It’s great. When you start going to CVPR, you don’t expect it. Then you go again and again and again.

RA: That is a really nice message for people attending their first CVPR. Everyone can build their own Berkeley Crowd.

DF: Yes, they’re friends you don’t realize you have yet.

RA: Do you have a funny story from those days that you could share with our readers?

Hiking

DF: I remember when Alyosha Efros would take us on a hike, he’d say, “It’ll be an hour,” and it’d always be like four hours! We would do things like there was a miniature train that he would take us on, and somehow, we’d always end up eating gelato. He had this uncanny ability to find gelato! These hikes would be outrageously long, and his estimates would be wildly inaccurate, but they were so much fun. I’d come home totally sunburnt but very happy! My message to people is make sure you take the time to do stuff like this because it’s really important.

RA: By having a career in academia, is that your way of not abandoning that world completely?

DF: Yes, I get to talk to people about all sorts of research problems all the time. I can work on all sorts of things. I‘m in heaven! I’m trying to do lots of different projects at the same time and it’s so much fun getting to have that experience with my students. An advisor-advisee relationship is not the same as you and your office mate, but there are similarities. You sit in the office and say, “What problems should we be solving?” Or, “Did you see this new thing on YouTube? How can we use that for computer vision?” It’s wonderful.

RA: Computer vision technology is evolving so fast. Where do you see things going next?

DF: People are really interested in 3D now, which is great. I got interested in 3D when it really didn’t work. Some of my old results are just horribly embarrassingly bad! It’s exciting. Because of deep nets now there’s stuff that you just couldn’t imagine. Justin Johnson is also at Michigan and he’s interested in 3D, so we have two students who we co-advise and it’s a lot of fun.

by Yiwei Yang 

This article contains a description of an AI project which was awarded the “Best Poster Award” by the public at the University of Michigan AI Symposium 2019.

Machine learning techniques, especially deep learning, have been widely applied to solve a variety of problems ranging from classifying email spam to detecting toxic language. However, deep learning often requires a massive amount of labeled training data, which is very costly, and sometimes may be infeasible to obtain. In low-resource settings (e.g. when labeled data is scarce, or when training data only represents a subclass of the testing data), machine learning models tend to not generalize well. For example, reviewing legal contracts is a tedious task for lawyers. To help facilitate the process, machine learning methods can be used to extract documents relevant to certain clauses. However, the company (e.g. IBM) that produces the model can only get a large number of their own contracts, whereas contracts of others (e.g. Google, Apple) is hard to obtain, causing the model trained to overfit and not generalize well on contracts pertinent to other companies.

On the other hand, human experts are able to extrapolate rules from data. Using their domain knowledge, they can create rules that generalize to real world data. For example, while ML models may see a pattern of sentences with past tense are correlated with the clause of communication, and thus use past tense as a core feature for classification, a human would easily recognize that the verbs of the sentences are true reasons why the sentences should be classified this way, thereby creating a rule of “if sentence has verb X, then sentence is related to communication.” However, coming up with rules is very difficult. Human experts often need to manually explore massive size of datasets, which can take up to months.

Our goal is to combine human and machine intelligence to create models that generalize to real world data, even when training data is lacking. The core idea is to first apply an existing deep learning technique to learn first-order-logic rules, and then leverage domain experts to select a trusted set of rules that generalize. By applying the rule learning method this way, the rules serve as an intermediate layer that bridge the explainability gap between humans and the neural network. Such a human-machine collaboration makes use of the machine’s ability to mine potentially interesting patterns from large scale datasets, and the human’s ability to recognize patterns that generalize. We present the learned rules in HEIDL (Human-in-the-loop linguistic Expressions with Deep Learning), an system that facilitates the exploration of rules and integration of domain knowledge.

Figure 1: Overview of our human-machine collaboration approach

The learned rules and the features of HEIDL are illustrated below.
What do the rules look like?

Each rule is a conjunction of predicates. Each predicate is a shallow semantic representation of each sentence in the training data, generated by NLP techniques such as semantic role labeling and syntactic parsing. It captures “who is doing what to whom, when, where, and how” described in a sentence. For example, a predicate can be tense is future, or verb X is in dictionary Y. So a rule can simply be tense is future and verb X is in dictionary Y. Each rule can be viewed as a binary classifier. A sentence is classified as true for a label if it satisfies all predicates of the rule.

What are the features of HEIDL?
HEIDL allows expert users to rank, filter rules by precision, recall, f1, and predicates
– After evaluating a rule, users can approve or disapprove it (the final goal is to approve a set of rules that align with the users’ domain knowledge. A sentence is true for a label if it satisfies any rule in the set.)
– The combined performance (precision, recall, F1 score) of all approved rules is updated each time a rule gets approved, helping users to keep track of overall progress
– Users can see the effect on overall performance by hovering over a rule
– Users can modify rules by adding or dropping predicates, and examine the effects

Figure 2: HEIDL: the user interface that allows experts to quickly explore, evaluate, and select rules

We evaluated the effectiveness of the hybrid approach on the task of classifying legal documents to various clauses (E.g. communication, termination). We recruited 4 NLP engineers as domain experts. The training data is sentences extracted from IBM-procurement contracts, and the testing data is sentences extracted from non IBM-procurement contracts. We compared this approach to a state-of-the-art machine learning model – a bi-directional LSTM trained on top of GloVe embedding, and demonstrated that the co-created rule-based model in HEIDL outperformed the bi-LSTM model.

Our work suggests that the instillation of human knowledge into machine learning models can help improve the overall performance. Further, this work exemplifies how humans and machines can collaborate to augment each other and solve problems that cannot be solved by either alone.

The full paper can be found here: https://arxiv.org/abs/1907.11184

The author of the paper is a student researcher at the University of Michigan. This work was done in collaboration with IBM Research as a summer internship project.




by Michigan AI’s Prof. Benjamin Kuipers & Prof. Rada Mihalcea


Prof. Benjamin Kuipers recommends:

I am currently reading a sequence of three books by Michael Tomasello that I think say something important about the contribution of different kinds of cooperation and collaboration to human success, individually and as a species. The books are:
A Natural History of Human Thinking (2014)
A Natural History of Human Morality (2016)
Becoming Human: A Theory of Ontogeny (2019)

All of these focus on the evolution of cooperation, which he argues is responsible for the dominance of the human species on our planet. He uses experimental evidence about the cognitive capabilities of great apes as a proxy for the capabilities of the last common ancestor shared by humans and great apes, about six million years ago. He observes that great apes are capable of sophisticated knowledge about physical causality including tool use, and even knowledge of intentionality: that is, the beliefs, goals, and plans that they and other individuals may have in a given situation. However, this knowledge is “individual intentionality”, used for individual competitive advantage in pursuing the agent’s own goals.

He argues that about 400,000 years ago, early humans began to evolve the capabilities for “joint intentionality”, the ability to pursue shared goals with another agent. This has a number of implications, including cognitive development of the ability to infer how the partner sees the world, the ability to communicate information that the partner needs, and the need for the partner to trust and believe what the agent is attempting to communicate.
With the emergence of modern humans about 150,000 years ago, this progresses to “collective intentionality”, involving shared goals with a larger population of other agents, leading to the development of a shared culture of beliefs, goals, and norms. As this culture develops, individuals acquire its structure, not as learned knowledge about the beliefs of other individual agents, but learned from infancy and childhood as “the way things are”. Norms progress from ways to maintain a collaboration with another specific individual to “how things should be done” to participate successfully in the society.

This evolutionary picture has implications for the nature of human thinking, human morality, and human child development (“ontogeny”).

These books by Michael Tomasello fill in some important gaps in my understanding of how ethics contributes to the survival and thriving of human society, discussed in:
Non-Zero: The Logic of Human Destiny, by Robert Wright (2000)
The Better Angels of Our Nature: Why Violence Has Declined, by Steven Pinker (2011)
Enlightenment Now: The Case for Reason, Science, Humanism, and Progress, by Steven Pinker (2018)



Prof. Rada Mihalcea recommends:


If I were to pick only three books from among the science books I read over the past year, I would choose the ones that I quoted the most in my conversations with others.
The books are:

Life 3.0 – Being Human in the Age of Artificial Intelligence, by Max Tegmark (2017)
As I was reading this book, I first liked it, then disliked it, then liked it again, and ended up loving it. I am not even sure if I love the book itself, or the thoughts that it provoked, but it probably doesn’t matter. The book is very dense in interesting aspects of life in the presence of an advanced AI. The ideas that I found the most intriguing: (i) It’s not a matter of if, but a matter of when will our planet end. Technology is our only hope to transcend our own condition. (ii) The right question to ask is not “how will the future of AI look like” but “how would we like the future of AI to look like.” (iii) Consciousness (defined as the sensing of experiences) can happen with small entities (e.g., humans), but it’s much harder with large entities (e.g., the universe).
Teaser: The book also includes an AI-based scheme to get rich using Amazon Mechanical Turk (don’t put it in practice!)

Factfulness – Ten Reasons We’re Wrong About the World – and Why Things Are Better Than You Think, by Hans Rosling, Ola Rosling, Anna Rosling (2018)
Let me start by admitting that I failed miserably the survey from the beginning of this book, which asks questions about the state of the world: “What is the life expectancy of the world today?” Or “How many people in the world have some access to electricity?” As it turns out, most people will fail this survey: according to the Roslings, our view of the world is largely outdated, as the statistics that we generally use as reference correspond to the state of the world in 1960. And yes, almost 60 years have passed since! The book is a wonderful account of the current state of the world – as seen by a physician, a statistician, and a designer – and also a positive message that the world is in a much better state than we often think it is.
Bonus:
Anna Rosling’s Ted Talk
Dollar Street: an interactive website to see how people really live 

What If?: Serious Scientific Answers to Absurd Hypothetical Questions, by Randall Munroe (2014)
“Delightful” is probably the right word for this book; or rather, “scientifically delightful.” This book is a collection (that you can read in any order!) of questions asked on the popular XKCD website, along with solid scientific answers. The absurdity of many of the questions make the book amusing – e.g., “How much computing power could we achieve if the entire world population will start doing calculations?” (Just imagine the whole world stopping from what they are doing to do instead calculations! Turns out even back in 1994 a desktop computer exceeded the combined computing power of the humanity), or “When, if ever, will the bandwidth of the Internet surpass that of FedEx?” (Believe it or not – FedEx throughput is currently a hundred times that of the Internet). At the same time, the clearly explained science in the answers make the book a rich learning resource covering a wide variety of disciplines – biology, geology, computing, and more.

Bonus for Ann Arbor locals: Randall Munroe will be in town on September 6, hosted by Literati

Want even more recommendations? The other two contenders for my top science books over the past year were:
The Immortal Life of Henrietta Lacks, by Rebecca Skloot (2010)
Why We Sleep: The New Science of Sleep and Dreams, by Matthew Walker (2017)

by IHPI

This is an interview with Dr. Wiens, who is affiliated with the Michigan AI Lab and is also a member of the Institute for Healthcare Policy & Innovation.



Jenna Wiens, Ph.D., is an assistant professor of electrical engineering and computer science.
Her research focuses on developing the computational methods needed to help organize, process, and transform patient data into actionable knowledge.

What are you thinking about?

I’m interested in developing computational methods to transform health data into knowledge with actionable clinical applications. I lead the Machine Learning for Data-Driven Decisions (MLD3) research group, composed of researchers in computer science. Working closely with clinicians, we aim to augment clinical care. We bring the methods and expertise in machine learning, big data, and artificial intelligence (AI), and the clinicians bring the data and the healthcare problems in need of attention. Working together, we can come up with solutions, with the ultimate goal of changing clinical practice and improving patient outcomes.

We’re working on many different areas of health – everything from inpatient hospitalization to outpatient wearables. The opportunities for AI in health are vast, and from a technical perspective, the number of challenges is seemingly unlimited.

Why is this interesting to you?

Machine learning is the study of methods for automatically detecting patterns in data. These approaches shine when working with massive and complex datasets that electronic health records represent. Medical settings collect an immense amount of data through patient encounters – everything from what medications a patient’s on to what procedures they’ve undergone, to their location in a hospital, and who is looking after them.

We try to leverage these sources of data by working backwards in collaboration with clinicians to identify a problem for which the data are relevant, and then develop machine-learning and data-mining techniques to try to solve the problem.

We’re also interested in the technical challenges that come with working with longitudinal data – implicit in following patients over time – and work to develop new methods to solve those challenges.

What are the practical implications for healthcare?

A lot of our work focuses on developing methods for predicting adverse outcomes, particularly during a hospitalization. For example, we’ve developed machine learning models that can predict a patient’s risk of developing Clostridium difficile (C. diff) infection much earlier than existing methods, and can be tailored to accommodate different patient populations, different EHR systems and other institution-specific factors.

We’re also collaborating with Michael Sjoding and others in the Michigan Integrated Center for Health Analytics and Medical Prediction (MiCHAMP) on developing and improving a model that can predict the onset of Acute Respiratory Distress Syndrome (ARDS) throughout a hospitalization better than the best clinical techniques currently used, by using routinely collected electronic health record (EHR) data.

We’re also developing techniques for modeling patient trajectories to better understand the progression of various diseases, including Alzheimer’s, type 1 diabetes and cystic fibrosis.

One of the big challenges is developing tools that are not only accurate, but also interpretable and robust. That gets into questions of implementation science and translation. You can integrate these the models into the EHR to automatically compute a patient’s risk for a particular outcome. But then the question becomes, who’s the right person to show those data to? Is it the physician? Is it the nurses? Is it the hospital’s infection prevention and control team? Is it antimicrobial stewardship team? That’s where it gets more application-specific and also where the clinicians play can help determine what’s most actionable in the model, how to refine the model, and how to best utilize it.

We’ve seen a tremendous uptick in opportunities to address various issues in healthcare with recent advances in terms of our ability to collect and store health data, clinical data in particular, and by working together in interdisciplinary teams we can begin translating back to the bedside and improving patient outcomes. That’s the fundamental goal.

by Karthik Desingh

Scene from the movie “Robot and Frank”, in which Frank’s son gets Frank a robot companion to take care of him.

Personal robots are more and more in the media spotlight including movies. Owning a personal robot, your very own maid or butler that has it all, is everyone’s dream. But exactly how advanced are they these days? In my Ph.D. program, I, along with my colleagues are exploring ways to enable a robot to perform tasks that involve sequentially manipulating objects and working in complex indoor environments. But first let’s get to know some of the challenges.

Robots cannot really see

“Robot and Frank” is a movie set in the near future, where a robot butler helps old Frank by cleaning, cooking and keeping him busy with a hobby of his interest. The robot in this movie, which will be referred to as Frank’s robot in this article, is capable of doing all the activities one would like a butler to do in a typical household environment. However, in reality our robots cannot see the world like humans do. We have had tremendous success in making robots function efficiently where the environment is structured, such as warehouse autonomous navigation (Figure 1(a)) or industrial manufacturing (Figure 1(b)). The imposed structure complements the inability of the robots to perceive their environments. However, our household environments are highly unstructured, inherently complex with variety of objects, interactions and relations, and associations to indoor locations. Given a task in such unstructured environments, the robot butler should know what objects are involved, where they are, and how to grasp and move them around to accomplish the task. Hence, robotic perception under unstructured environments is challenging and largely unsolved. In order to achieve this goal of having a robot butler that is capable of cooking a meal at Frank’s place, the problem of perception under complex unstructured indoor environments should be addressed.

Perception for everything

Autonomous personal robots are expected to: a) navigate gracefully in indoor environments avoiding people and, moving with ease through stairs, b) interact with humans beyond question and answering, to intimately serve people with disabilities through physical interactions, c) grasping and manipulating everyday objects to accomplish household tasks; for example, fetching milk cartons and cereal boxes out of a fridge and a shelf respectively, and performing pouring actions to accomplish the breakfast preparation task. In addition to the above mentioned abilities, there are many others that have to be put together before the robot can help Frank. However, all these aspects require the robot to see and perceive the world like humans in order to make any decision. As one can understand there are levels of perceptual information (milk carton is in the fridge inside the kitchen) inferred from sensor observations that need to be combined together to enable the robot make high-level decision over a task (breakfast preparation). In this article, we discuss some of the problems in the domain of goal-driven tasks that involves perception continually to inform sequential manipulation actions, and research efforts towards addressing them.

Perception for manipulation

Every task an autonomous agent is given, can be associated with a goal representation. This representation defines the conditions the agent has to satisfy to accomplish a goal. For example, in breakfast preparation, the desired goal is to have a bowl on a dining table, containing milk and cereal. In order to get to this state, the robot has to sense and perceive what the current state of the world is. In other words, to know where the bowl, milk-carton and cereal box are, and where the objects containing these objects, such as fridge and cabinets are. This has to be followed by actions such as grasping (grab the milk carton) and manipulating (pour the milk into the bowl) these objects in order to accomplish the task.

One thing to note is that household tasks such as meal preparation demand advancements in both hardware (grippers and manipulators) and software AI algorithms. Hardware robotic platforms are capable of performing complex manipulation actions as shown in Figure 2. Combining these advancements in the hardware platforms with AI algorithms can enable the robots to perform goal-directed task executions (Figure. 2(a)). However in order to bring these platforms to home environments which are often cluttered as shown in Figure 2b, the ability to perceive under ambiguous observations have to be improved.

Additionally, perceiving the world is extremely challenging, especially when the robot needs to act and change the world in order to complete a task. Things like noisy sensor data and extremely varying environmental conditions with wide range of object categories, make robot perception a real bottleneck in bringing personal robots to the home.

The clutter challenge

Human environments are often cluttered, making it challenging to perceive using the existing robot sensors. Clutter can be defined to be ambiguity created in the observations due to environmental occlusions, making it harder to identify objects in the world. A typical kitchen scene captured using a RGB camera will result in observation as shown in Figure 3a. As one can notice that objects such as coffee maker are not directly visible to the camera from this particular viewpoint. Similarly, the kitchen sink captured using a RGB camera as shown in Figure 3b, has extreme clutter with wide range of objects in arbitrary poses. Speaking of variety of objects, the objects that have functional parts such as fridge, oven and coffee makers as shown in Figure 3c. These scenarios are very common and lead to incomplete sensor observation about the world. Perceiving and reasoning under such ambiguous and incomplete information towards a household task is challenging. To emphasize, take a second look at Figure 5b, and consider a task where the robot needs to pick up all the objects from the sink and put it in a dishwasher. At the minimum, this task requires the robot to decide which object it needs to pick up next. Hence, it should perceive what objects are in the sink and how they are physically supporting each other to know which object is the target object for the pick up action. This task is challenging in and itself, however becomes much harder under incomplete sensor observation due to clutter.

Figure 3: Cluttered scenes a) a cluttered kitchen scene b) a sink scene c) a kitchen scene with objects that have doors and parts with functions (fridge, cabinets and oven) – credits google image search.

Levels of perception

A robot that is built with various components needs to possess proprioception (sense of relative position of one’s own body parts) along with an estimate of its relative position in an environment. For example, the robot should be able to estimate that “it is near the kitchen sink” in a familiar home environment. In addition to this notion of self localization in an environment, the robot should be able to perceive the state of the world and abstract the perceived information to perform tasks. For example, the robot must perceive where the dining table, plates, and cutlery are in order to set the table for dinner.

A robot can understand the world at different levels. A high level understanding can result in representing a world as a collection of semantic place locations (living room, bed room and dining room) that will let the robot navigate to a desired location (dining table in the dining room) (Figure. 4). A low level abstraction could result in collection of objects (Figure. 5) with their positions and orientations, suitable for grasping the objects for completing a task.

Figure 4: Understanding a large scale scene as a collection of objects and semantic locations [8]
Figure 5: Understanding a scene as a collection of objects with their positions and orientations [7]

The scene understanding towards these levels of abstractions becomes extremely challenging when the sensor observations are noisy and can only capture partial information about the environment at every instant.

Our advancements

With our research advancements, I along with my colleagues are developing efficient ways to overcome these challenges posed by complex indoor environments. For example, if the task is to set the table for dinner, then the robot can associate a goal state with conditions such as table is clean, one plate on the table in front of each chair, fork on the left of each plate, and spoon on the right of each plate. Achieving this goal state would mean that the robot accomplishes the task. The initial state provided to the robot is arbitrary and can have a messy table, or plates and cutlery in shelves. This arbitrary initial state has to be perceived and abstracted appropriately to symbols and axioms (table is not clean, plates are in shelves, forks are in drawer, and spoons are in the drainer ), such that the robot can plan actions that would lead to the associated goal state. However, perceiving this arbitrary state to enable task planning to pick and place objects is non-trivial. As a step towards solving this problem Zeng et. al. [1] developed an approach that lets the robot perceive the desired goal state (a tray with 3 objects in a specific configuration) shown by the user (see Figure. 6). And at a later point when the robot is provided with an arbitrary scene (a tray and 5 objects in some configuration), it perceives the current state and plans a sequence of actions to realize the desired goal state.

Figure 6: Robot perceiving the initial state and planning actions to realize a goal state shown by a user [1]. Video of this work is available here.

To perform this type of goal-driven tasks under inherent clutterness of the indoor environments, we enable the robot to perceive the world by developing algorithms that considers all possible hypothesis (generative approach [2]) to produce a most likely world state that explains the sensor observation of the robot [3, 4]. These hypothesis include plausible objects and their poses in the real world that is likely to produce the sensor observation of the world.

In order to extend these inference methods beyond table scenes (where the robot has to navigate to kitchen and open a fridge to fetch a milk carton in order to complete a task), we are developing algorithms that are scalable to a large number and variety of objects with functional parts (cabinets and other articulated objects – Figure 7) [5, 6].

Figure 7: Three scenes where a cabinet with a (blue) frame and three drawers (yellow, cyan, pink) is estimated in point cloud observations that are incomplete due to self and environmental occlusions. From left to right: Original scene, point cloud observation from depth camera, estimated pose viewed from two different views.Video of this work is available here.

Furthermore, contextual reasoning (milk carton is most likely to be found in fridge) enables the robot to make intelligent decisions when searching for objects in the human environments. We explore ways the robot can perform semantic mapping of such information while simultaneously identifying and locating objects [7] (see Figure. 8). During a later time of the mapping process, the acquired knowledge is used to avoid ambiguity in perceiving or searching the objects.

Figure 8: Robot semantically maps a student lounge in four different visits. Each column shows an RGB snapshot of the environment, together with the corresponding semantic map composed by the detected and localized objects. We model contextual relations between objects and temporal consistency of object locations. Video link: https://www.youtube.com/watch?v=W-6ViSlrrZg

These research efforts address some critical challenges discussed early in the article, and act as a bridge between classical AI systems and physical robots. By solving these problems, we are advancing towards the goal of having a personal robot companion to serve Frank and make him a healthy breakfast.

Bio

Karthik Desingh is a Ph.D. candidate of Computer Science and Engineering at the University of Michigan. He works at The Laboratory for PROGRESS advised by Prof. Jenkins and is closely associated with Robotics Institute and Michigan AI. Desingh earned his B.E. in Electronics and Communication Engineering at Osmania University, India (2008), and M.S. in Computer Science at IIIT-Hyderabad (2013) and Brown University (2015). His research interests lie primarily in perception under uncertainty for goal-driven mobile manipulation tasks. Broadly towards solving problems in robotics and computer vision using probabilistic graphical models.

References

  1. Zhen Zeng, Zheming Zhou, Zhiqiang Sui, Odest Chadwicke Jenkins, Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes, ICRA 2018
  2. Generative Model, https://en.wikipedia.org/wiki/Generative_model
  3. Zhiqiang Sui, Lingzhu Xiang, Odest Chadwicke Jenkins, Karthik Desingh, Goal-directed Robot Manipulation through Axiomatic Scene Estimation, IJRR 2017
  4. Karthik Desingh, Odest Chadwicke Jenkins, Lionel Reveret, Zhiqiang Sui, Physically Plausible Scene Estimation for Manipulation in Clutter, Humanoids 2016
  5. Karthik Desingh Anthony Opipari, Odest Chadwicke Jenkins, Pull Message Passing for Nonparametric Belief Propagation, arXiv July 2018
  6. Karthik Desingh, Shiyang Lu, Anthony Opipari, Odest Chadwicke Jenkins, Factored Pose Estimation of Articulated Objects using Efficient Nonparametric Belief Propagation, ICRA 2019 (to appear)
  7. Zhen Zeng, Yunwen Zhou, Odest Chadwicke Jenkins, Karthik Desingh, Semantic Mapping with Simultaneous Object Detection and Localization, IROS 2018
  8. Karthik Desingh, Anthony Opipari, Odest Chadwicke Jenkins, Analysis of Goal-directed Manipulation in Clutter using Scene Graph Belief Propagation, ICRA 2018 Workshop
  9. Dai, Angela and Ritchie, Daniel and Bokeloh, Martin and Reed, Scott and Sturm, J{\”u}rgen and Nie{\ss}ner, Matthias}, Scan, Complete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans, CVPR 2018.
  10. Rodney Brooks, “Steps Towards Super Intelligence,” Blog post consisting of four parts.

by Gabe Cherry 

This is an interview with Professor Emily Mower Provost that was first published by The Michigan Engineer News Center.

Emily Mower Provost, Associate Professor of Electrical Engineering and Computer Science, speaks at the Ada Lovelace Opera: A Celebration of Women in Computing event. Photo: Joseph Xu

Using machine learning to decode the unpredictable world of human emotion might seem like an unusual choice. But in the ambiguity of human expression, U-M computer science and engineering associate professor Emily Mower Provost has discovered a rich trove of data waiting to be analyzed.

Mower Provost uses machine learning to help measure emotion, mood, and other aspects of human behavior; for example, she has developed a smartphone app that analyzes the speech of patients with bipolar disorder to track their mood, with the ultimate goal of helping them more effectively manage their health.

How do you quantify something as ambiguous as emotion in a field where, traditionally, ambiguity is the enemy?
As a machine learning researcher, we generally expect data examples to be accompanied by clear and unambiguous labels. The task is then to discover new ways to associate the patterns in the data with their labels.

But in emotion modeling, the whole idea of “unambiguity” is invalid. Ambiguity is a natural component of how we express ourselves and contributes to the richness of our interactions. Emotion modeling research asks how we can quantify the ambiguity in human expression and use that information to do something useful.

Can you give an example?
Sure–consider an example of one person talking loudly to his/her friend. An outside observer might perceive that the interaction is positive. Someone else may note the volume of the interaction and, instead, perceive that the interaction is negative. Who is right? To some extent they both are.

There’s useful data not just in the different interpretations, but in the degree of variation between the interpretations. We have been researching methods to predict how groups of people, rather than a single person, would perceive an emotional display. We can turn them into multi-dimensional descriptions called “emotion profiles”, which enable us to express ambiguity mathematically.

“We, as engineers, must change the way we think about modeling these types of behavior, asking what we can quantify and then investigating how we can use these measures to understand more about the individuals with whom we work.” -Emily Mower Provost, EECS-CSE associate professor

What sorts of adjustments did you have to make to transition your work to the medical field?
One big difference is that, in machine learning, we’re often very comfortable speaking about data labels—a description of the information that is present in a given data source. For example, we might say that a given interaction is “positive” or that a given image contains a cat.

In the medical community, the notion of a label may not be valid. For example, an individual may be diagnosed with bipolar disorder, but the label of “bipolar disorder” does not describe who that person is or even what that person may be doing or thinking at any given time.

We, as engineers, must change the way we think about modeling these types of behavior, asking what we can quantify and then investigating how we can use these measures to understand more about the individuals with whom we work.

How do you think machine learning will change the medical landscape in the years ahead?
Machine learning is in such an exciting place right now. For so long, we’ve worked to demonstrate what machine learning tools can do, demonstrating that we can automatically predict aspects of health, and, critically, demonstrating that this is a worthwhile thing to do.

Now we get to ask how we can help: how we can help reduce burdens on individual patients, on caregivers, on the healthcare system. We get to ask how we can learn more about health and wellness by stepping outside of laboratory settings and measuring people in their own environments.

by Michigan AI Lab 

The Artificial Intelligence Laboratory was founded in 1988 as an outgrowth of the Computer Vision Research Lab (CVRL) as faculty involved in other aspects of AI had joined Michigan.

Ramesh Jain was its first Director, and the initial faculty members included Lynn Conway, Ed Durfee, John Holland, Keki Irani, Ramesh Jain, Steve Kaplan, Dave Kieras, John Laird, Steve Lytinen, Brian Schunck, Elliot Soloway, and Terry Weymouth. Ramesh Jain continued as the director as the scope of the lab broadened to encompass areas such as cognitive architectures, natural language processing, intelligent tutoring systems, genetic algorithms, logical reasoning, and multiagent planning. In 1992, Jain left and John Laird took over as Lab Director and set the stage for the AI Lab to flourish as a broad AI community.

The AI Lab originally shared the Advanced Technology Laboratory (ATL) building (now occupied by Biomedical Engineering) with the Advanced Computer Architecture Laboratory (ACAL), which subsequently moved to the EECS Building. ATL was originally the Printing Services building, and its large open basement area, which once held printing machinery, was a perfect home for various (large!) robotic systems, including a massive PUMA arm and many mobile robots. Faculty in other departments who worked on robotic systems shared the facilities.

From its inception, the AI Lab has taken a broad, integrated view of AI. For example, a collaboration between faculty and students involved in vision, planning, and robotics led to a team from Michigan winning the very first AAAI Mobile Robot Competition, in 1992. For the rest of the 1990’s, members of the AI Lab continued work on mobile robotics, including as part of DARPA-sponsored efforts to develop unmanned ground vehicles.

The AI Lab continued to grow from 1990 into the early 2000s, joined by new faculty members including Michael Wellman, Dan Koditschek, Martha Pollack, and Satinder Singh, and by faculty from other EECS research areas Bill Rounds, Bill Birmingham, and Greg Wakefield whose research interests drifted towards AI. Faculty continued to collaborate among themselves and with other units, including cross-campus projects on digital libraries and computer music.

In 2005, the AI Lab joined the rest of Computer Science in Engineering in the new Beyster Building, where it has grown to nearly 20 faculty and research staff.

The lab was directed by:
Ramesh Jain (1988-92), John Laird (1992-99),
Edmund Durfee (1999-01), Michael Wellman (2001-05),
John Laird (2005-06), Satinder Singh Baveja (2006-17),
Rada Mihalcea (2017-)

by Jule Schatz 

What is the first thing you think of when you see the word “mouse”?
What about when you see “dairy”?
Most likely you all gave pretty different answers for the two word prompts above.
Now what about “cake”? What word comes to mind?
I can guess that most of you thought of “cheese”.

How come I was able to guess? This is a psychological effect called priming. The connection you made for “cake” was influenced by the words that you had previously seen, “mouse” and “dairy”. Because you were thinking about “mouse” and “dairy”, your mind was already thinking about “cheese”. So, when I asked you to connect something with “cake”, that’s where your mind went.

Word associations can be a fun way to learn about how people think. People have also used word associations to create fun games. One popular example is the game Codenames (pictured below). It involves coming up with word associations and trying to guess what associations other people will make.

Another game involving word associations is the Remote Associates Test (the RAT). In this game, a player is presented with three words and has to come up with a fourth word that relates to all three. Here is one example I think we can all get correct after our earlier priming!
Cake, cottage, Swiss. . . .
The answer is cheese! Here are some more for you to try.
Man, glue, star
Dust, cereal, fish
Way, board, sleep
See the answers at the bottom of the article.

These can be fun brain teasers, but they also bring up questions of how humans solve the RAT. Did you find one of these examples to be harder than the others? Why was that? What parts of the brain are involved? Does priming relate to how people perform on the RAT?

One way to approach these questions is to use computer science. This can be done by creating a model of the human brain on a computer and seeing what settings and directions make it behave the most human like.

Soar is a cognitive architecture that can be used to model the human brain that can perform task such as the RAT. Soar allows us to test the hypothesis that priming helps humans solve the RAT. In Soar, you can create a program that uses priming and another that doesn’t when solving RAT problems. By comparing these results to human performance, we gain information on how humans solve the RAT.

From running such experiments in Soar we found that priming does result in more human like performance on RAT problems. This is evidence that humans do rely on priming when solving RAT problems. This is a concrete example of how we can use computer science to gain insights into the human brain.

If you want more information on my research in the use of Soar for solving RAT problems, check out the paper.
You can also go to the github where all the code for the project lives!

More information on priming: https://en.wikipedia.org/wiki/Priming_(psychology)

More information on Soar: https://soar.eecs.umich.edu/

More information on General AI and cognitive architectures: http://www.cogsys.org/pdf/paper-1-2.pdf

RAT Answers: super, bowl, walk