Ben Taskar
Computer and Information Science, UPENN
From Co-occurrence to Correspondence
While supervised learning methods for classification and structured
prediction are very effective in many domains, they require detailed
and precise labeling of large amounts of data. Weakly or ambiguously
labeled data present major challenges as well as opportunities. For
example, to build a machine translation system, we typically have
large amounts of translated sentences to learn from, but without word
or phrase level correspondence. Copious images and videos on the web
or your harddrive are typically labeled with captions of who and what
is in the picture, but not where and when. The challenges are both
theoretical and algorithmic: under what assumptions can we guarantee
effective and efficient learning of precise correspondence from pure
co-occurrence? I will describe our ongoing work on weakly supervised
learning approaches for machine translation and parsing of images,
videos and text.Bio:
Ben Taskar received his bachelor's and doctoral degree in Computer
Science from Stanford University. After a postdoc at the University of
California at Berkeley, he joined the faculty at the University of
Pennsylvania Computer and Information Science Department in 2007,
where he currently co-directs PRiML: Penn Research in Machine Learning.
His research interests include machine learning, natural language
processing and computer vision. His work on structured prediction has
received best paper awards at NIPS and EMNLP conferences.