Event: MS/UW Symposium in Computational Linguistics

25th MS/UW Symposium in Computational Linguistics
Time: 3:30-5pm, October 28, 2011
Location: Microsoft, Building 99, lecture hall 1919
Please see the end of the message for a note about parking.

Announcing the 25th Symposium in Computational Linguistics sponsored by the UW
Departments of Linguistics, Electrical
Engineering, and Computer Science, Microsoft Research, and UW alumni at

Come take advantage of this opportunity to connect with the computational
linguistics community at Microsoft and the
University of Washington. This is a regular opportunity for computational
linguists at the University of Washington and at
Microsoft to discuss topics in the field and to connect in a friendly informal
atmosphere. We will have five short talks (see
below), followed by informal mingling.

Yoav Artzi: Towards Predicting Responses in Twitter

Twitter?s open and public network allows to directly observe how messages are
reaching and influencing users by following
responses. Twitter provides two forms of response: replies and retweets.
Responses thus serve both as a measure of
distribution and as a way to increase it. Understanding this dynamic for
prediction would be valuable information for any
content generator. In this work, we describe methods to predict if a given
tweet will elicit a response from the social
network once it’s posted. To accomplish this task we exploit features derived
from various sources of signal such as the
language used in the tweet, the social network and the user’s history. We use
these features and leverage historical data to
automatically train prediction models from a stream of real tweets collected
over a two weeks period. We empirically show
that our models are capable of generating accurate predictions over a subset of
the tweets population.

Brian Hutchinson: Tensor Deep Stacking Networks for Phonetic Classification and

We introduce a novel deep architecture, the Tensor Deep Stacking Network
(T-DSN), in which multiple blocks are stacked on top
of another and where a bilinear mapping from hidden representations to the
output in each block is used to incorporate
higher-order statistics of the input features. Using an efficient and scalable
parallel learning algorithm, we train a T-DSN
to classify standard three-state monophones in the TIMIT database. The T-DSN
outperforms an alternative pretrained Deep
Neural Network (DNN) architecture in frame-level classification (both state and
phone) and in the cross-entropy measure. For
continuous phonetic recognition, T-DSN performs equivalently to a DNN, without
the need for a hard-to-scale fine-tuning step.

Aniruddh Nath: Generalizing Natural Language Instructions

Reinforcement learning has been successfully applied to the problem of mapping
natural language instructions to actions with
little or no supervision (Branavan et al., 2009). However, this mapping cannot
be used to solve new problem instances unless
we also receive a set of instructions for the new instance. We present an
algorithm for generalizing the knowledge gained
while learning to map instructions to actions, allowing us to solve new problem
instances with no additional knowledge. The
algorithm is a form of imitation learning using Counting-MLNs (C-MLNs), a novel
statistical relational representation that
can reason about the number of objects that satisfy a formula. We present an
algorithm for learning C-MLNs, and apply it to
the problem of generalizing instructions for the Crossblock puzzle game. We
also investigate the use of C-MLNs for standard
relational reinforcement learning, without the use of natural language

(Joint work with Matthew Richardson.)

Julie Medero: NLP in Patient-Directed Medical Displays

When patients are in the Emergency Department (ED), extensive electronic
records are kept about the their visit. The
patient?s complaints and background are recorded, along with doctors? notes and
records of every test that is ordered and
every medication that is administered. Currently, though, that information is
not available to the patient, or to the
patient?s family and friends who are with them in the hospital. In this talk, I
will summarize work done this summer with the NLP and CUE groups at MSR on ways
that NLP technologies can be used to make electronic medical records accessible
to patients in the context of a mobile, patient-friendly display. In
particular, we look at normalization and splitting of the Chief Complaint
records that are entered by triage nurses when patients arrive at the ED, and
at extracting ?patient-friendly? explanations of lab tests and medications from
consumer health websites.

Amittai Axelrod: Topic Modeling for Statistical Machine Translation

Unsupervised topic models can be used effectively in language modeling and
information retrieval to tailor performance on
broad corpora by determining clusters of related data. We combine such a topic
model based on Latent Dirichlet Allocation
with our recent work on corpus sub-selection to improve machine translation
system results on a variety of TED talks.
Parking: Due to other events at Microsoft Friday afternoon, parking may be
busier than usual. Please allow some extra time. You can park anywhere in the
garage as long as you register the car with the receptionist (if the visitor
spots are full, you can take any other spot in the garage). You may also email
Michael Gamon (mgamon at microsoft dot com) with “symposium parking” in the
subject line and with your vehicle and license plate information in the body to
be pre-registered for parking, which saves additional time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s