ABSTRACT: Current-generation search engines provide a glimpse of the kinds of activities that can be catalyzed by intelligent processing of large-scale document corpora. Further progress in this area will require the tools of statistical natural language processing, including tools for automatic extraction of propositional information from text. This presentation will explore several lines of research on one of the core problems that arise in this domain---the identification of semantic relations between constituents in sentences. First, I will discuss the problem of identifying relationships between two-word noun compounds (to characterize, for example, the treatment-for-disease relationship between the words of "migraine treatment" versus the method-of-treatment relationship between the words of "aerosol treatment".) Second, I'll describe my work in the area of Information Extraction, in particular the problem of identifying semantic entities such as "treatment" and "disease" from biomedical text. Finally, I will present my recent work on the problem of predicting protein-protein interactions from biological text. A major impediment to such work is the acquisition of appropriately labeled training data; for my experiments I have identified a database that serves as a proxy for training data. In each of these cases I will describe the statistical machine learning methods---both generative and discriminative---used to tackle these tasks.
This Blog is maintained by the Robot Perception and Learning lab at CSIE, NTU, Taiwan. Our scientific interests are driven by the desire to build intelligent robots and computers, which are capable of servicing people more efficiently than equivalent manned systems in a wide variety of dynamic and unstructured environments.
Wednesday, November 09, 2005
CMU LTI talk: Natural Language Processing in Bioinformatics: Uncovering Semantic Relations
Speaker: Barbara Rosario, University of California, Berkeley
TITLE: Natural Language Processing in Bioinformatics: Uncovering Semantic Relations
ABSTRACT: Current-generation search engines provide a glimpse of the kinds of activities that can be catalyzed by intelligent processing of large-scale document corpora. Further progress in this area will require the tools of statistical natural language processing, including tools for automatic extraction of propositional information from text. This presentation will explore several lines of research on one of the core problems that arise in this domain---the identification of semantic relations between constituents in sentences. First, I will discuss the problem of identifying relationships between two-word noun compounds (to characterize, for example, the treatment-for-disease relationship between the words of "migraine treatment" versus the method-of-treatment relationship between the words of "aerosol treatment".) Second, I'll describe my work in the area of Information Extraction, in particular the problem of identifying semantic entities such as "treatment" and "disease" from biomedical text. Finally, I will present my recent work on the problem of predicting protein-protein interactions from biological text. A major impediment to such work is the acquisition of appropriately labeled training data; for my experiments I have identified a database that serves as a proxy for training data. In each of these cases I will describe the statistical machine learning methods---both generative and discriminative---used to tackle these tasks.
ABSTRACT: Current-generation search engines provide a glimpse of the kinds of activities that can be catalyzed by intelligent processing of large-scale document corpora. Further progress in this area will require the tools of statistical natural language processing, including tools for automatic extraction of propositional information from text. This presentation will explore several lines of research on one of the core problems that arise in this domain---the identification of semantic relations between constituents in sentences. First, I will discuss the problem of identifying relationships between two-word noun compounds (to characterize, for example, the treatment-for-disease relationship between the words of "migraine treatment" versus the method-of-treatment relationship between the words of "aerosol treatment".) Second, I'll describe my work in the area of Information Extraction, in particular the problem of identifying semantic entities such as "treatment" and "disease" from biomedical text. Finally, I will present my recent work on the problem of predicting protein-protein interactions from biological text. A major impediment to such work is the acquisition of appropriately labeled training data; for my experiments I have identified a database that serves as a proxy for training data. In each of these cases I will describe the statistical machine learning methods---both generative and discriminative---used to tackle these tasks.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.