SIS Research Area - Data Management & Analytics
Research Theme
Central Concerns and Questions
We are interested in developing general learning algorithms that enable text mining tools to adapt to new domains (e.g. from news articles to online blogs), and to apply these general domain adaptation techniques to specific text mining problems such as text categorsation, named entity recognition and relation extraction.
Emerging Ideas and Initiatives
Supervised text mining tools require sufficient annotated data for training, but currently most annotated corpora are from the news domain, preventing existing tools to be successfully applied to other domains such as emails and blogs. An important research question is therefore how to adapt text mining tools to new domains by developing new learning algorithms and exploiting domain knowledge. Our current focus is on adapting relation extraction methods to new genres and new domains of text by exploiting the commonality between different types of relations.
Selected Publications
[1] Jing Jiang. Multi-task Transfer Learning for Weakly-Supervised Relation Extraction. The Joint Conference of the 47 th Annual Meeting of the Association for Computational Linguistics and the Fourth International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP), August 2009, 1012-1020.
[2] Jing Gao, Wei Fan, Jing Jiang and Jiawei Han. Knowledge Transfer via Multiple Model Local Structure Mapping. ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 2008, 283-291.
[3] Jing Jiang and ChengXiang Zhai. A Two-stage Approach to Domain Adaptation for Statistical Classifiers. ACM Conference on Information and Knowledge Management (CIKM), November 2007, 401-410.
[4] Jing Jiang and ChengXiang Zhai. Instance Weighting for Domain Adaptation in NLP. Annual Meeting of the Association for Computational Linguistics (ACL), June 2007, 264-271.
[5] Jing Jiang and ChengXiang Zhai. Exploiting Domain Structure for Named Entity Recognition. Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), June 2006, 74-81.
Projects, Presentations and Posters
-
-
-
- Jing Jiang, Exploiting Domain Structure for Named Entity Recognition (presentation)
Collaborators
- Chieu Hai Leong (Singapore DSO National Laboratories)
|