- CoQA: A Conversational Question Answering Challenge
- GraphParser: An Ungrounded and Grounded Semantic Parser
- English Compound Noun Compositionality Dataset
- Hindi POS Tagger
- Hindi Dependency Parser
- Hindi WordNet in Python
- Kannada POS Tagger
- Telugu POS Tagger
- Indonesian and Malay Tools
CoQA: A Conversational Question Answering Challenge
CoQA is a large-scale dataset for building Conversational Question Answering systems. The goal of the CoQA challenge is to measure the ability of machines to understand a text passage and answer a series of interconnected questions that appear in a conversation.
GraphParser: An Ungrounded and Grounded Semantic Parser
Graph Parser is a semantic parser which converts Natural Language Sentences/Questions to predicate-argument graphs, which can in-turn be converted to logical queries and executed on Freebase knowledge-graph. Please read more about it in our paper Large-scale Semantic Parsing without Question-Answer Pairs.
Compound Noun Compositionality Dataset
Compositionality Dataset described in Reddy, McCarthy and Manandhar (2011, IJCNLP).
Alternate download link from Diana McCarthy
POS Taggers, Corpora, Lemmatizers, Morph Analyzers for Indian Languages
Most of these tools are developed by the methods described in Reddy and Sharoff (2011, CLIA @ IJCNLP). Some of the taggers are built using cross-lingual resources and some using mono-lingual resources. Please read corresponding README's of each tool for additional information.
This work is supported by Sketch Engine and Intellitext project.
If you need resources for any other Indian languages, please contact me.
Kannada Tools
Download v2.0
Sample Output of the tagger
For the complete corpus described in the paper, please contact me. Alternate download link from Serge Sharoff
Telugu Tools
Download v3.0
Sample Output of the tagger
Project Page: https://bitbucket.org/sivareddyg/telugu-part-of-speech-tagger
Hindi Tools
Download v3.0
Sample Output of the tagger
Project Page: https://bitbucket.org/sivareddyg/hindi-part-of-speech-tagger
Indonesian and Malay morphological analyzer, part-of-speech (POS) tagger, Machine Translation System
With support from Sketch Engine, I have made few contributions to the Apertium Indonesian-Malay language pair. All the tools can be downloaded from http://sourceforge.net/projects/apertium/files/apertium-id-ms/
Hindi WordNet in Python
Download v1.4
Other versions
Demo Program
Project Page: https://bitbucket.org/sivareddyg/python-hindi-wordnet
Hindi Dependency Parser
Download v2.0
Sample Output
Project Page: https://bitbucket.org/sivareddyg/hindi-dependency-parser