Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs

Abstract

Sanskrit (Saṃskṛta) enjoys one of the largest and most varied literature in the whole world. Extracting the knowledge from it, however, is a challenging task due to multiple reasons including complexity of the language and paucity of standard natural language processing tools. In this paper, we target the problem of building knowledge graphs for particular types of relationships from Saṃskṛta texts. We build a natural language question-answering system in Saṃskṛta that uses the knowledge graph to answer factoid questions. We design a framework for the overall system and implement two separate instances of the system on human relationships from Mahābhārata and Rāmāyaṇa, and one instance on synonymous relationships from Bhāvaprakāśa Nighaṇṭu, a technical text from Āyurveda. We show that about 50% of the factoid questions can be answered correctly by the system. More importantly, we analyse the shortcomings of the system in detail for each step, and discuss the possible ways forward.

Publication
Proceedings of the 6th International Sanskrit Computational Linguistics Symposium
Hrishikesh Terdalkar
Hrishikesh Terdalkar
Postdoctoral Researcher

My research lies in the intersection of Computational Linguistics, Natural Language Processing, and Graph Databases with a particular emphasis on low-resource languages such as Sanskrit and other Indian languages. I am committed to pioneering NLP innovations that have a real-world impact. I enjoy building user-friendly GUIs and CLIs for various applications. My interests also include Artificial Intelligence, Databases, Human-Computer Interaction, Information Retrieval, and Data Mining.