Chandojñānam: A Sanskrit Meter Identification and Utilization System

Hrishikesh Terdalkar, Arnab Bhattacharya

January, 2023

Abstract

We present Chandojñānam, a web-based Sanskrit meter (Chanda) identification and utilization system. In addition to the core functionality of identifying meters, it sports a friendly user interface to display the scansion, which is a graphical representation of the metrical pattern. The system supports identification of meters from uploaded images by using optical character recognition (OCR) engines in the backend. It is also able to process entire text files at a time. The text can be processed in two modes, either by treating it as a list of individual lines, or as a collection of verses. When a line or a verse does not correspond exactly to a known meter, Chandojñānam is capable of finding fuzzy (i.e., approximate and close) matches based on sequence matching. This opens up the scope of a meter based correction of erroneous digital corpora. The system is available for use at https://sanskrit.iitk.ac.in/jnanasangraha/chanda/, and the source code in the form of a Python library is made available at https://github.com/hrishikeshrt/chanda/.

Type

Conference paper

Publication

The 18th World Sanskrit Conference, January 2023

Hrishikesh Terdalkar

Assistant Professor

My research lies at the intersection of Computational Linguistics, Natural Language Processing, and Knowledge Graphs with a particular emphasis on low-resource languages such as Sanskrit and other Indian languages. My recent work has focused on building datasets, models, benchmarks, and evaluation frameworks grounded in linguistic structure. My interests also include Artificial Intelligence, Information Retrieval, Human-Computer Interaction, and Data Mining.