San Jose Professor Makes Indian Classical Music Accessible Through Artificial Intelligence

India West Staff Reporter

SAN JOSE, CA – Music has a profound impact on human life. Many, like Dr. Vishnu S. Pendyala, a faculty member in the Department of Applied Data Science at San Jose State University, aspire to learn music, especially Indian classical music, but are held to l learning gap, mainly due to the expensive process of mastering music. competence, he says. It takes years of rigorous practice under the tutelage of an established musician to perfect vocal music. Since childhood, Pendyala made multiple attempts to learn the Carnatic voice, but to no avail.

After many years now, advances in technology have helped direct his research in deep learning, which is an active area of ​​Artificial Intelligence (AI), towards his passion for music. He and his students began experimenting with music using cutting-edge inventions.

Audio signals that attract the ear and create a melody generally conform to a structural framework. Audio is one of the four types of data that deep learning works best on, the other three being image, video, and text. Pendyala and his student teams experimented with audio signals, particularly with regard to the latent melodic frames within them.

His undergraduate students, Rohan Surana and Aakash Varshney worked with him to use a deep learning framework called CycleGAN to convert classical Indian melodic frameworks in the South Indian Carnatic style to those of the Northern Hindustani style. from India and vice versa. The work was published in Springer Proceedings of Second International Conference on Advances in Computer Engineering and Communication Systems earlier this year.

The system they developed takes as input unpaired samples of both styles of music – Hindustani and Carnatic and learns how to convert a new sample from one style to the other. CycleGAN is a type of generative adversarial network, abbreviated as GAN. GAN is the same technology used to generate real-looking fictional images.

Generative models have become very popular lately for accomplishing a wide variety of tasks and the field continues to evolve. A GAN has two components – a generator and a discriminator. In the context of fictional image generation, the generator is software that creates the fictional images by taking feedback from its adversary, the discriminator. The generator behaves like a child or a pupil and the discriminator works like the parent or the teacher making the child learn.

A more useful application of deep learning in music that Pendyala wanted to experiment with was to create an inexpensive music tutor that could help the world learn Indian classical music. His graduate students, Nupur Yadav, Chetan Kulkarni and Lokesh Vadlamudi worked with him to develop a deep learning system to recognize melodic frames in amateur vocal rendering and play the perfected snippet for the same melodic frame so that amateurs can improve their rendering. listening to it.

The system was deployed using software technologies such as containerization, orchestration, and cloud computing as a proof of concept for possible mass access. The system is close to creating a music tutor and is appropriately titled in the article which was published in a Scopus-indexed Elsevier journal on Systems and Software Computing accessible at

Comments are closed.