Abstract

We present PyMAiVAR, a versatile toolbox that encompasses the generation of image representations for audio data including Wave plots, Spectral Centroids, Spectral Roll Offs, Mel Frequency Cepstral Coefficients (MFCC), MFCC Feature Scaling, and Chromagrams. This wide-ranging toolkit generates rich audio-image representations, playing a pivotal role in reshaping human action recognition. By fully exploiting audio data's latent potential, PyMAiVAR stands as a significant advancement in the field. The package is implemented in Python and can be used across different operating systems.

RAS ID

58302

Document Type

Journal Article

Date of Publication

9-1-2023

Volume

17

Funding Information

Edith Cowan University / Australia and Higher Education Comission (HEC) of Pakistan / Australian Government

School

School of Science / School of Engineering

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Publisher

Elsevier

Comments

Shaikh, M. B., Chai, D., Islam, S. M. S., & Akhtar, N. (2023). PyMAiVAR: An open-source Python suit for audio-image representation in human action recognition. Software Impacts, 17, article 100544. https://doi.org/10.1016/j.simpa.2023.100544

Share

 
COinS
 

Link to publisher version (DOI)

10.1016/j.simpa.2023.100544