https://libeldoc.bsuir.by/handle/123456789/33888
Title: | Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement |
Authors: | Herasimovich, V. Petrovsky, Al. A. Avramov, V. V. Petrovsky, A. |
Keywords: | публикации ученых;audio/speech coding;wavelet packet;matching pursuit;psychoacoustics;neural networks;deep autoencoder;listening enhancement |
Issue Date: | 2018 |
Publisher: | Springer |
Citation: | Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement / V. Herasimovich and other // Multimedia and Network Information Systems – MISSI 2018. Advances in Intelligent Systems and Computing. – 2018. – Vol. 833. – P. 109 – 119. – DOI: 10.1007/978-3-319-98678-4_13. |
Abstract: | The article presents universal sound coding framework. The encoding algorithm works at the junction of the transform and parametric approaches. The input signal goes through the decorrelation transform – wavelet packet decomposition (WPD) that is tuned to perceptual structure of the analyzed signal with the psychoacoustic modelling. The parameterization stage is the matching pursuit (MP) algorithm with the WPD based dictionaries. Selected parameters then quantized and coded for the transmission to the decoder. Quantization algorithm based on the artificial neural networks with a deep autoencoder (DAE) architecture is presented. The decoding part of the coder has the listening enhancement function. Since the decoder input is the parameters that are distributed in the subbands it is only necessary to decompose the noise signal with the corresponding filterbank and estimate the subband gain factor based on this information. The results of the conducted research like objective difference grade and performance demonstration are shown. |
URI: | https://libeldoc.bsuir.by/handle/123456789/33888 |
Appears in Collections: | Публикации в зарубежных изданиях |
File | Description | Size | Format | |
---|---|---|---|---|
Herasimovich_Audio.pdf | 84.81 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.