Skip navigation
Please use this identifier to cite or link to this item: https://libeldoc.bsuir.by/handle/123456789/33888
Title: Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement
Authors: Herasimovich, V.
Petrovsky, Al. A.
Avramov, V. V.
Petrovsky, A.
Keywords: публикации ученых;audio/speech coding;wavelet packet;matching pursuit;psychoacoustics;neural networks;deep autoencoder;listening enhancement
Issue Date: 2018
Publisher: Springer
Citation: Audio/Speech Coding Based on the Perceptual Sparse Representation of the Signal with DAE Neural Network Quantizer and Near-End Listening Enhancement / V. Herasimovich and other // Multimedia and Network Information Systems – MISSI 2018. Advances in Intelligent Systems and Computing. – 2018. – Vol. 833. – P. 109 – 119. – DOI: 10.1007/978-3-319-98678-4_13.
Abstract: The article presents universal sound coding framework. The encoding algorithm works at the junction of the transform and parametric approaches. The input signal goes through the decorrelation transform – wavelet packet decomposition (WPD) that is tuned to perceptual structure of the analyzed signal with the psychoacoustic modelling. The parameterization stage is the matching pursuit (MP) algorithm with the WPD based dictionaries. Selected parameters then quantized and coded for the transmission to the decoder. Quantization algorithm based on the artificial neural networks with a deep autoencoder (DAE) architecture is presented. The decoding part of the coder has the listening enhancement function. Since the decoder input is the parameters that are distributed in the subbands it is only necessary to decompose the noise signal with the corresponding filterbank and estimate the subband gain factor based on this information. The results of the conducted research like objective difference grade and performance demonstration are shown.
URI: https://libeldoc.bsuir.by/handle/123456789/33888
Appears in Collections:Публикации в зарубежных изданиях

Files in This Item:
File Description SizeFormat 
Herasimovich_Audio.pdf84.81 kBAdobe PDFView/Open
Show full item record Google Scholar

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.