Neuromorphic Audio Coding: Ultra-Low-Power Reconstruction Via Bipolar Spiking Neural Networks and Vector Quantization
Continuous audio processing in always-on edge devices is constrained by strict energy budgets, making conventional synchronous codecs inefficient for battery-powered deployment. We introduce AudioSpike, a hybrid neuromorphic audio coding architecture that explicitly decouples transient temporal dynamics from stationary spectral structure. Temporal information is encoded through bipolar Leaky Integrate-and-Fire (LIF) ON/OFF spike populations, while spectral texture is compactly represented using sub-band vector quantization (VQ). Phase reconstruction is initialized from a spike-derived temporal trace and refined via Griffin–Lim, reducing blind phase artifacts while preserving structural coherence. The system is validated in two complementary settings: (i) a native C++ Android implementation providing an end-to-end encode/decode pipeline, and (ii) hardware-in-the-loop execution on SynSense Xylo Audio 3 for in-silicon energy characterization. Experimental results demonstrate energy reductions of up to three orders of magnitude compared to conventional software execution, while maintaining structurally consistent waveform reconstruction suitable for continuous edge monitoring under severe power constraints. These findings support event-driven neuromorphic coding as a viable paradigm for ultra-low-power always-on audio processing in Edge AI systems.
