BS-PLCNet: Band-split Packet Loss Concealment Network with
Multi-task Learning Framework and Multi-discriminators

Zihan Zhang1,2, Jiayao Sun1, Xianjun Xia2, Chuanzeng Huang2, Yijian Xiao2, Lei Xie1
1Audio, Speech and Language Processing Group (ASLP@NPU), Northwestern Polytechnical University, Xi'an, China
2ByteDance, China

0. Contents

  1. Abstract
  2. Samples of 2024 PLC Challenge blind test set


1. Abstract

Packet loss is a common and unavoidable problem in voice over internet phone (VoIP) systems. To deal with the problem, we propose a band-split packet loss concealment network (BS-PLCNet). Specifically, we split the full-band signal into wide-band (0-8kHz) and high-band (8-24kHz). The wide-band signals are processed by a gated convolutional recurrent network (GCRN), while the high-band counterpart is processed by a simple GRU network. To ensure high speech quality and automatic speech recognition (ASR) compatibility, multi-task learning (MTL) framework including fundamental frequency (f0) prediction, linguistic awareness, and multi-discriminators are used. The proposed approach tied for $1^{st}$ place in the ICASSP 2024 PLC Challenge.



Samples of 2024 PLC Challenge blind test set

Models Sample 1 Sample 2 Sample 3 Sample 4
Lossy
BS-PLCNet


Models Sample 5 Sample 6 Sample 7 Sample 8
Lossy
BS-PLCNet