Improved Chinese Sentence Segmentation with Reinforcement Learning

We propose a technique for learning to split Chinese text into segments that can be independently translated into English, so as to maximise the overall translation quality. We cast the problem as a sequential decision problem and use reinforcement learning to learn a segmentation policy that makes a series of decisions at candidate split points about whether to split and translate the previous segment or to continue to the next one. This addresses, in a principled way, a long-standing challenge in Chinese–English machine translation: that English-like sentence boundaries are ambiguous in Chinese orthography. Our RL-based segmentation model is able to improve the baseline BLEU score on the WMT2020 Chinese–English test set by +0.25 points overall and improves the BLEU scores on source segments that contain more than 60 words by nearly +3 points.