Awesome Simultaneous Translation

Eric Lina · September 6, 2022

Awesome Simultaneous Translation

This repository collects the tookits, common datasets and paper list related to the research on Simultaneous Translation. This repository is continuously updating… update-badge

It is a great honor if this repository brings some help or reference to your research.😊

wordcloud

Tookits

  • Fairseq: a sequence modeling toolkit, covering the machine translation, speech translation and simultaneous translation (both text-to-text and speech-to-text).
  • SimulEval: a general evaluation framework for simultaneous translation on text and speech.

Datasets

  • Conventional text-to-text translation datasets:
    • IWSLT15 English-Vietnamese: 133K sentence pairs. [Link]
    • WMT15 German-English: 4.5M sentence pairs. [Link]
    • WMT14 English-French: 36.3M sentence pairs. [Link]
  • Conventional speech-to-text translation datasets:
    • MuST-C: multilingual speech-to-text translation corpus with 8 language pairs. [Link]
  • Simultaneous interpretation datasets:
    • BSTC Chinese-English: 68 hours. [Link]
    • NAIST-SIC English-Japanese: 22 hours.[Link]

Tutorials & Talks

PACLIC 2016: The Challenge of Simultaneous Speech Translation. Anoop Sarkar. [Link]

EMNLP 2020: Simultaneous Translation. Liang Huang, Colin Cherry, Mingbo Ma, Naveen Arivazhagan, and Zhongjun He. [Link]

AMTA 2020: Simultaneous Speech Translation in Google Translate. Jeff Pitman. [Link]

Paper List

This is a paper list of Simultaneous Translation, organized by publication year.

We also collect a paper list organized by different categories. Refer to Here.

2002 2006 2007 2009 2010 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022

<a id="2002">2002 </a>

  • Translation Unit Concerning Timing of Simultaneous Translation. LREC 2002. [PDF]

<a id="2006">2006 </a>

  • Simultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer. ACL 2006. [PDF]

<a id="2007">2007 </a>

  • Simultaneous translation of lectures and speeches. Mach Translat 2007. [PDF]

<a id="2009">2009 </a>

  • End-to-End Evaluation in Simultaneous Translation. EACL 2009. [PDF]

<a id="2010">2010 </a>

  • Stream-based Translation Models for Statistical Machine Translation. NAACL 2010. [PDF]
  • Construction of Chunk-Aligned Bilingual Lecture Corpus for Simultaneous Machine Translation. LREC 2010. [PDF]

<a id="2012">2012 </a>

  • Real-time Incremental Speech-to-Speech Translation of Dialogs. NAACL 2012. [PDF]

<a id="2013">2013 </a>

  • Incremental Segmentation and Decoding Strategies for Simultaneous Translation. IJCNLP 2013. [PDF]

<a id="2014">2014 </a>

  • Optimizing Segmentation Strategies for Simultaneous Speech Translation. ACL 2014. [PDF]
  • Collection of a Simultaneous Translation Corpus for Comparative Analysis. IREC 2014. [PDF]
  • Don’t Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation. EMNLP 2014. [PDF]
  • Towards Simultaneous Interpreting: the Timing of Incremental Machine Translation and Speech Synthesis. IWSLT 2014. [PDF]
  • Segmentation Strategies for Streaming Speech Translation. NAACL 2014. [PDF]

<a id="2015">2015 </a>

  • Automated Simultaneous Interpretation: Hints of a Cognitive Framework for Machine Translation. HyTra 2015. [PDF]
  • Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents. ACL 2015. [PDF]
  • Syntax-based Rewriting for Simultaneous Machine Translation. EMNLP 2015. [PDF]

<a id="2016">2016 </a>

  • An Efficient and Effective Online Sentence Segmenter for Simultaneous Interpretation. WAT 2016. [PDF]
  • Interpretese vs. Translationese: The Uniqueness of Human Strategies in Simultaneous Interpretation. NAACL 2016. [PDF] [Code]
  • Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons. LREC 2016. [PDF]
  • A Prototype Automatic Simultaneous Interpretation System. COLING 2016. [PDF]
  • Simultaneous Machine Translation using Deep Reinforcement Learning. ICML 2016 [PDF]
  • Can neural Machine Translation do Simultaneous Translation? Arxiv 2016. [PDF]

<a id="2017">2017 </a>

  • Online and Linear-Time Attention by Enforcing Monotonic Alignments. ICML 2017. [PDF] [Code]
  • Learning to Translate in Real-time with Neural Machine Translation. EACL 2017. [PDF] [Code]

<a id="2018">2018 </a>

  • Simultaneous Translation using Optimized Segmentation. AMTA 2018. [PDF]
  • Automatic Estimation of Simultaneous Interpreter Performance. ACL 2018. [PDF] [Code]
  • Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation. NAACL 2018. [PDF] [Code]
  • Statistical Analysis of Missing Translation in Simultaneous Interpretation Using A Large-scale Bilingual Speech Corpus. LREC 2018. [PDF]
  • Prediction Improves Simultaneous Neural Machine Translation. EMNLP 2018. [PDF] [Code]
  • KIT Lecture Translator: Multilingual Speech Translation with One-Shot Learning. COLING 2018. [PDF]
  • Monotonic Chunkwise Attention. ICLR 2018. [PDF] [Code]

<a id="2019">2019 </a>

  • Monotonic Infinite Lookback Attention for Simultaneous Machine Translation. ACL 2019. [PDF]
  • STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework. ACL 2019. [PDF]
  • Simultaneous Translation with Flexible Policy via Restricted Imitation Learning. ACL 2019. [PDF]
  • Lost in Interpretation: Predicting Untranslated Terminology in Simultaneous Interpretation. NAACL 2019. [PDF] [Code]
  • Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation. EMNLP 2019. [PDF]
  • Speculative Beam Search for Simultaneous Translation. EMNLP 2019. [PDF]
  • Thinking Slow about Latency Evaluation for Simultaneous Machine Translation. Arxiv 2019. [PDF]
  • DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting. Arxiv 2019. [PDF]
  • Simultaneous Neural Machine Translation using Connectionist Temporal Classification. Arxiv 2019. [PDF]

<a id="2020">2020 </a>

  • Towards Multimodal Simultaneous Neural Machine Translation. WMT 2020. [PDF] [Code]
  • Opportunistic Decoding with Timely Correction for Simultaneous Translation. ACL 2020. [PDF]
  • Simultaneous Translation Policies: From Fixed to Adaptive. ACL 2020. [PDF]
  • SimulSpeech: End-to-End Simultaneous Speech to Text Translation. ACL 2020. [PDF]
  • Monotonic Multihead Attention. ICLR 2020. [PDF] [Code]
  • Learning Adaptive Segmentation Policy for Simultaneous Translation. EMNLP 2020. [PDF]
  • Simultaneous Machine Translation with Visual Context. EMNLP 2020. [PDF] [Code]
  • Direct Segmentation Models for Streaming Speech Translation. EMNLP 2020. [PDF] [Code]
  • SIMULEVAL: An Evaluation Toolkit for Simultaneous Translation. EMNLP 2020. [PDF] [Code]
  • Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework. EMNLP 2020 findings. [PDF]
  • Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training. EMNLP 2020 findings. [PDF]
  • A General Framework for Adaptation of Neural Machine Translation to Simultaneous Translation. AACL 2020. [PDF]
  • SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation. AACL 2020. [PDF] [Code]
  • Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation. ICASSP 2020 [PDF]
  • Efficient Wait-k Models for Simultaneous Machine Translation. InterSpeech 2020. [PDF] [Code]
  • Presenting Simultaneous Translation in Limited Space. Arxiv 2020. [PDF]
  • Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS. Arxiv 2020. [PDF]
  • Low Latency ASR for Simultaneous Speech Translation. Arxiv 2020. [PDF]

<a id="2021">2021 </a>

  • Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement. WMT2021. [PDF]
  • Simultaneous Neural Machine Translation with Constituent Label Prediction. WMT 2021. [PDF]
  • Future-Guided Incremental Transformer for Simultaneous Translation. AAAI 2021. [PDF]
  • Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation. Machine Translation 2021. [PDF]
  • Beyond Sentence-Level End-to-End Speech Translation: Context Helps. ACL 2021. [PDF] [Code]
  • RealTranS: End-to-End Simultaneous Speech Translation with Convolutional Weighted-Shrinking Transformer. ACL 2021 findings. [PDF]
  • Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR. ACL 2021 findings. [PDF]
  • Multilingual Simultaneous Neural Machine Translation. ACL 2021 findings. [PDF]
  • Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy. EMNLP 2021. [PDF] [Code]
  • Cross Attention Augmented Transducer Networks for Simultaneous Translation. EMNLP 2021. [PDF] [Code]
  • Translation-based Supervision for Policy Generation in Simultaneous Neural Machine Translation. EMNLP 2021. [PDF] [Code]
  • Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings. EMNLP 2021. [PDF]

    • A Generative Framework for Simultaneous Machine Translation. EMNLP 2021. [PDF]
    • It Is Not As Good As You Think! Evaluating Simultaneous Machine Translation on Interpretation Data. EMNLP 2021. [PDF] [Code]
  • Stream-level Latency Evaluation for Simultaneous Machine Translation. EMNLP 2021 findings. [PDF] [Code]
  • MiSS: An Assistant for Multi-Style Simultaneous Translation. EMNLP 2021 Demo. [PDF]
  • Learning Coupled Policies for Simultaneous Machine Translation using Imitation Learning. EACL 2021. [PDF] [Code]
  • Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation. EACL 2021. [PDF] [Code]
  • An Empirical Study Of End-To-End Simultaneous Speech Translation Decoding Strategies. ICASSP 2021 [PDF]
  • Streaming Simultaneous Speech Translation With Augmented Memory Transformer. ICASSP 2021 [PDF]
  • Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation. Interspeech 2021. [PDF]
  • Visualization: the missing factor in Simultaneous Speech Translation. CLIC-it 2021. [PDF]
  • UniST: Unified End-to-end Model for Streaming and Non-streaming Speech Translation. Arxiv 2021. [PDF]

    • Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation. Arxiv 2021. [PDF]
  • Learning to Use Future Information in Simultaneous Translation. Arxiv 2021 [PDF]
  • Simultaneous Multi-Pivot Neural Machine Translation. Arxiv 2021. [PDF]
  • Full-Sentence Models Perform Better in Simultaneous Translation Using the Information Enhanced Decoding Strategy. Arxiv 2021. [PDF]
  • Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems. Arxiv 2021. [PDF]
  • Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention. Arxiv 2021. [PDF]

<a id="2022">2022 </a>

  • Modeling Dual Read/Write Paths for Simultaneous Machine Translation. ACL 2022. [PDF] [Code]
  • Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework. ACL 2022. [PDF]
  • From Simultaneous to Streaming Machine Translation by Leveraging Streaming History. ACL 2022. [PDF]
  • Learning When to Translate for Streaming Speech. ACL 2022. [PDF] [Code]
  • Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation. ACL 2022. [PDF]
  • Gaussian Multi-head Attention for Simultaneous Machine Translation. ACL 2022 findings. [PDF] [Code]
  • Language Model Augmented Monotonic Attention for Simultaneous Translation. NAACL 2022. [PDF]

Workshops

IWSLT 2020

  • ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020. [PDF]
  • Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University. [PDF]
  • KIT’s IWSLT 2020 SLT Translation System. [PDF]
  • End-to-End Simultaneous Translation System for IWSLT2020 Using Modality Agnostic Meta-Learning. [PDF]
  • ELITR Non-Native Speech Translation at IWSLT 2020. [PDF]
  • Re-translation versus Streaming for Simultaneous Translation. [PDF]
  • Towards Stream Translation: Adaptive Computation Time for Simultaneous Machine Translation. [PDF]
  • Neural Simultaneous Speech Translation Using Alignment-Based Chunking. [PDF]

AutoSimTrans 2020

  • Dynamic Sentence Boundary Detection for Simultaneous Translation. [PDF]

ASLTRW 2021

  • Operating a Complex SLT System with Speakers and Human Interpreters. [PDF]
  • Simultaneous Speech Translation for Live Subtitling: from Delay to Display. [PDF]

IWSLT 2021

  • The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021. [PDF]
  • NAIST English-to-Japanese Simultaneous Translation System for IWSLT 2021 Simultaneous Text-to-text Task. [PDF]
  • The University of Edinburgh’s Submission to the IWSLT21 Simultaneous Translation Task. [PDF]
  • Without Further Ado: Direct and Simultaneous Speech Translation by AppTek in 2021. [PDF]
  • The Volctrans Neural Speech Translation System for IWSLT 2021. [PDF]
  • Large-Scale English-Japanese Simultaneous Interpretation Corpus: Construction and Analyses with Sentence-Aligned Data. [PDF]
  • Towards the evaluation of automatic simultaneous speech translation from a communicative perspective. [PDF]
  • Tag Assisted Neural Machine Translation of Film Subtitles. [PDF]

AutoSimTrans 2021

  • ICT’s System for AutoSimTrans 2021: Robust Char-Level Simultaneous Translation. [PDF]
  • BIT’s system for AutoSimulTrans2021. [PDF]
  • XMU’s Simultaneous Translation System at NAACL 2021. [PDF]
  • System Description on Automatic Simultaneous Translation Workshop. [PDF]
  • BSTC: A Large-Scale Chinese-English Speech Translation Dataset. [PDF]

IWSLT 2022

  • Simultaneous Neural Machine Translation with Prefix Alignment. [PDF]
  • Anticipation-Free Training for Simultaneous Machine Translation. [PDF]
  • The AISP-SJTU Simultaneous Translation System for IWSLT 2022. [PDF]
  • The Xiaomi Text-to-Text Simultaneous Speech Translation System for IWSLT 2022. [PDF]
  • The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation. [PDF]
  • MLLP-VRAIN UPV systems for the IWSLT 2022 Simultaneous Speech Translation and Speech-to-Speech Translation tasks. [PDF]
  • CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022. [PDF]
  • NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022. [PDF]

Contact

If this repository is helpful to you, welcome to ⭐️ it !

If you have any suggestions or some related papers are not included, feel free to contact me with: zhangshaolei20z@ict.ac.cn.

Shaolei Zhang

Ph.D. Student Candidate (2020 ~ 2025)

Key Lab. of Intelligent Information Processing

Institute of Computing Technology, Chinese Academy of Sciences

Twitter, Facebook