Abstract
As aforementioned, dual learning has been studied and applied in many applications, including machine translation, image translation, speech processing, text summarization, code generation and commenting, etc. In this chapter, we focus on machine translation, which is the first application it was studied and also one of the best applications it can fit into. We introduce several representative works based on the dual reconstruction principle for semi-supervised and unsupervised neural machine translation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We focus on text translation in this chapter.
- 2.
- 3.
1+ million bilingual sentence pairs were used to pre-train the two models in [15].
- 4.
- 5.
Since both translation models are not perfect, we can only get a noisy translation for a sentence.
- 6.
The public implementation can be found at https://github.com/artetxem/vecmap.
References
Artetxe, M., Labaka, G., & Agirre, E. (2017). Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (pp. 451–462).
Artetxe, M., Labaka, G., Agirre, E., & Cho, K. (2018). Unsupervised neural machine translation. In 6th International Conference on Learning Representations.
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015.
Brown, P. F., Cocke, J., Pietra, S. A. D., Pietra, V. J. D., Jelinek, F., Lafferty, J., et al. (1990). A statistical approach to machine translation. Computational Linguistics, 16(2), 79–85.
Cao, R., Zhu, S., Liu, C., Li, J., & Yu, K. (2019). Semantic parsing with dual learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 51–64).
Cao, R., Zhu, S., Yang, C., Liu, C., Ma, R., Zhao, Y., et al. (2020). Unsupervised dual paraphrasing for two-stage semantic parsing. Preprint. arXiv:2005.13485.
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., et al. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1724–1734).
Conneau, A., Lample, G., Ranzato, M., Denoyer, L., & Jégou, H. (2017). Word translation without parallel data. Preprint. arXiv:1710.04087.
Dietterich, T. G. (2002). Ensemble learning. The Handbook of Brain Theory and Neural Networks, 2 (pp. 110–125). MIT Press.
Edunov, S., Ott, M., Auli, M., & Grangier, D. (2018). Understanding back-translation at scale. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 489–500).
Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning (Vol. 70, pp. 1243–1252). JMLR. org.
Gulcehre, C., Firat, O., Xu, K., Cho, K., Barrault, L., Lin, H.-C., et al. (2015). On using monolingual corpora in neural machine translation. Preprint. arXiv:1503.03535.
Gulcehre, C., Firat, O., Xu, K., Cho, K., & Bengio, Y. (2017). On integrating a language model into neural machine translation. Computer Speech & Language, 45, 137–148.
Hassan Awadalla, H., Aue, A., Chen, C., Chowdhary, V., Clark, J., Federmann, C., et al. (March 2018). Achieving human parity on automatic chinese to English news translation. arXiv:1803.05567.
He, D., Xia, Y., Qin, T., Wang, L., Yu, N., Liu, T.-Y., et al. (2016). Dual learning for machine translation. In Advances in Neural Information Processing Systems (pp. 820–828).
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. Preprint. arXiv:1503.02531.
Jia, R., & Liang, P. (2016). Data recombination for neural semantic parsing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 12–22).
Kim, Y., & Rush, A. M. (2016). Sequence-level knowledge distillation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 1317–1327).
Koehn, P. (2009). Statistical machine translation. New York: Cambridge University Press.
Lample, G., Conneau, A., Denoyer, L., & Ranzato, M. (2018). Unsupervised machine translation using monolingual corpora only. In 6th International Conference on Learning Representations, ICLR 2018.
Lample, G., Ott, M., Conneau, A., Denoyer, L., & Ranzato, M. (2018). Phrase-based & neural unsupervised machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018 (pp. 5039–5049).
Luo, F., Li, P., Yang, P., Zhou, J., Tan, Y., Chang, B., et al. (2019). Towards fine-grained text sentiment transfer. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 2013–2022).
Luo, F., Li, P., Zhou, J., Yang, P., Chang, B., Sun, X., et al. (2019). A dual reinforcement learning framework for unsupervised text style transfer. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (pp. 5116–5122). AAAI Press.
Meng, C., Ren, P., Chen, Z., Sun, W., Ren, Z., Tu, Z., et al. (2020). Dukenet: A dual knowledge interaction network for knowledge-grounded conversation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1151–1160).
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Eleventh Annual Conference of the International Speech Communication Association.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (pp. 3111–3119).
Nirenburg, S. (1989). Knowledge-based machine translation. Machine Translation, 4(1), 5–24.
Nirenburg, S., Carbonell, J., Tomita, M., & Goodman, K. (1994). Machine translation: A knowledge-based approach. San Mateo, CA: Morgan Kaufmann Publishers Inc.
Ranzato, M., Chopra, S., Auli, M., & Zaremba, W. (2015) Sequence level training with recurrent neural networks. Preprint. arXiv:1511.06732.
Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 379–389).
Sennrich, R., Haddow, B., & Birch, A. (2016). Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 86–96).
Sestorain, L., Ciaramita, M., Buck, C., & Hofmann, T. (2018). Zero-shot dual machine translation. Preprint. arXiv:1805.10338.
Shen, L., & Feng, Y. (2020). CDL: Curriculum dual learning for emotion-controllable response generation. Preprint. arXiv:2005.00329.
Su, S.-Y., Huang, C.-W., & Chen, Y.-N. (2020). Towards unsupervised language understanding and generation by joint dual learning. In ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics (pp. 671–680).
Sundermeyer, M., Schlüter, R., & Ney, H. (2012). LSTM neural networks for language modeling. In Thirteenth Annual Conference of the International Speech Communication Association.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (pp. 3104–3112).
Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems, (pp. 1057–1063).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998–6008).
Wang, Y., Xia, Y, He, T., Tian, F., Qin, T., Xiang Zhai, C., et al. (2019). Multi-agent dual learning. In 7th International Conference on Learning Representations, ICLR 2019.
Yang, M., Zhao, Z., Zhao, W., Chen, X., Zhu, J., Zhou, L., et al. (2017). Personalized response generation via domain adaptation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1021–1024).
Zelle, J. M., & Mooney, R. J. (1996). Learning to parse database queries using inductive logic programming. In Proceedings of the National Conference on Artificial Intelligence (pp. 1050–1055).
Zhang, S., & Bansal, M. (2019). Addressing semantic drift in question generation for semi-supervised question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 2495–2509).
Zhou, Z.-H. (2012). Ensemble methods: Foundations and algorithms. New York: CRC Press.
Zhu, S., Cao, R., & Yu, K. (2020). Dual learning for semi-supervised natural language understanding. IEEE Transactions on Audio, Speech, and Language Processing, 28, 1936–1947.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Qin, T. (2020). Dual Learning for Machine Translation and Beyond. In: Dual Learning. Springer, Singapore. https://doi.org/10.1007/978-981-15-8884-6_4
Download citation
DOI: https://doi.org/10.1007/978-981-15-8884-6_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8883-9
Online ISBN: 978-981-15-8884-6
eBook Packages: Computer ScienceComputer Science (R0)