Linguistic Individuality Transformation for Spoken Language

Mizukami, Masahiro; Neubig, Graham; Sakti, Sakriani; Toda, Tomoki; Nakamura, Satoshi

doi:10.1007/978-3-319-19291-8_13

Masahiro Mizukami⁵,
Graham Neubig⁵,
Sakriani Sakti⁵,
Tomoki Toda⁵ &
…
Satoshi Nakamura⁵

1131 Accesses
5 Citations

Abstract

In text and speech, there are various features that express the individuality of the writer or speaker. In this paper, we take a step towards the creation of dialogue systems that consider this individuality by proposing a method for transforming individuality using a technique inspired by statistical machine translation (SMT). However, finding a parallel corpus with identical semantic content but different individuality is difficult, precluding the use of standard SMT techniques. Thus, in this paper, we focus on methods for creating a translation model (TM) using techniques from the paraphrasing literature and a language model (LM) by combining small amounts of individuality-rich data with larger amounts of background text. We perform an automatic and manual evaluation comparing the effectiveness of three types of TM construction techniques and find that the proposed system using a method focusing on a limited set of function words is most effective and can transform individuality to a degree that is both noticeable and identifiable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This Japanese paraphrase model will be made available upon acceptance of the paper.
2.
http://www.eijiro.jp.
3.
Verbosity is one component of individuality, so setting λ _wp to a different value for each source/target speaker pair is more appropriate, but we leave this to future work.

References

Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. In: 1988 international conference on acoustics, speech, and signal processing, 1988. ICASSP-88, pp 655–658
Google Scholar
Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 597–604
Google Scholar
Barzilay R, Lee L (2003) Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology, vol 1, pp 16–23. doi:10.3115/1073445.1073448. http://www.dx.doi.org/10.3115/1073445.1073448
Bond F, Isahara H, Fujita S, Uchimoto K, Kuribayashi T, Kanzaki K (2009) Enhancing the Japanese wordnet. In: Proceedings of the 7th workshop on Asian language resources, pp 1–8
Google Scholar
Brill E, Moore RC (2000) An improved error model for noisy channel spelling correction. In: Proceedings of the 38th annual meeting on association for computational linguistics, pp 286–293
Google Scholar
Chung C, Pennebaker JW (2007) The psychological functions of function words. In: Fiedler K (ed) Social communication. Psychology Press, New York, pp 343–359
Google Scholar
Dagan I, Lee L, Pereira FC (1999) Similarity-based models of word cooccurrence probabilities. Mach Learn 34(1–3):43–69
Article MATH Google Scholar
Gosling SD, Rentfrow PJ, Swann WB Jr (2003) A very brief measure of the big-five personality domains. J Res Pers 37(6):504–528
Article Google Scholar
Hiraoka T, Neubig G, Sakti S, Toda T, Nakamura S (2014) Construction and analysis of a persuasive dialogue corpus. In: 5th international workshop on spoken dialog systems (IWSDS)
Google Scholar
Inui K, Fujita A (2004) A survey on paraphrase generation and recognition. J Nat Lang Process 11(5):151–198
Article Google Scholar
Isard A, Brockmann C, Oberlander J (2006) Individuality and alignment in generated dialogues. In: Proceedings of the 4th international natural language generation conference, pp 25–32. http://www.dl.acm.org/citation.cfm?id=1706269.1706277
Isbister K, Nass C (2000) Consistency of personality in interactive characters: verbal cues, non-verbal cues, and user characteristics. Int J Hum Comput Stud 53(2):251–267. doi:10.1006/ijhc.2000.0368. http://www.dx.doi.org/10.1006/ijhc.2000.0368
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Conference on empirical methods on natural language processing, pp 388–395
Google Scholar
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology - Volume 1, The 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 48–54. doi:10.3115/1073445.1073462. http://www.dx.doi.org/10.3115/1073445.1073462
Mairesse F, Walker MA (2011) Controlling user perceptions of linguistic style: trainable generation of personality traits. Comput Ling 37(3):455–488
Article Google Scholar
Metze F, Englert R, Bub U, Burkhardt F, Stegmann J (2009) Getting closer: tailored human-computer speech dialog. Univ Access Inf Soc 8(2):97–108. doi:10.1007/s10209-008-0133-0. http://www.dx.doi.org/10.1007/s10209-008-0133-0
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38:39–41
Article Google Scholar
Neubig G, Nakata Y, Mori S (2011) Pointwise prediction for robust, adaptable Japanese morphological analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2, pp 529–533. http://www.dl.acm.org/citation.cfm?id=2002736.2002841
Neubig G, Akita Y, Mori S, Kawahara T (2012) A monotonic statistical machine translation approach to speaking style transformation. Comput Speech Lang 26(5):349–370
Article Google Scholar
Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of association for computational linguistics
Google Scholar
Qian Y, Soong FK, Yan ZJ (2013) A unified trajectory tiling approach to high quality speech rendering. IEEE Trans Audio Speech Lang Process 21(2):280–290
Article Google Scholar
Riesa J, Irvine A, Marcu D (2011) Feature-rich language-independent syntax-based alignment for statistical machine translation. In: Proceedings of the conference on empirical methods in natural language processing, pp 497–507
Google Scholar
Takezawa T, Sumita E, Sugaya F, Yamamoto H, Yamamoto S (2002) Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. In: Proceedings of language resources and evaluation conference, pp 147–152
Google Scholar
Teshigawara M, Kinsui S (2012) Modern Japanese “Role Language”(Yakuwarigo): fictionalised orality in Japanese literature and popular culture. Socioling Stud 5(1):37–58
Article Google Scholar
Wang WY, Finkelstein S, Ogan A, Black AW, Cassell J (2012) Love ya, jerkface: using sparse log-linear models to build positive (and impolite) relationships with teens. In: Proceedings of the 13th annual meeting of the special interest group on discourse and dialogue, pp 20–29
Google Scholar
Xu W, Ritter A, Dolan B, Grishman R, Cherry C (2012) Paraphrasing for style. In: Proceedings of computational linguistics 2012, pp 2899–2914. http://www.aclweb.org/anthology/C12-1177
Yamagishi J, Usabaev B, King S, Watts O, Dines J, Tian J, Guan Y, Hu R, Oura K, Wu YJ et al (2010) Thousands of voices for HMM-based speech synthesis—analysis and application of TTS systems built on various ASR corpora. IEEE Trans Audio Speech Lang Process 18(5): 984–1004
Article Google Scholar

Download references

Author information

Authors and Affiliations

NAIST, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
Masahiro Mizukami, Graham Neubig, Sakriani Sakti, Tomoki Toda & Satoshi Nakamura

Authors

Masahiro Mizukami
View author publications
You can also search for this author in PubMed Google Scholar
Graham Neubig
View author publications
You can also search for this author in PubMed Google Scholar
Sakriani Sakti
View author publications
You can also search for this author in PubMed Google Scholar
Tomoki Toda
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Nakamura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masahiro Mizukami .

Editor information

Editors and Affiliations

Department of Computer Science and Engin, Pohang University of Science & Tech, Namgu, Pohang, Korea (Republic of)
G.G. Lee
School of Information and Communications, Gwangju Institute of Science and Tech, Buk-gu, Gwangju, Korea (Republic of)
H.K. Kim
Microsoft Corporation, Redmond, Washington, USA
M. Jeong
Dept of Computer Science and Engineering, Sogang University, Mapo-gu, Seoul, Korea (Republic of)
J.-H. Kim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mizukami, M., Neubig, G., Sakti, S., Toda, T., Nakamura, S. (2015). Linguistic Individuality Transformation for Spoken Language. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-19291-8_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19290-1
Online ISBN: 978-3-319-19291-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics