Skip to main content

Abstract

In text and speech, there are various features that express the individuality of the writer or speaker. In this paper, we take a step towards the creation of dialogue systems that consider this individuality by proposing a method for transforming individuality using a technique inspired by statistical machine translation (SMT). However, finding a parallel corpus with identical semantic content but different individuality is difficult, precluding the use of standard SMT techniques. Thus, in this paper, we focus on methods for creating a translation model (TM) using techniques from the paraphrasing literature and a language model (LM) by combining small amounts of individuality-rich data with larger amounts of background text. We perform an automatic and manual evaluation comparing the effectiveness of three types of TM construction techniques and find that the proposed system using a method focusing on a limited set of function words is most effective and can transform individuality to a degree that is both noticeable and identifiable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This Japanese paraphrase model will be made available upon acceptance of the paper.

  2. 2.

    http://www.eijiro.jp.

  3. 3.

    Verbosity is one component of individuality, so setting λ wp to a different value for each source/target speaker pair is more appropriate, but we leave this to future work.

References

  • Abe M, Nakamura S, Shikano K, Kuwabara H (1988) Voice conversion through vector quantization. In: 1988 international conference on acoustics, speech, and signal processing, 1988. ICASSP-88, pp 655–658

    Google Scholar 

  • Bannard C, Callison-Burch C (2005) Paraphrasing with bilingual parallel corpora. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 597–604

    Google Scholar 

  • Barzilay R, Lee L (2003) Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology, vol 1, pp 16–23. doi:10.3115/1073445.1073448. http://www.dx.doi.org/10.3115/1073445.1073448

  • Bond F, Isahara H, Fujita S, Uchimoto K, Kuribayashi T, Kanzaki K (2009) Enhancing the Japanese wordnet. In: Proceedings of the 7th workshop on Asian language resources, pp 1–8

    Google Scholar 

  • Brill E, Moore RC (2000) An improved error model for noisy channel spelling correction. In: Proceedings of the 38th annual meeting on association for computational linguistics, pp 286–293

    Google Scholar 

  • Chung C, Pennebaker JW (2007) The psychological functions of function words. In: Fiedler K (ed) Social communication. Psychology Press, New York, pp 343–359

    Google Scholar 

  • Dagan I, Lee L, Pereira FC (1999) Similarity-based models of word cooccurrence probabilities. Mach Learn 34(1–3):43–69

    Article  MATH  Google Scholar 

  • Gosling SD, Rentfrow PJ, Swann WB Jr (2003) A very brief measure of the big-five personality domains. J Res Pers 37(6):504–528

    Article  Google Scholar 

  • Hiraoka T, Neubig G, Sakti S, Toda T, Nakamura S (2014) Construction and analysis of a persuasive dialogue corpus. In: 5th international workshop on spoken dialog systems (IWSDS)

    Google Scholar 

  • Inui K, Fujita A (2004) A survey on paraphrase generation and recognition. J Nat Lang Process 11(5):151–198

    Article  Google Scholar 

  • Isard A, Brockmann C, Oberlander J (2006) Individuality and alignment in generated dialogues. In: Proceedings of the 4th international natural language generation conference, pp 25–32. http://www.dl.acm.org/citation.cfm?id=1706269.1706277

  • Isbister K, Nass C (2000) Consistency of personality in interactive characters: verbal cues, non-verbal cues, and user characteristics. Int J Hum Comput Stud 53(2):251–267. doi:10.1006/ijhc.2000.0368. http://www.dx.doi.org/10.1006/ijhc.2000.0368

  • Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Conference on empirical methods on natural language processing, pp 388–395

    Google Scholar 

  • Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 conference of the North American chapter of the association for computational linguistics on human language technology - Volume 1, The 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 48–54. doi:10.3115/1073445.1073462. http://www.dx.doi.org/10.3115/1073445.1073462

  • Mairesse F, Walker MA (2011) Controlling user perceptions of linguistic style: trainable generation of personality traits. Comput Ling 37(3):455–488

    Article  Google Scholar 

  • Metze F, Englert R, Bub U, Burkhardt F, Stegmann J (2009) Getting closer: tailored human-computer speech dialog. Univ Access Inf Soc 8(2):97–108. doi:10.1007/s10209-008-0133-0. http://www.dx.doi.org/10.1007/s10209-008-0133-0

  • Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38:39–41

    Article  Google Scholar 

  • Neubig G, Nakata Y, Mori S (2011) Pointwise prediction for robust, adaptable Japanese morphological analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2, pp 529–533. http://www.dl.acm.org/citation.cfm?id=2002736.2002841

  • Neubig G, Akita Y, Mori S, Kawahara T (2012) A monotonic statistical machine translation approach to speaking style transformation. Comput Speech Lang 26(5):349–370

    Article  Google Scholar 

  • Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of association for computational linguistics

    Google Scholar 

  • Qian Y, Soong FK, Yan ZJ (2013) A unified trajectory tiling approach to high quality speech rendering. IEEE Trans Audio Speech Lang Process 21(2):280–290

    Article  Google Scholar 

  • Riesa J, Irvine A, Marcu D (2011) Feature-rich language-independent syntax-based alignment for statistical machine translation. In: Proceedings of the conference on empirical methods in natural language processing, pp 497–507

    Google Scholar 

  • Takezawa T, Sumita E, Sugaya F, Yamamoto H, Yamamoto S (2002) Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. In: Proceedings of language resources and evaluation conference, pp 147–152

    Google Scholar 

  • Teshigawara M, Kinsui S (2012) Modern Japanese “Role Language”(Yakuwarigo): fictionalised orality in Japanese literature and popular culture. Socioling Stud 5(1):37–58

    Article  Google Scholar 

  • Wang WY, Finkelstein S, Ogan A, Black AW, Cassell J (2012) Love ya, jerkface: using sparse log-linear models to build positive (and impolite) relationships with teens. In: Proceedings of the 13th annual meeting of the special interest group on discourse and dialogue, pp 20–29

    Google Scholar 

  • Xu W, Ritter A, Dolan B, Grishman R, Cherry C (2012) Paraphrasing for style. In: Proceedings of computational linguistics 2012, pp 2899–2914. http://www.aclweb.org/anthology/C12-1177

  • Yamagishi J, Usabaev B, King S, Watts O, Dines J, Tian J, Guan Y, Hu R, Oura K, Wu YJ et al (2010) Thousands of voices for HMM-based speech synthesis—analysis and application of TTS systems built on various ASR corpora. IEEE Trans Audio Speech Lang Process 18(5): 984–1004

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masahiro Mizukami .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Mizukami, M., Neubig, G., Sakti, S., Toda, T., Nakamura, S. (2015). Linguistic Individuality Transformation for Spoken Language. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19291-8_13

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19290-1

  • Online ISBN: 978-3-319-19291-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics