About me

My name is Yusuke Matsubara. I am a PhD candidate working at the Social ICT Research Center at The University of Tokyo in Tokyo, Japan.

I have worked on statistical modeling of natural language and its applications to support document authoring. Recent projects I have worked on include semantic parsing of Japanese clinical texts, quantitative analysis of Wikipedia contributors. My research interests cover other areas of computational linguistics at large.

In my spare time, I enjoy playing the piano, learning about languages (both natural and artificial ones), or contributing to free software and free content movement. If you are interested in those activities, see also my other online presence.


  • Matsubara, Yusuke and Jun'ichi Tsujii. Large-vocabulary lexical choice with rich context features. International Journal of Computational Linguistics and Applications, vol. 2, no. 1-2, pp.9-24. 2011. 
  • Matsubara, Yusuke, Jun Ogata and Masataka Goto. Improvements on Podcast Speech Recognition: Language modeling with Web Keywords maintained by Mass Knowledge (Original title: ポッドキャスト音声認識の性能向上手法: 集合知によって更新されるWebキーワードを活用した言語モデリング). 音声言語情報処理研究会 研究報告 2008-SLP-71-6. 2008(46). pp. 39--44, 情報処理学会, May 2008. (in Japanese).


  • Growthring - a Scala implementation of k-repeating substrings for efficiently identifying less frequent substrings. Appears at PACLIC 2014.


Please feel free to e-mail me at:
yusuke (at) matsubara.name

View Yusuke  Matsubara's LinkedIn profileView Yusuke Matsubara's profile