Yusuke Matsubara
I'm Yusuke Matsubara.
I have worked on statistical modeling of natural language and its applications to support document authoring. Recent projects I have worked on include mining of less frequent patterns in a text, semantic parsing of Japanese clinical texts, quantitative analysis of Wikipedia contributors. My research interests cover other areas of computational linguistics at large.
In my spare time, I enjoy contributing to free software and free content movement. If you are interested in those activities, see also my other online presence.
Publications
(refereed)
Matsubara, Yusuke and Koiti Hasida. "K-repeating Substrings: a String-Algorithmic Approach to Privacy-Preserving Publishing of Textual Data". [slides] Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing (PACLIC 28). 2014.
Matsubara, Yusuke and Jun'ichi Tsujii. "Large-vocabulary lexical choice with rich context features". International Journal of Computational Linguistics and Applications, vol. 2, no. 1-2, pp.9--24. 2011.
Wailok Tam, Koiti Hasida, Yusuke Matsubara, Eiji Aramaki, Mai Miyabe, Motoyuki Takaai, Hirosi Uozaki and Yo Sato. "Proper and Efficient Treatment of Anaphora and Long-Distance Dependency in Context-Free Grammar". Proceedings of the First Workshop on Natural Language Processing for Medical and Healthcare Fields. 2013.
(not refereed)
Matsubara, Yusuke, Mizuki Morita and Koiti Hasida. "BARY at the NTCIR-11 MedNLP-2 Task for Complaints and Diagnosis Recognition". [slides] Proceedings of the NII Testbeds and Community for Information access Research (NTCIR-11). 2014.
Wai Lok Tam, 松原勇介, 橋田浩一, 鷹合基行, 荒牧英治, 宇於崎弘. "Linking a Grammar to an Ontology". 言語処理学会第19回年次大会(NLP2013). 2013.
松原勇介, 宮尾祐介, 辻井潤一. "大語彙の同義語集合からの文脈に応じた語彙選択". 言語処理学会第16回年次大会(NLP2010). 2010.
岡野原, 大輔, 松原勇介, 辻井潤一. "階層木言語モデルの音声認識への適用". 日本音響学会2009年春季研究発表会. 2009.
Matsubara, Yusuke, Jun Ogata and Masataka Goto. "Improvements on Podcast Speech Recognition: Language modeling with Web Keywords maintained by Mass Knowledge" (Original title: "ポッドキャスト音声認識の性能向上手法: 集合知によって更新されるWebキーワードを活用した言語モデリング"). 音声言語情報処理研究会 研究報告 2008-SLP-71-6. 2008(46). pp. 39--44. 情報処理学会. May 2008. (in Japanese)
松原勇介, 宮尾祐介, 辻井潤一. "重複する素性を持つNグラム言語モデル". 言語処理学会第14回年次大会(NLP2008). 2008.
松原勇介, 秋葉友良, 辻井潤一. "最小記述長原理に基づいた日本語話し言葉の単語分割". 言語処理学会第16回年次大会(NLP2007). 2007.
Software
Growthring - a Scala implementation of k-repeating substrings for efficiently identifying less frequent substrings.
Contact
Please feel free to e-mail me at:
yusuke (at) matsubara.name