Abstract: |
Researches on emotion estimation from text mostly use machine learning method. Because machine learning
requires a large amount of example corpora, how to acquire high quality training data has been discussed as
one of its major problems. The existing language resources include emotion corpora; however, they are not
available if the language is different. Constructing bilingual corpus manually is also financially difficult. We
propose a method to convert a training data into different language using an existing Japanese-English parallel
emotion corpus. With a bilingual dictionary, the translation candidates are extracted against every word of
each sentence included in the corpus. Then the extracted translation candidates are narrowed down into a set
of words that highly contribute to emotion estimation and we used the set of words as training data. As the
result of the evaluation experiment using the training data created by our proposed method, the accuracy of
emotion estimation increased up to 66.7% in Naive Bayes.
1 INTRODUCTION
Recently, there have been many researches on emotion
estimation from text in the field of sentiment
analysis or opinion mining (Ren, 2009), (Ren and
Quan, 2015), (Ren and Wu, 2013), (Quan and Ren,
2010), (Quan and Ren, 2014), (Ren and Matsumoto,
2015) and many of them adopted machine learning
methods that used words as a feature. When the type
of the target sentence for emotion estimation and the
type of the sentence prepared as training data are different,
as in the case of terminology in the problem
of domain adaptation for document classification, the
appearance tendency of the emotion words differs.
This causes a problem in fluctuation of accuracy. On
the other hand, when a word is used as a feature for
emotion estimation, the sentence structure does not
have to be considered. As a result, it is easy to apply
the method to other languages. Only if we prepare a
large number of corpora with annotation of emotion
tags on each sentence, emotion would be easily estimated
by using the machine learning method. In the
machine learning method, because manual definition
of a rule is not necessary, we can reduce costs to apply
the method to other languages.
However, just like the problem in the domain, depending
on the |