1 Introduction
As one of the fastest growing segments of the global entertainment market (Szalai, 2007), video games have come to represent a wide spectrum of values in terms of expectations and design decisions of the groups involved in their development and publication. Despite a rapidly expanding market, changing demographics and growing global economic importance, regional differences in game designs have yet to be fully explored. The underlying premise of this research is that game components, genres, and themes within a descriptive review of video game products should reveal world regions. This paper seeks to explore the effectiveness of multiple text representations and classifiers to accurately predict the development region of a video game based upon the reviews. The video game industry has matured to the extent that it has already attained a size comparable to those of both the music and movie industries (Otobe, 2007). Research by the Entertainment Software Association (ESA), the trade association for the computer and video game industry in the United States, has found that the sales of video games have more than tripled in the last 12 years (ESA, 2008). In addition to economic growth, the ESA also reports that the average age, gender ratio and size of the user base are changing. Between 2007 and 2008, the ESA reported that the average age of gamers had increased from 33 to 35 years old, the percentage of female gamers had increased from 38% to 40%, and males under the age of 17 only represent 18% of the overall market. Additionally, the number of American households owning home video game consoles had increased from 33% to 38%, and the number of households who played either computer or video games had increased to 65%. As a result of heightened competition due to this expanding demographic and staying up to date with new technologies, the rise in development costs has dramatically increased the level of risk versus profitability (BBC News, 2007). This may in turn result in a decline in risk taking as manufacturers specialize and tailor their products for specific target markets. The selection of hardware platform does, however, have an influence on video game design in two ways parallel to other developer specific issues like regional market expectations. First, different hardware design constraints apply to the types of features available, and second individual developers have preferences towards particular platforms (Otobe, 2007). These difference are, however, mitigated to some degree by the homogenization of features such as online connectivity and 3D rendering across hardware platforms. Therefore, given these broad changes in the industry it is an empirical question I address in this paper to determine how significant the differences in video game feature sets are between geographical regions. Despite the rapid expansion and heightened competition of the video game industry, little has been done to formalize video game design classification beyond simple genre groupings. Although reviewers and review organizations appear to share an expanding vocabulary of terms, the lack of a formal vocabulary with a fine degree of granularity makes comparing regions more problematic. By employing text mining techniques to automatically extract features selected by reviewers; this project seeks to get at the descriptions important to modeling the relationship between game design and geographic region. Specifically, this project will determine the effectiveness of support vector machine classification systems to predict the development region of a video game based on the text used in a collection of reviews. Review data was extracted from two sources of electronic video game reviews and combined with a listing of developers and publishers. Further, the paper compares the performance of five text representations of the reviews, including a sliding window of terms, noun phrases, noun phrase and verbs, and individual terms with natural language processing informed stopword removal. Using the Oracle Data Mining tool, binary and multi-class classifiers were constructed based on each of the alternative text representations. Lastly, the effectiveness of region as a classifier will be compared against temporal grouping of reviews by release date over the same noun and verb based representations.
...
References (1):
Alvarez, J., Djaouti, D., Ghassempouri, R., Jessel, J., and Methel, G. (2006). Morphological study of the video games. In Procedings of the 3rd Australasian Conference on interactive Entertainment. ACM International Conference Proceeding Series, 207, 36-43. Murdoch University, Murdoch University, Australia.