Re-exploring O. Henry’s Short Stories——A Corpus-Based Pilot Study(1)

(整期优先)网络出版时间:2009-08-17
/ 4

【摘 要】本文试图采用语料库的方法从文体学视角分析欧•亨利小说集《四百万》。研究揭示,通过语料库软件计算出的总体统计数据,为有关欧•亨利小说广泛认同的文学阐释提供了更为具体的描述基础。在探讨小说场景和基本主题方面,重现序列的搭配及频数信息发现了前人并未关注过的语言学特征。

【关键词】语料库;欧•亨利;场景;主题

Abstract:This article attempts to apply corpus-based method to a stylistic interpretation of O. Henry’s short story collection The Four Million. It is shown that the overall statistics computed by corpus software has provided a more detailed descriptive basis for widely accepted literary interpretations of his stories. In terms of story settings and general themes, the collocation and frequency information of recurrent sequences can identify valuable linguistic features which literary critics seem not to have noticed.

Key words:corpus; O.Henry; setting; theme

1.Introduction

O. Henry was called the American Guy De Maupassant. Both authors wrote twist endings, but O. Henry’s stories were much more playful and optimistic. Among the former studies on O.Henry’s short stories, there is consensus that O.Henry’s works are generally branded with such features as surprising endings, use of coincidence or chance to create humor, ingenious and exquisite layouts, smile-in-tears irony and so forth. Despite the detailed literary discussion, little work has been done to reveal its linguistic styles. Nor is there work with quantitative data as convincing evidence. In terms of the established description of his style, it seems unlikely that the corpus-based method can find anything original. However, the stylistic analysis in the present paper aims to illustrate the value of corpus empirical method in exploring the literary styles. On the one hand, statistic data help to confirm the canonical view on O.Henry’s short stories; on the other hand, stylistics is related to linguistic features of his works.

2.Data and Methodology

This paper is devoted to investigating the linguistic styles of O.Henry’s works in an empirical way, applying both quantitative and qualitative methods. The study adopts two corpora. One is O.Henry’s book The Four Million (a collection of stories), published in 1906, contains a series of short stories which took place in the New York City in the early years of the 20th century. The computer readable versions available on the internet are used to set up a minor working corpus for investigation (http://www.literaturepage.com/read/thefourmillion.html). The other one is Brown corpus used as a reference corpus.

The corpus concordance software used in this study is Wordsmith tools. Wordsmith can undertake more detailed analyses of frequencies of concordance items and extract collocational information. By use of corpora software, words with significant keyness in the book The Four Million will be sorted out first, and then concordance lines with a keyword and its collocates will be extracted. The corpora data will be processed by statistical instruments.

3.Overall Statistics

The overall statistics are one essential starting point for a systematic corpus-based textual analysis. Wordsmith Tools are used to provide the overall statistics of the two corpora and comparison is made as shown in Table 3.1.

Table 3.1 Comparison of Overall Statistics between the Two Corpora

text file

tokens (running words) in text

types (distinct words)

type/token ratio (TTR)

standardised TTR

standardised TTR basis

mean sentence length (in words) mean word length (in characters)

word length std.dev.

Overall of Mini

52,770

8,251

16

46.97

1,000.00

15

4

2.24

Overall of Brown

1,390,505

47,146

4

39.07

1,000.00

23

5

2.52

In terms of the number of tokens, the corpus for the present study seems rather small compared with that of Brown. However, the observed standardized type-token ratio (TTR) in the minor working corpus is higher than that of Brown. The higher the ratio is, the more lexical variation it reflects. The result is not quite surprising, for the fact that O.Henry was notable for writing stories with a persity of expressions. The mean sentence length shows that O.Henry wrote shorter sentences than those in general written texts. In each of his short stories, several incidents and clues are condensed into merely 750 words or so. Naturally, sentences in shorter length can deliver new information in a flash of time. It takes less time for readers to follow up the development of the dramatic plot. Fundamentally a product of his time, O. Henry's work provides one of the best English examples of catching the entire flavor of an age. Whether roaming the cattle-lands of Texas, exploring the art of the ‘gentle grafter’, or investigating the tensions of class and wealth in turn-of-the-century New York, O. Henry had an inimitable hand for isolating some element of society and describing it with an incredible economy and grace of language.

4.Keyword Analysis for Settings

Setting is an indispensable constituent for story telling. Setting basically includes several closely related aspects – the time in which the event or action takes place, the place where the event or action takes place and social environment of the characters: the manners, customs, moral values that govern the characters’ society. In the present study, the information of the first two aspects can be provided by sorting out the keywords of the texts as Table 4.1 shows.

Table 4.1 Keywords Denoting Places and Time

Indoor places

Outdoor places

room, restaurant, Bogle’s, Cypher’s, store, flat, door, window, dresser, table, skylight, floor, desk,

bench, street, corner, sidewalk, Broadway, avenue, park,

Denoting time

evening, night, o’clock,

It is strikingly interesting that two out of three keywords explicitly refer to the time at or after sunset, including ‘evening’, and ‘night’. More interestingly, almost all the collocations of ‘o’clock’ indicate the right time from the late afternoon till the late night. The following concordance lines provide evidence.

1.s were few. The time was barely 10 o'clock at night, but chilly gusts of wind

2.ey Donovan's paper-box girl. At 10 o'clock the jolly round face of "Big Mike"

3.heatres are stupid, anyway." At 11 o'clock that night somebody tapped lightly

4.ant to overshadow her friend. At 4 o'clock on the afternoon of the third day M

5.upper windows lighted? Well, at 6 o'clock I stood in that house with the youn

6.a strike-breaker's motor car. At 6 o'clock the waiter brought her dinner and c

7.e to thinking. One evening about 6 o'clock my mistress ordered him to get busy

8.lar and eighty- seven cents?" At 7 o'clock the coffee was made and the frying-

9.ging mistress. It was most times 7 o'clock when he returned in the evening. At

10.coddled, praised and kissed at 7 o'clock. Art is an engaging mistress. It wa

11.We were married last evening at 8 o'clock in the Little Church Around the Cor

12.ew samples this morning. It's 9.45 o'clock, and not a single picture hat or pi

13.g happiness to your son." At eight o'clock the next evening Aunt Ellen took a

14.by we moved out, for 'twas eleven o'clock, and stands a bit upon the sidewalk