学校代码:10036
Chapter One
INTRODUCTION
1.1 Need for the Study
Phraseology has been recognized as a discipline in its own in the former Soviet Union, however, interdisciplinary studies on it are popularized fairly recently in the West. Phraseological research is closely associated with lexicography, language learning and corpus. As the building blocks of discourses, Phraseological structures (PSs) help to shape meanings in the context and establish the sense of distinctiveness in a register. Whether in spoken or written registers, speaker/writer depend much on these prefabricated structures which convey information or stance in a clear and convenient way. Thus, an excellent control of PSs contributes to establishing an academic writing as an idiomatic and professional one. Since research article is the master narrative of our time, the study on it helps learners to develop native-like proficiency of academic writing.
Theory-driven paradigm has long been the major approach to phraseological studies, with its focus on sequences structurally rigid and semantically opaque. Frozen and psychologically salient, these idioms and formulaic expressions only occupy a little proportion of a text. With the help of computer technology, corpus-driven approach to phraseology has been gaining more and more ground with its advantages of practicality and empiricality. It breaks the limit of grammaticality and covers units cutting across grammatical structures. With respect to the field of phraseological study, much attention has been drawn to investigations into written academic discourse (Adel & Erman, 2012; Cortes, 2004; Chen & Baker,2012; Glaser, 1998; Howarth, 1996; Howarth, 1998; Hyland, 2008a; Liu, 2012; Peacock, 2012), spoken register (Herbel-Eisenmann, Wagener & Cortes, 2010; Wei, 2007), and comparison between spoken and written registers (Biber, Conrad & Cortes, 2004; Biber & Barbieri, 2007; Conrad & Biber, 2004; Moon, 1998), among which analysis of written academic discourse occupies the lion’s share. Investigations into PSs in written academic discourse are conducted with an aim to distinguishing academic vocabulary from that of the spoken English and the general English. Integral to academic discourse, PSs also provide an important means of distinguishing written academic discourses by discipline. Cortes (2004) and Hyland (2008a) demonstrated that even within the written academic discourse, each discipline differs from each other structurally and functionally, but few of such studies have been carried out elaborately. Gavioli argues that the analysis of smaller specialized corpora with restricted text-types and topics can assist learners of ESP in improving their awareness of key lexical, grammatical or textual issues (2005).
Altogether phraseology plays a great part in the production of idiomatic and nativelike language. However, attempts to analyze it in a corpus-driven approach have only been initiated recently. The recognition of register variations facilitates studies on phraseology in written academic articles and spoken register, but discipline-specific characteristics are also noteworthy. As Montgomery says, the research article is the master narrative of our time (1996) and thus our investigation into research discourse is justified. For the preceding reasons, a deep and detailed study on corpus consisting of business English academic discourses can help us better understand the idiomatic language use of PSs in a specialized discipline.
1.2 Structure of Thesis
This thesis consists of five chapters: (1) introduction, (2) literature review; (3) research methods; (4) results and discussion; (5) conclusions.
Chapter 1 is a brief introduction of the whole thesis, presenting the need for the study and the structure of the thesis.
Chapter 2 reviews different rubrics and definitions of phraseological structure, theory- and corpus-driven approach to analyze it and previous research on academic discourse.
Chapter 3 presents the research questions, the Business English Research Discourse Corpus, the software Wordsmith5.0 and the data analysis procedures adopted in the thesis.
Chapter 4 discusses the frequency distribution of PSs, and then the structural and functional features of most representative PSs in the corpus.
Chapter 5 summaries the whole thesis and suggests the implications and limitations of the study.
Chapter Two
LITERATURE REVIEW
2.1 Phraseology under Different Rubrics
As a result of different approaches, criteria and purposes taken in studies, there is no precise or widely accepted definition of phraseology. Phraseology, the umbrella term, has been studied under different rubrics, some referring to similar units with fine distinction, while others pointing to items with great difference. From the perspective of language acquisition and processing, phraseology includes linguistic prefabrications, prefabs, chunks and stereotyping by Bolinger; pre-assembled lexical phrases by Nattinger; lexicalized sentences stems and form-meaning pairings by Pawley and Syder (Wei, 2002, pp.27-33); prefabricated patterns by Granger (1998). All these concepts are in favor of the viewpoint that such PSs are stored, processed and retrieved from the mental lexicon as a whole and “are used as conventional expressions to facilitate fluent production and rapid comprehension” (Howarth, 1996, p. IX).
In the field of the theory-driven paradigm, phraseology has been studied under the names of phraseological unit by Cheuisheva (1964); set combination by Zgusta (1971); set combination phraseme by Melcuk (1988); phraseological unit by Glaser (1988); word-combination by Cowie (1988) and Howarth (1996). The above terms in theoretical model consider and prioritize features of idiomaticity, semantic transparency, commutability, syntactic structure and discourse function respectively.
In the corpus-based paradigm, phraseology is explored in a frequency-driven approach. Altenberg (1998) adopts the name “recurrent word combinations” and defines them as “any continuous string of words occurring more than once in identical form” (1998, p.101); Herbel-Eisenmann et al. state that “a specific grouping of words, in the same order, must repeat frequently to be considered a bundle” and take lexical bundles (LBs) as “groups of three or more words that frequently recur, as multi-word groups, in a particular register” (2012, p.25-32). Other terms contain fixed expressions and idioms (Moon, 1998), LBs (Biber et al., 2004; Chen & Baker, 2012; Hyland, 2008a), multi-word sequences (Biber et al., 2004) and academic clusters (Hyland, 2008b). All these multi-word units have to arrive at a cut-off frequency before they are counted as significant PSs. However, the frequency cut-off point is “somewhat arbitrary” (Biber & Barbieri, 2007, p.267) and to a large extent, depends on the authors’ judgment. It is found that the frequency of occurrence, text dispersion and length of sequences work together to identify the most frequent LBs in a corpus. Altenberg focuses on word-combinations consisting of at least three words occurring at least ten times in the corpus (1998). Hyland depends on more conservative cut off point of 20 times per million words, and only includes those which occur in at least 10% of the texts in the corpus (2008a). Biber later investigates LBs consisting of four-word sequences occurring more than 40 times per million words across at least five different texts in university teaching and textbooks (2004). Cortes analyzes LBs composed of four-word combinations occurring at least 20 times across five or more texts (2004). It is obvious that multi-word units consisting of at least four words are investigated commonly because two-word combinations are usually “mere repetitions or fragments of larger structures, e.g. the the, in a” (Altenberg, 1998, p.102) and three-word combinations are included in four-word combinations (Cortes, 2004, p.401). Text dispersion is determined according to the amount of texts in the corpus under study. In view of the preceding identification and counting criteria, this thesis examines the PSs that conform to the following features: any continuous sequence in identical form consisting of at least four words occurring over 20 times per million words in at least 10% of the texts.
2. 2 Theory-driven Approach to Phraseology
In the field of phraseology, there are two dominant research paradigms, one is theory-driven approach, and the other is corpus-driven approach. Theory-driven approach is first advocated in Russia, where Classical Russian theory, “the most pervasive influence at work in current phraseological studies” (Cowie, 1998, p.2), is put forward, and later on has its extensions in West. This approach mainly focuses on the identification and categorization of various PSs. The practice of classifying PSs formally and functionally is pioneered by Palmer and Hornby, two founding fathers of EFL lexicography (Cowie, 1998), and discussed at length by later scholars. Another productive strand of theoretical-driven approach is the work of J.R. Firth, with its extensions to neo-firthians, such as Alexander, Cowie, Glaser, Howarth, Halliday, Hudson, Nattinger and DeCarrico, Zgusta, whose work reflect greater depth and precision of identification and categorization of PSs.
Nattinger and Decarrico are concerned about the classification of lexical phrases and their implications for language teaching. In their theory, lexical phrases are regarded as form/function composites which exist between poles of lexicon and syntax, occur more frequently and have more idiomatically determined meaning than language that is put together each time (1992). Their study on LPs helps to facilitate “synthesis of structural and functional perspectives” and accelerates “interface of lexicon and syntax in language use” (Leech, 1994, p.162). LPs are distinguished from collocations in that they have been assigned pragmatic functions while collocations not. Further, LPs are categorized in terms of formal and functional characteristics. In formal aspect, four criteria (length and grammatical status, canonical versus non-canonical shape, variable versus fixed, and continuous versus discontinuous) are applied to classify LPs into polywords, institutionalized expressions, phrasal constraints and sentence builders. In functional aspect, LPs can realize the particular functions of social interaction, necessary topics and discourse device. However, there is no detail on Nattinger and Decarrico’s methodology of finding and classifying LPs. Other functions of LPs, which Nattinger and Decarrico fail to include in their functional classifications, are also important.
Alexander presents a similar but more detailed identification and classification of “fixed expressions” (an overarching name of phraseology). Three criteria are applied in identifying fixed expressions: idiomaticity, pragmatic properties and socio-cultural markedness (Alexander, 1992, p.35). “Fixed” in Alexander’s categorization goes through his model, but there is no explicit explanation of it. Zgusta takes into “restrictedness” consideration as the primary criterion, “syntactic function” the secondary criterion and “semantic transparency” the third criterion to classify combinations of words. Having distinguished between “set” and the “free”, Zgusta further classify set combinations in terms of syntactic function. Multi-word lexical units and set/idiomatic expressions can work as single lexical unit while set groups including proverbs and quotations can not. Semantic transparency is used to distinguish multi-word lexical units and set/idiomatic expressions, the primary one being semantically transparent and the latter one being semantically opaque. Cowie takes linguistic functions as the primary criterion to distinguish between word combinations. Word combinations are firstly divided into “functional expressions” and “composites” (1998, p.11). Secondary criterion of semantic specialization and structural stability then is considered to further divide “functional expressions” and “composites” into “non-idiomatic” and “idiomatic” one.
Howarth amalgamates several widely accepted classification model and presents the following one (1998, pp.27-28):
Word combinations


functional expressions composite units




non-idiomatic idiomatic grammatical lexical
composites composites




non-idiomatic idiomatic non-idiomatic idiomatic
Figure 2: Howarth’s model
“Functional expressions” and “composite units”, divided according to syntactic function, correspond to the Zgusta’s second-level classification of “multi-word lexical units” and “set groups” and Cowie’s “functional expressions” and “composites”. “Composite units” in Zgusta’s model can be further divided into grammatical and lexical composites based on the word class of their constituents, with the “grammatical composites” consisting of one open and one closed class word and “lexical composites” of two open class words. The prominent characteristic of this model lies in its division of grammatical and lexical composites into “non-idiomatic” and “idiomatic”. Instead of giving a simple two-way split of idiomaticity, this model regards idiomaticity as a continuum with no clear-cut boundary. In terms of the criteria of semantic specification and commutability, combinations are divided into free collocations, restricted collocations, figurative idioms and pure idioms. Every sub-category only differs to the extent, rather than in absolute delimitation.
Categorization is difficult and complex in phraseology due to the “bristling array of variables—syntactic, pragmatic, stylistic, and semantic—which the material is constantly throwing up” (Cowie, 1998, p.210). Another problem facing researchers is the cline of restrictedness, the boundary between semantic transparent/ structural variation and semantic opaqueness/ structural rigidity. Therefore, theoretical researchers have been devoted to exploring an explicit gradation of restrictedness and presenting an elaborate framework of phraseological categorization. Each approach aforementioned differs on the terminology of the phraseology with the same substance, and differential priority to criteria applied in classifying PSs.
2. 3 Corpus-driven Approach to Phraseology
In contrast to the theory-driven approach, the corpus-driven approach has the following advantages: it is not confined to a particular theoretical model; PSs are selected in terms of frequency and probability; the research is based on the naturally-occurring data and covers a wide range of linguistic forms like collocation, fixed and semi-fixed expressions, idioms and fragmented phrases (Wei, 2007, p.280). Representatives of corpus-driven research are Sinclair, Altenberg, Biber, Cortes, Moon, to just mention a few.
Sinclair is considered to be the founder of corpus-driven approach. He posits two models to interpret the meaning of language texts. One is the open-choice principle, which takes language text as “a series of slots which have to be filled from a lexicon which satisfies local restraints” (Sinclair, 1991, p.109). This principle is consistent with grammaticalness and useful to interpret language meaning according to generative rules. The other model is the idiom principle, which channels lexical study into phraseological one. It is advanced to account for the words occurrence that cannot be captured by the open-choice principle. In some cases, it is found that several choices are available grammatically to writers or speakers, but one particular choice is preferred and more idiomatic. Collocation is a case in point. The idiom principle, at its simplest, can be seen as the simultaneous choice of two or more words and reflects the “economy of effort” (ibid). In the context of describing collocations, Sinclair gives primacy to naturally-occurring texts, rather than to individual intuition. He and Renouf (1991) investigate collocational frameworks which consist of a discontinuous sequence of two grammatical words on each side and an intervening word in it, such as a/an+?+of, too+?+to. Each collocational framework is selective of its collocates which share the similar semantic meanings or grammatical classes. As evidenced by the high type-token ratio, collocational frameworks are statistically important in language use. Therefore, the study calls for different ways to present and explain language patterning.
Moon has contributed to the analysis of large-scale written and/or spoken corpora. Based on naturally-occurring texts and the database of fixed expressions and idioms (FEIs) she constructs (1998). She compares frequency of FEIs in various categories, among different genres and across several corpora. A broad definition of FEIs is presented in accordance with three criteria, the first being institutionalization, which is decided on frequency in the corpus, the second being lexico-grammatical fixedness, which depends on the degree of formal rigidity, and the third being non-compositionality, which shows that the whole mean, ing of an expression can not be obtained through combination of the usual meaning of each word. Moon discusses non-compositionality in a way that is different from others in that non-compositionality does not refer to semantic opaqueness, but means in some cases that “a component item has a meaning not found in other collocations or contexts” (ibid, p.21). Therefore, the FEIs included in her study are distinct from other researchers. Moon classifies FEIs into anomalous collocations, formulae, and metaphors, with each group having fine classifications. She also emphasizes the necessity to examine text functions of FEIs and divides the discourse functions of FEIs into five categories based on a Hallidayan (1994) model. She argues that “these functions are common to lexis in general and not peculiar to FEIs” (Deignan, 2000, p.1242). In her study, an expression may have two or more discourse functions at the same time.
Altenberg is regarded as the milestone of the corpus-driven paradigm. Based on a frequency approach, she examines the recurrent word-combinations consisting of at least three words occurring at least ten times in the London-Lund Corpus of Spoken English. Altenberg finds that recurrent word-combinations in speech are fairly short, but they are pervasive at various discourse levels (1998). Recurrent word-combinations, in most cases, are flexible structurally and transparent semantically. The example she gives is I see, which appears more in its variant forms such as oh I see, yes I see. In terms of the grammatical completeness, three broad types of structures are categorized in the process of analysis by Altenberg. They are full clauses, clause constituents and incomplete phrases respectively, each category serving its discursive or pragmatic function. Full clause is the linguistic entity of highest level, with relatively complete structure consisting of subject and verb. Clause constituents are not grammatically well-formed but consist of two or more clauses. Incomplete phrases include phrasal fragments, discontinuous sequences and other phrases. This categorization provides the model for future research on phraseological features in spoken register, such as Wei’ (2007) phraseological study on Chinese English learners’ speech. Functions of word-combinations are also analyzed under the structural category. Certain recurrent word-combinations are conventionalized to convey “speech acts and discourse strategies” (ibid, p.121) through repetitive use. In Altenberg’s view, word-combination is seldom completely frozen but provides a preferred and convenient way to express oneself.
Biber is another scholar who attends to spoken register after Altenberg (1999, 2002, 2004, 2007). Because LBs are grammatically incomplete and semantically transparent, they are easy to be neglected. Biber argues that academic discourse is investigated with the focus on lexico-grammatical and genre characteristics. On the other hand, relatively few studies describe the linguistic features of spoken register. In order to fill the gap, Biber employs a frequency- driven approach to compare LBs in a wide range of university spoken and written registers. It is found that different registers rely on different sets of LBs. Conversation depends on more types and numbers of LBs which reflect the “communicative purpose” with a focus on “interactions and conveying personal thoughts and attitudes and the concern for politeness and not imposing on others” (Conrad & Biber, 2004, p.67). Academic prose relies on fewer LBs and pays more attention to “convey precise information over interpersonal considerations of a face-to-face interaction” (ibid, p.69). In university, there exist a wide range of spoken and written registers, including classroom teaching, office hours, study group, course management, institutional writing and so on. The study of these corpora suggests that both mode (bundles are common in spoken and written registers) and communicative purpose (bundles are common in all student management registers) influence the numbers and functions of LBs. In addition, Biber provides a model for categorizing structure and function of LBs in his research, which is discussed in Chapter 4.
2. 4 Previous Research on Academic Discourse
Over the last 20 years, studies on academic discourse have been principally carried out through three approaches: genre analysis, multi-dimensional (MD) analysis and Hallidayan systemic functional grammar (Jiang & Zhao, 2006, p.1). Genre analysis, developed by Swales, employs the “move structure” method to segment a text according to their communicative purpose. Swales’ research focuses on the genre in academic settings, particularly on the introduction section of an academic article, for which he devises the Create a Research Space (CARS) model (Swales, 1990, p.141). CARS model is effective in identifying and establishing shared basic moves for the introduction section of a research article.
In contrast to the macroscopic perspective of genre analysis, MD analysis, initiated by Biber, concentrates on linguistic features, the microscopic aspect of a register. In MD analysis, linguistic features selected in a corpus are divided into different dimensions. Through statistical computations, each dimension has its own characteristic linguistic features and occurrence patterns, indicating the functions realized by registers. Researchers are more concerned with the common features realized by a certain register, but interest in disciplinary variation has starts off. It is found that even in the same register, say, written academic articles, each discipline has its own characteristics besides sharing the features of the register. Cortes compares the structure and function of the most frequent LBs in published history and history journals and finds that LBs in history only compose of noun phrases and prepositional phrases, while LBs in biology have a wide range of structural types (2004). Functionally, LBs in the two corpora serve the similar functions with slight difference in epistemic-impersonal stance markers. Hyland supports Cortes’ conclusions through investigating lexical bundle in four disciplines: biology, electrical engineering, applied linguistics and business studies (2008a). His study demonstrates that disciplinary differences exist in distribution frequency, principal structures and functions of LBs. It is recognized that academic English differs from general English considerably, but even each discipline within academic English have its own features. Both approaches mentioned above are corpus-based in present time. With the help of electronic corpora and advanced software, move structure and linguistic features can be tagged and analyzed effectively and accurately.
As with the MD analysis, systemic functional grammar also takes a microscopic perspective on academic articles. It pays attention to theme pattern and linguistic features, with an aim to exploring the discourse characteristics through microscopic features. To date, considerable efforts have been made to describe the rhetorical structure and linguistic features of written academic registers, especially of written research articles in science and medicine, such as the research on hedging devices and special classes of verbs used in research articles, the complex noun phrase structures typical of scientific prose, imperatives, personal pronouns, existential there, politeness markers (Biber, 2002). Corpus-based study on linguistic feature of PSs has just received considerable attention recently, most of which are investigated in students’ academic writing (Adel & Erman, 2012; Cortes, 2004; Liu, 2012) and published academic articles (Chen & Baker, 2010; Hyland, 2008a; Hyland, 2008b). With the research further on, attention is directed from the PSs of general published research articles to disciplinary variations and thus, the present study on business English research discourse is needed.
Chapter Three
RESEARCH METHODS
3.1 Research Questions
1. How are the PSs distributed in Business English Research Discourse?
2. How are these most representative PSs categorized structurally?
3. What functions do these most representative PSs realize?
3.2 Corpus for the Thesis
The Business English Research Discourse (BERD) Corpus used in this study is a collection of published research discourses on business administration. In order to make the findings representative, a wide range of acknowledged journals and a large number of discourses are included in the corpus. Each discourse is composed of the title, author description, abstract, keywords, main body and references. As Hyland (2008b, p.47) points out that the research article is “not only the principal site of disciplinary knowledge-making but…the master narrative of our time.” Thus, the business English research discourses can represent the characteristics of this discipline. The following table 1 presents the basic information of the corpus. The BERD corpus contains 99 kinds of journals with 611 texts. The total running words are 3662,150.
Table 1: Basic information of the BERD corpus
Journals |
99 |
Texts |
611 |
Tokens (running words) |
3662,150 |
Types |
53,382 |
Type/token ration |
1.51 |
3.3 Software
WordSmith Tools 5.0, developed by Mike Scott, is “an integrated suite of programs for looking at how words behave in texts” (Scott, 2010, p.2). It serves three major functions: Concordance, KeyWords and WordList. WordList is used for quantitative analysis while Concordance for qualitative analysis in this thesis. Wordlist function can present basic information of a corpus, e.g. tokens (running words) in text, tokens, types, type/token ratio and “let you see a list of all the words or word-clusters in a text, set out in alphabetical or frequency order” (ibid). On the other hand, parameters in WordList Clusters help position the specific clusters more quickly and precisely. “Cluster size” in the WordList Clusters function limits the length of sequences , “min. frequency” excluding the clusters occurring by chance and “Joining” being capable of joining smaller clusters to larger one, for example, joining a bearing of, a bearing of forty, a bearing of forty five to a bearing of forty five degrees. Therefore, this paper makes use of Wordlist to look for a word list consisting of sequences with at least four words occurring at least 20 times per million words in at least 10% of all texts. The advantage of Concordance is that it can provide the context of a PS so as to help pinpoint its discourse function. The function of a PS can only be defined precisely when it is presented in a concordance line with context considered. For instance, it is hard to judge whether at the end of refers to time or text until it is examined in the whole sentence or paragraph.
3.4 Data Analysis
In order to investigate the first question “how are the PSs distributed in Business English Research Discourse”, this thesis narrows down the operational definition of phraseology in Chapter 2.2.1 as “any continuous sequence in identical form consisting of at least four words occurring at least 20 times per million words in at least 10% of the texts”. Firstly, this thesis uses the function of “clusters” in WordSmith 5.0, limiting the “cluster size” at four or above words and adjusting the “minimum frequency” to 73 times (running words of the corpus are 3643,596 and the representative PSs should occur at least 20 times per million words). After the raw list of PSs was obtained, text distribution should be considered. Since representative PSs should occur in at least 10% of the texts and the corpus under study has 611 texts, thus each type of phraseological structure should cover at least 62 texts. After deleting PSs in raw list occurring below 62 texts, there is a need for joining smaller structures into the largest one, so as to reduce inflated results due to overlapping phenomenon. Having finished these steps, a list of representative PSs can be outlined. Based on the list, this thesis will analyze how heavily the Business English Research Discourse depends on PSs.
The structural pattern of PSs is analyzed according to Biber’s (1999) model. Biber puts forward a model of categorizing LBs structurally in academic prose. On the basis of his model, PSs can be categorized into NP-based, PP-based and VP-based structures with its subcategories. Since Biber presents an elaborate classification with examples under each structural pattern, PSs in this thesis can be clearly categorized. Later, the types and numbers of each category are calculated for the purpose of comparison.
PSs are categorized in terms of Biber’s (2004, 2007) model. As the function of a PS can only be analyzed in the context, a qualitative analysis thus is carried out using Concordance to pinpoint its function. It is found that a PS may realize more than one function in the corpus. In some case, a PS realizes two functions across categories or across different subcategories within a category. In this thesis, PSs are classified according to its more frequent function. When two functions of a PS occurring at the equally common frequency, it is grouped into the multi-function subcategory. Later on, each functional category is discussed associated with its representative examples.
Chapter Four
RESULTS and DISCUSSION
4.1 Distribution of PSs in the BERD Corpus
When PSs are counted, three problems should be noted. First is the distinction between types and tokens of PSs. A corpus possibly contains a wide range of PSs, but each PS occurs in low frequency. Therefore, calculation depending on the criteria of types or tokens may lead to different research results, especially when comparing corpora on different scales. Token distributions are linear while type distributions are not. With the corpus scaling up, types probably increase slower than tokens of PSs, which should be bore in mind when the comparison are made later in this paper. In order to make the statistics clear, both types and tokens are presented in Table 2 and clarified in the later sections. Secondly, since the corpus used in the research is discipline bounded and the whole sections of a research article are included in the corpus, sequences occur as a result of frequently cited journals (e.g., Academy of management journal, Total quality management TQM) and publishers (e.g., Harvard business school press) in bibliography are irrelevant to this research and should be deleted after refinement. Thirdly, longer PSs appear repeatedly as fragmented PSs. For instance, phraseological sequence of the purpose of this paper is to often occur in the forms of the purpose of this paper, the purpose of this paper is, purpose of this paper, purpose of this paper is, purpose of this paper is to. In the face of overlapping PSs, shorter PSs are incorporated into the largest one and regarded as one type of PS. This practice helps to guard against inflated results. Having dealt with the above matters, types and tokens PSs obviously reduce to a large extent, revealing a more reliable picture to depict.
As shown in Table 2, both types and tokens of PSs reduce considerably after the refinement of raw statistics, because references in an article contain overlapped names of journals.
Table 2: Numbers of phraseological structures before and after refinements
|
Before refinements |
After refinements |
Corpus |
No. of PSs (types) |
No. of PSs (tokens) |
No. of PSs (types) |
No. of PSs (tokens) |
Total cases |
% of total words |
BERD |
81 |
51954 |
65 |
39542 |
9801 |
1.1 |
With respect to the length of PSs, there are only three types of five-word PSs and two types of six-word PSs among 65 types of PSs in BERD corpus, the rest being four-word PSs (as shown in appendix). Since these longer structures are relatively rare and all contain four-word structures occurring independently in the corpus, they are negligible when calculating the frequency of four-word PSs. There are 65 types of PSs, with the most frequent structure on the other hand occurs only 447 times, which demonstrates that business English research discourses use relatively narrow range of PS with low frequency. In BERD corpus, four-word PSs occur on average a bit less than 2657 times per million words and account for about 1.1% of the total running words. Despite of a higher cut-off frequency at 40 times per million words, four-word bundles occur over 5000 times per million words and occupy 2% of the total words in academic prose in Conrad and Biber's study (2004). Adopting the same criteria, Hyland finds that PSs constitute 3.1% of total words in academic articles made up of four disciplines (2008b). In comparison with these similar studies, research discourses on business management only constitute 1.1% of total words and thus depend less on PSs. It is suggested that the gap concerning overall frequency of PSs between the BERD and previous studies arises as a result of disciplinary and genre variations. Conrad and Biber's academic corpus, displaying higher overall frequency of PSs, consists of research articles and book extracts on various disciplines (2004) . It is found that LBs are more than twice as frequent in electrical engineering as in biology and fine distinctions exist in sub-corpora related to various disciplines (ibid). By comparison, the BERD corpus merely includes Business English research discourses. In addition, Hyland' study is just a case in point when considering genre variation (2008b). Corpus in Hyland's research is made up of three genres---published research articles, masters' theses and doctoral dissertations (ibid) . Even though the size of his corpus is by far much smaller, it displays a wider variety and higher overall frequency of PSs than the present BERD. In view of this, expert genre depends less on PSs. It is claimed that when writers develop proficient writing skills, they have richer resources to take and rely less on pre-constructed PSs. In contrast, student genres are more phrasal than the published articles and students are more inclined to prefabricated structures in voicing their arguments (ibid, p.50).
Business English research discourses are similar to academic prose in sharing the phraseological features, but fine distinction exists. Business English research discourses depend less on PSs. The "hard" knowledge might be date-oriented and the language is used to connect data in visual forms with arguments. Thus, prefabricated structures are often chosen to make the reasoning clear to get across. The "soft" knowledge, on the other hand, is more likely to interpret and further convince readers through semantic cohesion than formal cohesiveness realized by PSs. It is speculatively more stance-oriented and needs rhetoric to persuade readers. More proficient at writing than students, expert writers are equipped with exquisite rhetorical abilities to refine their articles and develop their individual rhetorical style.
4.2 Structural Patterns of Phraseological Structures
PSs are identified on the basis of frequency and thus are not necessarily represent well structured units. According to the previous study, less than 5% of the PSs in academic prose have complete structures (Biber, 2004). In the present study, none PS is grammatically complete structured. Fragmented in structure patterns, PSs still reveal many features in language use. Based of Biber'scategorization of LBs (1999, p.1014-1024) , this study groups PSs into three broad categories with its own subset according to their structural correlates. The first category of PS begins with the noun phrase, followed by of-phrase fragment or other post-modifier fragment. The second category starts with preposition, followed by embedded of-phrase fragment or other phrase fragment. The third category includes complex patterns with a central verb. These patterns are anticipatory it + verb phrase/adjective phrase, passive verb + prepositional phrase fragment, copula be + noun phrase/adjective phrase, (verb phrase + ) that-clause fragment, (verb/adjective + ) to-clause fragment, pronoun/noun phrase + be ( +...) and adverbial clause fragment. The last category contains other expressions not belonging to any of the three major categories. The structural patterns in BERD are shown in Table 3 and Figure 1.
Table 3: Structural patterns of PS in the BERD corpus
Category |
Pattern |
Example |
% of all types |
% of all cases |
NP-based |
noun phrase + of-phrase fragment
|
per cent of the(309), the purpose of this(194), the nature of the(174), one of the most(164), the development of a (164), the result of the(157), (the) purpose of this paper (is) (to) (157), a wide range of(148), the development of the (144), the role of the(140), the end of the(124), the context of the (120), the quality of the(110), the use of the(99), the performance of the(93), a high level of(91), the importance of the (89), the implementation of the (88), the success of the (86), an integral part of (83), a large number of (82), the rest of the (76) |
33.85 |
29.51
|
noun phrase + other post-modifier fragment |
the extent to which (252), the way in which(106), the fact that the(97), the need for a (83)
|
6.15 |
5.49 |
PP-based |
prepositional phrase with embedded of-phrase fragment |
in the context of(357), on the basis of(250), as a result of(233), in the case of(216), in terms of the(194), in the form of (193), of this paper is (to)(262), at the end of(137), in the field of(137), to the development of(120), in the development of(113), in the area of(112), in the process of(105), at the university of(99), as part of the(80) |
23.08 |
26.61 |
other prepositional phrase |
on the other hand(447), at the same time(324), on the one hand(122), in addition to the(111), in order to achieve(106), in relation to the(103),in a way that(91) |
10.77 |
13.30 |
VP-based |
1) anticipatory it + adjective phrase |
it is important to(212), it is possible to(96), it is necessary to(94)
|
4.62
|
4.10 |
2) passive verb + prepositional phrase fragment |
shown in table 1(79), are shown in table(87), is based on the(143), can be contacted at(201), can be used to(190)
|
7.69
|
7.14 |
3) copula be + noun phrase /adjective phrase |
is one of the(144)
|
1.54 |
1.47 |
4) (verb phrase +)that-clause fragment |
it is clear that(86), that there is a(124)
|
3.08 |
2.14 |
5) (verb/adjective + )to-clause fragment |
to be able to(157)
|
1.54 |
|
6) pronoun/noun phrase + be (+...) |
there is a need(114), this paper is to(176)
|
3.08 |
2.96 |
7) adverbial clause fragment |
as shown in figure(139), as shown in table(100)
|
3.08 |
2.44 |
Others
|
|
as well as the (317)
|
1.54 |
3.23 |


It is clear that NP-based and PP-based structures, covering more than 70% of the PSs, dominate the patterns in BERD. This corresponds to the viewpoint that academic prose has more noun and prepositional phrases (Hyland, 2008b, Biber et al.,1999, Conrad & Biber, 2004). In addition, NP-based and PP-based structures with of-embedded fragment hold a large proportion. In business English research discourses, NP-based structures often end with articles, prepositions and complementizer such as which, in which, that. This practice reflects the "cautious limitations of academic prose" (Hyland, 2008b, p.48). The post modification limits the noun phrase and attempts to target at the object precisely. In BERD corpus, noun phrases with of-phrase fragment, the most frequent phraseological pattern, covers a wide range of meanings in discourse. They identify quantity (e.g., per cent of the, a large number of, one of the most), mark ends (e.g.,the purpose of, this the result of the), highlight quality (e.g., the nature of the, a high level of, the quality of the), exhibit procedures (e.g., the use of the, the performance of the, the implementation of the). This pattern consists of a productive framework the+noun+of, which accounts for 73% of types in this pattern. Table 4 shows that collocates in the+noun+of framework are transparent semantically and flexible structurally, but the language habit makes the particular structure available to writers as a chunk.
Table 4: The collocates in the framework the+noun+of
351 |
purpose |
140 |
role |
99 |
use |
86 |
success |
208 |
development |
124 |
end |
93 |
performance |
76 |
rest |
174 |
nature |
120 |
context |
89 |
importance |
|
|
157 |
result |
110 |
quality |
88 |
implementation |
|
|
Compared with the examination of LBs in four disciplines by Hyland (2008a), it is found that collocates, such as development, context, performance, implementation, success are the specific vocabulary occurring as noun in the+noun+of framework in Business English. With high productivity, this framework provides insight into the discontinuous structures which are investigated by Renouf and Sinclair (1991). The pattern of noun phrase with other post-modifier fragment is of fewer types than that with of-embedded phrase fragment. However, parts of relative clause, such as the extent to which, the way in which, the fact that the, show the consistency across disciplines in academic prose (Chen & Baker, 2010). The PS of the need for a, however, is specific to the BERD corpus. It is often followed by nouns related to approach, system, understanding and perspective.
PP-based PSs rank as the second largest category in BERD corpus, but possess the first three PSs occurring most frequently. Prepositional phrase with embedded of-phrase fragment indicates conditions of arguments (e.g., in the context of, in the case of ), limits the scope of a topic (in the field of, in the area of), marks the course (e.g., in the development, in the progress of) and reflects logical relations (e.g., as a result of,on the basis of, in terms of the). Almost half of types in this pattern are made up of the framework in+the+noun+of. As shown in Table 5, nouns used in this framework
Table 5: The collocates in the framework in+the+noun+of
357 |
context |
113 |
development |
216 |
case |
112 |
area |
193 |
form |
105 |
process |
137 |
field |
|
|
provide the need for the background when arguments are put forward and the of-phrase fragment in the framework again reflects the cautious limit of academic discourse:
1) Any re-evaluation of the role of CS in the undergraduate business curriculum
needs to be placed in the context of radical change in UK higher education.
2) Even in the case of BM evolution, however, the process is not risk-free.
3) This theory constitutes a core issue in "pseudo" disputes today, a state of affairs
attesting to the status of "the serious thing" that is the firm, and translating the existence of an US intellectualism in the field of management sciences.
Other prepositional phrase, less variety in types, contains the PS of highest frequency though. This pattern marks the logical relations between clauses, sentences and paragraphs, on the other hand indicating contrast, at the same time identifying juxtaposition, in addition to representing the progressive relationship and in order to achieve emphasizing the purpose. These PSs reveal the way in which business English organizes its argument pattern. On the other hand, the most frequently used PS across disciplines in academic prose (Hyland, 2008a), is positioned either at the beginning of a sentences or in a sentence. It is the signal of counter-argument, reflecting the contrasting facts, perspectives or points across the phrase:
4) Most of the managers answered not sure on the impact on the loss profit. This is shown by strongly disagree (24 percent), disagree (65 percent), and agree (12 percent)). On the other hand, most of the managers disagree that they have to develop the risk assessment and contingency plan program (strongly disagree (47 percent) and disagree (53 percent)).
5) The TQM protagonists assume that existing health care practices and systems are
principally right but improvements are needed. The BPR supporters, on the other
hand, assume that health care systems and practices are flawed and need replacing.
With respect to VP-based PSs, passive verb with prepositional phrase fragment is the prominent structure. It contains prepositional phrase indicating the locative relation (e.g., shown in table, are shown in table). On the other hand, extraposed structures like it is important to, it is necessary to, show the characteristics of authorial disguise. By evacuating subject, arguments seem to be believable and are easy to be accepted by readers. However, as a member of soft knowledge, business English in this study uses far less extraposed and passive structures than the hard knowledge (ibid). Research discourse in BERD corpus also reflects the feature of data oriented. Structures such as shown in table, are shown in table, as shown in figure, as shown in table, link the information in table or figure with arguments so as to exhibit a convincing basis supporting the writer's viewpoints. In agreement with Biber's finding (1999), pronoun/noun phrase with be fragment, used to initiate the sentence, has the existential there or demonstrative pronoun this as subject and copula be as verb.
4.3 Functions of phraseological structures in BERD
Since PSs are form/function composites, they have not only lexico-grammatical features, but also "customary pragmatic and/or discourse functions, used and recognized by the speaker of a language within certain contexts" (Chen &Baker, 2012, p.30). Biber, who draw special attention to LBs in spoken and written registers, provides a framework for the functional taxonomy of LBs (Biber et al., 2004, Conrad & Biber, 2004, Biber & Barbieri, 2007). In his research, LBs are classified into stance expressions, discourse organizers and referential expressions in relation to the discourse functions they realize. Stance expressions have two types: epistemic stance and attitudinal/modality stance. Epistemic stance is concerned with certainty, uncertainty, probability or possibility of knowledge status. Attitudinal/modality stance bundles express the writer's "attitudes towards the actions or events described in the following proposition" (Conrad & Biber, 2004). They have fine distinctions and are further categorized to desire bundles, obligation/directive bundles, intention/prediction bundles and ability bundles. Stance expressions can be classified in another way in according to whether they convey writer's stance or not. Personal bundles attribute stance to writers or readers, while impersonal bundles do not directly attribute stance to any individuals.
Discourse organizers act as a topic signal, displaying relationships between prior and coming discourse. Topic introduction/focus bundles tell the reader that a new topic is being discussed or will be brought into focus. Topic elaboration/clarification bundles bring up the prior topic through detailed explanation or new information added.
Referential expressions refer to "physical or abstract entities, or to the textual context" (ibid, p.67). They are categorized into four types. Identification/focus bundles are used to mark an entity as especially important” (e.g. those of you who) or to emphasize the main point (e.g. that’s one of the)”. Impression bundles point to imprecise entities. Specification of attributes describes quantities, tangible and intangible features. Time/place/text/ reference makes direct reference to those aspects or to be multi-functional.
In the analysis of functional distribution, each PS in the BERD corpus is examined in its concordance line with an aim to consider specific context. On the basis of Biber's functional taxonomy, PSs in the BERD corpus are grouped into categories according to its common discourse functions. Some PSs fulfill two functions are The results are shown in Table 5 and Figure 2.
Table 6: Functional classification of PSs in the BERD corpus
Functional classification |
Examples |
% of all cases |
1.Stance Expressions |
|
11.35% |
A. Epistemic stance |
|
|
Impersonal: |
the fact that the, it is clear that, it is possible to |
2.85% |
B. Attitudinal/modality
stance |
|
|
B1) desire |
|
|
Impersonal: |
in order to achieve |
1.08% |
B2) obligation/directive |
|
|
Impersonal: |
it is important to, the importance of the, there is a need, it is necessary to, the need for a |
3.88% |
B4) ability |
|
|
Personal: |
to be able to |
1.60% |
Impersonal: |
can be used to |
1.94% |
2. Discourse organizers |
|
21.86% |
A. Topic introduction/focus |
the purpose of this, this paper is to, of this paper is (to), (the) purpose of this paper (is) (to),that there is a |
9.31% |
B.Topic elaboration/
clarification |
on the other hand, as well as the, on the one hand, in addition to the, as a result of, |
12.55% |
3. Referential expressions |
|
60.79% |
A. Identification/focus: |
one of the most, is one of the, an integral part of |
3.99% |
B. Specification |
|
|
B1) Quantity
specification |
per cent of the, a wide range of, a large number of, the rest of the, as part of the |
7.09% |
B2) Tangible framing
attributes |
in the form of |
1.12% |
B3) Intangible framing
attributes |
in the context of, on the basis of, in the case of, in terms of the, the nature of the, the results of the, the context of the, the way in which, in the process of, in relation to the, in a way that, as part of the, a high level of, is based on the, the extent to which, in the field of, in the area of, the role of the, the development of a, the development of the, in the development of, the success of the, the quality of the |
37.48% |
C. Time/place/text/reference |
|
|
C1) Time reference |
at the same time |
3.31% |
C2) Place reference |
at the university of |
1.01% |
C3) Text deixis: |
as shown in figure, as shown in table, are shown in table, shown in table 1 |
4.13% |
C4) Multi-functional |
at the end of, the end of the
|
2.66% |
4. Specific function |
the implementation of the, the performance of the, the use of the,
|
2.74% |
Others: |
can be contacted at, to the development of, |
3.28% |
It is clear that referential expressions are far more frequent than the sum of other expressions. PSs are principally used to make direct reference in the BERD corpus. They refer to the topic, quantity, tangible or intangible attributes, time, place and data in the text. Discourse organizers rank the second largest category. PSs expressing stance is a little bit fewer than discourse organizers. Specific function and other expressions which not belong to the above categories are relatively rare.


Stance expressions. Business English research discourse makes use of a few PSs to express stance, most of which work to convey epistemic stance and obligation/directive. Three types of PSs expressing epistemic stance are the fact that the, it is possible to and it is clear that. They have the characteristics of appealing to the shared knowledge and employing impersonal pronoun/noun. In a holistic viewpoint, stance expressions use impersonal pronoun in order to avoid direct reference to the writer when commenting on the certainty of an argument and obscure the source of epistemic judgments (Hyland, 2010). On the other hand, Hyland argues that explicit appeals to collective understandings in the corpus are more often used in soft papers (ibid). By presupposing the sharedness, writers may "smuggle contested ideas into theirs argument" (ibid, p.184). As in example 6, the fact that students are overloaded as a result of time pressure is neither the common sense nor explicitly presented. This fact is smuggled to support the intensive nature of learning experience that is put forward by the writer. In example 7, the argument introduced by the PS it is clear that is the inference by the writer who attempts to stress the importance of individuals, rather than the obvious fact. With the aid of PSs expressing epistemic stance, the writer tries to convince readers by smuggling personal argument into accredited facts. Besides certainty, epistemic expressions also work as hedges to avoid absolutely assurance when making a claim. The PS it is possible to in example 8
6) A measure of the intensity of the learning experience is given by the fact that the course workload is nominated by more than 80 per cent of the students as the “worst” aspect of the project. As one student said: The hours required to fulfill obligations and expectations exceed normal units and as a result priorities are constantly being reviewed, moved to accommodate "live" delays, outside priorities, etc. until there is no more time.
7) In the ST50 firms, the systems that emerged achieved coherence not because this was planned for and managed, but because the specific practices used were defined by and consistent with the cultures and values of the businesses' owners and subsequently with those of the businesses… In this sense, it is clear that the processes operating in the ST50 businesses do not simply reflect the agency of the entrepreneurs concerned. Certainly, individuals have been instrumental in establishing and developing these businesses, but to see their actions as being fundamentally causally significant, as some of the literature has assumed in the past, can hardly be valid.
8) “Narrative” modes seem unable to achieve the status of “good research”, even if one does accept that it is possible to derive acceptable theory from narrative representations.
Stance expressions are also frequently used to make obligation/directive. The PSs, such as it is important to, the importance of the, there is a need, the need for a, it is necessary to, call for the necessity and importance of readers' cognitive acts. Hyland (ibid, p.185) considers that "cognitive acts guide readers through a line of reasoning, or get them to understand a point in a certain way". Example 8 and 9 show the cognitive pattern that the writer prefers to arouse the readers' consciousness. This pattern begins with reasons or facts, and proceeds to the conclusion that awareness is indispensible to fulfill a task.
9) As Hitchens et al. (2004) point out, the problem is acute among SMEs because they are additionally handicapped by lack of information and resources to invest in green management. Thus, it is important to understand why and how SMEs adopt green management into business organizations.
10) Accordingly, all of the dualism-situations may be troublesome or cause a dilemma in the implementation of TQM. Therefore, there is a need to maintain internal harmony between management and employee expectations and perceptions.
Discourse organizers. It should be noted that PSs of topic introduction/ focus function (e.g., the purpose of this, this paper is to, of this paper is (to), (the) purpose of this paper (is) (to)), the subcategory of discourse organizers, also reflect the writer's intention. That is to say, in some cases, these PSs may belong to both stance expressions and discourse organizers. As with example 10, this paper is to, occurring in the abstract, introduces the topic of franchising. On the other hand, it also indicates that the intention of the following part is to elaborate on franchising. In example 11, of this paper is to, identifies the topic of the application of model and at the same time, signals the following action to report on it. It is found that PSs of dual functions mainly belong to the identification/focus and intention/prediction at the same time. In contrast to topic introducing PSs, elaboration/clarification PSs, such as as well as the, in addition to, are used to add new information to the prior statement, while on the one hand and on the other hand show the two sides of a coin. Thus, it is clear that PSs of topic elaboration/clarification function to express the additive relationship between arguments.
11) The purpose of this paper is to highlight the business growth opportunities available from franchising in the UK and abroad.
12) The objective of this paper is to report on the implementation and evaluation of the suggested model as it was applied to retail micro-enterprises in Limerick City.
Referential expressions. Identification/focus expressions are used to draw attention to the element the writer attempts to emphasize. They identify and focus on an entity (as in example 12), an abstract object (as in example 13, 15) and a factor (as in example14). PSs realizing this function have structure in common. That is, they all contain indefinite articles with of-embedded fragment.
13) Movies are considered as one of the most dynamic industries in many countries/cultures due to the emergence of various innovations and technologies, such as, digital production and exhibition, and newer retail formats, such as multiplexes or megaplexes (sci-tech-today.com, 2005; Jardin, 2005).
14) The development of a cultural dimensions typology is one of the major frameworks for understanding culture (Hofstede, 1980; Hofstede and Bond, 1984).
15) The mixture of structured and unstructured aspects of real-life processes is certainly one of the most important reasons for the rapidly growing interdependency between BI and portal technologies.
16) According to firms, communication is an integral part of the marketing effort.
Specifying attributes is the most common function of PSs in academic prose. They specify the amounts (e.g., per cent of the, a wide range of, a large number of), describe tangible features (e.g., in the form of) and most importantly, refer to intangible attributes. PSs of intangible framing attributes make direct reference to the scope of argument (as in example 10, 11). These PSs pinpoint the context in which argument is put forward. PSs of this function are mostly composed of the PP-based structure: in the + noun + of-embedded fragment, for instance, in the context of, in the case of, in the field of, in the area of). Correlates between discourse function and structural pattern are prominent in PSs of specification of attributes.
17) Our objective is to evaluate certain of these proposals in the context of sales organizations in the UK, replicating and extending an empirical design used in earlier studies in the USA (Cravens et al., 1993; Oliver and Anderson, 1994) and in Australia (Cravens et al., 1992; Grant and Cravens, 1996).
18) All panelists had to be experts in the field of knowledge management.
PSs with a relative clause specify a manner in which things operate (as in example 18. 19). What’s more, the NP-based structure the + noun+of-embedded fragment, acting as an adjective, describe the quality and state of the coming noun. The abstract features include the role, the nature, the result, and the development course of the object (e.g., the nature of the, the results of the, the role of the, the development of a, the development of the, the success of the). In addition, PSs of intangible framing attributes mark the logical relation between prior and coming clauses, for instance, on the basis of, is based on the, in terms of the, as a result of reflecting the ground on which the inference is made.
19) It examines the way in which leadership and management styles interact with the position, power and contribution of women in organizations, and contribute to gender stereotyping.
20) Local tactical activity is encouraged to help the business to learn, but at the same time the company ensures that such activity is carried out in a way that does not harm the global brand and strategy.
With regard to time/place/text/reference, deixis reference is relatively common in BERD corpus compared with previous studies. In general, PSs like as shown in figure, as shown in table, are shown in table, shown in table 1 are more preferred in hard papers with date orientation. However, it is found that soft papers in BERD corpus are also inclined to data to support the writer's argument. In this sense, business English research discourse shares the feature of hard science in the aspect of data collection.
Specific function. There exists a group of PSs displaying progressive act. PSs of the implementation of the, the performance of the, the use of the demonstrate the act imposed on the object. The implementation-embedded fragment is followed with nouns concerning model, standard, strategy and system. The performance-embedded fragment is to modify the act of people, organization and system. The use-embedded fragment is flexible to describe various entities. These bundles also share the framework the+noun+of-embedded fragment as with PSs referring to quality and state. It is suggested that PS with the framework the+noun+of-embedded is possessed with the function of specifying quality, state and progressive act.
Chapter Five
CONCLUSIONS
5.1 Summary and Implications
The results of the analysis suggest that PS is an indispensible building block of discourse because it provides a list of cohesive means that link the complex sentential component structurally and functionally. PS is identified merely according to the criterion of frequency, and thus it is not the well-formed unit in the view of traditional linguistics. Despite of grammatical incompleteness, PS is still interpretable in its structure and function. It is flexible structurally and transparent semantically. It is structurally analyzable in that it is organized in an order way in which “frame” and “slot” are the two major parts of a PS, with “frame” acting as “a discourse anchor for the new information in the slot” (Biber et al. 2004).
Business English research discourse in the BERD corpus covers a relatively narrow range of PSs, which is the characteristic of academic prose. Expert discourse, in particular, depends less on prefabricated structures in order to be flexible in organizing arguments. NP-based and PP-based patterns are the common structural categories, in which prevalent of-embedded fragment, functioning as the frame modifying the comning new information, reflects the cautious limit of academic discourse. Discontinuous sequence is widespread in BERD corpus. The framework the+noun+of in NP-based structure and framework in+the+noun+of in PP-based structure are prominent and have its functional correlates. The framework the+noun+of is associated with the function of specifying the state, quality and dynamic process. The framework in+the+noun+of, on the other hand, is related to contextual limit. In addition, extraposed structure is often used to disguise authority in a way that personal source is concealed when judging on the validity of a claim and making directives.
PSs in the BERD corpus, to a large extent, are used to make direct reference, particularly specifying intangible framing attributes. Business English research discourse in BERD corpus are date-based in that text deixis, often the characteristic of hard papers, is particularly used here to refer to table or figure in the text. Discourse organizers, the second largest functional category, either attend to the purpose of the paper or mark the logical relationship between arguments. They prepare readers for the coming argument and organize the text in a cohesive way. In addition, PSs also refer to impersonal epistemic stance with certainty or possibility. By resorting to shared knowledge, a claim made by the author is grounded on the accredited fact rather than on an opinion and the evidence supporting the claim seems to be undeniable and indisputable.
This thesis reflects the phraseological characteristics of expert discourse in business English. As a form/function composite, PS provides a means of bridging across two clauses and connecting two semantic units so as to shape the discourse in an idiomatic way. As a second or foreign language learner, it is necessary to attend to PSs which may be ignored without careful consideration but play a basic role in building discourse.
5.2 Limitations of the Thesis
There are several limitations of the thesis.
Firstly, for the reason that shorter structures are contained in the longer one and it needs to confine the data in a controllable scope, this thesis is restricted to PSs consisting of at least four words, with two- or three-word PSs not included in the study. However, shorter PS plays an important role is structuring discourse and some previous studies have been devoted to it.
Secondly, the BERD corpus used in this thesis considerably consists of discourse on business administration, which represents a part of instead of the whole picture of business English. It has its advantages of focusing on one professional field, but has the limitations of presenting a tip of the iceberg. With the development of specialized business English corpus, it is possible to investigate phraselogical features of business English in a more comprehensive and representative way.
Thirdly, it is hard to compare the result of this thesis with that of the previous studies. It is significant to compare business English research discourse with academic discourse in other disciplines in terms of phraseological feature, but the differences in phraseological length and cut-off frequency makes the results incomparable. Genre or register distinction has been much investigated, while disciplinary analysis is at the initial stage. Thus, it is urgent to make out a standard to connect the studies.
References
Alexander, R.J. (1992). Fixed expressions, idioms and phraseology in recent English learner’s dictionaries. Euralex’ 92- Proceedings, 35-42.
Altenberg, B. (1998). On phraseology of spoken English: The evidence of recurrent word-combinations. In A.P. Cowie (Ed.), Phraseology: Theory, analysis, and application. Oxford: Clarendon Press. pp. 101-122.
Adel, A. & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English: A LBs approach. English for Specific Purposes, 31, 81-92.
Biber, D., Johansson, S., Leech, G., & Conrad, S. (1999). Longman grammar of spoken and written English. London: Longman.
Biber, D. , Conrad, S. , Reppen, R., Byrd, P. & Helt, M. (2002). Speaking and writing in the University: A multidimensional comparison. TESOL Quarterly, 36(1): 9-48.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: LBs in university teaching and textbooks. Applied Linguistics, 25 (3), 371-404.
Biber, D. & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for Specific Purposes, 26, 263-286.
Cowie, A.P. (1998). Phraseology: Theory, analysis, and application. Oxford: Clarendon Press.
Cortes, V. (2004). LBs in published and student disciplinary writing Examples from history and biology. English for Specific Purposes, 23, 379-423.
Conrad, S. & Biber,D. (2004). The frequency and use of LBs in conversation and academic prose. Lexicographic, 20, 56-71.
Chen, Y. H. & Baker, P. (2010). LBs in L1 and L2 academic writing. Language learning and technology, 14, 30-49.
Deignan, A, (2000). Book review. Journal of Pragmatics, 32, 1241-1246.
Glaser, R. (1998). The stylistic potential of phraseological units in the light of genre analysis. In A.P. Cowie (Ed.), Phraselogy: Theory, analysis, and application. Oxford: Clarendon Press. pp. 123-144.
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and formulae. In A.P. Cowie (Ed.), Phraselogy: Theory, analysis, and application. Oxford: Clarendon Press. pp. 145-160.
Gavioli, L. (2005). Exploring corpora for ESP learning. Amsterdam: John Benjamins.
Howarth, P. (1998). Phraseology and second language proficiency. Applied Linguistics, 19(1), 24-44.
Hyland, K. (2008a). As can be seen: LBs and disciplinary variation. English for Specific Purposes, 27, 4-28.
Hyland, K. (2008b). Academic clusters: text patterning in published and postgraduate writing. International Journal of Applied Linguistics,18, 41-62.
Hyland, K. (2010). Stance and engagement—a model of interaction in academic discourse. Discourse Studies,7, 173-192.
Herbel-Eisenmann, B., WANGER, D., & Cortes, V. (2010). Lexical bundle analysis in mathematics classroom discourse: The significance of stance. Educ Stud Math, 75, 23-42.
Howarth, P.A. (1996). Phraseology in English academic writing. Tubingen: Max Niemeyer Verlag.
Leech, d. (1994) Lexical Phrases and Language Teaching by James R. Nattinger and Jeanette S. DeCarrico. Issues in Applied Linguistics, 1,160-165.
Liu, D. (2012). The most frequently-used multi-word constructions in academic written English: A multi-corpus study. English for Specific Purposes, 31, 25-35.
Montgomery, S. (1996). The Scientific Voice. New York: Guilford Press.
Moon, R. (1998). Frequencies and forms of phrasal lexemes in English. In A.P. Cowie (Ed.), Phraselogy: Theory, analysis, and application. Oxford: Clarendon Press. pp. 77-100.
Moon, R (1998). Fixed expressions and idioms in English. Oxford : Clarendon Press.
Nattinger, J.R. & DeCarrioco, J.S. (1992). Lexical phrases and language teaching. Oxford: Oxford University Press.
Peacock, M. (2012). High-frequency collocations of nouns in research articles across eight disciplines. Iberica, 23,29-46.
Renouf, A. & Sinclair, J. (1991) Collocational frameworks in English. In K. Aijmer & B. Altenberg (Eds), English corpus linguistics: Studies in honour of Jan Svartvik. Longdon: Longman, 128-143.
Scott, M. (2010). WordSmith Tools version 5.0. Liverpool: Lexical Analysis Software.
Sincalir, J. (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Swales, J. M. (1996). Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.
卫乃兴,2007,中国学生英语口语的短语学特征研究*—COLSEC语料库的词块证据分析. 现代外语(季刊), 30(3):280-291。
桂诗春,2009,基于语料库的英语语言学语体分析,北京: 外语教学与研究出版社。
卫乃兴,2002,词语搭配的界定,上海:上海交通大学出版社。
姜亚军,赵刚,2006,学术语篇的语言学研究:流派分野和方法整合.外语研究(6):1-5.