Morpho-syntactic Variation in World Englishes – The Corpus-Based Approach
While the core grammar of standard and standardising World Englishes (WEs) is shared, it is not a monolithic entity, but one that shows variability. Moreover, language-internal factors of variation interact with factors like text type (news vs. academic writing) or mode (speech vs. writing). From the late 1970s, WEs research has worked towards a systematic description of this diverse range of Englishes and their variable grammars. From around 1995 onwards, with the advent of computer corpora specifically compiled for the purpose of comparing (standard/acrolectal) varieties of English in their spoken and written form and across a common set of text categories, research into WEs has increasingly been corpus-based.
The lecture will draw on material from the International Corpus of English (ICE,see Greenbaum 1996) and the Global Web-based English (GloWbE) corpus (see Davies & Fuchs 2015). The ICE components are standard one-million-word reference corpora of first- and second-language varieties of English sampling both spoken and written material. GloWbE is a web-based text collection which runs to almost 2 billion words of text from six ENL, thirteen ESL and one ESD varieties of English, is a less carefully compiled but useful complementary source of data for more infrequent morphosyntactic phenomena. The majority of the material in GloWbE (60%) is taken from informal blogs, the remainder from a cross-section of other text types found on the internet.
Drawing on case studies on morphological and syntactic variation in WEs the lecture will illustrate how corpus-based research proceeds, from the definition of the envelope of variation via data retrieval, analysis to statistical modeling of the variable grammar(s).
Englishes in an age of reimagining: Perspectives from urban, online, multicultural Asia
In what is viewed as the second phase of postmodernity, in particular as defined by digitality, we are afforded challenges and opportunities for reimagining once-imagined English communities. I identify 2 phenomena that I suggest need to be appreciated for their significance in the language contact dynamics of this era and their impact on the evolution and conceptualisation of Englishes in multicultural contexts.
I consider peripheral or minority communities who have shifted from their ancestral vernacular to the colonial language they accessed, English – and I use as illustration the erstwhile Baba-Malay-speaking Peranakans of Singapore. With such communities, while their emergent variety, considered a lesser-known variety of English, is usually described as displaying both acrolectal as well as vernacular features, its characteristic as a mixed code is often mentioned but not afforded sufficient scholarly attention as a fundamental dimension of the Englishes of such contact communities.
I also consider communities who, even with English as a post/colonial language, are essentially other-language-dominant – here I use Cantonese-dominant Hongkongers as a case in point. The challenge in such contexts is that, for a new variety of English to genuinely emerge as a variety, it has to be used widely and spontaneously in a society, for internally driven norms to emerge. In such contexts, it is computer-mediated communication (CMC) which serves as a vital platform and catalyst. CMC of multilingual communities favours the use of English, promotes significantly more code mixing with and calqueing into English compared to spoken discourse, and prompts subsequent spread to other domains. Diasporic web-based communities of practice use their contact language variety more than in traditional writing of spoken face-to-face interaction. Such a platform and its practices support the evolution and positioning of multilingual English varieties.
What were traditionally language communities on the margins of study in the linguistics of English – those engaging in mixed language practices, developing multicultural New Englishes – are in a time of reimagining, in practice and in scholarship, and need to be valued for what they can reveal about language contact dynamics, diversity, evolution, and authenticity in the study of Englishes.
Do dialects really change or speakers simply vary? Challenges for sociolinguistic fieldwork and interpretation
Tapping into and assessing the sociolinguistic validity ofspoken language has been a major concern for our discipline ever since the1960s. Beginning with Labov’s early (1972) work and the quest for the vernacular, it was recognized that ‘there are no single-style speakers.’ As a consequence, the ever-varying sociolinguistic repertoire of individuals represents a somewhat slippery ground for developing and testing models of variation and change, as it juxtaposes individual (context/projected person/topic-based) and community-wide variation, which is the axiom for language change (cf. Rickford & Price's 2013 discussion of what they refer to as “stylistic chameleons”).
The main focus of the paper will be on context-related usage shifts, or better: the varying usage of sociolinguistic variables in different interview contexts and its importance for interpretation and theoretical assessment. I will present and discuss quantitative evidence that non-mobile older rural males (the so-called NORMs, well-known from traditional dialectology and especially singled out in fieldwork for major projects such as the Linguistic Atlas of England) vary and shift according to external interview parameters, so that even highly isolated elderly speakers are context-sensitive and vary in their vernacular usage under different recording conditions. A conglomerate of criteria related to the interviewer, topics discussed, and place of interview thus entails considerable individual variability, which raises important issues both for methodology and interpretation.
Edgar W. Schneider
World Englishes as a scholarly discipline: Evolution and current challenges
This lecture surveys and comments on the growth of the scholarly discipline of World Englishes and its main developmental strands with respect to theories and models, methodological approaches, and driving concepts.
The 1970s saw an early and marginalized awareness of the fact that new varieties of English were emerging in former colonies and beyond – in Strand’s (1970) A/B/C-classification of countries (later translated into the notions of ENL/ESL/EFL and into Kachru’s Three Circles), and in isolated publications on distinctive properties of select Asian and African varieties. In the early 1980s the new discipline as such was born – with some collective volumes, two journals founded, and activities broadening, including ultimately the foundation of the International Association of World Englishes (IAWE). Very broadly, the growth of the discipline in those days can be categorized into two broad strands. Kachru’s school, with the “Three Circles” model as its core and a broad orientation towards linguistic, political and cultural issues, has come to be hugely influential. In Europe, a more empirically descriptive and sociohistorically and sociolinguistically oriented approach was inspired by scholars such as Manfred Görlach and Tom McArthur.
Synchronic model-making gave way to a more diachronic orientation, mainly manifested in the “Dynamic Model” of the evolution of Postcolonial Englishes by Edgar Schneider in the early 2000s, and later expansions and adaptations (like the “Extra-and Intra-Territorial Forces” model or the concept of “Transnational Attraction”). In addition, applied approaches, focused on teaching strategies, and critical thinking on the global role of English have constantly been visible.
Methodological advances moved from haphazard to more systematically collected observations and studies of “distinctive features” and have come to include systematic fieldwork-based studies and, most importantly, the compilation of partly balanced, partly huge electronic corpora (ICE, GloWbE). The quantitative approach inherent in the latter inspired a substantial increase in the statistical sophistication of analytical procedures. A fusion of corpus methodology and diachronic thinking has recently resulted in systematic activities to compile diachronic corpora representing different World Englishes.
Recent trends, occasionally labeled a “post-postcolonial approach”, include an increased focus on what happens to English in EFL contexts (and a blurring of the boundaries between the once firm distinctions between ENL, ESL and EFL), and a tendency to dismiss national perspectives and highlighting transnational diffusion, informal lingua franca interactions and a “grassroots growth” of English in some contexts.
All these steps and developments are briefly summarized, critically discussed and put in perspective.
Analysing World Englishes in Praat (Alexander Kautzsch)
The realization of vowels in World Englishes is highly variable. Words like goat may have a monophthong in speakers form India or Scotland; Canadians may exhibit Canadian raising in words like mouth or price; New Zealanders are said to have a very close vowel in words like web or dress, while L1 Afrikaans speakers from Namibia will have a schwa in words like bit. In most cases, an auditory analysis will not be objective enough to describe the differences across or the peculiarities of varieties properly. Instead it might be reasonable to resort to an acoustic analysis of vowels using, for example, PRAAT (www.praat.org), one of the most popular phonetics software packages available.
The focus in this hands-on workshop, which is designed as a beginner’s guide for participants without any previous knowledge, will be on how to measure the quantity and quality of vowels and on how to create vowel plots. We will start with a brief introduction to acoustic phonetics in general. Then we will investigate the basic functions of PRAAT (opening sound files, manual annotation of sound files, manual and semi-automatic measuring of vowel duration and formant frequencies, etc.) and see how to collect and manage vowel data. Finally, we will turn to creating vowel plots, which can be done in a variety of ways. Here, we will first take a “quick and dirty” approach using the online tool NORM (http://lingtools.uoregon.edu/norm/index.php) and then also go into the details of more sophisticated options like PRAAT and the statistics package R.
The workshop will take place in a computer room where participants can use desktop PCs (Windows 10); alternatively, it might be more convenient to bring your own notebooks. Software required: PRAAT (www.praat.org), spreadsheet software (like MS Excel or LibreOffice Calc), the statistics package R (https://cran.r-project.org/), and a text editor (preferably Notepad++, https://notepad-plus-plus.org/).
Collecting and analysing diachronic data in World Englishes (Thorsten Brato)
The study of diachronic data to uncover processes of the linguistic evolution of World Englishes is still a relatively small, yet upcoming field of research. The aim of this workshop is to provide participants with a toolkit to collect and analyse historical and diachronic data of New Englishes. In a first step I will provide an overview of the current state of research in the field. On the level of written language, a major approach has been the compilation and analysis of historical and diachronic corpora of World Englishes, but also collections of letters have provided a fruitful source for analyses. Much less attention has been given to spoken data so far, as data is even more scarce and difficult to analyse.
We will then turn to the more practical questions of how to collect and analyse historical postcolonial English data. Based on the experiences of compiling the “Historical Corpus of English in Ghana” (HiCE Ghana) I will try to guide participants through the theoretical and methodological considerations that need to go into the compilation of an ‘old’ corpus of a ‘new’ English and suggest some guidelines, resources and tools I found useful in the creation of HiCE Ghana. In the third part, the focus will be on the collection and analysis of spoken data. What kind of resources are available? How do I get hold of them? Which steps must be taken to digitise and analyse old speech data? For the final part, participants are encouraged to bring their own data and ideas for the study of historical and diachronic data which we can discuss in the group. Participants should bring their own laptops and head-/earphones.
Collecting field data in World Englishes (Sarah Buschfeld)
Once due to British imperialism and currently driven by forces of globalization, English has undergone an unprecedented spread and changes around the globe. With virtually unlimited ways of communication on both the professional and the private levels for which English functions as the lingua franca, the cultural, economic, and political hegemony of the United States, and the excellent reputation of English as the language of personal advancements and better job opportunities, this is even more valid today than ever before. The spread of English has long moved beyond the former colonies of the British Empire and can today be observed in many different countries without a colonial background and in many different functions, i.e. as first, second, and foreign languages, but also in "emergent contexts" (Schneider 2014: 18-26), e.g. as grassroots forms, mixed codes, hybrid varieties or cyber Englishes. Due to these fast-changing linguistic realities, young researchers investigating these forms are – more than ever – required to collect their own, context-specific data.
This two-day workshop introduces up-to-date methods for data collection in the field of World Englishes. We will look into both established and traditional approaches to investigate synchronic variation which have proven valuable in earlier studies, with a special focus on designing and conducting interviews and questionnaires, but also consider methodologies originating in other linguistic subdisciplines (e.g. collecting acceptability judgment and other kinds of data following the psycholinguistic paradigm, i.e. by means of psycholinguistic experiments) and how they can be utilized within the World Englishes paradigm. By means of these methods, we will consider both the collection of data for grammatical as well as acoustic analyses and briefly touch on using web-based data, which presents researchers with yet another challenge as this kind of data often represents a communicative level to be located between speech and writing. For all these methods, we will look into questions of how to collect and analyze such data from a theoretical perspective and based on my own empirical experience. The latter aspect will also include an experience-based overview of the dos and don'ts of data collection and some advice on how to best overcome common data collection obstacles. In general, this workshop is designed to consider the needs and questions of the participants, i.e. participants are encouraged to state their concerns and problems / ask questions prior to the workshop, and bring their own data and ideas, if possible.
Corpus data: from compilation to statistical modeling (Sandra Götz / Tobias Bernaisch)
The aim of this two-day workshop is to give participants a hands-on introduction to the theoretical and practical aspects of corpus linguistics, starting from corpus compilation and annotation over data extraction and their statistical analysis and visualization. In the first session, we will start off with introducing participants to the principles of corpus compilation and different ways of data annotation. Here, we will particularly focus on the notions of representativeness, authenticity, balance and ethics in corpus compilation and introduce different types of software for transcribing collected data. We will also show participants different ways of adding structural mark-up (i.e. information on the text/speech that was transcribed, such as self-corrections, pausing, laughter, text units, quotations, etc.) and Part-of-Speech tags (i.e. adding word-class information to each word in the corpus) to the data and mention the possibilities of parsing. During sessions two and three, we will work with corpora and explore different ways of analysing corpora by introducing standard corpus-linguistic functions, such as wordlists, concordances, n-grams available in different corpus analysis tools such as WordSmith Tools (Scott 2016) or AntConc (Anthony 2016). We will also show participants how to design a spreadsheet meaningfully to prepare it for statistical analysis. In the fourth session, we will round off this workshop by introducing participants to statistical analyses and data visualization in the statistical software package R. By the end of this workshop, participants are expected to have gained knowledge on corpus compilation as well as hands-on experience in data analysis and modelling so that they should be prepared to do their own corpus analyses.
Transcribing and annotating data in ELAN (Naomi Nagy)
ELAN (https://tla.mpi.nl/tools/tla-%20tools/elan/, Wittenburg et al. 2006), a cross-platform freeware application, is well-established as a valuable tool for language documentation. ELAN is frequently used for transcription and multi-tier mark-up illustrating levels of linguistic structure as well as translations and glosses. Prior to the implementation of ELAN, it was common for sociolinguists to use multiple software applications, and consequently multiple formats, along the route from recording participants to conducting statistical analyses of the data. Nagy & Meyerhoff (2015) present a method which allows for transcription, extracting, coding, preparation for statistical analysis, calculation of some basic frequency statistics, and creation of a concordance all within one program.
After providing an overview of ELAN’s utility, we will focus on extracting (marking) and coding tokens of linguistic variables for quantitative analysis in the variationist sociolinguistic framework. This seamless connection between recording, transcript and coding of dependent and independent variables improves consistency, efficiency, utility, reliability and the accountability of our coding to the original recording. I will illustrate a range of benefits and include step-by-step instructions accompanied by downloadable sample files to illustrate each step of the process (http://projects.chass.utoronto.ca/ngn/zip/Celeste_for_ELAN.zip).
Opening Talk (Naomi Nagy)