APS

Observation

The littlest linguists: new research on language development.

  • Bilingualism
  • Developmental Psychology
  • Language Development

research paper about languages

How do children learn language, and how is language related to other cognitive and social skills? For decades, the specialized field of developmental psycholinguistics has studied how children acquire language—or multiple languages—taking into account biological, neurological, and social factors that influence linguistic developments and, in turn, can play a role in how children learn and socialize. Here’s a look at recent research (2020–2021) on language development published in Psychological Science . 

Preverbal Infants Discover Statistical Word Patterns at Similar Rates as Adults: Evidence From Neural Entrainment

Dawoon Choi, Laura J. Batterink, Alexis K. Black, Ken A. Paller, and Janet F. Werker (2020)

One of the first challenges faced by infants during language acquisition is identifying word boundaries in continuous speech. This neurological research suggests that even preverbal infants can learn statistical patterns in language, indicating that they may have the ability to segment words within continuous speech.

Using electroencephalogram measures to track infants’ ability to segment words, Choi and colleagues found that 6-month-olds’ neural processing increasingly synchronized with the newly learned words embedded in speech over the learning period in one session in the laboratory. Specifically, patterns of electrical activity in their brains increasingly aligned with sensory regularities associated with word boundaries. This synchronization was comparable to that seen among adults and predicted future ability to discriminate words.

These findings indicate that infants and adults may follow similar learning trajectories when tracking probabilities in speech, with both groups showing a logarithmic (rather than linear) increase in the synchronization of neural processing with frequent words. Moreover, speech segmentation appears to use neural mechanisms that emerge early in life and are maintained throughout adulthood.

Parents Fine-Tune Their Speech to Children’s Vocabulary Knowledge

Ashley Leung, Alexandra Tunkel, and Daniel Yurovsky (2021)

Children can acquire language rapidly, possibly because their caregivers use language in ways that support such development. Specifically, caregivers’ language is often fine-tuned to children’s current linguistic knowledge and vocabulary, providing an optimal level of complexity to support language learning. In their new research, Leung and colleagues add to the body of knowledge involving how caregivers foster children’s language acquisition.

The researchers asked individual parents to play a game with their child (age 2–2.5 years) in which they guided their child to select a target animal from a set. Without prompting, the parents provided more informative references for animals they thought their children did not know. For example, if a parent thought their child did not know the word “leopard,” they might use adjectives (“the spotted, yellow leopard”) or comparisons (“the one like a cat”). This indicates that parents adjust their references to account for their children’s language knowledge and vocabulary—not in a simplifying way but in a way that could increase the children’s vocabulary. Parents also appeared to learn about their children’s knowledge throughout the game and to adjust their references accordingly.

Infant and Adult Brains Are Coupled to the Dynamics of Natural Communication

Elise A. Piazza, Liat Hasenfratz, Uri Hasson, and Casey Lew-Williams (2020)

This research tracked real-time brain activation during infant–adult interactions, providing an innovative measure of social interaction at an early age. When communicating with infants, adults appear to be sensitive to subtle cues that can modify their brain responses and behaviors to improve alignment with, and maximize information transfer to, the infants.

Piazza and colleagues used functional near-infrared spectroscopy—a noninvasive measure of blood oxygenation resulting from neural activity that is minimally affected by movements and thus allows participants to freely interact and move—to measure the brain activation of infants (9–15 months old) and adults while they communicated and played with each other. An adult experimenter either engaged directly with an infant by playing with toys, singing nursery rhymes, and reading a story or performed those same tasks while turned away from the child and toward another adult in the room.

Results indicated that when the adult interacted with the child (but not with the other adult), the activations of many prefrontal cortex (PFC) channels and some parietal channels were intercorrelated, indicating neural coupling of the adult’s and child’s brains. Both infant and adult PFC activation preceded moments of mutual gaze and increased before the infant smiled, with the infant’s PFC response preceding the adult’s. Infant PFC activity also preceded an increase in the pitch variability of the adult’s speech, although no changes occurred in the adult’s PFC, indicating that the adult’s speech influenced the infant but probably did not influence neural coupling between the child and the adult.

Theory-of-Mind Development in Young Deaf Children With Early Hearing Provisions

Chi-Lin Yu, Christopher M. Stanzione, Henry M. Wellman, and Amy R. Lederberg (2020)

Language and communication are important for social and cognitive development. Although deaf and hard-of-hearing (DHH) children born to deaf parents can communicate with their caregivers using sign language, most DHH children are born to hearing parents who do not have experience with sign language. These children may have difficulty with early communication and experience developmental delays. For instance, the development of theory of mind—the understanding of others’ mental states—is usually delayed in DHH children born to hearing parents.

Yu and colleagues studied how providing DHH children with hearing devices early in life (before 2 years of age) might enrich their early communication experiences and benefit their language development, supporting the typical development of other capabilities—in particular, theory of mind. The researchers show that 3- to 6-year-old DHH children who began using cochlear implants or hearing aids earlier had more advanced language abilities, leading to better theory-of-mind growth, than children who started using hearing provisions later. These findings highlight the relationships among hearing, language, and theory of mind.

The Bilingual Advantage in Children’s Executive Functioning Is Not Related to Language Status: A Meta-Analytic Review

Cassandra J. Lowe, Isu Cho, Samantha F. Goldsmith, and J. Bruce Morton (2021)

Acommon idea is that bilingual children, who grow up speaking two languages fluently, perform better than monolingual children in diverse executive-functioning domains (e.g., attention, working memory, decision making). This meta-analysis calls that idea into question.

Lowe and colleagues synthesized data from studies that compared the performance of monolingual and bilingual participants between the ages of 3 and 17 years in executive-functioning domains (1,194 effect sizes). They found only a small effect of bilingualism on participants’ executive functioning, which was largely explained by factors such as publication bias. After accounting for these factors, bilingualism had no distinguishable effect. The results of this large meta-analysis thus suggest that bilingual and monolingual children tend to perform at the same level in executive-functioning tasks. Bilingualism does not appear to boost performance in executive functions that serve learning, thinking, reasoning, or problem solving.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

research paper about languages

Teaching: Ethical Research to Help Romania’s Abandoned Children 

An early intervention experiment in Bucharest can introduce students to the importance of responsive caregiving during human development.

research paper about languages

Silver Linings in the Demographic Revolution 

Podcast: In her final column as APS President, Alison Gopnik makes the case for more effectively and creatively caring for vulnerable humans at either end of life.

research paper about languages

Communicating Psychological Science: The Lifelong Consequences of Early Language Skills

“When families are informed about the importance of conversational interaction and are provided training, they become active communicators and directly contribute to reducing the word gap (Leung et al., 2020).”

Privacy Overview

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 08 March 2023

Changing perceptions of language in sociolinguistics

  • Jiayu Wang 1 ,
  • Guangyu Jin 1 , 2 &
  • Wenhua Li 1  

Humanities and Social Sciences Communications volume  10 , Article number:  91 ( 2023 ) Cite this article

6845 Accesses

1 Citations

1 Altmetric

Metrics details

  • Language and linguistics

This paper traces the changing perceptions of language in sociolinguistics. These perceptions of language are reviewed in terms of language in its verbal forms, and language in vis-à-vis as a multimodal construct. In reviewing these changing perceptions, this paper examines different concepts or approaches in sociolinguistics. By reviewing these trends of thoughts and applications, this article intends to shed light on ontological issues such as what constitutes language, and where its place is in multimodal practices in sociolinguistics. Expanding the ontology of language from verbal resources toward various multimodal constructs has enabled sociolinguists to pursue meaning-making, indexicalities and social variations in its most authentic state. Language in a multimodal construct entails the boundaries and distinctions between various modes, while language as a multimodal construct sees language itself as multimodal; it focuses on the social constructs, social meaning and language as a force in social change rather than the combination or orchestration of various modes in communication. Language as a multimodal construct has become the dominant trend in contemporary sociolinguistic studies.

Similar content being viewed by others

research paper about languages

Comparing the language style of heads of state in the US, UK, Germany and Switzerland during COVID-19

research paper about languages

The art of rhetoric: persuasive strategies in Biden’s inauguration speech: a critical discourse analysis

research paper about languages

How Spanish speakers express norms using generic person markers

Introduction.

This article will review a range of sociolinguistic concepts and their applications in multimodal studies, in relation to how language has been conceptualized in sociolinguistics. While there are reviews of specific areas of research in sociolinguistics, including prosody and sociolinguistic variation (Holliday, 2021 ), language and masculinities (Lawson, 2020 ), and Language change across the lifespan (Sankoff, 2018 ), there have been few reviews works set out to delineate the most fundamental ontological questions in sociolinguistic studies; that is, what is and what constitutes language? How do sociolinguists perceive language in relation to other semiotic resources that are part and parcel of social meaning-making and social interaction? Relevant discussions are scattered in passing mainly in the introductory sections of various sociolinguistic works, such as Blommaert ( 1999 ), García and Li ( 2014 ) and Makoni and Pennycook ( 2005 ). However, there have not been review articles systematically dealing with the changing perceptions of language in sociolinguistic studies.

These issues are worthwhile to pursue in the sense that though sociolinguistics studies language, yet no reviews were done regarding what on earth constitutes language, especially in relation to a wider range of semiotic resources. What even makes the review more imperative is that in an increasingly globalized and high-tech world, linguistic practices are complicated by the super-diversity of ethnic fluidity, communications technologies, and globalized cross-cultural art.

Centring on the ontological perception of language in sociolinguistics, this article consists of five sections. After the “Introduction” section, the next section will review traditional (socio)linguistic perceptions of language as written or spoken signs or symbols that people use to communicate or interact with each other. The next section will review representative sociolinguistic approaches that place language in multimodal settings which involve the relationship between language and other semiotic resources. They are categorized as the conceptualizations of “language in multimodal construct” and “language as multimodal construct”. These conceptualizations share the common feature that language is not researched merely in terms of written and spoken signs and symbols, but it is probed (1) in relation to its multimodal contexts and (re)contextualization (regarding language in multimodal construct), (2) in terms of its own materiality and spatiality, and linguistic representations of multimodality, for instance, social (inter)action and “smellscapes” (Pennycook and Otsuji, 2015a ) which are in turn conflated with linguistic features (regarding language as multimodal construct). The penultimate section and the last section will present a critical reflection and a conclusion of the review, respectively.

Language as written and spoken signs and symbols

What constitutes language(s)? Saussure ( 1916 ) distinguishes between langue and parole. The former refers to the abstract, systematic rules and conventions of the signifying system, while the latter represents language in daily use. Chomsky ( 1965 ) refers to them as competence (corresponding to langue) and performance (corresponding to parole). Chomsky ( 1965 ) assumes that performance is bound up with “grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of this language in actual performance” (Chomsky, 1965 , pp. 3–4). He advocates that the agenda of linguistics should be the study of competence of “an ideal speaker-listener, in a completely homogeneous speech-community, who knows its (the speech community’s) language perfectly” (in brackets original). His conception of the ideal language rules out the “imperfections” arising from the influences of social or pragmatic dimensions in real language use. This can be seen as the conception of language as innate human competence. By contrast, constructionists have argued that language cannot be separated from the societal and social domain; social reality is constructed through languages (Berger and Luckmann, 1966 ), and linguistics should take social dimensions into account, as shown by Systemic Functional Linguistics developed by Halliday. These approaches to language studies, nevertheless, do not pay much attention to the ontological issues of language or linguistics concerning what constitutes language, whether languages can be separated from each other, and whether there are different conceptions of language(s).

Sociolinguistics, taking as its departure an interdisciplinary attempt to be the sociology regarding linguistic issues or linguistics regarding sociological issues, faces the ambivalent positioning of whether it should be sociologically oriented (that is, more explanatory) or linguistically oriented (that is, more descriptive) (Cameron, 1990 ). Also, there are contentions regarding whether more attention should be paid to epistemically linguistic minutiae (as in conversation analysis or CA), or to the macro-social interpretation of ideology not necessarily dependent on the evident orientation of the participants (as in critical discourse analysis, or CDA), as debated in Blommaert ( 2005 ) and Schegloff ( 1992 , 1998a / 1998b , 1999 ). As such, more sociolinguists than linguists in other disciplines are concerned with the ontology of language regarding its nature and its relation with broader social structures. In other words, such concerns can, firstly, justify the identity of sociolinguistics being either a branch of sociology, or linguistics, or even more broadly, anthropology. They can also delineate the contour of the macro vis-à-vis micro research subjects: are languages seen as separate systems, or inseparable but relatively fixed systems or an integrated construction in relation to their social dimensions of power, ideology and hegemony?

Such ontological concerns are important, because different approaches to research may be engendered accordingly. For instance, variational sociolinguistics is concerned with the linguistic differences within a language (standard language vis-à-vis its variations in dialects) and examines how these differences are linked to social aspects of linguistic practices, such as gender and social status. These differences within a certain category of language may be placed in the changing situations of various language communities or areas (e.g., Labov, 1963 , 1966 ), or in contextualized pragmatic situations (Agha, 2003 ; Eckert, 2008 ). Assumptions of separable or separate languages may be well-encapsulated in the works regarding language ideology and linguistic differentiation, such as the studies by Kroskrity ( 1998 ), Irvine and Gal ( 2000 ), as well as considerable other works on bilingualism or multilingualism. These works treat language as belonging to different standard systems (e.g., English, French, German, and so on) and can be pursued by “enumerating” these categories. In other words, these standard language systems are seen as having clear boundaries between them, and language can be researched by attributing different linguistic resources to (one of) these systems. The stance of the inseparability of language problematizes the enumeration of languages, by discrediting their explanatory potential in linguistic practices. In pedagogical contexts, transnational students are found using language features beyond the boundaries of language systems (Creese and Blackledge, 2010 ; Lewis et al., 2012 ). In the context of youth or urban culture, there are loosely fixed assumptions between language and ethnicity (Maher, 2005 ; Woolard, 1999 ). In some globalized contexts, new communications technologies as well as globalization itself are changing the traditional power structure in linguistic practices (Jacquemet, 2005 ; Jørgensen, 2008 ; Jørgensen et al., 2011 ). Furthermore, Makoni and Pennycook ( 2005 ), by advocating the disinvention of languages, problematize the process of “historical amnesia” (Makoni and Pennycook, 2005 , p. 149) of bi- and multilingualism, and their tradition of enumerating languages which reduces sociolinguistics to at best a “pluralization of monolingualism” (Makoni and Pennycook, 2005 , p. 148). However, this does mean that languages cannot be probed as standard categories. It holds a more intricate stance: on the one hand, it problematizes the separation of languages, as language is characterized by fluidity in multi-ethnic settings; on the other hand, it assumes the fixity of the relationship between a given (standard) language and its corresponding identity, ethnicity, and other societal factors (Otsuji and Pennycook, 2010 ); fluidity and fixity, however, are not binary attributes that exclude each other; they coexist, mutually influence each other in real-life linguistic practices. By the same token, Blackledge and Creese ( 2010 ) and Martin-Jones et al. ( 2012 ) also hold a dynamic view on language and identity: while language functions as “heritage” (see Blackledge and Creese, 2010 , pp. 164–180) and the positioning or maintenance of national identity, the bondage, however, frequently loosens as it is always contested, resisted and “disinvented” (Makoni and Pennycook, 2005 ). Table 1 illustrates three kinds of sociolinguistic conceptualizations of language.

The above discussion briefly delineates how contemporary sociolinguistic studies attempt to capture the complex ways in which the notion of language is construed, resisted or reinvented in and through practices. Most of these approaches are based on the traditional assumption of language as written signs and symbols in its verbal forms. Other forms of resources are generally seen as contexts where these verbal signs and symbols take place. They are contextual facets that contribute to the ideological and sociological corollary of language use, but they are not seen as ontological components in linguistics. Later developments, which integrate multimodal studies into sociolinguistics, show differing stances regarding the ontology of language, as shown in the next section.

Language in vis-à-vis as multimodal construct

Jewitt ( 2013 , p. 141) defines multimodality as “an inter-disciplinary approach that understands communication and representation to be more than about language”. This should be seen as a definition oriented toward social semiotics, in which different semiotic resources are seen as various modes of representation or communication through semiosis. For a sociolinguistic version of the definition, we prefer to interpret it as language in vis-à-vis as a multimodal construct. By using the word “construct”, we would like to point out that multimodality or multimodal conventions enter into sociolinguistic studies because they are socially constructed; that is, sociolinguists research these multimodal dimensions because they are semiotic resources and practices which are constructed by social subjects with power, manipulation and ideology. They are not neutral resources by which people communicate information or by which the process of meaning-making, or semiosis, is realized. Instead, they are a social construct that constitutes the type of Foucauldian knowledge in which sociological power and ideology lie at the core. In this sense, the notions, frameworks, and approaches that we discuss as follows are socially critical in nature and are predominantly related to socially constructed ideologies such as hegemony, power, and identity. As Makoni and Pennycook ( 2005 ) note, languages are “invented” by the dominant (colonial) groups through classification and naming in history; they are not neutral practices and they are constructed and invested with ideologies, power and inequality. Sociolinguistics thus needs a historically critical perspective. In fact, since its birth, sociolinguistics has been a discipline focusing on language use in relation to socially critical issues, such as gender, race, class and politics. This focus can date back as early as Labov’s ( 1963 , 1966 ) ethnographical research on variations of English on the island of Martha’s Vineyard, Massachusetts and in New York City. The sound change or phonetic features are studied in relation to ethnicity, social stratification and class. Agha ( 2003 ) and Eckert ( 2008 ) also probe the phonetic features or regional change of variations in relation to ethnicity and social and economic status.

In fact, the above-mentioned concerns of sociolinguistics are also consistent with CDA (see Wang and Jin, 2022 ; Wang and Yang, 2022 ), especially multimodal critical discourse analysis (MCDA), which also contributes to the research trend in terms of language in multimodality. Kress and van Leeuwen ( 1996 ) postulates a set of visual grammar based on systemic functional grammar. Machin ( 2016 ) and Machin and Mayr ( 2012 ) and other scholars have also adopted MCDA in various types of discourse. Semiotic resources other than language are analysed to reveal the social construct of power, ideology, and inequality in relation to verbal resources (Wang, 2014 , 2016a , 2016b ). Language in the multimodal construct in sociolinguistics is quite similar to the social semiotic and critical discourse approach to multimodality: language is seen as one type of resource, amongst other non-language resources (visual, aural, embodied, and spatial) in the meaning-making process. The difference lies in that sociolinguistic approaches toward language in multimodality have much more focus on social interaction, power and ideology and their research frequently includes ethnographical data and observations. Language as a multimodal construct, by contrast, sees language as a more integral part of multimodal resources, and vice versa; less distinct boundaries are seen as existing between languages and non-languages. These two trends of conceptions are discussed below.

Language in multimodal construct

To place language studies in the multimodal construct is not a new practice in sociolinguistics. Agha ( 2003 , p. 29) analyses the Bainbridge cartoon, treating accent not as “object of metasemiotic scrutiny”, but as an integral element in “the social perils of improper demeanour in many sign modalities” such as dress, posture, gait and gesture. His discussion demonstrates how language studies can be embedded in a larger multimodal scope. Language is contextualized by its peripheral multimodal paralinguistic sign systems. In Eckert ( 2008 , p. 25), the process of “bricolage” (Hebdige, 1984 ), in which “individual resources can be interpreted and combined with other resources to construct a more complex meaningful entity”, is linked to the style and language variations which reflect social meaning. She gives examples of how the clothing of students at Palo Alto High School affords them certain types of styles to convey social meaning. Eckert ( 2001 ), Coupland ( 2003 , 2007 ) and other scholars’ research represent the “third-wave” sociolinguistic studies, which see the use of variation in terms of personal and social styles (Eckert, 2012 ). Language and other semiotic resources constitute a stylistic complex that makes social meaning and constructs social styles and identities together. Goodwin ( 2007 ) extensively encompasses multimodal interaction in the examination of participation, stance and affect in a “homework” interaction between a father and his daughter, where gaze, gesture, and the spatial environment are taken into account. Goodwin’s research is partly premised on Bourdieu’s ( 1991 , pp. 81–89) associating bodily hexis with habitus , which is also a notion that is multimodal in itself. The deployment of different bodily modes in different contexts of participation (such as homework, archaeology, and surgery) depends on conventions of various social practices or their respective habitus .

Research regarding language in multimodal construct shares some common ground with the social semiotic approach towards multimodality. First, in communication, there are different modes of resources or semiotic types that convey social meaning and embed ideology. Second, these resources consist of language and “non-language”: the former being written or spoken signs and symbols that social actors use to communicate, and the latter being visual, aural, or embodied ones in that language are situated. Third, meaning-making is done through the orchestration of these resources.

In contrast to social semiotic approaches, with an anthropology-oriented concern, language in the multimodal construct as a sociological and sociolinguistic approach usually bases itself on ethnographical observations of social interaction. Language is seen as a component in social interactional discourse; other semiotic modes or resources are also important resources through which language use is contextualized. To be more specific, language in multimodal construct shows concerns with language as one type of semiotic resource that is placed in multimodal contexts in the following aspects:

First, meaning-making through other resources is seen as “add-ons” to that of language. In other words, language indexes social meaning and ideology in collaboration with other types of resources. An example is Agha’s ( 2003 ) analysis of the Bainbridge cartoon in which clothes, demeanour, and even body shape work in collaboration with accent in conveying register and social status. Second, language as one type of social meaning-making resource can be conceptualized in relation to the meaning-making process of other resources. For example, the process of “bricolage” is probed in relation to variations with their indexed styles and social categorization in terms of “gender and adolescence” (Eckert, 2008 , p. 458). This concept is used to offer clues regarding how “the differential use of variables constituted distinct styles associated with different communities of practice” (Eckert, 2008 , p. 458). Third, language is one of the communicative modes in social interactional discourse. It does not necessarily take the central role, because other types of resources, such as gestures, gaze, and the environment where these actions take place, jointly constitute the social meaning-making process. This can be best encapsulated in Goodwin’s ( 2007 ) analysis of the “homework” interaction between a father and his daughter. In this quite mundane interactional discourse, the father uses different embodied actions to negotiate different moral and affective stances through the “homework interaction” with his daughter. Conversation as a linguistic resource plays a role in the interaction, while embodied actions are key factors in affecting these stances.

Language as a multimodal construct

A slightly different approach to studies of language in multimodal contexts is to view it as a multimodal construct: either in the way that language is considered as autonomously constituting the semiotic texture (e.g., in the art form of the “text art” where text is also seen as picture) or in the way that some traditionally assumed extra-linguistic modes are considered as special forms or dimensions of language. This trend of research includes recent studies on language in space, social interactional multimodal discourse analysis, and new concepts or conceptualizations of language in society, as discussed below.

Language in space: semiotic landscape, place semiotics, and discourse geography

Jaworski and Thurlow ( 2010 ) review the notion of spatialization , that is, the semiotics and discursivity of space (Jaworski and Thurlow, 2010 ), and the extension of the notion of the linguistic landscape. By so doing, they frame the concept of semiotic landscape as encapsulating how written discourse interacts with other multimodal discursive resources with blurring boundaries in between.

In their opinion, space is “not only physically but also socially constructed, which necessarily shifts absolutist notions of space towards more communicative or discursive conceptualizations” (Jaworski and Thurlow, 2010 , p. 7). Sociological research on space thus is more oriented toward spatialization, “the different processes by which space comes to be represented, organized and experienced” (Jaworski and Thurlow, 2010 , p. 6). This spatialization—as represented discursively—is intrinsically multimodal:

Echoing the sentiments of Kress and van Leeuwen quoted at the start of this chapter, Markus and Cameron argue that ‘[b]uildings themselves are not representations’ (p. 15), but ways of organizing space for their users; in other words, the way buildings are used and the way people using them relate to one another, is largely dependent on the spoken, written and pictorial texts about these buildings… Architecture and language (spoken and written) may then form an even more complex, multi-layered landscape (or cityscape) combining built environment, writing, images, as well as other semiotic modes, such as speech, music, photography, and movement…(Jaworski and Thurlow, 2010 , pp. 19–20)

The “spatial turn” (Jaworski and Thurlow, 2010 , p. 6) in sociolinguistics thus adds the analytical dimensions of multimodal resources to the traditional concept of the linguistic landscape. Written language itself does convey social meaning and ideologies, while it is situated in materiality (the materials it is written on) and spatiality (the places where it appears). The concept of the semiotic landscape blurs the traditional boundary between language and non-language.

Different from social semiotic approaches towards multimodality, researchers of semiotic landscape pay predominant attention to the “metalinguistic or metadiscursive nature of ideologies” (Jaworski and Thurlow, 2010 , p. 11). In Kallen’s words, the concept of semiotic landscape starts from the assumption that “sinage is indexical of more than the ostensive message of the sign”. (Kallen, 2010 , p. 41); signage indexes ideologies that are embedded in, or indicated by, different types of space or spatiality: city centre, tourist places, districts and so on. Less interest is invested in the process of semiosis regarding how different modes of signs are orchestrated to communicate information, which is one of the primary endeavours of social semiotics (Li and Wang, 2022 ; Wang, 2014 , 2019 ; Wang and Li, 2022 ). As such, in ethnographical studies or data analysis, language, materiality, and spatiality are usually seen as interwoven with each other, with no distinct boundaries in between; or at least, boundary-marking is not the primary concern of semiotic landscape.

In the same vein, Scollon and Scollon ( 2003 , p. 2) coin the term “geosemiotics” (or “place semiotics”) which is “the study of the social meaning of signs and discourses and of our actions in the material world”. Their research objects are signs in public places. The conceptual framework of “geosemiotics” sees language as a multimodal construct in terms of the following aspects. First, verbal language is analysed by using social semiotic approaches to visuals. Code preference (regarding which language is seen as “primary” language) shown on signs or buildings is analysed by using Kress and van Leeuwen’s ( 1996 , p. 208) conception of compositional meaning indexed by different positions in pictures. Second, language is seen as multimodal itself. Language on signs or buildings is analysed in terms of the multimodal inscription (see Scollon and Scollon, 2003 , pp. 129–142) that includes fonts, letter form, material quality, layering and state changes. Third, the emplacement (referring to meaning-making through positioning signs in different places) in geosemiotics, similar to Jaworski and Thurlow’s ( 2010 ) approach towards the semiotic landscape, is predominantly concerned with spatiality and metalinguistic or metadiscursive ideology, rather than the interaction and orchestration of different modes (language vis-à-vis non-language) in semiosis.

Similar to the concepts of semiotic landscape and place semiotics, Gu ( 2009 , 2012 ) postulates the framework of four-borne discourse and discourse geography. Based on Blommaert’s ( 2005 , p. 2) view of discourse as “language-in-action”, Gu analyses the language and activities in social actors’ trajectories of time and space in the land-borne situated discourse (LBSD): a type of discourse categorized by Gu ( 2009 ) according to different types of spatiality as carriers and places where the discourses take place. In Gu’s ( 2012 ) conceptualizations, language and discourse are metaphorically spatialized: language is seen in terms of the place where it takes place. Multimodality is evaluated based on space (Gu, 2009 ). Though it is arguable to what extent language is seen as a conflation of modes or semiotic attributes in Gu ( 2009 ), his work demarcates an ambivalent boundary between language and the “non-language”. Also, in “spatializing” language as discourse geography, it represents language and discourse as a PLACE or SPACE metaphor that is multimodal itself. In addition, it analyses the translation between different modes, for instance, the “modalization” of written language into visuals and sounds; visuals are also seen as forms of “modalized” language and vice versa. As such, Gu ( 2009 ) also represents the “spatial turn” of sociolinguistics which can be seen as the research trend that regards language as multimodal construct.

In general, the trend to spatialize language and discourse (or the “spatial turn”), with the concepts or frameworks such as semiotic landscape, place semiotics, and discourse geography, treats language as multimodal construct in the following two aspects. First, it focuses on metalinguistic or metadiscursive ideologies that are embedded in different modes of signs or symbols; also, Gu’s research metaphorically theorizes social interaction through multimodality. In other words, it posits that language itself is multimodal or modalizable in meaning-making. Written language has its multimodal dimensions such as facets of its inscription including fonts, letterform, material quality, layering and state changes (Scollon and Scollon, 2003 ). Different forms of language are multimodal in terms of spatiality: they can be naturally multimodal and aural-visual for instance in televised discourse; written language can also be “modalized” (Gu, 2009 , p. 11) into visuals (Gu, 2009 ). Overall, language is either considered as signs in the spatialized system or actions in trajectories of activities. It is an integral part of multimodal construct, where other modes (visual, gesture, action, and so on) are not peripheral or auxiliary, but frequently they also belong to linguistic resources, for instance, the visual resources in text arts.

Multimodal studies from the social interactional perspective

There are sociolinguistic approaches towards multimodality that combine social interactional sociolinguistics (Goffman, 1959 , 1963 , 1974 ), social semiotic approach towards multimodality (Kress and van Leeuwen, 1996 ), and intercultural communication (Wertsch, 1998 ). We summarize these approaches as multimodal studies from the social interactional perspective, which include mediated discourse analysis (Scollon and Scollon, 2003 ) and multimodal interaction analysis (Norris, 2004 ); the latter grew out of the former.

Multimodal studies from the social interactional perspective focus on people’s daily actions and interactions, and the environment and technologies with(in) which they take place. This trend of research sees discourse as (embedded in) social interaction and sets out to investigate social action through multimodal resources used in daily interaction, such as gestures, postures, and language (see Jones and Norris, 2005 ). In Norris’s ( 2004 ) framework for multimodal interaction analysis, units of analysis are a system of layered and hierarchical actions including the lower-level actions such as an utterance of spoken language, a gesture, or a posture, and the higher-level actions consisting of chains of higher-level actions. Norris ( 2004 ) also coins the term “modal density” to refer to the complexity of modes a social actor uses to produce higher-level actions.

The focus on hierarchical levels of actions and the concept of “modal density” entail reflections on the question with regard to what constitute(s) mode and language. Language in multimodal interaction analysis is seen as a type of lower-level action amongst other different embodied resources that are at interactants’ disposal. These embodied resources are seen as different modes such as gesture, gaze, and proxemics. But arguably gestures and gazes in Norris ( 2004 ) are also seen as forms of language in interaction as well. Furthermore, regarding the mode of spoken language, Norris ( 2004 ) and her other works methodologically treat it as a multimodal construct where the pitches and intonation are visualized through various fonts in the wave-shaped annotation, along with the policeman’s gestures, as shown in Fig. 1 .

figure 1

The policeman’s spoken language is treated as a multimodal construct where the pitches and intonation are visualized through various fonts in the wave-shaped annotation, along with his gestures.

Multimodal studies from the social interactional perspective, similar to other sociolinguistic approaches to multimodality, target the meta-modal or metadiscursive facets of ideology. This is done through a bottom-up approach, that is, examining the general social categories of such as power, dominance and ideology from people’s daily (inter)action. This trend of research focuses on basic units of actions in people’s daily interaction; the conception of mode and language is oriented toward seeing language as multimodal; the methodological treatment of languages also shows this orientation. Multimodal studies from the social interactional perspective are intended to reveal the ideology and power embedded in language as action. Overall, they perceive language as a multimodal construct in social (inter)action.

Metrolingualism, heteroglossia, polylanguaging and multimodality

In the second section of the paper, we mentioned the works on some similar notions such as metrolingualism and polylanguaging. In this section, we will review the latest application of the notion of metrolingualism in multimodal analysis and discuss why other related notions or approaches also encapsulate the conceptualization regarding language as a multimodal construct.

Metrolingualism is a concept postulated by Otsuji and Pennycook ( 2010 ) originally referring to “creative linguistic conditions across space and borders of culture, history and politics, as a way to move beyond current terms such as multilingualism and multiculturalism” (Otsuji and Pennycook, 2010 , p. 244). Their later works (Pennycook and Otsuji, 2014 , 2015a , 2015b ) develop the concept and reformulate it as a broader notion encompassing the everyday language use in the city and linguistic landscapes in urban settings.

In Pennycook and Otsuji ( 2014 , 2015b ), metrolingualism involves the practice of “metrolingual multitasking” (Pennycook and Otsuji, 2015b , p. 15), in which “linguistic resources, everyday tasks and social space are intertwined” (Pennycook and Otsuji, 2015b , p. 15). Metrolingualism thus is not only concerned with the mixed use of linguistic resources (from different languages), but it involves how language use is involved in broader multimodal practices such as (embodied) actions accompanying or included in the metrolingual process, (changing) space or places where these actions and language use take place, and the objects in the environment. Pennycook and Otsuji ( 2015b ) include an olfactory mode in their analysis of the metrolingual practices in cities. Smell is represented through linguistic or pictorial signs in the city and suburb to constitute “smellscapes” in relation to social activities, ethnicities, gender and races. Metrolingual smellscapes are represented through the conflation of written and visual signs and symbols (e.g., street signs), social activities (e.g., buying and selling, and riding a bus), objects (e.g., spices), and places or spaces (e.g., suburb markets, coffee shops, buses and trains). The conventional distinction between language and the non-language is less important, or not at issue here, as smells have to be represented through language or visuals, and more resources are conceptualized as metrolingual other than languages.

Language in Pennycook and Otsuji’s ( 2014 , 2015a , 2015b ) conception of metrolingualism, in this regard, is seen as being integrated into different types of activities and actions; it is also spatialized in the sense that metrolingual practice is seen as involving the organization of space, the relationship between “locution and location” (Pennycook and Otsuji, 2015b , p. 84), (historical) layers of cities (Pennycook and Otsuji, 2015b , p. 140). The spatialization is intrinsically multimodal, which we have discussed in earlier sections.

In relation to metrolingualism, Jaworski ( 2014 ) briefly reviews the history of arts and writing, from which he chose the art form of “text art” as his research subject. Referring to the notion of metrolingualism, he sees these art forms as “metrolingual art”, where language interacts with other modes or is seen as part of the visual mode. He suggests that it be useful to “extend the range of semiotic features amenable to metrolingual usage to include whole multimodal resources” (Jaworski, 2014 , p. 151). The multimodal representations in text art are realized by mixing, meshing and queering of the linguistic features, as well as by its relation to a “melange of styles, genres, content, and materiality” (Jaworski, 2014 , p. 151). In this regard, the multimodal affordances (Kress, 2010 ; Jewitt, 2009 ) realized by materiality (e.g., papers, cloths, walls where the language is written), media (e.g., soundtrack, video, moving images, etc.), and styles (e.g., fonts, letterform, layering like add-ons or decorations) are an integral part of the metrolingualism. Subsequently, he postulates that it would be useful to align the concept of heteroglossia with metrolingualism, so as “to extend the idea of metrolingualism beyond ‘hybrid and multilingual’ speaker practices (Otsuji and Pennycook, 2010 , p. 244) and move towards a more ‘generic’ view of metrolingualism as a form of heteroglossia” (Jaworski, 2014 , p. 152). In this way, it relates the subject position taken by the producers of the text arts to their social orientation or alignment as regards power, domination, hegemony, and ideology in a broader social realm. This is also in line with Bailey’s discussion about heterogliossia: “(a) heteroglossia can encompass socially meaningful forms in both bilingual and monolingual talk; (b) it can account for the multiple meanings and readings of forms that are possible, depending on one’s subject position, and (c) it can connect historical power hierarchies to the meanings and valences of particular forms in the here-and-now” (Bailey, 2007 , pp. 266–267; also quoted in Jaworski, 2014 , p. 153). Overall, Jaworski ( 2014 ) shows how metrolingualism and heteroglossia can be used to analyse features of language and their place in multimodal construct. He also discusses how other notions which are similar to metrolingualism may bear a relationship with multimodality in that they stress “the importance of linguistic features (rather than discrete languages) as resources for speakers to achieve their communicative aims” (Jaworski, 2014 , p. 138).

Apart from the concepts of metrolingualism and heteroglossia, Jaworski ( 2014 ) touches upon the relationship between polylanguaging and multimodality, but he does not elaborate on it. Jørgensen ( 2008 ) demonstrates how polylanguaging is concerned with the use of language features in language practice among adolescents in superdiverse societies. Some of these language features “would be difficult to categorize in any given language” (Jørgensen et al., 2011 , p. 25); that is, they do not belong to any standard language system (e.g., English, Chinese, German). In addition, emoticons are frequently used in communication via social networking software. If some of these language features do not belong to any given language, it is difficult to say whether they can be seen as languages. The attention on features of language hence blurs the boundary between language and other semiotic resources. Of course, these features can be seen as a type of linguistic (lexical, morphemic or phonemic) units which still belong to language, but they are frequently used in multimodal meaning-making. Below I use Jørgensen et al.’s ( 2011 , p. 26) example (Fig. 2 ) to illustrate this.

figure 2

The “majority boy” makes use of resources from the minority’s language (the word “shark”).

Jørgensen et al.’s analysis of this example focuses on the “majority boy” using the word “shark”, which is a loan word from Arabic. As a majority member, he is using the minority’s language to which he is not entitled. Judging by the interaction, it can be seen that “both interlocutors are aware of the norm and react accordingly” (Jørgensen et al., 2011 , p. 25). As such he noted that one feature of polylanguaging is “the use of resources associated with different ‘languages’ even when the speaker knows very little of these” (Jørgensen et al., 2011 , p. 25).

What also needs attention but is not discussed by Jørgensen et al. ( 2011 ), is the interlocutors’ creative way to use these features in polylanguaging: the word “shark” is written as a prolonged “shaarkkk” in terms of its phonetic and visual effects. The creative configuration of the language feature “shark” functions to draw other interlocutors’ attention toward the polylanguaging practice. The emoticon “:D” following it is to demonstrate that the speaker knows that he is using language features by violating the “normal” rules; that is, he is using the minority language features to which he is not entitled. The repeated words “cough, cough”, followed by the emoticon “:D”, also demonstrate this.

Polylanguaging, as formulated by Jørgensen et al. ( 2011 ), deviates from the tradition of multilingualism to enumerate languages, but focuses on language features that may not belong to any given language. In this sense, the emoticons or creative configuration of words can also be seen as language features—the language features that are creatively used by a virtual community of (young) netizens in communication. These features are multimodal in the following aspects. First, they visualize the polylanguaging practice by creating new forms of words, for instance, the prolonged word “shaarkkk”. This creation itself is in fact also a process of polylanguaging, in the sense that it uses the features of common language, or language in people’s daily life (that is, non-cyber language) to create new cyber-language that is used by members of a virtual community. Second, these language features utilize the multimodal resources of embodiment in polylanguaging. For example, emoticons use different letters or punctuations (as language features from people’s daily written language) to represent different facial expressions and emotions. The repetition of the words “cough, cough”, as “a reference to a cliché way of expressing doubt or scepticism” (Jørgensen et al., 2011 , p. 27) also takes on an embodied stance. It shows that the interlocutors are aware that the majority boy is using the minority’s language to which he is not entitled. Hence, this embodied stance indexes the polylanguaging practice. To summarize what is discussed above, polylanguaging entails seeing language as a multimodal construct, as interlocutors creatively adapt language features in daily communication (face-to-face or written communication not involving the internet) or utilize embodied language features when polylanguaging in online communication.

Discussion and a critical reflection

In the sections “Language as written and spoken signs and symbols” and “Language in vis-à-vis as multimodal construct” above, we delineated the ontological perceptions of language in sociolinguistics, including language as spoken and written signs and symbols, language in vis-à-vis as a multimodal construct. In teasing out various trends of approaches, language in sociolinguistics is found to have undergone several stages of development. Language as spoken and written signs and symbols have been pursued in variational sociolinguistics, bi- and multilingualism, and the latest theoretical and conceptual trends of research that do not see language as separate and separable systems or codes. Language in sociolinguistics, however, has been predominantly placed in nuanced and complicated relationships with other semiotic resources. Research regarding language in multimodal constructs sees language and non-language resources as different modes, or types of resources. These different modes have boundaries, and efforts are made to see how each mode combines with each other in meaning-making; language itself is a distinctive type of mode, interdependent with but different from other modes. Research regarding language as a multimodal construct sees language itself as multimodal, language is spatialized (that is, probed in relation to various spatiality and materiality where they appear); in the social interactional approach to multimodality, it is embodied and seen as embedded in a layered and hierarchical system of modes (including gesture, posture, and intonation) in social interaction; in the latest concepts built on languaging, language is regarded as “inventions” (Makoni and Pennycook, 2005 ), as cross- and trans-cultural practice, instead of separable and enumerable codes, or system. Language is entangled and integrated with objects (for instance, signage, and the materiality where it appears) and multitasking with embodied resources (gestures, talking, and simultaneously doing other things).

Expanding the ontology of language from verbal resources toward various multimodal constructs has enabled sociolinguists to pursue meaning-making, indexicalities and social variations in its most authentic state. Language itself is multimodal, though it cannot be denied that language and other modes do have boundaries and distinctions (yet not always being so). Whenever a language is spoken, the stresses, intonations, and paralinguistic resources are all integrated into it. Focusing on language per se has generated fruitful outcomes in sociolinguistic studies, but placing language in the multi-semiotic resources has innovated the field and it has become the dominant trend in contemporary sociolinguistics. Both languages in or as multimodal constructs have captured the complex ways in which language interacts with multi-subjects, materiality, objects and spatiality. But it may be found that the latest research in sociolinguistics comes to increasingly see language itself as an intricate multimodal construct, as encapsulated by various new concepts and theories including translanguaging, metrolingualism, and polylanguaing, in the contexts of globalization, migration, multi-ethnicity, and new communication technologies. Language is not only seen as separable codes and systems spoken or written by a different group of people, but it entails a wider range of communicative repertoires including embodied meaning-making, objects and the environment where the written or spoken signs are placed. It hence may be speculated that sociolinguistics will be increasingly less concerned with the boundaries of language and non-language resources, but will focus more on the social constructs, social meaning, and language as a force in social change. The enumerating and separating way of studying language and multimodality—that is, delineating inter-semiotic boundaries and focusing on how modes of communication are combined in meaning-making—has generated various outcomes, especially in the field of grammar-oriented social semiotic research and MCDA. However, contemporary sociolinguistic studies have immensely expanded their scope toward a wider range of areas other than discursive, grammatical, and communicative. The three research paradigms regarding language as a multimodal construct reviewed in “Language as multimodal construct” have proved themselves as a feasible approach toward language in social interaction, geo-semiotics, and language use in ethnographical and multi-ethnic settings. The ontology of language in sociolinguistics, in this regard, may be perceived in terms of the sociology and societal facets of multimodal construct, rather than language placed in a multitude of semiotic types or the verbal resources per se. A critical reflection on the ontology of language is one of the prerequisites of innovations in contemporary linguistics, which is also the objective of this comprehensive review.

As can be seen through the above discussion, there are several versions of the perception of language in sociolinguistics. First, perceptions of language as a written or verbal system are moving from, or have moved from, the enumerating traditions bi- or multi-lingualism towards seeing language as an inseparable entity with fixity and fluidity. In other words, new approaches in sociolinguistics come to see languages as comprising different features, repertories, or resources, rather than different or discrete standard languages such as English, French, German and so on. The negotiation, construction, or attribution of ethnicity, identity, power and ideologies through language also has taken on a more dynamic and diverse look. Second, there is sociolinguistic research that places language with in the multimodal construct. Language is seen as being contextualized by other multimodal semiotics that is seen as “non-language”. However, more research comes to see language as multimodal construct; that is, language, be it written or spoken, is multimodal in itself as it comprises multimodal elements such as type, font, materiality, intonation, embodied representations and so on. It is also activated (seen as actions or activities) or spatialized in different approaches such as mediated discourse analysis, multimodal interaction analysis, geosemiotics, semiotic landscape, and metrolingualism discussed earlier. Third, these changing perceptions of languages in sociolinguistics result from researchers’ innovative efforts to view language from different perspectives. More importantly, they arise from the fact that language itself is also changing as society changes. As mentioned in the beginning, the world has been increasingly globalized and communications technologies have fundamentally changed the ways people interact with each other. Linguistic practices are complicated by the super-diversity of ethnic fluidity (e.g., the diversity of ethnic groups and the ever-present changes in ethnic structure), communications technologies, and globalized cross-cultural art.

In sum, it can be argued that contemporary sociolinguistics has become increasingly concerned with languaging (trans-, poly-, metro-, and pluri- and so on), rather than languages as a type of (static and fixed) verbal resource with demarcated boundaries separating them from other multimodal resources. Language is multimodal; it is embedded in or represents social activities, places or spaces, objects, and smells. Language in society belongs to and constitutes the “semiotic assemblage” (Pennycook, 2017 ) that can be better analysed holistically so as to reach an understanding of “how different trajectories of people, semiotic resources and objects meet at particular moments and places” (Pennycook, 2017 , p. 269). At a fundamental level of sociolinguistic ontology, this trend of research reflects the changing ways in which sociolinguists come to understand what language is and how it should be understood as part of a more general range of semiotic practices.

Agha A (2003) The social life of cultural value. Language Commun 23(3–4):231–273

Article   Google Scholar  

Berger P, Luckmann T (1966) The social construction of reality: a treatise in the sociology of knowledge. Doubleday, New York

Google Scholar  

Blackledge A, Creese A (2010) Multilingualism: a critical perspective. Continuum, London

Blommaert J (Ed.) (1999) Language ideological debates, vol. 2. Walter de Gruyter, Berlin

Blommaert J (2005) Discourse: a critical introduction. Cambridge University Press, Cambridge

Book   Google Scholar  

Bourdieu P (1991) Language and symbolic power [Thompson JB (ed and introd)] (trans: Raymond G, Adamson M). Polity Press/Blackwell, Cambridge

Bailey B (2007) Heteroglossia and boundaries. In: Heller M (Ed.) Bilingualism: a social approach. Palgrave Macmillan, New York, pp. 257–274

Chapter   Google Scholar  

Cameron D (1990) Demythologizing sociolinguistics: why language does not reflect society. In: Joseph J, Taylor T (eds) Ideologies of language. Routledge, London, pp. 79–93

Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge, Massachusetts

Coupland N (2003) Sociolinguistic authenticities. J Sociolinguist 7(3):417–431

Coupland N (2007) Style: language variation and identity. Cambridge University Press, Cambridge

Creese A, Blackledge A (2010) Translanguaging in the bilingual classroom: a pedagogy for learning and teaching? Mod Language J 94:103–115

Eckert P, Rickford JR (Eds.) (2001) Style and sociolinguistic variation. Cambridge University Press, Cambridge

Eckert P (2008) Variation and the indexical field. J Sociolinguist 12(4):453–476

Eckert P (2012) Three waves of variation study: the emergence of meaning in the study of sociolinguistic variation. Annu Rev Anthropol 41(1):87–100

García O, Li W (2014) Translanguaing: language, bilingualism and education. Palgrave Macmillan, London

Goffman E (1959) The presentation of self in everyday life. Doubleday, New York, NY

Goffman E (1963) Behavior in public places. Free Press, New York, NY

Goffman E (1974) Frame analysis. Harper & Row, New York, NY

Goodwin C (2007) Participation, stance, and affect in the organization of activities. Discourse Soc 18(1):53–73

Gu Y (2009) Four-borne discourses: towards language as a multi-dimensional city of history. In: Li W, Cook V (eds.) Linguistics in the real world. Continuum, London, pp. 98–121

Gu Y (2012) Discourse geography. In: Gee JP, Hanford M (eds.) The Routledge handbook of discourse analysis. Routledge, London, pp. 541–557

Hebdige D (1984) Framing the youth ‘problem’: the construction of troublesome adolescence. In: Garms-Homolová V, Hoerning EM, Schaeffer D (eds.) Intergenerational Relationships. Lewiston, NY: C. J. Hogrefe, pp.184–195

Holliday N (2021) Prosody and sociolinguistic variation in American Englishes. Annu Rev Linguist 7:55–68

Irvine JT, Gal S (2000) Language ideology and linguistic differentiation. In: Kroskrity PV (ed.) Regimes of language: ideologies, polities, and identities. School of American Research Press, Santa Fe, pp. 35–84

Jaworski A (2014) Metrolingual art: multilingualism and heteroglossia. Int J Biling 18(2):134–158

Jaworski A, Thurlow C (eds.) (2010) Semiotic landscapes: language, image, space. Continuum, New York

Jewitt C (2009) Different approaches to multimodality. In: Jewitt C (ed) The Routledge handbook of multimodal analysis. Routledge, Abingdon, pp. 28–39

Jewitt C (2013) Multimodality and digital technologies in the classroom. In: de Saint-Georges I, Weber J (eds) Mulitlingualism and multimodality: current challenges for educational studies. Sense Publishing, Boston, pp. 141–152

Jørgensen JN (2008) Poly-lingual languaging around and among children and adolescents. Int J Multiling 5(3):161–176

Jørgensen JN, Karrebæk MS, Madsen LM, Møller JS (2011) Polylanguaging in superdiversity. Diversities 13(2):23–37

Jacquemet M (2005) Transidiomaticpractices: language and power in the age of globalization. Language Commun 25:257–277

Jones R, Norris S (2005) Discourse as action/discourse in action. In: Norris S, Jones R (eds) Discourse in action: introducing mediated discourse analysis. Routledge, London, pp. 1–3

Kallen J (2010) Changing landscapes: language, space and policy in the Dublin linguistic landscape. In: Jaworski A, Thurlow C (eds) Semiotic landscapes: language, image, space. New York: Continuum, pp. 41–58

Kress GR (2010) Multimodality: a social semiotic approach to contemporary communication. Routledge, London

Kress GR, van Leeuwen T (1996) Reading Images: the grammar of graphic design. Routledge, London

Kroskrity PV (1998) Arizona Tewa Kiva speech as a manifestation of linguistic ideology. In: Schieffelin BB, Woolard KA, Kroskrity P (eds) Language ideologies: practice and theory. Oxford University Press, New York, pp. 103–122

Labov W (1963) The social motivation of a sound change. Word 19(3):273–309

Labov W (1966) Hypercorrection by the lower middle class as a factor in linguistic change. Sociolinguistics 1966:84–113

Lawson R (2020) Language and masculinities: history, development, and future. Annu Rev Linguist 6(1):409–434

Lewis WG, Jones B, Baker C (2012) Translanguaging: origins and development from school to street and beyond. Educ Res Eval 18(7):641–654

Li W, Wang J (2022) Chronotopic identities in contemporary Chinese poetry calligraphy. Poznan Stud Contemp Linguist 58(4):861–884

Machin D (2016) The need for a social and affordance-driven multimodal critical discourse studies. Discourse Soc 27(3):322–334

Machin D, Mayr A (2012) How to do critical discourse analysis: a multimodal introduction. Sage, London

Maher J (2005) Metroethnicity, language, and the principle of Cool. Int J Sociol Language 11:83–102

Makoni S, Pennycook A (2005) Disinventing and (re)constituting languages. Crit Inq Language Stud 2(3):137–156

Martin-Jones M, Blackledge A, Creese A (eds) (2012) The Routledge handbook of multilingualism. Routledge, London

Norris S (2004) Analyzing multimodal interaction: a methodological framework. Routledge, London

Otsuji E, Pennycook A (2010) Metrolingualism: fixity, fluidity and language in flux. Int J Multiling 7:240–254

Pennycook A (2017) Translanguaging and semiotic assemblages. Int J Multiling 14(3):1–14

Pennycook A, Otsuji E (2014) Metrolingual multitasking and spatial repertoires: ‘Pizza mo two minutes coming’. J Socioling 18(2):161–184

Pennycook A, Otsuji E (2015a) Making scents of the landscape. Linguist Landsc 1(3):191–212

Pennycook A, Otsuji E (2015b) Metrolingualism. Language in the city. Routledge, New York

Sankoff G (2018) Language change across the lifespan. Annu Rev Linguist 4:297–316

Schegloff EA (1992) In another context. In: Duranti A, Goodwin C (eds) Rethinking context: language as an interactive phenomenon. Cambridge University Press, Cambridge, pp. 191–227

Schegloff EA (1998a) Positioning and interpretative repertoires: conversation analysis and poststructuralism in dialogue: reply to Wetherell. Discourse Soc 9(3):413–416

Schegloff EA (1998b) Reply to Wetherell. Discourse Soc 9(3):457–60

Schegloff EA (1999) ‘Schegloff’s texts’ as ‘Billig’s data’: a critical reply. Discourse Soc 10(4):558–572

Scollon R, Scollon S (2003) Discourses in place: language in the material world. Routledge, New York

Saussure F (1916) Course in general linguistics. Duckworth, London

Wang J (2014) Criticising images: critical discourse analysis of visual semiosis in picture news. Crit Arts 28(2):264–286

Wang J (2016a) Multimodal narratives in SIA’s “Singapore Girl” TV advertisements—from branding with femininity to branding with provenance and authenticity? Soc Semiot 26(2):208–225

Article   MathSciNet   Google Scholar  

Wang J (2016b) A new political and communication agenda for political discourse analysis: critical reflections on critical discourse analysis and political discourse analysis. Int J Commun 10:19

ADS   Google Scholar  

Wang J (2019) Stereotyping in representing the “Chinese Dream” in news reports by CNN and BBC. Semiotica 2019(226):29–48

Wang J, Jin G (2022) Critical discourse analysis in China: history and new developments. In: Aronoff M, Chen Y, Cutler C (eds) Oxford Research Encyclopedia of Linguistics. Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.909

Wang J, Li W (2022) Situating affect in Chinese mediated soundscapes of suona. Soc Semiot. https://doi.org/10.1080/10350330.2022.2139171

Wang J, Yang M (2022) Interpersonal-function topoi in Chinese central government’s work report (2020) as epidemic (counter-) crisis discourse. J Language Politics. https://doi.org/10.1075/jlp.22022.wan

Wertsch JV (1998) Voices of the mind: a sociocultural approach to mediated action. Harvard University Press, Cambridge, MA

Woolard K (1999) Simultaneity and bivalency as strategies in bilingualism. J Linguist Anthropol 8(1):3–29

Download references

Acknowledgements

Our thanks are extended to Dr. William Dezheng Feng for his constructive advice on the earlier drafts of the paper. This work is supported by the National Social Science Foundation of China (Project No. 18CYY050); the Foreign Language Education Foundation of China (Project No. ZGWYJYJJ11A030); and the Self-Determined Research Funds of CCNU from MOE for basic research and operation (Project No. CCNU20TD008).

Author information

Authors and affiliations.

Central China Normal University, Wuhan, China

Jiayu Wang, Guangyu Jin & Wenhua Li

Inner Mongolia Agricultural University, Hohhot, China

Guangyu Jin

You can also search for this author in PubMed   Google Scholar

Contributions

All three authors contributed to the conception and design of the study. JW mainly participated in drafting the work. GJ revised it critically for important intellectual content. WL participated in major intellectual contributions to the Chinese versions of the paper (unpublished); her ideas and points are integrated into the final version of this paper. All three authors are corresponding authors responsible for the final approval of the version to be published.

Corresponding authors

Correspondence to Jiayu Wang , Guangyu Jin or Wenhua Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

Additional information.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Wang, J., Jin, G. & Li, W. Changing perceptions of language in sociolinguistics. Humanit Soc Sci Commun 10 , 91 (2023). https://doi.org/10.1057/s41599-023-01574-5

Download citation

Received : 12 September 2022

Accepted : 20 February 2023

Published : 08 March 2023

DOI : https://doi.org/10.1057/s41599-023-01574-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

research paper about languages

Low-Resource Indic Languages Translation Using Multilingual Approaches

  • Conference paper
  • First Online: 02 December 2023
  • Cite this conference paper

research paper about languages

  • Candy Lalrempuii 41 &
  • Badal Soni 41  

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1087))

Included in the following conference series:

  • International Conference on Computer Vision, High-Performance Computing, Smart Devices, and Networks

103 Accesses

Machine translation is effective in the presence of a substantial parallel corpus. In a multilingual country like India, with diverse linguistic origins and scripts, the vast majority of languages need more resources to produce high-quality translation models. Multilingual neural machine translation (MNMT) has the advantage of being scalable across multiple languages and improving low-resource languages via knowledge transfer. In this work, we investigate MNMT for low-resource Indic languages—Hindi, Bengali, Assamese, Manipuri, and Mizo. With the recent success of massively multilingual pre-trained models for low-resource languages, we explore the effectiveness of using multilingual pre-trained transformers—mBART and mT5 on several Indic languages. We perform fine-tuning on the pre-trained models in a one-to-many and many-to-one approach. We compare the performance of multilingual pre-trained models with multiway multilingual translation trained from scratch using a one-to-many and many-to-one approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

https://data.statmt.org/pmindia/v1/.

https://www.timesofmizoram.com/.

https://zalen.in/.

http://www.statmt.org/moses/.

https://github.com/anoopkunchukuttan/indic_nlp_library.

https://simpletransformers.ai/.

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), Minneapolis, Minnesota, pp 4171–4186

Google Scholar  

Liu Y, Gu J, Goyal N, Li X, Edunov S, Ghazvininejad M, Lewis M, Zettlemoyer L (2020) Multilingual denoising pre-training for neural machine translation. Trans Assoc Comput Linguist 8:726–742

Article   Google Scholar  

Xue L, Constant N, Roberts A, Kale M, Al-Rfou R, Siddhant A, Barua A, Raffel C (2021) mT5: a massively multilingual pre-trained text-to-text transformer. In: Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies, (Online). Association for Computational Linguistics, June 2021, pp 483–498

Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, Vol 1 (Long Papers). Association for Computational Linguistics, Beijing, China, pp 1723–1732

Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, San Diego, California, pp 866–875

Ha TL, Niehues J, Waibel A (2016) Toward multilingual neural machine translation with universal encoder and decoder. In: Proceedings of the 13th international conference on spoken language translation, Seattle, Washington DC. International Workshop on Spoken Language Translation, Dec 8–9, 2016

Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G, Hughes M, Dean J (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist 5:339–351

Aharoni R, Johnson M, Firat O (2019) Massively multilingual neural machine translation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Vol 1 (Long and Short Papers), Minneapolis, Minnesota, pp 3874–3884

Pathak A, Pakray P, Bentham J (2022) English-mizo machine translation using neural and statistical approaches. Neural Comput Appl 31:7615–7631 Mar

Lalrempuii C, Soni B (2020) Attention-based english to mizo neural machine translation. In: Machine learning, image processing, network security and data sciences. Springer Singapore, pp 193–203

Lalrempuii C, Soni B, Pakray P (2021) An improved english-to-mizo neural machine translation. ACM Trans Asian Low-Resour Lang Inf Process 20

VMXEP Khenglawt, Laskar SR, Pal S, Pakray P, Khan AK (2022) Language resource building and English-to-mizo neural machine translation encountering tonal words. In: Proceedings of the WILDRE-6 workshop within the 13th language resources and evaluation conference, Marseille, France, European Language Resources Association, pp 48–54

Dabre R, Chu C, Cromieres F, Nakazawa T, Kurohashi S (2015) Large-scale dictionary construction via pivot-based statistical machine translation with significance pruning and neural network features. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, Shanghai, China, pp 289–297

Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th annual meeting of the association for computational linguistics, (Online), pp 7871–7880

Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67

MathSciNet   Google Scholar  

Kudo T, Richardson J (2018) SentencePiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, Brussels, Belgium, pp 66–71

Ott M, Edunov S, Baevski A, Fan A, Gross S, Ng N, Grangier D, Auli M (2019) Fairseq: a fast, extensible toolkit for sequence modeling. In: Proceedings of NAACL-HLT 2019: demonstrations

Wolf T, Debut L, Sanh J, Chaumond V, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le Scao T, Gugger S, Drame M, Lhoest Q, Rush A (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, (Online), pp 38–45

Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, ACL’02, Stroudsburg, PA, USA, pp 311–318

Post M (2018) A call for clarity in reporting BLEU scores. In: Proceedings of the 3rd conference on machine translation: research papers, Belgium, Brussels, pp 186–191

Popović M (2015) chrF: character n-gram F-score for automatic MT evaluation. In: Proceedings of the 10th workshop on statistical machine translation, association for computational linguistics, Lisbon, Portugal, pp 392–395

Ramesh G, Doddapaneni S, Bheemaraj A, Jobanputra M, AK R, Sharma A, Sahoo S, Diddee H, J M, Kakwani D, Kumar N, Pradeep A, Nagaraj S, Deepak K, Raghavan V, Kunchukuttan A, Kumar P, Khapra MS, (2022) Samanantar: the largest publicly available parallel corpora collection for 11 indic languages. Trans Assoc Comput Linguist 10:145–162

Download references

Author information

Authors and affiliations.

National Institute of Technology, Silchar, Assam, India

Candy Lalrempuii & Badal Soni

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Candy Lalrempuii .

Editor information

Editors and affiliations.

Delhi Technological University, Delhi, India

Ruchika Malhotra

Jawaharlal Nehru Technological University Kakinada, Kakinada, Andhra Pradesh, India

L. Sumalatha

Universiti Teknikal Malaysia Melaka, Melaka, Malaysia

S. M. Warusia Yassin

National Institute of Technology Silchar, Silchar, Assam, India

Ripon Patgiri

Naresh Babu Muppalaneni

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Lalrempuii, C., Soni, B. (2024). Low-Resource Indic Languages Translation Using Multilingual Approaches. In: Malhotra, R., Sumalatha, L., Yassin, S.M.W., Patgiri, R., Muppalaneni, N.B. (eds) High Performance Computing, Smart Devices and Networks. CHSN 2022. Lecture Notes in Electrical Engineering, vol 1087. Springer, Singapore. https://doi.org/10.1007/978-981-99-6690-5_27

Download citation

DOI : https://doi.org/10.1007/978-981-99-6690-5_27

Published : 02 December 2023

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-6689-9

Online ISBN : 978-981-99-6690-5

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Automated Social Science: Language Models as Scientist and Subjects

We present an approach for automatically generating and testing, in silico, social scientific hypotheses. This automation is made possible by recent advances in large language models (LLM), but the key feature of the approach is the use of structural causal models. Structural causal models provide a language to state hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted structural causal model becomes an object available for prediction or the planning of follow-on experiments. We demonstrate the approach with several scenarios: a negotiation, a bail hearing, a job interview, and an auction. In each case, causal relationships are both proposed and tested by the system, finding evidence for some and not others. We provide evidence that the insights from these simulations of social interactions are not available to the LLM purely through direct elicitation. When given its proposed structural causal model for each scenario, the LLM is good at predicting the signs of estimated effects, but it cannot reliably predict the magnitudes of those estimates. In the auction experiment, the in silico simulation results closely match the predictions of auction theory, but elicited predictions of the clearing prices from the LLM are inaccurate. However, the LLM's predictions are dramatically improved if the model can condition on the fitted structural causal model. In short, the LLM knows more than it can (immediately) tell.

This research was made possible by a generous grant from Dropbox Inc. Thanks to Jordan Ellenberg, Benjamin Lira Luttges, David Holtz, Bruce Sacerdote, Paul Röttger, Mohammed Alsobay, Ray Duch, Matt Schwartz, David Autor, and Dean Eckles for their helpful feedback. Author's contact information, code, and data are currently or will be available at http://www.benjaminmanning.io/. Both Benjamin S. Manning and Kehang Zhu contributed equally to this work. John J. Horton is a co-founder of a company, Expected Parrot Inc., using generative AI models for market research. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research.

MARC RIS BibTeΧ

Download Citation Data

More from NBER

In addition to working papers , the NBER disseminates affiliates’ latest findings through a range of free periodicals — the NBER Reporter , the NBER Digest , the Bulletin on Retirement and Disability , the Bulletin on Health , and the Bulletin on Entrepreneurship  — as well as online conference reports , video lectures , and interviews .

15th Annual Feldstein Lecture, Mario Draghi, "The Next Flight of the Bumblebee: The Path to Common Fiscal Policy in the Eurozone cover slide

Help | Advanced Search

Computer Science > Computation and Language

Title: phi-3 technical report: a highly capable language model locally on your phone.

Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench).

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Abstract illustration of a computer monitor

AI + Machine Learning , Announcements , Azure AI , Azure AI Studio

Introducing Phi-3: Redefining what’s possible with SLMs

By Misha Bilenko Corporate Vice President, Microsoft GenAI

Posted on April 23, 2024 4 min read

  • Tag: Copilot
  • Tag: Generative AI

We are excited to introduce Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices as they compose and build generative AI applications.

Starting today, Phi-3-mini , a 3.8B language model is available on Microsoft Azure AI Studio , Hugging Face , and Ollama . 

  • Phi-3-mini is available in two context-length variants—4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.
  • It is instruction-tuned, meaning that it’s trained to follow different types of instructions reflecting how people normally communicate. This ensures the model is ready to use out-of-the-box.
  • It is available on Azure AI to take advantage of the deploy-eval-finetune toolchain, and is available on Ollama for developers to run locally on their laptops.
  • It has been optimized for ONNX Runtime with support for Windows DirectML along with cross-platform support across graphics processing unit (GPU), CPU, and even mobile hardware.
  • It is also available as an NVIDIA NIM microservice with a standard API interface that can be deployed anywhere. And has been optimized for NVIDIA GPUs . 

In the coming weeks, additional models will be added to Phi-3 family to offer customers even more flexibility across the quality-cost curve. Phi-3-small (7B) and Phi-3-medium (14B) will be available in the Azure AI model catalog and other model gardens shortly.   

Microsoft continues to offer the best models across the quality-cost curve and today’s Phi-3 release expands the selection of models with state-of-the-art small models.

abstract image

Azure AI Studio

Phi-3-mini is now available

Groundbreaking performance at a small size

Phi-3 models significantly outperform language models of the same and larger sizes on key benchmarks (see benchmark numbers below, higher is better). Phi-3-mini does better than models twice its size, and Phi-3-small and Phi-3-medium outperform much larger models, including GPT-3.5T.  

All reported numbers are produced with the same pipeline to ensure that the numbers are comparable. As a result, these numbers may differ from other published numbers due to slight differences in the evaluation methodology. More details on benchmarks are provided in our technical paper . 

Note: Phi-3 models do not perform as well on factual knowledge benchmarks (such as TriviaQA) as the smaller model size results in less capacity to retain facts.  

research paper about languages

Safety-first model design

Responsible ai principles

Phi-3 models were developed in accordance with the Microsoft Responsible AI Standard , which is a company-wide set of requirements based on the following six principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness. Phi-3 models underwent rigorous safety measurement and evaluation, red-teaming, sensitive use review, and adherence to security guidance to help ensure that these models are responsibly developed, tested, and deployed in alignment with Microsoft’s standards and best practices.  

Building on our prior work with Phi models (“ Textbooks Are All You Need ”), Phi-3 models are also trained using high-quality data. They were further improved with extensive safety post-training, including reinforcement learning from human feedback (RLHF), automated testing and evaluations across dozens of harm categories, and manual red-teaming. Our approach to safety training and evaluations are detailed in our technical paper , and we outline recommended uses and limitations in the model cards. See the model card collection .  

Unlocking new capabilities

Microsoft’s experience shipping copilots and enabling customers to transform their businesses with generative AI using Azure AI has highlighted the growing need for different-size models across the quality-cost curve for different tasks. Small language models, like Phi-3, are especially great for: 

  • Resource constrained environments including on-device and offline inference scenarios.
  • Latency bound scenarios where fast response times are critical.
  • Cost constrained use cases, particularly those with simpler tasks.

For more on small language models, see our Microsoft Source Blog .

Thanks to their smaller size, Phi-3 models can be used in compute-limited inference environments. Phi-3-mini, in particular, can be used on-device, especially when further optimized with ONNX Runtime for cross-platform availability. The smaller size of Phi-3 models also makes fine-tuning or customization easier and more affordable. In addition, their lower computational needs make them a lower cost option with much better latency. The longer context window enables taking in and reasoning over large text content—documents, web pages, code, and more. Phi-3-mini demonstrates strong reasoning and logic capabilities, making it a good candidate for analytical tasks. 

Customers are already building solutions with Phi-3. One example where Phi-3 is already demonstrating value is in agriculture, where internet might not be readily accessible. Powerful small models like Phi-3 along with Microsoft copilot templates are available to farmers at the point of need and provide the additional benefit of running at reduced cost, making AI technologies even more accessible.  

ITC, a leading business conglomerate based in India, is leveraging Phi-3 as part of their continued collaboration with Microsoft on the copilot for Krishi Mitra, a farmer-facing app that reaches over a million farmers.

“ Our goal with the Krishi Mitra copilot is to improve efficiency while maintaining the accuracy of a large language model. We are excited to partner with Microsoft on using fine-tuned versions of Phi-3 to meet both our goals—efficiency and accuracy! ”    Saif Naik, Head of Technology, ITCMAARS

Originating in Microsoft Research, Phi models have been broadly used, with Phi-2 downloaded over 2 million times. The Phi series of models have achieved remarkable performance with strategic data curation and innovative scaling. Starting with Phi-1, a model used for Python coding, to Phi-1.5, enhancing reasoning and understanding, and then to Phi-2, a 2.7 billion-parameter model outperforming those up to 25 times its size in language comprehension. 1 Each iteration has leveraged high-quality training data and knowledge transfer techniques to challenge conventional scaling laws. 

Get started today

To experience Phi-3 for yourself, start with playing with the model on Azure AI Playground . You can also find the model on the Hugging Chat playground . Start building with and customizing Phi-3 for your scenarios using the  Azure AI Studio . Join us to learn more about Phi-3 during a special  live stream of the AI Show.  

1 Microsoft Research Blog, Phi-2: The surprising power of small language models, December 12, 2023 .

Let us know what you think of Azure and what you would like to see in the future.

Provide feedback

Build your cloud computing and Azure skills with free courses by Microsoft Learn.

Explore Azure learning

Related posts

AI + Machine Learning , Azure AI , Azure AI Content Safety , Azure Cognitive Search , Azure Kubernetes Service (AKS) , Azure OpenAI Service , Customer stories

AI-powered dialogues: Global telecommunications with Azure OpenAI Service   chevron_right

AI + Machine Learning , Azure AI , Azure AI Content Safety , Azure OpenAI Service , Customer stories

Generative AI and the path to personalized medicine with Microsoft Azure   chevron_right

AI + Machine Learning , Azure AI , Azure AI Services , Azure AI Studio , Azure OpenAI Service , Best practices

AI study guide: The no-cost tools from Microsoft to jump start your generative AI journey   chevron_right

AI + Machine Learning , Azure AI , Azure VMware Solution , Events , Microsoft Copilot for Azure , Microsoft Defender for Cloud

Get ready for AI at the Migrate to Innovate digital event   chevron_right

Microsoft Research Blog

Microsoft at asplos 2024: advancing hardware and software for high-scale, secure, and efficient modern applications.

Published April 29, 2024

By Rodrigo Fonseca , Sr Principal Research Manager Madan Musuvathi , Partner Research Manager

Share this page

  • Share on Facebook
  • Share on Twitter
  • Share on LinkedIn
  • Share on Reddit
  • Subscribe to our RSS feed

ASPLOS 2024 logo in white on a blue and green gradient background

Modern computer systems and applications, with unprecedented scale, complexity, and security needs, require careful co-design and co-evolution of hardware and software. The ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (opens in new tab) , is the main forum where researchers bridge the gap between architecture, programming languages, and operating systems to advance the state of the art.

ASPLOS 2024 is taking place in San Diego between April 27 and May 1, and Microsoft researchers and collaborators have a strong presence, with members of our team taking on key roles in organizing the event. This includes participation in the program and external review committees and leadership as the program co-chair.

We are pleased to share that eight papers from Microsoft researchers and their collaborators have been accepted to the conference, spanning a broad spectrum of topics. In the field of AI and deep learning, subjects include power and frequency management for GPUs and LLMs, the use of Process-in-Memory for deep learning, and instrumentation frameworks. Regarding infrastructure, topics include memory safety with CHERI, I/O prefetching in modern storage, and smart oversubscription of burstable virtual machines. This post highlights some of this work.

Microsoft Research Podcast

Peter Lee wearing glasses and smiling at the camera with the Microsoft Research Podcast logo to the left

AI Frontiers: AI for health and the future of research with Peter Lee

Peter Lee, head of Microsoft Research, and Ashley Llorens, AI scientist and engineer, discuss the future of AI research and the potential for GPT-4 as a medical copilot.

Paper highlights

Characterizing power management opportunities for llms in the cloud.

The rising popularity of LLMs and generative AI has led to an unprecedented demand for GPUs. However, the availability of power is a key limiting factor in expanding a GPU fleet. This paper characterizes the power usage in LLM clusters, examines the power consumption patterns across multiple LLMs, and identifies the differences between inference and training power consumption patterns. This investigation reveals that the average and peak power consumption in inference clusters is not very high, and that there is substantial headroom for power oversubscription. Consequently, the authors propose POLCA: a framework for power oversubscription that is robust, reliable, and readily deployable for GPU clusters. It can deploy 30% more servers in the same GPU clusters for inference tasks, with minimal performance degradation.

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization

PIM-DL is the first deep learning framework specifically designed for off-the-shelf processing-in-memory (PIM) systems, capable of offloading most computations in neural networks. Its goal is to surmount the computational limitations of PIM hardware by replacing traditional compute-heavy matrix multiplication operations with Lookup Tables (LUTs). PIM-DL first enables neural networks to operate efficiently on PIM architectures, significantly reducing the need for complex arithmetic operations. PIM-DL demonstrates significant speed improvements, achieving up to ~37x faster performance than traditional GEMM-based systems and showing competitive speedups against CPUs and GPUs.

Cornucopia Reloaded: Load Barriers for CHERI Heap Temporal Safety

Memory safety bugs have persistently plagued software for over 50 years and underpin some 70% of common vulnerabilities and exposures (CVEs) every year. The CHERI capability architecture (opens in new tab) is an emerging technology (opens in new tab) (especially through Arm’s Morello (opens in new tab) and Microsoft’s CHERIoT (opens in new tab) platforms) for spatial memory safety and software compartmentalization. In this paper, the authors demonstrate the viability of object-granularity heap temporal safety built atop CHERI with considerably lower overheads than prior work.

AUDIBLE: A Convolution-Based Resource Allocator for Oversubscribing Burstable Virtual Machines

Burstable virtual machines (BVMs) are a type of virtual machine in the cloud that allows temporary increases in resource allocation. This paper shows how to oversubscribe BVMs. It first studies the characteristics of BVMs on Microsoft Azure and explains why traditional approaches based on using a fixed oversubscription ratio or based on the Central Limit Theorem do not work well for BVMs: they lead to either low utilization or high server capacity violation rates. Based on the lessons learned from the workload study, the authors developed a new approach, called AUDIBLE, using a nonparametric statistical model. This makes the approach lightweight and workload independent. This study shows that AUDIBLE achieves high system utilization while enforcing stringent requirements on server capacity violations.

Complete list of accepted publications by Microsoft researchers

Amanda: Unified Instrumentation Framework for Deep Neural Networks Yue Guan, Yuxian Qiu, and Jingwen Leng; Fan Yang , Microsoft Research; Shuo Yu, Shanghai Jiao Tong University; Yunxin Liu, Tsinghua University; Yu Feng and Yuhao Zhu, University of Rochester; Lidong Zhou , Microsoft Research; Yun Liang, Peking University; Chen Zhang, Chao Li, and Minyi Guo, Shanghai Jiao Tong University

AUDIBLE: A Convolution-Based Resource Allocator for Oversubscribing Burstable Virtual Machines Seyedali Jokar Jandaghi and Kaveh Mahdaviani, University of Toronto; Amirhossein Mirhosseini, University of Michigan; Sameh Elnikety , Microsoft Research; Cristiana Amza and Bianca Schroeder, University of Toronto, Cristiana Amza and Bianca Schroeder, University of Toronto

Characterizing Power Management Opportunities for LLMs in the Cloud (opens in new tab) Pratyush Patel, Microsoft Azure and University of Washington; Esha Choukse (opens in new tab) , Chaojie Zhang (opens in new tab) , and Íñigo Goiri (opens in new tab) , Azure Research; Brijesh Warrier (opens in new tab) , Nithish Mahalingam,  Ricardo Bianchini (opens in new tab) , Microsoft AzureResearch

Cornucopia Reloaded: Load Barriers for CHERI Heap Temporal Safety Nathaniel Wesley Filardo , University of Cambridge and Microsoft Research; Brett F. Gutstein, Jonathan Woodruff, Jessica Clarke, and Peter Rugg, University of Cambridge; Brooks Davis, SRI International; Mark Johnston, University of Cambridge; Robert Norton , Microsoft Research; David Chisnall, SCI Semiconductor; Simon W. Moore, University of Cambridge; Peter G. Neumann, SRI International; Robert N. M. Watson, University of Cambridge

CrossPrefetch: Accelerating I/O Prefetching for Modern Storage Shaleen Garg and Jian Zhang, Rutgers University; Rekha Pitchumani, Samsung; Manish Parashar, University of Utah; Bing Xie , Microsoft; Sudarsun Kannan, Rutgers University

Kimbap: A Node-Property Map System for Distributed Graph Analytics Hochan Lee, University of Texas at Austin; Roshan Dathathri, Microsoft Research; Keshav Pingali, University of Texas at Austin

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization Cong Li and Zhe Zhou, Peking University; Yang Wang , Microsoft Research; Fan Yang, Nankai University; Ting Cao and Mao Yang , Microsoft Research; Yun Liang and Guangyu Sun, Peking University

Predict; Don’t React for Enabling Efficient Fine-Grain DVFS in GPUs Srikant Bharadwaj , Microsoft Research; Shomit Das, Qualcomm; Kaushik Mazumdar and Bradford M. Beckmann, AMD; Stephen Kosonocky, Uhnder

Conference organizers from Microsoft

Program co-chair, madan musuvathi, submission chairs.

Jubi Taneja Olli Saarikivi

Program Committee

Abhinav Jangda (opens in new tab) Aditya Kanade (opens in new tab) Ashish Panwar (opens in new tab) Jacob Nelson (opens in new tab) Jay Lorch (opens in new tab) Jilong Xue (opens in new tab) Paolo Costa (opens in new tab) Rodrigo Fonseca (opens in new tab) Shan Lu (opens in new tab) Suman Nath (opens in new tab) Tim Harris (opens in new tab)

External Review Committee

Career opportunities.

Microsoft welcomes talented individuals across various roles at Microsoft Research, Azure Research, and other departments. We are always pushing the boundaries of computer systems to improve the scale, efficiency, and security of all our offerings. You can review our open research-related positions here .

Related publications

Kimbap: a node-property map system for distributed graph analytics, predict; don’t react for enabling efficient fine-grain dvfs in gpus, amanda: unified instrumentation framework for deep neural networks, crossprefetch: accelerating i/o prefetching for modern storage, meet the authors.

Portrait of Rodrigo Fonseca

Rodrigo Fonseca

Sr Principal Research Manager

Portrait of Madan Musuvathi

Partner Research Manager

Continue reading

Research Focus April 15, 2024

Research Focus: Week of April 15, 2024

"2023 Microsoft Research Year In Review" in white text on a blue, green, and purple abstract gradient background

Research at Microsoft 2023: A year of groundbreaking AI advances and discoveries

Flowchart showing natural language is transformed into a program in domain specific language using an LLM. This step is called Intent formalization. The user is able to modify, repair and query. The Program in DSL is then converted into natural language representation that can be in text or visual formats. The Program in DSL is also separatedly converted into Code via the Code Generation pipeline. This step is called Robust Code Generation.

PwR: Using representations for AI-powered software development

Research Focus: November 22, 2023 on a gradient patterned background

Research Focus: Week of November 22, 2023

Research areas.

research paper about languages

  • Follow on Twitter
  • Like on Facebook
  • Follow on LinkedIn
  • Subscribe on Youtube
  • Follow on Instagram

Share this page:

COMMENTS

  1. Global predictors of language endangerment and the future of linguistic

    As with global biodiversity, the world's language diversity is under threat. Of the approximately 7,000 documented languages, nearly half are considered endangered 1,2,3,4,5,6,7,8.In comparison ...

  2. The Psychology of Communication: The Interplay Between Language and

    Just as language shapes our thoughts and perceptions of the world, so too does one's culture. For the purpose of the current work, culture can be defined as the learned and shared systems of beliefs, values, preferences, and social norms that are spread by shared activities (Arshad & Chung, 2022; Bezin & Moizeau, 2017).Over the past 50 years, the Journal of Cross-Cultural Psychology (JCCP ...

  3. The Littlest Linguists: New Research on Language Development

    Specifically, caregivers' language is often fine-tuned to children's current linguistic knowledge and vocabulary, providing an optimal level of complexity to support language learning. In their new research, Leung and colleagues add to the body of knowledge involving how caregivers foster children's language acquisition.

  4. Changing perceptions of language in sociolinguistics

    Abstract. This paper traces the changing perceptions of language in sociolinguistics. These perceptions of language are reviewed in terms of language in its verbal forms, and language in vis-à ...

  5. Research on learning and teaching of languages other than English in

    Language education research papers have recently been giving increased attention to language learners, a reflection of the advances made in learner-centred language teaching (e.g. Ushioda, 2020). It therefore comes as no surprise that research on language learners occupies a central position in the studies we reviewed. The six articles we ...

  6. Languages

    Languages is an international, multidisciplinary, peer-reviewed, open access journal on interdisciplinary studies of languages published monthly online by MDPI. The European Society for Transcultural and Interdisciplinary Dialogue (ESTIDIA) is affiliated with Languages and its members receive discounts on the article processing charges. Open Access — free for readers, with article processing ...

  7. Research on Language and Social Interaction

    The journal as we know now it started when Stuart Sigman took over Papers in Linguistics on the death of Tony Vanek in 1987. He renamed it Research on Language in Social Interaction, putting out a double issue on multi-channel codes. Thereafter Robert Sanders took on the responsibilities of Editor (with Sigman staying on as Associate Editor ...

  8. Language and Literature: Sage Journals

    Language and Literature is an invaluable international peer-reviewed journal that covers the latest research in stylistics, defined as the study of style in literary and non-literary language. We publish theoretical, empirical and experimental research that aims to make a contribution to our understanding of style and its effects on readers.

  9. Second Language Research: Sage Journals

    Second Language Research is an international peer-reviewed, quarterly journal, publishing original theory-driven research concerned with second language acquisition and second language performance. This includes both experimental studies and contributions aimed at exploring conceptual issues. In addition to providing a forum for investigators in the field of non-native language learning...

  10. (PDF) LANGUAGE AND DIALECT: CRITERIA AND HISTORICAL EVIDENCE

    This paper will propose better criteria towards differentiation of language and dialect basing the argument on the empirical evidence of the history of linguistcs. Content may be subject to ...

  11. (PDF) EXPLORING THE IMPACT OF CULTURE ON LANGUAGE ...

    The research aimed at Developing the Language Repertoire of Non-Native Arabic Novice Learners by Using Web Based Semantic Fields in Light of the European Framework of Reference for Language ...

  12. Language: Its Origin and Ongoing Evolution

    With the present paper, we sought to use research findings to illustrate the following thesis: the evolution of language follows the principles of human evolution. We argued that language does not exist for its own sake, it is one of a multitude of skills that developed to achieve a shared communicative goal, and all its features are reflective ...

  13. Research priorities in the field of multilingualism and language

    This paper aims at identifying and explaining research priorities in the field of multilingualism and language education in a cross-national perspective. It draws on data from a survey with 298 expert participants in five European countries (Germany, Italy, the Netherlands, Portugal, Spain) who ranked pre-identified research topics in relation ...

  14. Learning a Foreign Language: A Review on Recent Findings About Its

    Similarly, research performed by Bak et al. on a short 1-week Scottish Gaelic course on attentional functions among 67 adults aged between 18 years and 78 years reveals that even a short period of intensive language learning can modulate attentional functions and that all age groups can benefit from this effect. The results showed that at the ...

  15. Research on Language and Learning: implications for Language Teaching

    BSTRACT. Taking into account several limitations of communicative language teaching (CLT), this paper. calls for the need to consider research on language use and learning through communication as ...

  16. Strategies for overcoming language barriers in research

    This paper seeks to describe best practices for conducting cross-language research with individuals who have a language barrier.Discussion paper.Research methods papers addressing cross-language research issues published between 2000-2017.Rigorous ...

  17. A comprehensive survey on Indian regional language processing

    There are many research works and applications like (1) Chatbot (2) Text-to-speech conversion (3) Language Identification (4) Hands-free computing (5) Spell-check (6) Summarizing-electronic medical records (7) Sentiment Analysis and so on, developed to handle these natural languages for real time needs. In this paper, various methods used to ...

  18. Low-Resource Indic Languages Translation Using Multilingual ...

    Multilingual neural machine translation (MNMT) has the advantage of being scalable across multiple languages and improving low-resource languages via knowledge transfer. In this work, we investigate MNMT for low-resource Indic languages—Hindi, Bengali, Assamese, Manipuri, and Mizo. With the recent success of massively multilingual pre-trained ...

  19. Language Teaching Research: Sage Journals

    Language Teaching Research is a peer-reviewed journal that publishes research within the area of second or foreign language teaching. Although articles are written in English, the journal welcomes studies dealing with the teaching of languages other … | View full journal description. This journal is a member of the Committee on Publication ...

  20. (PDF) Linguistic research in the Philippines: Trends, prospects

    2.2 Survey of Studies in Linguistics in the Last Ten Years (2000-2009) The Philippine Journal of Linguistics focuses on studies in descriptive, comparative, historical, and areal linguistics ...

  21. Automated Social Science: Language Models as Scientist and Subjects

    Founded in 1920, the NBER is a private, non-profit, non-partisan organization dedicated to conducting economic research and to disseminating research findings among academics, public policy makers, and ... Working Papers; Automated Social Science: Language… Automated Social Science: Language Models as Scientist and Subjects ...

  22. Harnessing large language models for coding, teaching and inclusion to

    Methods in Ecology and Evolution is an open access journal publishing papers across a wide range of subdisciplines, disseminating new methods in ecology and evolution. Abstract Large language models (LLMs) are a type of artificial intelligence (AI) that can perform various natural language processing tasks. ... Much current research is focussed ...

  23. [2404.14219] Phi-3 Technical Report: A Highly Capable Language Model

    We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our ...

  24. fairseq/examples/mms/README.md at main

    The Massively Multilingual Speech (MMS) project expands speech technology from about 100 languages to over 1,000 by building a single multilingual speech recognition model supporting over 1,100 languages (more than 10 times as many as before), language identification models able to identify over 4,000 languages (40 times more than before), pretrained models supporting over 1,400 languages, and ...

  25. (PDF) NATURAL LANGUAGE PROCESSING: TRANSFORMING HOW ...

    Natural Language Processing (NLP) stands as a pivotal advancement in the field of artificial intelligence, revolutionizing the way machines comprehend and interact with human language. This paper ...

  26. Introducing Phi-3: Redefining what's possible with SLMs

    Small language models, like Phi-3, are especially great for: Resource constrained environments including on-device and offline inference scenarios. Latency bound scenarios where fast response times are critical. Cost constrained use cases, particularly those with simpler tasks. For more on small language models, see our Microsoft Source Blog.

  27. Microsoft at ASPLOS 2024: Advancing hardware and software for high

    Modern computer systems and applications, with unprecedented scale, complexity, and security needs, require careful co-design and co-evolution of hardware and software. The ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (opens in new ...

  28. (PDF) English Language Teaching: Challenges and ...

    The current paper displaysthe failure of traditional methods used in the English language classrooms which are purely teacher-centred and do not allow the students to be competent in English.