(Tree taxa) .
Dialects (Tree taxa) . | Type . | Subtype . | Merger state (Bimoraic nouns) . | Data Source . |
---|---|---|---|---|
Tokyo | Tokyo | Churin | ||
Hiroshima | ||||
Totsukawa | Nairin | |||
Hirosaki | Gairin | , , ; | ||
Oita | ||||
Kyoto | Keihan | Central | ||
Tarui | Tarui (C type) | ; | ||
Kan’onji | Sanuki | ; | ||
Ibukijima | Ibukijima | , | ||
Bumoki | Bumoki | |||
Myogisho | Myogisho | |||
Miyakonojo | N-kei | 1-kei | ||
Kagoshima | 2-kei | |||
Nagasaki | ; | |||
Gonza’s materials |
Phylogenetic relationship of pitch-accent systems hypothesized in previous studies. (a) Kindaichi’s hypothesis ( Kindaichi 1964 , 1966a , 1967 , 1975a , b , 1978 ). (b) Tokugawa’s hypothesis ( Tokugawa 1962 ). Some modern dialects are assigned to internal nodes instead of leaves because the two linguists both posit, unlike standard comparative linguistics, that one pitch-accent is derived from another, instead of hypothesizing a common ancestral pitch-accent. Also, note that both panels assign 14 / 235 (called B type) as the merger state of Tarui, instead of 14 / 23 / 5 (called C type) used in our dataset.
Although Kindaichi’s hypothesis is famous and deemed prevailing, it has been exposed to various criticisms from different perspectives. First, Hattori (1985) points out that the hypothesis does not follow the standard comparative methods, as it claimed that Churin type had evolved from Keihan (central) type, instead of reconstructing an ancient pitch-accent system which is ancestral to both Churin and central types. Tokugawa (1962) posited two lineages with the merger state 1/2/3/45, each including both Churin and Gairin type dialects ( Fig. 2b ), which sharply contradicts Kindaichi’s hypothesis. Moreover, Uwano (2006) even challenged the assumption that the common ancestor of the mainland pitch-accent systems had the same number of accentual classes that are seen in Ruiju Myogisho. Uwano hypothesized that the common ancestor had a more complicated pitch-accent system and that more accentual classes, which are no longer found in the modern dialects or written records, were present. So far, no consensus has been established as to the phylogenetic tree of Japanese pitch-accent systems.
Nevertheless, the merger of accentual classes is an indispensable cue in reconstructing the phonetic phylogeny of Japanese dialects. An advantage of analyzing accentual classes is that we can infer the phylogenetic relationship of different pitch-accent systems in a way that is not subject to borrowing. While the accent pattern for individual words may be borrowed among dialects, substituting the representative accent pattern of a class means all their component words simultaneously change their pronunciation, which is not likely to happen by borrowing.
However, previous arguments about the phylogeny of Japanese pitch-accent systems based on the merger state have been made with little quantitative scrutiny. In other words, while linguists have been able to propose plausible hypotheses, there has been little statistical evaluation of their relative credence. Perhaps that is a reason why the conventional studies of phylogenetic reconstruction have not reached a consensus. In hope of improving the situation, this study presents a statistically grounded method to infer the phylogenetic tree of the pitch-accent systems of modern Japanese dialects based on the principle of the accentual class merger.
Statistical methods for the inference of phylogenetic trees were originally developed in the field of evolutionary biology. These methods are based on a dataset of the variation of morphological characters or nucleotide sequences in present-day species (i.e. terminal taxa) and take into account how the characters or sequences have evolved on the tree branches. Subsequently, linguists applied the same approach to infer phylogenetic trees of languages under the premise that linguistic traits are inherited from generation to generation, like the genetic traits transmitted from parents to offspring. The statistical models have been applied to infer the phylogenetic relationship within a language family ( Bouckaert et al., 2012 ; Currie et al., 2013 ; Gray et al., 2009 ; Koile et al., 2022 ; Lee and Hasegawa 2011 ; Saitou and Jinam 2017 ), replacement rates among cognate groups ( Pagel and Meade 2017 ; Pagel et al., 2007 ), and unknown locations of ancestral languages ( Bouckaert et al., 2012 ). On the technical aspects, besides many algorithms for phylogenetic analysis, such as maximum parsimony, neighbor joining ( Saitou and Nei 1987 ), and maximum likelihood analysis, a growing body of literature employs Bayesian statistics for the reconstruction of linguistic phylogeny ( Hoffman et al., 2021 ). One of the advantages of Bayesian phylogenetic analysis is the ease with which the model can be extended; previous studies have presented models that take into account various phenomena such as spatial diffusion ( Lemey et al., 2008 , 2010 ; Takahashi and Ihara 2023 ) and borrowing ( Neureiter et al., 2022 ).
In this article, to infer the evolutionary history of pitch-accent systems in modern Japanese dialects, we will perform a Bayesian phylogenetic analysis on data of accent patterns assigned to accentual classes. In performing Bayesian phylogenetic analysis, we compute the probability of observing the known data given a phylogenetic tree (likelihood), based on a substitution model representing how the linguistic features change. A major obstacle to be overcome is that most of the substitution models previously proposed for linguistic phylogeny were designed for lexical data (reviewed in Hoffmann et al., 2021 ), and that there is no standard method for the phylogenetic analysis of pitch accent. In particular, although the merger of accentual classes appears to be the key to resolve the phylogeny of Japanese pitch-accent systems, to our knowledge, no previous models have taken this process into account. Most previous studies in linguistic phylogeny convert features of languages into binary characters, namely the presence/absence of cognates or grammatical features, and represent the transition among the feature states as a Markov model. Obviously, if we should extract binary features of pitch-accent systems and assume they evolve independently, the model would miss the evolutionary signal of accentual classes.
We will thus develop a novel model, in which pitch-accent systems evolve on tree branches via the merger of accentual classes and the substitution of accent patterns assigned to each class. The model we will use in this study considers two phenomena for statistical inference: geographical (spatial) diffusion and mutation. On the one hand, to represent the diffusion, our model will be based on the general Bayesian framework described in Takahashi and Ihara (2023) , which treats the evolutionary dynamics in a network through a discrete-time model. This framework computes the tree prior based on a weighted adjacency matrix, representing the rate of cultural transmission among multiple populations. On the other hand, to represent the mutation of pitch-accent systems, we will model the merger of accentual class and the replacement of accent patterns of each class.
We use the Bayesian framework described in Takahashi and Ihara (2023) , which regards the space as a network of n discrete nodes that are connected with each other by weighted edges representing the spatial transmission between them.
To apply this framework to the geography of mainland Japan, we assign lattice sites every 0.25 degrees of longitude and every 0.167 degrees of latitude, which gives a grid network consisting of square cells with approximately 20 km for each side. Note that we exclude from our analysis the Ryukyuan language, whose pitch-accent systems display no clear correspondence with the accentual classes of the mainland dialects, and Hokkaido, which has experienced an extensive immigration from the mainland toward the end of the 19th century. The grid is composed of 691 square cells denoted by P 1 ⋯ P n (i.e. n = 691 ).
As we will see in later subsections, the transmission of pitch-accent systems in space is modeled with a function of the geographical distance and population size. To assign a population estimate of the Heian period to each of the 691 cells, we use the population size of each province (ryoseikoku), the ancient administrative unit established in the Asuka period (592–710), estimated by Hattori (1959) as the source data. Since a province occupied a far wider area than a 20 km × 20 km, we calibrate the estimates based on the variation of the present-day population size, surveyed at the level of 1 × 1 km square in 1995 ( e-Stat. Portal Site of Official Statistics of Japan 2016 ) to improve the resolution. See electronic supplementary material for further details about the calibration of population sizes.
2.2.1. locations of taxon dialects and studied word categories.
We investigate the phylogenetic relationship of the pitch-accent systems of 15 dialects spoken at 12 out of 691 cells in mainland Japan ( Figure 3 ) based on the accent patterns for the following six word categories: monomoraic nouns (denoted as C 1 ), bimoraic nouns ( C 2 ), bimoraic verbs ( C 3 ), trimoraic godan verbs ( C 4 ), trimoraic ichidan verbs ( C 5 ), and trimoraic adjectives ( C 6 ). Note that ‘godan’ and ‘ichidan’ refer to conjugation types of verbs. Thus, in our framework, each dialect is characterized by a pitch-accent system that specifies the mappings between accent patterns (i.e. sounds) and accentual classes (i.e. words) for each of the six word categories. Let l ( i ) denote the number of accentual classes of C i ( 1 ≤ i ≤ 6 ) . Previous studies have identified accentual classes for these word categories, which gives l ( 1 ) = 3 , l ( 2 ) = 5 (i.e. the five classes of bimoraic nouns mentioned in section 1), l ( 3 ) = 2 , l ( 4 ) = 3 , l ( 5 ) = 2 , and l ( 6 ) = 2 .
Locations of the modern dialects analyzed in this study. This figure was created based on the base map in Geospatial Information Authority of Japan (2006) .
We collect from the existing literature the data on the accent patterns of different accentual classes for each word category in each dialect. Although most of the words belonging to the same accentual class have the same accent pattern, the literature indicates that some words do not have the representative accent pattern of the class, probably because their accent patterns have either mutated or been borrowed individually. Ignoring such exceptions, we regard the accent pattern assigned to the majority of a class’s component words as the accent pattern of the class. Also, some literature already picked up a representative accent pattern for each class, in which case we simply used the accent pattern. The accent patterns are coded using five characters H, L, M, F, and R, which respectively represent high, low, middle, falling, and rising pitches assigned to each mora. For nouns, since some accent patterns are distinguished by the presence/absence of the pitch drop placed right after the word, we code the accent pattern of words as they are pronounced with a monomoraic postpositional particle, such as ‘ga’ or ‘mo’. From these, the merger state of each accent system for each word category is determined. Table 2 summarizes the conventional types and subtypes proposed by dialectologists, the merger state for bimoraic nouns, and the data sources for the 15 dialects. The whole dataset is shown in appendix (see Supplementary Table SA1 and SA2 ).
Note that three out of the fifteen dialects are ancestral dialects reconstructed from historical documents; Ruiju Myogisho (hereafter Myogisho), Bumoki, and Gonza’s materials. First, Myogisho and Bumoki are historical documents which are considered to reflect the pitch-accent system of Kyoto in the Heian and Muromachi period, respectively ( Kindaichi 1975a ). On the other hand, Gonza’s materials, which show the pitch-accent system of the old Kagoshima dialect ( Kibe 1997 ), are bilingual manuscripts written by Gonza, a Japanese castaway who arrived in Russia from Kagoshima in the early 18th century. The geographical locations of both Bumoki and Myogisho are assumed to be identical to that of modern Kyoto dialect, whereas the geographical location of Gonza’s materials is the same as that of modern Kagoshima dialect.
In Hirosaki and Kagoshima, the unit of the accent is not mora but syllable, that is, each syllable is assigned a pitch marked by H, L, and so on. In order to make the number of accent units consistent among every tree taxon dialect, we coded the accent patterns of words whose mora breaks concur with syllable breaks. Exceptionally, the mora break cannot concur with the syllable break for adjectives because the common ending -i does not constitute a syllable, so the number of syllables is always one less than the number of morae. In order to analyze the accent pattern of three accent units, we coded the accent pattern of the quadrimoraic (trisyllabic) adjectives instead of trimoraic (bisyllabic) adjectives for Hirosaki and Kagoshima. For Hirosaki, we used the accent pattern of yasashii (kind) for the first class and kibishii (strict) for the second class, as they were picked up as typical accent patterns ( Uwano 1990 ). For Kagoshima, we assigned the accent pattern of kanashii (sad) and munashii (futile) for the first class and the accent pattern of kuwashii (detailed), shitashii (close), suzushii (cool), tadashii (correct), and hitoshii (equal) for the second class (Hirayama).
As Ibukijima is a small island off the coast of Kan’onji city, the two taxa Ibukijima and Kanonji are assigned to the same cell due to the relatively coarse resolution (i.e. around 20 km × 20 km) of the grid. However, the Bayesian framework of Takahashi and Ihara (2023) cannot be applied to the case where multiple taxa share the same network node. As a proxy for the actual location of the island, we assigned the pitch-accent system of the Ibukijima dialect to a cell in Shikoku, which is about 33 km away from the island.
In Kyoto, we did not consider the automatic word lengthening of monomoraic nouns.
In Hirosaki, the merger state of bimoraic nous is 12 / 3 / 45 for nouns with second mora having a close vowel and 12 / 345 otherwise. We coded the accent patterns of the former, because it is thought to conserve old characteristics in view of the distinction of accentual classes. Also, words in the Hirosaki dialect, which distinguishes accent patterns by the position of pitch-rise, have different accent patterns when ending and continuing a sentence, the former of which was coded in this study.
In Tarui, the pitch-accent system has relatively recently undergone a merger, and the merger state is either 14 / 23 / 5 (called C type) or 14 / 235 (B type) depending on regions and literature. We assigned the accent patterns of the C type.
Accent data on Nagasaki were placed on the Shimabara Peninsula of Nasgasaki prefecture instead of Nagasaki city, because one of our data sources ( Kibe 2000 ) mainly argued the pitch-accent system of Shimabara. Nevertheless, the pitch-accent system of Nagasaki city is basically the same as that of Shimabara ( Hirayama 1951 ).
In order to distinguish observed variables from the latent variables mentioned later, the accent patterns and merger state that are observed in the dataset are referred to as surface accent pattern and surface merger state , respectively. The surface accent pattern for the j th accentual class of the word category C i is denoted by D i j ( 1 ≤ i ≤ 6 , 1 ≤ j ≤ l ( i ) ), and we define the vector D i = ( D i 1 ⋯ D i l ( i ) ) . Also, the surface merger state of the word category C i is denoted by M i .
To perform a Bayesian Markov Chain Monte Carlo (MCMC) simulation, we develop a generative model that assigns a pitch-accent system to every one of the 691 cells in the grid. In this model, the pitch-accent system of a dialect is specified by the set of accent patterns for all accentual classes of the six word categories. With regard to the i th word category C i , every cell has two latent variables N i and b i .
First, N i of a given cell represents its extended merger state for word category C i . Unlike the surface merger state M i , which is represented using slash marks, the extended merger state N i contains information about which of the classes in a merged block has overridden the accent pattern of the other classes in the block. The extended merger state is represented as a set of strings, such as { ′ 1 ′ ,′ 234 ′ ,′ 5 ′ } , where each string represents the indexes of classes belonging to each merged block. Importantly, the first character in each string represents the index of the accentual class which overrode the other classes. The extended merger state { ′ 1 ′ , ′ 234 ′ , ′ 5 ′ } not only signifies that classes 2, 3, and 4 are merged into a single block but also that the accent pattern of class 2 overrode those of classes 3 and 4 whereas the former accent patterns of classes 3 and 4 were discarded. For each merged block, the class which overrode the others (i.e. the class indexed by the first character of every string) is referred to as the leading class . The extended merger states are distinguished both by the way they merge classes and by the identity of leading classes, so { ′ 1 ′ , ′ 234 ′ , ′ 5 ′ } and { ′ 1 ′ , ′ 324 ′ , ′ 5 ′ } are distinct representations, although they both correspond to the merger state 1 / 234 / 5 . On the other hand, { ′ 1 ′ , ′ 234 ′ , ′ 5 ′ } and { ′ 1 ′ , ′ 243 ′ , ′ 5 ′ } represent the same extended merger state because the leading class of each merged block is identical. The extended merger state is a latent variable because we do not know which class overrode the other classes in the past.
Second, b i = ( b i 1 ⋯ b i l ( i ) ) is a vector representing the latent accent patterns for word category C i , each of whose elements b i j takes one of the accent patterns empirically observed for some accentual class of C i ( 1 ≤ j ≤ l ( i ) ). For example, there are seven distinct accent patterns documented for monomoraic nouns C 1 ( l ( 1 ) = 3 ), namely HL, LH, HM, HH, LL, LM, and MH, constituting the set of values that b 11 , b 12 , and b 13 can take.
One might find it odd to have different variables for the merger state and the accent patterns as these two are mutually dependent in the empirical data. Indeed, b i is a latent variable and may differ from the surface accent patterns D i used for computing the likelihood in the MCMC algorithm. As we will see later, this model setting enables us to reduce the number of possible values for each parameter and to efficiently compute the tree likelihood. Hence, in the present analysis, we define a pitch-accent system by the set of parameters { ( N 1 , b 1 ) , ( N 2 , b 2 ) , ⋯ , ( N 6 , b 6 ) } .
Considering a discrete-time model, each of the square cells P 1 ⋯ P n is assumed to have one pitch-accent system at a given timestep. The evolution of pitch-accent systems is driven by two events that happen at every timestep: transmission (diffusion) and mutation, the details of which will be described in the following two subsections.
At the beginning of each timestep, every cell inherits the pitch-accent system from one of the n cells. Note that the whole pitch-accent system { ( N 1 , b 1 ) , ( N 2 , b 2 ) , ⋯ , ( N 6 , b 6 ) } is transmitted from the chosen cell. We do not allow a cell to copy some of the variables from one cell and the remaining variables from another, which ensures that the evolutionary history of the pitch-accent system is represented as a phylogenetic tree rather than a phylogenetic network. We will compute the tree prior based on the transmission (diffusion) of the lineages in space.
To represent the spatially structured interaction among human populations, we will set the following assumptions. The probability that cell P i inherits the pitch-accent system from cell P j is denoted by a i j and is modeled as follows:
where π j represents the population size of P j , and d i j represents the geographic distance between the two cells, measured in great circle distance. K i is a normalizing factor given by
The factor e x p ( ⋅ ) in equation (2.1 ) is a Gaussian interaction kernel, so cells tend to learn the pitch-accent system from nearby populations (see Burridge 2017 for a similar model). Hence, phylogenetic trees tend to score a high prior probability if geographically close taxa form clusters. It should also be noted that cells with a large population density exert large influence on other cells. The parameter σ is the standard deviation of the Gaussian kernel and gives how far the pitch-accent system diffuses among cells; a large value of σ allows transmission among distant cells. As we have found that the MCMC algorithm does not converge unless the parameter σ is fixed, we assume σ = 70 km in this study, but as we shall see in subsection 2.9, the result with a different value of σ is included in the electronic supplementary material .
Following the method in Takahashi and Ihara (2023) , we store the information about where the cells inherited the pitch-accent system over the past τ timesteps, where τ represents the maximum possible height of the tree root in the unit of timestep. Let G denote a matrix of dimension τ × n , whose element at t th row and i th column represents the index of the cell from which the cell P i inherited the pitch-accent system t − 1 timesteps ago. Based on the matrix G , we can cut out a phylogenetic tree T , by tracing the lineages from leaf nodes (see Fig. 4 ). As the prior probability of G can be computed by using equation (2.1 ), we can obtain a tree prior reflecting the spatial interaction pattern of the cells.
(a) Coalescent process based on matrix G . Circles represent network nodes representing the space ( n = 5 , τ = 5 in this example). The lineages starting from the five taxa A, B, C, D, and E are retraced. (b) Resulting phylogenetic tree T . Tree branches are labeled with their lengths in the unit of timestep. For convenience, our model allows a branch with zero length, in order that the output of the coalescent process is always a binary tree. Hence, two taxa C and E, where the former is the descendant of the latter, are regarded as sister taxa.
After all cells inherit a pitch-accent system at each timestep, they may modify the inherited accent via a mutation event. First, we assume that the evolution of ( N i , b i ) is independent of ( N j , b j ) if i ≠ j . That is to say, mutations on the merger states and accent patterns occur independently among different word categories. We thus describe our model of mutation focusing on a single word category C i and consider the mutation of the two variables N i and b i , which also mutate independently of each other.
As for the mutation of the extended merger state N i , we assume that every pair of merged blocks are merged with the same probability q , resulting in a larger block containing all the accentual classes included in the two blocks. When two blocks merge, the strings representing the two blocks are concatenated in a random order, meaning that the leading class of one block overrides the classes of the other block. For example, if N i = ′ 1 ′ , 2 ‵ 3 ′ , ′ 45 ′ , it may mutate into either ′ 123 ′ , ′ 45 ′ , ′ 231 ′ , ′ 45 ′ , { ′ 145 ′ , ′ 23 ′ } , { ′ 451 ′ , ′ 23 ′ } , { ′ 1 ′ , ′ 2345 ′ } , or { ′ 1 ′ , ′ 4523 ′ } with probability q / 2 for each of them (see Fig. 5a for another example). Letting u i denote the number of merged blocks in N i , as the merger is possible for the u i 2 = 1 2 u i u i − 1 pairs, the total probability that N i mutates is given by 1 2 u i ( u i − 1 ) q .
(a) Mutation of the extended merger state for the case with three accentual classes. The rectangles show extended merger states and the arrows indicate merger of two accentual classes. (b) Replacement of the latent accent pattern for bimoraic words with five possible accent patterns as an example. The rectangles show accent patterns and the arrows indicate transitions among different accent patterns. (c) Relation between the latent and surface accent patterns in the case of three accentual classes. In this example, from the latent accent patterns shown in the rectangles on the left, the surface accent patterns shown in the rectangles on the right are generated. For panels (a) and (b), self-loops representing the absence of mutation are omitted.
On the other hand, we assume that the latent accent pattern b i = ( b i 1 ⋯ b i l ( i ) ) mutates in an elementwise manner. The latent accent pattern b i j may mutate into any other accent patterns empirically documented for the word category C i with equal probability denoted by α (see Fig. 5b ). This is the discrete-time version of the Mk-model ( Lewis 2001 ), which represents the evolution of a trait with k possible states.
Focusing again on one word category C i , we describe how the surface accent pattern D i is generated from the latent accent pattern b i and the merger state N i in each cell. For each merged block, the latent accent pattern of the leading class is observed as the surface accent pattern in every class of the block (see Figure 5c ). For example, consider the bimoraic noun C 2 with five accentual classes. If a cell has the extended merger state N 2 = { ′ 1 ′ , ′ 23 ′ , ′ 54 ′ } and the latent accent patterns b 2 = ( H H M , H L L , H H L , L H L , L L H ) , the surface accent pattern will be ( H H M , H L L , H L L , L L H , L L H ) .
Under these model assumptions, we can properly incorporate two features of the principle of accentual class merger. First, the merged classes mutate simultaneously. For example, considering a word category with three accentual classes, it takes a single mutation event, rather than two, for the transition of surface accent patterns from ( L H H , H L L , H L L ) to ( L H H , L H L , L H L ) to occur if the relevant merger state is 1/23. Second, as a corollary, we always observe the same accent pattern for merged accentual classes.
The probabilistic dependency of the model parameters is depicted in Fig. 6 . The observed data with regard to the word category C i is given by Y i = D i ∩ M i . Here we define Y = { Y 1 , ⋯ , Y 6 } , D = { D 1 , ⋯ , D 6 } , B = { b 1 , ⋯ , b 6 } , N = { N 1 , ⋯ , N 6 } , and M = { M 1 , ⋯ , M 6 } . Note that the symbols B , N , M , D , Y in the following expressions and discussion represent the variables assigned to every tree taxon rather than those in a single pitch-accent system, but the indexes of taxa are omitted for notational simplicity. The joint posterior distribution is given by
Graphical representation of the probabilistic dependencies between parameters concerning the word category C i . In particular, functional dependency is shown by bold arrows, that is, G , N i and { N i , b i } functionally determine T , M i and D i , respectively. The filled and open circles represent observed and latent parameters, respectively. Here, symbols N i , b i , M i and D i represent variables assigned at taxa (tree leaves).
where B A = { b 1 A , ⋯ , b 6 A } denotes the latent accent pattern at the tree root. The latent accent patterns B is integrated out in the above equation. We assume that the prior probability of B A is given independently to each word category, which gives P ( B A ) = ∏ i = 1 6 P ( b i A ) . Note also that the latent accent pattern always concurs with the surface accent pattern at the tree root since all the accentual classes are unmerged. The latent variables M and B assigned at tree taxa are only dependent on the tree T that is derived from G , so we have P ( N | G , q ) = P ( N | T , q ) and P ( D | G , α , B A , N ) = P ( D | T , α , B A , N ) . Hence,
The MCMC algorithm allows us to draw a sample from the joint posterior distribution using expression (2.3). Focusing on the right-hand side, we use uniform distributions as the two priors P ( q ) and P ( α ) . The prior P ( G ) can be computed as the product of the value a i j using equations (2.1 ) and (2.2 ) (see Takahashi and Ihara 2023 ). As for the latent accent patterns at the tree root b i A = ( b i 1 A ⋯ b i l ( i ) A ) , we assume that the prior probability P ( b i A ) is uniform to any b i A such that every class has different latent accent patterns. For example, the latent accent pattern of monomoraic nouns b 11 , b 12 , and b 13 can take seven possible values which are empirically documented, so b 1 A = ( b 11 A , b 12 A , b 13 A ) may take 7 × 6 × 5 = 210 possible values, each of which is given the prior probability 1 / 210 . The factor P ( N i | T , q ) is computed by Felsenstein’s pruning algorithm ( Felsenstein 1973 , 1981 ). The conditional probabilities P ( M i | N i ) is one if and only if the combination of classes constituting the merged blocks is the same in M i and N i . Otherwise, P ( M i | N i ) = 0 . Thus, in running the MCMC algorithm, we only have to explore the values of N i , such that P ( M i | N i ) = 1 holds true.
Finally, the conditional probability P ( D i | T , α , b i A , N i ) in the right-hand side of expression (2.3) can be computed by performing Felsenstein’s pruning algorithm ( Felsenstein 1973 , 1981 ) in an irregular way. As discussed above, when our MCMC algorithm explores the parameter space of N i , it is guaranteed that the observed values of surface accent patterns D i j is uniform for every j ( 1 ≤ j ≤ l ( i ) ) such that j th class belongs to the same merged block in N i . Hence, focusing on one leaf node (i.e. taxon), the observed accent patterns D i are generated by the model if and only if b i j = D i j holds true for every j such that j th class is a leading class in N i at the focal leaf. Focusing on j th class of the word category C i , we define a function L i j ( p , v ) for a possible latent accent pattern p and a tree node v . L i j ( p , v ) represents the probability that b i j = D i j holds true at all the leaves, which are v itself or descendants of v , and at which j th class is a leading class in N i , given that the node v has the latent accent pattern p . We have
To compute this, we recursively compute L i j ( p , v ) from leaves to the root. If v is a leaf, we initialize L i j ( p , v ) by
If v is an internal node or the root, whose child nodes are denoted by s 1 and s 2 , we can compute L i j ( p , v ) by
This is practically the pruning algorithm with missing state values at some of the taxa.
MCMC is conducted with three independent chains. For each chain, we run MCMC for 4 × 10 6 iterations, and the first 10 6 iterations are discarded as burn-in. We draw a sample at the interval of 10 3 iterations, which gives the sample size of 9000 in total. After sampling from the joint posterior distribution (2.3), we readily obtain the posterior distributions of model parameters and the latent accent pattern at the root (i.e. P ( q | Y ) , P ( α | Y ) , and P ( B A | Y ) ). The posterior distribution of the phylogenetic tree P ( T | Y ) can be obtained by tracing the lineages based on the sampled values of G ( Fig. 4 ). Moreover, as matrix G contains information as to where the lineages existed at each timestep in the past, we can obtain the posterior distribution of the geographical location of the tree root and the most recent common ancestor (MRCA) of a subset of the tree taxa. We thus infer the location of the root and the MRCA of the three types of pitch-accent systems; Tokyo type, Keihan type, and N-kei type.
To assess convergence, we conducted two additional runs of MCMC with the same model configuration with the exception of the initial value of G (i.e. different starting tree), and obtained confirmatory results (data not shown).
We establish the correspondence between the tree length, originally given in the unit of timestep, and the real unit of time (year) taking into consideration the time at which the ancestral pitch-accent systems existed. Considering the survey dates or publication dates of the data sources for the accent patterns of modern dialects, we set the observation year of the tree taxa (i.e. terminal nodes) to be 1950 AD. The dates of Ruiju Myogisho, Bumoki, and Gonza’s materials (old Kagoshima) are respectively assumed to be 850, 450, and 225 years before the present, which means 1100, 1500, and 1725 AD in the calendar year. The prior probability of the date of the common ancestor (i.e. root node) in years before present follows the uniform distribution U ( 850 , 1500 ) , ranging from 450 to 1100 AD. The upper limit (oldest limit) is set to 450 AD considering the divergence time of the Ryukyuan and mainland Japanese languages which is debatable but hypothesized to be around the third to seventh century in recent research ( Pellard 2016 ). In Ryukyuan dialects, accentual classes that are not seen in mainland dialects are present, suggesting that the common ancestor of the mainland pitch-accent systems is more recent than the divergence of Ryukyuan and mainland Japanese.
Also, to relate the timestep of our discrete-time model to a real unit of time, we assume that one timestep represents 25 years, roughly one human generation. We also ran MCMC with the assumption that one timestep corresponds to 10 years, but the resulting phylogeny stayed mostly unchanged (see subsection 2.9. and online supplementary material ).
We have data on the six different word categories C 1 , ⋯ , C 6 , which differ in the number of morae, part in speech, and conjugation. While our assumption is that the evolution of accent patterns occurs independently among the word categories, this may not be the case in reality. For example, our dataset shows that the accent patterns for the first accentual classes of bimoraic nouns and verbs are the same for most of the modern dialects, suggesting the possibility that a single mutation may have affected accent patterns in more than one word category. Since our model would require double mutations for this sort of correlated change, it may bias the inferred phylogenetic tree. Hence, we perform an additional Bayesian inference by using a subset of the data, particularly those on nouns (i.e. C 1 and C 2 ).
We also ran MCMC changing the value of σ , prior on the time to the most recent common ancestor (MRCA), the number of years to which one timestep of the model corresponds, and the influence of the population size on the transmission of pitch-accent systems.
The sensitivity analysis showed that the resulting phylogenetic trees were mostly similar, particularly in terms of the tree topology, across different assumptions. We thus only present the main result in this section, and the results of the sensitivity analysis (subsection 2.9.) are included in the electronic supplementary material .
Fig. 7 shows the maximum clade credibility (MCC) tree derived from the sample of posterior trees. The MCC trees in this article were developed by the TreeAnnotator application distributed with BEAST v2.7.6 ( Bouckaert et al., 2014 ) and visualized by FigTree v1.4.4. Note again that the date of the tree taxa is 1950. The figure indicates that the phylogenetic tree consists of three clades, which concurs with the conventional classification of pitch-accent systems: Tokyo type, Keihan type, and N-kei type. It is thus suggested that the Tokyo-type accents, distributed both the west and east sides of the archipelago, have a shared common ancestor. Focusing on the phylogenetic relationship of the three clades, the Keihan type and N-kei type are most plausibly sister groups, although this clade is far from being decisive in view of its posterior probability. The 95% credibility interval of the date of the common ancestor of all the modern dialects ranges from AD 450 to 825, corresponding to the mid-Kofun to the early Heian period. Similarly, the 95% credibility interval of the time to the most recent common ancestor of Tokyo-type, Keihan-type (including Ruiju Myogisho), and N-kei-type is AD 1000–1550, 825–1100, and 1050–1700, respectively. However, it must be noted that the posterior distribution of the date of the common ancestor is heavily dependent on the prior distribution (see electronic supplementary material ).
Maximum clade credibility (MCC) tree generated from the posterior sample of phylogenetic trees. Horizontal axis represents the time before present in year, but note that the modern pitch-accent systems are assumed to be dated as 1950. The branches are labeled with posterior probabilities representing the proportion of posterior trees supporting each clade. The bars covering the root and internal nodes represent the 95% credibility interval of divergence time. Taxa and clades are colored according to the conventional classification of pitch-accent systems (see Table 2 ): Keihan type (Bumoki, Tarui, Ibukijima, Kyoto, Kan’onji and Myogisho), N-kei type (Kagoshima, Gonza’s materials, Nagasaki and Miyakonojo), and Tokyo type (Hirosaki, Oita, Tokyo, Hiroshima and Totsukawa).
We first focus on the phylogenetic relationship among the Tokyo-type pitch-accent systems, particularly on Churin (Tokyo and Hiroshima) and Gairin (Hirosaki and Oita) subtypes, which respectively display the merger states 1 / 23 / 45 and 12 / 3 / 45 with respect to the bimoraic nouns. As the clade of Tokyo and Oita is strongly supported in the MCC tree, our result suggests that both Gairin and Churin subtypes are either paraphyletic or polyphyletic groups which do not share an immediate common ancestor. It is also suggested that there existed a pitch-accent system with the merger state of 1 / 2 / 3 / 45 .
Here we focus on the phylogenetic relationship among the Keihan-type accents. The MCC tree ( Fig. 7 ) suggests that the pitch-accent system of Ibukijima, which has the merger state 1 / 2 / 3 / 4 / 5 , is the sister taxon of the Kyoto accent, indicating that the common ancestor of the Ibukijima and the modern Kyoto accents had the merger state 1 / 2 / 3 / 4 / 5 . Therefore, although the pitch-accent system of Muromachi-period Kyoto, recorded in Bumoki, and the modern Kyoto dialect have the merger state 1 / 23 / 4 / 5 in common, the MCC tree indicates that the merger of classes 2 and 3 happened independently on the lineages of these two taxa.
Considering the phylogenetic relationship among Tokyo, Keihan, and N-kei-types, the Keihan + N-kei clade appeared in 66.6% of the sampled trees ( Fig. 7 ). In contrast, the Tokyo + N-kei clade and Tokyo + Keihan clade were supported by 21.5% and 8.3% of the sampled trees respectively, although these clades do not appear in the MCC tree.
Fig. 8 plots the location of the most recent common ancestor (MRCA) of all fifteen taxa ( Fig. 8a ), Keihan type (including Ruiju Myogisho) ( Fig. 8b ), Tokyo type ( Fig. 8c ), and N-kei type ( Fig. 8d ) which appear in each of the 9000 trees sampled by MCMC. Hence, the values displayed in this figure are proportional to the posterior probability that the MRCA is located in each cell. Not surprisingly, the MRCA tends to appear in cells with a large population size because equation (2.1 ) indicates that the pitch-accent system is more likely to be inherited from such cells. The figure suggests that the common ancestor of all the taxon pitch-accent systems has most likely been located around contemporary Kyoto, Osaka, Nara, or Kobe, although other parts of the Kinki region and the east part of the Chugoku and Shikoku regions are also possible locations of the linguistic homeland ( Fig. 8a ). The MRCA of the Keihan-type accent is also likely to be located in these regions ( Fig. 8b ). The highest posterior probability is scored by Shijo, Kyoto, which is probably because this cell has one of the highest population sizes in the Kinki area and because the accent pattern of Ruiju Myogisho, which is closest to the root, was assigned to this cell. Indeed, this area was a political and cultural center from the Heian period. As for the Tokyo type, locations of the MRCA sampled by MCMC spread relatively widely, but we can observe the highest peak around the Kinki region and the second highest peak around Tokyo ( Fig. 8c ). On the other hand, Tohoku and Kyushu regions are plausibly not the homeland of the Tokyo-type accent, although Tokyo type is currently observed in part of these regions. Finally, the MRCA of the N-kei type was most likely to be located in the northern or central part of Kyushu ( Fig. 8d ).
Geographical distribution of the locations of the MRCA for every sampled phylogenetic tree. The figure shows the number of sampled trees whose MRCA occupies each cell (logarithmic scale). (a) MRCA of all the taxon dialects. (b) MRCA of the dialects with Keihan-type accent. (c) MRCA of the dialects with Tokyo-type accent. (d) MRCA of the dialects with N-kei type accent.
Table 3 shows the posterior probability of the accent patterns at the tree root (i.e. common ancestor of all the modern dialects) computed based on the sample from the distribution P ( B A | Y ) . Since the surface accent pattern always concurs with the latent accent pattern at the root, P ( B A | Y ) is regarded as the posterior probability of the surface accent pattern. Not surprisingly, the table shows that, for all accentual classes, the accent pattern of Ruiju Myogisho scores the highest posterior probability as the accent pattern at the root.
Posterior probability of accent patterns for every accentual class at the tree root. Three accent patterns which scored the three highest posterior probabilities are shown for each class. Numbers represent posterior probabilities. For comparison, the accent patterns of Ruiju Myogisho ( Kindaichi 1975a ) are shown.
Word Category . | Class . | Rank of posterior probability . | Myogisho . | ||
---|---|---|---|---|---|
1 . | 2 . | 3 . | |||
Monomoraic noun | 1 | HH (0.69) | MH (0.13) | LM (0.07) | HH |
2 | HL (0.83) | LL (0.05) | LM (0.05) | HL | |
3 | LH (0.86) | HL (0.06) | LL (0.02) | LH | |
Bimoraic noun | 1 | HHH (0.61) | MHH (0.08) | LLL (0.07) | HHH |
2 | HLH (0.54) | LLM (0.10) | HLL (0.10) | HLH | |
3 | LLH (0.72) | LHL (0.09) | HHM (0.04) | LLH | |
4 | LHH (0.60) | HLL (0.14) | HHL (0.04) | LHH | |
5 | LHL (0.55) | HLL (0.17) | HHL (0.05) | LHL | |
Bimoraic verb | 1 | HH (0.62) | HL (0.15) | LH (0.09) | HH |
2 | LH (0.73) | HL (0.21) | MH (0.02) | LH | |
Trimoraic godan verb | 1 | HHH (0.59) | LHL (0.08) | MHH (0.08) | HHH |
2 | LLH (0.74) | LHL (0.12) | HLL (0.03) | LLH | |
3 | LHH (0.61) | LHL (0.18) | MHH (0.04) | LHH | |
Trimoraic ichidan verb | 1 | HHH (0.53) | LHH (0.12) | LHL (0.09) | HHH |
2 | LLH (0.70) | LHL (0.20) | HLL (0.02) | LLH | |
Trimoraic adjective | 1 | HHL (0.64) | LHH (0.09) | LHL (0.07) | HHL |
2 | LLH (0.71) | LHL (0.17) | HLL (0.03) | LLH |
Word Category . | Class . | Rank of posterior probability . | Myogisho . | ||
---|---|---|---|---|---|
1 . | 2 . | 3 . | |||
Monomoraic noun | 1 | HH (0.69) | MH (0.13) | LM (0.07) | HH |
2 | HL (0.83) | LL (0.05) | LM (0.05) | HL | |
3 | LH (0.86) | HL (0.06) | LL (0.02) | LH | |
Bimoraic noun | 1 | HHH (0.61) | MHH (0.08) | LLL (0.07) | HHH |
2 | HLH (0.54) | LLM (0.10) | HLL (0.10) | HLH | |
3 | LLH (0.72) | LHL (0.09) | HHM (0.04) | LLH | |
4 | LHH (0.60) | HLL (0.14) | HHL (0.04) | LHH | |
5 | LHL (0.55) | HLL (0.17) | HHL (0.05) | LHL | |
Bimoraic verb | 1 | HH (0.62) | HL (0.15) | LH (0.09) | HH |
2 | LH (0.73) | HL (0.21) | MH (0.02) | LH | |
Trimoraic godan verb | 1 | HHH (0.59) | LHL (0.08) | MHH (0.08) | HHH |
2 | LLH (0.74) | LHL (0.12) | HLL (0.03) | LLH | |
3 | LHH (0.61) | LHL (0.18) | MHH (0.04) | LHH | |
Trimoraic ichidan verb | 1 | HHH (0.53) | LHH (0.12) | LHL (0.09) | HHH |
2 | LLH (0.70) | LHL (0.20) | HLL (0.02) | LLH | |
Trimoraic adjective | 1 | HHL (0.64) | LHH (0.09) | LHL (0.07) | HHL |
2 | LLH (0.71) | LHL (0.17) | HLL (0.03) | LLH |
We also obtained the posterior distributions of the two model parameters q and α , which represent the rates of the accentual class merger and substitution between accent patterns ( Fig. 9 ). Focusing on q , the 95% credibility interval ranges from 2.25 × 10 − 4 to 4.86 × 10 − 4 per annum. Its posterior median value is 3.37 × 10 − 4 , with which the merger of a given pair of two unmerged blocks should take place once every 2967 years. Since a pitch-accent system with the merger state 1 / 2 / 3 / 4 / 5 experiences a merger event with probability 5 2 q = 10 q , according to the posterior median, the expected time to the first merger event is 296.7 years. At this pace, the merger state 1 / 2 / 3 / 4 / 5 stays unchanged for over 1000 years with probability 0.034 . Intuitively, this small value is in line with the fact that the modern pitch-accent system with 1 / 2 / 3 / 4 / 5 is found only in the dialect of a small island, Ibukijima. On the other hand, the 95% credibility interval of α resides in the range from 6.97 × 10 − 5 to 1.39 × 10 − 4 per annum, with the posterior median being 9.89 × 10 − 5 . For example, as monomoraic nouns have seven possible accent patterns, the probability with which the latent accent pattern of an accentual class changes is given by 6 α . Based on the posterior median value of α , the latent accent pattern of a given accentual class of monomoraic nouns changes once every 1685 years.
Posterior distributions of mutation rates per annum. (a) Posterior distribution of q , the rate at which accentual classes merge. (b) Posterior distribution of α , the rate at which the latent accent pattern is replaced with another specific accent pattern.
In this paper, we developed a mutation model representing the transition of pitch-accent systems driven by accentual class merger and integrated the model into the Bayesian framework for phylogenetic reconstruction with spatial diffusion ( Takahashi and Ihara 2023 ). On the basis of documented data on accent patterns in multiple modern dialects, we inferred the phylogenetic tree of their pitch-accent systems. The resulting phylogenetic tree ( Fig. 5 ) supports the clades of Tokyo type, Keihan type, and N-kei type (in Kyushu), which is also true for all the conditions of sensitivity analysis (see electronic supplementary material ). Although we included the diffusion of pitch-accent systems into the model to generate the tree prior, the resulting tree still supports the monophyly of the Tokyo-type accents, which are distributed distantly in the west and east.
The combination of the resulting phylogenetic tree ( Fig. 7 ) and the posterior distributions of the geographical location of the MRCAs ( Fig. 8 ) suggests the following scenario of the evolutionary history of pitch-accent systems. First, the common ancestor of the modern pitch-accent systems dates back to the mid-Kofun to early Heian period and was located in the contemporary Kinki region or its perimeter. The pitch-accent system then split into three branches, which are respectively ancestral to the modern Keihan type, Tokyo type, and N-kei type. The Keihan-type branch stayed around the Kinki region until the Heian period and subsequently split into branches that were inherited by historical documents and modern dialects. On the other hand, the lineage of the N-kei-type of the Kyushu region moved from Kinki to northern or central Kyushu, subsequently splitting into lineages to individual modern dialects from AD 1050 to 1700. Also, the Tokyo-type branch most plausibly stayed in Kinki and started splitting around mid-Heian to late Muromachi period, diffusing both eastward and westward from the center of the archipelago. Since Tokyo and Oita have relatively recently diverged in our result, the diffusion of lineages to Tokyo and Oita is expected to have occurred recently. The second most plausible scenario is that, after splitting from the common ancestor of the mainland pitch-accent system, the Tokyo-type branch moved from Kinki to Tokyo, after which it split and diffused to the north (Hirosaki) and to the west (Hiroshima, Oita). Again, the diffusion from Tokyo to Oita is suggested to have happened recently, although the cause of this jumpy transmission is unknown. The latter scenario indicates that the Tokyo-type accent in the west (Hiroshima, Oita) has diffused like a round trip (eastward and then westward) since the common ancestor of the mainland dialects. However, the sensitivity analyses showed the divergence time heavily depended on the prior of the root age (see electronic supplementary material ), so the discussion above depends on the assumption that the MRCA of the mainland pitch-accent does not date back before 450 AD.
Our results are in sharp contradiction to the conventional hypotheses proposed by linguists. First, although Gairin type, a subtype of the Tokyo-type accent seen in Oita and Hirosaki, was posited to be phylogenetically far from the Churin-type accents by Kindaichi ( Fig. 2a ), Figure 7 suggests that the Tokyo-type accents form a monophyletic group. Thus, our results indicate that the common ancestor of the Tokyo-type accents had the merger state 1 / 2 / 3 / 45 with respect to the bimoraic nouns, which has not been observed in any modern dialects to the extent of our knowledge. For this reason, our result does not support the prevailing hypothesis that the Churin-type accents (Tokyo and Hiroshima) derived from the pitch-accent system recorded in Bumoki ( Fig. 2a ) ( Kindaichi 1942 , 1975a ). One hypothesis that is partially congruent with our result has been proposed by Tokugawa (1962) , who posits that Gairin and Churin types have the common ancestor with the merger state 1 / 2 / 3 / 45 , although it argues, unlike us, against the monophyly of the Tokyo-type accents ( Fig. 2b ).
Besides the phylogeny of the pitch-accent systems, we compare our result with proposed evolutionary history of Japonic languages/dialects inferred from lexical features. First, Igarashi (2021) ’s analysis suggested a phylogenetic tree of Japonic languages in which two geographically continuous groups of dialects, the ‘Macro-Eastern Japanese branch’ and the ‘Southern Japanese branch’, respectively form clusters. While the former branch extends in the east part of Japanese mainland, including our taxon dialects Hirosaki, Tokyo, and Tarui, the latter consists of Ryukyuan and dialects in Kyushu, including Nagasaki, Oita, Miyakonojo and Kagoshima. However, our results do not support the monophyly of either of the groups, because the Tokyo-type accent, whose geographical distribution overlaps with both of the branches proposed by Igarashi, forms a cluster. Our results also contradict the phylogenetic tree of Lee and Hasegawa (2011) , inferred through Bayesian phylogenetic analysis. In their result, mainland dialects tend to form geographically continuous clades (east-west division), and the split between Tokyo type and Kyoto type (center-periphery division) was not observed. In view of the discrepancies between our results and previous studies, the pitch-accent system and lexicons in mainland dialects may have different histories.
The phylogenetic relationship within the clade of the Keihan-type accent is also quite different from the conventional hypotheses in that the pitch-accent system of the Ibukijima dialect is suggested to share the immediate ancestor with that of the Kyoto dialect. Since Ibukijima is known for the complex pitch-accent system where all the accentual classes of bimoraic nouns are unmerged (i.e. 1 / 2 / 3 / 4 / 5 ), our result indicates that the ancestor of the Kyoto dialect had the merger state 1 / 2 / 3 / 4 / 5 until recently. This result may seem inconsistent with the fact that Kyoto had the pitch-accent system 1 / 23 / 4 / 5 in the late Muromachi period ( Kindaichi 1975a ). However, as the dialects can diffuse in space, it is not impossible that the ancestor of the pitch-accent system of the modern Kyoto dialect was located elsewhere in the Muromachi period. Our results suggest that the pitch-accent system of modern Kyoto is not derived from that of Bumoki but from an undocumented pitch-accent system with the merger state 1 / 2 / 3 / 4 / 5 which survived from the Muromachi period. Focusing on the Kyoto region, our result is interpreted that the two lineages to the modern Kyoto dialect and Bumoki have independently undergone the merger of classes 2 and 3.
As for the methodology used in this study, we developed a mutation model which represents the accentual class merger and which can be integrated into the framework of phylogenetic analysis with a practical algorithmic efficiency. The novelty of this method consists in the representation of the phenomenon of merger, unlike models with binary features which are often employed in Bayesian phylogeny. The method is somewhat difficult to interpret, in the way that we assume the latent accent pattern whose mutation is a Markov process. However, the model setting enables the efficient computation of the tree likelihood through Felsenstein’s tree-pruning algorithm ( Felsenstein 1973 , 1981 ), by reducing the number of possible states of each variable. On the technical aspect, although the pruning algorithm requires the computation of the power of the matrix with mutation rates, our model setting efficiently reduces the computational complexity by assuming that variables N i and b i j follow independent Markov processes. Since there are only 196 possible extended merger states for the five accentual classes of the bimoraic nouns, the number of possible values for N i is limited to a relatively small number, which significantly reduces the computation time for matrix multiplication. If we used a model where the surface accent patterns D i followed a Markov process, there would be more than thousands or tens of thousands possible states, rendering the algorithm impractically slow.
Our model may be applied to other features of languages beyond Japanese pitch accents or cultural traits in general, to infer the phylogenetic tree from merger phenomena. The merger is not limited to pitch accent but commonly occurs in the sound system of any language, replacing a sound with another existent sound. Merger in phonology is also a common event where two phonemes lose distinction. The manual comparative method has traditionally often relied on the merger phenomenon in sound system, but few statistical models were built to treat this class of dataset for phylogenetic analysis. For instance, starting from the vowel system of Proto-Japonic as the common ancestor, the pattern of vowel merger differs between the lineage leading to Old Japanese and Proto-Ryukyuan, which has been used to argue the phylogeny and classification of Japonic languages ( Igarashi 2021 ). Although the evolution of sound systems is driven not only by mergers but also by splits in reality, our model can still describe the nature of language evolution, which tends to follow simplifying rather than complicating processes. If relevant data are available, our model may pave the way to statistically infer linguistic phylogeny from a dataset describing the sound system of languages.
We consider the limitations of this study in view of the dataset. In this study, we collected the accent patterns from multiple different research sources, due to the absence of publicly available databases exhaustively recording the accent patterns of Japanese dialects. Conflicts in the recorded accent patterns are thus inevitable because not every author follows the same criteria in judging the accent patterns recorded in their fieldwork. It is possible that our result was biased by the data source which we selected. We hope that a database of Japanese pitch-accent systems, which records their features based on consistent criteria, will be available in the future. Moreover, we omitted trimoraic nouns from our analysis, due to the computational tractability and words with exceptional accent patterns. As the number of possible extended merger states soars with the number of accentual classes, heavily slowing down the computation of the likelihood, it was difficult to include trimoraic nouns, which have six (or seven) classes. In addition, a non-negligible number of trimoraic nouns within an accentual class have different accent patterns, which made it difficult to assign a representative accent pattern to each class. Provided some previous study reconstructed the phylogenetic tree based on trimoraic nouns ( Hirako 2017 ), our analysis may have missed the phylogenetic signal that trimoraic nouns offer. Nevertheless, the resulting phylogenetic tree is not likely to change drastically even if we include trimoraic nouns, because the accent patterns of trimoraic nouns are not independent of those of bimoraic nouns in many dialects. If a pair of two dialects have similar accent patterns for the classes for bimoraic nouns, they tend to also have similar accent patterns for trimoraic nouns.
In a different vein, we did not include the dialects of Ryukyuan, because their accent patterns do not completely correspond to the accentual classes of Japanese ( Matsumori 1998 ), as the common ancestor of Japanese and Ryukyuan dates back earlier than the Heian period. It is known that Proto-Ryukyuan had at least three distinct accent patterns, and that the classes 4 and 5 of bimoraic nouns are split into subclasses 4a and 4b, and 5a and 5b, respectively, while 4a and 5a, and 4b and 5b are merged. Accentless regions epitomized by Fukushima and Miyazaki were also excluded from our study because it is not certain whether these pitch-accent systems, which do not have distinct accent patterns given to each word, were formed through accentual class merger. This limitation is inevitable since our method is based on the assumption that all tree taxa are descendants of a pitch-accent system with distinct accentual classes seen in Ruiju Myogisho. Thus, elucidating the phylogenetic relationship of Ryukyuan dialects and accentless regions would require a different model. Nevertheless, the phylogenetic analysis including Ryukyuan languages could potentially be done by setting the merger state of the tree root to 1 / 2 / 3 / 4 a / 4 b / 5 a / 5 b and is a promising extension of our study. However, challenges concerning data curation and increasing computation time are expected.
Other limitations include the assumptions regarding the mutation of accent patterns. First, we employed the Mk model ( Lewis 2001 ), where every accent pattern may mutate into every other accent pattern with the same probability, which may be an oversimplified assumption. This assumption may have affected the inference of the accent patterns at the tree root and may also have overrated the divergence time of dialects with accent patterns which can easily mutate into each other. A possible solution to this problem would be to assign multiple different mutation rates to different pairs of accent patterns, but our relatively small dataset (i.e. accent patterns of as few as seventeen classes) seems insufficient for the inference of many different model parameters for mutation rates. One possible direction for future research is to pre-classify accent patterns into a few groups, so that the mutation within a group is more likely than mutation between groups. In this way, we may reflect the variation in the mutation rates by introducing two model parameters representing replacement rate within and between groups of accent patterns. However, judging which accent patterns are likely to be mutually replaced would require experimental work or expertise in phonology.
The lack of phonemic analysis is also a limitation of our merger model. In general, every dialect has a small number of features that distinguish accent patterns, such as the position of the pitch drop in the Tokyo dialect, or two tonal registers seen in 2-kei-type accents in Kyushu. The evolution of such distinguishing features is often related to the non-tonal contrast in other parts of the sound system, which is not modeled in this study. Loss of such distinguishing features results in a large-scale merger event that concerns the pitch accent of multiple word categories, but our model assumes that the accent for each word category (i.e. part of speech, number of morae, conjugation) evolves independently. For example, the Kagoshima dialect has two tonal registers, and either the last or second-to-last syllable is pronounced with a high pitch regardless of the number of morae and part of speech. On the other hand, in the Miyakonojo dialect, every word is assigned with the same tonal register: the last mora is pronounced with a high pitch. It has been posited that the 1-kei type pitch-accent system of Miyakonojo was formed due to the loss of one of the tonal registers seen in the Kagoshima dialect. Thus, the difference between the two pitch-accent systems can simply be explained by a single mutation event (loss of a tonal register), but our model regards this as multiple mutation events that happened independently for each word category. Nevertheless, we performed a sensitivity analysis with a reduced dataset, in the attempt to avoid this bias, so this limitation is somewhat mitigated.
In our model, we included geographical information in computing the prior probability of the phylogenetic tree, by modeling the rate of dialect transmission as a function of the geographical distance and population density. Although we used the great circle distance (a shortest distance on the sphere surface of the Earth) as the measure of geographical distance, previous research showed that the travel distance or travel time between locations better explains the variation of the linguistic distance ( Jeszenvszky et al., 2019 ; Szmrecsanyi 2012 ). The presence of the sea routes may have also affected the diffusion of dialects. Another limitation concerning geographical information is the calibration of population sizes in the Heian period. Although we used the demographic data in 1995 for calibration, it must be noted that the Japanese population distribution changed after the alluvial plain development. We did not include these factors in order to keep the model simple, but future studies may consider such geographical factors.
Unlike lexical features which are subject to borrowing between dialects, the pitch accent gives a signal for the evolutionary process described in the tree structure. Moreover, the accentual class merger gives evidence that the pitch-accent systems have split from a shared ancestor, which is quite compatible with tree-thinking. Analyzing the phylogeny of pitch accent is a promising way to shed light on the evolutionary history of the modern dialects.
We thank Peter Ranacher and Nico Neureiter for valuable discussions and feedback. We also appreciate the valuable comments from four anonymous reviewers (including one secondary reviewer). This research was funded by JSPS KAKENHI, Grant nos. 17H06381, 18J00484 and 24K09627, Meiji Institute for Advanced Study of Mathematical Sciences (MIMS) Joint Research Project, and the Swiss NSF Sinergia Project No. CRSII5_183578.
Conflict of interest statement . We declare no conflicts of interest.
The supplementary document is available on journal’s website. Data on the accent patterns of tree taxa are shown in Appendix. Other data and code associated with this paper are available at https://zenodo.org/records/11154180 .
Bouckaert , R. , et al. . ( 2014 ). ‘ BEAST 2: A Software Platform for Bayesian Evolutionary Analysis ’, PLoS Computational Biology , 10 ( 4 ): e1003537 . https://doi.org/10.1371/journal.pcbi.1003537
Google Scholar
Bouckaert , R. , et al. . ( 2012 ). ‘ Mapping the Origins and Expansion of the Indo-European Language Family’, Science , 337 ( 6097 ): 957 – 960 . https://doi.org/10.1126/science.1219669
Burridge , J. ( 2017 ). ‘ Spatial Evolution of Human Dialects’, Physical Review X , 7 ( 3 ): 031008 .
Currie , T. E. , Meade , A. , Guillon , M. , and Mace , R. ( 2013 ). ‘ Cultural Phylogeography of the Bantu languages of sub-Saharan Africa’, Proceedings Biological Sciences , 280 ( 1762 ): 20130695 . https://doi.org/10.1098/rspb.2013.0695
Database of Global Administrative Areas . 2015 . GADM Database (www.gadm.org), version 2.8 . https://gadm.org/data.html (Downloaded on 7/3/2017).
e-Stat . Portal Site of Official Statistics of Japan . 2016 . https://www.e-stat.go.jp/gis/statmap-search?page=1&type=1&toukeiCode=00200521&toukeiYear=1995&aggregateUnit=S&serveyId=S002005111995&statsId=T000751
Felsenstein , J. ( 1973 ). ‘ Maximum Likelihood and Minimum-Steps Methods for Estimating Evolutionary Trees from Data on Discrete Characters’, Systematic Biology , 22 ( 3 ): 240 – 249 . https://doi.org/10.1093/sysbio/22.3.240
Felsenstein , J. ( 1981 ). ‘ Evolutionary Trees From DNA Sequences: A Maximum Likelihood Approach’, Journal of Molecular Evolution , 17 ( 6 ): 368 – 376 . https://doi.org/10.1007/BF01734359
Geospatial Information Authority of Japan . 2006 . Global Map Japan Version 1.1 Raster Data . https://www.gsi.go.jp/kankyochiri/gm_japan_e.html . (Downloaded on 4/11/2014).
Gray , R. , Drummond , A. J. , and Greenhill , S. J. ( 2009 ). ‘ Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement’, Science , 323 : 479 – 483 .
Hattori S. 1959 (reprint 1999 ). Nihongo no Keitō , pp. 130 . Japan : Iwanami bunko .
Google Preview
Hattori S. 1985 (reprint 2018 ). ‘ Nihongo Shohōgen no Akusento no Kenkyū to Hikaku Hōhō’, in Z. Uwano (ed) Nihon sogo no saiken , pp. 597 – 610 . Japan : Iwanami Shoin .
Hirako , T. ( 2017 ). ‘ On the Historical Position of the Gairin Type Accent’, Journal of Asian and African Studies , 94 : 259 – 276 .
Hirayama T. ( 1951 ) Kyūshū Hōgen On-chō no Kenkyū: Kyōtsū-go Keihan-go tono Hikaku Kōsatsu . Japan : Gakkaino shishin sha .
Hirayama T. ( 1957 ) Nihongo On-chō no Kenkyū . Japan : Meiji shoin .
Hirayama T. ed. ( 1960 ) Zenkoku akusento jiten . Japan : Tokyo-dō Shuppan .
Hirayama T. ( 1969 ) Satsunan Shotō no Sōgō teki Kenkyū . Japan : Meiji Shoin .
Hirayama , T. ( 1979 ). Gengotō Nara-ken Totsukawa hōgen no Seikaku . Gengo Kenkyu , 76 , 29 – 73 . https://doi.org/10.11435/gengo1939.1979.76_29
Hoffmann , K. , Bouckaert , R. , Greenhill , S. J. , and Kühnert , D. ( 2021 ). ‘ Bayesian Phylogenetic Analysis of Linguistic Data Using BEAST’ , Journal of Language Evolution , 6 ( 2 ), 119 – 135 . https://doi.org/10.1093/jole/lzab005
Huisman , J. L. A. , Majid , A. , and van Hout , R. ( 2019 ). ‘ The Geographical Configuration of a Language Area Influences Linguistic Diversity’, PLoS One , 14 ( 6 ), e0217363 . https://doi.org/10.1371/journal.pone.0217363
Igarashi Y. ( 2021 ) ‘ Bunki-gaku-teki shuhō ni motozuita Nichiryū shogo no keitō bunrui no kokoromi’, in Y. Hayashi , T. Kinuhata , and N. Kibe (eds) Fīrudo to bunken kara miru Nichiryū shogo no keitō to rekishi , pp. 17 – 51 . Japan : Kaitakusha .
Jeszenvszky , P. , Hikosaka , Y. , Imamura , S. , and Yano , K. ( 2019 ). ‘ Japanese Lexical Variation Explained by Spatial Contact Patterns’, Geo-Inf , 8 ( 9 ): 400 . https://doi.org/10.3390/ijgi8090400
Kibe , N. ( 1997 ). ‘ 18-Seiki Satsuma no hyōryū-min Gonza no Akusento ni Tsuite: Joshi no Akusento to Gonza Akusento no Ichizuke’ , Kokugogaku , 191 : 84 – 97 .
Kibe N. ( 2000 ) Seinanbu Kyūshū 2-kei akusento no Kenkyū . Japan : Bunsei Shuppan .
Kindaichi H. ( 1942 ) (reprint 2005) ‘ Bumoki no Kenkyu Zokuchō. Nihongo no Akusento, Nihon Hōgen Gakkai’, in Kindaichi Haruhiko Chosaku Shū 9 , pp. 9 – 38 . Japan : Tamakawa Daigaku Shuppanbu .
Kindaichi H. ( 1964 ) (reprint 1977). ‘ Watashi no hōgen kukaku’, in Nihon-go hōgen no kenkyū , pp. 54 – 80 . Japan : Tokyodō .
Kindaichi H. ( 1966b ) (reprint 2005) ‘ Sanuki akusento hen’i seiritsu kou’, in H. Kindaichi (ed) Kindaichi Haruhiko Chosaku shū 7 , pp. 531 – 568 . Japan : Tamakawa Daigaku Shuppanbu .
Kindaichi H. , 1966a (reprint 2005). ‘ Tsushima Iki no Akusento no Chii’, in H. Kindaichi (ed) Kindaichi Haruhiko Chosaku Shū 7 , pp. 347 – 373 . Japan : Tamakawa Daigaku Shuppanbu .
Kindaichi , H. ( 1967 ). ‘ Tōgoku hōgen no rekishi o kangaeru’ , Kokugo-gaku , 69 : 40 – 50 .
Kindaichi H. ( 1975b ) (reprint 2005). ‘ On’in henka kara akusento no henka he’, in H. Kindaichi (ed) Kindaichi Haruhiko Chosaku shū 7 , pp. 630 – 657 . Japan : Tamakawa Daigaku Shuppanbu .
Kindaichi H. , ( 1975a ) (reprint 2005). ‘ Tōzai ryo akusento no chigai ga dekiru made’, in H. Kindaichi (ed) Kindaichi Haruhiko Chosaku Shū 7 , pp. 374 – 414 . Japan : Tamakawa Daigaku Shuppanbu .
Kindaichi , H. ( 1978 ). ‘ Aichi ken Akusento no Keifu’, Kokugo Gaku Ronshū’, . Kasama-Shoin , 1 : 1 – 19 .
Kindaichi H. ed ( 2001 ). A concise Tone Dictionary of the Japanese Language [Meikai Nihongo Akusento Jiten] . Japan : Sanseidō .
Koile , E. , et al. . ( 2022 ). ‘ Phylogeographic Analysis of the Bantu Language Expansion Supports A Rainforest Route’, Proceedings of the National Academy of Sciences of the United States of America , 119 ( 32 ): e2112853119 . https://doi.org/10.1073/pnas.2112853119
Lee , S. , and Hasegawa , T. ( 2011 ) ‘ Bayesian Phylogenetic Analysis Supports an Agricultural Origin of Japonic Languages’, Proceedings of the Royal Society B , 278 ( 1725 ): 3662 – 3669 . https://doi.org/10.1098/rspb.2011.0518
Lemey , P. , Rambaut , A. , Drummond , A. J. , and Suchard , M. A. ( 2008 ). ‘ Bayesian Phylogeography Finds Its Roots,’ PLoS Computational Biology , 5 ( 9 ): e1000520 . https://doi.org/10.1371/journal.pcbi.1000520
Lemey , P. , Rambaut , A. , Welch , J. J. , and Suchard , M. A. ( 2010 ). ‘ Phylogeography Takes a Relaxed Random Walk in Continuous Space and Time’ , Molecular Biology and Evolution , 27 ( 8 ): 1877 – 1885 . https://doi.org/10.1093/molbev/msq067
Lewis , P. O. ( 2001 ). ‘ A Likelihood Approach to Estimating Phylogeny from Discrete Morphological Character Data’, Systematic Biology , 50 ( 6 ): 913 – 925 . https://doi.org/10.1080/106351501753462876
List J. M. , Shijulal N. S. , Martin W. , and Geisler H. ( 2014 ) ‘ Using Phylogenetic Networks to Model Chinese Dialect History’, Language Dynamics and Change , 4 : 222 – 252 . ( https://doi.org/10.1163/22105832-00402008 )
Long , D. , Isono , E. , and Tsukahara , Y. ( 2008 ). ‘ Ogasawara Shotō no Ōbei-kei Tōmin ni Mirareru Go-akusento no Kata Oyobi Sono Sedaisa’ , Ogasawara Kenkyū Nenpō , 31 , 31 – 40 .
Mase Y. ( 1994 ) Hiroshima-shi hōgen akusento jiten . Japan : Nakano Shuppan Kikaku .
Matsukura , K. ( 2014 ). ‘ Distribution of Accent Systems in Awara City. Fukui Pref ’, Tokyo University Linguistic Papers , 35 : 141 – 154 . https://doi.org/10.15083/00027471
Matsukura , K. , and Nitta , T. ( 2016 ). ‘ Comparison of the Three-pattern Accent Systems in Fukui Prefecture’, Journal of the Phonetic Society of Japan , 20 ( 3 ): 81 – 94 . https://doi.org/10.24467/onseikenkyu.20.3_81
Matsumori , A. ( 1998 ). ‘ Ryūkyū Akusento no Rekishi Teki Keisei Katei: Ruibetsu Gorui Nihakugo no Tokuina Gōryū no Shikata O Tegakari ni’, Gengo Kenkyū , 114 : 85 – 114 .
Matsumori , A. , Nitta , T. , Kibe , N. , and Nakai , Y. ( 2012 ) Nihongo Akusento Nyūmon . Japan : Sanseido .
Ministry of Land, Infrastructure, Transport and Tourism ( 2020 ) Digital national land information (Administrative Area Data), ver. 2.2 , https://nlftp.mlit.go.jp/ksj/gml/datalist/KsjTmplt-N03-v2_3.html (downloaded in March 2022).
Nerbonne , J. ( 2010 ). ‘ Measuring the Diffusion of Linguistic Change’, Philosophical Transactions of the Royal Society of London, Series B: Biologicl Sciences , 365 ( 1559 ): 3821 – 3828 . https://doi.org/10.1098/rstb.2010.0048
Neureiter , N. , et al. . ( 2022 ). ‘ Detecting Contact in Language Trees: A Bayesian Phylogenetic Model with Horizontal Transfer’ , Humanities and Social Sciences Communications , 9 ( 1 ): 205 .
Nitta , T. ( 2012 ). ‘ Accent of the Kokonogi Dialect in Echizen Town, Fukui Prefecture’ , Journal of the Phonetic Society of Japan , 16 ( 1 ): 63 – 79 . https://doi.org/10.24467/onseikenkyu.16.1_63
Okumura M. ed. ( 1976 ) Gifu-ken hōgen no kenkyū . Japan : Taishushobō .
Pagel , M. , Atkinson , Q. D. , and Meade , A. ( 2007 ). ‘ Frequency of Word-Use Predicts Rates of Lexical Evolution throughout Indo-European History’, Nature , 449 ( 7163 ): 717 – 720 . https://doi.org/10.1038/nature06176
Pagel , M. , and Meade , A. ( 2017 ). ‘ The Deep History of the Number Words’, Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences , 373 ( 1740 ): 20160517 . https://doi.org/10.1098/rstb.2016.0517
Pellard T. ( 2016 ) ‘ Nichiryū Sogo no Bunki Nendai’, in Y. Takubo , J. Whitman , and T. Hirako (eds) Ryūkyū shogo to Kodai Nihongo: Nichiryū sogo no saiken ni mukete , pp. 99 – 124 . Japan : Kuroshio Shuppan .
Romano , N. , Ranacher , P. , Bachmann , S. , and Joost , S. ( 2022 ). ‘ Linguistic Traits as Heritable Units? Spatial Bayesian Clustering Reveals Swiss German Dialect Regions’, Journal of Linguistic Geography , 10 ( 1 ): 11 – 22 . https://doi.org/10.1017/jlg.2021.12
Saitou , N. , and Jinam , T. A. ( 2017 ). ‘ Language Diversity of the Japanese Archipelago and its Relationship with Human DNA Diversity’ , Man in India , 95 ( 4 ): 205 – 228 .
Saitou , N. , and Nei , M. ( 1987 ). ‘ The Neighbor-Joining Method: A New Method For Reconstructing Phylogenetic Trees’ , Molecular Biology and Evolution , 4 ( 4 ): 406 – 425 . https://doi.org/10.1093/oxfordjournals.molbev.a040454
Sato , R. ( 1983 ). ‘ Fukui-Shi Oyobi Sono Shūhen Chiiki no Akusento-Chōsahō to Kata no Kubetsu no Arawarekata tono Kanren o Chūshin ni’, Kokugogaku Kenkyū , 23 : 1 – 19 .
Sato R. , ( 1988 ) ‘ The Accent System of Fukui City and Its Suburbs—With Special Reference to the Survey Methods, Age and Individual Differences’, in: National Institute for Japanese Language , (ed.) Hōgen Kenkyū hō no Tankyū , pp. 123 – 219 . Japan : Shūei Shuppan .
Sato , Y. , Sogabe , Y. , and Mazuka , R. ( 2010 ). ‘ Development of hemispheric specialization for lexical pitch-accent in Japanese infants’ , Journal of Cognitive Neuroscience , 22 ( 11 ): 2503 – 2513 . https://doi.org/10.1162/jocn.2009.21377
Shibata T. (1942) (reprint 1950 ) ‘ Ibigawa Jōryū no Akusento’, in T. Shibata (ed) Moji to Kotoba , pp. 231 – 266 . Japan : Toue Shoin .
Shibatani M. , ed ( 1990 ) (8th ed: 2005). The Languages of Japan. Cambridge Language Survey . UK : Cambridge University Press .
Szmrecsanyi B. ( 2012 ) ‘ Geography is Overrated’, in: Hansen S. , Schwarz C. , Stoeckle P. , and Streck T. , (eds) Dialectological and Folk Dialectological Concepts of Space—Current Methods and Perspectives in Sociolinguistic Research on Dialect Change , pp. 215 – 231 . Berlin, Germany : De Gruyter .
Takahashi , T. , and Ihara , Y. ( 2023 ). ‘ Spatial Evolution of Human Cultures Inferred Through Bayesian Phylogenetic Analysis’, Journal of the Royal Society Interface , 20 ( 198 ): 20220543 . https://doi.org/10.1098/rsif.2022.0543
Tokugawa , M. ( 1962 ). ‘Nihongo sho-hōgen akusento no keifu’ shiron: ‘rui no tōgō’ to ‘chiri-teki bumpu’ kara miru . Gakushuin Daigaku Kokugo Kokubungaku Kaishi , 6 : 1 – 19 .
Uwano , Z. ( 1977 ) ‘ Nihongo no akusento’, in S. Ohno and T. Shibata (eds) Iwanami kōza Nihongo 5 – On’in , pp. 281 – 322 . Japan : Iwanami Shoten .
Uwano , Z. ( 1985a ). ‘ The Accent System of Ibukijima Dialect’, Transactions of the Japan Academy , 40 ( 2 ): 75 – 179 . https://doi.org/10.2183/tja1948.40.75
Uwano , Z. ( 1985b ). ‘ Genealogical Relationships and the Geographical Distribution of the Accents in Mainland Japan’, Transactions of the Japan Academy , 40 ( 3 ): 215 – 250 . https://doi.org/10.2183/tja1948.40.215
Uwano , Z. ( 1987 ). ‘ Genealogical Relationships and the Geographical Distribution of the Accents in Mainland Japan’ , Transactions of the Japan Academy , 42 ( 1 ): 15 – 70 . https://doi.org/10.2183/tja1948.42.15
Uwano , Z. ( 1990 ). ‘ Accentual System of the Adjective in the Aomori Dialect ’, Asia & African linguistics , 19 : 45 – 81 .
Uwano , Z. ( 2006 ). ‘ Nihongo Akusento no Saiken’ , Gengo Kenkyū , 130 : 1 – 42 .
Uwano , Z. ( 2019 ). ‘ Accent Data of Verbs in the Northern Tōhoku Dialects: Part 1 ’, NINJAL Research Papers , 17 : 101 – 130 . https://doi.org/10.15084/00002226
Yamaguchi Y. ( 1984 ). ‘ Fukui-shi Kougai no Ni-kei Akusento’, Hōgen Kenkyū Nempō , 27 : 207 – 229 .
Yamaguchi Y. ( 2003 ) ‘ Akusento Taikei ga Shashō shita mono – Jun Ni-kei Akusento Shizuoka-ken Maisaka-machi Hōgen no Rei’, in Nihongo Tokyo Akusento no Seiritsu , pp. 326 – 349 . Japan : Minatono hito .
Month: | Total Views: |
---|---|
July 2024 | 394 |
Citing articles via.
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Blood Cancer Journal volume 14 , Article number: 118 ( 2024 ) Cite this article
43 Accesses
Metrics details
Dear Editor,
Approximately 25–35% of adult patients with acute myeloid leukemia (AML) carries NPM1 mutation, which generally indicated a favorable outcome in the absence of FLT3-ITD mutation [ 1 ]. NPM1 mutations are absent in clonal hematopoiesis, and have been considered as AML initiating lesions [ 2 ]. Research on co-mutation characteristics of NPM1 -mutated patients concentrated on FLT3-ITD , which has been suggested to hold a negative prognostic impact on NPM1 -mutated patients by several large retrospective clinical studies [ 3 , 4 ]. Besides FLT3-ITD , although there remains controversy, other high-frequency co-mutations such as DNMT3A , IDH1 , IDH2 , FLT3-TKD , NRAS , and WT1 mutations have also been pointed out to affect the prognosis of NPM1- mutated patients [ 3 , 5 , 6 , 7 , 8 , 9 ]. Indeed, identification of specific co-mutation combinations other than FLT3-ITD mutation is essential for precise risk stratification and treatment strategy optimization for NPM1 -mutated AML patients. Since allogeneic hematopoietic stem cell transplantation (allo-HSCT) is generally considered to improve the long-term outcome of most adverse-risk and suitable intermediate-risk AML patients, for NPM1 -mutated AML patients, it is imperative to revisit the co-mutation profiles to determine the optimal population who may benefit from allo-HSCT.
In this study, we conducted a retrospective analysis of newly diagnosed adult AML patients with NPM1 mutations (acute promyelocytic leukemia excluded) in our center diagnosed from October 2018 to December 2022, focusing on exploring the therapeutic and prognostic significance of co-mutation characteristics in AML patients with NPM1 mutations. Patients who received at least one complete course of induction therapy were included in the further outcome analysis. Table S1 provided details of induction chemotherapy. We evaluated efficacy after two induction cycles, unless patients achieved CR/CRi after receiving only one induction cycle or discontinued treatment. Response evaluation was performed according to the NCCN guidelines for AML (version 3. 2023) and was categorized as CR/CRi or non-CR/CRi (including PR and NR) cohort [ 10 ]. Overall survival (OS) was defined as the time interval from treatment initiation until death due to any reason. Event-free survival (EFS) was defined as the time interval from treatment initiation to the occurrence of induction failure, relapse, or death, whichever came first. Disease-free survival (DFS) was defined as the time interval from disease remission to the occurrence of relapse or death, whichever came first. The study was conducted in accordance to the Declaration of Helsinki and was approved by the Ethics Committee of the First Affiliated Hospital of Zhejiang University College of Medicine (Hangzhou, China, Ethics Approval Number: IIT20240304A). All statistical analyses were performed using GraphPad Prism 7.0 software (GraphPad Software, CA, USA) and SPSS 23.0 (SPSS Inc., Chicago, IL).
One hundred ninety-two newly diagnosed NPM1 -mutated AML patients detected through next-generation sequencing (NGS) were analyzed (Tables S2 – S4 ). Twenty NPM1 mutants were identified, most of which were located in exon 12 and manifested as 4 base pair duplication/insertion alteration. Seven non-exon 12 mutants were located in exon 5, 8, 9 and exon 11, respectively (Fig. 1A and Table S5 ). A total of 56 co-mutated genes were detected in the cohort (Fig. 1B ). Co-mutated genes with a detection rate of ≥10% included FLT3 (56.77%), DNMT3A (48.44%), TET2 (29.69%), IDH2 (23.96%), IDH1 (14.58%), PTPN11 (11.46%), and NRAS (11.46%). Co-mutated genes related to epigenetics and signal transduction were the most common by functional classification (Table S6 ).
A Protein domain structure and location of amino acids affected by mutations in NPM1 . Several nuclear import and export signals of NPM1 assist its nucleocytoplasmic shuttling and cytological localization. The conserved N-terminal domain of NPM1 contains a leucine-rich nuclear export signal (NES). The middle domain contains two nuclear localization signals (NLS) that drive NPM1 to move from the cytoplasm to the nucleus. The C-terminus contains a nucleolar localization signal (NoLS), in which two highly conserved tryptophan residues (W288 and W290) are responsible for the correct folding of the helix to stabilize the hydrophobic core of NoLS. Most of the insertion mutations in exon 12 led to the loss of the original NoLS signal and generated a new NES signal, leading to aberrant cytoplasmic dislocation of NPM1 protein. B Co-mutation distribution map of NPM1 -mutated AML patients.
One hundred seventy-eight patients (92.71%) received at least one complete course of intensive induction chemotherapy and underwent efficacy assessment, of which 133 patients (74.72%) achieved CR/CRi within two courses of induction chemotherapy. The median follow-up of the 178 patients was 26.23 months (95% confidence interval [CI], 23.31–29.16). The median OS and DFS have not been reached, with the median EFS of 15.03 months (95% CI, 8.25–21.82). The 3-year expected OS, EFS, and DFS were 51.5%, 40.3%, and 53.7%, respectively.
Regardless of the cut-off value of variant allele frequency (VAF) levels, there was no significant difference in OS, EFS, and DFS between NPM1 low VAF group and NPM1 high VAF group (Fig. S1 ). Then we focused on impact of co-mutations on response and outcome of AML patients with NPM1 mutations. Among the 178 NPM1 -mutated patients included in the follow-up, we noticed that patients with either FLT3-ITD or DNMT3A mutations showed significantly worse CR/CRi rates and prognosis trends than wild type group ( FLT3-ITD , CR/CRi rates, 63.41% vs. 84.38%, p = 0.001; median OS, 14.3 months vs. NR, p < 0.001; median EFS, 7.3 months vs. NR, p < 0.001; median DFS, 21.6 months vs. NR, p = 0.044; DNMT3A , CR/CRi rates, 67.44% vs. 81.53%, p = 0.013; median OS, 15.3 months vs. NR, p < 0.001; Median EFS, 11.6 months vs 27.7 months, p = 0.031; Median DFS, p = 0.337) (Table S7 and Fig. S2 ). We further divided patients into four subgroups according to the FLT3-ITD and DNMT3A mutation status. NPM1/FLT3-ITD/DNMT3A triple mutants showed extremely poor OS and EFS trends among four groups (Fig. 2A, B ). Besides, we noticed that when combined with DNMT3A mutations, FLT3-ITD mutated patients exhibited significantly worse OS than that of FLT3-ITD wild-type patients ( p = 0.003), while similar results were found in DNMT3A wild-type patients ( p = 0.002); We also noticed that when combined with FLT3-ITD mutations, DNMT3A mutated patients exhibited significantly worse OS than that of DNMT3A wild-type patients ( p = 0.045), with similar results occurred in FLT3-ITD wild-type patients ( p = 0.020) (Fig. 2A ).
A OS and B EFS of NPM1 -mutated AML patients with different combination patterns of FLT3-ITD and DNMT3A mutations. C OS, D EFS, and E DFS of NPM1 -mutated AML patients with IDH1/2 mutation . F OS, G EFS, and H DFS of NPM1 -mutated AML patients with PTPN11 -PTP mutation. I OS and J EFS of NPM1 mut FLT3-ITD mut AML patients with IDH mutations. K OS and L EFS of NPM1 mut DNMT3A mut AML patients with PTPN11 mutations. M OS, N EFS, and O DFS of allo-HSCT on NPM1 -mutated AML patients harbored both FLT3-ITD and DNMT3A mutations.
For patients combined with IDH1/2 mutations, we observed that the IDH1/2 mutant group significantly improved OS, EFS, and DFS compared with wild-type group (Median OS, NR vs. 18.6 months, p < 0.001; Median EFS, NR vs 10.2 months, p = 0.003; Median DFS, NR vs 18.3 months, p = 0.012) (Figs. 2C–E and S3 ). Although patients combined with PTPN11 mutations showed a trend toward improved outcome compared with PTPN11 wild-type, the difference was not significant (Fig. S4 ). PTPN11 mutations have been reported to be mainly clustered in the N-terminal Src homology region 2 (N-SH2) and phosphatase (PTP) domains. Since mutations in both two domains involved in attenuating the autoinhibition of the protein, SHP2, encoded by PTPN11 [ 11 ], we further investigated whether mutations in different domains of PTPN11 led to comparable outcome. The OS and EFS of patients with PTPN11 -PTP domain mutations were significantly improved compared to those with PTPN11 wild-type (Median OS, NR vs 26.0 months, p = 0.014; Median EFS, NR vs 13.5 months, p = 0.016). Similar trends were found in DFS, whereas patients with PTPN11 -N-SH2 domain mutations showed no significant improvement in outcome (Figs. 2F–H and S4 ). In addition, Fig. S5 showed the prognostic impact of other co-mutation genes with a detection rate of ≥10% in the follow-up patients, including TET2 , FLT3-TKD , NRAS , and WT1 , with trends all non-significant.
Further, we took into account the presence of IDH or PTPN11 mutations in NPM1- mutated patients combined with FLT3-ITD or DNMT3A mutations to explore the prognostic impact of the specific co-mutation interaction patterns. Separately, carrying IDH mutations significantly improved OS and exhibited an improved EFS trend in patients with NPM1 / FLT3-ITD dual mutations (Median OS, 30.8 vs 12.8 months, p = 0.015; Median EFS, 22.6 vs 6.1 months, p = 0.099), but has no significant impact on the outcome of patients with NPM1 / DNMT3A mutations (Figs. 2I, J and S6 ). Similarly, carrying PTPN11 mutations significantly improved OS and EFS in patients with NPM1 / DNMT3A dual mutations (Median OS, NR vs. 14.6 months, p = 0.026; Median EFS, NR vs. 10.2 months, p = 0.033), but has no significant impact on the outcome of patients with NPM1 / FLT3-ITD mutations (Figs. 2K, L and S6 ).
Previous research generally acknowledged that allo-HSCT is beneficial for FLT3-ITD mutated AML patients without NPM1 mutations. To identify the subgroup of NPM1 -mutated AML patients likely to benefit from allo-HSCT, we explored the prognosis of patients who underwent allo-HSCT during post-remission after achieving CR/CRi within two courses of induction. A total of 32 patients received allo-HSCT, with another four patients relapsed and received salvage-HSCT during post-remission. For patients with NPM1 mutation, receiving allo-HSCT or salvage-HSCT did not significantly improve the outcome compared with non-transplanted patients (Fig. S7 ). For patients with NPM1 mutations combined with either FLT3-ITD or DNMT3A mutation, allo-HSCT showed a trend toward improved outcome, but the difference was not significant. When further focused on patients with NPM1/FLT3-ITD/DNMT3A triple mutations characterized by poor prognosis, we observed that allo-HSCT significantly improved the OS, EFS, and DFS of these subgroup (Median OS, NR vs. 14.0 months, p = 0.037; Median EFS, NR vs. 9.1 months, p = 0.014; Median DFS, NR vs. 7.4 months, p = 0.012) (Fig. 2M–O ). Nevertheless, for NPM1 -mutated patients with wild type FLT3-ITD and DNMT3A , administration of allo-HSCT showed no improved outcome (Fig. S7 ).
Our results indicated that in NPM1 -mutated AML, co-mutations of IDH1 /2 and PTPN11 -PTP domain were correlated with favorable prognosis, whereas FLT3-ITD and DNMT3A co-mutations were indicative of poor prognosis. Notably, the presence of NPM1/FLT3-ITD/DNMT3A triple mutations is associated with exceptionally adverse OS and EFS trends. Several studies have reported NPM1 / FLT3-ITD / DNMT3A , the most common triple mutation pattern in NPM1 -mutated patients, defined an AML subgroup with extremely poor prognosis [ 7 , 12 ], which aligned with our findings. Further, our results on specific co-mutation combinations indicated that IDH and PTPN11 co-mutations, respectively, ameliorated the adverse prognosis of patients with NPM1 / FLT3-ITD or NPM1 / DNMT3A dual mutations, thus two subsets with improved prognosis were redefined from the original adverse-prognosis subset of NPM1 -mutated AML. Besides, for patients with NPM1 / FLT3-ITD dual mutations, allo-HSCT post-first remission has demonstrated a significant enhancement in both OS and DFS juxtaposed with the continued administration of chemotherapy alone [ 13 , 14 ]. However, another large cohort study on pediatric AML reported opposite results [ 15 ]. Our research endeavored to identify the optimal population who may benefit from allo-HSCT. The findings underscored the therapeutic potential of allo-HSCT, particularly for AML patients with NPM1/FLT3-ITD/DNMT3A triple mutations during post-remission.
In summary, these findings underscored the importance of co-mutation analysis in NPM1 -mutated AML for risk stratification and therapeutic decision-making, suggesting that allo-HSCT may be a recommended strategy for NPM1 -mutated patients with specific adverse co-mutation profiles. Nevertheless, further research is needed to confirm these findings and explore how these co-mutations interact to diversify the outcome of NPM1 -mutated AML patients.
The data are not publicly available, owing to ethics considerations and privacy restriction, but can be requested from the corresponding author if necessary.
Grimwade D, Ivey A, Huntly BJ. Molecular landscape of acute myeloid leukemia in younger adults and its clinical relevance. Blood. 2016;127:29–41. https://doi.org/10.1182/blood-2015-07-604496 .
Article CAS PubMed PubMed Central Google Scholar
McKerrell T, Park N, Moreno T, Grove CS, Ponstingl H, Stephens J, et al. Leukemia-associated somatic mutations drive distinct patterns of age-related clonal hemopoiesis. Cell Rep. 2015;10:1239–45. https://doi.org/10.1016/j.celrep.2015.02.005 .
Papaemmanuil E, Gerstung M, Bullinger L, Gaidzik VI, Paschka P, Roberts ND, et al. Genomic classification and prognosis in acute myeloid leukemia. N Engl J Med. 2016;374:2209–21. https://doi.org/10.1056/NEJMoa1516192 .
Boddu PC, Kadia TM, Garcia-Manero G, Cortes J, Alfayez M, Borthakur G, et al. Validation of the 2017 European LeukemiaNet classification for acute myeloid leukemia with NPM1 and FLT3-internal tandem duplication genotypes. Cancer. 2019;125:1091–100. https://doi.org/10.1002/cncr.31885 .
Article CAS PubMed Google Scholar
Gaidzik VI, Weber D, Paschka P, Kaumanns A, Krieger S, Corbacioglu A, et al. DNMT3A mutant transcript levels persist in remission and do not predict outcome in patients with acute myeloid leukemia. Leukemia. 2018;32:30–7. https://doi.org/10.1038/leu.2017.200 .
Boddu P, Kantarjian H, Borthakur G, Kadia T, Daver N, Pierce S, et al. Co-occurrence of FLT3-TKD and NPM1 mutations defines a highly favorable prognostic AML group. Blood Adv. 2017;1:1546–50. https://doi.org/10.1182/bloodadvances.2017009019 .
Bezerra MF, Lima AS, Pique-Borras MR, Silveira DR, Coelho-Silva JL, Pereira-Martins DA, et al. Co-occurrence of DNMT3A, NPM1, FLT3 mutations identifies a subset of acute myeloid leukemia with adverse prognosis. Blood. 2020;135:870–5. https://doi.org/10.1182/blood.2019003339 .
Eisfeld AK, Kohlschmidt J, Mims A, Nicolet D, Walker CJ, Blachly JS, et al. Additional gene mutations may refine the 2017 European LeukemiaNet classification in adult patients with de novo acute myeloid leukemia aged <60 years. Leukemia. 2020;34:3215–27. https://doi.org/10.1038/s41375-020-0872-3 .
Patel JP, Gonen M, Figueroa ME, Fernandez H, Sun Z, Racevskis J, et al. Prognostic relevance of integrated genetic profiling in acute myeloid leukemia. N Engl J Med. 2012;366:1079–89. https://doi.org/10.1056/NEJMoa1112304 .
Benson AB, Venook AP, Al-Hawary MM, Arain MA, Chen YJ, Ciombor KK, et al. Colon Cancer, Version 2.2021, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2021;19:329–59. https://doi.org/10.6004/jnccn.2021.0012 .
Article PubMed Google Scholar
Alfayez M, Issa GC, Patel KP, Wang F, Wang X, Short NJ, et al. The Clinical impact of PTPN11 mutations in adults with acute myeloid leukemia. Leukemia. 2021;35:691–700. https://doi.org/10.1038/s41375-020-0920-z .
Heiblig M, Duployez N, Marceau A, Lebon D, Goursaud L, Plantier I, et al. The impact of DNMT3A status on NPM1 MRD predictive value and survival in elderly AML patients treated intensively. Cancers. 2021;13. https://doi.org/10.3390/cancers13092156 .
Pratcorona M, Brunet S, Nomdedeu J, Ribera JM, Tormo M, Duarte R, et al. Favorable outcome of patients with acute myeloid leukemia harboring a low-allelic burden FLT3-ITD mutation and concomitant NPM1 mutation: relevance to post-remission therapy. Blood. 2013;121:2734–8. https://doi.org/10.1182/blood-2012-06-431122 .
Sakaguchi M, Yamaguchi H, Najima Y, Usuki K, Ueki T, Oh I, et al. Prognostic impact of low allelic ratio FLT3-ITD and NPM1 mutation in acute myeloid leukemia. Blood Adv. 2018;2:2744–54. https://doi.org/10.1182/bloodadvances.2018020305 .
Xu LH, Fang JP, Liu YC, Jones AI, Chai L. Nucleophosmin mutations confer an independent favorable prognostic impact in 869 pediatric patients with acute myeloid leukemia. Blood Cancer J. 2020;10:1. https://doi.org/10.1038/s41408-019-0268-7 .
Article PubMed PubMed Central Google Scholar
Download references
This work was supported in part by National Natural Science Foundation of China (82370162); Natural Science Foundation of Zhejiang Province, China (LY23H080005); Key R&D Program of Zhejiang (2024C03162) and the Fundamental Research Funds for the Central Universities (226-2022-00003).
These authors contributed equally: Yiyi Yao, Yile Zhou, Nanfang Zhuo.
Department of Hematology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310003, Zhejiang, PR China
Yiyi Yao, Yile Zhou, Nanfang Zhuo, Wanzhuo Xie, Haitao Meng, Yinjun Lou, Liping Mao, Hongyan Tong, Jiejing Qian, Min Yang, Wenjuan Yu, De Zhou, Jie Jin & Huafeng Wang
Zhejiang Provincial Key Laboratory of Hematopoietic Malignancy, Zhejiang University, Hangzhou, 310000, Zhejiang, PR China
Zhejiang Provincial Clinical Research Center for Hematological disorders, Hangzhou, 310000, Zhejiang, PR China
Hongyan Tong, Jie Jin & Huafeng Wang
Zhejiang University Cancer Center, Hangzhou, 310000, Zhejiang, PR China
You can also search for this author in PubMed Google Scholar
YY and HW designed the study, collected and analyzed the data, and wrote the first draft of the manuscript. YZ, NZ, WX, HM, YL, LM, HT, JQ, MY, WY, and DZ collected and analyzed the data, and reviewed the manuscript. JJ and HW read and reviewed the manuscript. HW accessed and verified the data, and provided administrative support. All authors had full access to all the data in the study and had final responsibility for the decision to submit for publication.
Correspondence to Huafeng Wang .
Competing interests.
The authors declare no competing interests.
This study was approved by local ethics committees and was conducted in accordance with the Declaration of Helsinki. All patients signed written informed consent.
All patients signed informed consent and also consented to the publication of these data.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary materials, rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Yao, Y., Zhou, Y., Zhuo, N. et al. Co-mutation landscape and its prognostic impact on newly diagnosed adult patients with NPM1 -mutated de novo acute myeloid leukemia. Blood Cancer J. 14 , 118 (2024). https://doi.org/10.1038/s41408-024-01103-w
Download citation
Received : 18 April 2024
Revised : 08 July 2024
Accepted : 11 July 2024
Published : 22 July 2024
DOI : https://doi.org/10.1038/s41408-024-01103-w
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
IMAGES
VIDEO
COMMENTS
Mutation Research: Genetic Toxicology and Environmental Mutagenesis (MRGTEM) publishes papers advancing knowledge in the field of genetic toxicology. Papers are welcomed in the following areas: ... The evaluation of contrasting or opposing viewpoints is welcomed as long as the presentation is in accordance with the journal's aims, scope, and ...
A section of Mutation Research. Mutation Research (MR) provides a platform for publishing all aspects of DNA mutations and epimutations, from basic evolutionary aspects to translational applications in genetic and epigenetic diagnostics and therapy.Mutations are defined as all possible alterations in DNA sequence and sequence organization, from point mutations to genome structural variation ...
About the journal. The subject areas of Mutation Research - Reviews in Mutation Research (MRR) encompass the entire spectrum of the science of mutation research and its applications, with particular emphasis on the relationship between mutation and disease. Thus, this section will cover: Advances in human genome …. View full aims & scope ...
Read the latest Research articles in Mutation from Nature Reviews Genetics. ... Journal Club | 19 October 2022. The mutation rate as an evolving trait ... Mutation is the source of genetic ...
Mutation articles from across Nature Portfolio. ... Research 16 Jul 2024 Journal of Human Genetics. P: 1-7 ... Research Highlights 11 Oct 2023 Nature Reviews Genetics.
A section of Mutation Research. Mutation Research (MR) provides a platform for publishing all aspects of DNA mutations and epimutations, from basic evolutionary aspects to translational applications in genetic and epigenetic diagnostics and therapy. Mutations are defined as all possible alterations in DNA sequence and sequence organization, from point mutations to genome structural variation ...
The mutational landscape of normal human endometrial epithelium. Whole-genome sequencing of normal human endometrial glands shows that most are clonal cell populations and frequently carry cancer ...
Mutation Research is a peer-reviewed scientific journal that publishes research papers in the area of mutation research which focus on fundamental mechanisms underlying the phenotypic and genotypic expression of genetic damage. There are currently three sections: Two previous sections. are now continued as DNA Repair .
Scope. Mutation Research (MR) provides a platform for publishing all aspects of DNA mutations and epimutations, from basic evolutionary aspects to translational applications in genetic and epigenetic diagnostics and therapy. Mutations are defined as all possible alterations in DNA sequence and sequence organization, from point mutations to ...
Abstract. Mutation is the engine of evolution in that it generates the genetic variation on which the evolutionary process depends. To understand the evolutionary process we must therefore characterize the rates and patterns of mutation. Starting with the seminal Luria and Delbruck fluctuation experiments in 1943, studies utilizing a variety of ...
Advances in DNA sequencing have enabled the identification of human germline and somatic mutations at a genome-wide scale.These studies have confirmed, refined, and extended our understanding on the origins, mechanistic basis, and empirical characteristics of human mutations, including both replicative and nonreplicative errors (), heterogeneity in the rates and spectrum of mutations within ...
Human Mutation provides a unique forum for the exchange of ideas, methods, and applications of interest to molecular, human, and medical geneticists in academic, industrial, and clinical research settings worldwide.
Incorporating Mutation Research Letters, Mutation Research/Environmental Mutagenesis and Related Subjects and Mutation Research/Genetic Toxicology ; 2024 — Volumes 893-898
Scope. Mutation Research - Genetic Toxicology and Environmental Mutagenesis (MRGTEM) publishes papers advancing knowledge in the field of genetic toxicology. Papers are welcomed in the following areas: New developments in genotoxicity testing of chemical agents (e.g. improvements in methodology of assay systems and interpretation of results ...
Mutation Research: Genetic Toxicology and Environmental Mutagenesis (MRGTEM) publishes papers advancing knowledge in the field of genetic toxicology. Papers are welcomed in the following areas: ... The evaluation of contrasting or opposing viewpoints is welcomed as long as the presentation is in accordance with the journal's aims, scope, and ...
Mutations drive evolution and were assumed to occur by chance: constantly, gradually, roughly uniformly in genomes, and without regard to environmental inputs, but this view is being revised by discoveries of molecular mechanisms of mutation in bacteria, now translated across the tree of life. These mechanisms reveal a picture of highly regulated mutagenesis, up-regulated temporally by stress ...
Journal List; HHS Author Manuscripts; PMC3909961 As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health. ... Mutation Research / Fundamental and Molecular Mechanisms of Mutagenesis Special Issue: DNA ...
Scope. Mutation Research (MR) provides a platform for publishing all aspects of DNA mutations and epimutations, from basic evolutionary aspects to translational applications in genetic and epigenetic diagnostics and therapy. Mutations are defined as all possible alterations in DNA sequence and sequence organization, from point mutations to ...
The subject areas of Mutation Research - Reviews in Mutation Research (MRR) encompass the entire spectrum of the science of mutation research and its applications, with particular emphasis on the relationship between mutation and disease. Thus, this section will cover: Advances in human genome research (including evolving technologies for mutation detection and functional genomics) with ...
Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis. Supports open access. 4.9 CiteScore. 1.5 Impact Factor. Articles & Issues. About. Publish. ... Sign in to set up alerts; RSS; About. Publish. Order journal. Submit search. Submit your article Guide for authors. All issues. Click here for a complete list of all Mutation ...
The mutation risk score distributions in the cohort calculated by each model are shown in Figure 1. The majority of the risk scores were below 10%, consistent with the actual mutation carrier rate of 6.0%. The median risk scores by the PREMM 5, MMRPro, MMRPredict, and Myriad models were 3.4%, 0.45%, 4.0%, and 7.2%, respectively. Notably, most ...
However, research in the past decade has shown that a substantial portion of mutation rate variation has a scale of dozens of kilobases, is DNA strand dependent, and is correlated with gene ...
One possible direction for future research is to pre-classify accent patterns into a few groups, so that the mutation within a group is more likely than mutation between groups. In this way, we may reflect the variation in the mutation rates by introducing two model parameters representing replacement rate within and between groups of accent ...
The journal Mutation Research was founded in 1964 by Frits H. Sobels. Over the years, adapting to the evolving field, the journal has been divided into several sections and has seen a number of title changes, generating a complex publication history.
PURPOSE The autosomal dominant cancer predisposition disorders hereditary breast and ovarian cancer (HBOC) and Lynch syndrome (LS) are genetic conditions for which early identification and intervention have a positive effect on the individual and public health. The goals of this study were to determine whether germline genetic screening using exome sequencing could be used to efficiently ...
The National Institutes of Health has awarded a $2.4 million grant to an Oklahoma Medical Research Foundation scientist whose lab identifies disease-causing genetic mutations. ... that will further expedite research on individual genetic mutations. Qin's discovery was recently published in the journal Nature Communications. Filed Under: News ...
Research on co-mutation characteristics of NPM1-mutated patients concentrated on FLT3-ITD, ... Blood Cancer Journal (Blood Cancer J.) ISSN 2044-5385 (online) ...
Jianzhong Wu. June 2001 View PDF. More opportunities to publish your research: Browse open Calls for Papers beta. Read the latest articles of Mutation Research/Mutation Research Genomics at ScienceDirect.com, Elsevier's leading platform of peer-reviewed scholarly literature.
Sep. 7, 2020 — Recent cancer studies have shown that genomic mutations leading to cancer can occur years, or even decades, before a patient is diagnosed. Researchers have developed a statistical ...