U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Scientific Reports

Logo of scirep

Modality, presentation, domain and training effects in statistical learning

Krisztina sára lukics.

1 Department of Cognitive Science, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary

2 MTA-BME Momentum Language Acquisition Research Group, Eötvös Loránd Research Network (ELKH), Budapest, Hungary

Ágnes Lukács

Associated data.

Study materials and data of the experiment are available at https://osf.io/hzg7w/?view_only=639eb1b8b8dd4325a7a5ce25c08cee9b .

While several studies suggest that the nature and properties of the input have significant effects on statistical learning, they have rarely been investigated systematically. In order to understand how input characteristics and their interactions impact statistical learning, we explored the effects of modality (auditory vs. visual), presentation type (serial vs. simultaneous), domain (linguistic vs. non-linguistic), and training type (random, starting small, starting big) on artificial grammar learning in young adults (N = 360). With serial presentation of stimuli, learning was more effective in the auditory than in the visual modality. However, with simultaneous presentation of visual and serial presentation of auditory stimuli, the modality effect was not present. We found a significant domain effect as well: a linguistic advantage over nonlinguistic material, which was driven by the domain effect in the auditory modality. Overall, the auditory linguistic condition had an advantage over other modality-domain types. Training types did not have any overall effect on learning; starting big enhanced performance only in the case of serial visual presentation. These results show that input characteristics such as modality, presentation type, domain and training type influence statistical learning, and suggest that their effects are also dependent on the specific stimuli and structure to be learned.

Introduction

Our surroundings are full of structured patterns and regularities. In order to efficiently operate in this complex environment, an organism has to be equipped with abilities to find, learn, and utilize these environmental structures and regularities. Statistical learning is a powerful mechanism of extracting and encoding structure from environmental stimuli 1 . This form of learning is ubiquitous in human cognition: studies have shown that it is present in the auditory, visual, and tactile modalities and across the linguistic and nonlinguistic domains 2 – 12 , and it also operates in multimodal, visuomotor tasks 13 , 14 , as well.

Statistical learning supports many skills in our everyday life. For instance, language, consisting of complex patterns and regularities on multiple levels, has been suggested to rely on it 15 – 25 . While the contribution of statistical learning might be most frequently highlighted in language, several results have shown that this mechanism is not limited to it: it has an important role in domains such as music acquisition 26 , event processing 27 , or acquiring complex visual stimuli like scenes or faces 28 , suggesting a diverse and varied role for statistical learning in human cognition. As the human cognitive system faces great diversity in learning materials, differences in the properties of the input may impose different constraints on statistical learning in each area 1 , 29 . Input constraints are especially important in statistical learning because this form of learning is model-free and input-driven compared to other forms of learning like reinforcement learning or declarative learning 29 . To understand how this fundamental mechanism operates in different areas of cognition, we aim to uncover how input characteristics and their interactions affect learning.

While the variability of areas where statistical learning is present may suggest generality, direct comparisons of learning the same structure with stimuli from different domains and modalities indicate the presence of modality and domain specific constraints. (In the present paper, we use domain to refer to the content of representations, more specifically, to denote the linguistic-nonlinguistic distinction in our tasks). These effects have mostly been demonstrated in artificial language learning tasks, where a few novel items are organized into sequences based on simple patterns. After being exposed to a set of grammatical sequences, humans are able to distinguish grammatical from ungrammatical sequences. These studies have shown that the efficiency of statistical learning of serially presented (i.e., presenting one stimulus after the other) nonlinguistic auditory patterns exceeds the extraction of serial nonlinguistic visual patterns, which in turn is better than learning serial nonlinguistic tactile patterns 3 . In general, statistical learning is assumed to have modality- or even stimulus specific characteristics 1 , 29 .

Importantly, these modality effects are likely to result from differences in parameters of optimal presentation. While sequential information in the auditory modality is only available through serial presentation of stimuli, for visual sequences, serial and simultaneous presentation are both feasible. Simultaneous presentation, where items of a sequence are presented together at the same time, seems to be the optimal in the case of statistical learning of nonlinguistic visual sequences 30 , 31 . When visual information is presented simultaneously, performance is similar to nonlinguistic auditory learning 30 , 31 . Presentation rates also affect learning differently in different modalities, and slower rates seem to facilitate visual statistical learning: when presentation rate is slower with serial linguistic visual than with serial linguistic auditory stimuli, learning performance is equivalent between the modalities 69 .

While modality differences in statistical learning have been demonstrated in several studies, tests of domain effects, e.g. direct comparisons of linguistic versus nonlinguistic materials are hard to find. One notable exception is Saffran 31 , who explored both domain (linguistic versus nonlinguistic) and modality (auditory versus visual) effects in an artificial grammar learning task and found no overall advantage of sequence learning in the linguistic over the nonlinguistic domain (or in the auditory over visual modality) with serial presentation of sequences. However, the focus of this study was on contrasting two types of grammars (grammars with predictive and non-predictive dependencies) within each condition, instead of directly comparing performance across domains and modalities. Although it was not the primary focus of their studies, Hoch, Tyler and Tillmann 70 directly compared statistical learning in the linguistic and nonlinguistic domains, observing significantly higher levels of learning in the linguistic than in the nonlinguistic domain.

Besides constraints by modality, presentation type and domain, different arrangements of stimuli during training (training type) also influence statistical learning. The starting small hypothesis assumes that incremental presentation of stimuli of different length (and complexity) enhances statistical learning in the case of humans and neural networks 32 . In another formulation, under the less is more hypothesis 33 , 34 , cognitive limitations like reduced working memory capacity, help learning complex patterns and systems. Research on human learners and less is more / starting small is methodologically diverse and has given controversial results 35 – 38 . On the one hand, contrary to the predictions of the less is more hypothesis, several studies found that the acquisition of grammar structures in artificial grammar learning tasks is more effective in adults than in children 39 . On the other hand, simulating reduced working memory capacity in adults seems to facilitate learning in some 40 , but not in other studies 41 . Starting big arrangement of stimuli, starting with the longer strings of the grammar, has also been argued and demonstrated to result in superior performance by allowing larger chunks to be learned first and be parsed later 42 . However, it may also lead to false hypotheses about grammar structure 32 , 43 , or prevent generalization of rules 40 . To summarize, starting small and starting big training types lead to more efficient learning in some cases, but further research is needed to test the conditions in which they boost learning.

Although effects of input characteristics like modality, presentation, domain and training type have been examined before, previous research only investigated these effects on statistical learning separately calling for further studies with direct comparisons. Furthermore, many studies used different statistical learning designs, with differences in patterns, stimuli and presentation arrangement. Our aims in this study were to examine modality, presentation, domain and training type effects using Saffran’s 31 predictive grammar in order to extend the results of the original study (a) by systematically investigating all combinations of the examined input effects and their interactions (e.g. by also including visual linguistic conditions), and (b) by directly comparing learning performance across conditions. We compare the efficiency of statistical learning in visual (both in serial and simultaneous presentation types) and auditory modalities and across linguistic versus non-linguistic domains. We also wanted to test how training type, namely starting small and starting big influences learning across these conditions, as training effects have not been examined with finite state, category-based grammars. Figure  1 summarizes the design of experimental conditions in the study.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig1_HTML.jpg

The design of the study. We systematically investigated the effect of four factors: modality ( auditory vs. visual ), presentation type ( serial vs. simultaneous ), domain ( linguistic vs. nonlinguistic ), and training type ( random vs. starting small vs. starting big ), yielding 18 conditions altogether. With 20 participants in each group, 360 participants took part in the study.

Our hypotheses were the following:

  • Based on previous findings, we expected the advantage of learning in the auditory modality over the visual modality with serial presentation. We also hypothesized that when presentation is optimized for modality, this advantage disappears, and performance in the serial auditory and in the simultaneous visual tasks would be on similar levels.
  • In the present study, we aimed at directly comparing the acquisition of statistical patterns in the linguistic and nonlinguistic domains. Based on the results of Saffran 31 , we expected that the linguistic versus nonlinguistic status of stimuli would not have an effect on learning efficiency.
  • Training effects, starting small and starting big, have not been examined with finite state, category-based grammars. We hypothesized that starting small would facilitate learning compared to presenting training sequences in a random order, as starting small enables the generation of simple and flexible hypotheses about the rule 32 , 43 . In contrast, starting big would mainly facilitate learning of specific item-relations, and as a result, we expected it would result in lower learning performance than random training due to less effective hypothesis generation 42 .

These hypotheses can be translated to the following formulations in our current experimental design, motivating three sets of analyses:

  • With serial presentation of stimuli, we expected the advantage of learning in the auditory modality in comparison to the visual modality. We hypothesized that there would be no domain effect, that is, the linguistic conditions would not differ from the nonlinguistic conditions. We also expected that starting small training would lead to higher, while starting big would lead to lower performance than presenting training sequences with a different length in a random order.
  • In the visual conditions, we expected the advantage of simultaneous over serial presentation. In this case, we also expected no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.
  • With presentation optimized for modality, we expected that performance in the serial auditory and in the simultaneous visual tasks would be on similar levels. Here, we also hypothesized no domain effect, and an advantage of starting small and disadvantage of starting big stimuli relative to random training.

Participants

360 young adults participated in the study. Most of them were university students who were recruited through facultative cognitive psychology courses at the Budapest University of Technology and Economics, and received course credit for their participation. The rest of the participants were volunteers who were recruited via convenience sampling. Inclusion criteria were normal or corrected-to-normal hearing and vision, and Hungarian as a native language. Participants were asked to report any neurological and psychiatric conditions (none were reported in our sample). Mean age was 22.5 (SD = 3.9, minimum = 18.1, maximum = 55.8), and 255 females and 105 males participated in the study. Age information was missing in the case of two participants. All participants were tested with their informed consent, in accordance with the principles set out in the Declaration of Helsinki and the stipulations of the local institutional review board (United Ethical Review Committee for Research in Psychology, ethical approval number: EPKEB-2018/87).

Throughout the conditions, stimuli varied by modality ( auditory versus visual ) and domain ( nonlinguistic versus linguistic ). For all conditions, our aim was to design diverse stimulus sets where individual stimuli are well discriminable from each other. For the auditory nonlinguistic conditions, we divided a frequency range that was conveniently perceivable through our laboratory headphones (220–831 Hz) into 15 equal sections following steps of the musical scale to obtain 16 tones. As a result, we obtained intervals larger than standard semitones, and almost as large as standard whole tones (220 Hz, 240 Hz, 263 Hz, 287 Hz, 314 Hz, 343 Hz, 374 Hz, 409 Hz, 447 Hz, 488 Hz, 534 Hz, 583 Hz, 637 Hz, 696 Hz, 761 Hz, 831 Hz). Each tone was 470 ms long. For the auditory linguistic conditions, we used Hungarian CVC nonwords compiled from diverse Hungarian phonemes to promote discriminability (bif, dők, dup, gal, hep, kav, lam, lor, mib, neb, péf, rász, rud, szig, tez, sot). Note that some of the nonwords are four characters long as they include phonemes with a digraph (two-character grapheme) equivalent (‘sz’). Nonwords were recorded from a Hungarian female speaker, and the average length of syllables was 470 ms. In the visual nonlinguistic condition, 16 meaningless symbols were used that were rich in detail, and easily distinguishable from each other. In the visual linguistic conditions, the same syllables were used as in the auditory linguistic condition. Syllables were visually presented on white screen in black font. Individual items in each stimulus type were assigned to categories (as illustrated in Fig.  2 ), and the rules of the artificial grammar were defined over these categories.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig2_HTML.jpg

Stimulus sets in different conditions by modality and domain. Stimuli in each condition are classified into categories (A, C, D, F, and G). The rules of the grammar are defined on these categories.

With the help of the grammar (given in Fig.  3 , taken from Saffran 31 ) and the condition-specific categorized vocabularies, we generated 58 three to five items long grammatical sentences and 32 two to three items long phrases for the learning phases (90 sequences altogether). Phrases were parts of grammatical sentences. We also generated 24 pairs of grammatical and ungrammatical sequences (9 four- and 15 five item long sequences) for the test phases in all conditions. Grammatical sentences followed the rules of the grammar, while the ungrammatical ones included a violation of one of the grammatical rules: (1) sentences must contain an AP phrase, (2) D words follow A words, while G words follow C words, (3) sentences must contain an F word, (4) CP phrases must precede F words. As a result, there were four violation types, one for each rule: (1) sentences starting with a BP phrase instead of an AP phrase, (2) sentences where D and G words were interchanged so G words followed A words and D words followed C words, (3) sentences where F words were exchanged for G words, (4) sentences where CP phrases or parts of the CP phrases were missing before F words. Each violation type was represented by six ungrammatical strings. Members of categories were randomly distributed in sentences in the case of each category. The full set of training and test sequences together with their statistical properties for the linguistic conditions are included as supplementary material online. Sequences of the nonlinguistic conditions were parallel to those of linguistic conditions, that is, each syllable corresponded to a pure tone and a symbol, respectively. The modality and domain variations on conditions only differed in their stimulus set.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig3_HTML.jpg

Rules of the artificial grammar (from Saffran 31 ). Letters A, C, D, F and G refer to categories, which each include a set of items (tones, syllables, or symbols in the different conditions). Items in each category were randomly distributed in sentences. A sentence consists of an “AP” phrase, a “BP” phrase, and an optional “CP” phrase. An “AP” phrase is made of an “A” category item and an optional “D” category item. A “CP” phrase consists of a “C” and an optional “G” item. A “BP” phrase is made of a “CP” phrase and an optional “F” item.

Participants were tested in a silent room in groups of two or three. The testing was administered using E-Prime 2.0 Professional. The test administration took cca. 15 min, and consisted of a training phase and a test phase for all conditions.

In the auditory conditions, items were presented with no pauses between them. In the visual conditions, we applied two presentation types: in the serial conditions, one item was presented at a time on the center of the screen for 800 ms, followed by the next item with no pauses, while in the simultaneous conditions, all items of a sentence were presented together on the screen at the same time. (Pilot data from our lab on a simpler segmentation task showed no learning effect in the case of visual statistical learning when stimulus timing was matched to that of acoustic statistical learning and set to 470 ms. This was one of the reasons for using a longer presentation time: we wanted to avoid floor effects in a more complex task in the visual modality. Choosing longer presentation times was also motivated by earlier studies showing that longer presentation times in visual statistical learning indeed promote learning 69 , 76 , 77 . Since visual presentation rates vary between 400 and 1200 ms in the literature, we decided to go with a mid-range 800 ms that was significantly longer than what we used in our pilot studies.) Presentation time was adjusted to sequence length (the number of items times 800 ms). During the training phase, participants were instructed to simply attend to the presented sequences.

In all combinations of modality, presentation and domain, we examined the effects of three different training types. All conditions presented the same set of sequences; small and big were not defined in absolute terms, they refer to the relative length of sentences within the same training set. In the random conditions, sequences of different length were presented in a random order; the starting small conditions involved incremental presentation of sentences ordered by length, starting with the shortest sequence; the starting big condition was the reverse of the starting small condition, starting with the longest sequences and gradually proceeding towards the shortest ones. It is important to point out that the shortest strings were not full sentences of the grammar, but they were structural units (phrases) of the language.

In the two-alternative forced choice (2AFC) test phase, participants were told that the sequences presented before were in an unknown language, and then they were presented with 24 sequence pairs of a grammatical sentence and a sentence containing a violation in each of the 24 trials. The grammatical-ungrammatical order within the sequence pair was counterbalanced across the trials. The order of the trials was random, but sentence-pairs were preset. Participants were instructed to choose the one which was more similar to sentences of the unknown language in the training phase and indicate it by pressing the corresponding key (‘1’ for the first sentence and ‘2’ for the second). The two sentences followed each other with 2000 ms pauses. Higher than chance scores (choosing the grammatical member of the pair significantly more than 50% of the time) was taken as evidence of learning.

Data were analyzed and visualized using IBM SPSS Statistics 20, JASP version 0.15.0.0 78 and the R package ggplot2, version 3.3.5 44 . Descriptive statistics of accuracies in the 2AFC task are displayed in Table ​ Table1 1 .

Descriptive statistics of groups in different modality, presentation, domain and training conditions.

ModalityPresentation typeDomainTraining typeMean accuracy (SD)Age in years (SD)Females/males
AuditorySerialNonlinguisticRandom0.58 (0.11)**21.16 (2.24)13/7
Starting small0.58 (0.09)***20.66 (1.37)13/7
Starting big0.56 (0.11)*20.42 (1.47)19/1
LinguisticRandom0.68 (0.10)***22.13 (2.30)15/5
Starting small0.75 (0.13)***21.92 (1.90)17/3
Starting big0.71 (0.12)***21.91 (2.92)15/5
VisualSerialNonlinguisticRandom0.59 (0.13)**26.45 (7.35)12/8
Starting small0.53 (0.13)23.69 (2.45)12/8
Starting big0.65 (0.16)***21.68 (2.53)16/4
LinguisticRandom0.58 (0.13)*24.82 (7.74)12/8
Starting small0.51 (0.12)24.46 (2.21)13/7
Starting big0.59 (0.17)*22.38 (1.91)15/5
VisualSimultaneousNonlinguisticRandom0.68 (0.16)***20.91 (1.19)13/7
Starting small0.70 (0.11)***21.28 (0.95)16/4
Starting big0.62 (0.14)***20.59 (1.07)18/2
LinguisticRandom0.66 (0.13)***20.81 (2.19)16/4
Starting small0.65 (0.13)***22.96 (3.95)11/9
Starting big0.62 (0.14)**26.04 (5.90)9/11

Descriptive statistics of the 2AFC accuracy task in groups in different Modality, Presentation Type, Domain and Training Type conditions. Differences from chance level (0.5) are calculated with one-sample t-tests, *: p < 0.05, **: p < 0.01, ***: p < 0.001. Mean age and proportion of females and males are also displayed for each group.

Post-hoc calculations of power estimates are included in Appendix 1 . Detailed analyses of performance as a function of statistical regularities and violation types, as well as trial level performances were also conducted. These analyses showed that (1) participants performed above chance level in all violation types, (2) there were significant differences between performance by different violation types, (3) higher accuracy in violation types tended to co-occur with higher discriminability between the grammatical versus ungrammatical sequence based on of statistical features, and (4) statistical feature differences between grammatical and ungrammatical test items influenced accuracy on the word level, but not on the category level. These additional analyses are included in the supplementary materials .

First, we analyzed results for all conditions with serial presentation of stimuli. We conducted a three-way ANOVA to test the effects of Modality, Domain and Training Type. The effect of Modality was significant, F(1,228) = 19.12, p < 0.001, η p 2  = 0.08, BF 10  = 5.557e+6, performance in the auditory modality was higher than in the visual modality. The effect of Domain was also significant, F(1,228) = 11.83, p = 0.001, η p 2  = 0.05, BF 10  = 186,759.88, showing that participants performed better in conditions in the linguistic than in the nonlinguistic domain. The effect of Training Type was not significant, F(2,228) = 1.51, p = 0.223, η p 2  = 0.01, BF 10  = 0.914. The interaction of Modality*Domain was significant, F(1,228) = 26.83, p < 0.001, η p 2  = 0.11, BF 10  = 54,417.28 (Fig.  4 ). Post hoc analyses showed that in the auditory modality, performance in the linguistic domain was significantly higher than in the nonlinguistic domain, t(118) = -6.95, p < 0.001, r = 0.54, BF 10  = 3.761e+7. In the visual modality, the difference between the two domains was not significant, t(118) = 1.08, p = 0.284, r = 0.10, BF 10  = 0.33. In the nonlinguistic domain, Modality did not affect performance (t(118) = 0.57, p = 0.570, r = 0.05, BF 10  = 0.23), while in the linguistic domain, the efficiency of auditory and visual learning differed significantly, with higher scores in the auditory linguistic than in the visual linguistic condition, t(118) = 6.52, p < 0.001, r = 0.51, BF 10  = 4.910e+6.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig4_HTML.jpg

2AFC accuracy by Modality and Domain with serial presentation. Dots represent the mean accuracies of individual participants in a given group (a small amount of jitter was added to increase visibility). Lines in boxes represent group medians, box lengths illustrate the group interquartile range, and whiskers show minimum and maximum values. Outliers are participant data outside the 1.5 interquartile range beyond the first and third quartiles. *: p  < 0.05, **: p  < 0.01, ***: p  < 0.001. Performance in the auditory linguistic condition was significantly better than performance in the auditory nonlinguistic condition and performance in the visual linguistic condition (regardless of Presentation Type).

The interaction of Modality*Training Type was also significant, F(2,228) = 6.06, p = 0.003, η p 2  = 0.05, BF 10  = 4.59 (Fig.  5 ). Post hoc tests showed that in the auditory modality, Training Type has no significant effect, F(2,117) = 0.99, p = 0.374, η p 2  = 0.02, BF 10  = 0.18. However, in the visual modality, the effect of Training Type was significant, F(2,117) = 5.32, p = 0.006, η p 2  = 0.08, BF 10  = 6.07. Tukey pairwise comparisons showed that only the starting small and starting big groups were different (with higher performance in the starting small than in the starting big condition), p = 0.005, BF 10  = 13.47, the random and starting small , and the random and starting big groups did not differ from each other, p = 0.110, BF 10  = 1.96 and p = 0.459, BF 10  = 0.41, respectively. In the case of the latter two comparisons, Bayes factors did not show evidence for equal performance in different training types. Further analyses showed that learning in the auditory modality was more efficient than in the visual modality in the case of the starting small condition, t(78) = 5.06, p < 0.001, r = 0.50, BF 10  = 5497.80. The two modalities did not differ in the random and the starting big groups, t(78) = 1.74, p = 0.085, r = 0.19, BF 10  = 0.86, t(78) = 0.50, p = 0.621, r = 0.06, BF 10  = 0.26. However, in the former comparison, the Bayesian analysis did not show evidence for equal performance in the two modalities. No other interactions were significant.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig5_HTML.jpg

2AFC accuracy in the Modality*Training Type interaction in the case of serial presentation. Dots represent the mean accuracies of individual participants in a given group (a small amount of jitter was added to increase visibility). Lines in boxes represent group medians, box lengths illustrate the group interquartile range, and whiskers show minimum and maximum values. Outliers are participant data which are outside the 1.5 interquartile range beyond the first and third quartiles. *: p  < 0.05, **: p  < 0.01, ***: p  < 0.001. Performance was significantly lower in the visual starting small condition than performance in the auditory starting small condition and performance in the visual starting big condition (regardless of Domain).

We performed a second three-way ANOVA to test the effects of Presentation Type, Domain and Training Type on accuracy in the visual conditions. The effect of Presentation Type was significant, F(1,228) = 20.38, p < 0.001, η p 2  = 0.08, BF 10  = 1092.94; performance was higher with simultaneous presentation than with serial presentation. The main effects of Domain and Training Type were not significant, F(1,228) = 2.11, p = 0.148, η p 2  = 0.01, BF 10  = 0.18, and F(2,228) = 0.10, p = 0.371, η p 2  = 0.01, BF 10  = 0.56, respectively. The interaction of Presentation Type*Training Type was significant, F(2,228) = 6.10, p = 0.003, η p 2  = 0.05, BF 10  = 2.86 (Fig.  6 ). Post hoc analyses showed that in the case of serial presentation, the effect of Training Type was significant, F(2,117) = 5.32, p = 0.006, η p 2  = 0.08, BF 10  = 6.07; Tukey pairwise comparisons showed that starting small was less efficient than starting big training, p = 0.005, BF 10  = 13.47, but learning with random and starting small , and random and starting big training did not differ from each other, p = 0.110, BF 10  = 1.96 and p = 0.459, BF 10  = 0.41, respectively. (Note that this analysis is the same as the post hoc analysis of Training Type in the case of the visual modality in the previous ANOVA.) The effect of Training Type was not significant in the case of simultaneous presentation, F(2,117) = 1.75, p = 0.178, η p 2  = 0.03, BF 10  = 0.33. When analyzing the effect of Presentation Type in each Training Type, we found that in the case of random and starting small training, simultaneous presentation resulted in higher performance levels than serial presentation, t(78) = -2.77, p = 0.007, r = 0.30, BF 10  = 5.99, and t(78) = -5.58, p < 0.001, r = 0.53, BF 10  = 37,631.56. In the case of starting big training, there was no difference between presentation types, t(78) = -0.06, p = 0.951, r = 0.01, BF 10  = 0.23. No other interactions were significant.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig6_HTML.jpg

2AFC accuracy in the Presentation*Training Type interaction in the case of visual stimuli. Dots represent the mean accuracies of individual participants in a given group (a small amount of jitter was added to increase visibility). Lines in boxes represent group medians, box lengths illustrate the group interquartile range, and whiskers show minimum and maximum values. Outliers are participant data which are outside the 1.5 interquartile range beyond the first and third quartiles. *: p  < 0.05, **: p  < 0.01, ***: p  < 0.001. Serial starting small performance was lower than serial starting big performance, and simultaneous starting small performance; and simultaneous random performance was higher than serial random performance (regardless of Domain).

We performed a third three-way ANOVA to test the effects of Modality, Domain, and Training Type for optimal presentation for each modality, i.e., serial presentation for auditory stimuli, and simultaneous presentation for visual stimuli. With presentation type fitted to modality, the effect of Modality was not significant, F(1,228) = 0.34, p = 0.560, η p 2  < 0.01, BF 10  = 1335.81. On the other hand, the effect of Domain was significant, F(1,228) = 13.34, p < 0.001, η p 2  = 0.06, BF 10  = 46,289.46, showing that participants performed better in conditions in the linguistic domain than in the nonlinguistic domain. The effect of Training Type was not significant, F(2,228) = 2.30, p = 0.102, η p 2  = 0.02, BF 10  = 0.17. The interaction of Modality*Domain was significant, F(1,228) = 26.24, p < 0.001, η p 2  = 0.10, BF 10  = 6922.99. This interaction is illustrated in Fig.  7 . Post hoc analyses showed that in the auditory modality, performance in the nonlinguistic domain was significantly lower than in the linguistic domain, t(118) = 6.95, p < 0.001, r = 0.49, BF 10  = 3.761e+7. In the visual modality, the difference between the two domains was not significant, t(118) = 0.95, p = 0.346, r = 0.09, BF 10  = 0.291. In the nonlinguistic domain, the performance in the visual modality was higher than in the auditory modality, t(118) = 4.03, p < 0.001, r = 0.35, BF 10  = 216.78, and in the linguistic domain, we observed the opposite pattern, with significantly higher performance in the auditory than in the visual modality, t(118) = 3.21, p = 0.002, r = 0.28, BF 10  = 17.97. No other interactions were significant.

An external file that holds a picture, illustration, etc.
Object name is 41598_2022_24951_Fig7_HTML.jpg

2AFC accuracy by Modality and Domain with the optimal Presentation for each modality ( serial presentation for auditory stimuli, and simultaneous presentation for visual stimuli). Dots represent the mean accuracies of individual participants in a given group (a small amount of jitter was added to increase visibility). Lines in boxes represent group medians, box lengths illustrate the group interquartile range, and whiskers show minimum and maximum values. Outliers are participant data which are outside the 1.5 interquartile range beyond the first and third quartiles. *: p < 0.05, **: p < 0.01, ***: p < 0.001. Performance in the auditory linguistic condition was higher than performance in the auditory nonlinguistic condition and performance in the visual linguistic condition; and performance in the visual nonlinguistic condition was superior to performance in the auditory nonlinguistic condition (regardless of Training Type).

As pointed out by one of the reviewers of the original manuscript, in the auditory nonlinguistic conditions, the use of musical tones may give rise to musical features like contours (the ascending and descending pattern between tones) and intervals (the relative pitch change between tones) 75 . Taking this into account, a possible explanation for lower performance in the auditory nonlinguistic condition is that statistical patterns of these emergent musical features might conflict with statistical information imposed by the grammar alone. To check this possibility, we compared the main effect of Domain and the Modality*Domain interaction in test trials where grammatical and musical patterns converged (based on the statistics of the learning phase) and in test trials where grammatical and musical statistics diverged in supporting a choice between the grammatical versus ungrammatical item. We found that the main effect of Domain and the Modality*Domain interaction was more prominent in test trials where emergent musical patterns and grammatical patterns did not converge. A thorough description of the recoding process and the analysis is provided in Appendix 2 .

Since age was not entirely balanced between groups, we checked whether it affected learning to control for potential biases. Age had a very weak, but significant negative relationship with performance, r(356) = − 0.11, p = 0.039, however, the Bayesian analysis did not show evidence for this relationship, BF 10  = 0.64. To examine a potential effect on the results of ANOVA analyses, we performed three further ANCOVAs with Age as a covariate. When testing the effects of Modality, Domain and Training Type in the case of serial presentation, we found that Age had no significant effect on performance, F(1,226) = 0.07, p = 0.790, η p 2  < 0.01, BF 10  = 0.17. Similarly, when testing the effect Presentation Type, Domain and Training Type in the case of visual stimuli, Age had no significant effect, F(1,226) = 0.17, p = 0.677, η p 2  = 0.01, BF 10  = 0.16. When testing the effect of Modality, Domain, and Training Type in the case of optimal presentation type, Age was a significant covariate, F(1,225) = 6.02, p = 0.015, η p 2  = 0.03, BF 10  = 8.01, but including it as a covariant did not change the pattern of findings: the effects of Domain and Modality*Domain remained significant.

Summary and discussion

Statistical learning operates across many different areas of cognition on different stimuli, but the effects of modality, presentation, domain and training type together with the interaction of these factors have not been examined together and systematically in the statistical learning literature. To fill this gap, we investigated the effects of these factors in an artificial grammar task. When stimuli were presented serially, learning was more effective in the auditory than in the visual modality. This modality effect was particularly pronounced in the linguistic domain. With simultaneous presentation of visual stimuli, the auditory advantage over the visual modality disappeared. A significant domain effect showed that learning linguistic patterns results in higher performance than learning nonlinguistic patterns. However, the linguistic advantage over learning the nonlinguistic material was only present in the auditory modality. The auditory linguistic condition had an overall advantage over other modality-domain types. Training type did not have any general effect on the acquisition of the grammar, but starting big enhanced performance in the case of serial visual presentation relative to starting small training, and starting small training with serial visual materials resulted in lower performance than starting small training with simultaneous visual materials. The results and their implications are discussed in the context of earlier findings and in more detail in the following sections.

Effects of modality and presentation

We expected an auditory advantage in statistical learning with serial presentation of stimuli. This assumption was supported by our results: the grammar was easier to learn in the auditory than in the visual modality, in line with previous results by Conway and Christiansen 3 , who found the same pattern with a simpler grammar. These observations suggest that regardless of the complexity and structure of the pattern to be learned, when stimuli are presented serially, learning is more effective in the auditory than in the visual modality. Such modality effects in statistical learning tasks might reflect differences in general information processing mechanisms in sensory modalities. Supporting this notion, Conway and Christiansen 30 demonstrated that well-known primacy and recency effects in serial recall 45 , 46 are also present in statistical learning. Moreover, the advantage of auditory over visual presentation, demonstrated in previous studies and in the current one, had also been described outside the field of statistical learning, for instance, in the memory for order of words in sequences in a word list recall task 47 .

However, with presentation type optimized, i.e., when items of the sequence are presented serially in the auditory, and simultaneously in the visual modality, the auditory advantage disappeared and learning was equally efficient in both modalities, in concert with previous results of Conway and Christiansen 30 . The findings of Saffran 31 also provide indirect support for this claim, however, she only found an advantage of simultaneous over serial presentation for visual stimuli with the same predictive grammar we also used (but not with the nonpredictive grammar). As she discusses, it is unclear whether this pattern was due to the advantage of visual simultaneous learning for the predictive grammar, or the disadvantage for the non-predictive grammar. Taken together, (1) our results support the advantage of auditory over visual statistical learning with serial presentation; (2) simultaneous presentation seems to benefit visual statistical learning of sequences over visual serial presentation; and (3) when presentation is optimized for modality, there is no difference between modalities in learning efficiency.

The advantage of simultaneous compared to serial visual presentation raises the possibility that modality effects might be specific to or at least interact with structure type. In statistical learning, modality effects are generally investigated with sequential structures (with some exceptions 48 ). However, while auditory perception and processing seems to be suited for processing temporal information, which is inherently sequential, vision is better suited to processing spatial than temporal information, which can be both sequential and nonsequential (as concluded by Freides 49 , and Conway and Christiansen 3 , but see also other studies 48 , 50 – 52 ). Testing modality effects can be challenging in the case of nonsequential structures, although not impossible 48 , due to the sequential organization of most types of auditory information. As this modality effect might be limited to sequential processing, further studies should target nonsequential structures to broaden our knowledge about modality effects in statistical learning and other domains of cognition. To conclude, the present study (1) confirms modality effects observed in earlier studies and extends them to predictive dependencies and a category-based grammar, (2) shows that these modality effects can be structure dependent.

Domain effects

Based on previous findings 31 , we expected no advantage of learning the grammar with linguistic over nonlinguistic stimuli (although see 70 for results with a linguistic advantage with a different design). This assumption was only partially supported by our findings. In the case of serial presentation, performance was higher in linguistic than in nonlinguistic conditions. We observed a similar domain effect in the analysis including serial auditory and simultaneous visual learning (i.e., the optimal presentation for each modality). A possible explanation for a linguistic advantage would be that the grammar was explicitly created to mimic predictive dependencies and word categories common in human languages: in the original design, Saffran 31 argued that learning constraints should be tailored to the stimuli for effective learning, thus, different constraints might be advantageous for learning linguistic and nonlinguistic stimuli, as different chunking and grouping mechanisms might operate in these domains (e.g., different constraints for linguistic structures versus musical structure in the auditory modality, and for symbol sequences versus complex real-life visual scenes in the visual domain). This type of structure with predictive dependencies and word categories, characteristic of language, might be optimal for learning linguistic materials. A further potential explanation of the linguistic over nonlinguistic advantage is that participants, although they are not instructed to do so, might also apply explicit memorization strategies for linguistic materials (e.g., rehearsal of sequences) which are less available for other types of stimuli.

However, the presence of the domain effect in the auditory conditions draws attention to the potential influence of stimulus specific factors beyond general effects in statistical learning. In the auditory nonlinguistic condition, the use of musical tones may give rise to musical features like contours (the ascending and descending pattern between tones) and intervals (the relative pitch change between tones) 75 , which might support or be in conflict with grammatical information. Indeed, the linguistic advantage observed in the auditory modality was challenged by further analyses suggesting that lower performance in the nonlinguistic condition might have been caused by conflicting grammatical and musical patterns. Therefore, also in line with the result of no linguistic advantage in the visual modality, our results do not support general domain effects in statistical learning: the efficiency of learning may depend on more stimulus-specific features. Stimulus- and task specific learning effects are not surprising, since statistical information is not the only cue to finding structure in environmental stimuli. In cases of contradicting cues, other sources of information may override it (see e.g. prosody over statistics: Johnson and Seidl 73 ; familiar units over statistical cues: Poulin and colleagues 74 ), although in other cases, learners may rely on statistical features over other information types (statistical cues over similarity: Tillman and McAdams 79 ).

To summarize, we found an advantage of statistical learning in auditory linguistic conditions compared to all other conditions, including visual linguistic learning. In addition, performance in the auditory nonlinguistic condition was weaker than in other conditions. These results show that the effectiveness of statistical learning may be influenced by the domain of learning (e.g. linguistic versus nonlinguistic). However, in our study this domain effect was confounded with other emergent patterns in the stimuli: musical patterns (contours and intervals) in tone sequences were in conflict with statistical patterns defined by the grammar, making learning in the auditory nonlinguistic conditions more difficult than learning sequences of syllables. Further studies are needed to clarify the nature of and control for such effects and their interaction with domain and modality. These results suggest that instead of global domain effects, stimulus specific effects shape statistical learning which may also depend on task type, design and features of the learning material.

Training effects

To examine the influence of input characteristics on statistical learning, we also explored training effects across different modalities and domains. We hypothesized that starting small would facilitate the acquisition of the category based grammar through enabling the generation of simple and flexible hypotheses about the underlying rules. In contrast, we expected starting big to yield lower learning performance due to less effective hypothesis generation. However, we only found an effect of training with serial presentation in the visual modality: here, regardless of stimulus domain (i.e. both in the linguistic and nonlinguistic conditions), starting small training had an adverse effect on performance, while starting big training facilitated learning. This pattern of results for different training types suggests that the way of stimulus presentation can affect statistical learning in important and perhaps modality and domain-dependent ways. The visual processing system seems to be optimized for spatial rather than temporal processing 49 , and the starting big presentation might compensate for the insufficient availability of information in the serial presentation.

The above pattern of results is in contrast with earlier findings about the starting small effect in visual statistical learning showing enhanced acquisition of structure with starting small training and simultaneous presentation in the visual modality both in the linguistic 53 and the nonlinguistic domain 35 . These contradictory findings may be explained by differences in the grammars: previous studies applied recursive grammars in which the structure was based on the non-adjacent combination of item pairs. Thus, the initial acquisition of adjacent pairs of these legal combinations is essential, and increasingly more difficult when embedded in longer sequences: the complexity (the number of different sequences the grammar can generate) of recursive grammar sentences exponentially increases as a function of length 53 . Starting small training targets this problem with presenting pairs with just the two adjacent items in the beginning. However, the grammar that we used is different in structure. Here, complexity does not increase with sentence length as much as in the case of recursive grammars. Poletiek and colleagues 53 argued that the key to the starting small effect is the presentation of less complex, and not necessarily shorter, sequences. As a result, the acquisition of this type of grammar may not profit as much from starting small training. However, statistical properties of ‘small’ phrases were not controlled for, and post-hoc analyses of these regularities do not show systematic differences. Shorter sequences with less complex statistical regularities than the longer ones might yield larger benefits in starting small: this would be a design worthy of implementation in a future study.

A further reason for the absence of the starting small effect might be that shorter sequences induce explicit rule search strategies which decrease the efficiency of learning complex statistical patterns 71 , 72 . It is also possible that we have not provided sufficient information in the beginning of training in the starting small presentation conditions of our study for beneficial effects. Given the variability of items within phrases, the training might have been too short for participants to acquire these basic units and a longer training with ‘small’ phrases might have resulted in stronger or more explicit representations, which might have then served better as building blocks in later parts of the training with more complex material. To summarize, as training effects might significantly depend on grammar or structure type, further studies are needed to determine their scope. A larger sample size would also benefit exploring training effects further, as post-hoc comparisons in the Modality*Training Type interaction were not powered enough to unequivocally show either the presence or the absence of a difference.

Considerations about pattern and stimuli characteristics

Statistical learning is an umbrella term covering the acquisition of several types of patterns and systems, for instance, segmenting words from a speech stream 9 , 11 , 70 , learning regularities in real-world scenes 27 , spatial locations 13 , 14 , acquiring visual patterns and faces 6 , 7 , 28 , 54 , or learning musical systems 87 , 88 . Even in the case of learning sequential information in artificial grammar learning tasks, the structure to be acquired is highly variable: phrasal 31 , finite-state 55 , central-embedded or right-branching recursive 35 , 38 , and non-adjacent dependency grammars 56 are all applied. The literature on the effects of modality 3 , presentation 30 , 31 , domain 31 , 70 and training effects 25 – 32 , 32 – 39 , 41 , 42 , and more broadly, input properties e.g., 79 , 83 – 86 also relies on results from statistical learning studies working with a large variety of structure types. Although we used a category-based artificial grammar consisting of predictive dependencies in the present study, we aimed to explore domain, modality and training effects on statistical learning in general. Our results extend and confirm previous findings from different tasks and stimulus sets on modality, domain, presentation and training effects in statistical learning. At the same time, contradicting findings from tasks with different statistical structure types (e.g., while Saffran 31 found no linguistic advantage in an artificial grammar task, Hoch, Tyler and Tillmann 70 found that learning linguistic materials was more successful than learning nonlinguistic materials in a segmentation paradigm) draw the attention to a possible interaction of input characteristics and structure type, which should be addressed by future studies.

Beside structure type, stimulus type is also an underexamined, yet significant factor in statistical learning 57 . Linguistic stimuli can take many forms in different modalities and different constraints may apply in learning from speech streams versus written texts versus gesture sequences. On the other hand, in the nonlinguistic domain, various types of musical and environmental sounds can be used as auditory stimuli, while for the visual modality, applied stimuli range from colorful squares through spatial locations to complex symbols, all organized by potentially different statistical constraints. The constraints for optimal acquisition might be specific not only to modality and/or domain, but to stimulus type as well. Previous results also suggest that learning efficiency for different stimulus types interacts with age, as well: Raviv and Arnon 67 and Shufaniya and Arnon 68 found different developmental trajectories during childhood for different stimulus types in statistical learning. Further studies should explore such specificity in statistical learning: investigating modality, domain and training effects with a diverse set of structure and stimulus types in different ages is an important future direction.

Methodological and psychometric limitations

One of the limitations of our study is a general methodological problem that many statistical learning studies face: we only measured learning offline, that is, after the learning phase. This post hoc measurement is problematic from multiple aspects. First, this way, we cannot gain information about the process and dynamics of learning. Second, as a consequence, we measure knowledge only at retrieval, which is a different process from encoding (for a discussion of implications for statistical learning, see 58 ). This is especially important when modality- and domain specific effects are in the focus, as their encoding and retrieval processes might differ 59 , 60 . Third, the typically applied offline forced-choice tests recruit cognitive abilities distinct from statistical learning, for instance, decision-making and working memory processes 61 , 62 . Individual variations in these abilities might also make the measurement of statistical learning noisy and unreliable. A potential solution to these pitfalls is relying on online measurements: for instance, measuring reaction times to one or more predictable items during the training allows to infer changes in the efficiency of processing and predicting items in the pattern. This can be then applied as a measure of statistical learning 13 , 14 , 62 – 65 .

There are also psychometric aspects to be considered in future testing. Offline forced choice tasks often apply a relatively low number of trials. However, in a task type where group performance is just slightly different from the chance level most of the time, on the individual level, above-chance performance is difficult to distinguish from chance-level performance 66 . In our case, with a mean score of 0.62 yielded from the 24 trials in the two-alternative forced choice task, there is an 8% chance that an individual performed above chance merely by accident based on the binomial distribution. This is even more likely in conditions where mean performances were lower. (However, increasing the number of test trials, and thus participants’ exposure to ungrammatical sequences, may weaken or alter acquired statistical representations. This effect could be minimized by including ungrammatical trials without any systematic statistical biases or controlled for by applying statistical methods which include trial order as a random factor.) Including trials with systematically varying difficulty would also make a better targeted method, as participants with different levels of knowledge could be more accurately tested. Thus, increasing the number and variability of trials would make results less noisy and more reliable, resulting in a better statistical learning task.

Finally, it is also a limitation to be addressed by future studies that we did not collect any information on backgrounds in musical training for the participants. In the serial non-linguistic condition, tone sequences created short melodies, which participants with a musical training might have found easier to process. Since more general beneficial effects of musical training have been reported for memory and learning 80 – 82 , controlling for effects of musical training on performance would be relevant not just for the statistical learning of tone sequences, but for other modalities and domains as well.

Conclusions

The present study demonstrates that the efficiency of the acquisition of statistical structures may show considerable differences depending on the specific modality, domain, and presentation type. Most importantly, our findings show the advantage of sequential learning of auditory linguistic stimuli over other modalities and domains. Moreover, when grammar-based and musical features were matched in the nonlinguistic auditory condition, similar levels of performance were reached as in the linguistic auditory condition. This indicates the presence of constraints in statistical learning: serial presentation with this type of sequential structure with predictive dependencies and abstract categories might be optimal for learning auditory stimuli, while other stimulus types might profit more from other structure varieties. Our results also suggest that optimal, that is, simultaneous presentation type can boost learning performance in the visual modality. However, we found no general training effect in the present study, which indicates that training effects might also depend on the specific structure to be acquired. Our findings show that learning is constrained by the modality and presentation type together with the specific stimulus characteristics of the input, and call for broadening the scope of research by testing input effects on statistical learning with a wider range of structure and stimulus types.

Supplementary Information

Acknowledgements.

This work was supported by the Momentum Research Grant of the Hungarian Academy of Sciences (Momentum 96233 'Profiling learning mechanisms and learners: individual differences from impairments to excellence in statistical learning and in language acquisition', PI: Ágnes Lukács). The authors are grateful to Dezső Németh for his useful comments on the manuscript.

Author contributions

K.S.L.: data collection; data analysis; writing; A.L.: conceptualization; methodology; writing.

Open access funding provided by Budapest University of Technology and Economics.

Data availability

Competing interests.

The authors declare no competing interests.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The online version contains supplementary material available at 10.1038/s41598-022-24951-7.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 06 June 2024

Effects of multimodal explanations for autonomous driving on driving performance, cognitive load, expertise, confidence, and trust

  • Robert Kaufman 1 ,
  • Jean Costa 2 &
  • Everlyne Kimani 2  

Scientific Reports volume  14 , Article number:  13061 ( 2024 ) Cite this article

598 Accesses

1 Altmetric

Metrics details

  • Computer science
  • Human behaviour

Advances in autonomous driving provide an opportunity for AI-assisted driving instruction that directly addresses the critical need for human driving improvement. How should an AI instructor convey information to promote learning? In a pre-post experiment (n = 41), we tested the impact of an AI Coach’s explanatory communications modeled after performance driving expert instructions. Participants were divided into four (4) groups to assess two (2) dimensions of the AI coach’s explanations: information type (‘what’ and ‘why’-type explanations) and presentation modality (auditory and visual). We compare how different explanatory techniques impact driving performance, cognitive load, confidence, expertise, and trust via observational learning. Through interview, we delineate participant learning processes. Results show AI coaching can effectively teach performance driving skills to novices. We find the type and modality of information influences performance outcomes. Differences in how successfully participants learned are attributed to how information directs attention, mitigates uncertainty, and influences overload experienced by participants. Results suggest efficient, modality-appropriate explanations should be opted for when designing effective HMI communications that can instruct without overwhelming. Further, results support the need to align communications with human learning and cognitive processes. We provide eight design implications for future autonomous vehicle HMI and AI coach design.

Similar content being viewed by others

information presentation modality

A study of safety acceptance and behavioral interventions for autonomous driving technologies

information presentation modality

Analysing the effect of gender on the human–machine interaction in level 3 automated vehicles

information presentation modality

Impact of experience on visual behavior and driving performance of high-speed train drivers

Introduction.

Recent years have seen vast improvements in autonomous driving technology, with on-road vehicle testing currently being conducted in major cities around the world. The proposed benefits of autonomous vehicles (AVs) include increases in driving safety and efficiency while reducing driving infractions, traffic, and passenger stress 1 . Though AVs may have a bright future, real-world deployment is hindered by a number of technological, infrastructural, and human-interaction roadblocks that are likely to take decades to solve. Meanwhile, the National Highway Traffic Safety Administration (NHTSA) estimates that in the United States alone, there are over 6 million car crashes each year, resulting in over 2.5 million injuries, 40,000 deaths, and $340 billion in damages 2 , 3 . Of these, it is estimated that 94% of vehicle crashes are due to human error 4 . Therefore, there is a large and pressing demand for technologies that may improve human driving ability. In this study, we seek to test a novel use of autonomous vehicle technology: can AVs teach humans to be better drivers?

We propose that augmenting driver learning by observing an AI driving coach may help address the glaring need for human driving improvement. To test this concept, we conducted a mixed-methods driving simulator experiment in which participants observed one of four AI coaches providing different types of explanatory instructions. We evaluate how the AI coaching sessions impact learning outcomes such as participant-driving performance, cognitive load, confidence, expertise, and trust.

We leverage the domain of performance or “race track” driving to test study objectives. Performance driving is more challenging but very related to everyday driving, and it allows driving skills and knowledge to be built and tested more objectively. The goal of performance driving is to drive as fast as possible on a given track, maximizing the vehicle’s limits while minimizing the distance needed to travel 5 . Many performance driving skills directly translate to real-world driving contexts—such as improvements to vehicle handling and situational awareness 6 , 7 —and thus, it is an appropriate proxy to study driving skill learning. Testing the potential of an AI driving coach in the context of performance driving has several major benefits over everyday driving. First, performance driving has specifically defined parameters for success so we can objectively measure the effectiveness of our AI coach in a controlled environment. Next, it is a driving task many people are unfamiliar with, and thus we are able to test the potential of an AI driving coach on true novices. This helps maintain a consistent knowledge and skill baseline for our study sample, providing further consistency. Lastly, by testing our AI coach on learning an extreme and challenging driving task, we hope to derive insights that can be generalized to even the most intense real-world driving situations.

Though novel in the realm of autonomous driving, learning from AI, particularly by means of AI explanations, is not a new concept 8 . Learning from AI is a rapidly expanding area of interest, particularly given the proliferation of AI-based large language model tools such as ChatGPT 9 , 10 . In domains such as medicine, explainable AI-based systems—for example, image classification systems for radiology—have been shown to help physicians learn new techniques to identify pathologies 11 , 12 , 13 . These systems present justifications in the form of ’explanations’ which provide a rationale for a system’s decisions. Specific cases where human-interpretable AI explanations of decisions are produced have been singled out as especially helpful for learning and improvement 14 , 15 , 16 , particularly in cases where AI explanations are based on the explanations of human experts 17 , 18 . AI has the ability to advance educational techniques in many other domains, from second language learning 19 to programming 20 . Measuring the impact of interacting with AI systems on learning outcomes can be a challenge, and we can differentiate between explicitly testing learning via knowledge tests and operationalizing learning more functionally via outcomes like task performance 21 . For the present study, we measure learning in both ways. The main learning outcomes assess changes in driving performance before and after exposure to the AI coach; secondarily, we assess knowledge via quiz. In this way, we are able to measure the learning impact of the AI Coach directly from multiple perspectives.

In the context of autonomous driving, we propose that observing an AI driving coach may be an effective way to transfer knowledge and teach critical driving skills. There have been a number of studies on human-machine interfaces (HMIs) for AVs, which often take the form of Heads-Up-Displays (HUDs) aimed at building transparency, accountability, and trust 22 , 23 . Though highly relevant, these studies have majorly focused on non-learning areas of the driver experience, and none to our knowledge have tested the impact of HMI elements on driver learning specifically. For example, in the large corpus of work on human-AV trust formation, Morra et al. found that showing more informative displays increased participant willingness to test a real AV 24 , Ruijten et al found that displays mimicking human explanations increases trust 25 , and Koo et al. found that including why-type information in display improves trust 26 . Other HMI studies have emphasize the complexity of trust formation, proposing holistic and multi-factor approaches to designing trustworthy interfaces 27 , 28 , 29 . Beyond trust, Schartmüller et al. assessed display modality differences in driver comprehension and multi-tasking, finding that certain interfaces could reduce workload better than others 30 . HMI experiments aimed at building situational awareness with AVs have shown improved awareness with display elements that allow a person to comprehend of why or why-not (contrastive) a vehicle is taking an action 31 , 32 , 33 . Though these prior studies support vehicle HMIs as a promising avenue for coaching applications, how these may apply in instructional applications is currently unexplored. The work presented here seeks to contribute to the knowledge gap that exists between HMI design research and learning communities focused on optimal coaching techniques.

We focus on the role an AI Coach can play in improving a person’s driving ability—using explanatory communications modeled after the instructions of human driving experts. As the most common in-car method for introductory performance driving instruction is to begin with observational learning (i.e., a passenger observing an expert driver), our explanatory communications are framed within this observational learning context. Using a state-of-the-art, full-motion driving simulator, we explore both if an AI coach’s instructions can improve a novice’s driving performance more than observation alone and, if so, what explanatory methods may be best.

Inspired by explanations given by expert driving instructors, we explore the role that HMIs may play in performance driving instruction. To design an effective HMI, it is important to determine both the type of informational content that should be conveyed as well as the presentation modality of the information. This is especially important in the context of HMIs for safety-critical and cognitively demanding tasks like driving 34 . Studies have shown mixed results on the impact of modality of information presentation. Some studies claim visual techniques should be preferred 30 , 35 , while others suggest auditory feedback as a better strategy 36 , 37 . Impact of the type and content of information presented also shows mixed results in terms of driver performance and preference, including differences between ‘what is happening’ or ‘why it is happening’ information 26 , 32 . These explanation attributes have not been explored in the domain of AI coaching for driving instruction.

To address these gaps of knowledge, this study uses a mixed-methods approach to assess study outcomes. Participants were randomized into one of four AI coaching groups differing by the type and presentation modality of the information they presented. Before and after observation, measures were taken to compare changes in participant driving performance, cognitive load, confidence, expertise, and trust. Each of these factors have been highlighted as important for the development and adoption of HMIs and explainable AI systems more generally. Interviews were conducted to contextualize findings within the larger learning context, including building a more general understanding on concerns with AVs and how these may be mitigated. This broader view of AVs can help illuminate additional roadblocks that must be addressed for successful future adoption of AI driving coaches.

Our research questions are as follows:

What are the pre-post impacts of observing an AI performance driving coach compared to a pure observation (no explanation) control, including the impacts of explanation information type and presentation modality?

What is the process of a novice learning performance driving, and how can an AI Driving Coach facilitate this learning?

What concerns do novices have in general about the deployment of AV technology, and how can these further inform future AV HMI design?

Results from this study support the premise of AI coaching as a promising method for driving instruction and lead to important considerations for future HMI design.

Specifically, we contribute:

A novel assessment of the impact of observing an AI Coach on performance-driving ability, cognitive load, confidence, expertise, and trust.

Insight into the impact of information type (‘what’ and ‘what + why’) and information presentation modality (visual and auditory) on the process of learning performance driving.

Eight design insights to inform the creation of future human-centered HMIs for driving and AI driving coaches.

To address our research questions, participants were divided into one of four experimental conditions differing in the information type and information presentation modality given by the AI coach. For information type, we test two layers of information: (1) ‘What’ information provides descriptors of where the vehicle should drive; (2) ‘Why’ information explains the rationale behind why that position and movement is optimal, and is meant to build conceptual understanding. Some participants received just ’what’ information, while others received both ’what’ and ’why’ information in combination. With this manipulation, we seek to build upon prior work on the impact of information level on performance, preference, and trust 26 . For information modality, we manipulated whether ‘what’ explanations are presented auditorily or in the form of a visual racing line projected on the track; These two additional conditions aim to provide clarification on the efficacy of visual and auditory information at conveying information about a vehicle’s behavior.

We evaluate changes in driving performance from before observing the AI coaching condition to after observation. One crucial concept of learning performance driving focuses on vehicle positioning, which is a fundamental skill for novices. In performance driving, the optimal position follows a path on the track called the ‘racing line’, which minimizes time cost around corners and allows the driver to move as quickly as possible 38 , 39 . As the AI coach’s ‘what’ instructions primarily focused on the racing line, the main outcome of our study is how well participants positioned themselves while driving. Other performance measures include lap time, speed, and acceleration; these were addressed only in explanations that included a ‘why’ component.

Additional secondary outcomes were addressed via interview and questionnaire. These include impacts on trust, knowledge and expertise, driving confidence, and feedback related to how helpful and effective the AI coach was at facilitating the participants’ learning process.

A large corpus of prior work suggests that a major drawback of mid-drive communication is the potential for information overload and high cognitive demand 34 . We hypothesized that observational learning would provide the opportunity to transfer knowledge to a novice while avoiding cognitive overload, as a participant can pay attention to explanations without task switching between learning and driving decision-making. To further explore this phenomenon, we measure cognitive load for each of our study conditions.

Participants

A total of 41 participants were recruited for the study and completed all study procedures. Participants were all novices in the performance driving domain but were otherwise licensed to drive with at least 5 years of everyday driving experience. This ensured that participants had the baseline motor skills necessary to begin performance driving instruction.

Performance driving simulation

All driving-related tasks took place in a state-of-the-art, highly immersive, full-motion driving simulator to ensure realism. This included a 270-degree wrap-around screen, a full vehicle cabin with a closely calibrated steering wheel and pedals, sound effects, wind effects, and cabin movement mimicking the feeling of a vehicle moving on the track (Fig.  1 ). The simulator has a mounted tablet so that participants can answer survey questions between drives without leaving the simulation, ensuring real-time responses and preserving immersiveness. The specific racing context chosen was a highly accurate simulation of the professional driving course Thunderhill Raceway in Willows, California, USA.

During the AI coaching observation sessions, the vehicle drove autonomously with no user input. The specific model used to control the vehicle was a reinforcement learning agent trained using the DSAC algorithm 40 , with observations and rewards tuned for Thunderhill. The agent’s policy was optimized for the physical features and limitations of the vehicle and the geometry of the track 41 . This model was also used to compute the “ideal” racing line for calculating performance.

figure 1

Full-motion driving simulator.

AI coach explanations

During the AI coach observation sessions, explanatory instructions were provided to all participants except the control group. Auditory explanations were presented via an in-cabin speaker. These were produced using Amazon Polly Text-to-Speech 42 . Visual explanations were projected directly onto the track in the simulated environment.

We modeled the AI coach’s explanations off instructions given by four real performance driving experts and instructors who all had real-world experience driving and instructing on the real-world Thunderhill Raceway. First, two expert performance drivers were ethnographically observed giving real-world driving instruction at Thunderhill Raceway via several hours of video. Next, this was paired with think-aloud [cite] and interview sessions with two additional expert performance drivers as they drove the same simulated Thunderhill track used in the experiment. This qualitative assessment was used to ground and justify the specific explanations given by the AI coach to ensure accuracy. Descriptions given by the experts also helped scaffold the learning procedure, which we have described in detail later in this paper.

We found that the most common in-car method for introductory performance driving lessons began with observational learning, specifically when an instructor drives the performance vehicle on the track while providing instruction as the novice observes and listens. This instruction is typically auditory; however, may also include visual explanation via pointing. Thus, our explanatory communications are framed within an observational learning context, and explanations are motivated by these explanatory modalities. In safety-critical domains like driving, observational learning may be a safe and effective method to transfer knowledge to a novice. This study was designed to assess whether observing an AI coach drive and explaining its behavior are effective methods to improve a novice’s driving performance, among other outcomes.

Dimension 1: Information type

First, we focus on the type of information presented to a driver: ‘what’ information provides descriptors of where the vehicle should drive, specifically following the ideal racing line; ‘why’ information explains why that position and movement is optimal. These are inspired by the explanations examined in experimental work on explanations of driving behavior by Koo et al. 26 ; discussed in frameworks by Wang et al. 15 and Lim et al. 43 ; and explored in cognitive science and philosophy 44 . In our experiment, the ‘what’ explanations are designed to help the participant know where to go on the course in order to follow the racing line, such as “to the left edge of the track.” The ‘why’ explanations are designed to help the participant develop a more generalized understanding of performance-driving concepts, techniques, and decision-making. These include ways to optimize speed, acceleration, and lap time. For example, the vehicle moves “to the left edge of the track ...to decrease the traction needed to get around the curve. ” We separate participants into ‘what’-only and ‘what + why’ conditions and compare these to a no explanation control group to assess the effect of information type.

Dimension 2: Information modality

We are further interested in information presentation modality. To assess the impact of multimodal explanations, two different ‘what + why’ conditions were tested. One received visual ‘what’ explanations, while the other received only auditory cues. This visual ‘what’ explanation shows a visual projection of the racing line (Fig.  2 ). Auditory ‘what’ explanations were designed to convey the same information given by the visual racing line instead of using verbal cues. Auditory ‘why’ explanation content remained consistent as the instructions conveyed were deemed too complex for simple visualizations.

figure 2

The green racing line projected on the track is an example of a visual ‘what’ explanation seen by Group 4.

Participants were randomly assigned to one of four experimental conditions (Table  1 ).

Driving performance

Racing line distance : This is how far participants drove from the ideal racing line, measured in meters. It is calculated as the absolute value of the difference between a participant’s line position and the ideal line position determined by the RL model, averaged across the track. Lower racing line distance means a participant was closer to the ideal.

Maximum speed : This is the highest speed achieved in miles per hour. Higher speed is indicative of better performance.

Average Acceleration : This is calculated as the mean change in velocity per 1 second when accelerating forward, measured as meters per second per second. Higher acceleration is indicative of better performance.

Lap time : The time taken to complete one lap in seconds. Lower time is indicative of better performance.

Trust in AV driving coach : This is the average of four, 5-point Likert scale (strongly disagree–strongly agree) questions similar to “I would trust the advice given to me by an AI driving coach.”

General trust in AVs : This is the average of four, 5-point Likert scale (strongly disagree–strongly agree) questions similar to “I would trust an autonomous vehicle to drive me around safely.”

Self-reported performance driving expertise : This is the average of three, on a 5-point Likert scale (strongly disagree–strongly agree) questions similar to “I understand the concepts behind performance driving.”

Racing line knowledge : This is the number of correct responses given on four true/false questions and one diagram question designed by the research team to test how well participants understood the racing line concept. It was given after all driving tasks concluded.

Confidence and Cognitive load

Performance driving confidence : This is the rating, from 0-10, given in response to the question, “If you were asked to performance drive in real life (without assistance), how confident would you feel?”

Cognitive load : This is measured using the widely accepted NASA Task Load Index (TLX) 45 .

The study was approved for human subjects research by WCG IRB (external) and informed consent was collected from all study participants before procedures began. All study procedures were performed in accordance with relevant guidelines and regulations. Figure 3 shows the study procedure.

figure 3

Study timeline. The study duration was approximately 1 hour in total and included a questionnaire, participant driving, AI coach observation, and interview.

The study was framed in the context of learning performance driving. Before and after exposure, participants drove two laps. First, a practice to familiarize participants with the track, controls, and vehicle capabilities. Next, a performance where participants were told to drive as fast as possible without going off track or losing control of the vehicle as if they were going for their best lap time. These instructions align with the standard instructions for racing used in past studies 5 .

During the observation, participants observed the AI coach drive four laps around the same track in the driving simulator. During the first and last lap, the RL agent drove at full speed; these did not contain any explanations from the AI coach and remained consistent across all groups. The second and third laps occurred at a slower speed and contained explanations from the AI coach. The same explanations were repeated twice to promote uptake: in all, about 8 min of explicit instruction was given to participants. Explanations varied by the condition the participant was assigned. Participants in Group 1 did not receive any verbal or visual instruction during these laps but still observed the vehicle drive the course.

Questionnaires in the simulator were given for measures of confidence and cognitive load. Questionnaires outside of the simulator to track trust, and expertise, as well as to get feedback on participants’ experience overall. A 10–15 min semi-structured interview was conducted at the end, during which participants gave feedback on the AI coach’s instructions, described their own learning procedure, and discussed trust and opinions on autonomous vehicles and coaches (see Supplementary Information for more details).

Analysis methods

To assess score differences from before to after observation, we used linear mixed effects (LME) models via the R package ‘lme4’. LME models yield similar results as mixed model ANOVAs, however, they allow for greater flexibility for pre-post experiments while allowing random effects to reduce the probability of a Type 1 error 46 . To test the impact of the explanations received by different groups, models were made with fixed effects for Timepoint (pre-post) and Group (1, 2, 3, or 4), with random effects for Subject ID added to control for individual differences. This tests if the pre-post change experienced by each experimental group differs from the pre-post change of Group 1, the pure-observation control. In some cases, we are also interested in the impact of the observation sessions themselves, regardless of group assignment. This can give insight into the general impact of observing any version of the AI coach. In these cases, the fixed effect for Group is removed, allowing us to test pre-post differences with all groups combined. To avoid skewing results, outliers were removed from racing line analysis when they represented extreme values related to a participant’s vehicle moving far off the track; other racing measures are still valid in these cases.

Impact of AI coaching information type and modality on driving performance

A summary of driving performance results is included in Table 2 and Fig.  4 .

Distance to the ideal racing line is our primary measure of AI coaching success, as vehicle positioning was the primary focus of the training. This is a measure unlikely to improve from practice alone due to its lack of intuitiveness for a novice driver. Looking between groups, we find significant differences in racing line distance. Specifically, differences are found between groups receiving explicit instruction (2, 3, 4) compared to the control group (1) who did not receive explicit instruction. All groups with explicit instruction saw favorable pre-post change (i.e. were closer to the ideal racing line), while the control group itself got worse pre-post. For Group 2 ( \(\beta = -0.3\) , t(35) = − 2.3, \(p < 0.05\) ) and Group 4 ( \(\beta = -0.3\) , t(35) = \(-2.0, p < 0.05\) ), differences were significant compared to Group 1 (Fig.  4 ). These results imply that explicit instruction on the racing line was helpful for learning beyond simple observation and that both information content type and information modality play a role in how effective an AI coach will be.

Among other driving performance measures, Group 2 saw the most differences compared to Group 1. Group 2 improved less on lap time ( \(\beta = 10.3\) , t(37) = 1.8, \(p = 0.08\) ), max speed ( \(\beta = -9.0\) , t(37) = − 3.4, \(p < 0.01\) ), and acceleration ( \(\beta = -1.6\) , t(37) = − 2.7, \(p < .01\) ) compared to the pre-post improvement of the control (Fig.  4 ). Group 3 and 4 did not see any significant pre-post differences compared to the control by these measures.

Combining across all groups allows us to assess pre-post changes that stem from simply observing any version of the AI coach. When participants are lumped together, results show improvement from pre to post-exposure for several aspects of driving performance, including faster lap times ( \(\beta = -13.8\) , t(40) = \(-6.8, p < 0.001\) ), greater average acceleration ( \(\beta = 1.3\) , t(40) = 5.9, \(p < 0.001\) ), and greater max speed ( \(\beta = 5.7\) , t(40) = 5.5, \(p < 0.001\) ). Interestingly, we did not find an all-groups-combined benefit of the AI coach on a participant’s average distance to the ideal racing line, further emphasizing the importance of the specific explanation received. These results as a whole demonstrate a clear positive trend on the impact of observing the AI driving coach in helping improve these aspects of driving performance. They also introduce nuance in the impact of different types and modalities of explanations on performance results.

figure 4

Change ( \(\Delta \) ) from pre-post observation for driving performance measures by group.

Impact of AI coaching information type and modality on AV trust, self-perceived confidence, and expertise

As a whole, impressions of the AI coach were very positive across all groups: 90% of participants agreed the AI coach was helpful, 93% agreed it helped them understand performance driving better, and 85% agreed it helped improve their driving.

Observing the AI driving coach was subjectively impactful for all groups with regard to trust, confidence, and self-reported expertise. We found all-groups-combined pre-post increases in perceived trust in driving coach ( \(\beta = 1.5\) , t(40) = 5.6, \(p < 0.001\) ) and in autonomous vehicles in general ( \(\beta \) = 0.9, t(40) = 3.3, \(p < 0.01\) ). Participants had higher self-report performance driving expertise ( \(\beta \) = 1.0, t(40) = 3.7, \(p < 0.001\) ) and confidence in their performance driving skills ( \(\beta \) = 1.4, t(40) = 6.6, \(p < 0.001\) ). We did not find significant differences in these measures between groups, however. See Table 2 and Fig.  5 for a summary.

Scores on the 5-question racing line knowledge quiz—which was only given post-observation—showed that Group 4 trended towards being higher than Groups 1 ( \(p = 0.06\) ) and 2 ( \(p = 0.10\) ), but did not reach statistical significance. These differences were assessed using ANOVA with a Tukey HSD post-hoc test.

We assessed if any of these measures-trust, confidence, or expertise—were able to predict a participant’s pre-post change in racing line distance. This was done by categorizing participants into groups based on score percentile. We found participants in the top 33% percent of trust in coach ( \(\beta \) = 0.4, t(36) = 3.1, \(p < 0.01\) ) and trust in AVs in general ( \(\beta \) = 0.3, t(35) = 2.2, \(p < 0.05\) ) moved closer to the racing line compared to those in the bottom 33% of each trust measure respectively. Confidence, self-report expertise, and expertise measured via knowledge quiz did not appear to impact change in racing line distance.

figure 5

Pre-post scores for self-report measures, all groups combined.

Impact of AI coaching information type and modality on cognitive load

We are interested in understanding the impact of each type of AI coach on cognitive load. Looking specifically at cognitive load directly after the observation sessions, we find that Group 3 had the highest cognitive load (mean = 14.2, se = 5.7), followed by Group 2 (mean = 14.0, se = 3.2), Group 1 (mean = 13.1, se = 7.0), and Group 4 (mean = 11.9, se = 5.4). Though not statistically significant from each other via ANOVA, these group scores follow the same trend reported in the interview phase and thus may be meaningful for explaining differences observed between groups.

We investigated if cognitive load before or after observation impacted pre-post change in racing line distance. We found individuals in the lowest 33% of cognitive load pre-observation improved less with respect to the racing line compared to those in the highest 33% ( \(\beta \) = 0.29, t(37) = 2.5, \(p < 0.05\) ). We also found those in the middle 33% of cognitive load post-observation got closer to the racing line compared to those in the highest 33% for the cognitive load ( \(\beta = -0.28\) , t(36) = \(-2.5, p < 0.05\) ). These imply that cognitive load impacts a participant’s ability to learn and implement racing line knowledge.

Interview insights

Qualitative analysis of the post-observation interview revealed insights into the learning processes of participants, how the HMI helped or hindered these processes, and brought a new understanding of participant dispositions towards trusting autonomous vehicles. Key themes emerged from the interviews that may inform future HMI design for AV Coaches.

Information content, presentation, and modality

All groups wanted explanations of additional types. For participants in Group 1—who received no explanations-explicit ‘what’ and ‘why’-type information, such as those given to other groups, were desired by nearly all participants. Though many noticed patterns, participants experienced a large amount of uncertainty around what the AI Coach was trying to teach them. One participant explicitly stated, “at first … I didn’t get what the [AI coach] was teaching me.” Participants in Group 2—who received auditory ‘what’ information only—wanted a further rationale for the ’what’ instructions they received (i.e. they wanted ’why’ information). One participant commented that they “didn’t know why the vehicle [moved in a particular way], so [they] couldn’t generalize.” Participants in Groups 3 and 4 expressed a desire for additional visual explanations to help them with the timing and magnitude of accelerating and braking inputs. Finally, a few members of Group 4 suggested that real-time or post-drive feedback would also be helpful for their improvement.

Too much auditory information caused a feeling of being overwhelmed and a desire for more efficiency. Participants in Group 3—who received auditory ’what’ and auditory ‘why’ information—chiefly expressed a desire for more efficiency in the information presented to them. This was the result of Group 3’s tendency to be overwhelmed with the amount of information presented. “Substantial”, “overwhelming”, and “a lot” were all used to describe the amount of information presented. In essence, participants still wanted comprehensive instruction covering a wide range of racing topics, but they wanted this instruction to be easier to consume and delivered in a less burdensome way. The rationale for this request was that it was too difficult to attend to and comprehend all of the auditory ‘why’ information while also receiving auditory ’what’ information (though the two never directly overlapped).

No explanation and auditory ‘what’ explanations introduced uncertainty. In addition to uncertainty about what the coach was trying to teach them, Group 1 participants were unsure of exactly how to position themselves on the track or why they should do so. “It was hard to know what the important parts are [with observation alone],” noted another participant. This introduced additional challenges to the learning process. Participants in Groups 2 and 3 expressed difficulty with the preciseness of auditory ‘what’ instructions, as they had to internally map auditory visuospatial cues such as “move to the left edge” to precise visual positions on the track with inherent uncertainty. These participants wanted higher specificity in the position information they received in order to reduce the amount of guesswork.

Visual ’what’ information was preferred or desired near-universally. Many participants in Groups 2 and 3 explicitly requested a visual racing line. For example, one participant expressed a desire for a “visual line showing … details such as how much to go left, etc.” This was primarily as a means to overcome the uncertainty of auditory ‘what’ explanations. Group 4—who received visual ’what’ and auditory ’why’ information—appeared far more satisfied with their explanations than the rest. They expressed that the visual projection of the racing line was helpful, and in contrast to Group 3, they did not report any issues with overwhelm nor a desire for more efficiency. One participant commented, “the path on the track was ... very helpful. [It] helped me feel more comfortable and confident”.

Trust in autonomous vehicles

For nearly all 41 participants, a lack of trust in autonomous vehicles was a key issue hindering their willingness to adopt AV technology. Aligning with prior research on AV trust, the most common trust concern for our participants was over the AV’s ability to perform safely and reliably 47 . There was a variance between participants: while some only expressed concerns for extreme or rare circumstances, others felt less confident in AVs for any circumstance involving the potential for unexpected occurrences (such as animals), pedestrians, or other drivers. This implies that trust in AVs may be contextually dependent and subject to individual differences. Other concerns included a lack of trust in the companies building the AVs, concerns over AVs running under-tested beta software, legal liability, losing the fun of driving, and AVs taking jobs from humans.

To help alleviate many of these concerns, participants reported that trust could be built through repeated, positive experiences with AVs. Many also suggested that AI explainability would help them feel more comfortable. Specifically, validation that an AI “perceives” what is around it, knowing how an AV will behave in specific driving situations, and understanding the rationale for decisions. Participants wanted to know when an AV can be relied upon and to maintain the ability to regain control from an AV if they are feeling uncertain or uncomfortable.

This study aimed to assess the viability of an AI driving coach for performance driving instruction as well as provide insight into how HMIs for AI driving coaching can be improved in the future. Using a mixed-methods approach (n = 41), we find that an AI driving coach may be a successful method to improve driving performance. Important considerations must be made to the type and modality of information presented. Our results shed light on how to design effective human-machine interfaces (HMIs) for coaching and driving interactions more generally. We discuss key findings and their implications for future HMI design in these contexts.

AI Coaching is a viable method for performance-driving instruction. Both quantitative and qualitative results of this study support the viability of an AI coach for performance-driving instruction. The primary area of focus for our AI coach was to provide instruction on racing line positioning. For racing line distance, we observed that Group 1 (control) got worse pre-post, while Groups 2 (auditory ‘what’ only), 3 (auditory ‘what’ and auditory ‘why’) and 4 (visual ‘what’ and auditory ‘why’) improved. The benefits of explicit instruction on racing line distance are unsurprising, as positioning on the racing line may be counterintuitive for novices. For instance, the racing line often follows the edge of the track, whereas everyday driving generally requires sticking to the middle of a lane. We also found significant overall pre-post-observation differences in several areas of performance driving, including faster lap times, max speeds, and acceleration rates before and after observation. Survey and interview data support AI coaching as an instructional method: most participants found the coach helpful, and overall boosts were found in participant expertise, trust, and confidence.

Proving nuance to our findings, Groups 2 and 4 were the most effective at improving racing line distance; however, Group 2 saw reductions in other areas—such as speed and acceleration—while Group 4 did not. This suggests that the specific instructions chosen greatly impact an AI coach’s effectiveness. During the interview, each group expressed concerns and desires related to the way the type and modality of information impacted their learning. While these differed across groups, concerns and desires were generally shared by members of the same group. The fact that we see significant differences for Groups 2 and 4 but not for Group 3 further underscores the importance of carefully selecting the information conveyed by an AI coach.

We attribute group differences to how information directed attention, mitigated uncertainty, and influenced overload experienced by participants. These, in turn, affected how successfully participants could go about the learning processes.

The type of information received by participants played a role in directing attention. This aligns with prior work in explainable AI suggesting that a feature’s presence (or absence) is meaningful and thus relevant for directing focus 16 , 44 . For our study, ‘what’ information aimed to teach participants how to adhere to the racing line. Group 1—who did not receive this information—suffered in this regard. Group 2 only received ‘what’ information, and thus, we suspect that they focused their attention more on adhering to the racing line and less on optimizing speed, acceleration, and lap time- topics discussed in the ‘why’ explanations. Thus, Group 2 saw less improvement in these other areas. By contrast, Groups 3 and 4 received ‘what’ and ‘why’ information, and thus focused their attention on several areas of performance driving simultaneously. Consequently, Group 4 also got closer to the racing line compared to the control group without seeing sacrifice in other measures like Group 2. We would have expected Group 3 to also have gotten closer to the racing line, as they also received the same types of information as Group 4. We suspect, however, that Group 3’s improvement was hindered by two consequences of information modality: uncertainty and information overload.

The uncertainty of auditory and visual ‘what’ explanations impacted the ease of processing. For individuals in Groups 2 and 3, the uncertainty of auditory ‘what’ information presented to them was a major point of concern. To integrate ‘what’ information effectively, participants needed to transform auditory visuospatial cues such as “move to the left edge” into precise positions on the track. This extra step of transformation, which involves both creating a spatial representation for the linguistic cue and translating it into visual working memory 48 , was not required with the visually presented racing line. Auditory ‘what’ cues were included in this study in alignment with Wickens’ Multiple Resource Theory, which suggests that auditory information should be more efficiently incorporated during visually heavy tasks, such as driving, due to their different channels of processing 49 . In our case, however, we believe that presenting information auditorily made ‘what’ information more difficult to integrate due to the extra step of processing. For example, the instruction “move to the left edge of the track” lacks details specifying exactly where the participant should be aiming, and requires them to visually map the auditory spatial cue to a visual position on the track in front of them. This is supported by the modality appropriateness hypothesis of multisensory integration, which suggests that the effectiveness of information integration is dependent on the context of the task 50 . For a visual context of finding the racing line, visual cues were more efficient. Though we had expected the lack of explanatory preciseness for auditory explanations to be supplemented by observing the movements of the vehicle itself, participants expressed that the uncertainty of the instruction created too much “guesswork”. As a result, our results support that visuospatial ‘what’-type information should be presented visually—such as via a racing line projection—as opposed to auditorily.

The modality and type of information affected the cognitive burden and overwhelm experienced by participants. Decades’ worth of prior research shows that increases in epistemic uncertainty or ambiguity can increase the cognitive burden of information processing 51 , 52 . Integrating these theories into our observations suggests that Groups 2 and 3 both experienced higher cognitive burdens processing the more uncertain auditory ‘what’ information than Group 4 did processing more precise visual ‘what’ information. For Group 2, who had no other information to pay attention to, this increase in demand may have been frustrating but did not impact their ability to stay on the racing line. For Group 3, we believe the increase in cognitive demand required to deal with uncertainty, combined with the increase in demand from receiving additional auditory ‘why’ information, was sufficient enough to overload Group 3 and prevent them from improving. In other words, there is a combinatorial effect of the burden of dealing with uncertainty and the burden of adding additional information to learn. Indeed, though not significantly different from the other groups, participants in Group 3 had the highest cognitive load and complained the most about being overwhelmed in their interviews. Group 4, who received visual ‘what’ and auditory ‘why’ information, had the lowest cognitive load of all groups. This aligns with prior work suggesting multimodal interfaces may promote more efficient processing of multiple sources of information 53 , provided that each is integrated in modality-appropriate ways. Our theory is further supported by the Yerkes-Dodson law where too much cognitive burden is a detriment to performance 54 . We also found that Group 4’s knowledge of the racing line trended towards being significantly higher than both Group 1 and Group 2, while Group 3 showed no difference between these non-why groups. This implies that the visual ‘what’ explanations and their lower cognitive burden may have helped Group 4’s ability to take in auditory ‘why’ information compared to Group 3. Taken as a whole, these findings suggest that information modality and type influenced the amount of cognitive load and sensation of overwhelm for participants in each group, which affected performance and preference in turn. They also suggest that—in this particular instance—the uncertainty of information impacted the burden more than the sheer amount of information.

The implication of these results is that there is a nuanced relationship between information type and information modality for AI coaching HMI design. HMIs designed for AI coaching should aim to find a balance between comprehensive coverage of relevant topics and communication efficiency. This can help learners gain sufficient knowledge, prioritize attention, and avoid being overwhelmed. Context-based modality-appropriateness is essential for transferring information efficiently.

Designing HMIs to support the learning process

The results of this study have clear implications for the future design of AI coaching and autonomous vehicle HMIs more generally. We briefly summarize eight design considerations.

Designing for the learning process. It is important to design for the specific learning process of the learners being taught. Combining participant descriptions with the insights provided by expert performance driving instructors allows us to delineate the process by which participants learned to drive in the driving simulator (Table 3 ). Prior work in explainable AI suggests that AI explanations that align with the learning or reasoning processes of their users may be more effective than those that are not 15 , 55 . Our study supports these past findings. By directing attention and supporting stages in the learning process differently—such as presenting information with more or less uncertainty or ambiguity—we found differences in the performance, knowledge, and preferences of our participants. Some participants expressed a desire for additional support with stages related to learning to brake and accelerate. The implication is that an AI coach, or HMI more generally, needs to be tuned specifically for the task at hand and that careful consideration should be placed on the process stages of learning that task. We delineated the learning processes of novices seeking to learn performance driving. Though not explored in this study, this presents the unique opportunity to study the effect of specific HMI designs on their effect of learning specific learning stages in future work.

Directing attention. We find that-based on the type of information presented-attention may be directed in different ways. Omitting certain information, such as ‘why’ information in our study, may have caused participants to overly focus on the racing line to the detriment of other aspects of performance driving. The implication is that the specific information an AI coach presents should be mindfully chosen to ensure attention is being directed appropriately. In many cases, temporal ordering and prioritization of attention can be specifically designed for using techniques like scaffolding, organizing information hierarchically, and employing progressive disclosure so that less prominent details are deprioritized until they are needed.

Balancing thoroughness with efficiency. Carefully balancing information thoroughness with efficiency of presentation is crucial for useful HMIs. Increasing the amount of information conveyed to a novice may be helpful in transferring sufficient information, however, our evidence suggests that efforts will be futile if not done efficiently. As a result, careful selection of the type and amount of information presented is an important consideration for ease of processing. Avoiding overcomplexity and electing for easy to process information can help reduce the possibility of cognitive overload.

Modality-appropriateness and minimizing uncertainty. Our findings have clear implications on the importance of presenting modality-appropriate information that maximizes ease of processing. For driving instruction and HMIs for driving tasks, visual explanations are ideal for visual aspects of performance, such as showing where on the road to drive. Auditory may be more appropriate for information that is complex or not directly tied to a visual task, such as ‘why’-type explanations. In both cases, information needs to be efficiently presented with emphasis on the precise details necessary for task execution. Presenting thorough, modality-appropriate information can minimize epistemic uncertainty and ambiguity.

Trust as a barrier. Trust was a major concern for our participants. Participants who trusted the AI coach more showed better performance for non-intuitive aspects of performance driving like following the racing line. As such, the effectiveness of an AI coach may be dependent on how trustworthy it is. The implications are clear and supported by a plethora of past research: without trust, HMIs will fail. According to our participants, trust can be built through repeated positive exposure, helpfulness, and explainability validating the AV’s ability and reliability. Though coaching is a novel application from explainable AIs studied in prior work, similar principles for building trustworthy AI systems may apply. For the case of learning specifically, explanations that align with the learning and reasoning processes of the learners themselves may help build trust, as these give the learner the agency to cross-examine the information they are receiving in-context 15 , 55 .

Personalizing interactions. Participants differed in their performance, preferences, and trust levels. Just as it is in human-to-human instruction, AI coaching will work best if personalized to the individual needs of the learner 56 . Attributes for personalization suggested from the study presented here may include ability, expertise, confidence, trust, preference, and cognitive load. For ability and expertise, an AI coach can add or withhold details and alter the complexity of information offered at a given time, among other techniques. For preference, individuals may want the type or modality of information offered to be changed based on their preferred learning style 57 . For trust and confidence, the types of supporting information given to a participant can be varied to address their specific trust concerns. These may include explanations of how the AI system will behave, how it was trained, or even reassuring comments on its ability. Designing for cognitive load may prove a larger challenge, but if an AV can detect when a participant is feeling overwhelmed through biometrics or self-report 58 , it can modulate its communication style and behavior in commensurate ways.

Contextual flexibility. Particularly in discussions around trust, it was clear that certain contexts may require more or different information to mitigate concern than others. In this way, it is clear that pragmatics matter. This aligns with prior work on the impact of contextual factors on explainable AI usefulness 59 . For vehicle HMI design, this means that explanations may need to be altered based on the perceived riskiness or complexity of the driving scenario. In other cases, a vehicle may even give up control to a passenger, or vice-versa. For AI coaching, instructional explanations should be grounded in the context within which they should be applied, giving the learner a broader understanding of when certain lessons should be applied and when they should not be.

Observation versus interactivity. Our study highlighted the potential usefulness of observational AI coaching. We did not compare our observational methods to more interactive methods, such as question-answering 60 . Some participants requested real-time feedback and a means to interact with the information they received. These may be potentially viable future directions to explore in the context of AI coaching.

Limitations and future work

This study was not without limitations. While these are limitations for a study that is designed to be a first step towards AI-driven vehicle coaching, this study still serves as a confirmation of the viability of AI coaching for driving and as a means to drive future work.

While between-group comparisons are the central focus of this research, all-groups-combined results allow us to assess the potential role AI driving coaches play in driving instruction more generally. Both group and combined comparisons, as well as qualitative results from interviews and surveys, support the viability of AI coaching as an instructional method. Combined results should be approached with caution, however, as it is not possible with the current study design to separate the pre-post observation changes from practice effects. Though changes from pre-post observation are far greater in magnitude than we would expect from practice alone, formally testing the impact of practice against the impact of training is left for future study.

We note that during real-world instruction, novices may ask questions to the instructor. Though the present study does not take into account this interaction, the explanations included were designed to cover the most common questions received by our expert drivers. Interactivity and other non-observational methods would be excellent future research directions to explore.

Compared to real-world instruction, our AI coaching sessions were quite short. We would expect larger differences and greater insight after exposure to longer AI coaching sessions and/or sessions that form a series.

Finally, though sufficient enough to obtain statistical significance and identify several data trends, this study had a relatively small sample size. With a larger sample, we would be able to evaluate interactions between variables such as cognitive load and trust directly on the pre-post differences in driving performance observed by the groups. We would expect to see clearer impacts of AI coaching with a larger study sample.

In this study, we tested a novel use of AV technology-if AVs can teach humans to be better drivers. Results from the pre-post observation study reported here support the conclusion that observing an AI driving coach is a promising method to teach novices performance driving. Breaking participants into groups allowed us to determine how information type (‘what’ and ‘why’) and information modality (auditory and visual) influenced outcomes. We saw differences in how information directed attention, mitigated uncertainty, and influenced overload experienced by participants. These differences affected how successfully participants were able to learn. Results suggest that explanations for AI driving coaching and vehicle HMI design should seek to strike a balance between comprehensive coverage of relevant topics—such as how to follow the racing line and meet the limits of speed—with information complexity. When designed properly, explanations can direct attention to appropriate details efficiently, supporting the learning process while avoiding learner overwhelm. Context-based, modality-appropriate explanations should be opted for, especially when they mitigate information uncertainty. We conclude that communications must be designed to align with the needs and learning processes of the learner and present 8 design considerations to inform future HMI design that should generalize to driving in many different contexts, including everyday driving. These include specific suggestions for how to direct attention and choose the modality of an explanation as well as more general implications on the need for personalized, trustworthy, context-based HMIs.

Data availability

The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Fagnant, D. J. & Kockelman, K. Preparing a nation for autonomous vehicles: Opportunities, barriers and policy recommendations. Transport. Res. Part A Policy Pract. 77 , 167–181 (2015).

Article   Google Scholar  

National Highway Traffic Safety Administration Traffic Safety Facts: 2021 data (2021). https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813473.pdf

Blincoe, L. et al. The economic and societal impact of motor vehicle crashes, 2019. Technical Report (2022).

Singh, S. Critical reasons for crashes investigated in the national motor vehicle crash causation survey. Technical Report (2015).

Braghin, F., Cheli, F., Melzi, S. & Sabbioni, E. Race driver model. Comput. Struct. 86 , 1503–1516 (2008).

Van Leeuwen, P. M., De Groot, S., Happee, R. & De Winter, J. C. Differences between racing and non-racing drivers: A simulator study using eye-tracking. PLoS ONE 12 , e0186871 (2017).

Article   PubMed   PubMed Central   Google Scholar  

McKerral, A. & Pammer, K. Identifying objective behavioural measures of expert driver situation awareness. Accid. Anal. Prev. 163 , 106465 (2021).

Article   PubMed   Google Scholar  

Carbonell, J. R. AI in CAI: An artificial-intelligence approach to computer-assisted instruction. IEEE Trans. Man Mach. Syst. 11 , 190–202 (1970).

Baidoo-Anu, D. & Ansah, L. O. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of Chatgpt in promoting teaching and learning. J. AI 7 , 52–62 (2023).

Mozer, M. C., Wiseheart, M. & Novikoff, T. P. Artificial intelligence to support human instruction. Proc. Natl. Acad. Sci. 116 , 3953–3955 (2019).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Irvin, J. et al. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. Proc. AAAI Conf. Artif. Intell. 33 , 590–597 (2019).

Google Scholar  

Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L. H. & Aerts, H. J. Artificial intelligence in radiology. Nat. Rev. Cancer 18 , 500–510 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Duong, M. T. et al. Artificial intelligence for precision education in radiology. Br. J. Radiol. 92 , 20190389 (2019).

Holzinger, A., Langs, G., Denk, H., Zatloukal, K. & Müller, H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9 , e1312 (2019).

Wang, D., Yang, Q., Abdul, A. & Lim, B. Y. Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems 1–15 (2019).

Soltani, S., Kaufman, R. A. & Pazzani, M. J. User-centric enhancements to explainable AI algorithms for image classification. In Proceedings of the Annual Meeting of the Cognitive Science Society , vol. 44 (2022).

Pazzani, M., Soltani, S., Kaufman, R., Qian, S. & Hsiao, A. Expert-informed, user-centric explanations for machine learning. Proc. AAAI Conf. Artif. Intell. 36 , 12280–12286 (2022).

Kaufman, R. A. & Kirsh, D. Cognitive differences in human and AI explanation. In Proceedings of the Annual Meeting of the Cognitive Science Society , vol. 44 (2022).

Ruan, S. et al. Englishbot: An AI-powered conversational system for second language learning. In 26th International Conference on Intelligent User Interfaces 434–444 (2021).

Becker, B. A. et al. Programming is hard-or at least it used to be: Educational opportunities and challenges of AI code generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 500–506 (2023).

Zheng, L., Niu, J., Zhong, L. & Gyasi, J. F. The effectiveness of artificial intelligence on learning achievement and learning perception: A meta-analysis. Interact. Learn. Environ. 31 , 5650–5664 (2023).

Currano, R., Park, S. Y., Moore, D. J., Lyons, K. & Sirkin, D. Little road driving HUD: Heads-up display complexity influences drivers’ perceptions of automated vehicles. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems 1–15 (2021).

Omeiza, D., Webb, H., Jirotka, M. & Kunze, L. Explanations in autonomous driving: A survey. IEEE Trans. Intell. Transp. Syst. 23 , 10142–10162 (2021).

Morra, L., Lamberti, F., Pratticó, F. G., La Rosa, S. & Montuschi, P. Building trust in autonomous vehicles: Role of virtual reality driving simulators in HMI design. IEEE Trans. Veh. Technol. 68 , 9438–9450 (2019).

Ruijten, P. A., Terken, J. M. & Chandramouli, S. N. Enhancing trust in autonomous vehicles through intelligent user interfaces that mimic human behavior. Multimodal Technol. Interact. 2 , 62 (2018).

Koo, J. et al. Why did my car just do that? Explaining semi-autonomous driving actions to improve driver understanding, trust, and performance. Int. J. Interact. Des. Manuf. 9 , 269–275 (2015).

Kaufman, R., Kirsh, D. & Weibel, N. Developing situational awareness for joint action with autonomous vehicles (2024). arXiv preprint arXiv:2404.11800

Ekman, F., Johansson, M. & Sochor, J. Creating appropriate trust in automated vehicle systems: A framework for HMI design. IEEE Trans. Hum. Mach. Syst. 48 , 95–101 (2017).

Frison, A.-K. et al. In UX we trust: Investigation of aesthetics and usability of driver-vehicle interfaces and their impact on the perception of automated driving. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems 1–13 (2019).

Schartmüller, C., Weigl, K., Wintersberger, P., Riener, A. & Steinhauser, M. Text comprehension: Heads-up versus auditory displays: Implications for a productive work environment in SAE level 3 automated vehicles. In Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications 342–354 (2019).

Wiegand, G., Schmidmaier, M., Weber, T., Liu, Y. & Hussmann, H. I drive-you trust: Explaining driving behavior of autonomous cars. In Extended Abstracts of the 2019 Chi Conference on Human Factors in Computing Systems 1–6 (2019).

Omeiza, D., Kollnig, K., Web, H., Jirotka, M. & Kunze, L. Why not explain? Effects of explanations on human perceptions of autonomous driving. In 2021 IEEE International Conference on Advanced Robotics and Its Social Impacts (ARSO) 194–199 (IEEE, 2021).

Omeiza, D., Web, H., Jirotka, M. & Kunze, L. Towards accountability: Providing intelligible explanations in autonomous driving. In 2021 IEEE Intelligent Vehicles Symposium (IV) 231–237 (IEEE, 2021).

Liu, M. & Qi, B. Design study on the effect of intelligent vehicles interaction mode on drivers’ cognitive load. In International Conference on Human–Computer Interaction 42–57 (Springer, 2023).

Chang, C.-C., Sodnik, J. & Boyle, L. N. Don’t speak and drive: cognitive workload of in-vehicle speech interactions. In Adjunct Proceedings of the 8th International Conference on Automotive User Interfaces and Interactive Vehicular Applications 99–104 (2016).

Jeon, M., Davison, B. K., Nees, M. A., Wilson, J. & Walker, B. N. Enhanced auditory menu cues improve dual task performance and are preferred with in-vehicle technologies. In Proceedings of the 1st International Conference on Automotive User Interfaces and Interactive Vehicular Applications 91–98 (2009).

Löcken, A. et al. Towards adaptive ambient in-vehicle displays and interactions: Insights and design guidelines from the 2015 automotiveui dedicated workshop. Automotive User Interfaces: Creating Interactive Experiences in the Car 325–348 (2017).

Xiong, Y. et al. Racing line optimization . Ph.D. thesis, Massachusetts Institute of Technology (2010).

Brayshaw, D. & Harrison, M. A quasi steady state approach to race car lap simulation in order to understand the effects of racing line and Centre of gravity location. Proc. Inst. Mech. Eng. Part D. J. Automob. Eng. 219 , 725–739 (2005).

Ma, X., Xia, L., Zhou, Z., Yang, J. & Zhao, Q. Dsac: Distributional soft actor critic for risk-sensitive reinforcement learning (2020). arXiv preprint arXiv:2004.14547

Chen, L., Subosits, S. M. J. D. J. & Tylkin, P. Learn thy enemy: Online, task-aware opponent modeling in autonomous racing. In Machine Learning for Autonomous Driving Symposium (ML4AD) (2023).

Aws amazon polly (2023). https://aws.amazon.com/polly/

Lim, B. Y., Yang, Q., Abdul, A. M. & Wang, D. Why these explanations? Selecting intelligibility types for explanation goals. In IUI Workshops (2019).

Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 267 , 1–38 (2019).

Article   MathSciNet   Google Scholar  

Hart, S. G. Nasa-task load index (NASA-TLX); 20 years later. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting , vol. 50, 904–908 (Sage publications Sage CA, 2006).

Boisgontier, M. P. & Cheval, B. The Anova to mixed model transition. Neurosci. Biobehav. Rev. 68 , 1004–1005 (2016).

Choi, J. K. & Ji, Y. G. Investigating the importance of trust on adopting an autonomous vehicle. Int. J. Hum. Comput. Interact. 31 , 692–702 (2015).

Magliano, J. P., Larson, A. M., Higgs, K. & Loschky, L. C. The relative roles of visuospatial and linguistic working memory systems in generating inferences during visual narrative comprehension. Mem. Cognit. 44 , 207–219 (2016).

Wickens, C. D. Processing resources and attention. In Multiple Task Performance 3–34 (CRC Press, 2020).

Welch, R. B. & Warren, D. H. Immediate perceptual response to intersensory discrepancy. Psychol. Bull. 88 , 638 (1980).

Article   CAS   PubMed   Google Scholar  

Kirschner, P. A. Cognitive load theory: Implications of cognitive load theory on the design of learning (2002).

Enke, B. & Graeber, T. Cognitive uncertainty. Q. J. Econ. 138 , 2021–2067 (2023).

Turk, M. Multimodal interaction: A review. Pattern Recogn. Lett. 36 , 189–195 (2014).

Article   ADS   Google Scholar  

Yerkes, R. M. et al. The relation of strength of stimulus to rapidity of habit-formation. J. Comp. Neurol. Psychol. 6 , 459–482 (1908).

Kaufman, R. & Kirsh, D. Explainable AI and visual reasoning: Insights from radiology (2023). arXiv preprint arXiv:2304.03318

Wintersberger, P., Frison, A.-K., Riener, A. & Boyle, L. N. Towards a personalized trust model for highly automated driving. In Mensch und Computer 2016-Workshopband (2016).

Pashler, H., McDaniel, M., Rohrer, D. & Bjork, R. Learning styles: Concepts and evidence. Psychol. Sci. Public Interest 9 , 105–119 (2008).

Radhakrishnan, V. et al. Physiological indicators of driver workload during car-following scenarios and takeovers in highly automated driving. Transport. Res. F Traffic Psychol. Behav. 87 , 149–163 (2022).

Liao, Q. V., Zhang, Y., Luss, R., Doshi-Velez, F. & Dhurandhar, A. Connecting algorithmic research and usage contexts: A perspective of contextualized evaluation for explainable AI. Proc. AAAI Conf. Hum. Comput. Crowdsour. 10 , 147–159 (2022).

Liao, Q. V., Gruen, D. & Miller, S. Questioning the AI: Informing design practices for explainable AI user experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems 1–15 (2020).

Download references

Acknowledgements

The authors would like to express gratitude to the Human Interactive Driving and Human-Centered AI groups at Toyota Research Institute. A special thanks to Nadir Weibel, Allison Morgan, Andrew Best, Paul Tylkin, Tiffany Chen, Hiro Yasuda, Hanh Nguyen, and Steven Goldine for their individual contributions, feedback, and support.

Author information

Authors and affiliations.

University of California - San Diego, La Jolla, CA, 92093, USA

Robert Kaufman

Toyota Research Institute, Los Altos, 94022, USA

Jean Costa & Everlyne Kimani

You can also search for this author in PubMed   Google Scholar

Contributions

R.K., E.K., and J.C. conceived the experiment, R.K. conducted the experiment, and R.K. analyzed the results. All authors reviewed the manuscript.

Corresponding author

Correspondence to Robert Kaufman .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Kaufman, R., Costa, J. & Kimani, E. Effects of multimodal explanations for autonomous driving on driving performance, cognitive load, expertise, confidence, and trust. Sci Rep 14 , 13061 (2024). https://doi.org/10.1038/s41598-024-62052-9

Download citation

Received : 18 January 2024

Accepted : 09 May 2024

Published : 06 June 2024

DOI : https://doi.org/10.1038/s41598-024-62052-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

information presentation modality

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Presentation modality and indirect performance information: effects on ratings, reactions, and memory

Affiliation.

  • 1 Department of Psychology, University of Calgary, Alberta, Canada. [email protected]
  • PMID: 12395818
  • DOI: 10.1037/0021-9010.87.5.940

The authors investigated whether raters integrate indirect (second-hand) information from an employee's co-worker with their direct observations when completing performance evaluations. Performance levels of direct and indirect information and presentation modality (auditory vs. textual) were manipulated (N = 220). Results showed that indirect information was perceived to be of highest utility when the performance levels of the direct and indirect information were consistent. Confidence in performance ratings was lowest when the indirect source delivered negative performance feedback that was contrary to the rater's own positive observations. Indirect information was only reflected in the performance ratings when direct observations were positive. There was a significant 3-way interaction between performance level of the direct information, performance level of the indirect information, and presentation modality on memory for performance incidents.

PubMed Disclaimer

Similar articles

  • Six steps to improve discussions of employee performance. Goodale JG. Goodale JG. Clin Lab Manage Rev. 1995 Jan-Feb;9(1):7-8, 12-4. Clin Lab Manage Rev. 1995. PMID: 10139912
  • Performance appraisal, 2: Improving the rater's effectiveness. Martin DC. Martin DC. Personnel. 1986 Aug;63(8):28-33. Personnel. 1986. PMID: 10278733
  • Effects of message, source, and context on evaluations of employee voice behavior. Whiting SW, Maynes TD, Podsakoff NP, Podsakoff PM. Whiting SW, et al. J Appl Psychol. 2012 Jan;97(1):159-82. doi: 10.1037/a0024871. Epub 2011 Aug 15. J Appl Psychol. 2012. PMID: 21842973
  • Understanding the latent structure of job performance ratings. Scullen SE, Mount MK, Goff M. Scullen SE, et al. J Appl Psychol. 2000 Dec;85(6):956-70. doi: 10.1037/0021-9010.85.6.956. J Appl Psychol. 2000. PMID: 11125659
  • Performance appraisal and performance management: 100 years of progress? DeNisi AS, Murphy KR. DeNisi AS, et al. J Appl Psychol. 2017 Mar;102(3):421-433. doi: 10.1037/apl0000085. Epub 2017 Jan 26. J Appl Psychol. 2017. PMID: 28125265 Review.
  • Search in MeSH

LinkOut - more resources

Full text sources.

  • American Psychological Association
  • Ovid Technologies, Inc.
  • MedlinePlus Health Information
  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Long-term retention of information about presentation modality by children and adults

  • Published: January 1985
  • Volume 13 , pages 21–28, ( 1985 )

Cite this article

information presentation modality

  • Elyse Brauch Lehman 1 ,
  • James W. Mikesell 1   nAff2 &
  • Suzanne C. Doherty 1   nAff3  

A mixed-modality continuous recognition task followed by a final recognition test after 0 h, 4 h, 1 day, or 7 days was administered to third- and fourth-grade children and adults. Subjects gave recognition responses and reported presentation modalities. Forgetting rates for both words and input mode were invariant with age. The decay functions for presentation modality were affected, however, by the initial input mode, with modality identification declining more rapidly for words heard first than for words seen first. Information about whether a word was seen or heard remained in memory for at least 4 h. The results demonstrate that long-term-memory representations contain a great deal of information about input mode and suggest that the theoretical distinction between automatic and effortful processing may be a useful one.

Article PDF

Download to read the full article text

Similar content being viewed by others

information presentation modality

To-be-forgotten information shows more relative forgetting over time than to-be-remembered information

information presentation modality

Recognition memory decisions made with short- and long-term retrieval

information presentation modality

The simultaneous recognition of multiple words: A process analysis

Avoid common mistakes on your manuscript.

Attig, M. S. (1983, August). Recent research on automatic and effortful memory processing. The effect of subject variables . Paper presented at the meeting of the American Psychological Association, Anaheim, CA

Berch, D. B. (1975). Methodological problems in the study of memory development: A critique of the Perlmutter and Myers experiment. Bulletin of the Psychonomic Society , 6 , 285–286.

Google Scholar  

Berch, D. B. , & Evans, R. (1973). Decision processes in children’s recognition memory. Journal of Experimental Child Psychology , 16 , 148–164.

Article   Google Scholar  

Bray, N. W. & Batchelder, W. H. (1972). Effects of instructions and retention interval on memory of presentation mode. Journal of Verbal Learning and Verbal Behavtor , 11 , 367–374.

Bregman, A S. (1968). Forgetting curves with semantic, phonetic, graphic, and contiguity cues. Journal of Experimental Psychology , 78 , 539–546.

Burke, D M , & Light, L. L. (1981). Memory and aging The role of retrieval processes. Psychological Bulletin , 90 . 513–546

Article   PubMed   Google Scholar  

Cermak, L. S. , & Youtz, C P. (1976). Retention of phonemic and semantic features of words. Memory & Cognition , 4 , 172–175

Craik, F I M , & Lockhari, R S. (1972). Levels of processing A frame work for memory research. Journal of Verbal Learning and Verbal Behavior , 11 , 671–684

Craik, F. I. M , & Tuiving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology General . 104 , 268–294

Fajnsztejn-Pollack, G. (1973). A developmental study of decay rate in long-term memory. Journal of Experimental Child Psychology , 16 . 225–235

Hasher, L, Riebman, B , & Wren, F. (1976). Imagery and the retention of free-recall learning. Journal of Experimental Psychology. Human Learning and Memory , 2 , 172–181

Hasher, L , & Thomas, H. (1973). A developmental study of retention. Developmental Psychology , 9 , 281

Hasher, L , & Zacks, R. T. (1979). Automatic and effortful processes in memory. Journal of Experimental Psychology General , 108 , 356–388

Hintzman, D. L, Block, R. A , & Inskeep, N. R. (1972). Memory for mode of input. Journal of Verbal Learning and Verbal Behavior , 11 . 741–749

Jacoby, L. L. (1974). The role of mental contiguity in memory. Registration and retrieval effects. Journal of Verbal Learning and Verbal Behavior , 13 . 483–496.

Jones, G. (1979). Multirate forgetting. Journal of Experimental Psychology Human Learning and Memory , 5 , 98–114

Kagan, J, Klein, R, Haith, M , & Morrison, F. (1973). Memory and meaning in two cultures. Child Development , 44 , 221–223

Kaii, R. (1979). The development of memory in children . San Francisco Freeman

Kausler, D. H , & Hakami, M. K. (1983). Memory for activities. Adult age differences and intentionality. Developmental Psychology , 19 , 889–894

Kirsner, K. (1974). Modality differences in recognition memory for words and their attributes. Journal of Experimental Psychology , 102 , 579–584

Kolers, P. A. (1975). Memorial consequences of automatized encoding. Journal of Experimental Psychology. Human Learning and Memory , 1 . 687–701

Kolers, P. A. (1976). Reading a year later. Journal of Experimental Psychology . Human Learning and Memory. 2 , 554–565

Kolers, P. A. , & Ostry, D. J. (1974). Time course of loss of information regarding pattern analyzing operations. Journal of Verbal Learning and Verbal Behavior , 13 , 599–612

Lehman, E.B. (1982). Memory for modality. Evidence for an automatic process. Memory & Cognition , 10 , 554–564

Lehman, E. B. , & Hanzel, S. H. (1981). A developmental study of memory for presentation modality. Journal of General Psychology . 105 , 155–164

Lehman, E. B. , & Mellinger, J. C. (1984) The effects of aging on memory for presentation modality. Developmental Psychology , 20 . 1210–1217

Light, L.L. , & Berger, D. E. (1976). Are there long-term “literal copies” of visually presented words?. Journal of Experimental Psychology. Human Learning and Memory , 2 , 654–662

Madigan, S. , & Doherty, L. (1972). Retention of item attributes in free recall. Psychonomic Science , 27 , 233–235

Nelson, K. C. (1971). Memory development in children. Evidence from nonverbal tasks. Psychonomic Science , 25 , 346–348

Paivio, A. (1971). Imagery and verbal processes , New York Holt, Rinehart & Winston

Pelson, K. (1983). Memory for items and their spatial locations by young and elderly adults. Developmental Psychology , 19 , 895–900

Posner, M. I. , & Nissen, M. J. (1976). Visual dominance: An information processing account of its origins and significance. Psychological Review , 83 , 157–171.

Shulman, H. G. (1970). Encoding and retention of semantic and phonemic information in short-term memory. Journal of Verbal Learning and Verbal Behavior , 9 , 499–508.

Sophian, C , & Perlmutter, M. (1980). Encoding and retention factors in the early development of recall Bulletin of the Psychonomic Society , 15 , 342–344.

Wagner, D. A. (1978). Memories of Morocco: The influence of age, schooling, and environment on memory. Cognitive Psychology , 10 , 1–28

Wickelgren, W. A. (1975). Age and storage dynamics in continuous recognition memory. Developmental Psychology , 11 , 165–169

Download references

Author information

James W. Mikesell

Present address: Henderson Human Development Building, Pennsylvania State University, 16870, University Park, PA

Suzanne C. Doherty

Present address: Tri-County Youth Services Bureau, 20622, Charlotte Hall, MD

Authors and Affiliations

Department of Psychology, George Mason University, 4400 University Drive, 22030, Fairfax, VA

Elyse Brauch Lehman, James W. Mikesell & Suzanne C. Doherty

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

About this article

Lehman, E.B., Mikesell, J.W. & Doherty, S.C. Long-term retention of information about presentation modality by children and adults. Memory & Cognition 13 , 21–28 (1985). https://doi.org/10.3758/BF03198439

Download citation

Received : 02 July 1984

Accepted : 27 December 1984

Issue Date : January 1985

DOI : https://doi.org/10.3758/BF03198439

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Word Recognition
  • Retention Interval
  • Delay Group
  • Modality Identification
  • Find a journal
  • Publish with us
  • Track your research

bioRxiv

The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: [email protected] [email protected]
  • ORCID record for Milot Mirdita
  • ORCID record for Yunha Hwang
  • Info/History
  • Preview PDF

Biological language model performance depends heavily on pretraining data quality, diversity, and size. While metagenomic datasets feature enormous biological diversity, their utilization as pretraining data has been limited due to challenges in data accessibility, quality filtering and deduplication. Here, we present the Open MetaGenomic (OMG) corpus, a genomic pretraining dataset totalling 3.1T base pairs and 3.3B protein coding sequences, obtained by combining two largest metagenomic dataset repositories (JGI’s IMG and EMBL’s MGnify). We first document the composition of the dataset and describe the quality filtering steps taken to remove poor quality data. We make the OMG corpus available as a mixed-modality genomic sequence dataset that represents multi-gene encoding genomic sequences with translated amino acids for protein coding sequences, and nucleic acids for intergenic sequences. We train the first mixed-modality genomic language model (gLM2) that leverages genomic context information to learn robust functional representations and coevolutionary signals in protein-protein interfaces. Furthermore, we show that deduplication in embedding space can be used to balance the corpus, demonstrating improved performance on downstream tasks. The OMG dataset is publicly hosted on the Hugging Face Hub at https://huggingface.co/datasets/tattabio/OMG and gLM2 is available at https://huggingface.co/tattabio/gLM2_650M .

Competing Interest Statement

The authors have declared no competing interest.

View the discussion thread.

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Systems Biology
  • Animal Behavior and Cognition (5548)
  • Biochemistry (12600)
  • Bioengineering (9484)
  • Bioinformatics (30917)
  • Biophysics (15894)
  • Cancer Biology (12972)
  • Cell Biology (18566)
  • Clinical Trials (138)
  • Developmental Biology (10036)
  • Ecology (15016)
  • Epidemiology (2067)
  • Evolutionary Biology (19208)
  • Genetics (12779)
  • Genomics (17584)
  • Immunology (12733)
  • Microbiology (29791)
  • Molecular Biology (12414)
  • Neuroscience (64937)
  • Paleontology (481)
  • Pathology (2008)
  • Pharmacology and Toxicology (3470)
  • Physiology (5357)
  • Plant Biology (11129)
  • Scientific Communication and Education (1730)
  • Synthetic Biology (3067)
  • Systems Biology (7707)
  • Zoology (1733)

What are you looking for?

information presentation modality

Suggested Searches

NO SUGGESTIONS

Search History

Related Searches

Matched Contents

Control PowerPoint presentations with your Samsung smart watch

Controlling the presentation with Galaxy Watch6

Control the slides

PPT Controller displayed on the Galaxy Watch

That's right, you can use your watch to transition between slides instead of your computer. From the watch, navigate to your apps, and tap PPT Controller . Tap Computer , and then tap Connect . Allow your watch to be shown to other Bluetooth devices so you can connect it to your PC. Next, navigate to the Bluetooth settings on your computer and add the watch as a Bluetooth device.

Now that you are connected, you can control your slides. Open a PowerPoint presentation on your computer, and then from the watch, tap the SLIDESHOW play icon. Control the slides by tapping the right arrow. If you need to go back a slide, swipe up from the bottom of your watch screen, and then tap the left arrow .

You can also control your mouse from the watch by tapping Touchpad . This feature is not just limited to PowerPoint - you can literally use your watch as a mouse.

Set presentation alerts

Wrap up alert for the PPT Controller app

If you're not great with timing while you're presenting, you can set Wrap-up alerts or Interval alerts, so you know when to move things along.

On the watch, navigate to and tap PPT Controller . Tap the Settings icon, and then select Wrap-up alert or Interval alerts .

  • Wrap-up alert : Create an alert for the end time of the presentation.
  • Interval alerts : Set alerts in 5-minute increments.

Control PowerPoint presentations with your Samsung smart watch

We would love your feedback!

What information are you looking for?

Anything else you would like us to know? (Optional)

Please tell us how we can help you? (Required)

Thank you for your feedback! Your comment has been submitted.

Contact Samsung Support

  • Mobile 8 AM - 12 AM EST 7 days a week
  • Home Electronics & Appliance 8 AM - 12 AM EST 7 days a week
  • IT/Computing 8 AM - 12 AM EST 7 days a week
  • Text Support 24 hours a day 7 days a week

information presentation modality

You Are About To Be Redirected To Investor Relations Information for U.S.

Redirect notification.

  • * For Samsung Supplies information go to: www.hp.com/go/samsungsupplies
  • * For S.T.A.R. Program cartridge return & recycling go to: www.hp.com/go/suppliesrecycling
  • * For Samsung printer support or service go to: www.hp.com/support/samsung

Select CONTINUE to visit HP's website.

UN team in Bangladesh to discuss modalities of human rights probe

  • Medium Text

A flag is seen on a building during the Human Rights Council at the United Nations in Geneva

Sign up here.

Writing by Sudipto Ganguly; Editing by Shounak Dasgupta

Our Standards: The Thomson Reuters Trust Principles. , opens new tab

Soldiers of the U.S. Army's 1st Armored Brigade Combat Team, 1st Armored Division conduct a live-fire drill, in Pocheon

Kamala Harris' big speech seeks to redefine her for America

Vice President Kamala Harris will try to redefine herself for America and draw a sharp contrast with Republican Donald Trump on Thursday when she accepts the Democratic Party's 2024 presidential nomination.

The Wider Image: In Myanmar's jungles, young volunteers train hard to fight junta

information presentation modality

Maintenance work is planned from 21:00 BST on Tuesday 20th August 2024 to 21:00 BST on Wednesday 21st August 2024, and on Thursday 29th August 2024 from 11:00 to 12:00 BST.

During this time the performance of our website may be affected - searches may run slowly, some pages may be temporarily unavailable, and you may be unable to log in or to access content. If this happens, please try refreshing your web browser or try waiting two to three minutes before trying again.

We apologise for any inconvenience this might cause and thank you for your patience.

information presentation modality

Chemical Society Reviews

Advances in small-molecule fluorescent probes for the study of apoptosis.

ORCID logo

* Corresponding authors

a Institute of Pharmaceutical Biotechnology, School of Biology and Food Engineering, Suzhou University, Suzhou 234000, P. R. China E-mail: [email protected] , [email protected] Tel: +86-025-89682572

b State Key Laboratory of Pharmaceutical Biotechnology, Nanjing University, Nanjing, P. R. China E-mail: [email protected]

c School of Pharmacy, Anhui Province Key Laboratory of Major Autoimmune Diseases, Anhui Institute of Innovative Drugs, Anhui Medical University, Hefei 230032, P. R. China

Apoptosis, as type I cell death, is an active death process strictly controlled by multiple genes, and plays a significant role in regulating various activities. Mounting research indicates that the unique modality of cell apoptosis is directly or indirectly related to different diseases including cancer, autoimmune diseases, viral diseases, neurodegenerative diseases, etc. However, the underlying mechanisms of cell apoptosis are complicated and not fully clarified yet, possibly due to the lack of effective chemical tools for the nondestructive and real-time visualization of apoptosis in complex biological systems. In the past 15 years, various small-molecule fluorescent probes (SMFPs) for imaging apoptosis in vitro and in vivo have attracted broad interest in related disease diagnostics and therapeutics. In this review, we aim to highlight the recent developments of SMFPs based on enzyme activity, plasma membranes, reactive oxygen species, reactive sulfur species, microenvironments and others during cell apoptosis. In particular, we generalize the mechanisms commonly used to design SMFPs for studying apoptosis. In addition, we discuss the limitations of reported probes, and emphasize the potential challenges and prospects in the future. We believe that this review will provide a comprehensive summary and challenging direction for the development of SMFPs in apoptosis related fields.

Graphical abstract: Advances in small-molecule fluorescent probes for the study of apoptosis

Article information

Download citation, permissions.

information presentation modality

Y. Ye, J. Pan, H. Wang, X. Zhang, H. Zhu and X. Liu, Chem. Soc. Rev. , 2024, Advance Article , DOI: 10.1039/D4CS00502C

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page .

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page .

Read more about how to correctly acknowledge RSC content .

Social activity

Search articles by author.

This article has not yet been cited.

Advertisements

SEE YOU AT THE PARALYMPIC GAMES

Sandrine Martinet, French Para judo legend: "I'll do everything I can to win this go...

Coach Sport Paris 2024: Innovation for all sports enthusiasts

Coach Sport Paris 2024: Innovation for all sports enthusiasts

Programme, favourites... Everything you need to know about the Para table tennis

Programme, favourites... Everything you need to know about the Para table tennis

Paris 2024 Paralympics | They will give us chills: Pauline Déroulède

Paris 2024 Paralympics | They will give us chills: Pauline Déroulède

Programme, favourites... Everything you need to know about Boccia tournaments

Programme, favourites... Everything you need to know about Boccia tournaments

Program, favorites... everything you need to know about the Wheelchair Basketball to...

Program, favorites... everything you need to know about the Wheelchair Basketball to...

The best of paris 2024 olympic games.

The best moments, all in one place. Get unlimited access to exclusive content, highlights and replays of the Paris 2024 Olympic Games.

Access highlights & replays

THE BEST OF PARIS 2024 OLYMPIC GAMES

GET READY FOR THE PARALYMPIC GAMES

Paralympic torch relay.

On its arrival on the French coast at Calais from Stoke Mandeville, the Paralympic Flame will be divided into 12 Flames that will be lit at the edge of France. Each day, the different Flames will get closer to Paris, the venue for the first Summer Paralympic Games in France!

Discover now

Paralympic Torch Relay

ABOUT THE GAMES

Sports

Celebrating the Games

Spectator Information

Spectator Information

Discover the paris 2024 paralympics collection.

Discover the Paris 2024 Paralympics collection!

OLYMPIC SHOP

The Olympic Collection Bobble Beanie - White

The Olympic Collection Bobble Beanie - White

The Olympic Collection Sporty Plush Doll - Skating

The Olympic Collection Sporty Plush Doll - Skating

The Olympic Collection Pierre de Coubertin T-Shirt - Red - Unisex

The Olympic Collection Pierre de Coubertin T-Shirt - Red - Unisex

The Olympic Collection - Classic Records Long Sleeve Shirt - Black

The Olympic Collection - Classic Records Long Sleeve Shirt - Black

The Olympic Collection - Latin Crest T-Shirt - White

The Olympic Collection - Latin Crest T-Shirt - White

Paris 2024 Paralympics Plush Mascot 15cm

Paris 2024 Paralympics Plush Mascot 15cm

Paris 2024 Paralympics Crew Sweat - Grey

Paris 2024 Paralympics Crew Sweat - Grey

The Olympic Collection - Primary Logo T-Shirt - White

The Olympic Collection - Primary Logo T-Shirt - White

Download the official app.

Never miss a moment! Enjoy full access to breaking news, live sports, original series, and so much more.

Download now

DOWNLOAD THE OFFICIAL APP

Questions? Contact us

What sports will be featured at the Paris 2024 Paralympic Games?

The Paralympic Games programme includes 22 sports and 23 disciplines with 549 events taking place across 269 sessions from 28 August to 8 September. The 23 disciplines are blind football, boccia, goalball, Para archery, Para athletics, Para badminton, Para canoe, Para cycling road, Para cycling track, Para equestrian (Para dressage), Para judo, Para powerlifting, Para rowing, Para swimming, Para table tennis, Para taekwondo, Para triathlon, shooting Para sport, sitting volleyball, wheelchair basketball, wheelchair fencing, wheelchair rugby and wheelchair tennis.

Where will the 2024 Paralympic Games be held?

The Paralympic Games will be held in Paris and its suburbs such as Clichy-sous-Bois, La Courneuve, Versailles, Saint-Quentin-en-Yvelines and Vaires-sur-Marne. Outside of Paris, para shooting will be held in Châteauroux.

How many venues will there be at the Paris 2024 Paralympic Games?

There will be 20 competition venues, including iconic sites such as the Grand Palais, the Champs de Mars, the Eiffel Tower, Roland-Garros and the Pont Alexandre III.

When will the Paris 2024 Paralympic Games take place?

The Paralympic Games Paris 2024 will be held from 28 August to 8 September 2024, with over 11 days of competition.

How many athletes will participate in the Paris 2024 Paralympic Games?

Around 4,400 athletes are expected to participate in the Paralympic Games Paris 2024.

NEXT PARALYMPIC GAMES

p2024_mico2026-para

WORLDWIDE PARTNERS

Airbnb

IMAGES

  1. Introduction to Modality

    information presentation modality

  2. PPT

    information presentation modality

  3. Information presentation level of the Modality Ontology

    information presentation modality

  4. PPT

    information presentation modality

  5. PPT

    information presentation modality

  6. PPT

    information presentation modality

COMMENTS

  1. PDF Effects of Information Presentation Modalities on Antibiotic

    This comparison study sought to assess the impact of information-presentation modality in a simulated environment using 4 case-vignettes and employed a factorial design with the following factors: 2 (decision correctness) 2 (information presentation modality) 4 (complexity-decision pairs).

  2. Meta-analysis of the modality effect

    Theories of multiple modality information presentation Penney (1989) reviewed research on the effects of visual and auditory presentation on short-term retention of verbal stimuli, developing the separate streams hypothesis of modality effects. This model of short-term memory structure holds that information presented in an auditory mode is ...

  3. Modality, presentation, domain and training effects in statistical

    The design of the study. We systematically investigated the effect of four factors: modality ( auditory vs. visual ), presentation type ( serial vs. simultaneous ), domain ( linguistic vs ...

  4. (PDF) Meta-analysis of the modality effect

    Abstract and Figures. This article reviews research on the modality effect, the educational practice of presenting to-be-learned graphical information visually, and related textual information ...

  5. Modality, presentation, domain and training effects in statistical

    While sequential information in the auditory modality is only available through serial presentation of stimuli, for visual sequences, serial and simultaneous presentation are both feasible. Simultaneous presentation, where items of a sequence are presented together at the same time, seems to be the optimal in the case of statistical learning of ...

  6. Information presentation level of the Modality Ontology

    Here "modality content" is defined in an upper level subject to the inherent modal conditions described in three "modality profile" levels: the information presentation level, the perception level ...

  7. A framework for the intelligent multimodal presentation of information

    2.1.4.. BehaviourThe expression of information requires a multimodal presentation suited to the current interaction context. This presentation is composed of a set of output (modality, medium) pairs linked with redundancy or complementarity properties [10].For example, an incoming call on a mobile phone may be expressed through a multimodal presentation composed of two pairs.

  8. PDF Long-term retention of information about presentation modality by

    Long-term r tention of information about presentation modality by children andadults. 4 h, 1 day, oradministered 7 daystothird- wasandfourth-grade children and adults. Subjects gave r cognition resp a rses d ported presentation modalities. Forgetting rates for both words and input mode were invariant w th age.

  9. Multimodal information presentation: Design guidance and research

    In current systems, one often-employed modality sequence is the use of an auditory alert, followed by the visual presentation of relevant information. For example, route guidance systems for cars use an auditory signal to notify the driver of an upcoming turn, and a visual display then provides more detailed information about the turn (e.g ...

  10. Effects of multimodal explanations for autonomous driving on driving

    To address our research questions, participants were divided into one of four experimental conditions differing in the information type and information presentation modality given by the AI coach ...

  11. PDF How Do Formats and Information Presentation Modalities ...

    This paper divides the visual modality and audio-visual modality as the information presentation modalities and measures recall, recognition and brand awareness depending on the modalities. LITERATURE REVIEW . Communication effect by contents format. Formatting is the form and order in which the information is organized.

  12. Presentation modality and indirect performance information: Effects on

    The authors investigated whether raters integrate indirect (second-hand) information from an employee's co-worker with their direct observations when completing performance evaluations. Performance levels of direct and indirect information and presentation modality (auditory vs. textual) were manipulated (N=220). Results showed that indirect information was perceived to be of highest utility ...

  13. PDF Modality effects in free recall: A retrieved-context account

    Modality efects in free recall: A retrieved-context account. K. Pazdera and Michael J. KahanaUniversity of PennsylvaniaThe modality efect refers to the robust finding that memory performance d. fers for items presented aurally, as compared with visually. Whereas auditory presentation leads to stronger recency performance in immediate recall ...

  14. PDF Long-Term Retention of Information about Presentation Modality by

    words awl their presentation modality addressed three issues: (1) how long the modality information is retained, (2) whether children or adults lose it more rapidly, and (3) whether the word or modality information is lost more rapidly. The study consisted of two experiments. In the first, 32 third- and fourth-grade students and 32

  15. Modality of Information Presentation, Duration of Screen Presentation

    CBT program overall took 13 minutes 14 seconds. Table 1 displays for each screen page the duration and the modality it was presented in for the two groups. ... View in full-text. Context 2 ...

  16. Presentation modality and indirect performance information: effects on

    Indirect information was only reflected in the performance ratings when direct observations were positive. There was a significant 3-way interaction between performance level of the direct information, performance level of the indirect information, and presentation modality on memory for performance incidents.

  17. Long-term retention of information about presentation modality by

    A mixed-modality continuous recognition task followed by a final recognition test after 0 h, 4 h, 1 day, or 7 days was administered to third- and fourth-grade children and adults. Subjects gave recognition responses and reported presentation modalities. Forgetting rates for both words and input mode were invariant with age. The decay functions for presentation modality were affected, however ...

  18. Two Modality Effects in Verbal Short-Term Memory: Evidence from

    In textbooks on cognitive psychology the term "modality effect" (almost always) refers to an advantage of auditory over visual presentation modality in short-term retention of verbal information. Since the 1960s, it has been well known that immediate recall (as well as recognition) of verbal materials is better for lists presented in the ...

  19. Visual presentation modality's superiority in the detection of

    Although both the Deng et al. (2016) and Rosenfeld et al. (2015a) studies found that the visual S1 presentation modality was more effective than the auditory presentation modality in the CTP; however, the detected information used in both studies was autobiographical information (e.g., a participant's hometown name) and not crime-related ...

  20. How Do You Learn Best: The Role of Information Modality in Learning and

    A recent study observed memory performance in children aged 7-8 years is better when information is encountered with an auditory presentation modality as well as a combined auditory-visual modality than when presented visually (Pillai & Yathiraj, 2017).

  21. What Are Effective Presentation Skills (and How to Improve Them)

    Presentation skills are the abilities and qualities necessary for creating and delivering a compelling presentation that effectively communicates information and ideas. They encompass what you say, how you structure it, and the materials you include to support what you say, such as slides, videos, or images. ...

  22. The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic

    Biological language model performance depends heavily on pretraining data quality, diversity, and size. While metagenomic datasets feature enormous biological diversity, their utilization as pretraining data has been limited due to challenges in data accessibility, quality filtering and deduplication. Here, we present the Open MetaGenomic (OMG) corpus, a genomic pretraining dataset totalling 3 ...

  23. Control PowerPoint presentations with your Samsung smart watch

    On the watch, navigate to and tap PPT Controller. Tap the Settings icon, and then select Wrap-up alert or Interval alerts. Wrap-up alert: Create an alert for the end time of the presentation. Interval alerts: Set alerts in 5-minute increments.

  24. UN team in Bangladesh to discuss modalities of human rights probe

    A United Nations team will meet Bangladesh's interim government and other stakeholders from Thursday to discuss the process to investigate alleged human rights violations during the recent deadly ...

  25. Advances in small-molecule fluorescent probes for the study of

    Apoptosis, as type I cell death, is an active death process strictly controlled by multiple genes, and plays a significant role in regulating various activities. Mounting research indicates that the unique modality of cell apoptosis is directly or indirectly related to different diseases including cancer, autoimmun

  26. FDG-PET as an Imaging Modality to Diagnose and Risk Stratify

    The purpose of this study is to evaluate FDG-PET as an imaging modality to diagnose and risk stratify subclinical, imaging negative ICI-myocarditis, and to determine whether subclinical ICI-induced myocarditis is a distinct and clinically relevant entity with a risk of progression to fulminant myocarditis. ... Clinical presentation consistent ...

  27. PDF Diversion/Reentry Workgroup August 23rd, 2024

    Presentation Outline . 17 Collaborative Comprehensive Case Planning I. Introductions, and Organization Overview II. Risk-Need-Responsivity III. Collaborative Comprehensive Case Planning ... • Additional modalities (Matrix Model/ EMDR/ Dialectical Behavior Therapy) • Additional services for Substance Use Disorder (4 additional clinicians) ...

  28. Paris 2024 Paralympics

    Welcome to the Paris 2024 Paralympics Games website. Follow the world's top athletes as they go for gold in France (Aug 28-Sept 8, 2024).

  29. PDF Federal Register /Vol. 89, No. 162/Wednesday, August 21, 2024 ...

    67602 Federal Register/Vol. 89, No. 162/Wednesday, August 21, 2024/Notices TABLE 2—SPECIES LIKELY IMPACTED BY THE SPECIFIED ACTIVITIES 1—Continued Common name Scientific name Stock ESA/ MMPA status; strategic (Y/N)2 Stock abundance (CV, N min, most recent abundance survey)3 PBR Annual M/SI4 Gray Seal8..... Halichoerus grypus