• Privacy Policy

Research Method

Home » Case Study – Methods, Examples and Guide

Case Study – Methods, Examples and Guide

Table of Contents

Case Study Research

A case study is a research method that involves an in-depth examination and analysis of a particular phenomenon or case, such as an individual, organization, community, event, or situation.

It is a qualitative research approach that aims to provide a detailed and comprehensive understanding of the case being studied. Case studies typically involve multiple sources of data, including interviews, observations, documents, and artifacts, which are analyzed using various techniques, such as content analysis, thematic analysis, and grounded theory. The findings of a case study are often used to develop theories, inform policy or practice, or generate new research questions.

Types of Case Study

Types and Methods of Case Study are as follows:

Single-Case Study

A single-case study is an in-depth analysis of a single case. This type of case study is useful when the researcher wants to understand a specific phenomenon in detail.

For Example , A researcher might conduct a single-case study on a particular individual to understand their experiences with a particular health condition or a specific organization to explore their management practices. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a single-case study are often used to generate new research questions, develop theories, or inform policy or practice.

Multiple-Case Study

A multiple-case study involves the analysis of several cases that are similar in nature. This type of case study is useful when the researcher wants to identify similarities and differences between the cases.

For Example, a researcher might conduct a multiple-case study on several companies to explore the factors that contribute to their success or failure. The researcher collects data from each case, compares and contrasts the findings, and uses various techniques to analyze the data, such as comparative analysis or pattern-matching. The findings of a multiple-case study can be used to develop theories, inform policy or practice, or generate new research questions.

Exploratory Case Study

An exploratory case study is used to explore a new or understudied phenomenon. This type of case study is useful when the researcher wants to generate hypotheses or theories about the phenomenon.

For Example, a researcher might conduct an exploratory case study on a new technology to understand its potential impact on society. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as grounded theory or content analysis. The findings of an exploratory case study can be used to generate new research questions, develop theories, or inform policy or practice.

Descriptive Case Study

A descriptive case study is used to describe a particular phenomenon in detail. This type of case study is useful when the researcher wants to provide a comprehensive account of the phenomenon.

For Example, a researcher might conduct a descriptive case study on a particular community to understand its social and economic characteristics. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of a descriptive case study can be used to inform policy or practice or generate new research questions.

Instrumental Case Study

An instrumental case study is used to understand a particular phenomenon that is instrumental in achieving a particular goal. This type of case study is useful when the researcher wants to understand the role of the phenomenon in achieving the goal.

For Example, a researcher might conduct an instrumental case study on a particular policy to understand its impact on achieving a particular goal, such as reducing poverty. The researcher collects data from multiple sources, such as interviews, observations, and documents, and uses various techniques to analyze the data, such as content analysis or thematic analysis. The findings of an instrumental case study can be used to inform policy or practice or generate new research questions.

Case Study Data Collection Methods

Here are some common data collection methods for case studies:

Interviews involve asking questions to individuals who have knowledge or experience relevant to the case study. Interviews can be structured (where the same questions are asked to all participants) or unstructured (where the interviewer follows up on the responses with further questions). Interviews can be conducted in person, over the phone, or through video conferencing.

Observations

Observations involve watching and recording the behavior and activities of individuals or groups relevant to the case study. Observations can be participant (where the researcher actively participates in the activities) or non-participant (where the researcher observes from a distance). Observations can be recorded using notes, audio or video recordings, or photographs.

Documents can be used as a source of information for case studies. Documents can include reports, memos, emails, letters, and other written materials related to the case study. Documents can be collected from the case study participants or from public sources.

Surveys involve asking a set of questions to a sample of individuals relevant to the case study. Surveys can be administered in person, over the phone, through mail or email, or online. Surveys can be used to gather information on attitudes, opinions, or behaviors related to the case study.

Artifacts are physical objects relevant to the case study. Artifacts can include tools, equipment, products, or other objects that provide insights into the case study phenomenon.

How to conduct Case Study Research

Conducting a case study research involves several steps that need to be followed to ensure the quality and rigor of the study. Here are the steps to conduct case study research:

  • Define the research questions: The first step in conducting a case study research is to define the research questions. The research questions should be specific, measurable, and relevant to the case study phenomenon under investigation.
  • Select the case: The next step is to select the case or cases to be studied. The case should be relevant to the research questions and should provide rich and diverse data that can be used to answer the research questions.
  • Collect data: Data can be collected using various methods, such as interviews, observations, documents, surveys, and artifacts. The data collection method should be selected based on the research questions and the nature of the case study phenomenon.
  • Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions and should aim to provide insights and conclusions relevant to the research questions.
  • Draw conclusions: The conclusions drawn from the case study should be based on the data analysis and should be relevant to the research questions. The conclusions should be supported by evidence and should be clearly stated.
  • Validate the findings: The findings of the case study should be validated by reviewing the data and the analysis with participants or other experts in the field. This helps to ensure the validity and reliability of the findings.
  • Write the report: The final step is to write the report of the case study research. The report should provide a clear description of the case study phenomenon, the research questions, the data collection methods, the data analysis, the findings, and the conclusions. The report should be written in a clear and concise manner and should follow the guidelines for academic writing.

Examples of Case Study

Here are some examples of case study research:

  • The Hawthorne Studies : Conducted between 1924 and 1932, the Hawthorne Studies were a series of case studies conducted by Elton Mayo and his colleagues to examine the impact of work environment on employee productivity. The studies were conducted at the Hawthorne Works plant of the Western Electric Company in Chicago and included interviews, observations, and experiments.
  • The Stanford Prison Experiment: Conducted in 1971, the Stanford Prison Experiment was a case study conducted by Philip Zimbardo to examine the psychological effects of power and authority. The study involved simulating a prison environment and assigning participants to the role of guards or prisoners. The study was controversial due to the ethical issues it raised.
  • The Challenger Disaster: The Challenger Disaster was a case study conducted to examine the causes of the Space Shuttle Challenger explosion in 1986. The study included interviews, observations, and analysis of data to identify the technical, organizational, and cultural factors that contributed to the disaster.
  • The Enron Scandal: The Enron Scandal was a case study conducted to examine the causes of the Enron Corporation’s bankruptcy in 2001. The study included interviews, analysis of financial data, and review of documents to identify the accounting practices, corporate culture, and ethical issues that led to the company’s downfall.
  • The Fukushima Nuclear Disaster : The Fukushima Nuclear Disaster was a case study conducted to examine the causes of the nuclear accident that occurred at the Fukushima Daiichi Nuclear Power Plant in Japan in 2011. The study included interviews, analysis of data, and review of documents to identify the technical, organizational, and cultural factors that contributed to the disaster.

Application of Case Study

Case studies have a wide range of applications across various fields and industries. Here are some examples:

Business and Management

Case studies are widely used in business and management to examine real-life situations and develop problem-solving skills. Case studies can help students and professionals to develop a deep understanding of business concepts, theories, and best practices.

Case studies are used in healthcare to examine patient care, treatment options, and outcomes. Case studies can help healthcare professionals to develop critical thinking skills, diagnose complex medical conditions, and develop effective treatment plans.

Case studies are used in education to examine teaching and learning practices. Case studies can help educators to develop effective teaching strategies, evaluate student progress, and identify areas for improvement.

Social Sciences

Case studies are widely used in social sciences to examine human behavior, social phenomena, and cultural practices. Case studies can help researchers to develop theories, test hypotheses, and gain insights into complex social issues.

Law and Ethics

Case studies are used in law and ethics to examine legal and ethical dilemmas. Case studies can help lawyers, policymakers, and ethical professionals to develop critical thinking skills, analyze complex cases, and make informed decisions.

Purpose of Case Study

The purpose of a case study is to provide a detailed analysis of a specific phenomenon, issue, or problem in its real-life context. A case study is a qualitative research method that involves the in-depth exploration and analysis of a particular case, which can be an individual, group, organization, event, or community.

The primary purpose of a case study is to generate a comprehensive and nuanced understanding of the case, including its history, context, and dynamics. Case studies can help researchers to identify and examine the underlying factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and detailed understanding of the case, which can inform future research, practice, or policy.

Case studies can also serve other purposes, including:

  • Illustrating a theory or concept: Case studies can be used to illustrate and explain theoretical concepts and frameworks, providing concrete examples of how they can be applied in real-life situations.
  • Developing hypotheses: Case studies can help to generate hypotheses about the causal relationships between different factors and outcomes, which can be tested through further research.
  • Providing insight into complex issues: Case studies can provide insights into complex and multifaceted issues, which may be difficult to understand through other research methods.
  • Informing practice or policy: Case studies can be used to inform practice or policy by identifying best practices, lessons learned, or areas for improvement.

Advantages of Case Study Research

There are several advantages of case study research, including:

  • In-depth exploration: Case study research allows for a detailed exploration and analysis of a specific phenomenon, issue, or problem in its real-life context. This can provide a comprehensive understanding of the case and its dynamics, which may not be possible through other research methods.
  • Rich data: Case study research can generate rich and detailed data, including qualitative data such as interviews, observations, and documents. This can provide a nuanced understanding of the case and its complexity.
  • Holistic perspective: Case study research allows for a holistic perspective of the case, taking into account the various factors, processes, and mechanisms that contribute to the case and its outcomes. This can help to develop a more accurate and comprehensive understanding of the case.
  • Theory development: Case study research can help to develop and refine theories and concepts by providing empirical evidence and concrete examples of how they can be applied in real-life situations.
  • Practical application: Case study research can inform practice or policy by identifying best practices, lessons learned, or areas for improvement.
  • Contextualization: Case study research takes into account the specific context in which the case is situated, which can help to understand how the case is influenced by the social, cultural, and historical factors of its environment.

Limitations of Case Study Research

There are several limitations of case study research, including:

  • Limited generalizability : Case studies are typically focused on a single case or a small number of cases, which limits the generalizability of the findings. The unique characteristics of the case may not be applicable to other contexts or populations, which may limit the external validity of the research.
  • Biased sampling: Case studies may rely on purposive or convenience sampling, which can introduce bias into the sample selection process. This may limit the representativeness of the sample and the generalizability of the findings.
  • Subjectivity: Case studies rely on the interpretation of the researcher, which can introduce subjectivity into the analysis. The researcher’s own biases, assumptions, and perspectives may influence the findings, which may limit the objectivity of the research.
  • Limited control: Case studies are typically conducted in naturalistic settings, which limits the control that the researcher has over the environment and the variables being studied. This may limit the ability to establish causal relationships between variables.
  • Time-consuming: Case studies can be time-consuming to conduct, as they typically involve a detailed exploration and analysis of a specific case. This may limit the feasibility of conducting multiple case studies or conducting case studies in a timely manner.
  • Resource-intensive: Case studies may require significant resources, including time, funding, and expertise. This may limit the ability of researchers to conduct case studies in resource-constrained settings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Questionnaire

Questionnaire – Definition, Types, and Examples

Observational Research

Observational Research – Methods and Guide

Quantitative Research

Quantitative Research – Methods, Types and...

Qualitative Research Methods

Qualitative Research Methods

Explanatory Research

Explanatory Research – Types, Methods, Guide

Survey Research

Survey Research – Types, Methods, Examples

Logo for Open Educational Resources

Chapter 10. Introduction to Data Collection Techniques

Introduction.

Now that we have discussed various aspects of qualitative research, we can begin to collect data. This chapter serves as a bridge between the first half and second half of this textbook (and perhaps your course) by introducing techniques of data collection. You’ve already been introduced to some of this because qualitative research is often characterized by the form of data collection; for example, an ethnographic study is one that employs primarily observational data collection for the purpose of documenting and presenting a particular culture or ethnos. Thus, some of this chapter will operate as a review of material already covered, but we will be approaching it from the data-collection side rather than the tradition-of-inquiry side we explored in chapters 2 and 4.

Revisiting Approaches

There are four primary techniques of data collection used in qualitative research: interviews, focus groups, observations, and document review. [1] There are other available techniques, such as visual analysis (e.g., photo elicitation) and biography (e.g., autoethnography) that are sometimes used independently or supplementarily to one of the main forms. Not to confuse you unduly, but these various data collection techniques are employed differently by different qualitative research traditions so that sometimes the technique and the tradition become inextricably entwined. This is largely the case with observations and ethnography. The ethnographic tradition is fundamentally based on observational techniques. At the same time, traditions other than ethnography also employ observational techniques, so it is worthwhile thinking of “tradition” and “technique” separately (see figure 10.1).

Figure 10.1. Data Collection Techniques

Each of these data collection techniques will be the subject of its own chapter in the second half of this textbook. This chapter serves as an orienting overview and as the bridge between the conceptual/design portion of qualitative research and the actual practice of conducting qualitative research.

Overview of the Four Primary Approaches

Interviews are at the heart of qualitative research. Returning to epistemological foundations, it is during the interview that the researcher truly opens herself to hearing what others have to say, encouraging her interview subjects to reflect deeply on the meanings and values they hold. Interviews are used in almost every qualitative tradition but are particularly salient in phenomenological studies, studies seeking to understand the meaning of people’s lived experiences.

Focus groups can be seen as a type of interview, one in which a group of persons (ideally between five and twelve) is asked a series of questions focused on a particular topic or subject. They are sometimes used as the primary form of data collection, especially outside academic research. For example, businesses often employ focus groups to determine if a particular product is likely to sell. Among qualitative researchers, it is often used in conjunction with any other primary data collection technique as a form of “triangulation,” or a way of increasing the reliability of the study by getting at the object of study from multiple directions. [2] Some traditions, such as feminist approaches, also see the focus group as an important “consciousness-raising” tool.

If interviews are at the heart of qualitative research, observations are its lifeblood. Researchers who are more interested in the practices and behaviors of people than what they think or who are trying to understand the parameters of an organizational culture rely on observations as their primary form of data collection. The notes they make “in the field” (either during observations or afterward) form the “data” that will be analyzed. Ethnographers, those seeking to describe a particular ethnos, or culture, believe that observations are more reliable guides to that culture than what people have to say about it. Observations are thus the primary form of data collection for ethnographers, albeit often supplemented with in-depth interviews.

Some would say that these three—interviews, focus groups, and observations—are really the foundational techniques of data collection. They are far and away the three techniques most frequently used separately, in conjunction with one another, and even sometimes in mixed methods qualitative/quantitative studies. Document review, either as a form of content analysis or separately, however, is an important addition to the qualitative researcher’s toolkit and should not be overlooked (figure 10.1). Although it is rare for a qualitative researcher to make document review their primary or sole form of data collection, including documents in the research design can help expand the reach and the reliability of a study. Document review can take many forms, from historical and archival research, in which the researcher pieces together a narrative of the past by finding and analyzing a variety of “documents” and records (including photographs and physical artifacts), to analyses of contemporary media content, as in the case of compiling and coding blog posts or other online commentaries, and content analysis that identifies and describes communicative aspects of media or documents.

case study techniques of data collection

In addition to these four major techniques, there are a host of emerging and incidental data collection techniques, from photo elicitation or photo voice, in which respondents are asked to comment upon a photograph or image (particularly useful as a supplement to interviews when the respondents are hesitant or unable to answer direct questions), to autoethnographies, in which the researcher uses his own position and life to increase our understanding about a phenomenon and its historical and social context.

Taken together, these techniques provide a wide range of practices and tools with which to discover the world. They are particularly suited to addressing the questions that qualitative researchers ask—questions about how things happen and why people act the way they do, given particular social contexts and shared meanings about the world (chapter 4).

Triangulation and Mixed Methods

Because the researcher plays such a large and nonneutral role in qualitative research, one that requires constant reflectivity and awareness (chapter 6), there is a constant need to reassure her audience that the results she finds are reliable. Quantitative researchers can point to any number of measures of statistical significance to reassure their audiences, but qualitative researchers do not have math to hide behind. And she will also want to reassure herself that what she is hearing in her interviews or observing in the field is a true reflection of what is going on (or as “true” as possible, given the problem that the world is as large and varied as the elephant; see chapter 3). For those reasons, it is common for researchers to employ more than one data collection technique or to include multiple and comparative populations, settings, and samples in the research design (chapter 2). A single set of interviews or initial comparison of focus groups might be conceived as a “pilot study” from which to launch the actual study. Undergraduate students working on a research project might be advised to think about their projects in this way as well. You are simply not going to have enough time or resources as an undergraduate to construct and complete a successful qualitative research project, but you may be able to tackle a pilot study. Graduate students also need to think about the amount of time and resources they have for completing a full study. Masters-level students, or students who have one year or less in which to complete a program, should probably consider their study as an initial exploratory pilot. PhD candidates might have the time and resources to devote to the type of triangulated, multifaceted research design called for by the research question.

We call the use of multiple qualitative methods of data collection and the inclusion of multiple and comparative populations and settings “triangulation.” Using different data collection methods allows us to check the consistency of our findings. For example, a study of the vaccine hesitant might include a set of interviews with vaccine-hesitant people and a focus group of the same and a content analysis of online comments about a vaccine mandate. By employing all three methods, we can be more confident of our interpretations from the interviews alone (especially if we are hearing the same thing throughout; if we are not, then this is a good sign that we need to push a little further to find out what is really going on). [3] Methodological triangulation is an important tool for increasing the reliability of our findings and the overall success of our research.

Methodological triangulation should not be confused with mixed methods techniques, which refer instead to the combining of qualitative and quantitative research methods. Mixed methods studies can increase reliability, but that is not their primary purpose. Mixed methods address multiple research questions, both the “how many” and “why” kind, or the causal and explanatory kind. Mixed methods will be discussed in more detail in chapter 15.

Let us return to the three examples of qualitative research described in chapter 1: Cory Abramson’s study of aging ( The End Game) , Jennifer Pierce’s study of lawyers and discrimination ( Racing for Innocence ), and my own study of liberal arts college students ( Amplified Advantage ). Each of these studies uses triangulation.

Abramson’s book is primarily based on three years of observations in four distinct neighborhoods. He chose the neighborhoods in such a way to maximize his ability to make comparisons: two were primarily middle class and two were primarily poor; further, within each set, one was predominantly White, while the other was either racially diverse or primarily African American. In each neighborhood, he was present in senior centers, doctors’ offices, public transportation, and other public spots where the elderly congregated. [4] The observations are the core of the book, and they are richly written and described in very moving passages. But it wasn’t enough for him to watch the seniors. He also engaged with them in casual conversation. That, too, is part of fieldwork. He sometimes even helped them make it to the doctor’s office or get around town. Going beyond these interactions, he also interviewed sixty seniors, an equal amount from each of the four neighborhoods. It was in the interviews that he could ask more detailed questions about their lives, what they thought about aging, what it meant to them to be considered old, and what their hopes and frustrations were. He could see that those living in the poor neighborhoods had a more difficult time accessing care and resources than those living in the more affluent neighborhoods, but he couldn’t know how the seniors understood these difficulties without interviewing them. Both forms of data collection supported each other and helped make the study richer and more insightful. Interviews alone would have failed to demonstrate the very real differences he observed (and that some seniors would not even have known about). This is the value of methodological triangulation.

Pierce’s book relies on two separate forms of data collection—interviews with lawyers at a firm that has experienced a history of racial discrimination and content analyses of news stories and popular films that screened during the same years of the alleged racial discrimination. I’ve used this book when teaching methods and have often found students struggle with understanding why these two forms of data collection were used. I think this is because we don’t teach students to appreciate or recognize “popular films” as a legitimate form of data. But what Pierce does is interesting and insightful in the best tradition of qualitative research. Here is a description of the content analyses from a review of her book:

In the chapter on the news media, Professor Pierce uses content analysis to argue that the media not only helped shape the meaning of affirmative action, but also helped create white males as a class of victims. The overall narrative that emerged from these media accounts was one of white male innocence and victimization. She also maintains that this narrative was used to support “neoconservative and neoliberal political agendas” (p. 21). The focus of these articles tended to be that affirmative action hurt white working-class and middle-class men particularly during the recession in the 1980s (despite statistical evidence that people of color were hurt far more than white males by the recession). In these stories fairness and innocence were seen in purely individual terms. Although there were stories that supported affirmative action and developed a broader understanding of fairness, the total number of stories slanted against affirmative action from 1990 to 1999. During that time period negative stories always outnumbered those supporting the policy, usually by a ratio of 3:1 or 3:2. Headlines, the presentation of polling data, and an emphasis in stories on racial division, Pierce argues, reinforced the story of white male victimization. Interestingly, the news media did very few stories on gender and affirmative action. The chapter on the film industry from 1989 to 1999 reinforces Pierce’s argument and adds another layer to her interpretation of affirmative action during this time period. She sampled almost 60 Hollywood films with receipts ranging from four million to 184 million dollars. In this chapter she argues that the dominant theme of these films was racial progress and the redemption of white Americans from past racism. These movies usually portrayed white, elite, and male experiences. People of color were background figures who supported the protagonist and “anointed” him as a savior (p. 45). Over the course of the film the protagonists move from “innocence to consciousness” concerning racism. The antagonists in these films most often were racist working-class white men. A Time to Kill , Mississippi Burning , Amistad , Ghosts of Mississippi , The Long Walk Home , To Kill a Mockingbird , and Dances with Wolves receive particular analysis in this chapter, and her examination of them leads Pierce to conclude that they infused a myth of racial progress into America’s cultural memory. White experiences of race are the focus and contemporary forms of racism are underplayed or omitted. Further, these films stereotype both working-class and elite white males, and underscore the neoliberal emphasis on individualism. ( Hrezo 2012 )

With that context in place, Pierce then turned to interviews with attorneys. She finds that White male attorneys often misremembered facts about the period in which the law firm was accused of racial discrimination and that they often portrayed their firms as having made substantial racial progress. This was in contrast to many of the lawyers of color and female lawyers who remembered the history differently and who saw continuing examples of racial (and gender) discrimination at the law firm. In most of the interviews, people talked about individuals, not structure (and these are attorneys, who really should know better!). By including both content analyses and interviews in her study, Pierce is better able to situate the attorney narratives and explain the larger context for the shared meanings of individual innocence and racial progress. Had this been a study only of films during this period, we would not know how actual people who lived during this period understood the decisions they made; had we had only the interviews, we would have missed the historical context and seen a lot of these interviewees as, well, not very nice people at all. Together, we have a study that is original, inventive, and insightful.

My own study of how class background affects the experiences and outcomes of students at small liberal arts colleges relies on mixed methods and triangulation. At the core of the book is an original survey of college students across the US. From analyses of this survey, I can present findings on “how many” questions and descriptive statistics comparing students of different social class backgrounds. For example, I know and can demonstrate that working-class college students are less likely to go to graduate school after college than upper-class college students are. I can even give you some estimates of the class gap. But what I can’t tell you from the survey is exactly why this is so or how it came to be so . For that, I employ interviews, focus groups, document reviews, and observations. Basically, I threw the kitchen sink at the “problem” of class reproduction and higher education (i.e., Does college reduce class inequalities or make them worse?). A review of historical documents provides a picture of the place of the small liberal arts college in the broader social and historical context. Who had access to these colleges and for what purpose have always been in contest, with some groups attempting to exclude others from opportunities for advancement. What it means to choose a small liberal arts college in the early twenty-first century is thus different for those whose parents are college professors, for those whose parents have a great deal of money, and for those who are the first in their family to attend college. I was able to get at these different understandings through interviews and focus groups and to further delineate the culture of these colleges by careful observation (and my own participation in them, as both former student and current professor). Putting together individual meanings, student dispositions, organizational culture, and historical context allowed me to present a story of how exactly colleges can both help advance first-generation, low-income, working-class college students and simultaneously amplify the preexisting advantages of their peers. Mixed methods addressed multiple research questions, while triangulation allowed for this deeper, more complex story to emerge.

In the next few chapters, we will explore each of the primary data collection techniques in much more detail. As we do so, think about how these techniques may be productively joined for more reliable and deeper studies of the social world.

Advanced Reading: Triangulation

Denzin ( 1978 ) identified four basic types of triangulation: data, investigator, theory, and methodological. Properly speaking, if we use the Denzin typology, the use of multiple methods of data collection and analysis to strengthen one’s study is really a form of methodological triangulation. It may be helpful to understand how this differs from the other types.

Data triangulation occurs when the researcher uses a variety of sources in a single study. Perhaps they are interviewing multiple samples of college students. Obviously, this overlaps with sample selection (see chapter 5). It is helpful for the researcher to understand that these multiple data sources add strength and reliability to the study. After all, it is not just “these students here” but also “those students over there” that are experiencing this phenomenon in a particular way.

Investigator triangulation occurs when different researchers or evaluators are part of the research team. Intercoding reliability is a form of investigator triangulation (or at least a way of leveraging the power of multiple researchers to raise the reliability of the study).

Theory triangulation is the use of multiple perspectives to interpret a single set of data, as in the case of competing theoretical paradigms (e.g., a human capital approach vs. a Bourdieusian multiple capital approach).

Methodological triangulation , as explained in this chapter, is the use of multiple methods to study a single phenomenon, issue, or problem.

Further Readings

Carter, Nancy, Denise Bryant-Lukosius, Alba DiCenso, Jennifer Blythe, Alan J. Neville. 2014. “The Use of Triangulation in Qualitative Research.” Oncology Nursing Forum 41(5):545–547. Discusses the four types of triangulation identified by Denzin with an example of the use of focus groups and in-depth individuals.

Mathison, Sandra. 1988. “Why Triangulate?” Educational Researcher 17(2):13–17. Presents three particular ways of assessing validity through the use of triangulated data collection: convergence, inconsistency, and contradiction.

Tracy, Sarah J. 2010. “Qualitative Quality: Eight ‘Big-Tent’ Criteria for Excellent Qualitative Research.” Qualitative Inquiry 16(10):837–851. Focuses on triangulation as a criterion for conducting valid qualitative research.

  • Marshall and Rossman ( 2016 ) state this slightly differently. They list four primary methods for gathering information: (1) participating in the setting, (2) observing directly, (3) interviewing in depth, and (4) analyzing documents and material culture (141). An astute reader will note that I have collapsed participation into observation and that I have distinguished focus groups from interviews. I suspect that this distinction marks me as more of an interview-based researcher, while Marshall and Rossman prioritize ethnographic approaches. The main point of this footnote is to show you, the reader, that there is no single agreed-upon number of approaches to collecting qualitative data. ↵
  • See “ Advanced Reading: Triangulation ” at end of this chapter. ↵
  • We can also think about triangulating the sources, as when we include comparison groups in our sample (e.g., if we include those receiving vaccines, we might find out a bit more about where the real differences lie between them and the vaccine hesitant); triangulating the analysts (building a research team so that your interpretations can be checked against those of others on the team); and even triangulating the theoretical perspective (as when we “try on,” say, different conceptualizations of social capital in our analyses). ↵

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sage Choice

Logo of sageopen

Continuing to enhance the quality of case study methodology in health services research

Shannon l. sibbald.

1 Faculty of Health Sciences, Western University, London, Ontario, Canada.

2 Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

3 The Schulich Interfaculty Program in Public Health, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

Stefan Paciocco

Meghan fournie, rachelle van asseldonk, tiffany scurr.

Case study methodology has grown in popularity within Health Services Research (HSR). However, its use and merit as a methodology are frequently criticized due to its flexible approach and inconsistent application. Nevertheless, case study methodology is well suited to HSR because it can track and examine complex relationships, contexts, and systems as they evolve. Applied appropriately, it can help generate information on how multiple forms of knowledge come together to inform decision-making within healthcare contexts. In this article, we aim to demystify case study methodology by outlining its philosophical underpinnings and three foundational approaches. We provide literature-based guidance to decision-makers, policy-makers, and health leaders on how to engage in and critically appraise case study design. We advocate that researchers work in collaboration with health leaders to detail their research process with an aim of strengthening the validity and integrity of case study for its continued and advanced use in HSR.

Introduction

The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the delivery and implementation of programs can increase the likelihood of success. 3 , 4 Case study methodology is particularly well suited for implementation research in health services because it can provide insight into the nuances of diverse contexts. 5 , 6 In 1999, Yin 7 published a paper on how to enhance the quality of case study in HSR, which was foundational for the emergence of case study in this field. Yin 7 maintains case study is an appropriate methodology in HSR because health systems are constantly evolving, and the multiple affiliations and diverse motivations are difficult to track and understand with traditional linear methodologies.

Despite its increased popularity, there is debate whether a case study is a methodology (ie, a principle or process that guides research) or a method (ie, a tool to answer research questions). Some criticize case study for its high level of flexibility, perceiving it as less rigorous, and maintain that it generates inadequate results. 8 Others have noted issues with quality and consistency in how case studies are conducted and reported. 9 Reporting is often varied and inconsistent, using a mix of approaches such as case reports, case findings, and/or case study. Authors sometimes use incongruent methods of data collection and analysis or use the case study as a default when other methodologies do not fit. 9 , 10 Despite these criticisms, case study methodology is becoming more common as a viable approach for HSR. 11 An abundance of articles and textbooks are available to guide researchers through case study research, including field-specific resources for business, 12 , 13 nursing, 14 and family medicine. 15 However, there remains confusion and a lack of clarity on the key tenets of case study methodology.

Several common philosophical underpinnings have contributed to the development of case study research 1 which has led to different approaches to planning, data collection, and analysis. This presents challenges in assessing quality and rigour for researchers conducting case studies and stakeholders reading results.

This article discusses the various approaches and philosophical underpinnings to case study methodology. Our goal is to explain it in a way that provides guidance for decision-makers, policy-makers, and health leaders on how to understand, critically appraise, and engage in case study research and design, as such guidance is largely absent in the literature. This article is by no means exhaustive or authoritative. Instead, we aim to provide guidance and encourage dialogue around case study methodology, facilitating critical thinking around the variety of approaches and ways quality and rigour can be bolstered for its use within HSR.

Purpose of case study methodology

Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16 , 17 It is ideal for situations including, but not limited to, exploring under-researched and real-life phenomena, 18 especially when the contexts are complex and the researcher has little control over the phenomena. 19 , 20 Case studies can be useful when researchers want to understand how interventions are implemented in different contexts, and how context shapes the phenomenon of interest.

In addition to demonstrating coherency with the type of questions case study is suited to answer, there are four key tenets to case study methodologies: (1) be transparent in the paradigmatic and theoretical perspectives influencing study design; (2) clearly define the case and phenomenon of interest; (3) clearly define and justify the type of case study design; and (4) use multiple data collection sources and analysis methods to present the findings in ways that are consistent with the methodology and the study’s paradigmatic base. 9 , 16 The goal is to appropriately match the methods to empirical questions and issues and not to universally advocate any single approach for all problems. 21

Approaches to case study methodology

Three authors propose distinct foundational approaches to case study methodology positioned within different paradigms: Yin, 19 , 22 Stake, 5 , 23 and Merriam 24 , 25 ( Table 1 ). Yin is strongly post-positivist whereas Stake and Merriam are grounded in a constructivist paradigm. Researchers should locate their research within a paradigm that explains the philosophies guiding their research 26 and adhere to the underlying paradigmatic assumptions and key tenets of the appropriate author’s methodology. This will enhance the consistency and coherency of the methods and findings. However, researchers often do not report their paradigmatic position, nor do they adhere to one approach. 9 Although deliberately blending methodologies may be defensible and methodologically appropriate, more often it is done in an ad hoc and haphazard way, without consideration for limitations.

Cross-analysis of three case study approaches, adapted from Yazan 2015

The post-positive paradigm postulates there is one reality that can be objectively described and understood by “bracketing” oneself from the research to remove prejudice or bias. 27 Yin focuses on general explanation and prediction, emphasizing the formulation of propositions, akin to hypothesis testing. This approach is best suited for structured and objective data collection 9 , 11 and is often used for mixed-method studies.

Constructivism assumes that the phenomenon of interest is constructed and influenced by local contexts, including the interaction between researchers, individuals, and their environment. 27 It acknowledges multiple interpretations of reality 24 constructed within the context by the researcher and participants which are unlikely to be replicated, should either change. 5 , 20 Stake and Merriam’s constructivist approaches emphasize a story-like rendering of a problem and an iterative process of constructing the case study. 7 This stance values researcher reflexivity and transparency, 28 acknowledging how researchers’ experiences and disciplinary lenses influence their assumptions and beliefs about the nature of the phenomenon and development of the findings.

Defining a case

A key tenet of case study methodology often underemphasized in literature is the importance of defining the case and phenomenon. Researches should clearly describe the case with sufficient detail to allow readers to fully understand the setting and context and determine applicability. Trying to answer a question that is too broad often leads to an unclear definition of the case and phenomenon. 20 Cases should therefore be bound by time and place to ensure rigor and feasibility. 6

Yin 22 defines a case as “a contemporary phenomenon within its real-life context,” (p13) which may contain a single unit of analysis, including individuals, programs, corporations, or clinics 29 (holistic), or be broken into sub-units of analysis, such as projects, meetings, roles, or locations within the case (embedded). 30 Merriam 24 and Stake 5 similarly define a case as a single unit studied within a bounded system. Stake 5 , 23 suggests bounding cases by contexts and experiences where the phenomenon of interest can be a program, process, or experience. However, the line between the case and phenomenon can become muddy. For guidance, Stake 5 , 23 describes the case as the noun or entity and the phenomenon of interest as the verb, functioning, or activity of the case.

Designing the case study approach

Yin’s approach to a case study is rooted in a formal proposition or theory which guides the case and is used to test the outcome. 1 Stake 5 advocates for a flexible design and explicitly states that data collection and analysis may commence at any point. Merriam’s 24 approach blends both Yin and Stake’s, allowing the necessary flexibility in data collection and analysis to meet the needs.

Yin 30 proposed three types of case study approaches—descriptive, explanatory, and exploratory. Each can be designed around single or multiple cases, creating six basic case study methodologies. Descriptive studies provide a rich description of the phenomenon within its context, which can be helpful in developing theories. To test a theory or determine cause and effect relationships, researchers can use an explanatory design. An exploratory model is typically used in the pilot-test phase to develop propositions (eg, Sibbald et al. 31 used this approach to explore interprofessional network complexity). Despite having distinct characteristics, the boundaries between case study types are flexible with significant overlap. 30 Each has five key components: (1) research question; (2) proposition; (3) unit of analysis; (4) logical linking that connects the theory with proposition; and (5) criteria for analyzing findings.

Contrary to Yin, Stake 5 believes the research process cannot be planned in its entirety because research evolves as it is performed. Consequently, researchers can adjust the design of their methods even after data collection has begun. Stake 5 classifies case studies into three categories: intrinsic, instrumental, and collective/multiple. Intrinsic case studies focus on gaining a better understanding of the case. These are often undertaken when the researcher has an interest in a specific case. Instrumental case study is used when the case itself is not of the utmost importance, and the issue or phenomenon (ie, the research question) being explored becomes the focus instead (eg, Paciocco 32 used an instrumental case study to evaluate the implementation of a chronic disease management program). 5 Collective designs are rooted in an instrumental case study and include multiple cases to gain an in-depth understanding of the complexity and particularity of a phenomenon across diverse contexts. 5 , 23 In collective designs, studying similarities and differences between the cases allows the phenomenon to be understood more intimately (for examples of this in the field, see van Zelm et al. 33 and Burrows et al. 34 In addition, Sibbald et al. 35 present an example where a cross-case analysis method is used to compare instrumental cases).

Merriam’s approach is flexible (similar to Stake) as well as stepwise and linear (similar to Yin). She advocates for conducting a literature review before designing the study to better understand the theoretical underpinnings. 24 , 25 Unlike Stake or Yin, Merriam proposes a step-by-step guide for researchers to design a case study. These steps include performing a literature review, creating a theoretical framework, identifying the problem, creating and refining the research question(s), and selecting a study sample that fits the question(s). 24 , 25 , 36

Data collection and analysis

Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5 , 37 By seeking patterns within and across data sources, a thick description of the case can be generated to support a greater understanding and interpretation of the whole phenomenon. 5 , 17 , 20 , 23 This technique is called triangulation and is used to explore cases with greater accuracy. 5 Although Stake 5 maintains case study is most often used in qualitative research, Yin 17 supports a mix of both quantitative and qualitative methods to triangulate data. This deliberate convergence of data sources (or mixed methods) allows researchers to find greater depth in their analysis and develop converging lines of inquiry. For example, case studies evaluating interventions commonly use qualitative interviews to describe the implementation process, barriers, and facilitators paired with a quantitative survey of comparative outcomes and effectiveness. 33 , 38 , 39

Yin 30 describes analysis as dependent on the chosen approach, whether it be (1) deductive and rely on theoretical propositions; (2) inductive and analyze data from the “ground up”; (3) organized to create a case description; or (4) used to examine plausible rival explanations. According to Yin’s 40 approach to descriptive case studies, carefully considering theory development is an important part of study design. “Theory” refers to field-relevant propositions, commonly agreed upon assumptions, or fully developed theories. 40 Stake 5 advocates for using the researcher’s intuition and impression to guide analysis through a categorical aggregation and direct interpretation. Merriam 24 uses six different methods to guide the “process of making meaning” (p178) : (1) ethnographic analysis; (2) narrative analysis; (3) phenomenological analysis; (4) constant comparative method; (5) content analysis; and (6) analytic induction.

Drawing upon a theoretical or conceptual framework to inform analysis improves the quality of case study and avoids the risk of description without meaning. 18 Using Stake’s 5 approach, researchers rely on protocols and previous knowledge to help make sense of new ideas; theory can guide the research and assist researchers in understanding how new information fits into existing knowledge.

Practical applications of case study research

Columbia University has recently demonstrated how case studies can help train future health leaders. 41 Case studies encompass components of systems thinking—considering connections and interactions between components of a system, alongside the implications and consequences of those relationships—to equip health leaders with tools to tackle global health issues. 41 Greenwood 42 evaluated Indigenous peoples’ relationship with the healthcare system in British Columbia and used a case study to challenge and educate health leaders across the country to enhance culturally sensitive health service environments.

An important but often omitted step in case study research is an assessment of quality and rigour. We recommend using a framework or set of criteria to assess the rigour of the qualitative research. Suitable resources include Caelli et al., 43 Houghten et al., 44 Ravenek and Rudman, 45 and Tracy. 46

New directions in case study

Although “pragmatic” case studies (ie, utilizing practical and applicable methods) have existed within psychotherapy for some time, 47 , 48 only recently has the applicability of pragmatism as an underlying paradigmatic perspective been considered in HSR. 49 This is marked by uptake of pragmatism in Randomized Control Trials, recognizing that “gold standard” testing conditions do not reflect the reality of clinical settings 50 , 51 nor do a handful of epistemologically guided methodologies suit every research inquiry.

Pragmatism positions the research question as the basis for methodological choices, rather than a theory or epistemology, allowing researchers to pursue the most practical approach to understanding a problem or discovering an actionable solution. 52 Mixed methods are commonly used to create a deeper understanding of the case through converging qualitative and quantitative data. 52 Pragmatic case study is suited to HSR because its flexibility throughout the research process accommodates complexity, ever-changing systems, and disruptions to research plans. 49 , 50 Much like case study, pragmatism has been criticized for its flexibility and use when other approaches are seemingly ill-fit. 53 , 54 Similarly, authors argue that this results from a lack of investigation and proper application rather than a reflection of validity, legitimizing the need for more exploration and conversation among researchers and practitioners. 55

Although occasionally misunderstood as a less rigourous research methodology, 8 case study research is highly flexible and allows for contextual nuances. 5 , 6 Its use is valuable when the researcher desires a thorough understanding of a phenomenon or case bound by context. 11 If needed, multiple similar cases can be studied simultaneously, or one case within another. 16 , 17 There are currently three main approaches to case study, 5 , 17 , 24 each with their own definitions of a case, ontological and epistemological paradigms, methodologies, and data collection and analysis procedures. 37

Individuals’ experiences within health systems are influenced heavily by contextual factors, participant experience, and intricate relationships between different organizations and actors. 55 Case study research is well suited for HSR because it can track and examine these complex relationships and systems as they evolve over time. 6 , 7 It is important that researchers and health leaders using this methodology understand its key tenets and how to conduct a proper case study. Although there are many examples of case study in action, they are often under-reported and, when reported, not rigorously conducted. 9 Thus, decision-makers and health leaders should use these examples with caution. The proper reporting of case studies is necessary to bolster their credibility in HSR literature and provide readers sufficient information to critically assess the methodology. We also call on health leaders who frequently use case studies 56 – 58 to report them in the primary research literature.

The purpose of this article is to advocate for the continued and advanced use of case study in HSR and to provide literature-based guidance for decision-makers, policy-makers, and health leaders on how to engage in, read, and interpret findings from case study research. As health systems progress and evolve, the application of case study research will continue to increase as researchers and health leaders aim to capture the inherent complexities, nuances, and contextual factors. 7

An external file that holds a picture, illustration, etc.
Object name is 10.1177_08404704211028857-img1.jpg

Case Study Research Method in Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews).

The case study research method originated in clinical medicine (the case history, i.e., the patient’s personal history). In psychology, case studies are often confined to the study of a particular individual.

The information is mainly biographical and relates to events in the individual’s past (i.e., retrospective), as well as to significant events that are currently occurring in his or her everyday life.

The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies.

Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

This makes it clear that the case study is a method that should only be used by a psychologist, therapist, or psychiatrist, i.e., someone with a professional qualification.

There is an ethical issue of competence. Only someone qualified to diagnose and treat a person can conduct a formal case study relating to atypical (i.e., abnormal) behavior or atypical development.

case study

 Famous Case Studies

  • Anna O – One of the most famous case studies, documenting psychoanalyst Josef Breuer’s treatment of “Anna O” (real name Bertha Pappenheim) for hysteria in the late 1800s using early psychoanalytic theory.
  • Little Hans – A child psychoanalysis case study published by Sigmund Freud in 1909 analyzing his five-year-old patient Herbert Graf’s house phobia as related to the Oedipus complex.
  • Bruce/Brenda – Gender identity case of the boy (Bruce) whose botched circumcision led psychologist John Money to advise gender reassignment and raise him as a girl (Brenda) in the 1960s.
  • Genie Wiley – Linguistics/psychological development case of the victim of extreme isolation abuse who was studied in 1970s California for effects of early language deprivation on acquiring speech later in life.
  • Phineas Gage – One of the most famous neuropsychology case studies analyzes personality changes in railroad worker Phineas Gage after an 1848 brain injury involving a tamping iron piercing his skull.

Clinical Case Studies

  • Studying the effectiveness of psychotherapy approaches with an individual patient
  • Assessing and treating mental illnesses like depression, anxiety disorders, PTSD
  • Neuropsychological cases investigating brain injuries or disorders

Child Psychology Case Studies

  • Studying psychological development from birth through adolescence
  • Cases of learning disabilities, autism spectrum disorders, ADHD
  • Effects of trauma, abuse, deprivation on development

Types of Case Studies

  • Explanatory case studies : Used to explore causation in order to find underlying principles. Helpful for doing qualitative analysis to explain presumed causal links.
  • Exploratory case studies : Used to explore situations where an intervention being evaluated has no clear set of outcomes. It helps define questions and hypotheses for future research.
  • Descriptive case studies : Describe an intervention or phenomenon and the real-life context in which it occurred. It is helpful for illustrating certain topics within an evaluation.
  • Multiple-case studies : Used to explore differences between cases and replicate findings across cases. Helpful for comparing and contrasting specific cases.
  • Intrinsic : Used to gain a better understanding of a particular case. Helpful for capturing the complexity of a single case.
  • Collective : Used to explore a general phenomenon using multiple case studies. Helpful for jointly studying a group of cases in order to inquire into the phenomenon.

Where Do You Find Data for a Case Study?

There are several places to find data for a case study. The key is to gather data from multiple sources to get a complete picture of the case and corroborate facts or findings through triangulation of evidence. Most of this information is likely qualitative (i.e., verbal description rather than measurement), but the psychologist might also collect numerical data.

1. Primary sources

  • Interviews – Interviewing key people related to the case to get their perspectives and insights. The interview is an extremely effective procedure for obtaining information about an individual, and it may be used to collect comments from the person’s friends, parents, employer, workmates, and others who have a good knowledge of the person, as well as to obtain facts from the person him or herself.
  • Observations – Observing behaviors, interactions, processes, etc., related to the case as they unfold in real-time.
  • Documents & Records – Reviewing private documents, diaries, public records, correspondence, meeting minutes, etc., relevant to the case.

2. Secondary sources

  • News/Media – News coverage of events related to the case study.
  • Academic articles – Journal articles, dissertations etc. that discuss the case.
  • Government reports – Official data and records related to the case context.
  • Books/films – Books, documentaries or films discussing the case.

3. Archival records

Searching historical archives, museum collections and databases to find relevant documents, visual/audio records related to the case history and context.

Public archives like newspapers, organizational records, photographic collections could all include potentially relevant pieces of information to shed light on attitudes, cultural perspectives, common practices and historical contexts related to psychology.

4. Organizational records

Organizational records offer the advantage of often having large datasets collected over time that can reveal or confirm psychological insights.

Of course, privacy and ethical concerns regarding confidential data must be navigated carefully.

However, with proper protocols, organizational records can provide invaluable context and empirical depth to qualitative case studies exploring the intersection of psychology and organizations.

  • Organizational/industrial psychology research : Organizational records like employee surveys, turnover/retention data, policies, incident reports etc. may provide insight into topics like job satisfaction, workplace culture and dynamics, leadership issues, employee behaviors etc.
  • Clinical psychology : Therapists/hospitals may grant access to anonymized medical records to study aspects like assessments, diagnoses, treatment plans etc. This could shed light on clinical practices.
  • School psychology : Studies could utilize anonymized student records like test scores, grades, disciplinary issues, and counseling referrals to study child development, learning barriers, effectiveness of support programs, and more.

How do I Write a Case Study in Psychology?

Follow specified case study guidelines provided by a journal or your psychology tutor. General components of clinical case studies include: background, symptoms, assessments, diagnosis, treatment, and outcomes. Interpreting the information means the researcher decides what to include or leave out. A good case study should always clarify which information is the factual description and which is an inference or the researcher’s opinion.

1. Introduction

  • Provide background on the case context and why it is of interest, presenting background information like demographics, relevant history, and presenting problem.
  • Compare briefly to similar published cases if applicable. Clearly state the focus/importance of the case.

2. Case Presentation

  • Describe the presenting problem in detail, including symptoms, duration,and impact on daily life.
  • Include client demographics like age and gender, information about social relationships, and mental health history.
  • Describe all physical, emotional, and/or sensory symptoms reported by the client.
  • Use patient quotes to describe the initial complaint verbatim. Follow with full-sentence summaries of relevant history details gathered, including key components that led to a working diagnosis.
  • Summarize clinical exam results, namely orthopedic/neurological tests, imaging, lab tests, etc. Note actual results rather than subjective conclusions. Provide images if clearly reproducible/anonymized.
  • Clearly state the working diagnosis or clinical impression before transitioning to management.

3. Management and Outcome

  • Indicate the total duration of care and number of treatments given over what timeframe. Use specific names/descriptions for any therapies/interventions applied.
  • Present the results of the intervention,including any quantitative or qualitative data collected.
  • For outcomes, utilize visual analog scales for pain, medication usage logs, etc., if possible. Include patient self-reports of improvement/worsening of symptoms. Note the reason for discharge/end of care.

4. Discussion

  • Analyze the case, exploring contributing factors, limitations of the study, and connections to existing research.
  • Analyze the effectiveness of the intervention,considering factors like participant adherence, limitations of the study, and potential alternative explanations for the results.
  • Identify any questions raised in the case analysis and relate insights to established theories and current research if applicable. Avoid definitive claims about physiological explanations.
  • Offer clinical implications, and suggest future research directions.

5. Additional Items

  • Thank specific assistants for writing support only. No patient acknowledgments.
  • References should directly support any key claims or quotes included.
  • Use tables/figures/images only if substantially informative. Include permissions and legends/explanatory notes.
  • Provides detailed (rich qualitative) information.
  • Provides insight for further research.
  • Permitting investigation of otherwise impractical (or unethical) situations.

Case studies allow a researcher to investigate a topic in far more detail than might be possible if they were trying to deal with a large number of research participants (nomothetic approach) with the aim of ‘averaging’.

Because of their in-depth, multi-sided approach, case studies often shed light on aspects of human thinking and behavior that would be unethical or impractical to study in other ways.

Research that only looks into the measurable aspects of human behavior is not likely to give us insights into the subjective dimension of experience, which is important to psychoanalytic and humanistic psychologists.

Case studies are often used in exploratory research. They can help us generate new ideas (that might be tested by other methods). They are an important way of illustrating theories and can help show how different aspects of a person’s life are related to each other.

The method is, therefore, important for psychologists who adopt a holistic point of view (i.e., humanistic psychologists ).

Limitations

  • Lacking scientific rigor and providing little basis for generalization of results to the wider population.
  • Researchers’ own subjective feelings may influence the case study (researcher bias).
  • Difficult to replicate.
  • Time-consuming and expensive.
  • The volume of data, together with the time restrictions in place, impacted the depth of analysis that was possible within the available resources.

Because a case study deals with only one person/event/group, we can never be sure if the case study investigated is representative of the wider body of “similar” instances. This means the conclusions drawn from a particular case may not be transferable to other settings.

Because case studies are based on the analysis of qualitative (i.e., descriptive) data , a lot depends on the psychologist’s interpretation of the information she has acquired.

This means that there is a lot of scope for Anna O , and it could be that the subjective opinions of the psychologist intrude in the assessment of what the data means.

For example, Freud has been criticized for producing case studies in which the information was sometimes distorted to fit particular behavioral theories (e.g., Little Hans ).

This is also true of Money’s interpretation of the Bruce/Brenda case study (Diamond, 1997) when he ignored evidence that went against his theory.

Breuer, J., & Freud, S. (1895).  Studies on hysteria . Standard Edition 2: London.

Curtiss, S. (1981). Genie: The case of a modern wild child .

Diamond, M., & Sigmundson, K. (1997). Sex Reassignment at Birth: Long-term Review and Clinical Implications. Archives of Pediatrics & Adolescent Medicine , 151(3), 298-304

Freud, S. (1909a). Analysis of a phobia of a five year old boy. In The Pelican Freud Library (1977), Vol 8, Case Histories 1, pages 169-306

Freud, S. (1909b). Bemerkungen über einen Fall von Zwangsneurose (Der “Rattenmann”). Jb. psychoanal. psychopathol. Forsch ., I, p. 357-421; GW, VII, p. 379-463; Notes upon a case of obsessional neurosis, SE , 10: 151-318.

Harlow J. M. (1848). Passage of an iron rod through the head.  Boston Medical and Surgical Journal, 39 , 389–393.

Harlow, J. M. (1868).  Recovery from the Passage of an Iron Bar through the Head .  Publications of the Massachusetts Medical Society. 2  (3), 327-347.

Money, J., & Ehrhardt, A. A. (1972).  Man & Woman, Boy & Girl : The Differentiation and Dimorphism of Gender Identity from Conception to Maturity. Baltimore, Maryland: Johns Hopkins University Press.

Money, J., & Tucker, P. (1975). Sexual signatures: On being a man or a woman.

Further Information

  • Case Study Approach
  • Case Study Method
  • Enhancing the Quality of Case Studies in Health Services Research
  • “We do things together” A case study of “couplehood” in dementia
  • Using mixed methods for evaluating an integrative approach to cancer care: a case study

Print Friendly, PDF & Email

The Case Study: Methods of Data Collection

  • First Online: 06 September 2017

Cite this chapter

case study techniques of data collection

  • Farideh Delavari Edalat 6 &
  • M. Reza Abdi 7  

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 258))

972 Accesses

This chapter  concerns with the methodology choice which affected the process and outcomes of this book. The chapter  identifies a case study on the basis of data collection from the semi-structured interviews to establish the knowledge required for the conceptual framework of AWM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

David M, Sutton CD (2004) Social research, the basics. Sage Publications, London

Google Scholar  

Fox N, Hunn A, Mathers N (2007) Sampling and sample size calculation. The NIHR RDS for the East Midlands/Yorkshire & the Humber, NHS. http://www.rds-yh.nihr.ac.uk/ . Accessed 04 Oct 2014

Saunders M, Lewis L, Thornhill A (2003) Research methods for business students. Pearson Education Limited, Essex

Saunders M, Lewis L, Thornhill A (2009) Research methods for business students, 5th edn. Pearson Education Limited, Essex

Saunders M, Lewis P, Thornhill A (2012) Research methods for business students, 6th edn. England, Pearson Education Limited

Sunderland E (1968) Pastoralism, nomadism and the social anthropology in Iran. In: Fisher WB (ed) The Cambridge history of Iran, vol I. Cambridge University Press, The Land of Iran. Cambridge, pp 611–683

Chapter   Google Scholar  

Tomas MK (2006) Collaboration for sustainability? A framework for analysing government impacts in collaborative environmental management. Sustain Sci Pract Policy 2(1):15–24

Vogt WP (1999) Dictionary of statistics and methodology: a nontechnical guide for the social sciences. Sage, London

Download references

Author information

Authors and affiliations.

Environment and Sustainability Consultant, Additive Design Ltd, Leeds, West Yorkshire, UK

Farideh Delavari Edalat

Operations and Information Management, School of Management, University of Bradford, Bradford, West Yorkshire, UK

M. Reza Abdi

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Edalat, F.D., Abdi, M.R. (2018). The Case Study: Methods of Data Collection. In: Adaptive Water Management. International Series in Operations Research & Management Science, vol 258. Springer, Cham. https://doi.org/10.1007/978-3-319-64143-0_6

Download citation

DOI : https://doi.org/10.1007/978-3-319-64143-0_6

Published : 06 September 2017

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-64142-3

Online ISBN : 978-3-319-64143-0

eBook Packages : Business and Management Business and Management (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

case study techniques of data collection

Data Analytics Case Study Guide 2024

by Sam McKay, CFA | Data Analytics

case study techniques of data collection

Data analytics case studies reveal how businesses harness data for informed decisions and growth.

For aspiring data professionals, mastering the case study process will enhance your skills and increase your career prospects.

Sales Now On Advertisement

So, how do you approach a case study?

Use these steps to process a data analytics case study:

Understand the Problem: Grasp the core problem or question addressed in the case study.

Collect Relevant Data: Gather data from diverse sources, ensuring accuracy and completeness.

Apply Analytical Techniques: Use appropriate methods aligned with the problem statement.

Visualize Insights: Utilize visual aids to showcase patterns and key findings.

Derive Actionable Insights: Focus on deriving meaningful actions from the analysis.

This article will give you detailed steps to navigate a case study effectively and understand how it works in real-world situations.

By the end of the article, you will be better equipped to approach a data analytics case study, strengthening your analytical prowess and practical application skills.

Let’s dive in!

Data Analytics Case Study Guide

Table of Contents

What is a Data Analytics Case Study?

A data analytics case study is a real or hypothetical scenario where analytics techniques are applied to solve a specific problem or explore a particular question.

It’s a practical approach that uses data analytics methods, assisting in deciphering data for meaningful insights. This structured method helps individuals or organizations make sense of data effectively.

Additionally, it’s a way to learn by doing, where there’s no single right or wrong answer in how you analyze the data.

So, what are the components of a case study?

Key Components of a Data Analytics Case Study

Key Components of a Data Analytics Case Study

A data analytics case study comprises essential elements that structure the analytical journey:

Problem Context: A case study begins with a defined problem or question. It provides the context for the data analysis , setting the stage for exploration and investigation.

Data Collection and Sources: It involves gathering relevant data from various sources , ensuring data accuracy, completeness, and relevance to the problem at hand.

Analysis Techniques: Case studies employ different analytical methods, such as statistical analysis, machine learning algorithms, or visualization tools, to derive meaningful conclusions from the collected data.

Insights and Recommendations: The ultimate goal is to extract actionable insights from the analyzed data, offering recommendations or solutions that address the initial problem or question.

Now that you have a better understanding of what a data analytics case study is, let’s talk about why we need and use them.

Why Case Studies are Integral to Data Analytics

Why Case Studies are Integral to Data Analytics

Case studies serve as invaluable tools in the realm of data analytics, offering multifaceted benefits that bolster an analyst’s proficiency and impact:

Real-Life Insights and Skill Enhancement: Examining case studies provides practical, real-life examples that expand knowledge and refine skills. These examples offer insights into diverse scenarios, aiding in a data analyst’s growth and expertise development.

Validation and Refinement of Analyses: Case studies demonstrate the effectiveness of data-driven decisions across industries, providing validation for analytical approaches. They showcase how organizations benefit from data analytics. Also, this helps in refining one’s own methodologies

Showcasing Data Impact on Business Outcomes: These studies show how data analytics directly affects business results, like increasing revenue, reducing costs, or delivering other measurable advantages. Understanding these impacts helps articulate the value of data analytics to stakeholders and decision-makers.

Learning from Successes and Failures: By exploring a case study, analysts glean insights from others’ successes and failures, acquiring new strategies and best practices. This learning experience facilitates professional growth and the adoption of innovative approaches within their own data analytics work.

Including case studies in a data analyst’s toolkit helps gain more knowledge, improve skills, and understand how data analytics affects different industries.

Using these real-life examples boosts confidence and success, guiding analysts to make better and more impactful decisions in their organizations.

But not all case studies are the same.

Let’s talk about the different types.

Types of Data Analytics Case Studies

 Types of Data Analytics Case Studies

Data analytics encompasses various approaches tailored to different analytical goals:

Exploratory Case Study: These involve delving into new datasets to uncover hidden patterns and relationships, often without a predefined hypothesis. They aim to gain insights and generate hypotheses for further investigation.

Predictive Case Study: These utilize historical data to forecast future trends, behaviors, or outcomes. By applying predictive models, they help anticipate potential scenarios or developments.

Diagnostic Case Study: This type focuses on understanding the root causes or reasons behind specific events or trends observed in the data. It digs deep into the data to provide explanations for occurrences.

Prescriptive Case Study: This case study goes beyond analytics; it provides actionable recommendations or strategies derived from the analyzed data. They guide decision-making processes by suggesting optimal courses of action based on insights gained.

Each type has a specific role in using data to find important insights, helping in decision-making, and solving problems in various situations.

Regardless of the type of case study you encounter, here are some steps to help you process them.

Roadmap to Handling a Data Analysis Case Study

Roadmap to Handling a Data Analysis Case Study

Embarking on a data analytics case study requires a systematic approach, step-by-step, to derive valuable insights effectively.

Here are the steps to help you through the process:

Step 1: Understanding the Case Study Context: Immerse yourself in the intricacies of the case study. Delve into the industry context, understanding its nuances, challenges, and opportunities.

Data Mentor Advertisement

Identify the central problem or question the study aims to address. Clarify the objectives and expected outcomes, ensuring a clear understanding before diving into data analytics.

Step 2: Data Collection and Validation: Gather data from diverse sources relevant to the case study. Prioritize accuracy, completeness, and reliability during data collection. Conduct thorough validation processes to rectify inconsistencies, ensuring high-quality and trustworthy data for subsequent analysis.

Data Collection and Validation in case study

Step 3: Problem Definition and Scope: Define the problem statement precisely. Articulate the objectives and limitations that shape the scope of your analysis. Identify influential variables and constraints, providing a focused framework to guide your exploration.

Step 4: Exploratory Data Analysis (EDA): Leverage exploratory techniques to gain initial insights. Visualize data distributions, patterns, and correlations, fostering a deeper understanding of the dataset. These explorations serve as a foundation for more nuanced analysis.

Step 5: Data Preprocessing and Transformation: Cleanse and preprocess the data to eliminate noise, handle missing values, and ensure consistency. Transform data formats or scales as required, preparing the dataset for further analysis.

Data Preprocessing and Transformation in case study

Step 6: Data Modeling and Method Selection: Select analytical models aligning with the case study’s problem, employing statistical techniques, machine learning algorithms, or tailored predictive models.

In this phase, it’s important to develop data modeling skills. This helps create visuals of complex systems using organized data, which helps solve business problems more effectively.

Understand key data modeling concepts, utilize essential tools like SQL for database interaction, and practice building models from real-world scenarios.

Furthermore, strengthen data cleaning skills for accurate datasets, and stay updated with industry trends to ensure relevance.

Data Modeling and Method Selection in case study

Step 7: Model Evaluation and Refinement: Evaluate the performance of applied models rigorously. Iterate and refine models to enhance accuracy and reliability, ensuring alignment with the objectives and expected outcomes.

Step 8: Deriving Insights and Recommendations: Extract actionable insights from the analyzed data. Develop well-structured recommendations or solutions based on the insights uncovered, addressing the core problem or question effectively.

Step 9: Communicating Results Effectively: Present findings, insights, and recommendations clearly and concisely. Utilize visualizations and storytelling techniques to convey complex information compellingly, ensuring comprehension by stakeholders.

Communicating Results Effectively

Step 10: Reflection and Iteration: Reflect on the entire analysis process and outcomes. Identify potential improvements and lessons learned. Embrace an iterative approach, refining methodologies for continuous enhancement and future analyses.

This step-by-step roadmap provides a structured framework for thorough and effective handling of a data analytics case study.

Now, after handling data analytics comes a crucial step; presenting the case study.

Presenting Your Data Analytics Case Study

Presenting Your Data Analytics Case Study

Presenting a data analytics case study is a vital part of the process. When presenting your case study, clarity and organization are paramount.

To achieve this, follow these key steps:

Structuring Your Case Study: Start by outlining relevant and accurate main points. Ensure these points align with the problem addressed and the methodologies used in your analysis.

Crafting a Narrative with Data: Start with a brief overview of the issue, then explain your method and steps, covering data collection, cleaning, stats, and advanced modeling.

Visual Representation for Clarity: Utilize various visual aids—tables, graphs, and charts—to illustrate patterns, trends, and insights. Ensure these visuals are easy to comprehend and seamlessly support your narrative.

Visual Representation for Clarity

Highlighting Key Information: Use bullet points to emphasize essential information, maintaining clarity and allowing the audience to grasp key takeaways effortlessly. Bold key terms or phrases to draw attention and reinforce important points.

Addressing Audience Queries: Anticipate and be ready to answer audience questions regarding methods, assumptions, and results. Demonstrating a profound understanding of your analysis instills confidence in your work.

Integrity and Confidence in Delivery: Maintain a neutral tone and avoid exaggerated claims about findings. Present your case study with integrity, clarity, and confidence to ensure the audience appreciates and comprehends the significance of your work.

Integrity and Confidence in Delivery

By organizing your presentation well, telling a clear story through your analysis, and using visuals wisely, you can effectively share your data analytics case study.

This method helps people understand better, stay engaged, and draw valuable conclusions from your work.

We hope by now, you are feeling very confident processing a case study. But with any process, there are challenges you may encounter.

EDNA AI Advertisement

Key Challenges in Data Analytics Case Studies

Key Challenges in Data Analytics Case Studies

A data analytics case study can present various hurdles that necessitate strategic approaches for successful navigation:

Challenge 1: Data Quality and Consistency

Challenge: Inconsistent or poor-quality data can impede analysis, leading to erroneous insights and flawed conclusions.

Solution: Implement rigorous data validation processes, ensuring accuracy, completeness, and reliability. Employ data cleansing techniques to rectify inconsistencies and enhance overall data quality.

Challenge 2: Complexity and Scale of Data

Challenge: Managing vast volumes of data with diverse formats and complexities poses analytical challenges.

Solution: Utilize scalable data processing frameworks and tools capable of handling diverse data types. Implement efficient data storage and retrieval systems to manage large-scale datasets effectively.

Challenge 3: Interpretation and Contextual Understanding

Challenge: Interpreting data without contextual understanding or domain expertise can lead to misinterpretations.

Solution: Collaborate with domain experts to contextualize data and derive relevant insights. Invest in understanding the nuances of the industry or domain under analysis to ensure accurate interpretations.

Interpretation and Contextual Understanding

Challenge 4: Privacy and Ethical Concerns

Challenge: Balancing data access for analysis while respecting privacy and ethical boundaries poses a challenge.

Solution: Implement robust data governance frameworks that prioritize data privacy and ethical considerations. Ensure compliance with regulatory standards and ethical guidelines throughout the analysis process.

Challenge 5: Resource Limitations and Time Constraints

Challenge: Limited resources and time constraints hinder comprehensive analysis and exhaustive data exploration.

Solution: Prioritize key objectives and allocate resources efficiently. Employ agile methodologies to iteratively analyze and derive insights, focusing on the most impactful aspects within the given timeframe.

Recognizing these challenges is key; it helps data analysts adopt proactive strategies to mitigate obstacles. This enhances the effectiveness and reliability of insights derived from a data analytics case study.

Now, let’s talk about the best software tools you should use when working with case studies.

Top 5 Software Tools for Case Studies

Top Software Tools for Case Studies

In the realm of case studies within data analytics, leveraging the right software tools is essential.

Here are some top-notch options:

Tableau : Renowned for its data visualization prowess, Tableau transforms raw data into interactive, visually compelling representations, ideal for presenting insights within a case study.

Python and R Libraries: These flexible programming languages provide many tools for handling data, doing statistics, and working with machine learning, meeting various needs in case studies.

Microsoft Excel : A staple tool for data analytics, Excel provides a user-friendly interface for basic analytics, making it useful for initial data exploration in a case study.

SQL Databases : Structured Query Language (SQL) databases assist in managing and querying large datasets, essential for organizing case study data effectively.

Statistical Software (e.g., SPSS , SAS ): Specialized statistical software enables in-depth statistical analysis, aiding in deriving precise insights from case study data.

Choosing the best mix of these tools, tailored to each case study’s needs, greatly boosts analytical abilities and results in data analytics.

Final Thoughts

Case studies in data analytics are helpful guides. They give real-world insights, improve skills, and show how data-driven decisions work.

Using case studies helps analysts learn, be creative, and make essential decisions confidently in their data work.

Check out our latest clip below to further your learning!

Frequently Asked Questions

What are the key steps to analyzing a data analytics case study.

When analyzing a case study, you should follow these steps:

Clarify the problem : Ensure you thoroughly understand the problem statement and the scope of the analysis.

Make assumptions : Define your assumptions to establish a feasible framework for analyzing the case.

Gather context : Acquire relevant information and context to support your analysis.

Analyze the data : Perform calculations, create visualizations, and conduct statistical analysis on the data.

Provide insights : Draw conclusions and develop actionable insights based on your analysis.

How can you effectively interpret results during a data scientist case study job interview?

During your next data science interview, interpret case study results succinctly and clearly. Utilize visual aids and numerical data to bolster your explanations, ensuring comprehension.

Frame the results in an audience-friendly manner, emphasizing relevance. Concentrate on deriving insights and actionable steps from the outcomes.

How do you showcase your data analyst skills in a project?

To demonstrate your skills effectively, consider these essential steps. Begin by selecting a problem that allows you to exhibit your capacity to handle real-world challenges through analysis.

Methodically document each phase, encompassing data cleaning, visualization, statistical analysis, and the interpretation of findings.

Utilize descriptive analysis techniques and effectively communicate your insights using clear visual aids and straightforward language. Ensure your project code is well-structured, with detailed comments and documentation, showcasing your proficiency in handling data in an organized manner.

Lastly, emphasize your expertise in SQL queries, programming languages, and various analytics tools throughout the project. These steps collectively highlight your competence and proficiency as a skilled data analyst, demonstrating your capabilities within the project.

Can you provide an example of a successful data analytics project using key metrics?

A prime illustration is utilizing analytics in healthcare to forecast hospital readmissions. Analysts leverage electronic health records, patient demographics, and clinical data to identify high-risk individuals.

Implementing preventive measures based on these key metrics helps curtail readmission rates, enhancing patient outcomes and cutting healthcare expenses.

This demonstrates how data analytics, driven by metrics, effectively tackles real-world challenges, yielding impactful solutions.

Why would a company invest in data analytics?

Companies invest in data analytics to gain valuable insights, enabling informed decision-making and strategic planning. This investment helps optimize operations, understand customer behavior, and stay competitive in their industry.

Ultimately, leveraging data analytics empowers companies to make smarter, data-driven choices, leading to enhanced efficiency, innovation, and growth.

author avatar

Related Posts

4 Types of Data Analytics: Explained

4 Types of Data Analytics: Explained

Data Analytics

In a world full of data, data analytics is the heart and soul of an operation. It's what transforms raw...

Data Analytics Outsourcing: Pros and Cons Explained

Data Analytics Outsourcing: Pros and Cons Explained

In today's data-driven world, businesses are constantly swimming in a sea of information, seeking the...

What Does a Data Analyst Do on a Daily Basis?

What Does a Data Analyst Do on a Daily Basis?

In the digital age, data plays a significant role in helping organizations make informed decisions and...

case study techniques of data collection

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Data Collection Methods | Step-by-Step Guide & Examples

Data Collection Methods | Step-by-Step Guide & Examples

Published on 4 May 2022 by Pritha Bhandari .

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental, or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address, and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analysed through statistical methods .
  • Qualitative data is expressed in words and analysed through interpretations and categorisations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data.

If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism, run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research, and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design .

Operationalisation

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalisation means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness, and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection.

Standardising procedures

If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorise observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organise and store your data.

  • If you are collecting data from people, you will likely need to anonymise and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimise distortion.
  • You can prevent loss of data by having an organisation system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1 to 5. The data produced is numerical and can be statistically analysed for averages and patterns.

To ensure that high-quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
  • You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research , you also have to consider the internal and external validity of your experiment.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, May 04). Data Collection Methods | Step-by-Step Guide & Examples. Scribbr. Retrieved 22 April 2024, from https://www.scribbr.co.uk/research-methods/data-collection-guide/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs quantitative research | examples & methods, triangulation in research | guide, types, examples, what is a conceptual framework | tips & examples.

This paper is in the following e-collection/theme issue:

Published on 24.4.2024 in Vol 26 (2024)

The Costs of Anonymization: Case Study Using Clinical Data

Authors of this article:

Author Orcid Image

There are no citations yet available for this article according to Crossref .

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 23 April 2024

Prediction and optimization method for welding quality of components in ship construction

  • Jinfeng Liu 1 ,
  • Yifa Cheng 1 ,
  • Xuwen Jing 1 ,
  • Xiaojun Liu 2 &
  • Yu Chen 1  

Scientific Reports volume  14 , Article number:  9353 ( 2024 ) Cite this article

116 Accesses

Metrics details

  • Mechanical engineering

Welding process, as one of the crucial industrial technologies in ship construction, accounts for approximately 70% of the workload and costs account for approximately 40% of the total cost. The existing welding quality prediction methods have hypothetical premises and subjective factors, which cannot meet the dynamic control requirements of intelligent welding for processing quality. Aiming at the low efficiency of quality prediction problems poor timeliness and unpredictability of quality control in ship assembly-welding process, a data and model driven welding quality prediction method is proposed. Firstly, the influence factors of welding quality are analyzed and the correlation mechanism between process parameters and quality is determined. According to the analysis results, a stable and reliable data collection architecture is established. The elements of welding process monitoring are also determined based on the feature dimensionality reduction method. To improve the accuracy of welding quality prediction, the prediction model is constructed by fusing the adaptive simulated annealing, the particle swarm optimization, and the back propagation neural network algorithms. Finally, the effectiveness of the prediction method is verified through 74 sets of plate welding experiments, the prediction accuracy reaches over 90%.

Similar content being viewed by others

case study techniques of data collection

Prediction of line heating deformation on sheet metal based on an ISSA–ELM model

case study techniques of data collection

Quality prediction and classification of resistance spot weld using artificial neural network with open-sourced, self-executable and GUI-based application tool Q-Check

case study techniques of data collection

Optimization of the ultrasonic roll extrusion process parameters based on the SPEA2SDE algorithm

Introduction.

The shipbuilding industry is a comprehensive national high-end equipment manufacturing industry that supports the shipping industry, marine development, and national defense construction. It plays a critical role in guaranteeing national defense strength and economic development 1 , 2 . With the continuous development of a new generation of information technology based on big data, internet of things (IoT), 5G, cloud computing, artificial intelligence, and digital twin, the intelligent construction is becoming the dominant advanced mode in the shipbuilding industry. At the same time, welding quality control is regarded as a significant part in shipbuilding, and the related innovation research under intelligent welding also urgently needs to be carry out. The current welding processing gradually becoming more flexible and complicated, the welding quality of each workstation ultimately determines the majority of the product quality by means of propagation, accumulation and interaction.

Welding process is one of the vital industrial technologies in ship segment construction 3 , 4 . However, in the welding process of ship components, the local uneven heating and local uncoordinated plastic strain of metal materials are most probably leading to large residual stresses 5 , 6 . This will cause a reduction in the static load capacity and fatigue strength of the ship components. which in turn affects the load capacity, dimensional accuracy, and assembly accuracy of the structure. However, in most shipbuilding enterprises, quality management usually involves issuing quality plans, post-sequence production inspections, and quality statistical reports, which are static quality control. The existing welding quality prediction methods have hypothetical premises and subjective factors, which cannot meet the dynamic control requirements of intelligent welding for processing quality. These methods often encounter problems such as inefficient quality inspection, untimely quality feedback, and untimely quality control 7 . Moreover, the post-welding correction process delays the ship construction cycle and increases the production cost.

The inadequacy of traditional welding quality control technology determines the functional and technical limitations in practical applications 8 , 9 . Firstly, the current welding process design relies on production experience and empirical calculation formulas 10 , which makes it difficult to ensure the design requirements for minimizing residual stress in the forming of structural parts. Secondly, the absence of effective data pre-processing methods to address complex production conditions and massive amounts of welding measurement data. Currently, the welding quality prediction methods for ship components is inadequate. For example, it is difficult to balance the prediction accuracy and computational efficiency, or combine actual measured welding data to drive data analysis services.

This work aims to provide a solution to the inefficiency of welding quality control during ship construction, delaying the production cycle and increasing production costs. The proposed method has the following advantages.

The data-acquisition framework for welding process parameters of ship unit-assembly welding is constructed, a stable and reliable data acquisition method is proposed.

Based on the feature selection method, the influence feature of the welding quality is quantitatively analyzed. This leads to the construction of an optimal subset of process influencing features for welding quality prediction.

Fusing an adaptive simulated annealing (SA), the particle swarm optimization (PSO) and the back propagation neural network (BPNN), a welding quality prediction model is established for welding quality control and decision making.

The remainder of this paper is organized as follows. “ Related works ” section presents the related research on the influence factor and prediction methods of welding quality. The data acquisition and processing framework is explained in “ Acquisition and pre-processing of welding process data ” section. In “ Construction the welding quality prediction model ” section, fusing an Adaptive SA, the PSO and the BPNN (APB), a welding quality prediction model is established. To verify the proposed method, the case study of ship unit-assembly welding is illustrated in “ Case study ” section. The conclusion and future work are shown in “ Conclusion and future works ” section.

Related works

Method for selecting welding quality features.

For huge amount of data in the production site, the knowledge information can be mined through suitable processing methods to assist production 11 , 12 . Feature selection is an important and widely used technique in this field. The purpose of feature selection is to select a small subset of features from the original dataset based on certain evaluation criteria, which usually yields better performance, such as higher classification accuracy, lower computational cost and better model interpretability. As a practical method, feature selection has been widely used in many fields 13 , 14 , 15 .

Depending on how the evaluation is performed, feature selection methods can be distinguished as filter models, wrapper models, or hybrid models. Filter models evaluate and select a subset of features based on the general characteristics of the data without involving any learning model. On the other hand, the wrapper model uses a learning algorithm set in advance and uses its performance as an evaluation criterion. Compared to filter models, wrapper models are more accurate but computationally more expensive. Hybrid models use different evaluation criteria at different stages and combine the advantages of the first two methods.

Two versions of an ant colony optimization-based feature selection algorithm was proposed by Warren et al. 16 , which can effectively improve weld defect detection accuracy and weld defect type classification accuracy. An enhanced feature selection method combining the Relief-F algorithm with a convolutional neural network (CNN) was proposed by Jiang et al. 17 to improve the recognition accuracy of welding defect identification in the manufacturing process of large equipment. A hybrid fisher-based filter and wrapper-based feature selection algorithm was proposed by Zhang et al. 18 , which reduces the 41 feature parameters for weld defect monitoring during tungsten arc welding of aluminum alloy to 19. The computational effort is reduced and the modeling accuracy is improved. Abdel et al. 19 proposed a combination of two-phase mutation and gray wolf optimization algorithm in the literature to solve the wrapper-based feature selection problem. It was able to balance accuracy and efficiency in handling the classification task. To the effect of maintaining and improving the classification accuracy. Le et al. 20 introduced a stochastic privacy preserving machine learning algorithm. The Relief-F algorithm was used for feature selection and Random Forest is utilized for privacy preserving classification. The algorithm is prevented from overfitting and higher classification accuracy is obtained.

In general, a huge amount of measured welding data is generated during the welding of actual ship components because of the complex production conditions. Problems such as high computational cost, easy to fall into local optimum and premature convergence of the algorithm may exist in feature selection. To determine the essential welding process influencing factors. It is necessary to use a suitable feature selection method that facilitates reasonable parsimony in obtaining the best set of input data features. This maximizes the accuracy and computational efficiency while reducing the computational complexity of the prediction model.

Welding quality prediction method

As the new-generation of information technology becomes popular in the ship construction, process data from the manufacturing site can be collected. This contains a non-linear mapping relationship between welding process parameters and quality. Welding data monitoring, welding quality prediction and optimization decisions can be effectively implemented. Therefore, based on machine learning algorithms to achieve welding quality prediction has received wide attention from academia and industry.

Pal et al. predicted the welding quality by processing the current and voltage signals in the welding process 21 , 22 . Taking the process parameters and statistical parameters of the arc signal as input variables, the BPNN, and radial basis function network models are adopted to realize the prediction of welding quality. A Fatigue strength prediction method of ultra-high strength steel butt-welded joints was proposed by Nykanen 23 . A reinforcement-penetration collaborative prediction network model based on deep residual was designed by Lu et al. 24 to predict reinforcement and penetration depth quantitatively. A nugget quality prediction method of resistance spot welding on aluminum alloy based on structure-borne acoustic emission signals was proposed by Luo et al. 25 .

Along with the maturity of related theories such as machine learning and neural networks, the task of welding quality prediction is increasingly implemented by scholars using related techniques.

Artificial neural networks (ANN): In the automatic gas metal arc welding processes, the response surface methodology and ANN models are adopted by Shim et al. 26 predict the best welding parameters for a given weld bead geometry. Lei et al. 27 used a genetic algorithm to optimize the initialization weights and biases of the neural network. They proposed a multi-information fusion neural network to predict the geometric characteristics of the weld by combining the welding parameters and the morphological characteristics of the molten pool. Chaki et al. 28 proposed an integrated prediction model of ANN and non-dominated sorting genetic algorithm which was used to predict and optimize the quality characteristics during pulsed Nd:YAG laser cutting of aluminum alloys. The improved regression network was adopted by Wang et al. 29 and predicted the future molten pool image. CNN were used by Hartl et al. 30 to analyze process data in friction stir welding and predict the resulting quality of the weld surface. To predict the penetration of fillet welds, a penetration quality prediction of asymmetrical fillet root welding based on an optimized BPNN was proposed by Chang et al. 31 . A CNN-based back bead prediction model was proposed by Jin et al. 32 . The image data of the current welding change is acquired, and the CNN model is used to realize the welding shape prediction of the gas metal arc welding. Hu et al. 33 established an ANN optimized by a pigeon inspired algorithm to optimize the welding process parameters of ultrasonic-static shoulder-assisted stir friction welding (U-SSFSW), which led to a significant improvement in the tensile strength of the joints. Cruz et al. 34 presented a procedure for yielding a near-optimal ensemble of CNNs through an efficient search strategy based on an evolutionary algorithm. Able to weigh the predictive accuracy of forecasting models against calculated costs under actual production conditions on the shop floor.

Support vector machines (SVM): SVM using the radial kernel, boosting, and random forest techniques were adopted by Pereda et al. 35 The direct quality prediction in the resistance spot welding process is achieved. To improve the prediction ability to welding quality during high-power disk laser welding, the SVM model was adopted by Wang et al. 36 to predict the welding quality of the metal vapor plume. By collecting the real-time torque signal of the friction stirs welding process, Das et al. 37 used an SVM regression model to predict the ultimate tensile strength of the welded joint. A model of laser welding quality prediction based on different input parameters was established by Petkovic 38 . Yu et al. 39 proposed a real-time prediction method of welding penetration mode and depth based on two-dimensional visual characteristics of the weld pool.

Other prediction models: A neuro-fuzzy model for the prediction and classification of the defects in the fused zone was built by Casalino et al. 40 . Using laser beam welding process parameters as input variables, neural networks, and C-Means fuzzy clustering algorithms are used to classify and predict the welding defects of Ti6Al4V alloy. Rout et al. 41 proposed a hybrid method based on fuzzy regression of the particle swarm optimization to achieve and optimize the prediction of weld quality from both mechanical properties and weld geometry. Kim et al. 42 proposed a semantic resistance spot welding weldability prediction framework. The framework constructs a shareable weldability knowledge database based on the regression rules. A decision tree algorithm and regression tree are used to extract decision rules, and the nugget width of the case was successfully predicted. AbuShanab et al. 43 proposed a stochastic vector functional link prediction model optimized by the Hunger Games search algorithm to link the joint properties with the welding variables, introducing a new prediction model for stir friction welding of dissimilar polymer materials.

Scholars have given various feasible forecasting schemes for welding quality. However, there are still defects, such as the lack of generalization performance of the weld quality prediction algorithms, having a large number of assumptions and subjective factors, these methods cannot meet the dynamic control requirements of intelligent welding for processing quality. Secondly, most prediction models can only predict before or after work, and cannot meet the dynamic changes in the welding environment on site. Therefore, the crucial to improving the welding quality is to give accurately and timely prediction results.

Acquisition and pre-processing of welding process data

The welding quality prediction framework is proposed and shown in Fig.  1 (The clearer version is shown in Supplementary Figure S1 ). Firstly, the critical controllable quality indicators in the ship unit-assembly welding are determined, and the influencing factors are analyzed. Secondly, based on the IoT system, a data acquisition system for real-time monitoring and prediction of the welding quality of the ship unit-assembly welding is established. Collection and transmission of welding data are achieved. Then, a feature selection method is created to optimally select the key features of the welding quality data.Secondly, fusing an adaptive simulated annealing, the particle swarm optimization and the back propagation neural network, a welding quality prediction model is established for welding quality control and decision making. Finally, the welding experiments of ship unit-assembly welding as an example to verify the critical technologies in the paper.

figure 1

The framework of welding quality prediction method.

Analyze the factors affecting welding quality

The reasons that lead to the welding quality problems of the ship component of ships involve six significant factors: human factors, welding equipment, materials, welding process, measurement system, and production environment. Residual stresses caused by instability during the welding of ship components are inextricably linked to the welding method and process parameters used. However, the essential factors are mainly determined by the thermal welding process and the constrained conditions of the weldment during the welding process. The influencing factors of welding quality in the thermal welding process are mainly reflected in the welding heat source type and its power density \(W\) , the effective thermal efficiency \(P\) and linear energy \(Q\) of the welding process, the heat energy’s transfer method (such as heat conduction, convection, etc.) and the welding temperature field. The determinants of the welding temperature field include the nature of the heat source and welding parameters (such as welding current, arc voltage, gas flow, inductance, welding speed, heat input, etc.). When the arc voltage and welding current increase, the heat energy input to the weld and the melting amount of the welding wire will increase directly, which will affect the width and penetration of the weld. When the welding speed is too low, it will cause the heat concentration and the width of the molten pool to increase, resulting in defects such as burn-through and dimensional deformation. The restraint conditions refer to the restraint type and restraint degree of the welded structure. Its value is mainly determined by factors such as the structure of the welded sheet, the position of the weld, the welding direction and sequence, the shrinkage of other parts during the cooling process, and the tightness of the clamped part.

Take the CO2 gas shielded welding process of the ship component as an example. Welding parameters determine the energy input to the weld and to a large extent affect the formation of the weld. Important process parameters that determine the welding quality of thin plate structures of ships include arc voltage, welding current, welding speed, inductance and gas flow. For example, when the welding current is too large, the weld width, penetration, and reinforcement that determine the dimension of the weld will increase, and welding defects are likely to occur during the welding process. At the same time, the angular deformation and bending deflection deformation of the welded sheet also increase. The instability and disturbance of the melt pool and arc can be caused when the gas flow rate is too high, resulting in turbulence and spatter in the melt pool.

Obtain the welding process data

The collection and transmission of process parameters during the welding process is an important basis for supporting the real-time prediction of welding quality. Therefore, a welding data acquisition and processing framework for ship component based on the IoT system is proposed, which is mainly divided into three levels: data perception, data transmission and preprocessing, and application services, as shown in Fig.  2 .

figure 2

A welding process data acquisition and processing framework.

During the execution of the welding process, the data sensing layer is mainly responsible for collecting various multi-source heterogeneous data in real-time and accurately, and providing a stable original data source for the data integration and analysis phase. The sensing data types mainly include welding process parameters, operating status information of welding equipment, and welding quality indicators. The collection method can be used through interface and protocol conversion or by connecting to an external intelligent sensing device. For example, for some non-digital smart devices, data collection can be operated by analog signals of electrical circuits. Then, data such as current, voltage, and welding speed are collected from the welding equipment by installing sensors such as current, voltage, and speed. Finally, a data acquisition board, such as PCL-818L, is used for analog-to-digital conversion, summary fusion, and data transmission. For most digital intelligent devices, data collection can use various communication interfaces or serial ports, PLC networks or communication interfaces, and other methods to collect and summarize the execution parameters and operating status of the equipment. Then, through the corresponding communication protocol, such as OPC-UA, MQTT, etc., the data read and write operations among the application, the server, and the PLC are realized.

The data transmission layer is mainly responsible for transmitting multi-source heterogeneous welding data collected on-site, achieving interconnectivity between underlying devices, application services, and multiple databases. As the new generation of communication technology matures, there are many ways to choose from for industrial-level information communication, such as 5G, Zigbee, industrial WiFi networks, and industrial Ethernet. According to actual needs and complementary advantages, a combined communication scheme can also be formed to meet the requirements of transmission and anti-interference ability, communication speed, and stability. The application scene and system functional requirements of this study are taken into account. Choose the combination of wired communication technology and wireless communication technology applications. To achieve efficient deployment of communication networks, with real-time welding data fast and stable transmission and portable networking.

The diversity of equipment in shipbuilding workshops and the heterogeneity of application systems have caused data to have multi-source and heterogeneous characteristics. Therefore, data integration is to shield the differences in data types and structures to realize unified storage, management, and data analysis. The key technologies of data integration include data storage and management, as well as data preprocessing. Among them, data storage management is the basis for maximizing data value and data preprocessing. Standard database technologies include SQL databases, such as MySQL, Oracle, etc., Redis, HBase, and other types of NoSQL databases. The specific deployment can be mixed and used according to actual needs and application scenarios to achieve complementary advantages and maximize benefits.

Data feature selection

Data feature selection is the premise to ensure the quality of data analysis and mining. It can not only ensure the quality and uniform format of the perceived data set, but also effectively avoid the feature jumble and curse of dimensionality in the process of data analysis. The welding data collected on-site will inevitably have characteristics such as missing, non-standard, and large capacity, requiring data filtering, data recovery, and data conversion to improve data quality and unify data formats.

The Relief-F algorithm is obtained by extending the function of the Relief algorithm by I. Kononenko 44 . It is a feature weight algorithm. Its function is to assign different weights to feature quantities based on the correlation between each feature quantity and category. Remove feature quantities with weights less than a certain threshold based on the calculation results. To achieve optimization of feature quantities. For multi-classification problems, suppose that the single-label training data \(D=\{\left({x}_{1},{y}_{1}\right),\left({x}_{2},{y}_{2}\right),\dots ,({x}_{n},{y}_{n})\}\) set can be divided into \(|c|\) categories. Relief-F can find the nearest neighbor examples in the sample set of class \({K}_{j}({K}_{j}\in ,\{\mathrm{1,2},\dots ,|c|\})\) and each other class for the example \({X}_{i}\) belonging to class \({K}_{j}\) . Suppose that the near-hit examples of \({X}_{i}\) is \({X}_{i,l,nh}\) ( \(l=\mathrm{1,2},\dots ,\left|c\right|;l\ne {K}_{j}\) ) and the near-miss examples of \({X}_{i}\) is \({X}_{i,l,nm}\) . Then, the iterative calculation formula is used to update the feature weight \(W(A)\) of the attribute feature A. According to the input data set \(D\) , set the sampling times of the sample to \(m\) , the threshold of the feature weight to \(\delta\) , and the number of nearest neighbor samples to \(k\) , and the corresponding calculation description is as follows:

The feature weight \(W(A)\) of each attribute is initialized to 0, and the feature weight set \(T\) of the sample data set \(D\) is an empty set.

Starting iterative calculation, and randomly selecting example \({X}_{i}\) from the sample data set \(D\) .

From the sample set \(D\) of the same type as \({X}_{i}\) , finding \(k\) the near-hit examples \({X}_{i,l,nh}\) , denoted as \({H}_{i}(c)(i=\mathrm{1,2},\dots ,k,c=class({X}_{i}))\) . From the sample set \(D\) of the same different type as \({X}_{i}\) , finding \(k\) the near-miss examples \({X}_{i,l,nm}\) , denoted as \({M}_{i}(\widehat{c})(\widehat{c}\ne class({X}_{i}))\) .

Updating the feature weights \(W(A)\) and \(T\) , the calculation formulas are as follows:

where \(diff\left(A,{X}_{1},{X}_{2}\right)\) represents the distance between the sample \({X}_{1}\) and \({X}_{2}\) on the feature \(A\) . \(class\left({X}_{i}\right)\) represents the class label contained in sample \({X}_{i}\) . \(P\left(c\right)\) represents the prior probability of the result label c.

According to the weight calculation results of each attribute, the feature set of the initial input data is filtered reasonably. Specifically, a threshold \(\tau\) needs to be specified, and the setting principle of its value should conform to Chebyshev's inequality \(0<\tau \ll 1/\sqrt{\alpha m}\) , \(a\) is the probability of accepting irrelevant features and \(m\) is the number of welding data samples.

Construction the welding quality prediction model

The welding quality prediction model based on apb.

The BPNN is the most successful learning algorithm for training multi-layer feedforward neural networks. Mathematically express the principle of iterative computation of BP neural network 45 . Assume that the sample dataset \(D=\left\{\left({x}_{1}{,y}_{1}\right),\left({x}_{2}{,y}_{2}\right),\dots ,\left({x}_{n}{,y}_{n}\right)\right\},{x}_{i}\in {R}^{m},{y}_{i}\in {R}^{z})\) , where the input sample vector includes m feature attributes and outputs a z -dimensional real-valued vector. m input neural nodes, q hidden layer neural nodes and z output neural nodes form a classical error BPNN structure. Taking the three-layer multilayer feedforward network structure as an example. The threshold value of the h -th neural node in the hidden layer is \({\gamma }_{h}\) . Threshold value of the j- th neural node in the output layer be \({\theta }_{j}\) . Connection weights between the i -th neural node in the input layer and the h -th neural node in the hidden layer are denoted as \({v}_{ih}\) . Connection weights between the h- th neural node in the hidden layer and the j -th neural node in the output layer are denoted as \({\omega }_{hj}\) . Notate that k is the number of training iterations of the network model.

The input vectors of each neural node in the hidden layer can be computed through the threshold \({\gamma }_{h}\) and the connection weights \({v}_{ih}\) between the input layer and the hidden layer \({O}_{h}\) . d is then used to generate the output vectors f of each neural node in the hidden layer by calculating through the activation function e.

The input vector \({O}_{h}\) of each neural node in the hidden layer can be calculated by the threshold \({\gamma }_{h}\) and the connection weight \({v}_{ih}\) between the input layer and the hidden layer. The output vectors \({S}_{h}\) of each neural node in the hidden layer are then computed by using \({O}_{h}\) through the activation function \(L(x)\) to generate the output vectors f of each neural node in the hidden layer:

Then, by utilizing the output vectors of the implicit layer, the connection weights \({\omega }_{hj}\) and the threshold \({\theta }_{j}\) , the input vector \({\beta }_{j}\) of each neural node in the output layer can be calculated. In employing the input vectors \({\beta }_{j}\) by means of the activation function \(p(x)\) the output response vectors \({T}_{j}\) of each neural node in the output layer can be computed:

For training sample \(\left({x}_{i}{,y}_{i}\right)\) , the output vector of the error back-propagation neural network is assumed to be \({T}_{j}\) . That is, the mean square error \({E}_{i}\) between the actual output value \({T}_{i}\) and the expected output value \({y}_{i}\) of the input training sample \(\left({x}_{i}{,y}_{i}\right)\) can be calculated as:

The BP neural network is an iterative learning algorithm. Based on the gradient descent strategy in each round of iteration for any parameter \(\delta\) the update formula is:

The learning rate is given as \(\eta\) , and the formula is derived in terms of the incremental weight \(\Delta {V}_{ih}\) of the connection between the input and hidden layers. Consider that \({V}_{ih}\) successively affects the input and output vectors of the h -th neural node of the hidden layer. Then it affects the input and output vectors of the j -th neural node of the output layer. Finally, it affects \({E}_{i}\) . That is:

It is assumed that a typical Sigmoid function is used for both hidden and output layer neural element nodes. That is, there is a characteristic function formula relationship. That is:

Substituting into Eq. (9), the update equation for a can be solved. Similarly, updated formulas for \(\Delta {\omega }_{hj}\) , \(\Delta {\theta }_{j}\) , and \(\Delta {\gamma }_{h}\) can be obtained. That is:

The BPNN model can realize any complex mapping of multidimensional and nonlinear functions, but it is easy to fall into the optimal local solution. The particle swarm optimization is a global random search algorithm based on swarm intelligence. It has good global search performance and universality for solving the global optimal solution of multiple objective functions and constraints. It can improve the convergence accuracy of BPNN and improve prediction performance. Therefore, fusing an adaptive simulated annealing, the particle swarm optimization and the back propagation neural network, a welding quality prediction algorithm is created. The algorithm flow is shown in Fig.  3 .

figure 3

The APB algorithm flow.

During the iteration optimization of the algorithm, the particle updates its position by tracking the individual extremes of the particle itself and the global extremes of the population. The movement of particles is composed of three parts, which reflect the trend of maintaining the previous speed, approaching the best position in history, group cooperation, and information sharing. The updated formulas of particle velocity and function are as follows:

where the critical parameters of each part are: \(w\) is the inertia weight coefficient. \({c}_{1}\) and \({c}_{2}\) are self-cognitive factors and social cognitive factors, respectively. \({v}_{i}(k)\) and \({x}_{i}\left(k\right)\) , respectively represent the velocity and position of the particle \(i\) at the k-th iteration. \({r}_{1}\) , \({r}_{2}\) are uniform random numbers in the range of \([\mathrm{0,1}]\) . \({P}_{best.i}\left(k\right)\) and \({G}_{best}(k)\) represent the individual optimal solution and the optimal global solution of the particle \(i\) at the k-th iteration.

\(w\) , \({c}_{1}\) and \({c}_{2}\) are essential parameters for controlling the iteration of the particle swarm optimization algorithm (PSO). \(w\) contains the inertia of the particle flight and the strength of the algorithm's searchability. \({c}_{1}\) , \({c}_{2}\) directly affect the particle's motion bias toward individual or group optimal. Therefore, to realize the adaptability of PSO, this study dynamically adjusts \(w\) and \({c}_{1}\) , \({c}_{2}\) to control the local and global optimization search strategy and collaborative sharing ability of the algorithm during iterative calculation. The nonlinear control strategy of a negative double tangent curve is adopted to control the change of \(w\) . The values of \({c}_{1}\) and \({c}_{2}\) vary with the iterative times \(k\) of PSO. The updated formulas of related parameters are as follows:

where \({w}_{max}\) and \({w}_{min}\) are the maximum and minimum values of the inertia weight coefficient. \(k\) is the current number of iterations. \({k}_{max}\) is the maximum number of iterations. \({c}_{1max}\) , \({c}_{1min}\) are the maximum and minimum values of the self-cognitive factor. \({c}_{2max}\) , \({c}_{2min}\) are the maximum and minimum values of the social cognitive factor.

In addition, to improve the search dispersion of the PSO algorithm and avoid convergence to local minima, SA is applied to the cyclic solution process of the PSO algorithm. The SA algorithm is an adaptive iterative heuristic probabilistic search algorithm. It has strong robustness, global convergence, computational parallelism, and adaptability, and can be suitable for solving nonlinear problems, as well as solving different types of design variable optimization problems. The specific process of the algorithm is as follows:

Select welding quality influencing factors with strong correlation as input characteristic set and corresponding welding quality data as output attribute set to establish training data set and verification data set of algorithm;

Preliminary construction of the BPNN prediction model for welding quality prediction;

The suitability function is set as the mean square error calculation function to evaluate the predictive performance. The flying particles are the weights and threshold parameter matrices of each neural network node. Particle population size N and maximum evolution number M are initialized. Set the search space dimension and speed range of the particle swarm. Random updating of the positions and velocities of all particles in the population;

Calculate the fitness values of all initial particles in the population, compare the optimal position of particle individual \({{\text{P}}}_{best}\) with the optimal position of population \({{\text{G}}}_{best}\) , and set the initial temperature of simulated annealing algorithm \(T\left(0\right)\) according to formula ( 22 );

Update the position and velocity of the particles by adjusting w, \({c}_{1}\) , and \({c}_{2}\) adaptively according to formulas ( 18 ), ( 19 ), and ( 20 ). Perform an iterative optimization. Update the global optimum of the population;

Set \(T=T(0)\) and initial solution \({S}_{1}\) as the global optimal solution, and determine the number of iterations at each temperature T, denoted as the chain length L of the Metropolis algorithm;

A stochastic perturbation is added to solution \({S}_{1}\) of the wheeled iteration and a new solution \({S}_{2}\) is generated.

Calculate the increment \(df=f{(S}_{2}\) ) \(-f({S}_{1})\) of the new solution \({S}_{2}\) , where \(f(x)\) is the fitness function;

If \(df<0\) , \({S}_{2}\) is accepted as the current solution for the iteration of the current wheel, so \({{S}_{1}=S}_{2}\) . If \(df>0\) , then the acceptance probability \({\text{exp}}(df/T)\) of \({S}_{2}\) is calculated, i.e. the random number Rand with uniform distribution is randomly generated in the interval (0,1). When the acceptance probability \({\text{exp}}(df/T)>rand\) , \({S}_{2}\) is also accepted as the new solution for the iteration, otherwise the current solution \({S}_{1}\) is retained;

The predictive error of the current solution \({S}_{1}\) has reached the accuracy requirement, or the number of iterations of the algorithm reaches the maximum number of iterations M, and the algorithm terminates. Otherwise, the algorithm decays the current temperature T according to the set attenuation function and returns to step 5 for cycle iteration until the condition is met and the current global optimal solution is output;

Output the current optimal particle, i.e. the optimal threshold and weight vector, fit the validation sample set and calculate the forecast error, and return to step 5 if conditions are not met.

After each iteration, the algorithm simulates the linear decay process of the initial temperature \(T\left(0\right).\) Then, the algorithm can not only accept the optimal solution, but also accept a certain probability \({P}_{T}\) , which improves the ability of PSO to jump out of the optimal local solution in the iterative optimization process. The updated formulas of related parameters are as follows:

where \({X}_{i+1}^{T(k)}\) represents the individual solution at the current temperature \(T\left(k\right)\) . \({P}_{T\left(k\right)}(i)\) is an acceptable probability that the new solution \({X}_{i+1}^{T(k)}\) can replace the historical solution \({X}_{i}^{T(k)}\) . \(T\left(k\right)\) represents the current temperature of the k-th annealing. \(\mu\) represents the cooling coefficient.

To evaluate the prediction accuracy of the improved algorithm model, the coefficient of determination (R 2 ), the mean absolute percentage error (MAPE), and the root mean square error (RMSE) is selected as a predictor of error in this thesis. The specific reference formula is:

where n is the sample size in the dataset; \({y}_{i}\) is the actual observation corresponding to the ith sample instance; \({\widehat{y}}_{i}\) is the fitted prediction corresponding to the ith sample instance; and \(\overline{y }\) is the average observation of the n sample instances. \({y}^{(i)}\) is the actual value corresponding to the i-th instances. \(h({X}^{(i)})\) is the predicted value corresponding to the i-th instances.

The R 2 indicates the superiority of fitting the covariance between the sample independent variables and the dependent variable in the regression model. The MAPE and RMSE reflect the degree of deviation between the predicted and actual values.

  • Process parameter optimization

The genetic algorithm is first proposed by John Holland 46 according to the evolution law of biological populations in nature. It is an algorithm to obtain the optimal solution by simulating the natural evolution of the biological population. It can handle complex nonlinear combinatorial optimization problems and has a good global optimization-seeking ability, so genetic algorithm is widely used in optimization problems in many engineering fields. Based on the welding quality prediction model built in the previous chapter, the genetic algorithm is introduced to optimize the welding process parameters to obtain the optimal combination of process parameters.

The specific idea is the welding current, arc voltage, welding speed, wire elongation, welding gas flow, and inductance of each process parameter by the actual number encoding as a gene composition chromosome. A chromosome represents a set of welding process parameters combined, to obtain the initial population, and then through selection, crossover, mutation to generate new populations, and according to the above-established prediction model for the degree of adaptation function for evaluation, and then finally iterate through the genetic algorithm to get the optimal combination of process parameters. The specific algorithm flow is shown in Fig.  4 .

figure 4

The optimization process of welding process parameters.

To demonstrate the feasibility of the method proposed in this paper, welding experiments on ship unit components are conducted in cooperation with a large shipyard. The proposed method is verified to accurately predict the welding quality for ship unit-assembly welding.

Based on the industrial IoT framework of data acquisition and processing of ship component welding, The welding data collection method is validated, as shown in Fig.  5 (The clearer version is shown in Supplementary Figure S2 ). The ship plate used in the investigation is general strength hull structural steel-Q235B. Its specific size is 300 mm × 150 mm × 5 mm, and its welding process chooses the center surfacing welding of the ship component. In the welding experiment of the ship component structure, the digital welding machine selected is Panasonic's all-digital Metal Inert Gas welding machine, model YD-350GS5 of the GP5 series, which has a built-in IoT module and analog communication interface. The automatic welding robot uses a Panasonic TAWERS welding robot, which can realize very low spatter welding of ship components. To collect key welding process parameters, some intelligent sensors and precision measuring instruments are equipped in the experiment. The threading sensor of CO2 welding can monitor the elongation of welding wire during welding. TH2810 inductance measuring instrument is used to measure the inductance during welding. In addition, the mass flow controller of shielding gas is used to measure the welding gas flow in the welding process (More complete description is shown in Supplementary Table S1 ).

figure 5

The welding data acquisition system of the ship component structure.

The equipment used for welding data transmission includes communication interface equipment and a serial port network module. For digital welding machines and welding robots, PLC provides analog input modules that can receive standard voltage or current signals converted by transmitters. Then, after the calculation and analysis of PLC, the analyzed data can be displayed on the human–machine interface (HMI) of the welding site through the communication interface device and communication protocol. The intelligent sensor configured in the experiment has its communication interface, such as RS232 and RS485. Therefore, the serial port network module can establish a connection and data protocol conversion with each sensor. Wireless Fidelity transmission and Ethernet can be set through a radiofrequency (RF) antenna and WAN/LAN conversion component to support the welding data reading and writing operation between welding site and the upper computer. In this case, MySQL database is used to store, manage and share welding data.

Residual stresses are measured by the blind hole method on the finished welded steel plate. The value of the residual stress reflects the quality of the weld. The blind hole method is based on applying a strain gauge to the surface of the workpiece to be measured. Then a hole is punched into the workpiece to cause stress relaxation around the hole and generate a new stress field distribution. Strain release is collected by the strain gauge, and the original residual stress and strain of the workpiece can be deduced based on the principle of elasticity.

The measurement equipment consisted of a stepper-adjustable drilling machine, a three-phase strain gauge, a TST3822E static strain test analyzer, and software. The diameter of the blind hole is 2 mm, and the depth of the hole is 3 mm. The measured stress is the average value of the pressure distribution in the depth of the blind spot. According to the direction of action, the residual welding stresses are divided into longitudinal residual stresses parallel to the weld axis and transverse residual stresses perpendicular to the weld. In this experiment, the strain gauge type is chosen as a three-phase right-angle strain gauge. That is, the layout angles of the strain gauges are 0°, 45°, and 90°. Longitudinal strain, principal strain, and transverse strain are measured, respectively. Since the distribution of longitudinal residual stresses is more regular than that of transverse residual stresses, only the themes in the 0° direction are considered in this experiment. The amount of strain changes along the weld direction, and then the analysis software yields the longitudinal residual stress. As shown in Fig.  6 , the operation site and the sticking position of the strain gauge for the experiment using the blind hole method are used. The participating stresses of each plate are collected through the TST3822E static strain test analyzer and computer software.

figure 6

Blind hole method to collect residual stress.

Preprocessing the welding process data

According to the correlation between the collected welding data and weld formation quality, MATLAB software and the Relief-F algorithm assign different influence weights to each data feature. Data features whose weight is less than the threshold value, such as data types irrelevant or weakly related to the weld formation quality, these data features will be excluded. The collected data includes welding current, arc voltage, welding speed, welding gun angle, steel plate thickness, welding gas flow, welding wire diameter, inductance value, and welding wire elongation. The Relief-F algorithm needs to set the number of neighbors and sampling times. Combined with the experimental sample data collected in the experiment, this case selects the number of neighbors \(k=\mathrm{3,5},\mathrm{7,8},9,\mathrm{10,15,18,25}\) , and the number of sampling \(m\) is 80. The calculation results are shown in Fig.  7 . The average of the calculation results of each group is used as the final weight of each data feature, and the calculation result is shown in Table 1 .

figure 7

The final weight of each data feature.

Among the features of the collected welding data, the influence weights of arc voltage and welding current on the quality of weld formation are the largest, which are 0.254 and 0.232, respectively. The main reason is that when the arc voltage and welding current increase, it will directly cause the heat energy input to the weld seam and the melting amount of the welding wire to increase, thereby increasing the width, penetration, and reinforcement of the weld seam. Secondly, the data feature with a relatively small degree of influence is the welding speed, and its corresponding influence weight is 0.173. When the welding speed is too high, the cooling rate of the welding seam will be too fast. Then, it will lead to deposition and the reduction of the number of metal coatings, which will affect the quality of weld formation. On the contrary, it will cause the heat concentration and width of the molten pool to increase, resulting in burn-through and other welding defects. In addition, in CO2 gas-shielded welding, the welding gas flow rate is a key parameter that affects the quality of weld formation, and its calculated influence weight is 0.171. When the gas flow is too large, it will cause instability and disturbance of the molten pool and arc of the weld, resulting in turbulence and splashing in the molten pool. On the contrary, it will directly reduce the protective effect of gas and affect the quality of weld formation. The inductance value will affect the penetration of the weld, and its weight is calculated to be 0.16. The welding wire elongation will directly affect the protective effect of the gas, and its weight is estimated to be 0.144. The welding gun angle, steel plate thickness, and welding wire diameter also have a particular influence on the forming quality of the weld. The influence weights are calculated to be 0.13, 0.08, and 0.05, respectively.

In this verification case, the data sample size for CO2 gas-shielded welding of the ship component structure is 350, and \(\alpha\) is 0.145. It is calculated that the weight threshold range of the influence weight of the weld forming quality in the CO2 gas-shielded welding of the ship component structure is \(0<\tau \le 0.14\) . Combined with the calculation results of the data feature weight, the influencing factors whose influence weight is greater than the threshold value are considered the main process parameters in this experiment. The main process parameters are arc voltage, welding current, welding speed, welding gas flow, inductance value, and welding wire extension length. These will be used as key input variables for constructing a welding quality prediction model.

Predict the welding quality

Using MATLAB as the verification platform, this case uses the APB algorithm model to predict the welding quality of the ship component. 300 sets of welding data are selected to train the algorithm model (Complete data in Supplementary Table S2 ), and 74 sets are selected for verification. The verification data set is shown in Table 2 (Complete data in Supplementary Table S3 ). This case considers the weld forming coefficient as the target variable and selects six variables as the key welding quality influencing factors according to the feature selection results in “ Preprocessing the welding process data ” section. The six key welding quality influencing factors include welding current, arc voltage, welding speed, welding wire elongation, inductance value, and welding gas flow.

After conducting many experiments using the above welding data, the upper limit \({w}_{max}\) is set to 0.9, and the lower limit \({w}_{min}\) is set to 0.4. The maximum number of \({k}_{max}\) iterations of PSO is set to 1000. The parameter combination of the self-cognitive factor and social cognitive factor is that \({c}_{1max},{c}_{1min},{c}_{2max}\) , and \({c}_{2min}\) are set to 2.5, 1.25, 2.5, and 1.25, respectively. The APB algorithm's global search capability and convergence speed can be balanced and achieve better results. The Metropolis criterion of the SA algorithm is introduced into the iterative calculation of the PSO algorithm. In the case verification, the initial temperature ( \(T\left(0\right)={10}^{4}\) ) is attenuated by the cooling coefficient ( \(\mu =0.9\) ). The 24 sets of welding data in Table 2 are substituted into the trained APB algorithm model to predict and verify the forming weld coefficient. The actual output value of each verification sample is compared with the expected value, and the relative error is calculated, as shown in Table 3 . (Complete data in Supplementary Table S4 ). In this case, the maximum and minimum relative prediction errors of the SAPSO_BPNN algorithm model on the validation sample data set are 8.764% and 5.364%, respectively. In general, the error of the proposed prediction algorithm is relatively small and can satisfy the accuracy requirements for predicting the welding quality of ship components of ships.

In addition, the improvements and advantages of the proposed APB algorithm model are further explained. Using the same welding data set above, the BPNN, BPNN optimized method based on the particle swarm optimization algorithm (PSO-BP), and the APB algorithm are selected to predict the residual welding stress. Some data comparison results are shown in Fig.  8 .

figure 8

The predictive outputs and comparison result of different algorithms.

The calculation results of the algorithm evaluation indicators R2, MAPE and RSME are shown in Table 4 . In comparison, the prediction accuracy of the welding data samples using the PSO-BP algorithm is higher than that of the BPNN. In addition, the prediction accuracy based on the APB algorithm is also significantly improved compared to PSO-BP.

Optimize the welding process parameters

Several workpieces with high welding residual stress are found in the experiment. The quality of these workpieces is not up to requirements, resulting in scrap. It will bring unnecessary economic loss to the enterprise. To reduce the loss and improve efficiency, the unqualified variety of welding process parameters is optimized by using the global optimization ability of the genetic algorithm.

Relevant parameters of the genetic algorithm are selected: maximum evolutionary algebra, population size, crossover probability, and variation probability are 100, 50, 0.7, and 0.01, respectively. The proposed forecast model is used as the objective function. The smaller the residual stress value, the higher the suitability. The experiment is carried out again to optimize the defective combination of process parameters in real time. The residual stress of the optimized product is re-measured. The results are shown in Table 5 . The experimental results show that the optimized combination of process parameters can yield products with lower residual stress. Improve quality and reduce economic losses. It can provide a reference for the real-time improvement of the welding process in enterprises.

Conclusion and future works

To meet the requirements of real-time monitoring and accurately prediction the ship unit-assembly welding quality, an IOT-based welding data acquisition framework is firstly established. And the stable and reliable data is obtained. The welding process monitoring elements are determined based on the feature dimensionality reduction methods. According to the Relief-F algorithm, the crucial features data is selected among the historical dataset. Secondly, the correlation rule between process parameters and welding quality is established. The prediction model of the ship unit-assembly welding is constructed by fusing the adaptive simulated annealing, the particle swarm optimization and the back propagation neural network. In order to optimize the welding parameters, the genetic algorithm is selected. Finally, the experimental welding of ship component is used as an example to verify the effectiveness of the proposed critical techniques.

The experimental results show that the proposed APB prediction model can predict the welding characteristics more effectively than the traditional methods, with a prediction accuracy of more than 91.236%, the coefficient of determination (R 2 ) is increased from 0.659 to 0.952, the mean absolute percentage error (MAPE) is reduced from 39.83 to 1.77%, and the root mean square error (RMSE) is reduced from 0.4933 to 0.0709. Showing higher prediction accuracy. It is proved that the technique can be used for online monitoring and accurate prediction of the welding quality for ship components. It can realize real-time collection and efficient transmission of big welding data, including welding process parameters, information on the operating status of welding equipment, and welding quality indicators. In addition, with the support of new-generation information technology such as the IoT, Big data, etc. the dynamic quality data in the welding process can be tracked in real-time and fully explored to realize online monitoring and accurate prediction of welding quality. With the application and development of automated welding equipment, more welding quality data and its impact factors are obtained. With the continuous updating and mining of welding data, a more accurate prediction model of welding quality needs to be established.

To dynamic control the processing quality of the ship unit-assembly welding, the proposed method can be well carried out. However, the implementation of technology is limited by the diversity and complexity of the ship sections assembly-welding process, so more effort and innovation should be paid to solve these defects. It is necessary to improve the perception and management of real-time data in the IoT system, so as to promote the deep fusion of physical and virtual workshops, and establish a more reliable virtual model and multi-dimensional welding simulation. Meanwhile, with the support of a more complete real-time database and welding quality mapping mechanism, the ship welding quality analysis ability can be continuously enhanced, and the processing quality prediction method can be further improved and innovated.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Stanic, V., Hadjina, M., Fafandjel, N. & Matulja, T. Toward shipbuilding 4.0—An Industry 4.0 changing the face of the shipbuilding industry. Brodogradnja 69 , 111–128. https://doi.org/10.21278/brod69307 (2018).

Article   Google Scholar  

Ang, J., Goh, C., Saldivar, A. & Li, Y. Energy-efficient through-life smart design, manufacturing and operation of ships in an Industry 4.0 environment. Energies 10 , 610. https://doi.org/10.3390/en10050610 (2017).

Remes, H. & Fricke, W. Influencing factors on fatigue strength of welded thin plates based on structural stress assessment. Weld. World 6 , 915–923. https://doi.org/10.1007/s40194-014-0170-7 (2014).

Article   CAS   Google Scholar  

Remes, H. et al. Factors affecting the fatigue strength of thin-plates in large structures. Int. J. Fatigue 101 , 397–407. https://doi.org/10.1016/j.ijfatigue.2016.11.019 (2017).

Li, L., Liu, D., Ren, S., Zhou, H. & Zhou, J. Prediction of welding deformation and residual stress of a thin plate by improved support vector regression. Scanning 2021 , 1–10. https://doi.org/10.1155/2021/8892128 (2021).

Fricke, W. et al. Fatigue strength of laser-welded thin-plate ship structures based on nominal and structural hot-spot stress approach. Ships Offshore Struct. 10 , 39–44. https://doi.org/10.1080/17445302.2013.850208 (2015).

Li, L., Liu, D., Liu, J., Zhou, H. & Zhou, J. Quality prediction and control of assembly and welding process for ship group product based on digital twin. Scanning 2020 , 1–13. https://doi.org/10.1155/2020/3758730 (2020).

Franciosa, P., Sokolov, M., Sinha, S., Sun, T. & Ceglarek, D. Deep learning enhanced digital twin for closed-loop in-process quality improvement. CIRP Ann. 69 , 369–372. https://doi.org/10.1016/j.cirp.2020.04.110 (2020).

Febriani, R. A., Park, H.-S. & Lee, C.-M. An approach for designing a platform of smart welding station system. Int. J. Adv. Manuf. Technol. 106 , 3437–3450. https://doi.org/10.1007/s00170-019-04808-6 (2020).

Liu, J. et al. Digital twin-enabled machining process modeling. Adv. Eng. Inf. 54 , 101737. https://doi.org/10.1016/j.aei.2022.101737 (2022).

Liu, J. et al. A digital twin-driven approach towards traceability and dynamic control for processing quality. Adv. Eng. Inf. 50 , 101395. https://doi.org/10.1016/j.aei.2021.101395 (2021).

Chen, J., Wang, T., Gao, X. & Wei, L. Real-time monitoring of high-power disk laser welding based on support vector machine. Comput. Ind. 94 , 75–81. https://doi.org/10.1016/j.compind.2017.10.003 (2018).

Rauber, T. W., De Assis Boldt, F. & Varejao, F. M. Heterogeneous feature models and feature selection applied to bearing fault diagnosis. IEEE Trans. Ind. Electron. 62 , 637–646. https://doi.org/10.1109/TIE.2014.2327589 (2015).

Bahmanyar, A. R. & Karami, A. Power system voltage stability monitoring using artificial neural networks with a reduced set of inputs. Int. J. Electr. Power Energy Syst. 58 , 246–256. https://doi.org/10.1016/j.ijepes.2014.01.019 (2014).

Rostami, M., Berahmand, K., Nasiri, E. & Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 100 , 104210. https://doi.org/10.1016/j.engappai.2021.104210 (2021).

Liao, T. W. Improving the accuracy of computer-aided radiographic weld inspection by feature selection. NDT E Int. 42 , 229–239. https://doi.org/10.1016/j.ndteint.2008.11.002 (2009).

Jiang, H. et al. Convolution neural network model with improved pooling strategy and feature selection for weld defect recognition. Weld. World 65 , 731–744. https://doi.org/10.1007/s40194-020-01027-6 (2021).

Zhang, Z. et al. Multisensor-based real-time quality monitoring by means of feature extraction, selection and modeling for Al alloy in arc welding. Mech. Syst. Signal Process. 60–61 , 151–165. https://doi.org/10.1016/j.ymssp.2014.12.021 (2015).

Article   ADS   Google Scholar  

Abdel-Basset, M., El-Shahat, D., El-henawy, I., de Albuquerque, V. H. C. & Mirjalili, S. A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst. Appl. 139 , 112824. https://doi.org/10.1016/j.eswa.2019.112824 (2020).

Le, T. T. et al. Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests. Bioinformatics 33 , 2906–2913. https://doi.org/10.1093/bioinformatics/btx298 (2017).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Pal, S., Pal, S. K. & Samantaray, A. K. Neurowavelet packet analysis based on current signature for weld joint strength prediction in pulsed metal inert gas welding process. Sci. Technol. Weld. Join. 13 , 638–645. https://doi.org/10.1179/174329308X299986 (2008).

Pal, S., Pal, S. K. & Samantaray, A. K. Prediction of the quality of pulsed metal inert gas welding using statistical parameters of arc signals in artificial neural network. Int. J. Comput. Integr. Manuf. 23 , 453–465. https://doi.org/10.1080/09511921003667698 (2010).

Nykänen, T., Björk, T. & Laitinen, R. Fatigue strength prediction of ultra high strength steel butt-welded joints. Fatigue Fract. Eng. Mat. Struct. 36 , 469–482. https://doi.org/10.1111/ffe.12015 (2013).

Lu, J., Shi, Y., Bai, L., Zhao, Z. & Han, J. Collaborative and quantitative prediction for reinforcement and penetration depth of weld bead based on molten pool image and deep residual network. IEEE Access 8 , 126138–126148. https://doi.org/10.1109/ACCESS.2020.3007815 (2020).

Luo, Y., Li, J. L. & Wu, W. Nugget quality prediction of resistance spot welding on aluminium alloy based on structureborne acoustic emission signals. Sci. Technol. Weld. Join. 18 , 301–306. https://doi.org/10.1179/1362171812Y.0000000102 (2013).

Shim, J.-Y., Zhang, J.-W., Yoon, H.-Y., Kang, B.-Y. & Kim, I.-S. Prediction model for bead reinforcement area in automatic gas metal arc welding. Adv. Mech. Eng. 10 , 168781401878149. https://doi.org/10.1177/1687814018781492 (2018).

Lei, Z., Shen, J., Wang, Q. & Chen, Y. Real-time weld geometry prediction based on multi-information using neural network optimized by PCA and GA during thin-plate laser welding. J. Manuf. Process. 43 , 207–217. https://doi.org/10.1016/j.jmapro.2019.05.013 (2019).

Chaki, S., Bathe, R. N., Ghosal, S. & Padmanabham, G. Multi-objective optimisation of pulsed Nd:YAG laser cutting process using integrated ANN–NSGAII model. J. Intell. Manuf. 29 , 175–190. https://doi.org/10.1007/s10845-015-1100-2 (2018).

Wang, Y. et al. Weld reinforcement analysis based on long-term prediction of molten pool image in additive manufacturing. IEEE Access 8 , 69908–69918. https://doi.org/10.1109/ACCESS.2020.2986130 (2020).

Hartl, R., Praehofer, B. & Zaeh, M. Prediction of the surface quality of friction stir welds by the analysis of process data using artificial neural networks. Proc. Inst. Mech. Eng. Part L J. Mater. Des. Appl. 234 , 732–751. https://doi.org/10.1177/1464420719899685 (2020).

Chang, Y., Yue, J., Guo, R., Liu, W. & Li, L. Penetration quality prediction of asymmetrical fillet root welding based on optimized BP neural network. J. Manuf. Process. 50 , 247–254. https://doi.org/10.1016/j.jmapro.2019.12.022 (2020).

Jin, C., Shin, S., Yu, J. & Rhee, S. Prediction model for back-bead monitoring during gas metal arc welding using supervised deep learning. IEEE Access 8 , 224044–224058. https://doi.org/10.1109/ACCESS.2020.3041274 (2020).

Hu, W. et al. Improving the mechanical property of dissimilar Al/Mg hybrid friction stir welding joint by PIO-ANN. J. Mater. Sci. Technol. 53 , 41–52. https://doi.org/10.1016/j.jmst.2020.01.069 (2020).

Cruz, Y. J. et al. Ensemble of convolutional neural networks based on an evolutionary algorithm applied to an industrial welding process. Comput. Ind. 133 , 103530. https://doi.org/10.1016/j.compind.2021.103530 (2021).

Pereda, M., Santos, J. I., Martín, Ó. & Galán, J. M. Direct quality prediction in resistance spot welding process: Sensitivity, specificity and predictive accuracy comparative analysis. Sci. Technol. Weld. Join. 20 , 679–685. https://doi.org/10.1179/1362171815Y.0000000052 (2015).

Wang, T., Chen, J., Gao, X. & Li, W. Quality monitoring for laser welding based on high-speed photography and support vector machine. Appl. Sci. 7 , 299. https://doi.org/10.3390/app7030299 (2017).

Das, B., Pal, S. & Bag, S. Torque based defect detection and weld quality modelling in friction stir welding process. J. Manuf. Process. 27 , 8–17. https://doi.org/10.1016/j.jmapro.2017.03.012 (2017).

Petković, D. Prediction of laser welding quality by computational intelligence approaches. Optik 140 , 597–600. https://doi.org/10.1016/j.ijleo.2017.04.088 (2017).

Article   CAS   ADS   Google Scholar  

Yu, R., Han, J., Zhao, Z. & Bai, L. Real-time prediction of welding penetration mode and depth based on visual characteristics of weld pool in GMAW process. IEEE Access 8 , 81564–81573. https://doi.org/10.1109/ACCESS.2020.2990902 (2020).

Casalino, G., Campanelli, S. L. & Memola Capece Minutolo, F. Neuro-fuzzy model for the prediction and classification of the fused zone levels of imperfections in Ti6Al4V alloy butt weld. Adv. Mater. Sci. Eng. 2013 , 1–7. https://doi.org/10.1155/2013/952690 (2013).

Rout, A., Bbvl, D., Biswal, B. B. & Mahanta, G. B. A fuzzy-regression-PSO based hybrid method for selecting welding conditions in robotic gas metal arc welding. Assem. Autom. 40 , 601–612. https://doi.org/10.1108/AA-12-2019-0223 (2020).

Kim, K.-Y. & Ahmed, F. Semantic weldability prediction with RSW quality dataset and knowledge construction. Adv. Eng. Inf. 38 , 41–53. https://doi.org/10.1016/j.aei.2018.05.006 (2018).

AbuShanab, W. S., AbdElaziz, M., Ghandourah, E. I., Moustafa, E. B. & Elsheikh, A. H. A new fine-tuned random vector functional link model using Hunger games search optimizer for modeling friction stir welding process of polymeric materials. J. Mater. Res. Technol. 14 , 1482–1493. https://doi.org/10.1016/j.jmrt.2021.07.031 (2021).

Kennedy, J. Particle Swarm Optimization. In Encyclopedia of Machine Learning (eds. Sammut, C. & Webb, G. I.) 760–766. https://doi.org/10.1007/978-0-387-30164-8_630 (2011).

Sun, C. et al. Prediction method of concentricity and perpendicularity of aero engine multistage rotors based on PSO-BP neural network. IEEE Access 7 , 132271–132278. https://doi.org/10.1109/ACCESS.2019.2941118 (2019).

Holland, J. H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence (The MIT Press, New York, 1992). https://doi.org/10.7551/mitpress/1090.001.0001 .

Book   Google Scholar  

Download references

The work is supported by the National Natural Science Foundation of China under Grant (number 52075229\ 52371324), in part by the Provincial Natural Science Foundation of China under Grant (number KYCX20_3121), the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant (number SJCX22_1923). Sponsored by Jiangsu Qinglan Project.

Author information

Authors and affiliations.

Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China

Jinfeng Liu, Yifa Cheng, Xuwen Jing & Yu Chen

Southeast University, Nanjing, 211189, China

Xiaojun Liu

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. The first draft of the manuscript was written by J.L., manuscript review and editing were performed by Y.C., X.J., X.L. and Y.C. All authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Jinfeng Liu or Xuwen Jing .

Ethics declarations

Competing interests.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Liu, J., Cheng, Y., Jing, X. et al. Prediction and optimization method for welding quality of components in ship construction. Sci Rep 14 , 9353 (2024). https://doi.org/10.1038/s41598-024-59490-w

Download citation

Received : 10 January 2024

Accepted : 11 April 2024

Published : 23 April 2024

DOI : https://doi.org/10.1038/s41598-024-59490-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Quality prediction
  • Components welding
  • Welding quality

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

case study techniques of data collection

IMAGES

  1. Data Collection Methods: Definition, Examples and Sources

    case study techniques of data collection

  2. Types Of Data Collection

    case study techniques of data collection

  3. Methods of Data Collection-Primary and secondary sources

    case study techniques of data collection

  4. Data Collection Strategies: Master the Art of Data Collection With Our

    case study techniques of data collection

  5. 4 Data Collection Techniques: Which One's Right for You?

    case study techniques of data collection

  6. 7 Data Collection Methods & Tools For Research

    case study techniques of data collection

VIDEO

  1. Data Collection for Qualitative Studies

  2. Case Study Techniques (व्यक्ति अध्ययन प्रपत्र) देखिए कैसे बनाते हैं BSTC/D.El.Ed./b.ed

  3. Research Techniques: Data Collection

  4. Sampling Techniques and Data Collection Methods

  5. How is Data Collected ... [How Data Collection works in Statistics]

  6. Data Analytics Case Study to Analyze Bank Wages Data

COMMENTS

  1. (PDF) Collecting data through case studies

    The case study is a data collection method in which in-depth descriptive information. about specific entities, or cases, is collected, organized, interpreted, and presented in a. narrative format ...

  2. Case Study

    The data collection method should be selected based on the research questions and the nature of the case study phenomenon. Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions ...

  3. Case Study Methodology of Qualitative Research: Key Attributes and

    A case study protocol should have the following constituent elements: (a) an overview of the entire study including its objectives, (b) a detailed description of field procedures including the techniques of data collection to be employed, and how one plans to move ahead and operate in the field, (c) a clearly and sharply developed questions ...

  4. Case Study Methods and Examples

    The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...

  5. Chapter 10. Introduction to Data Collection Techniques

    Case Study; Feminist Approaches; Mixed Methods; often used as a supplementary technique: SIngle or comparative focused discussions with 5-12 persons: ... Data Collection Techniques. Each of these data collection techniques will be the subject of its own chapter in the second half of this textbook. This chapter serves as an orienting overview ...

  6. What Is a Case Study?

    A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.

  7. Planning Qualitative Research: Design and Decision Making for New

    When conducting a case study, researchers use a variety of data collection procedures. Merriam and Tisdell (2015) ... While in this paper, we mention various data collection methods (techniques), it is essential to remember that in addition to collecting data, researchers must ensure rigor from the design to the evaluation of the research, i.e ...

  8. Case Study Method: A Step-by-Step Guide for Business Researchers

    Case study reporting is as important as empirical material collection and interpretation. The quality of a case study does not only depend on the empirical material collection and analysis but also on its reporting (Denzin & Lincoln, 1998). A sound report structure, along with "story-like" writing is crucial to case study reporting.

  9. PDF A (VERY) BRIEF REFRESHER ON THE CASE STUDY METHOD

    the case study method favors the collection of data in natural settings, compared with relying on "derived" data (Bromley, 1986, p. 23)—for example, responses to a researcher's instruments in an experiment or responses to questionnaires in a survey. For instance, education audiences may want to know about the following:

  10. Qualitative Study Design and Data Collection

    5. Describe the processes of qualitative data collection for observing, interviewing, focus groups, and naturally occurring data. Given a study description, identify the processes employed in that study. 6. Explain why sometimes it is best to use a combination of qualitative strategies for data gathering.

  11. (PDF) Data Collection for Qualitative Research

    Data Collection for Qualitative Research. January 2020. DOI: 10.1017/9781108762427.011. In book: Research Methods in Business Studies (pp.95-128) Authors: Pervez N. Ghauri.

  12. Collecting data through case studies

    The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and disadvantages of case studies as a data collection method are discussed and guidelines for their use are given. Volume 46, Issue 7.

  13. Continuing to enhance the quality of case study methodology in health

    Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5,37 By seeking patterns within and across data ...

  14. Data Collection

    Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem. While methods and aims may differ between fields, the overall process of ...

  15. Collecting data through case studies

    The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and disadvantages of case studies as a data collection method are discussed and guidelines for their use are given. References ( ). . ...

  16. Case Study Research Method in Psychology

    The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies. Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

  17. Data Collection Technique

    Data collection techniques include interviews, observations (direct and participant), questionnaires, and relevant documents (Yin, 2014). ... In case study research, the data collected are usually qualitative (words, meanings, views) but can also be quantitative (descriptive numbers, tables). Qualitative data analysis may be used in theory ...

  18. Data Collection Methods and Tools for Research; A Step-by-Step Guide to

    Before selecting a data collection method, the type of data that is required for the study should be determined (Kabir, 2016). This section aims to provide a summary of possible data types to go through the different data collection methods and sources of data based on these categories. However, we need to understand what data is exactly?

  19. The Case Study: Methods of Data Collection

    The case study involved TPWW Company and its consumers and semi-structured interview were selected to collect the primary data throughout the water industry professionals, and members of the public in Greater Tehran . Table 6.2 illustrates the linkages between the research objectives and the data collection methods.

  20. (PDF) Data Collection Methods and Tools for Research; A Step-by-Step

    One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain ...

  21. Data Analytics Case Study Guide 2024

    A data analytics case study comprises essential elements that structure the analytical journey: Problem Context: A case study begins with a defined problem or question. It provides the context for the data analysis, setting the stage for exploration and investigation.. Data Collection and Sources: It involves gathering relevant data from various sources, ensuring data accuracy, completeness ...

  22. Guidelines for data collection on energy performance of higher

    The study proposed a system which tackles the problem of data collection in university buildings. The proposed database covered data on the building's shape, data on the hourly and quarter-hourly consumption of power and gas, operational information, and the effect of weather to assist universities in creating nearly zero-energy buildings.

  23. Leading Quality and Safety on the Frontline

    This study is a part of a single embedded case study, conducted in accordance with Yin`s description of case study research. Citation 32 A case study is a method that investigates a contemporary phenomenon in-depth in a real-world context, and the approach is suitable when seeking to understand a complex social phenomenon.

  24. Case Study Methodology of Qualitative Research: Key Attributes and

    a phenomenon in its real-life context. In a case study research, multiple methods of data collection are used, as it involves an in-depth study of a phenomenon. It must be noted, as highlighted by Yin (2009), a case study is not a method of data collection, rather is a research strategy or design to study a social unit.

  25. Data Collection Methods

    Table of contents. Step 1: Define the aim of your research. Step 2: Choose your data collection method. Step 3: Plan your data collection procedures. Step 4: Collect the data. Frequently asked questions about data collection.

  26. The economic commitment of climate change

    The inclusion of further climate variables (Extended Data Fig. 5) and a sufficient number of lags to more adequately capture the extent of impact persistence (Extended Data Figs. 1 and 2) are the ...

  27. The Costs of Anonymization: Case Study Using Clinical Data

    Objective: The goal of this study is to contribute to a better understanding of anonymization in the real world by comprehensively evaluating the privacy-utility trade-off of differently anonymized data using data and scientific results from the German Chronic Kidney Disease (GCKD) study. Methods: The GCKD data set extracted for this study ...

  28. Machines

    Artificial neural networks (ANNs) provide supervised learning via input pattern assessment and effective resource management, thereby improving energy efficiency and predicting environmental fluctuations. The advanced technique of ANNs forecasts diesel engine emissions by collecting measurements during trial sessions. This study included experimental sessions to establish technical and ...

  29. Microorganisms

    Legionella pneumophila can cause a large panel of symptoms besides the classic pneumonia presentation. Here we present a case of fatal nosocomial cellulitis in an immunocompromised patient followed, a year later, by a second case of Legionnaires' disease in the same ward. While the first case was easily assumed as nosocomial based on the date of symptom onset, the second case required clear ...

  30. Prediction and optimization method for welding quality of ...

    Method for selecting welding quality features. For huge amount of data in the production site, the knowledge information can be mined through suitable processing methods to assist production 11,12