• Deep Learning Research Proposal

The word deep learning is the study and analysis of deep features that are hidden in the data using some intelligent deep learning models . Recently, it turns out to be the most important research paradigm for advanced automated systems for decision-making . Deep learning is derived from machine learning technologies that learn based on hierarchical concepts . So, it is best for performing complex and long mathematical computations in deep learning .

This page describes to you the innovations of deep learning research proposals with major challenges, techniques, limitations, tools, etc.!!!

One most important thing about deep learning is the multi-layered approach . It enables the machine to construct and work the algorithms in different layers for deep analysis . Further, it also works on the principle of artificial neural networks which functions in the same human brain. Since it got inspiration from the human brain to make machines automatically understand the situation and make smart decisions accordingly.  Here, we have given you some of the important real-time applications of deep learning.

Deep Learning Project Ideas

  • Natural Language Processing
  • Pattern detection in Human Face
  • Image Recognition and Object Detection
  • Driverless UAV Control Systems
  • Prediction of Weather Condition Variation
  • Machine Translation for Autonomous Cars
  • Medical Disorder Diagnosis and Treatment
  • Traffic and Speed Control in Motorized Systems
  • Voice Assistance for Dense Areas Navigation
  • Altitude Control System for UAV and Satellites

Now, we can see the workflow of deep learning models . Here, we have given you the steps involved in the deep learning model. This assists you to know the general procedure of deep learning model execution . Similarly, we precisely guide you in every step of your proposed deep learning model . Further, the steps may vary based on the requirement of the handpicked deep learning project idea. Anyway, the deep learning model is intended to grab deep features of data by processing through neural networks . Then, the machine will learn and understand the sudden scenarios for controlling systems.

Top 10 Interesting Deep Learning Research Proposal

Process Flow of Deep Learning

  • Step 1 – Load the dataset as input
  • Step 2 – Extraction of features
  • Step 3 – Process add-on layers for more abstract features
  • Step 4 – Perform feature mapping
  • Step 5 –Display the output

Although deep learning is more efficient to automatically learn features than conventional methods, it has some technical constraints. Here, we have specified only a few constraints to make you aware of current research. Beyond these primary constraints, we also handpicked more number of other constraints. To know other exciting research limitations in deep learning , approach us. We will make you understand more from top research areas.

Deep Learning Limitations

  • Test Data Variation – When the test data is different from training data, then the employed deep learning technique may get failure. Further, it also does not efficiently work in a controlled environment.
  • Huge Dataset – Deep learning models efficiently work on large-scale datasets than limited data

Our research team is highly proficient to handle different deep learning technologies . To present you with up-to-date information, we constantly upgrade our research knowledge in all advanced developments. So, we are good not only at handpicking research challenges but also more skilled to develop novel solutions. For your information, here we have given you some most common data handling issues with appropriate solutions. 

What are the data handling techniques?

  • Variables signifies the linear combo of factors with errors
  • Depends on the presence of different unobserved variables (i.e., assumption)
  • Identify the correlations between existing observed variables
  • If the data in a column has fixed values, then it has “0” variance.
  • Further, these kinds of variables are not considered in target variables
  • If there is the issue of outliers, variables, and missing values, then effective feature selection will help you to get rid out of it. 
  • So, we can employ the random forest method
  • Remove the unwanted features from the model
  • Repeat the same process until attaining maximum  error rate
  • At last, define the minimum features
  • Remove one at a time and check the error rate
  • If there are dependent values among data columns, then may have redundant information due to similarities.
  • So, we can filter the largely correlated columns based on coefficients of correlation
  • Add one at a time for high performance
  • Enhance the entire model efficiency
  • Addresses the possibility where data points are associated with high-dimensional space
  • Select low-dimensional embedding to generate related distribution
  •   Identify the missing value columns and remove them by threshold
  • Present variable set is converted to a new variable set
  • Also, referred to as a linear combo of new variables
  • Determine the location of each point by pair-wise spaces among all points which are represented in a matrix
  • Further, use standard multi-dimensional scaling (MDS) for determining low-dimensional points locations

In addition, we have also given you the broadly utilized deep learning models in current research . Here, we have classified the models into two major classifications such as discriminant models and generative models . Further, we have also specified the deep learning process with suitable techniques. If there is a complex situation, then we design new algorithms based on the project’s needs . On the whole, we find apt solutions for any sort of problem through our smart approach to problems.

Deep Learning Models

  • CNN and NLP (Hybrid)
  • Domain-specific
  • Image conversion
  • Meta-Learning

Furthermore, our developers are like to share the globally suggested deep learning software and tools . In truth, we have thorough practice on all these developing technologies. So, we are ready to fine-tuned guidance on deep learning libraries, modules, packages, toolboxes , etc. to ease your development process. By the by, we will also suggest you best-fitting software/tool for your project . We ensure you that our suggested software/tool will make your implementation process of deep learning projects techniques more simple and reliable .

Deep Learning Software and Tools

  • Caffe & Caffe2
  • Deep Learning 4j
  • Microsoft Cognitive Toolkit

So far, we have discussed important research updates of deep learning . Now, we can see the importance of handpicking a good research topic for an impressive deep learning research proposal. In the research topic, we have to outline your research by mentioning the research problem and efficient solutions . Also, it is necessary to check the future scope of research for that particular topic.

The topic without future research direction is not meant to do research!!!

For more clarity, here we have given you a few significant tips to select a good deep learning research topic.

How to write a research paper on deep learning?

  • Check whether your selected research problem is inspiring to overcome but not take more complex to solve
  • Check whether your selected problem not only inspires you but also create interest among readers and followers
  • Check whether your proposed research create a contribution to social developments
  • Check whether your selected research problem is unique

From the above list, you can get an idea about what exactly a good research topic is. Now, we can see how a good research topic is identified.

  • To recognize the best research topic, first undergo in-depth research on recent deep learning studied by referring latest reputed journal papers.
  • Then, perform a review process over the collected papers to detect what are the current research limitations, which aspect not addressed yet, which is a problem is not solved effectively,   which solution is needed to improve, what the techniques are followed in recent research, etc.
  • This literature review process needs more time and effort to grasp knowledge on research demands among scholars.
  • If you are new to this field, then it is suggested to take the advice of field experts who recommend good and resourceful research papers.
  • Majorly, the drawbacks of the existing research are proposed as a problem to provide suitable research solutions.
  • Usually, it is good to work on resource-filled research areas than areas that have limited reference.
  • When you find the desired research idea, then immediately check the originality of the idea. Make sure that no one is already proved your research idea.
  • Since, it is better to find it in the initial stage itself to choose some other one.
  • For that, the search keyword is more important because someone may already conduct the same research in a different name. So, concentrate on choosing keywords for the literature study.

How to describe your research topic?

One common error faced by beginners in research topic selection is a misunderstanding. Some researchers think topic selection means is just the title of your project. But it is not like that, you have to give detailed information about your research work on a short and crisp topic . In other words, the research topic is needed to act as an outline for your research work.

For instance: “deep learning for disease detection” is not the topic with clear information. In this, you can mention the details like type of deep learning technique, type of image and its process, type of human parts, symptoms , etc.

The modified research topic for “deep learning for disease detection” is “COVID-19 detection using automated deep learning algorithm”

 For your awareness, here we have given you some key points that need to focus on while framing research topics. To clearly define your research topic, we recommend writing some text explaining:

  • Research title
  • Previous research constraints
  • Importance of the problem that overcomes in proposed research
  • Reason of challenges in the research problem
  • Outline of problem-solving possibility

To the end, now we can see different research perspectives of deep learning among the research community. In the following, we have presented you with the most demanded research topics in deep learning such as image denoising, moving object detection, and event recognition . In addition to this list, we also have a repository of recent deep learning research proposal topics, machine learning thesis topics . So, communicate with us to know the advanced research ideas of deep learning.

Research Topics in Deep Learning

  • Continuous Network Monitoring and Pipeline Representation in Temporal Segment Networks
  • Dynamic Image Networks and Semantic Image Networks
  • Advance Non-uniform denoising verification based on FFDNet and DnCNN
  • Efficient image denoising based on ResNets and CNNs
  • Accurate object recognition in deep architecture using ResNeXts, Inception Nets and  Squeeze and Excitation Networks
  • Improved object detection using Faster R-CNN, YOLO, Fast R-CNN, and Mask-RCNN

Novel Deep Learning Research Proposal Implementation

Overall, we are ready to support you in all significant and new research areas of deep learning . We guarantee you that we provide you novel deep learning research proposal in your interested area with writing support. Further, we also give you code development , paper writing, paper publication, and thesis writing services . So, create a bond with us to create a strong foundation for your research career in the deep learning field.

Related Pages

Services we offer.

Mathematical proof

Pseudo code

Conference Paper

Research Proposal

System Design

Literature Survey

Data Collection

Thesis Writing

Data Analysis

Rough Draft

Paper Collection

Code and Programs

Paper Writing

Course Work

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Advertisement

Advertisement

Deep learning: systematic review, models, challenges, and research directions

  • Open access
  • Published: 07 September 2023
  • Volume 35 , pages 23103–23124, ( 2023 )

Cite this article

You have full access to this open access article

research proposal of deep learning

  • Tala Talaei Khoei   ORCID: orcid.org/0000-0002-7630-9034 1 ,
  • Hadjar Ould Slimane 1 &
  • Naima Kaabouch 1  

18k Accesses

38 Citations

4 Altmetric

Explore all metrics

The current development in deep learning is witnessing an exponential transition into automation applications. This automation transition can provide a promising framework for higher performance and lower complexity. This ongoing transition undergoes several rapid changes, resulting in the processing of the data by several studies, while it may lead to time-consuming and costly models. Thus, to address these challenges, several studies have been conducted to investigate deep learning techniques; however, they mostly focused on specific learning approaches, such as supervised deep learning. In addition, these studies did not comprehensively investigate other deep learning techniques, such as deep unsupervised and deep reinforcement learning techniques. Moreover, the majority of these studies neglect to discuss some main methodologies in deep learning, such as transfer learning, federated learning, and online learning. Therefore, motivated by the limitations of the existing studies, this study summarizes the deep learning techniques into supervised, unsupervised, reinforcement, and hybrid learning-based models. In addition to address each category, a brief description of these categories and their models is provided. Some of the critical topics in deep learning, namely, transfer, federated, and online learning models, are explored and discussed in detail. Finally, challenges and future directions are outlined to provide wider outlooks for future researchers.

Similar content being viewed by others

research proposal of deep learning

Future and Discussions

research proposal of deep learning

Deep reinforcement learning: a survey

research proposal of deep learning

Machine Learning Paradigms: Introduction to Deep Learning-Based Technological Applications

Explore related subjects.

  • Artificial Intelligence

Avoid common mistakes on your manuscript.

1 Introduction

The main concept of artificial neural networks (ANN) was proposed and introduced as a mathematical model of an artificial neuron in 1943 [ 1 , 2 , 3 ]. In 2006, the concept of deep learning (DL) was proposed as an ANN model with several layers, which has significant learning capacity. In recent years, DL models have seen tremendous progress in addressing and solving challenges, such as anomaly detection, object detection, disease diagnosis, semantic segmentation, social network analysis, and video recommendations [ 4 , 5 , 6 , 7 ].

Several studies have been conducted to discuss and investigate the importance of the DL models in different applications, as illustrated in Table 1 . For instance, the authors of [ 8 ] reviewed supervised, unsupervised, and reinforcement DL-based models. In [ 9 ], the authors outlined DL-based models, platforms, applications, and future directions. Another survey [ 10 ] provided a comprehensive review of the existing models in the literature in different applications, such as natural processing, social network analysis, and audio. In this study, the authors provided a recent advancement in DL applications and elaborated on some of the existing challenges faced by these applications. In [ 11 ], the authors highlighted different DL-based models, such as deep neural networks, convolutional neural networks, recurrent neural networks, and auto-encoders. They also covered their frameworks, benchmarks, and software development requirements. In [ 12 ], the authors discussed the main concepts of deep learning and neural networks. They also provided several applications of DL in a variety of areas.

Other studies covered particular challenges of DL models. For instance, the authors of [ 13 ] explored the importance of class imbalanced dataset on the performance of the DL models as well as the strengths and weaknesses of the methods proposed in the literature for solving class imbalanced data. Another study [ 14 ] explored the challenges that DL faces in the case of data mining, big data, and information processing due to huge volume of data, velocity, and variety. In [ 15 ], the authors analyzed the complexity of DL-based models and provided a review of the existing studies on this topic. In [ 16 ], the authors focused on the activation functions of DL. They introduced these functions as a strategy in DL to transfer nonlinearly separable input into the more linearly separable data by applying a hierarchy of layers, whereas they provided the most common activation functions and their characteristics.

In [ 17 ], the authors outlined the applications of DL in cybersecurity. They provided a comprehensive literature review of DL models in this field and discussed different types of DL models, such as convolutional neural networks, auto-encoders, and generative adversarial networks. They also covered the applications of different attack categories, such as malware, spam, insider threats, network intrusions, false data injection, and malicious in DL. In another study [ 18 ], the authors focused on detecting tiny objects using DL. They analyzed the performance of different DL in detecting these objects. In [ 19 ], the authors reviewed DL models in the building and construction industry-based applications while they discussed several important key factors of using DL models in manufacturing and construction, such as progress monitoring and automation systems. Another study [ 20 ] focused on using different strategies in the domain of artificial intelligence (AI), including DL in smart grids. In such a study, the authors introduced the main AI applications in smart grids while exploring different DL models in depth. In [ 7 ], the authors discussed the current progress of DL in medical areas and gave clear definitions of DL models and their theoretical concepts and architectures. In [ 21 ], the authors analyzed the DL applications in biology, medicine, and engineering domains. They also provided an overview of this field of study and major DL applications and illustrated the main characteristics of several frameworks, including molecular shuttles.

Despite the existing surveys in the field of DL focusing on a comprehensive overview of these techniques in different domains, the increasing amount of these applications and the existing limitations in the current studies motivated us to investigate this topic in depth. In general, the recent studies in the literature mostly discussed specific learning strategies, such as supervised models, while they did not cover different learning strategies and compare them with each other. In addition, the majority of the existing surveys excluded new strategies, such as online learning or federated learning, from their studies. Moreover, these surveys mostly explored specific applications in DL, such as the Internet of Things, smart grid, or constructions; however, this field of study requires formulation and generalization in different applications. In fact, limited information, discussions, and investigations in this domain may lead to prevent any development and progress in DL-based applications. To fill these gaps, this paper provides a comprehensive survey on four types of DL models, namely, supervised, unsupervised, reinforcement, and hybrid learning. It also provides the major DL models in each category and describes the main learning strategies, such as online, transfer, and federated learning. Finally, a detailed discussion of future direction and challenges is provided to support future studies. In short, the main contributions of this paper are as follows:

Classifications and in-depth descriptions of supervised, unsupervised, enforcement, and hybrid models. Description and discussion of learning strategies, such as online, federated, and transfer learning,

Comparison of different classes of learning strategies, their advantages, and disadvantages,

Current challenges and future directions in the domain of deep learning.

The remainder of this paper is organized as follows: Sect.  2 provides descriptions of the supervised, unsupervised, reinforcement, and hybrid learning models, along with a brief description of the models in each category. Section  3 highlights the main learning approaches that are used in deep learning. Section  4 discusses the challenges and future directions in the field of deep learning. The conclusion is summarized in Sect.  5 .

2 Categories of deep learning models

DL models can be classified into four categories, namely, deep supervised, unsupervised, reinforcement learning, and hybrid models. Figure  1 depicts the main categories of DL along with examples of models in each category. In the following, short descriptions of these categories are provided. In addition, Table 2 provides the most common techniques in every category.

figure 1

Schematic review of the models in deep learning

2.1 Deep supervised learning

Deep supervised learning-based models are one of the main categories of deep learning models that use a labeled training dataset to be trained. These models measure the accuracy through a function, loss function, and adjust the weights till the error has been minimized sufficiently. Among the supervised deep learning category, three important models are identified, namely, deep neural networks, convolutional neural networks, and recurrent neural network-based models, as illustrated in Fig.  2 . Artificial neural networks (ANN), known as neural networks or neural nets, are one of the computing systems, which are inspired by biological neural networks. ANN models are a collection of connected nodes (artificial neurons) that model the neurons in a biological brain. One of the simple ANN models is known as a deep neural network (DNN) [ 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 ]. DNN models consist of a hierarchical architecture with input, output, and hidden layers, each of which has a nonlinear information processing unit, as illustrated in Fig.  2 A. DNN, using the architecture of neural networks, consists of functions with higher complexity when the number of layers and units in a layer is increased. Some known instances of DNN models, as highlighted in Table 2 , are multi-layer perceptron, shallow neural network, operational neural network, self-operational neural network, and iterative residual blocks neural network.

figure 2

Inner architecture of deep supervised models

The second type of deep supervised models is convolutional neural networks (CNN), known as one of the important DL models that are used to capture the semantic correlations of underlying spatial features among slice-wise representations by convolution operations in multi-dimensional data [ 25 ]. A simple architecture of CNN-based models is shown in Fig.  2 B. In these models, the feature mapping has k filters that are partitioned spatially into several channels. In addition, the pooling function can shrink the width and height of the feature map, while the convolutional layer can apply a filter to an input to generate a feature map that can summarize the identified features as input. The convolutional layers are followed by one or more fully connected layers connected to all the neurons of the previous layer. CNN usually analyzes the hidden patterns using pooling layers for scaling functions, sharing the weights for reducing memories, and filtering the semantic correlation captured by convolutional operations. Therefore, CNN architecture provides a strong potential in spatial features. However, CNN models suffer from their disability in capturing particular features. Some known examples of this network are presented in Table 2 [ 7 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 ].

The other type of supervised DL is recurrent neural network (RNN) models, which are designed for sequential time-series data where the output is returned to the input, as shown in Fig.  2 C [ 27 ]. RNN-based models are widely used to memorize the previous inputs and handle the sequential data and existing inputs [ 42 ]. In RNN models, the recursive process has hidden layers with loops that indicate effective information about the previous states. In traditional neural networks, the given inputs and outputs are totally independent of one another, whereas the recurrent layers of RNN have a memory that remembers the whole data about what is exactly calculated [ 48 ]. In fact, in RNN, similar parameters for every input are applied to construct the neural network and estimate the outputs. The critical principle of RNN-based models is to model time collection samples; hence, specific patterns can be estimated to be dependent on previous ones [ 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 ]. Table 2 provides the instances of RNN-based models as simple recurrent neural network, long short-term memory, gated recurrent unit neural network, bidirectional gated recurrent unit neural network, bidirectional long short-term memory, and residual gated recurrent neural network [ 64 , 65 , 66 ]. Table  3 shows the advantages and disadvantages of supervised DL models.

3 Deep unsupervised learning

Deep unsupervised models have gained significant interest as a mainstream of viable deep learning models. These models are widely used to generate systems that can be trained with few numbers of unlabeled samples [ 24 ]. The models can be classified into auto-encoders , restricted Boltzmann machine, deep belief neural networks, and generative adversarial networks. An auto-encoder (AE) is a type of auto-associative feed-forward neural network that can learn effective representations from the given input in an unsupervised manner [ 29 ]. Figure  3 A provides a basic architecture of AE. As it can be seen, there are three elements in AE, encoder, latent space, and decoder. Initially, the corresponding input passes through the encoder. The encoder is mostly a fully connected ANN that is able to generate the code. In contrast, the decoder generates the outputs using the codes and has an architecture similar to ANN. The aim of having an encoder and decoder is to present an identical output with the given input. It is notable that the dimensionality of the input and output has to be similar. Additionally, real-world data usually suffer from redundancy and high dimensionality, resulting in lower computational efficiency and hindering the modeling of the representation. Thus, a latent space can address this issue by representing compressed data and learning the features of the data, and facilitating data representations to find patterns. As shown in Table 2 , AE consists of several known models, namely, stacked, variational, and convolutional AEs [ 30 , 43 ]. The advantages and disadvantages of these models are presented in Table 4 . 

figure 3

Inner architecture of deep unsupervised models

The restricted Boltzmann machine (RBM) model, known as Gibbs distribution, is a network of neurons that are connected to each other, as shown in Fig.  3 B. In RBM, the network consists of two layers, namely, the input or visible layer and the hidden layer. There is no output layer in RBM, while the Boltzmann machines are random and generative neural networks that can solve combinative problems. Some common RBM are presented in Table 2 as shallow restricted Boltzmann machines and convolutional restricted Boltzmann machines. The deep belief network (DBN) is another unsupervised deep neural network that performs in a similar way as the deep feed-forward neural network with inputs and multiple computational layers, known as hidden layers, as illustrated in Fig.  3 C. In DBM, there are two main phases that are necessary to be performed, pre-train and fine-tuning phases. The pre-train phase consists of several hidden layers; however, fine-tuning phase only is considered a feed-forward neural network to train and classify the data. In addition, DBN has multiple layers with values, while there is a relation between layers but not with the values [ 31 ]. Table 2 reviews some of the known DBN models, namely, shallow deep belief neural networks and conditional deep belief neural networks [ 44 , 45 ].

The generative adversarial network (GAN) is an another type of unsupervised deep learning model that uses a generator network (GN) and discriminator network (DN) to generate synthetic data to follow similar distribution from the original data, as presented in Fig.  3 D. In this context, the GN mimics the distribution of the given data using noise vectors to exhaust the DN to classify between fake and real samples. The DN can be trained to differentiate between fake and real samples by the GN from the original samples. In general, the GN learns to create plausible data, whereas the DN can learn to identify the generator’s fake data from the real ones. Additionally, the discriminator can penalize the generator for generating implausible data [ 32 , 54 ]. The known types of GAN are presented in Table 2 as generative adversarial networks, signal augmented self-taught learning, and Wasserstein generative adversarial networks. As a result of this discussion, Table 4 provides the main advantages and disadvantages of the unsupervised DL categories [ 56 ].

3.1 Deep reinforcement learning

Reinforcement learning (RL) is the science of making decisions with learning the optimal behavior in an environment to achieve maximum reward. The optimal behavior is achieved through interactions with the environment. In RL, an agent can make decisions, monitor the results, and adjust its technique to provide optimal policy [ 75 , 76 ]. In particular, RL is applied to assist an agent in learning the optimal policy when the agent has no information about the surrounding environments. Initially, the agent monitors the current state, takes action, and receives its reward with its new state. In this context, the immediate reward and new state can adjust the agent's policy; This process is repeated till the agent’s policy is getting close to the optimal policy. To be precise, RL does not need any detailed mathematical model for the system to guarantee optimal control [ 77 ]; however, the agent considers the target system as the environment and optimizes the control policy by communicating with it. The agent performs specific steps. During every step, the agent selects an action based on its existing policy, and the environment feeds back a reward and goes to the next state [ 78 , 79 , 80 ]. This process is learned by the agent to adjust its policy by referencing the relationships during the state, action, and rewards. The RL agent also can determine an optimal policy related to the maximum cumulative reward. In addition, an RL agent can be modeled as Markov decision process (MDP) [ 78 ]. In MDP, when the states and action spaces are finite, the process is known as finite. As it is clear, the RL learning approach may take a huge amount of time to achieve the best policy and discover the knowledge of a whole system; hence, RL is inappropriate for large-scale networks [ 81 ].

In the past few years, deep reinforcement learning (DRL) was proposed as an advanced model of RL in which DL is applied as an effective tool to enhance the learning rate for RL models. The achieved experiences are stored during the real-time learning process, whereas the generated data for training and validating neural networks are applied [ 82 ]. In this context, the trained neural network has to be used to assist the agent in making optimal decisions in real-time scenarios. DRL overcomes the main shortcomings of RL, such as long processing time to achieve optimal policy, thus opening a new horizon to embrace the DRL [ 83 ]. In general, as shown in Fig.  4 , DRL uses the deep neural networks’ characteristics to train the learning process, resulting in increasing the speed and improving the algorithms’ performance. In DRL, within the environment or agent interactions, the deep neural networks keep the internal policy of the agent, which indicates the next action according to the current state of the environment.

figure 4

Inner architecture of deep reinforcement learning

DRL can be divided into three methods, value-based, policy-based, and model-based methods. Value-based DRL mainly represents and finds the value functions and their optimal ones. In such methods, the agent learns the state or state-action value and behaves based on the best action in the state. One necessary step of these methods is to explore the environment. Some known instances of value-based DRL are deep Q-learning, double deep Q-learning, and duel deep Q-learning [ 83 , 84 , 85 ]. On the contrary, policy-based DRL finds an optimal policy, stochastic or deterministic, to better convergence on high-dimensional or continuous action space. These methods are mainly optimization techniques in which the maximum policy of function can be found. Some examples of policy-based DRL are deep deterministic policy gradient and asynchronous advantage actor critic [ 86 ]. The third category of DRL, model-based methods, aims at learning the functionality of the environment and its dynamics from its previous observations, while these methods attempt a solution using the specific model. For these methods, in the case of having a model, they find the best policy to be efficient, while the process may fail when the state space is huge. In model-based DRL, the model is often updated, and the process is replanned. Instances of model-based DRL are imagination-augmented agents, model-based priors for model-free, and model-based value expansion. Table 5 illustrates the important advantages and disadvantages of these categories [ 87 , 88 , 89 ].

3.2 Hybrid deep learning

Deep learning models have weaknesses and strengths in terms of hyperparameter tuning settings and data explorations [ 45 ]. Therefore, the highlighted weakness of these models can hinder them from being strong techniques in different applications. Every DL model also has characteristics that make it efficient for specific applications; hence, to overcome these shortcomings, hybrid DL models have been proposed based on individual DL models to tackle the shortcomings of specific applications [ 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 ]. Figure  5 indicates the popular hybrid DL models that are used in the literature. It is observed that convolutional neural networks and recurrent neural networks are widely used in existing studies and have high applicability and potentiality compared to other developed DL models.

figure 5

Review of popular hybrid models

4 Evaluation metrics

In any classification tasks, the metrics are required to evaluate the DL models. It is worth mentioning that various metrics can be used in different fields of studies. It means that the metrics which are used in medical analysis are mostly different with other domains, such as cybersecurity or computer visions. For this reason, we provide a short descriptions and a mathematical equations of the most common metrics in different domains, as following:

Accuracy: It is mainly used in classification problems to indicate the correct predictions made by a DL model. This metric is calculated, as shown in Eq. ( 1 ), where \({T}_{\mathrm{P}}\) is the true positive, \({T}_{\mathrm{N}}\) is true negative, \({F}_{\mathrm{P}}\) is the false positive, and \({F}_{\mathrm{N}}\) is the false negative.

Precision: It refers to the number of the true positives divided by the total number of the positive predictions, including true positive and false positive. This metric can be measured as following:

Recall (detection rate): It measures the number of the positive samples that are classified correctly to the total number of the positive samples. This metric, as measuring in Eq. ( 3 ), can indicate the model’s ability to classify positive samples among other samples.

F1-Score: It is calculated from the precision and recall of the test, where the precision is defined as Eq. ( 2 ), and recall is presented in Eq. ( 3 ). This metric is calculated as shown in Eq. ( 4 ):

Area under the receiver operating characteristics curve (AUC): AUC is one of the important metrics in classification problems. Receiver operating characteristic (ROC) helps to visualize the tradeoff between sensitivity and specificity in DL models. The AUC curve is a plot of true-positive rate (TPR) to false-positive rate (FPR). A good DL model has an AUC value near to 1. This metric is measured, as shown in Eq. ( 5 ), where x is the varying AUC parameter.

False Alarm Rate: This metric is also known as false-positive rate, which is the probability of a false alarm will be raised. It means, a positive result will be given when a true value is negative. This metric can be measured as shown in Eq. ( 6 ):

Misdetection Rate: It is a metric that shows the percentage of misclassified samples. This metric can be defined as the percentage of the samples that are not detected. It is also measured, as shown in Eq. ( 7 ):

5 Learning classification in deep learning models

Learning strategies, as shown in Fig.  6 , include online learning, transfer learning, and federated learning. In this section, these learning strategies are discussed in brief.

figure 6

Review of learning classification in deep learning models

5.1 Online learning

Conventional machine learning models mostly employ batch learning methods, in which a collection of training data is provided in advance to the model. This learning method requires the whole training dataset to be made accessible ahead to the training, which lead to high memory usage and poor scalability. On the other hand, online learning is a machine learning category where data are processed in sequential order, and the model is updated accordingly [ 90 ]. The purpose of online learning is to maximize the accuracy of the prediction model using the ground truth of previous predictions [ 91 ]. Unlike batch or offline machine learning approaches, which require the complete training dataset to be available to be trained on [ 92 ], online learning models use sequential stream of data to update their parameters after each data instance. Online learning is mainly optimal when the entire dataset is unavailable or the environment is dynamically changing [ 92 , 93 , 94 , 95 , 96 ]. On the other hand, batch learning is easier to maintain and less complex; it requires all the data to be available to be trained on it and does not update its model. Table 6 shows the advantages and disadvantages of batch learning and online learning.

An online model aims to learn a hypothesis \({\mathcal{H}}:X \to Y\) Where \(X\) is the input space, and \(Y\) is the output space. At each time step \(t\) , a new data instance \({\varvec{x}}_{{\varvec{t}}} \in X\) is received, and an output or prediction \(\hat{y}_{t}\) is generated using the mapping function \({\mathcal{H}}\left( {x_{t} ,w_{t} } \right) = \hat{y}_{t}\) , where \({{\varvec{w}}}_{{\varvec{t}}}\) is the weights’ vector of the online model at the time step \(t\) . The true class label \({y}_{t}\) is then utilized to calculate the loss and update the weights of the model \({\varvec{w}}_{{{\varvec{t}} + 1}}\) , which is illustrated in Fig.  7 [ 97 ].

figure 7

Online machine learning process

The number of mistakes committed by the online model across T time steps is defined as \({M}_{T}\) for \(\hat{y}_{t} \ne y_{t}\) [ 55 ]. The goal of an online learning model is to minimize the total loss of the online model performance compared to the best model in hindsight, which is defined as [ 35 ]

where the first term is the sum of the loss function at time step t, and the second term is the loss function of the best model after seeing all the instances [ 98 , 99 ]. While training the online model, different approaches can be adopted regarding data that the model has already trained on; full memory, in which the model preserves all training data instances; partial memory, where the model retains only some of the training data instances; and no memory, in which it remembers none. Two main techniques are utilized to remove training data instances: passive forgetting and active forgetting [ 107 , 108 , 109 ]

Passive forgetting only considers the amount of time that has passed since the training data instances were received by the model, which implies that the significance of data diminishes over time.

Active forgetting , on the other hand, requires additional information from the utilized training data in order to determine which objects to remove. The density-based forgetting and error-based forgetting are two active forgetting techniques.

Online learning techniques can be classified into three categories: online learning with full feedback, online learning with partial feedback, and online learning with no feedback. Online learning with full feedback is when all training data instances \(x\) have a corresponding true label \(y\) which is always disclosed to the model at the end of each online learning round. Online learning with partial feedback is when only partial feedback information is received that shows if the prediction is correct or not, rather than the corresponding true label explicitly. In this category, the online learning model is required to make online updates by seeking to maintain a balance between the exploitation of revealed knowledge and the exploration of unknown information with the environment [ 2 ]. On the other hand, online learning with no feedback is when only the training data are fed to the model without the ground truth or feedback. This category includes online clustering and dimension reduction [ 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 ].

5.2 Deep transfer learning

Training deep learning models from scratch needs extensive computational and memory resources and large amounts of labeled datasets. However, for some types of scenarios, huge, annotated datasets are not always available. Additionally, developing such datasets requires a great deal of time and is a costly operation. Transfer learning (TL) has been proposed as an alternative for training deep learning models [ 112 ]. In TL, the obtained knowledge from another domain can be easily transferred to target another classification problem. TL saves computing resources and increases efficiency in training new deep learning models. TL can also help train deep learning models on available annotated datasets before validating them on unlabeled data [ 113 , 114 ]. Figure  8 illustrates a simple visualization of the deep transfer learning, which can transfer valuable knowledge by further using the learning ability of neural networks.

figure 8

Visualization of deep transfer learning

In this survey, the deep transfer learning techniques are classified based on the generalization viewpoints between deep learning models and domains into four categories, namely, instance, feature representation, model parameter, and relational knowledge-based techniques. In the following, we briefly discuss these categories with their categorizations, as illustrated in Fig.  9 .

figure 9

Categories of deep transfer learning

5.2.1 Instance-based

Instance-based TL techniques are performed based on the selected instance or on selecting different weights for instances. In such techniques, the TL aims at training a more accurate model under a transfer scenario, in which the difference between a source and a target comes from the different marginal probability distributions or conditional probability distributions [ 62 ]. Instance-based TL presents the labeled samples that are only limited to training a classification model in the target domain. This technique can directly margin the source data into the target data, resulting in decreasing the target model performance and a negative transfer during training [ 109 , 110 , 111 ]. The main goal of instance-based TL is to single out the instances in the source domains. Such a process can have positive impact on the training of the models in target as well as augmenting the target data through particular weighting techniques. In this context, a viable solution is to learn the weights of the source domains' instances automatically in an objective function. The objective function is given by:

where \({W}_{i}\) is the weighting coefficient of the given source instance, \({C}^{s}\) represents the risks function of the selected source instance, and \({\vartheta }^{*}\) is the second risk function related to the target task or the parameter regularization.

The weighting coefficient of the given source instance can be computed as the ratio of the marginal probability distribution between source and target domains. Instance-based TL can be categorized into two subcategories, weight estimation and heuristic re-weighting-based techniques [ 63 ]. A weight estimation method can focus on scenarios in which there are limited labeled instances in the target domain, converting the instance transfer problem into the weight estimation problem using kernel embedding techniques. In contrast, a heuristic re-weighting technique is more effective for developing deep TL tasks that have labeled instances and are available in the target domains [ 64 ]. This technique aims at detecting negative source instances by applying instance re-weighting approaches in a heuristic manner. One of the known instance re-weighting approaches is the transfer adaptive boosting algorithm, in which the weights of the source and target instances are updated via several iterations [ 116 ].

5.2.2 Feature representation-based

Feature representation-based TL models can share or learn a common feature representation between a target and a source domain. This category uses models with the ability to transfer knowledge by learning similar representations at the feature space level. Its main aim is to learn the mapping function as a bridge to transfer raw data in source and target domains from various feature spaces to a latent feature space [ 109 ]. From a general perspective, feature representation-based TL covers two transfer styles with or without adapting to the target domain [ 110 ]. Techniques without adapting to the target domain can extract representations as inputs for the target models; however, the techniques with adapting to the target domain can extract feature representations across various domains via domain adaption techniques [ 112 ]. In general, techniques of adapting to the target domain are hard to implement, and their assumptions are weak to be justified in most of cases. On the contrary, techniques of adapting to the target domain are easy to implement, and their assumptions can be strong in different scenarios [ 111 ].

One important challenge in feature representation TL with domain adaptation is the estimation of representing invariance between source and target domains. There are three techniques to build representation invariance, leveraging discrepancy-based, adversarial-based, and reconstruction-based. Leveraging discrepancy-based can improve the learning transferable ability representations and decrease the discrepancy based on distance metrics between a given source and target, while the adversarial-based is inspired by GANs and provides the neural network with the ability to learn domain-invariant representations. In construction-based, the auto-encoder neural networks with specific task classifiers are combined to optimize the encoder architecture, which takes domain-specific representations and shares an encoder that learns representations between different domains [ 113 ].

5.2.3 Model parameter-based

Model parameter-based TL can share the neural network architecture and parameters between target and source domains. This category can convey the assumptions that can share in common between the source and target domains. In such a technique, transferable knowledge is embedded into the pre-trained source model. This pre-trained source model has a particular architecture with some parameters in the target model [ 99 ]. The aim of this process is to use a section of the pre-trained model in the source domain, which can improve the learning process in the target domain. These techniques are performed based on the assumption that labeled instances in the target domain are available during the training of the target model [ 99 , 100 , 101 , 102 , 103 ]. Model parameter-based TL is divided into two categories, sequential and joint training. In sequential training, the target deep model can be established by pretraining a model on an auxiliary domain. However, joint training focuses on developing the source and target tasks at the same time. There are two methods to perform joint training [ 104 ]. The first method is hard parameter sharing, which shares the hidden layers directly while maintaining the task-specific layers independently [ 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 ]. The second method is soft parameter sharing which changes the weight coefficient of the source and target tasks and adds regularization to the risk function. Table 7 shows the advantages and disadvantages of the three categories, instance-based, future representation-based, and model parameter-based.

5.3 Deep federated learning

In traditional centralized DL, the collected data have to be stored on local devices, such as personal computers [ 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 ]. In general, traditional centralized DL can store the user data on the central server and apply it for training and testing purposes, as illustrated in Fig.  10 A, while this process may deal with several shortcomings, such as high computational power, low security, and privacy. In such models, the efficiency and accuracy of the models heavily depend on the computational power and training process of the given data on a centralized server. As a result, centralized DL models not only provide low privacy and high risks of data leakage but also indicate the high demands on storage and computing capacities of the several machines which train the models in parallel. Therefore, federated learning (FL) was proposed as an emerging technology to address such challenges [ 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 ].

figure 10

Centralized and federated learning process flow

FL provides solutions to keep the users’ privacy by decentralizing data from the corresponding central server to devices and enabling artificial intelligence (AI) methods to discipline the data. Figure  10 B summarizes the main process in an FL model. In particular, the unavailability of sufficient data, high computational power, and a limited level of privacy using local data are three major benefits of FL AI over centralized AI [ 115 , 116 , 117 , 118 , 119 ]. For this purpose, FL models aim at training a global model which can be trained on data distributed on several devices while they can protect the data. In this context, FL finds an optimal global model, known as \(\theta\) , can minimize the aggregated local loss function, \({f}_{k}\) ( \({\theta }^{k}\) ), as shown in Eq. ( 10 ).

where X denotes the data feature, y is the data label, \({n}_{k}\) is the local data size, C is the ratio in which the local clients do not participate in every round of the models’ updates, l is the loss function, k is the client index, and \(\sum_{k=1}^{C*k}{n}_{k}\) shows the total number of sample pairs. FL can be classified based on the characteristics of the data distribution among the clients into two types, namely, vertical and horizontal FL models, as discussed in the following:

5.3.1 Horizontal federated learning

Horizontal FL, homogeneous FL, shows the cases in which the given training data of the participating clients share a similar feature space; however, these corresponding data have various sample spaces [ 76 ]. Client one and Client two have several data rows with similar features, whereas each row shows specific data for a unique client. A typical common algorithm, namely, federated averaging (FedAvg), is usually used as a horizontal FL algorithm. FedAvg is one of the most efficient algorithms for distributing training data with multiple clients. In such an algorithm, clients keep the data local for protecting their privacy, while central parameters are applied to communicate between different clients [ 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 ].

In addition, horizontal FL provides efficient solutions to avoid leaking private local data. This can happen since the global and local model parameters are only permitted to communicate between the servers and clients, whereas all the given training data are stored on the client devices without being accessed by any other parties [ 14 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 130 , 131 , 132 , 133 ]. Despite such advantages, constant downloading and uploading in horizontal FL may consume huge amounts of communication resources. In deep learning models, the situation is getting worse due to the needing huge amounts of computation and memory resources. To address such issues, several studies have been performed to decrease the computational efficiency of horizontal FL models [ 134 ]. These studies proposed methods to reduce communication costs using multi-objective evolutionary algorithms, model quantization, and sub-sampling techniques. In these studies, however, no private data can be accessed directly by any third party, the uploaded model parameters or gradients may still leak the data for every client [ 135 ].

5.3.2 Vertical federated learning

Vertical FL, heterogeneous FL, is one of the types of FL in which users’ training data can share the same sample space while they have multiple different feature spaces. Client one and Client two have similar data samples with different feature spaces, and all clients have their own local data that are mostly assumed to one client keeps all the data classes. Such clients with data labels are known as guest parties or active parties, and clients without labels are known as host parties [ 136 ]. In particular, in vertical FL, the common data between unrelated domains are mainly applied to train global DL models [ 137 ]. In this context, participants may use intermediate third-party resources to indicate encryption logic to guarantee the data stats are kept. Although it is not necessary to use third parties in this process, studies have demonstrated that vertical FL models with third parties using encryption techniques provide more acceptable results [ 14 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 130 , 131 , 132 , 133 , 134 , 135 , 136 , 137 , 138 ].

In contrast with horizontal FL, training parametric models in vertical FL has two benefits. Firstly, trained models in vertical FL have a similar performance as centralized models. As a matter of fact, the computed loss function in vertical FL is the same as the loss function in centralized models. Secondly, vertical FL often consumes fewer communication resources compared to horizontal FL [ 138 ]. Vertical FL only consumes more communication resources than horizontal FL if and only if the data size is huge. In vertical FL, privacy preservation is the main challenge. For this purpose, several studies have been conducted to investigate privacy preservation in vertical FL, using identity resolution schemes, protocols, and vertical decision learning schemes. Although these approaches improve the vertical FL models, there are still some main slight differences between horizontal and vertical FL [ 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 130 , 131 , 132 , 133 , 134 , 135 , 136 , 137 , 138 , 139 , 140 , 141 , 142 , 143 ].

Horizontal FL includes a server for aggregation of the global models. In contrast, the vertical FL does not have a central server and global model [ 14 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 130 ]. As a result, the output of the local model’s aggregation is done based on the guest client to build a proper loss function. Another difference is the model parameters or gradients between servers and clients in horizontal FL. Local model parameters in vertical FL depend on the local data feature spaces, while the guest client receives model outputs from the connected host clients [ 143 ]. In this process, the intermediate gradient values are sent back for updating local models [ 105 ]. Ultimately, the server and the clients communicate with one another once in a communication round in horizontal FL; however, the guest and host clients have to send and receive data several times in a communication round in vertical FL [ 14 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 ]. Table 8 summarizes the main advantages and disadvantages of vertical and horizontal FL and compares these FL learning categories with central learning.

6 Challenges and future directions

Deep learning models, while powerful and versatile, face several significant challenges. Addressing these challenges requires a multidisciplinary approach involving data collection and preprocessing techniques, algorithmic enhancements, fairness-aware model training, interpretability methods, safe learning, robust models to adversarial attacks, and collaboration with domain experts and affected communities to push the boundaries of deep learning and realize its full potential. A brief description of each of these challenges is given below.

6.1 Data availability and quality

Deep learning models require large amounts of labeled training data to learn effectively. However, obtaining sufficient and high-quality labeled data can be expensive, time-consuming, or challenging, particularly in specialized domains or when dealing with sensitive data such cybersecurity. Although there are several approaches, such as data augmentation, to generate high amounts of data, it can sometimes be cumbersome to generate enough training data and satisfy the requirements of DL models. In addition, having a small dataset may lead to overfitting issues where DL models perform well on the training data but fail to generalize to unseen data. Balancing model complexity and regularization techniques to avoid overfitting while achieving good generalization is a challenge in deep learning. In addition, exploring techniques to improve data efficiency, such as few-shot learning, active learning, or semi-supervised learning, remains an active area of research.

6.2 Ethics and fairness

The challenge of ethics and fairness in deep learning underscores the critical need to address biases, discrimination, and social implications embedded within these models. Deep learning systems learn patterns from vast and potentially biased datasets, which can perpetuate and amplify societal prejudices, leading to unfair or unjust outcomes. The ethical dilemma lies in the potential for these models to unintentionally marginalize certain groups or reinforce systemic disparities. As deep learning is increasingly integrated into decision-making processes across domains such as hiring, lending, and criminal justice, ensuring fairness and transparency becomes paramount. Striving for ethical deep learning involves not only detecting and mitigating biases but also establishing guidelines and standards that prioritize equitable treatment, encompassing a multidisciplinary effort to foster responsible AI innovation for the betterment of society.

6.3 Interpretability and explainability

Interpretability and explainability of deep learning pose significant challenges in understanding the inner workings of complex models. As deep neural networks become more intricate, with numerous layers and parameters, their decision-making processes often resemble “black boxes,” making it difficult to discern how and why specific predictions are made. This lack of transparency hinders the trust and adoption of these models, especially in high-stakes applications like health care and finance. Striking a balance between model performance and comprehensibility is crucial to ensure that stakeholders, including researchers, regulators, and end-users, can gain meaningful insights into the model's reasoning, enabling informed decisions and accountability while navigating the intricate landscape of modern deep learning.

6.4 Robustness to adversarial attacks

Deep learning models are susceptible to adversarial attacks, a concerning vulnerability that highlights the fragility of their decision boundaries. Adversarial attacks involve making small, carefully crafted perturbations to input data, often imperceptible to humans, which can lead to misclassification or erroneous outputs from the model. These attacks exploit the model's sensitivity to subtle changes in its input space, revealing a lack of robustness in real-world scenarios. Adversarial attacks not only challenge the reliability of deep learning systems in critical applications such as autonomous vehicles and security systems but also underscore the need for developing advanced defense mechanisms and more resilient models that can withstand these intentional manipulations. Therefore, developing robust models that can withstand such attacks and maintaining model security and data is of high importance.

6.5 Catastrophic forgetting

Catastrophic forgetting, or catastrophic interference, is a phenomenon that can occur in online deep learning, where a model forgets or loses previously learned information when it learns new information. This can lead to a degradation in performance on tasks that were previously well-learned as the model adjusts to new data. This catastrophic forgetting is particularly problematic because deep neural networks often have a large number of parameters and complex representations. When a neural network is trained on new data, the optimization process may adjust the weights and connections in a way that erases the knowledge the network had about previous tasks. Therefore, there is a need for models that address this phenomenon.

6.6 Safe learning

Safe deep learning models are designed and trained with a focus on ensuring safety, reliability, and robustness. These models are built to minimize risks associated with uncertainty, hazards, errors, and other potential failures that can arise in the deployment and operation of artificial intelligence systems. DL models without safety and risks considerations in ground or aerial robots can lead to unsafe outcomes, serious damage, and even casualties. The safety properties include estimating risks, dealing with uncertainty in data, and detecting abnormal system behaviors and unforeseen events to ensure safety and avoid catastrophic failures and hazards. The research in this area is still at a very early stage.

6.7 Transfer learning and adaptation

Transfer learning and adaptation present complex challenges in the realm of deep learning. While pretraining models on large datasets can capture valuable features and representations, effectively transferring this knowledge to new tasks or domains requires overcoming hurdles related to differences in data distributions, semantic gaps, and contextual variations. Adapting pre-trained models to specific target tasks demands careful fine-tuning, domain adaptation, or designing novel architectures that can accommodate varying input modalities and semantics. The challenge lies in striking a balance between leveraging the knowledge gained from pretraining and tailoring the model to extract meaningful insights from the new data, ensuring that the transferred representations are both relevant and accurate. Successfully addressing the intricacies of transfer learning and adaptation in deep learning holds the key to unlocking the full potential of AI across diverse applications and domains.

7 Conclusions

In recent years, deep learning has emerged as a prominent data-driven approach across diverse fields. Its significance lies in its capacity to reshape entire industries and tackle complex problems that were once challenging or insurmountable. While numerous surveys have been published on deep learning, its models, and applications, a notable proportion of these surveys has predominantly focused on supervised techniques and their potential use cases. In contrast, there has been a relative lack of emphasis on deep unsupervised and deep reinforcement learning methods. Motivated by these gaps, this survey offers a comprehensive exploration of key learning paradigms, encompassing supervised, unsupervised, reinforcement, and hybrid learning, while also describing prominent models within each category. Furthermore, it delves into cutting-edge facets of deep learning, including transfer learning, online learning, and federated learning. The survey finishes by outlining critical challenges and charting prospective pathways, thereby illuminating forthcoming research trends across diverse domains.

Data availability

Not applicable.

Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: International conference on artificial neural networks, Springer, Berlin; p 270–279

Tang B, Chen Z, Hefferman G, Pei S, Wei T, He H, Yang Q (2017) Incorporating intelligence in fog computing for big data analysis in smart cities. IEEE Trans Ind Informatics 13:2140–2150

Google Scholar  

Khoei TT, Aissou G, Al Shamaileh K, Devabhaktuni VK, Kaabouch N (2023) Supervised deep learning models for detecting GPS spoofing attacks on unmanned aerial vehicles. In: 2023 IEEE international conference on electro information technology (eIT), Romeoville, IL, USA, pp 340–346. https://doi.org/10.1109/eIT57321.2023.10187274

Article   Google Scholar  

Nguyen TT, Nguyen QVH, Nguyen DT, Nguyen DT, Huynh-The T, Nahavandi S, Nguyen TT, Pham QV, Nguyen CM (2022) Deep learning for deepfakes creation and detection: a survey. Comput Vis Image Underst 223:103525

Dong S, Wang P, Abbas K (2021) A survey on deep learning and its applications. Comput Scie Rev 40:100379

MathSciNet   MATH   Google Scholar  

Ni J, Young T, Pandelea V, Xue F, Cambria E (2022) Recent advances in deep learning based dialogue systems: a systematic survey. Artif Intell Rev 56:1–101

Piccialli F, Di Somma V, Giampaolo F, Cuomo S, Fortino G (2021) A survey on deep learning in medicine: why, how and when? Inf Fus 66:111–137

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

Hatcher WG, Yu W (2018) A survey of deep learning: platforms, applications and emerging research trends. IEEE Access 6:24411–24432. https://doi.org/10.1109/ACCESS.2018.2830661

Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv (CSUR) 51(5):1–36

Alom MZ et al (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3):292. https://doi.org/10.3390/electronics8030292

Dargan S, Kumar M, Ayyagari MR, Kumar G (2020) A survey of deep learning and its applications: a new paradigm to machine learning. Arch of Computat Methods Eng 27(4):1071–1092

MathSciNet   Google Scholar  

Johnson JM, Khoshgoftaar TM (2019) Survey on deep learning with class imbalance. J Big Data 6(1):1–54

Zhang Q, Yang LT, Chen Z, Li P (2018) A survey on deep learning for big data. Inf Fus 42:146–157

Hu X, Chu L, Pei J, Liu W, Bian J (2021) Model complexity of deep learning: a survey. Knowl Inf Syst 63(10):2585–2619

Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108

Berman D, Buczak A, Chavis J, Corbett C (2019) A survey of deep learning methods for cyber security. Information 10(4):122. https://doi.org/10.3390/info10040122

Tong K, Wu Y (2022) Deep learning-based detection from the perspective of small or tiny objects: a survey. Image Vis Comput 123:104471

Baduge SK, Thilakarathna S, Perera JS, Arashpour M, Sharafi P, Teodosio B, Shringi A, Mendis P (2022) Artificial intelligence and smart vision for building and construction 4.0: machine and deep learning methods and applications. Autom Constr 141:104440

Omitaomu OA, Niu H (2021) Artificial intelligence techniques in smart grid: a survey. Smart Cities 4(2):548–568. https://doi.org/10.3390/smartcities4020029

Akay A, Hess H (2019) Deep learning: current and emerging applications in medicine and technology. IEEE J Biomed Health Inform 23(3):906–920. https://doi.org/10.1109/JBHI.2019.2894713

Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26

Srinidhi CL, Ciga O, Martel AL (2021) Deep neural network models for computational histopathology: a survey. Med Image Anal 67:101813

Kattenborn T, Leitloff J, Schiefer F, Hinz S (2021) Review on convolutional neural networks (CNN) in vegetation remote sensing. ISPRS J Photogramm Remote Sens 173:24–49

Tugrul B, Elfatimi E, Eryigit R (2022) Convolutional neural networks in detection of plant leaf diseases: a review. Agriculture 12(8):1192

Yadav SP, Zaidi S, Mishra A, Yadav V (2022) Survey on machine learning in speech emotion recognition and vision systems using a recurrent neural network (RNN). Arch Computat Methods Eng 29(3):1753–1770

Mai HT, Lieu QX, Kang J, Lee J (2022) A novel deep unsupervised learning-based framework for optimization of truss structures. Eng Comput 39:1–24

Jiang H, Peng M, Zhong Y, Xie H, Hao Z, Lin J, Ma X, Hu X (2022) A survey on deep learning-based change detection from high-resolution remote sensing images. Remote Sens 14(7):1552

Mousavi SM, Beroza GC (2022) Deep-learning seismology. Science 377(6607):eabm4470

Song X, Li J, Cai T, Yang S, Yang T, Liu C (2022) A survey on deep learning based knowledge tracing. Knowl-Based Syst 258:110036

Wang J, Biljecki F (2022) Unsupervised machine learning in urban studies: a systematic review of applications. Cities 129:103925

Li Y (2022) Research and application of deep learning in image recognition. In: 2022 IEEE 2nd international conference on power, electronics and computer applications (ICPECA), p 994–999

Borowiec ML, Dikow RB, Frandsen PB, McKeeken A, Valentini G, White AE (2022) Deep learning as a tool for ecology and evolution. Methods Ecol Evol 13(8):1640–1660

Wang X et al (2022) Deep reinforcement learning: a survey. IEEE Trans on Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3207346

Pateria S, Subagdja B, Tan AH, Quek C (2021) Hierarchical reinforcement learning: A comprehensive survey. ACM Comput Surv (CSUR) 54(5):1–35

Amroune M (2019) Machine learning techniques applied to on-line voltage stability assessment: a review. Arch Comput Methods Eng 28:273–287

Liu S, Shi R, Huang Y, Li X, Li Z, Wang L, Mao D, Liu L, Liao S, Zhang M et al (2021) A data-driven and data-based framework for online voltage stability assessment using partial mutual information and iterated random forest. Energies 14:715

Ahmad A, Saraswat D, El Gamal A (2023) A survey on using deep learning techniques for plant disease diagnosis and recommendations for development of appropriate tools. Smart Agric Technol 3:100083

Khan A, Khan SH, Saif M, Batool A, Sohail A, Waleed Khan M (2023) A survey of deep learning techniques for the analysis of COVID-19 and their usability for detecting omicron. J Exp Theor Artif Intell. https://doi.org/10.1080/0952813X.2023.2165724

Wang C, Gong L, Wang A, Li X, Hung PCK, Xuehai Z (2017) SOLAR: services-oriented deep learning architectures. IEEE Trans Services Comput 14(1):262–273

Moshayedi AJ, Roy AS, Kolahdooz A, Shuxin Y (2022) Deep learning application pros and cons over algorithm deep learning application pros and cons over algorithm. EAI Endorsed Trans AI Robotics 1(1):1–13

Huang L, Luo R, Liu X, Hao X (2022) Spectral imaging with deep learning. Light: Sci Appl 11(1):61

Bhangale KB, Kothandaraman M (2022) Survey of deep learning paradigms for speech processing. Wireless Pers Commun 125(2):1913–1949

Khojaste-Sarakhsi M, Haghighi SS, Ghomi SF, Marchiori E (2022) Deep learning for Alzheimer’s disease diagnosis: a survey. Artif Intell Med 130:102332

Fu G, Jin Y, Sun S, Yuan Z, Butler D (2022) The role of deep learning in urban water management: a critical review. Water Res 223:118973

Kim L-W (2018) DeepX: deep learning accelerator for restricted Boltzmann machine artificial neural networks. IEEE Trans Neural Netw Learn Syst 29(5):1441–1453

Wang C, Gong L, Yu Q, Li X, Xie Y, Zhou X (2017) DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Trans Comput-Aided Design Integr Circuits Syst 36(3):513–517

Dundar A, Jin J, Martini B, Culurciello E (2017) Embedded streaming deep neural networks accelerator with applications. IEEE Trans Neural Netw Learn Syst 28(7):1572–1583

De Mauro A, Greco M, Grimaldi M, Nobili G (2016) Beyond data scientists: a review of big data skills and job families. In: Proceedings of IFKAD, p 1844–1857

Lin S-B (2019) Generalization and expressivity for deep nets. IEEE Trans Neural Netw Learn Syst 30(5):1392–1406

Gopinath M, Sethuraman SC (2023) A comprehensive survey on deep learning based malware detection techniques. Comp Sci Rev 47:100529

MATH   Google Scholar  

Khalifa NE, Loey M, Mirjalili S (2022) A comprehensive survey of recent trends in deep learning for digital images augmentation. Artif Intell Rev 55:1–27

Peng S, Cao L, Zhou Y, Ouyang Z, Yang A, Li X, Jia W, Yu S (2022) A survey on deep learning for textual emotion analysis in social networks. Digital Commun Netw 8(5):745–762

Tao X, Gong X, Zhang X, Yan S, Adak C (2022) Deep learning for unsupervised anomaly localization in industrial images: a survey. IEEE Trans Instrum Meas 71:1–21. https://doi.org/10.1109/TIM.2022.3196436

Sharifani K, Amini M (2023) Machine learning and deep learning: a review of methods and applications. World Inf Technol Eng J 10(07):3897–3904

Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol (TIST) 13(2):1–41

Zhou Z, Xiang Y, Hao Xu, Yi Z, Shi Di, Wang Z (2021) A novel transfer learning-based intelligent nonintrusive load-monitoring with limited measurements. IEEE Trans Instrum Meas 70:1–8

Akram MW, Li G, Jin Y, Chen X, Zhu C, Ahmad A (2020) Automatic detection of photovoltaic module defects in infrared images with isolated and develop-model transfer deep learning. Sol Energy 198:175–186

Karimipour H, Dehghantanha A, Parizi RM, Choo K-KR, Leung H (2019) A deep and scalable unsupervised machine learning system for cyber-attack detection in large-scale smart grids. IEEE Access 7:80778–80788

Moonesar IA, Dass R (2021) Artificial intelligence in health policy—a global perspective. Global J Comput Sci Technol 1:1–7

Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646

Subramanian N, Elharrouss O, Al-Maadeed S, Chowdhury M (2022) A review of deep learning-based detection methods for COVID-19. Comput Biol Med 143:105233

Tsuneki M (2022) Deep learning models in medical image analysis. J Oral Biosci 64(3):312–320

Pan X, Lin X, Cao D, Zeng X, Yu PS, He L, Nussinov R, Cheng F (2022) Deep learning for drug repurposing: Methods, databases, and applications. Wiley Interdiscip Rev: Computat Mol Sci 12(4):e1597

Novakovsky G, Dexter N, Libbrecht MW, Wasserman WW, Mostafavi S (2023) Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet 24(2):125–137

Fan Y, Tao B, Zheng Y, Jang S-S (2020) A data-driven soft sensor based on multilayer perceptron neural network with a double LASSO approach. IEEE Trans Instrum Meas 69(7):3972–3979

Menghani G (2023) Efficient deep learning: a survey on making deep learning models smaller, faster, and better. ACM Comput Surv 55(12):1–37

Mehrish A, Majumder N, Bharadwaj R, Mihalcea R, Poria S (2023) A review of deep learning techniques for speech processing. Inf Fus 99:101869

Mohammed A, Kora R (2023) A comprehensive review on ensemble deep learning: opportunities and challenges. J King Saud Univ-Comput Inf Sci 35:757–774

Alzubaidi L, Bai J, Al-Sabaawi A, Santamaría J, Albahri AS, Al-dabbagh BSN, Fadhel MA, Manoufali M, Zhang J, Al-Timemy AH, Duan Y (2023) A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. J Big Data 10(1):46

Katsogiannis-Meimarakis G, Koutrika G (2023) A survey on deep learning approaches for text-to-SQL. The VLDB J. https://doi.org/10.1007/s00778-022-00776-8

Soori M, Arezoo B, Dastres R (2023) Artificial intelligence, machine learning and deep learning in advanced robotics a review. Cognitive Robotics 3:57–70

Mijwil M, Salem IE, Ismaeel MM (2023) The significance of machine learning and deep learning techniques in cybersecurity: a comprehensive review. Iraqi J Comput Sci Math 4(1):87–101

de Oliveira RA, Bollen MH (2023) Deep learning for power quality. Electr Power Syst Res 214:108887

Yin L, Gao Qi, Zhao L, Zhang B, Wang T, Li S, Liu H (2020) A review of machine learning for new generation smart dispatch in power systems. Eng Appl Artif Intell 88:103372

Luong NC et al. (2019) Applications of deep reinforcement learning in communications and networking: a survey. In: IEEE communications surveys & tutorials, vol 21, no 4, p 3133–3174, https://doi.org/10.1109/COMST.2019.2916583

Kiran BR et al (2022) Deep reinforcement learning for autonomous driving: a survey. IEEE Trans Intell Transp Syst 23(6):4909–4926. https://doi.org/10.1109/TITS.2021.3054625

Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process Mag 34(6):26–38. https://doi.org/10.1109/MSP.2017.2743240

Levine S, Kumar A, Tucker G, Fu J (2020) Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643

Vinuesa R, Azizpour H, Leite I, Balaam M, Dignum V, Domisch S, Felländer A, Langhans SD, Tegmark M, Nerini FF (2020) The role of artificial intelligence in achieving the sustainable development goals. Nature Commun. https://doi.org/10.1038/s41467-019-14108-y

Khoei TT, Kaabouch N (2023) ACapsule Q-learning based reinforcement model for intrusion detection system on smart grid. In: 2023 IEEE international conference on electro information technology (eIT), Romeoville, IL, USA, pp 333–339. https://doi.org/10.1109/eIT57321.2023.10187374

Hoi SC, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289

Celard P, Iglesias EL, Sorribes-Fdez JM, Romero R, Vieira AS, Borrajo L (2023) A survey on deep learning applied to medical images: from simple artificial neural networks to generative models. Neural Comput Appl 35(3):2291–2323

Mohammad-Rahimi H, Rokhshad R, Bencharit S, Krois J, Schwendicke F (2023) Deep learning: a primer for dentists and dental researchers. J Dent 130:104430

Liu Z, Tong L, Chen L, Jiang Z, Zhou F, Zhang Q, Zhang X, Jin Y, Zhou H (2023) Deep learning based brain tumor segmentation: a survey. Complex Intell Syst 9(1):1001–1026

Zheng Y, Xu Z, Xiao A (2023) Deep learning in economics: a systematic and critical review. Artif Intell Rev 4:1–43

Jia T, Kapelan Z, de Vries R, Vriend P, Peereboom EC, Okkerman I, Taormina R (2023) Deep learning for detecting macroplastic litter in water bodies: a review. Water Res 231:119632

Newbury R, Gu M, Chumbley L, Mousavian A, Eppner C, Leitner J, Bohg J, Morales A, Asfour T, Kragic D, Fox D (2023) Deep learning approaches to grasp synthesis: a review. IEEE Trans Robotics. https://doi.org/10.1109/TRO.2023.3280597

Shafay M, Ahmad RW, Salah K, Yaqoob I, Jayaraman R, Omar M (2023) Blockchain for deep learning: review and open challenges. Clust Comput 26(1):197–221

Benczúr AA., Kocsis L, Pálovics R (2018) Online machine learning in big data streams. arXiv preprint arXiv:1802.05872

Shalev-Shwartz S (2011) Online learning and online convex optimization. Found Trends® Mach Learn 4(2):107–194

Millán Giraldo M, Sánchez Garreta JS (2008) A comparative study of simple online learning strategies for streaming data. WSEAS Trans Circuits Syst 7(10):900–910

Pinto G, Wang Z, Roy A, Hong T, Capozzoli A (2022) Transfer learning for smart buildings: a critical review of algorithms, applications, and future perspectives. Adv Appl Energy 5:100084

Sayed AN, Himeur Y, Bensaali F (2022) Deep and transfer learning for building occupancy detection: a review and comparative analysis. Eng Appl Artif Intell 115:105254

Li C, Zhang S, Qin Y, Estupinan E (2020) A systematic review of deep transfer learning for machinery fault diagnosis. Neurocomputing 407:121–135

Li W, Huang R, Li J, Liao Y, Chen Z, He G, Yan R, Gryllias K (2022) A perspective survey on deep transfer learning for fault diagnosis in industrial scenarios: theories, applications and challenges. Mech Syst Signal Process 167:108487

Wan Z, Yang R, Huang M, Zeng N, Liu X (2021) A review on transfer learning in EEG signal analysis. Neurocomputing 421:1–14

Tan C, Sun F, Kong T (2018) A survey on deep transfer learning.In: Proceedings of international conference on artificial neural networks. p 270–279

Qian F, Gao W, Yang Y, Yu D et al (2020) Potential analysis of the transfer learning model in short and medium-term forecasting of building HVAC energy consumption. Energy 193:116724

Weber M, Doblander C, Mandl P, (2020b). Towards the detection of building occupancy with synthetic environmental data. arXiv preprint arXiv:2010.04209

Zhu H, Xu J, Liu S, Jin Y (2021) Federated learning on non-IID data: a survey. Neurocomputing 465:371–390

Ouadrhiri AE, Abdelhadi A (2022) Differential privacy for deep and federated learning: a survey. IEEE Access 10:22359–22380. https://doi.org/10.1109/ACCESS.2022.3151670

Zhang C, Xie Y, Bai H, Yu B, Li W, Gao Y (2021) A survey on federated learning. Knowl-Based Syst 216:106775

Banabilah S, Aloqaily M, Alsayed E, Malik N, Jararweh Y (2022) Federated learning review: fundamentals, enabling technologies, and future applications. Inf Process Manag 59(6):103061

Mothukuri V, Parizi RM, Pouriyeh S, Huang Y, Dehghantanha A, Srivastava G (2021) A survey on security and privacy of federated learning. Futur Gener Comput Syst 115:619–640

McMahan HB, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th international conference on artificial intelligence and statistics, AISTATS

Hardy S, Henecka W, Ivey-Law H, Nock R, Patrini G, Smith G, Thorne B (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677

Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H et al (2015) Xgboost: extreme gradient boosting. R Package Vers 1:4–2

Heng K, Fan T, Jin Y, Liu Y, Chen T, Yang Q (2019) Secureboost: a lossless federated learning framework. arXiv preprint arXiv:1901.08755

Konečný J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492

Hamedani L, Liu R, Atat J, Wu Y (2017) Reservoir computing meets smart grids: attack detection using delayed feedback networks. IEEE Trans Industr Inf 14(2):734–743

Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recogn 77:160–172

Xiao B, Xiong J, Shi Y (2016) Novel applications of deep learning hidden features for adaptive testing. In: Proceedings of the 21st Asia and South Pacifc design automation conference, p 743–748

Zhong SH, Li Y, Le B (2015) Query oriented unsupervised multi document summarization via deep learning. Expert Syst Appl 42:1–10

Vincent P et al (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

Alom MZ et al. (2017) Object recognition using cellular simultaneous recurrent networks and convolutional neural network. In: Neural networks (IJCNN), international joint conference on IEEE

Quang W, Stokes JW (2016) MtNet: a multi-task neural network for dynamic malware classification. in: proceedings of the international conference detection of intrusions and malware, and vulnerability assessment, Donostia-San Sebastián, Spain, 7–8 July, p 399–418

Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90

Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88

Gheisari M, Ebrahimzadeh F, Rahimi M, Moazzamigodarzi M, Liu Y, Dutta Pramanik PK, Heravi MA, Mehbodniya A, Ghaderzadeh M, Feylizadeh MR, Kosari S (2023) Deep learning: applications, architectures, models, tools, and frameworks: a comprehensive survey. CAAI Trans Intell Technol. https://doi.org/10.1049/cit2.12180

Pichler M, Hartig F (2023) Machine learning and deep learning—a review for ecologists. Methods Ecol Evolut 14(4):994–1016

Wang N, Chen T, Liu S, Wang R, Karimi HR, Lin Y (2023) Deep learning-based visual detection of marine organisms: a survey. Neurocomputing 532:1–32

Lee M (2023) The geometry of feature space in deep learning models: a holistic perspective and comprehensive review. Mathematics 11(10):2375

Xu M, Yoon S, Fuentes A, Park DS (2023) A comprehensive survey of image augmentation techniques for deep learning. Pattern Recogn 137:109347

Minaee S, Abdolrashidi A, Su H, Bennamoun M, Zhang D (2023) Biometrics recognition using deep learning: a survey. Artif Intell Rev 56:1–49

Xiang H, Zou Q, Nawaz MA, Huang X, Zhang F, Yu H (2023) Deep learning for image inpainting: a survey. Pattern Recogn 134:109046

Chakraborty S, Mali K (2022) An overview of biomedical image analysis from the deep learning perspective. Research anthology on improving medical imaging techniques for analysis and intervention. IGI Global, Hershey, pp 43–59

Lestari, N.I., Hussain, W., Merigo, J.M. and Bekhit, M., 2023, January. A Survey of Trendy Financial Sector Applications of Machine and Deep Learning. In: Application of big data, blockchain, and internet of things for education informatization: second EAI international conference, BigIoT-EDU 2022, Virtual Event, July 29–31, 2022, Proceedings, Part III, Springer Nature, Cham, p. 619–633

Chaddad A, Peng J, Xu J, Bouridane A (2023) Survey of explainable AI techniques in healthcare. Sensors 23(2):634

Grumiaux PA, Kitić S, Girin L, Guérin A (2022) A survey of sound source localization with deep learning methods. J Acoust Soc Am 152(1):107–151

Zaidi SSA, Ansari MS, Aslam A, Kanwal N, Asghar M, Lee B (2022) A survey of modern deep learning based object detection models. Digital Signal Process 126:103514

Dong J, Zhao M, Liu Y, Su Y, Zeng X (2022) Deep learning in retrosynthesis planning: datasets, models and tools. Brief Bioinf 23(1):391

Zhan ZH, Li JY, Zhang J (2022) Evolutionary deep learning: a survey. Neurocomputing 483:42–58

Matsubara Y, Levorato M, Restuccia F (2022) Split computing and early exiting for deep learning applications: survey and research challenges. ACM Comput Surv 55(5):1–30

Zhang B, Rong Y, Yong R, Qin D, Li M, Zou G, Pan J (2022) Deep learning for air pollutant concentration prediction: a review. Atmos Environ 290:119347

Yu X, Zhou Q, Wang S, Zhang YD (2022) A systematic survey of deep learning in breast cancer. Int J Intell Syst 37(1):152–216

Behrad F, Abadeh MS (2022) An overview of deep learning methods for multimodal medical data mining. Expert Syst Appl 200:117006

Mittal S, Srivastava S, Jayanth JP (2022) A survey of deep learning techniques for underwater image classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3143887

Tercan H, Meisen T (2022) Machine learning and deep learning based predictive quality in manufacturing: a systematic review. J Intell Manuf 33(7):1879–1905

Stefanini M, Cornia M, Baraldi L, Cascianelli S, Fiameni G, Cucchiara R (2022) From show to tell: a survey on deep learning-based image captioning. IEEE Trans Pattern Anal Mach Intell 45(1):539–559

Caldas S, Konečný J, McMahan HB, Talwalkar A (2018) Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210

Chen Y, Sun X, Jin Y (2019) Communication-efficient federated deep learning with layerwise asynchronous model update and temporally weighted aggregation. IEEE Trans Neural Netw Learn Syst 31:4229–4238

Zhu H, Jin Y (2019) Multi-objective evolutionary federated learning. IEEE Trans Neural Netw Learn Syst 31:1310–1322

Download references

Acknowledgements

The authors acknowledge the support of the National Science Foundation (NSF), Award Number 2006674.

Author information

Authors and affiliations.

School of Electrical Engineering and Computer Science, University of North Dakota, Grand Forks, ND, 58202, USA

Tala Talaei Khoei, Hadjar Ould Slimane & Naima Kaabouch

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Tala Talaei Khoei .

Ethics declarations

Conflict of interest.

The authors declare no conflicts of interest relevant to this article.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Talaei Khoei, T., Ould Slimane, H. & Kaabouch, N. Deep learning: systematic review, models, challenges, and research directions. Neural Comput & Applic 35 , 23103–23124 (2023). https://doi.org/10.1007/s00521-023-08957-4

Download citation

Received : 31 May 2023

Accepted : 15 August 2023

Published : 07 September 2023

Issue Date : November 2023

DOI : https://doi.org/10.1007/s00521-023-08957-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Neural networks
  • Deep learning
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
  • Online learning
  • Federated learning
  • Transfer learning
  • Find a journal
  • Publish with us
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Med Internet Res
  • v.24(4); 2022 Apr

Logo of jmir

Understanding the Research Landscape of Deep Learning in Biomedical Science: Scientometric Analysis

1 Department of Library and Information Science, Sungkyunkwan University, Seoul, Republic of Korea

Donghun Kim

Woojin jung, yongjun zhu.

2 Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea

Advances in biomedical research using deep learning techniques have generated a large volume of related literature. However, there is a lack of scientometric studies that provide a bird’s-eye view of them. This absence has led to a partial and fragmented understanding of the field and its progress.

This study aimed to gain a quantitative and qualitative understanding of the scientific domain by analyzing diverse bibliographic entities that represent the research landscape from multiple perspectives and levels of granularity.

We searched and retrieved 978 deep learning studies in biomedicine from the PubMed database. A scientometric analysis was performed by analyzing the metadata, content of influential works, and cited references.

In the process, we identified the current leading fields, major research topics and techniques, knowledge diffusion, and research collaboration. There was a predominant focus on applying deep learning, especially convolutional neural networks, to radiology and medical imaging, whereas a few studies focused on protein or genome analysis. Radiology and medical imaging also appeared to be the most significant knowledge sources and an important field in knowledge diffusion, followed by computer science and electrical engineering. A coauthorship analysis revealed various collaborations among engineering-oriented and biomedicine-oriented clusters of disciplines.

Conclusions

This study investigated the landscape of deep learning research in biomedicine and confirmed its interdisciplinary nature. Although it has been successful, we believe that there is a need for diverse applications in certain areas to further boost the contributions of deep learning in addressing biomedical research problems. We expect the results of this study to help researchers and communities better align their present and future work.

Introduction

Deep learning is a class of machine learning techniques based on neural networks with multiple processing layers that learn representations of data [ 1 , 2 ]. Stemming from shallow neural networks, many deep learning architectures, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have been developed for various purposes [ 3 ]. The exponentially growing amount of data in many fields and recent advances in graphics processing units have further expedited research progress in the field. Deep learning has been actively applied to tasks, such as natural language processing (NLP), speech recognition, and computer vision, in various domains [ 1 ] and has shown promising results in diverse areas of biomedicine, including radiology [ 4 ], neurology [ 2 ], cardiology [ 5 ], cancer detection and diagnosis [ 6 , 7 ], radiotherapy [ 8 ], and genomics and structural biology [ 9 - 11 ]. Medical image analysis is a field that has actively used deep learning. For example, successful applications have been made in diagnosis [ 12 ], lesion classification or detection [ 13 , 14 ], organ and other substructure localization or segmentation [ 15 , 16 ], and image registration [ 17 , 18 ]. In addition, deep learning has also made an impact on predicting protein structures [ 19 , 20 ] and genomic sequencing [ 21 - 23 ] for biomarker development and drug design.

Despite the increasing number of published biomedical studies on deep learning techniques and applications, there has been a lack of scientometric studies that both qualitatively and quantitatively explore, analyze, and summarize the relevant studies to provide a bird’s-eye view of them. Previous studies have mostly provided qualitative reviews [ 2 , 9 , 10 ], and the few available bibliometric analyses were limited in their scope in that the researchers focused on a subarea such as public health [ 24 ] or a particular journal [ 25 ]. The absence of a coherent lens through which we can examine the field from multiple perspectives and levels of granularity leads to a partial and fragmented understanding of the field and its progress. To fill this gap, the aim of this study is to perform a scientometric analysis of metadata, content, and citations to investigate current leading fields, research topics, and techniques, as well as research collaboration and knowledge diffusion in deep learning research in biomedicine. Specifically, we intend to examine (1) biomedical journals that had frequently published deep learning studies and their coverage of research areas, (2) diseases and other biomedical entities that have been frequently studied with deep learning and their relationships, (3) major deep learning architectures in biomedicine and their specific applications, (4) research collaborations among disciplines and organizations, and (5) knowledge diffusion among different areas of study.

Data were collected from PubMed, a citation and abstract database that includes biomedical literature from MEDLINE and other life science journals indexed with Medical Subject Heading (MeSH) terms [ 26 ]. MeSH is a hierarchically structured biomedical terminology with descriptors organized into 16 categories, with subcategories [ 27 ]. In this study, deep learning [MeSH Major Topic] was used as the query to search and download deep learning studies from PubMed. Limiting a MeSH term as a major topic increases the precision of retrieval so that only studies that are highly relevant to the topic are found [ 28 ]. As of January 1, 2020, a total of 978 PubMed records with publication years ranging from 2016 to 2020 have been retrieved using the National Center for Biotechnology Information Entrez application programming interface. Entrez is a data retrieval system that can be programmatically accessed through its Biopython module to search and export records from the National Center for Biotechnology Information’s databases, including PubMed [ 26 , 29 ]. The metadata of the collected bibliographic records included the PubMed identifier or PubMed ID, publication year, journal title and its electronic ISSN, MeSH descriptor terms, and author affiliations. We also downloaded the citation counts and references of each bibliographic record and considered data sources other than PubMed as well. We collected citation counts of the downloaded bibliographic records from Google Scholar (last updated on February 8, 2020) and the subject categories of their publishing journals from the Web of Science (WoS) Core Collection database using the electronic ISSN.

Detailed Methods

Metadata analysis.

Journals are an important unit of analysis in scientometrics and have been used to understand specific research areas and disciplines [ 30 ]. In this study, biomedical journals that published deep learning studies were grouped using the WoS Core Collection subject categories and analyzed to identify widely studied research areas and disciplines.

Disease-related MeSH terms were analyzed to identify major diseases that have been studied using deep learning. We mapped descriptors to their corresponding numbers in MeSH Tree Structures to identify higher level concepts for descriptors that were too specific and ensured that all the descriptors had the same level of specificity. Ultimately, all descriptors were mapped to 6-digit tree numbers (C00.000), and terms with >1 tree number were separately counted for all the categories they belonged to. In addition, we visualized the co-occurrence network of major MeSH descriptors using VOSviewer (version 1.6.15) [ 31 , 32 ] and its clustering technique [ 33 ] to understand the relationships among the biomedical entities, as well as the clusters they form together.

Author Affiliations

We analyzed author affiliations to understand the major organizations and academic disciplines that were active in deep learning research. The affiliations of 4908 authors extracted from PubMed records were recorded in various formats and manually standardized. We manually reviewed the affiliations to extract organizations, universities, schools, colleges, and departments. For authors with multiple affiliations, we selected the first one listed, which is usually the primary. We also analyzed coauthorships to investigate research collaboration among organizations and disciplines. All the organizations were grouped into one of the following categories: universities, hospitals, companies, or research institutes and government agencies to understand research collaboration among different sectors. We classified medical schools under hospitals as they are normally affiliated with each other. In the category of research institutes or government agencies, we included nonprofit private organizations or foundations and research centers that do not belong to a university, hospital, or company. We extracted academic disciplines from the department section or the school or college section when department information was unavailable. As the extracted disciplines were not coherent with multiple levels and combinations, data were first cleaned with OpenRefine (originally developed by Metaweb then Google), an interactive data transformation tool for profiling and cleaning messy data [ 34 ], and then manually grouped based on WoS categories and MeSH Tree Structures according to the following rules. We treated interdisciplinary fields and fields with high occurrence as separate disciplines from their broader fields and aggregated multiple fields that frequently co-occurred under a single department name into a single discipline after reviewing their disciplinary similarities.

Content Analysis

We identified influential studies by examining their citation counts in PubMed and Google Scholar. Citation counts from Google Scholar were considered in addition to PubMed as Google Scholar’s substantial citation data encompasses WoS and Scopus citations [ 35 ]. After sorting the articles in descending order of citations, the 2 sources showed a Spearman rank correlation coefficient of 0.883. From the PubMed top 150 list (ie, citation count >7) and Google Scholar top 150 list (ie, citation count >36), we selected the top 109 articles. Among these, we selected the sources that met the criteria for applying or developing deep learning models as the subjects of analysis to understand the major deep learning architectures in biomedicine and their applications. Specifically, we analyzed the research topics of the studies, the data and architectures used for those purposes, and how the black box problem was addressed.

Cited Reference Analysis

We collected the references from downloaded articles that had PubMed IDs. Citations represent the diffusion of knowledge from cited to citing publications; therefore, analyzing the highly cited references in deep learning studies in biomedicine allows for the investigation of disciplines and studies that have greatly influenced the field. Toward this end, we visualized networks of knowledge diffusion among WoS subjects using Gephi (v0.9.2) [ 36 ] and examined metrics such as modularity, PageRank score, and weighted outdegree using modularity for community detection [ 37 ]. PageRank indicates the importance of a node by measuring the quantity and quality of its incoming edges [ 38 ], and weighted outdegree measures the number of outgoing edges of a node. We also reviewed the contents of the 10 most highly cited influential works.

On the basis of the data set, 315 biomedical journals have published deep learning studies, and Table 1 lists the top 10 journals selected based on publication size. Different WoS categories and MeSH terms are separated using semicolons.

Top 10 journals with the highest record counts.

Journal titleWeb of Science categoryNational Library of Medicine catalog Medical Subject Heading termPublisherRecord count, n
Bioinformatics Biochemical Research Methods; Mathematical and Computational Biology; Biotechnology and Applied MicrobiologyComputational BiologyBMC38
Multidisciplinary SciencesNatural Science DisciplinesNature Research37
Neurosciences; Computer Science, Artificial IntelligenceNerve Net; Nervous SystemElsevier35
Engineering in Medicine and Biology Society N/A Biomedical EngineeringIEEE31
Imaging Science and Photographic Technology; Engineering, Electrical and Electronic; Computer Science, Interdisciplinary Applications; Radiology, Nuclear Medicine, and Medical Imaging; Engineering, BiomedicalElectronics, Medical; RadiographyIEEE30
Chemistry, Analytical; Electrochemistry; Instruments and Instrumentation; Engineering, Electrical and ElectronicBiosensing TechniquesMultidisciplinary Digital Publishing Institute26
Biochemical Research Methods; Mathematical and Computational Biology; Biotechnology and Applied MicrobiologyComputational Biology; GenomeOxford University Press22
Biochemical Research MethodsBiomedical Research/methods; Research DesignNature Research21
Radiology, Nuclear Medicine, and Medical ImagingBiophysicsAmerican Association of Physicists in Medicine20
Multidisciplinary SciencesMedicine; SciencePublic Library of Science20

a BMC: BioMed Central.

b IEEE: Institute of Electrical and Electronics Engineers.

c N/A: not applicable.

From a total of 978 records, 96 (9.8%) were unindexed in the WoS Core Collection and were excluded, following which, an average of 2.02 (SD 1.19) categories were assigned per record. The top ten subject categories, which mostly pertained to (1) biomedicine, with 22.2% (196/882) articles published in Radiology, Nuclear Medicine, and Medical Imaging (along with Engineering, Biomedical : 121/882, 13.7%; Mathematical and Computational Biology : 107/882, 12.1%; Biochemical Research Methods : 103/882, 11.7%; Biotechnology and Applied Microbiology : 76/882, 8.6%; Neurosciences : 74/882, 8.4%); (2) computer science and engineering ( Computer Science, Interdisciplinary Applications : 112/882, 12.7%; Computer Science, Artificial Intelligence : 75/882, 8.5%; Engineering, Electrical and Electronic : 75/882, 8.5%); or (3) Multidisciplinary Sciences (82/882, 9.3%).

For the main MeSH term or descriptor, an average of 9 (SD 4.21) terms was assigned to each record as subjects. Among them, we present in Figure 1 the diseases that were extracted from the C category. In the figure, the area size is proportional to the record count, and the terms are categorized by color. In addition, terms under >1 category were counted multiple times. For instance, the term Digestive System Neoplasms has two parents in MeSH Tree Structures, Neoplasms and Digestive System Diseases , and as such, we counted articles in this category under Neoplasms by Site as well as under Digestive System Neoplasms . Owing to the limited space, 7 categories whose total record counts were ≤10 (eg, Congenital, Hereditary, and Neonatal Diseases and Abnormalities ; Nutritional and Metabolic Diseases ; and Stomatognathic Diseases ) were combined under the Others category, and individual diseases that had <10 record counts were summed up with each other in the same category to show only their total count (or with one of the diseases included as an example). In the process, we identified Neoplasms as the most frequently studied disease type, with a total of 199 studies.

An external file that holds a picture, illustration, etc.
Object name is jmir_v24i4e28114_fig1.jpg

Disease-related Medical Subject Heading descriptors studied with deep learning.

We further constructed a co-occurrence network of the complete set of major MeSH descriptors assigned to the records to understand the relationships among the biomedical entities. To enhance legibility, we filtered out terms with <5 occurrences. Figure 2 presents the visualized network of nodes (100/966, 10.4% of the total terms) with 612 edges and 7 clusters. In the figure, the sizes of the nodes and edges are proportional to the number of occurrences, and the node color indicates the assigned cluster (although the term deep learning was considered nonexclusive to any cluster as it appeared in all records).

An external file that holds a picture, illustration, etc.
Object name is jmir_v24i4e28114_fig2.jpg

Co-occurrence network of the major Medical Subject Heading descriptors (number of nodes=100; number of edges=612; number of clusters=7).

As depicted in Figure 2 , each cluster comprised descriptors from two groups: (1) biomedical domains that deep learning was applied to, including body regions, related diseases, diagnostic imaging methods, and theoretical models, and (2) the purposes of deep learning and techniques used for the tasks, including diagnosis, analysis, and processing of biomedical data. In the first cluster, computer neural networks and software were studied for the purposes of computational biology , specifically protein sequence analysis , drug discovery , and drug design , to achieve precision medicine . These were relevant to the biomedical domains of (1) proteins , related visualization methods ( microscopy ), and biological models , and (2) neoplasms , related drugs ( antineoplastic agents ), and diagnostic imaging ( radiology ). In the second cluster, deep learning and statistical models were used for RNA sequence analysis and computer-assisted radiotherapy planning in relation to the domains of (1) genomics , RNA , and mutation , and (2) brain neoplasms and liver neoplasms . The third cluster comprised (1) heart structures ( heart ventricles ), cardiovascular diseases , and ultrasonography and (2) eye structures ( retina ), diseases ( glaucoma ), and ophthalmological diagnostic techniques . These had been studied for computer-assisted image interpretation using machine learning and deep learning algorithms . The biomedical domain group of the fourth cluster involved specific terms related to neoplasms such as type ( adenocarcinoma ), different regions ( breast neoplasms , lung neoplasms , and colorectal neoplasms ), and respective imaging methods ( mammography and X-ray computed tomography ) to which deep learning and support vector machines have been applied for the purpose of computer-assisted radiographic image interpretation and computer-assisted diagnosis . The fifth cluster included (1) brain disorders ( Alzheimer disease ), neuroimaging , and neurological models ; (2) prostatic neoplasms ; and (3) diagnostic magnetic resonance imaging and 3D imaging . S upervised machine learning had been used for computer-assisted image processing of these data. In the sixth cluster, automated pattern recognition and computer-assisted signal processing were studied with (1) human activities (eg, movement and face ), (2) abnormal brain activities ( epilepsy and seizures ) and monitoring methods ( electroencephalography ), and (3) heart diseases and electrocardiography . In the last cluster, medical informatics , specifically data mining and NLP , including speech perception , had been applied to (1) electronic health records , related information storage and retrieval , and theoretical models and (2) skin diseases ( skin neoplasms and melanoma ) and diagnostic dermoscopy .

To investigate research collaboration within the field, we analyzed paper-based coauthorships using author affiliations with different levels of granularity, including organization and academic disciplines. We extracted organizations from 98.7% (4844/4908) of the total affiliations and visualized the collaboration of different organization types. The top 10 organizations with the largest publication records included Harvard University (37/844, 4.4%), Chinese Academy of Sciences (21/844, 2.5%; eg, Institute of Computing Technology, Institute of Automation, and Shenzhen Institutes of Advanced Technology), Seoul National University (21/844, 2.5%), Stanford University (20/844, 2.4%), Sun Yat-sen University (14/844, 1.7%; eg, Zhongshan Ophthalmic Center and Collaborative Innovation Center of Cancer Medicine), University of California San Diego (14/844, 1.7%; eg, Institute for Genomic Medicine, Shiley Eye Institute, and Institute for Brain and Mind), University of California San Francisco (14/844, 1.7%), University of Michigan (14/844, 1.7%), Yonsei University (14/844, 1.7%), and the University of Texas Health Science Center at Houston (12/844, 1.4%). The extracted organizations were assigned to one of the following four categories according to their main purpose: universities, hospitals, companies, or research institutes and government agencies. Among these, universities participated in most papers (567/844, 67.2%), followed by hospitals (429/844, 50.8%), companies (139/844, 16.5%), and research institutes or government agencies (88/844, 10.4%). We used a co-occurrence matrix to visualize the degrees of organizational collaboration, with the co-occurrence values log normalized to compare the relative differences ( Figure 3 ).

An external file that holds a picture, illustration, etc.
Object name is jmir_v24i4e28114_fig3.jpg

Collaboration of organization types.

From Figure 3 , we found that universities were the most active in collaborative research, particularly with hospitals, followed by companies and research institutes or government agencies. Hospitals also frequently collaborated with companies; however, research institutes or government agencies tended not to collaborate much as they published relatively fewer studies.

We also examined the collaborations among academic disciplines, which we could extract, as described in the Methods section, from 76.24% (3742/4908) of the total affiliations. Approximately half (ie, 386/756, 51.1%) of the papers were completed under disciplinary collaboration. Figure 4 depicts the network with 36 nodes (36/148, 24.3% of the total) and 267 edges after we filtered out disciplines with weighted degrees <10, representing the number of times one collaborated with the other disciplines. In the figure, the node and edge sizes are proportional to the weighted degree and link strength, respectively, and the node color indicates the assigned cluster.

An external file that holds a picture, illustration, etc.
Object name is jmir_v24i4e28114_fig4.jpg

Collaboration network of academic disciplines (number of nodes=36; number of edges=267; number of clusters=6).

As shown in the figure, the academic disciplines were assigned to 1 of 6 clusters, including 1 engineering-oriented cluster (cluster 1) and other clusters that encompassed biomedical fields. We specifically looked at the degree of collaboration between the biomedical and engineering disciplines. Figure 4 depicts that the most prominent collaboration was among Radiology, Medical Imaging, and Nuclear Medicine ; Computer Science ; and Electronics and Electrical Engineering . There were also strong links among Computer Science or Electronics and Electrical Engineering and Biomedical Informatics , Biomedical Engineering , and Pathology and Laboratory Medicine .

Among the top 10 disciplines in Figure 4 , the following three had published the most papers and had the highest weighted degree and degree centralities: Computer Science (number of papers=195, weighted degree=193, and degree centrality=32); Radiology, Medical Imaging, and Nuclear Medicine (number of papers=168, weighted degree=166, and degree centrality=30); and Electronics and Electrical Engineering (number of papers=161, weighted degree=160, and degree centrality=32). Meanwhile, some disciplines had high weighted degrees compared with their publication counts, indicating their activeness in collaborative research. These included Pathology and Laboratory Medicine (5th in link strength vs 8th in publications) and Public Health and Preventive Medicine (9th in link strength vs 15th in publications). A counterexample was Computational Biology , which was 12th in link strength but 7th in publications.

We analyzed the content of influential studies that had made significant contributions to the field through the application or development of deep learning architectures. We identified these studies by examining the citation counts from PubMed and Google Scholar, assigning the 109 most-cited records to one of the following categories: (1) review , (2) application of existing deep learning architectures to certain biomedical domains (denoted by A ), or (3) development of a novel deep learning model (denoted by D ). Table 2 summarizes the 92 papers assigned to the application or development category according to their research topic in descending order of citation count.

Top 92 studies with the highest citation count under the application or development category, according to the research topic.

Research topic and numberTask typeDataDeep learning architectures

A1 [ ]ClassificationRetinal disease OCT and chest x-ray with pneumoniaInception

A2 [ ]Segmentation and classificationRetinal disease OCTU-net and CNN

A3 [ ]ClassificationMelanoma dermoscopic imagesInception

A4 [ ]Survival predictionBrain glioblastoma MRI CNN_S

A6 [ ]Classification and segmentationWSI of 13 cancer typesCNN with CAE and DeconvNet

D1 [ ]SegmentationBrain MRIResNet based

A7 [ ]PredictionRetinal fundus images with cardiovascular diseaseInception

D2 [ ]TrackingVideo of freely behaving animalResNet-based DeeperCut subset

A8 [ ]ClassificationColonoscopy video of colorectal polypsInception

A9 [ ]ClassificationLung cancer CT CNN

A10 [ ]Classification and segmentationRetinal OCT with macular diseaseEncoder-decoder CNN

D3 [ ]SegmentationBrain glioma MRICNN based

D4 [ ]Binding affinities predictionProtein-ligand complexes as voxelSqueezeNet based

A11 [ ]Survival classificationBrain glioma MRI, functional MRI, and DTI CNN and mCNN

A12 [ ]ClassificationFundus images with glaucomatous optic neuropathyInception

A13 [ ]ClassificationChest radiographs with pneumoniaResNet and CheXNet

A14 [ ]Classification and segmentationCritical head abnormality CTResNet, U-net, and DeepLab

A15 [ ]ClassificationBrain glioma MRIResNet

D6 [ ]ClassificationThoracic disease radiographsDenseNet based

A16 [ ]Classification and segmentationEchocardiogram video with cardiac diseaseVGGNet and U-net

A17 [ ]ClassificationBrain positron emission tomography with AlzheimerInception

D7 [ ]ClassificationBreast cancer histopathological imagesCNN based

A18 [ ]ClassificationSkin tumor imagesResNet

A19 [ ]Classification and predictionChest CT with chronic obstructive pulmonary disease and acute respiratory diseaseCNN

A20 [ ]SegmentationBrain MRI with autism spectrum disorderFCNN

D8 [ ]SegmentationFetal MRI and brain tumor MRIProposal network (P-Net) based

A21 [ ]Classification, prediction, and reconstructionNatural movies and functional MRI of watching moviesAlexNet and De-CNN

D9 [ ]Detection and classificationFacial images with a genetic syndromeCNN based

A22 [ ]Detection and segmentationMicroscopic images of cellsU-net

A23 [ ]Classification and localizationBreast cancer mammogramsFaster region-based CNN with VGGNet

A24 [ ]Segmentation and predictionLung cancer CTMask-RCNN, CNN with GoogLeNet and RetinaNet

A26 [ ]ClassificationLung cancer CTCNN; fully connected NN; SAE

A27 [ ]Survival classificationLung cancer CTCNN

A29 [ ]PredictionPolar maps of myocardial perfusion imaging with CAD CNN

A30 [ ]ClassificationProstate cancer MRICNN

D12 [ ]ClassificationLiver SWE with chronic hepatitis BCNN based

D14 [ ]SegmentationLiver cancer CTDenseNet with U-net based

A31 [ ]ClassificationFundus images with macular degenerationAlexNet, GoogLeNet, VGGNet, inception, ResNet, and inception-ResNet

A32 [ ]ClassificationBladder cancer CTcuda-convnet

A34 [ ]ClassificationProstate cancer tissue microarray imagesMobileNet

D19 [ ]ClassificationHolographic microscopy of speciesCNN based

A36 [ ]Survival classificationChest CTCNN

D20 [ ]Classification and localizationMalignant lung nodule radiographsResNet based

A37 [ ]ClassificationShoulder radiographs with proximal humerus fractureResNet

A39 [ ]ClassificationFacial images of hetero and homosexualVGG-Face

A41 [ ]Segmentation and classificationCAD CT angiographyCNN and CAE

A42 [ ]Classification and localizationRadiographs with fractureU-net

A43 [ ]Binding classificationPeptide major histocompatibility complex as image-like arrayCNN

A44 [ ]DetectionLung nodule CTCNN

A45 [ ]ClassificationConfocal endomicroscopy video of oral cancerLeNet

A46 [ ]ClassificationWSI of prostate, skin, and breast cancerMIL with ResNet and RNN

D24 [ ]TrackingVideo of freely behaving animalFCNN based

D25 [ ]SegmentationFundus images with glaucomaU-net based

A47 [ ]Segmentation and classificationCardiac disease cine MRIU-net; M-Net; Dense U-net; SVF-Net; Grid-Net; Dilated CNN

D27 [ ]ClassificationKnee abnormality MRIAlexNet based

D28 [ ]Binding affinities predictionProtein-ligand complexes as gridCNN based

A50 [ ]SegmentationAutosomal dominant polycystic kidney disease CTFCNN with VGGNet

A51 [ ]Segmentation and classificationKnee cartilage lesion MRIVGGNet

A52 [ ]ClassificationMammogramsResNet

A54 [ ]PredictionCAD CT angiographyFCNN

D31 [ ]Classification and localizationWSI of lymph nodes in metastatic breast cancerInception based

D35 [ ]ClassificationFluorescence microscopic images of cellsFFNN based

A56 [ ]ClassificationRetinal fundus images with diabetic retinopathy and breast mass mammographyResNet; GoogLeNet

A25 [ ]Artifact reductionBrain and abdomen CT and radial MR dataU-net

A28 [ ]Resolution enhancementFluorescence microscopic imagesGAN with U-net and CNN

D15 [ ]DealiasingCompressed sensing brain lesion and cardiac MRIGAN with U-net and VGGNet based

D16 [ ]Resolution enhancementSuperresolution localization microscopic imagesGAN with U-net–based pix2pix network modified

A33 [ ]ReconstructionBrain and pelvic MRI and CTGAN with FCNN and CNN

D18 [ ]Artifact reductionCTCNN based

A38 [ ]ReconstructionContrast-enhanced brain MRIEncoder-decoder CNN

D22 [ ]ReconstructionBrain MR fingerprinting dataFFNN based

D23 [ ]Resolution enhancementHi-C matrix of chromosomesCNN based

A48 [ ]Resolution enhancementBrain tumor MRIU-net

D26 [ ]ReconstructionLung vessels CTCNN based

D32 [ ]Resolution enhancementKnee MRICNN based

D33 [ ]ReconstructionCTCNN based

D34 [ ]RegistrationCardiac cine MRI and chest CTCNN based

D17 [ ]Novel structures generation and property predictionSMILES Stack-RNN with GRU - and LSTM based

A40 [ ]Novel structures generationSMILESvariational AE ; CNN- and RNN with GRU-based AAE

D21 [ ]Gene expression (variant effects) predictionGenomic sequenceCNN based

D30 [ ]Novel structures generation and classificationSMILESGAN with differentiable neural computer and CNN based

A53 [ ]Novel structures generationSMILESLSTM

A57 [ ]ClassificationAntimicrobial peptide sequenceCNN with LSTM

D13 [ ]Contact predictionProtein sequence to contact matrixResNet based

A5 [ ]Subtype identification (survival classification)Multi-omics data from liver cancerAE

D5 [ ]Phenotype predictionGenotypeGoogLeNet and deeply supervised net based

D10 [ ]Survival predictionGenomic profiles from cancerFFNN based

D11 [ ]Drug synergies predictionGene expression profiles of cancer cell line and chemical descriptors of drugsFFNN based

A35 [ ]NLP (classification)Electronic health record with pediatric diseaseAttention-based BLSTM

A49 [ ]Binding classificationProtein sequence as matrix and drug molecular fingerprintSAE

D29 [ ]ClassificationElectrocardiogram signalBLSTM based

A55 [ ]ClassificationPolysomnogram signalCNN

a OCT: optical coherence tomography.

b CNN: convolutional neural network.

c MRI: magnetic resonance imaging.

d WSI: whole slide image.

e CAE: convolutional autoencoder.

f ResNet: residual networks.

g CT: computed tomography.

h DTI: diffusion tensor imaging.

i mCNN: multicolumn convolutional neural network.

j FCNN: fully convolutional neural network.

k SAE: stacked autoencoder.

l CAD: coronary artery disease.

m SWE: shear wave elastography.

n MIL: multiple instance learning.

o FFNN: feedforward neural network.

p MR: magnetic resonance.

q GAN: generative adversarial network.

r SMILES: simplified molecular input line-entry system.

s RNN: recurrent neural network.

t GRU: gated recurrent unit.

u LSTM: long short-term memory.

v AE: autoencoder.

w AAE: adversarial autoencoder.

x NLP: natural language processing.

y BLSTM: bidirectional long short-term memory.

Research Topics

In these studies, researchers applied or developed deep learning architectures mainly for the following purposes: image analysis, especially for diagnostic purposes, including the classification or prediction of diseases or survival, and the detection, localization, or segmentation of certain areas or abnormalities. These 3 tasks, which aim to identify the location of an object of interest, are different in that detection involves a single reference point, whereas localization involves an area identified through a bounding box, saliency map, or heatmap, segmentation involves a precise area with clear outlines identified through pixel-wise analysis. Meanwhile, in some studies, models for image analysis unrelated to diagnosis were proposed, such as classifying or segmenting cells in microscopic images and tracking moving animals in videos through pose estimation. Another major objective involved image processing for reconstructing or registering medical images. This included enhancing low-resolution images to high resolution, reconstructing images with different modalities or synthesized targets, reducing artifacts, dealiasing, and aligning medical images.

Meanwhile, several researchers used deep learning architectures to analyze molecules, proteins, and genomes for various purposes. These included drug design or discovery, specifically for generating novel molecular structures through sequence analysis and for predicting binding affinities through image analysis of complexes; understanding protein structure through image analysis of contact matrix; and predicting phenotypes, cancer survival, drug synergies, and genomic variant effects from genes or genomes. Finally, in some studies, deep learning was applied to the diagnostic classification of sequential data, including electrocardiogram or polysomnogram signals and electronic health records. In summary, in the reviewed literature, we identified a predominant focus on applying or developing deep learning models for image analysis regarding localization or diagnosis and image processing, with a few studies focusing on protein or genome analysis.

Deep Learning Architectures

Regarding the main architectures, most of them were predominantly CNNs and based on ≥1 CNN architecture such as a fully CNN (FCNN) and its variants, including U-net; residual neural network (ResNet) and its variants; GoogLeNet (Inception v1) or Inception and VGGNet and its variants; and other architectures. Meanwhile, a few researchers based their models on feedforward neural networks that were not CNNs, including autoencoders (AEs) such as convolutional AE and stacked AE. Others adapted RNNs, including (bidirectional) long short-term memory and gated recurrent unit. Furthermore, models that combined RNNs or AEs with CNNs were also proposed.

Content analysis of the reviewed literature showed that different deep learning architectures were used for different research tasks. Models for classification or prediction tasks using images were predominantly CNN based, with most being ResNet and GoogLeNet or Inception. ResNet with shortcut connections [ 129 ] and GoogLeNet or Inception with 1×1 convolutions, factorized convolutions, and regularizations [ 130 , 131 ] allow networks of increased depth and width by solving problems such as vanishing gradients and computational costs. These mostly analyzed medical images from magnetic resonance imaging or computed tomography, with cancer-related images often used as input data for diagnostic classification, in addition to image-like representations of protein complexes. Meanwhile, when applying these tasks to data other than images, such as genomic or gene expression profiles and protein sequence matrices, researchers used feedforward neural networks, including AEs, that enabled semi- or unsupervised learning and dimensionality reduction.

Image analysis for segmentation and image processing were achieved through CNN-based architectures as well, with most of them being FCNNs, especially U-net. FCNNs produce an input-sized pixel-wise prediction by replacing the last fully connected layers to convolution layers, making them advantageous for the abovementioned tasks [ 132 ], and U-net enhances these performances through long skip connections that concatenate feature maps from the encoder path to the decoder path [ 133 ]. In particular, for medical image processing tasks, a few researchers combined FCNNs (U-net) with other CNNs by adopting the generative adversarial network structure, which generates new instances that mimic the real data through an adversarial process between the generator and discriminator [ 134 ]. We found that images of the brain were often used as input data for these studies.

On the other hand, RNNs were applied to sequence analysis of the string representation of molecules (simplified molecular input line-entry system) and pattern analysis of sequential data such as signals. A few of these models, especially those generating novel molecular structures, combined RNNs with CNNs by adopting generative adversarial networks, including adversarial AE. In summary, the findings showed that the current deep learning models were predominantly CNN based, with most of them focusing on analyzing medical image data and different architectures that are preferred for the specific tasks.

Among these studies, Table 3 shows, in detail, the objectives and the proposed methods of the 35 studies with novel model development.

Content analysis of the top 35 records in the development category.

NumberDevelopment objectivesMethods (proposed model)
D1Segment brain anatomical structures in 3D MRI Voxelwise Residual Network: trained through residual learning of volumetric feature representation and integrated with contextual information of different modalities and levels
D2Estimate poses to track body parts in various animal behaviorsDeeperCut’s subset DeepLabCut: network fine-tuned on labeled body parts, with deconvolutional layers producing spatial probability densities to predict locations
D3Predict isocitrate dehydrogenase 1 mutation in low-grade glioma with MRI radiomics analysisDeep learning–based radiomics: segment tumor regions and directly extract radiomics image features from the last convolutional layer, which is encoded for feature selection and prediction
D4Predict protein-ligand binding affinities represented by 3D descriptorsKDEEP: 3D network to predict binding affinity using voxel representation of protein-ligand complex with assigned property according to its atom type
D5Predict phenotype from genotype through the biological hierarchy of cellular subsystemsDCell: visible neural network with structure following cellular subsystem hierarchy to predict cell growth phenotype and genetic interaction from genotype
D6Classify and localize thoracic diseases in chest radiographsDenseNet-based CheXNeXt: networks trained for each pathology to predict its presence and ensemble and localize indicative parts using class activation mappings
D7Multi-classification of breast cancer from histopathological imagesCSDCNN : trained through end-to-end learning of hierarchical feature representation and optimized feature space distance between breast cancer classes
D8Interactive segmentation of 2D and 3D medical images fine-tuned on a specific imageBounding box and image-specific fine-tuning–based segmentation: trained for interactive image segmentation using bounding box and fine-tuned for specific image with or without scribble and weighted loss function
D9Facial image analysis for identifying phenotypes of genetic syndromesDeepGestalt: preprocessed for face detection and multiple regions and extracts phenotype to predict syndromes per region and aggregate probabilities for classification
D10Predict cancer outcomes with genomic profiles through survival models optimizationSurvivalNet: deep survival model with high-dimensional genomic input and Bayesian hyperparameter optimization, interpreted using risk backpropagation
D11Predict synergy effect of novel drug combinations for cancer treatmentDeepSynergy: predicts drug synergy value using cancer cell line gene expressions and chemical descriptors, which are normalized and combined through conic layers
D12Classify liver fibrosis stages in chronic hepatitis B using radiomics of SWE DLRE : predict the probability of liver fibrosis stages with quantitative radiomics approach through automatic feature extraction from SWE images
D13Predict protein residue contact map at pixel level with protein featuresRaptorX-Contact: combined networks to learn contact occurrence patterns from sequential and pairwise protein features to predict contacts simultaneously at pixel level
D14Segment liver and tumor in abdominal CT scansHybrid Densely connected U-net: 2D and 3D networks to extract intra- and interslice features with volumetric contexts, optimized through hybrid feature fusion layer
D15Reconstruct compressed sensing MRI to dealiased imageDAGAN : conditional GAN stabilized by refinement learning, with the content loss combined adversarial loss incorporating frequency domain data
D16Reconstruct sparse localization microscopy to superresolution imageArtificial Neural Network Accelerated–Photoactivated Localization Microscopy: trained with superresolution PALM as the target, compares reconstructed and target with loss functions containing conditional GAN
D17Generate novel chemical compound design with desired propertiesReinforcement Learning for Structural Evolution: generate chemically feasible molecule as strings and predict its property, which is integrated with reinforcement learning to bias the design
D18Reduce metal artifacts in reconstructed x-ray CT imagesCNN -based Metal Artifact Reduction: trained on images processed by other Metal Artifact Reduction methods and generates prior images through tissue processing and replaces metal-affected projections
D19Predict species to identify anthrax spores in single cell holographic imagesHoloConvNet: trained with raw holographic images to directly recognize interspecies difference through representation learning using error backpropagation
D20Classify and detect malignant pulmonary nodules in chest radiographsDeep learning–based automatic detection: predict the probability of nodules per radiograph for classification and detect nodule location per nodule from activation value
D21Predict tissue-specific gene expression and genomic variant effects on the expressionExPecto: predict regulatory features from sequences and transform to spatial features and use linear models to predict tissue-specific expression and variant effects
D22Reconstruct MRF to obtain tissue parameter mapsDeep reconstruction network: trained with a sparse dictionary that maps magnitude image to quantitative tissue parameter values for MRF reconstruction
D23Generate high-resolution Hi-C interaction matrix of chromosomes from a low-resolution matrixHiCPlus: predict high-resolution matrix through mapping regional interaction features of low-resolution to high-resolution submatrices using neighboring regions
D24Estimate poses to track body parts of freely moving animalsLEAP : videos preprocessed for egocentric alignment and body parts labeled using GUI and predicts each location by confidence maps with probability distributions
D25Jointly segment optic disc and cup in fundus images for glaucoma screeningM-Net: multi-scale network for generating multi-label segmentation prediction maps of disc and cup regions using polar transformation
D26Reconstruct limited-view PAT to high-resolution 3D imagesDeep gradient descent: learned iterative image reconstruction, incorporated with gradient information of the data fit separately computed from training
D27Predict classifications of and localize knee injuries from MRIMRNet: networks trained for each diagnosis according to a series to predict its presence and combine probabilities for classification using logistic regression
D28Predict binding affinities between 3D structures of protein-ligand complexesPafnucy: structure-based prediction using 3D grid representation of molecular complexes with different orientations as having same atom types
D29Classify electrocardiogram signals based on wavelet transformDeep bidirectional LSTM network–based wavelet sequences: generate decomposed frequency subbands of electrocardiogram signal as sequences by wavelet-based layer and use as input for classification
D30Generate novel small molecule structures with possible biological activityReinforced Adversarial Neural Computer: combined with GAN and reinforcement learning, generates sequences matching the key feature distributions in the training molecule data
D31Detect and localize breast cancer metastasis in digitized lymph nodes slidesLYmph Node Assistant: predict the likelihood of tumor in tissue area and generate a heat map for slides identifying likely areas
D32Transform low-resolution thick slice knee MRI to high-resolution thin slicesDeepResolve: trained to compute residual images, which are added to low-resolution images to generate their high-resolution images
D33Reconstruct sparse-view CT to suppress artifact and preserve featureLearned Experts’ Assessment–Based Reconstruction Network: iterative reconstruction using previous compressive sensing methods, with fields of expert-applied regularization terms learned iteration dependently
D34Unsupervised affine and deformable aligning of medical imagesDeep Learning Image Registration: multistage registration network and unsupervised training to predict transformation parameters using image similarity and create warped moving images
D35Classify subcellular localization patterns of proteins in microscopy imagesLocalization Cellular Annotation Tool: predict localization per cell for image-based classification of multi-localizing proteins, combined with gamer annotations for transfer learning

a MRI: magnetic resonance imaging.

b CSDCNN: class structure-based deep convolutional neural network.

c SWE: shear wave elastography.

d DLRE: deep learning radiomics of elastography.

e CT: computed tomography.

f DAGAN: Dealiasing Generative Adversarial Networks.

g GAN: generative adversarial network.

h PALM: photoactivated localization microscopy.

i CNN: convolutional neural network.

j MRF: magnetic resonance fingerprinting.

k LEAP: LEAP Estimates Animal Pose.

l GUI: graphical user interface.

m PAT: photoacoustic tomography.

n LSTM: long short-term memory.

Black Box Problem

In quite a few of the reviewed studies, the black box problem of deep learning was partly addressed, as researchers implemented various methods to improve model interpretability. To understand the prediction results of image analysis models, most used one of the following two techniques to visualize the important regions: (1) activation-based heatmaps [ 45 , 54 , 65 , 70 ], especially class activation maps [ 57 , 61 , 77 , 92 ], and saliency maps [ 59 ] and (2) occlusion testing [ 39 , 75 , 82 , 94 ]. For models analyzing data other than images, there were no generally accepted techniques for model interpretation, and researchers suggested some methods, including adopting an interpretable hierarchical structure such as the cellular subsystem [ 122 ] or anatomical division [ 125 ], using backpropagation [ 123 ], observing gate activations of cells in the neural network [ 114 ], or investigating how corrupted input data affect the prediction and how identical predictions are made for different inputs [ 93 ]. As such, various methods were found to be used to tackle this well-known limitation of deep learning.

On average, each examined deep learning study with at least one PubMed indexed citation (429/978, 43.9%) had 25.8 (SD 20.0) citations. These cited references comprised 9373 unique records that were cited 1.27 times on average (SD 2.16). Excluding the ones that were unindexed in the WoS Core Collection (8618/9373, 8.06% of the unique records), an average of 1.77 (SD 1.07) categories were assigned to a record. The top ten WoS categories, which were assigned to the greatest number of total cited references, pertained to the following three major groups: (1) biomedicine ( Radiology, Nuclear Medicine, and Medical Imaging : 2025/11,033, 18.35%; Biochemical Research Methods : 1118/11,033, 10.13%; Mathematical and Computational Biology : 1066/11,033, 9.66%; Biochemistry and Molecular Biology : 1043/11,033, 9.45%; Engineering, Biomedical : 981/11,033, 8.89%; Biotechnology and Applied Microbiology : 916/11,033, 8.3%; Neurosciences : 844/11,033, 7.65%), (2) computer science and engineering ( Computer Science, Interdisciplinary Applications : 1041/11,033, 9.44%; Engineering, Electrical and Electronic : 645/11,033, 5.85%), and (3) Multidisciplinary Sciences (with 1411/11,033, 12.79% records).

To understand the intellectual structure of how knowledge is transferred among different areas of study through citations, we visualized the citation network of WoS subject categories. In the directed citation network shown in Figure 5 , the edges were directed clockwise with the source nodes as the WoS categories of the deep learning studies we examined and the target nodes as the WoS categories of the cited references from which knowledge was obtained. To enhance legibility, we filtered out categories with <100 weighted degrees, excluding self-loops, to form a network of 20 nodes (20/158, 12.7% of the total) and 59 edges (59/2380, 2.48% of the total). In the figure, the node color and size are proportional to the PageRank score (probability 0.85; ε=0.001; Figure 5 A) and weighted-out degree ( Figure 5 B), and the edge size and color are proportional to the link strength. PageRank considers not only the quantity but also the quality of incoming edges, identifying important exporters for knowledge diffusion based on how often and by which fields a node is cited. On the other hand, the weighted outdegree measures outgoing edges and identifies major knowledge importers that frequently cite other fields.

An external file that holds a picture, illustration, etc.
Object name is jmir_v24i4e28114_fig5.jpg

Citation network of the Web of Science subject categories assigned to the reviewed publications and their cited references according to (A) PageRank and (B) weighted outdegree (number of nodes=20; number of edges=59).

As depicted in Figure 5 A, categories with high PageRank scores mostly coincided with the frequently cited fields identified above and were grouped into two communities through modularity (upper half and lower half). The upper half region centered on Radiology, Nuclear Medicine, and Medical Imaging , which had the highest PageRank score (0.191) and proved to be a field with a significant influence on deep learning studies in biomedicine. Meanwhile, important knowledge exporters to this field included Engineering, Biomedical (0.134); Engineering, Electrical and Electronic (0.110); and Computer Science, Interdisciplinary Applications (0.091). The lower half region mainly comprised categories with comparable PageRank scores in which knowledge was frequently exchanged between one another, including Biochemical Research Methods (0.053), Multidisciplinary Sciences (0.053), Biochemistry and Molecular Biology (0.052), Biotechnology and Applied Microbiology (0.050), and Mathematical and Computational Biology (0.048). Specifically, in Figure 5 B, Mathematical and Computational Biology (1992), Biotechnology and Applied Microbiology (1836), and Biochemical Research Methods (1807) were identified as major knowledge importers with the highest weighted outdegrees, whereas Biochemistry and Molecular Biology (344) had a relatively low weighted outdegree, indicating their role as a source of knowledge for these fields.

We analyzed the 10 most frequently cited studies to gain an in-depth understanding of the most influential works and assigned these papers to one of the three categories: review, application, or development. Review articles provided comprehensive overviews of the development and applications of deep learning [ 1 , 3 ], with 1 focusing on applications to medical image analysis [ 4 ]. We summarize the 7 application (denoted by A ) or development (denoted by D ) studies in Table 4 .

Content analysis matrix of the highly cited references in the application or development category.

CategoryCitation count, nResearch topic: task typeObjectivesMethods (deep learning architectures)
A1 [ ]53Diagnostic image analysis: classificationApply CNN to classifying skin lesions from clinical imagesInception version 3 fine-tuned end to end with images; tested against dermatologists on 2 binary classifications
A2 [ ]51Diagnostic image analysis: classificationApply CNN to detecting referrable diabetic retinopathy on retinal fundus imagesInception version 3 trained and validated using 2 data sets of images graded by ophthalmologists
D1 [ ]34Computer scienceDevelop a new gradient-based RNN to solve error backflow problemsLSTM achieved constant error flow through memory cells regulated by gate units; tested numerous times against other methods
D2 [ ]33Sequence analysis: binding (variant effects) predictionPropose a predictive model for sequence specificities of DNA- and RNA-binding proteinsCNN-based DeepBind trained fully automatically through parallel implementation to predict and visualize binding specificities and variation effects
A3 [ ]27Diagnostic image analysis: classificationEvaluate factors of using CNNs for thoracoabdominal lymph node detection and interstitial lung disease classificationCompare performances of AlexNet, CifarNet, and GoogLeNet trained with transfer learning and different data set characteristics
D3 [ ]23Sequence analysis: chromatin profiles (variant effects) predictionPropose a model for predicting noncoding variant effects from genomic sequenceCNN-based DeepSEA trained for chromatin profile prediction to estimate variant effects with single nucleotide sensitivity and prioritize functional variants
A4 [ ]23Diagnostic image analysis: classificationEvaluate CNNs for tuberculosis detection on chest radiographsCompare performances of AlexNet and GoogLeNet and ensemble of 2 trained with transfer learning, augmented data set, and radiologist-augmented approach

a CNN: convolutional neural network.

b RNN: recurrent neural network.

c LSTM: long short-term memory.

In these studies, excluding the study by Hochreiter and Schmidhuber [ 135 ], whose research topic pertained to computer science, deep learning was used for diagnostic image analysis of various areas [ 12 - 14 , 136 ] and for sequence analysis of proteins [ 21 ] or genomes [ 22 ]. The main architectures implemented to achieve the different research objectives mostly comprised CNNs [ 12 - 14 , 136 ] or CNN-based novel models [ 21 , 22 ] and RNNs [ 135 ]. The findings indicated that these deep neural networks either outperformed previous methods or achieved a performance comparable with that of human experts.

Principal Findings

With the increase in biomedical research using deep learning techniques, we aimed to gain a quantitative and qualitative understanding of the scientific domain, as reflected in the published literature. For this purpose, we conducted a scientometric analysis of deep learning studies in biomedicine.

Through the metadata and content analyses of bibliographic records, we identified the current leading fields and research topics, the most prominent being radiology and medical imaging. Other biomedical fields that have led this domain included biomedical engineering, mathematical and computational biology, and biochemical research methods. As part of interdisciplinary research, computer science and electrical engineering were important fields as well. The major research topics that were studied included computer-assisted image interpretation and diagnosis (which involved localizing or segmenting certain areas for classifying or predicting diseases), image processing such as medical image reconstruction or registration, and sequence analysis of proteins or RNA to understand protein structure and discover or design drugs. These topics were particularly prevalent in their application to neoplasms.

Furthermore, although deep learning techniques that had been proposed for these themes were predominantly CNN based, different architectures are preferred for different research tasks. The findings showed that CNN-based models mostly focused on analyzing medical image data, with RNN architectures for sequential data analysis and AEs for unsupervised dimensionality reduction yet to be actively explored. Other deep learning methods, such as deep belief networks [ 137 , 138 ], deep Q network [ 139 ], and dictionary learning [ 140 ], have also been applied to biomedical research but were excluded from the content analysis because of low citation count. As deep learning is a rapidly evolving field, future biomedical researchers should pay attention to the emerging trends and keep aware of state-of-the-art models for enhanced performance, such as transformer-based models, including bidirectional encoder representations from transformers for NLP [ 141 ]; wav2vec for speech recognition [ 142 ]; and the Swin transformer for computer vision tasks of image classification, segmentation, and object detection [ 143 ].

The findings from the analysis of the cited references revealed patterns of knowledge diffusion. In the analysis, radiology and medical imaging appeared to be the most significant knowledge source and an important field in the knowledge diffusion network. Relatedly, we identified knowledge exporters to this field, including biomedical engineering, electrical engineering, and computer science, as important, despite their relatively low citation counts. Furthermore, citation patterns revealed clique-like relationships among the four fields—biochemical research methods, biochemistry and molecular biology, biotechnology and applied microbiology, and mathematical and computational biology—with each being a source of knowledge and diffusion for the others.

Beyond knowledge diffusion, knowledge integration was also encouraged through collaboration among authors from different organizations and academic disciplines. Coauthorship analysis revealed active research collaboration between universities and hospitals and between hospitals and companies. Separately, we identified an engineering-oriented cluster and biomedicine-oriented clusters of disciplines, among which we observed a range of disciplinary collaborations, with the most prominent 2 between radiology and medical imaging and computer science and electrical engineering, which were the 3 disciplines that were most involved in publishing and collaboration. Meanwhile, pathology and public health showed a high collaborative research to publications ratio, whereas computational biology showed a low collaborative ratio.

Limitations

This study has the following limitations that may have affected data analysis and interpretation. First, focusing only on published studies may have underrepresented the field. Second, publication data were only retrieved from PubMed; although PubMed is one of the largest databases for biomedical literature, other databases such as DataBase systems and Logic Programming may also include relevant studies. Third, the use of PubMed limited our data to biomedical journals and proceedings. Given that deep learning is an active research area in computer science, computer science conference articles are valuable sources of data that were not considered in this study. Finally, our current data retrieval strategy involved searching deep learning as the major MeSH term, which increased precision but may have omitted relevant studies that were not explicitly tagged as deep learning . We plan to expand our scope in future work to consider other bibliographic databases and search terms as well.

In this study, we investigated the landscape of deep learning research in biomedicine and identified major research topics, influential works, knowledge diffusion, and research collaboration through scientometric analyses. The results showed a predominant focus on research applying deep learning techniques, especially CNNs, to radiology and medical imaging and confirmed the interdisciplinary nature of this domain, especially between engineering and biomedical fields. However, diverse biomedical applications of deep learning in the fields of genetics and genomics, medical informatics focusing on text or speech data, and signal processing of various activities (eg, brain, heart, and human) will further boost the contribution of deep learning in addressing biomedical research problems. As such, although deep learning research in biomedicine has been successful, we believe that there is a need for further exploration, and we expect the results of this study to help researchers and communities better align their present and future work.

Abbreviations

AEautoencoder
CNNconvolutional neural network
FCNNfully convolutional neural network
MeSHMedical Subject Heading
NLPnatural language processing
ResNetresidual neural network
RNNrecurrent neural network
WoSWeb of Science

Authors' Contributions: SN and YZ designed the study. SN, DK, and WJ analyzed the data. SN took the lead in the writing of the manuscript. YZ supervised and implemented the study. All authors contributed to critical edits and approved the final manuscript.

Conflicts of Interest: None declared.

RESEARCH PROPOSAL DEEP LEARNING

Get your deep learning proposal work from high end trained professionals. The passion of your areas of interest will be clearly reflected in your proposal. Chose an expert to provide you with custom research proposal work. To interpret the real-time process of the art, historical context and future scopes we have made a literature survey in Deep Learning (DL).

  • Define Objectives:
  • Clearly sketch what we need to execute with our comprehensive view.
  • Take transformers in Natural Language Processing (NLP) as an example and note its specific tasks and issues.
  • Primary Sources:
  • Research Databases: We can use the fields such as Google Scholar, arXIv, PubMed (for biomedical papers), IEEE Xplore, and others.
  • Conference: Here NeurIPS, ICML, ICLR, CVPR, ICCV, ACL, EMNLP are the basic conferences in DL.
  • Journal: The Journal of Machine Learning Research (JMLR) and Neural Computation are the papers frequently establish DL related studies.
  • Start by Reviews and Surveys:
  • Find the latest survey and review papers on our area of interest which gives a literature outline and frequently see the seminal latest works.
  • Begin with Convolutional Neural Networks (CNNs) architecture survey paper if we search for CNN.
  • Reading Papers:
  • Skim: Begin with reading abstracts, introductions, conclusions, and figures.
  • Deep Dive: When a study shows high similar to our work, then look in-depth to its methodology, experiments, and results.
  • Take Notes: Look down the basic plans, methods, datasets, Evaluation metrics, and open issues described in the paper and note it.
  • Forward and Backward Search:
  • Forward: We can detect how the area is emerging using the tools such as Google Scholar’s “Cited by” feature to find latest papers in our research.
  • Backward : We can track the improvement of designs by seeing the reference which is gives more knowledge in our study.  
  • Organize and Combine:
  • Classify the papers by its themes, methodologies and version.
  • We have to analyze the trends, patterns, and gaps in the literature.
  • Keep Updates:
  • We need to stay update with notifications on fields such as Google Scholar and arXiv for keywords similar to our title with the recent publications, because Dl is a fast-emerging area.
  • Tools and Platforms:
  • Utilize the tools such as Mendeley, Zotero and EndNote for maintaining and citing papers.
  • We find similar papers with AI-driven suggestions from Semantic Scholar platform.
  • Engage with the Community:
  • Join into mailing lists, social media groups and online conference to get related with DL. Websites such as Reddit’s r/Machine Learning or the AI Alignment Forum frequently gather latest papers.
  • By attending the webinars, workshops and meetings often can help us to gain skills from recent techniques and find knowledge of what the group seems essential.
  • Report and Share:
  • If we want to establish the paper make annotated bibliographies, presentations, and review papers based on our motive and file the research.
  • We can our scope to help others and publish us a skilled person in this topic.

            The objective of this review is to crucially recognize and integrate the real-time content in the area. Though it is a time-consuming work, it will be useful for someone aims to make research and latest works in DL.

Deep Learning project face recognition with python OpenCV

            Designing a face remembering system using Python and OpenCV is an amazing work that introduces us into the world of computer vision and DL. The following are the step-by-step guide to construct a simple face recognition system:

  • Install Necessary Libraries

Make sure that we have the required libraries installed:

pip install opencv-python opencv-python-headless

  • Capture Faces

We require a dataset for training. We utilize the pre-defined dataset and capture our own using OpenCV.

cam = cv2.VideoCapture(0)

detector = cv2.CascadeClassifier(cv2.data.haarcascades + ‘haarcascade_frontalface_default.xml’)

id = input(‘Enter user ID: ‘)

sampleNum = 0

while True:

    ret, img = cam.read()

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = detector.detectMultiScale(gray, 1.3, 5)

    for (x,y,w,h) in faces:

        sampleNum += 1

        cv2.imwrite(f”faces/User.{id}.{sampleNum}.jpg”, gray[y:y+h,x:x+w])

        cv2.rectangle(img, (x,y), (x+w, y+h), (255,0,0), 2)

        cv2.waitKey(100)

    cv2.imshow(‘Capture’, img)

    cv2.waitKey(1)

    if sampleNum > 20: # capture 20 images

        break

cam.release()

cv2.destroyAllWindows()

  • Training the Recognizer

OpenCV has a built-in face recognizer. For this example, we’ll use the LBPH (Local Binary Pattern Histogram) face recognizer.

import numpy as np

from PIL import Image

path = ‘faces’

recognizer = cv2.face.LBPHFaceRecognizer_create()

def getImagesAndLabels(path):

    imagePaths = [os.path.join(path,f) for f in os.listdir(path)]    

    faceSamples=[]

    ids = []

    for imagePath in imagePaths:

        PIL_img = Image.open(imagePath).convert(‘L’)

        img_numpy = np.array(PIL_img,’uint8′)

        id = int(os.path.split(imagePath)[-1].split(“.”)[1])

        faces = detector.detectMultiScale(img_numpy)

        for (x,y,w,h) in faces:

           faceSamples.append(img_numpy[y:y+h,x:x+w])

            ids.append(id)

    return faceSamples, np.array(ids)

faces,ids = getImagesAndLabels(path)

recognizer.train(faces, ids)

recognizer.save(‘trainer/trainer.yml’)

  • Recognizing Faces

recognizer.read(‘trainer/trainer.yml’)

cascadePath = cv2.data.haarcascades + “haarcascade_frontalface_default.xml”

faceCascade = cv2.CascadeClassifier(cascadePath)

font = cv2.FONT_HERSHEY_SIMPLEX

minW = 0.1*cam.get(3)

minH = 0.1*cam.get(4)

    faces = faceCascade.detectMultiScale(

        gray,

        scaleFactor=1.2,

        minNeighbors=5,

        minSize=(int(minW), int(minH)),

        id, confidence = recognizer.predict(gray[y:y+h,x:x+w])

        if (confidence < 100):

            confidence = f”  {round(100 – confidence)}%”

        else:

            id = “unknown”

        cv2.putText(img, str(id), (x+5,y-5), font, 1, (255,255,255), 2)

        cv2.putText(img, str(confidence), (x+5,y+h-5), font, 1, (255,255,0), 1) 

    cv2.imshow(‘Face Recognition’,img)

    if cv2.waitKey(1) & 0xFF == ord(‘q’):

We have proper directories (faces and trainer) to design. It will be a basic face recognition system and can strengthen with DL models for better accuracy and robustness against various states in real-time. To achieve better accuracy in real-time conditions, we discover latest DL based techniques like FaceNet or pre-trained models from DL frameworks.

Deep learning MS Thesis topics

Have a conversation with our faculty members to get the best topics that matches with your interest. Some of the unique topic ideas are shared below …. contact us for more support.

RESEARCH PROPOSAL DEEP LEARNING BRILLIANT PROJECT IDEAS

  • Modulation Recognition based on Incremental Deep Learning
  • Fast Channel Analysis and Design Approach using Deep Learning Algorithm for 112Gbs HSI Signal Routing Optimization
  • Deep Learning of Process Data with Supervised Variational Auto-encoder for Soft Sensor
  • Methodological Principles for Deep Learning in Software Engineering
  • Recent Trends in Deep Learning for Natural Language Processing and Scope for Asian Languages
  • Adding Context to Source Code Representations for Deep Learning
  • Weekly Power Generation Forecasting using Deep Learning Techniques: Case Study of a 1.5 MWp Floating PV Power Plant
  • A Study of Deep Learning Approaches and Loss Functions for Abundance Fractions Estimation
  • A Trustless Federated Framework for Decentralized and Confidential Deep Learning
  • Research on Financial Data Analysis Based on Applied Deep Learning in Quantitative Trading
  • A Deep Learning model for day-ahead load forecasting taking advantage of expert knowledge
  • Locational marginal price forecasting using Transformer-based deep learning network
  • H-Stegonet: A Hybrid Deep Learning Framework for Robust Steganalysis
  • Comparison of Deep Learning Approaches for Sentiment Classification
  • An Unmanned Network Intrusion Detection Model Based on Deep Reinforcement Learning
  • Indoor Object Localization and Tracking Using Deep Learning over Received Signal Strength
  • Analysis of Deep Learning 3-D Imaging Methods Based on UAV SAR
  • Research and improvement of deep learning tool chain for electric power applications
  • Hybrid Intrusion Detector using Deep Learning Technique
  • Non-Trusted user Classification-Comparative Analysis of Machine and Deep Learning Approaches

Why Work With Us ?

Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.

Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.

Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.

We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal).

PhDdirection.com is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.

Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.

Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.

Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.

Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.

Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.

Related Pages

Our benefits, throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].

Questions ?

Click here to chat with us

IMAGES

  1. Phd Research Proposal Topics in Deep Learning Algorithms

    research proposal of deep learning

  2. (PDF) Research Proposal: Deep Learning-Based Human Modeling for Enhanced Human-Robot Interaction

    research proposal of deep learning

  3. Matlab Simulation

    research proposal of deep learning

  4. Novel Research Proposal Deep Learning

    research proposal of deep learning

  5. Novel Deep Learning Research Proposal [High Quality Proposal]

    research proposal of deep learning

  6. HOW TO WRITE RESEARCH PROPOSAL DEEP LEARNING

    research proposal of deep learning

VIDEO

  1. 🤯 Have you heard of “abacus” theory?

  2. TUGAS RENCARA PROPOSAL (DEEP LEARNING)

  3. Tugas rencana proposal (Deep Learning)

  4. How to make Research Proposal for PhD admission?

  5. How to make Research Proposal for PhD admission?

  6. Will You Marry Me? ❤️ [Proposal] [Deep Voice] [Boyfriend ASMR] [M4F]

COMMENTS

  1. Deep Learning Research Proposal - PHD TOPIC

    Deep Learning Research Proposal. The word deep learning is the study and analysis of deep features that are hidden in the data using some intelligent deep learning models. Recently, it turns out to be the most important research paradigm for advanced automated systems for decision-making.

  2. EXPANSE, A Continual Deep Learning System; Research Proposal

    Progressive Learning, a subcalegory of deep transfer learning, is the closest technique to human continual learning ability. The main goal of the following proposed system is to drive the current progressive learning method a step closer to the final destination of Artificial General Intelligence.

  3. Deep Learning: A Comprehensive Overview on Techniques ...

    While existing methods have established a solid foundation for deep learning systems and research, this section outlines the below ten potential future research directions based on our study.

  4. Deep learning in computer vision: A critical review of ...

    Abstract. Deep learning has been overwhelmingly successful in computer vision (CV), natural language processing, and video/speech recognition. In this paper, our focus is on CV. We provide a critical review of recent achievements in terms of techniques and applications.

  5. Deep learning: systematic review, models, challenges, and ...

    Furthermore, it delves into cutting-edge facets of deep learning, including transfer learning, online learning, and federated learning. The survey finishes by outlining critical challenges and charting prospective pathways, thereby illuminating forthcoming research trends across diverse domains.

  6. EXPANSE, A Continual Deep Learning System; Research Proposal

    This research proposes a deep learning-driven impulse radio ultra-wideband (IR-UWB) multiantenna scheme for non-ionic breast tumor localization.

  7. A Transdisciplinary Review of Deep Learning Research and Its ...

    Deep learning (DL) is transforming many scientific disciplines, but its adoption in hydrology is gradual. DL can help tackle interdisciplinarity, data deluge, unrecognized linkages, and long-standing challenges such as scaling and equifinality.

  8. A Novel Proposal for Deep Learning-Based Diabetes Prediction ...

    This study proposes a new method based on deep learning for the early detection of diabetes. Like many other medical data, the PIMA dataset used in the study contains only numerical values. In this sense, the application of popular convolutional neural network (CNN) models to such data are limited.

  9. Understanding the Research Landscape of Deep Learning in ...

    In this study, we investigated the landscape of deep learning research in biomedicine and identified major research topics, influential works, knowledge diffusion, and research collaboration through scientometric analyses.

  10. HOW TO WRITE RESEARCH PROPOSAL DEEP LEARNING

    RESEARCH PROPOSAL DEEP LEARNING. Get your deep learning proposal work from high end trained professionals. The passion of your areas of interest will be clearly reflected in your proposal. Chose an expert to provide you with custom research proposal work.