a OCT: optical coherence tomography.
b CNN: convolutional neural network.
c MRI: magnetic resonance imaging.
d WSI: whole slide image.
e CAE: convolutional autoencoder.
f ResNet: residual networks.
g CT: computed tomography.
h DTI: diffusion tensor imaging.
i mCNN: multicolumn convolutional neural network.
j FCNN: fully convolutional neural network.
k SAE: stacked autoencoder.
l CAD: coronary artery disease.
m SWE: shear wave elastography.
n MIL: multiple instance learning.
o FFNN: feedforward neural network.
p MR: magnetic resonance.
q GAN: generative adversarial network.
r SMILES: simplified molecular input line-entry system.
s RNN: recurrent neural network.
t GRU: gated recurrent unit.
u LSTM: long short-term memory.
v AE: autoencoder.
w AAE: adversarial autoencoder.
x NLP: natural language processing.
y BLSTM: bidirectional long short-term memory.
In these studies, researchers applied or developed deep learning architectures mainly for the following purposes: image analysis, especially for diagnostic purposes, including the classification or prediction of diseases or survival, and the detection, localization, or segmentation of certain areas or abnormalities. These 3 tasks, which aim to identify the location of an object of interest, are different in that detection involves a single reference point, whereas localization involves an area identified through a bounding box, saliency map, or heatmap, segmentation involves a precise area with clear outlines identified through pixel-wise analysis. Meanwhile, in some studies, models for image analysis unrelated to diagnosis were proposed, such as classifying or segmenting cells in microscopic images and tracking moving animals in videos through pose estimation. Another major objective involved image processing for reconstructing or registering medical images. This included enhancing low-resolution images to high resolution, reconstructing images with different modalities or synthesized targets, reducing artifacts, dealiasing, and aligning medical images.
Meanwhile, several researchers used deep learning architectures to analyze molecules, proteins, and genomes for various purposes. These included drug design or discovery, specifically for generating novel molecular structures through sequence analysis and for predicting binding affinities through image analysis of complexes; understanding protein structure through image analysis of contact matrix; and predicting phenotypes, cancer survival, drug synergies, and genomic variant effects from genes or genomes. Finally, in some studies, deep learning was applied to the diagnostic classification of sequential data, including electrocardiogram or polysomnogram signals and electronic health records. In summary, in the reviewed literature, we identified a predominant focus on applying or developing deep learning models for image analysis regarding localization or diagnosis and image processing, with a few studies focusing on protein or genome analysis.
Regarding the main architectures, most of them were predominantly CNNs and based on ≥1 CNN architecture such as a fully CNN (FCNN) and its variants, including U-net; residual neural network (ResNet) and its variants; GoogLeNet (Inception v1) or Inception and VGGNet and its variants; and other architectures. Meanwhile, a few researchers based their models on feedforward neural networks that were not CNNs, including autoencoders (AEs) such as convolutional AE and stacked AE. Others adapted RNNs, including (bidirectional) long short-term memory and gated recurrent unit. Furthermore, models that combined RNNs or AEs with CNNs were also proposed.
Content analysis of the reviewed literature showed that different deep learning architectures were used for different research tasks. Models for classification or prediction tasks using images were predominantly CNN based, with most being ResNet and GoogLeNet or Inception. ResNet with shortcut connections [ 129 ] and GoogLeNet or Inception with 1×1 convolutions, factorized convolutions, and regularizations [ 130 , 131 ] allow networks of increased depth and width by solving problems such as vanishing gradients and computational costs. These mostly analyzed medical images from magnetic resonance imaging or computed tomography, with cancer-related images often used as input data for diagnostic classification, in addition to image-like representations of protein complexes. Meanwhile, when applying these tasks to data other than images, such as genomic or gene expression profiles and protein sequence matrices, researchers used feedforward neural networks, including AEs, that enabled semi- or unsupervised learning and dimensionality reduction.
Image analysis for segmentation and image processing were achieved through CNN-based architectures as well, with most of them being FCNNs, especially U-net. FCNNs produce an input-sized pixel-wise prediction by replacing the last fully connected layers to convolution layers, making them advantageous for the abovementioned tasks [ 132 ], and U-net enhances these performances through long skip connections that concatenate feature maps from the encoder path to the decoder path [ 133 ]. In particular, for medical image processing tasks, a few researchers combined FCNNs (U-net) with other CNNs by adopting the generative adversarial network structure, which generates new instances that mimic the real data through an adversarial process between the generator and discriminator [ 134 ]. We found that images of the brain were often used as input data for these studies.
On the other hand, RNNs were applied to sequence analysis of the string representation of molecules (simplified molecular input line-entry system) and pattern analysis of sequential data such as signals. A few of these models, especially those generating novel molecular structures, combined RNNs with CNNs by adopting generative adversarial networks, including adversarial AE. In summary, the findings showed that the current deep learning models were predominantly CNN based, with most of them focusing on analyzing medical image data and different architectures that are preferred for the specific tasks.
Among these studies, Table 3 shows, in detail, the objectives and the proposed methods of the 35 studies with novel model development.
Content analysis of the top 35 records in the development category.
Number | Development objectives | Methods (proposed model) |
D1 | Segment brain anatomical structures in 3D MRI | Voxelwise Residual Network: trained through residual learning of volumetric feature representation and integrated with contextual information of different modalities and levels |
D2 | Estimate poses to track body parts in various animal behaviors | DeeperCut’s subset DeepLabCut: network fine-tuned on labeled body parts, with deconvolutional layers producing spatial probability densities to predict locations |
D3 | Predict isocitrate dehydrogenase 1 mutation in low-grade glioma with MRI radiomics analysis | Deep learning–based radiomics: segment tumor regions and directly extract radiomics image features from the last convolutional layer, which is encoded for feature selection and prediction |
D4 | Predict protein-ligand binding affinities represented by 3D descriptors | KDEEP: 3D network to predict binding affinity using voxel representation of protein-ligand complex with assigned property according to its atom type |
D5 | Predict phenotype from genotype through the biological hierarchy of cellular subsystems | DCell: visible neural network with structure following cellular subsystem hierarchy to predict cell growth phenotype and genetic interaction from genotype |
D6 | Classify and localize thoracic diseases in chest radiographs | DenseNet-based CheXNeXt: networks trained for each pathology to predict its presence and ensemble and localize indicative parts using class activation mappings |
D7 | Multi-classification of breast cancer from histopathological images | CSDCNN : trained through end-to-end learning of hierarchical feature representation and optimized feature space distance between breast cancer classes |
D8 | Interactive segmentation of 2D and 3D medical images fine-tuned on a specific image | Bounding box and image-specific fine-tuning–based segmentation: trained for interactive image segmentation using bounding box and fine-tuned for specific image with or without scribble and weighted loss function |
D9 | Facial image analysis for identifying phenotypes of genetic syndromes | DeepGestalt: preprocessed for face detection and multiple regions and extracts phenotype to predict syndromes per region and aggregate probabilities for classification |
D10 | Predict cancer outcomes with genomic profiles through survival models optimization | SurvivalNet: deep survival model with high-dimensional genomic input and Bayesian hyperparameter optimization, interpreted using risk backpropagation |
D11 | Predict synergy effect of novel drug combinations for cancer treatment | DeepSynergy: predicts drug synergy value using cancer cell line gene expressions and chemical descriptors, which are normalized and combined through conic layers |
D12 | Classify liver fibrosis stages in chronic hepatitis B using radiomics of SWE | DLRE : predict the probability of liver fibrosis stages with quantitative radiomics approach through automatic feature extraction from SWE images |
D13 | Predict protein residue contact map at pixel level with protein features | RaptorX-Contact: combined networks to learn contact occurrence patterns from sequential and pairwise protein features to predict contacts simultaneously at pixel level |
D14 | Segment liver and tumor in abdominal CT scans | Hybrid Densely connected U-net: 2D and 3D networks to extract intra- and interslice features with volumetric contexts, optimized through hybrid feature fusion layer |
D15 | Reconstruct compressed sensing MRI to dealiased image | DAGAN : conditional GAN stabilized by refinement learning, with the content loss combined adversarial loss incorporating frequency domain data |
D16 | Reconstruct sparse localization microscopy to superresolution image | Artificial Neural Network Accelerated–Photoactivated Localization Microscopy: trained with superresolution PALM as the target, compares reconstructed and target with loss functions containing conditional GAN |
D17 | Generate novel chemical compound design with desired properties | Reinforcement Learning for Structural Evolution: generate chemically feasible molecule as strings and predict its property, which is integrated with reinforcement learning to bias the design |
D18 | Reduce metal artifacts in reconstructed x-ray CT images | CNN -based Metal Artifact Reduction: trained on images processed by other Metal Artifact Reduction methods and generates prior images through tissue processing and replaces metal-affected projections |
D19 | Predict species to identify anthrax spores in single cell holographic images | HoloConvNet: trained with raw holographic images to directly recognize interspecies difference through representation learning using error backpropagation |
D20 | Classify and detect malignant pulmonary nodules in chest radiographs | Deep learning–based automatic detection: predict the probability of nodules per radiograph for classification and detect nodule location per nodule from activation value |
D21 | Predict tissue-specific gene expression and genomic variant effects on the expression | ExPecto: predict regulatory features from sequences and transform to spatial features and use linear models to predict tissue-specific expression and variant effects |
D22 | Reconstruct MRF to obtain tissue parameter maps | Deep reconstruction network: trained with a sparse dictionary that maps magnitude image to quantitative tissue parameter values for MRF reconstruction |
D23 | Generate high-resolution Hi-C interaction matrix of chromosomes from a low-resolution matrix | HiCPlus: predict high-resolution matrix through mapping regional interaction features of low-resolution to high-resolution submatrices using neighboring regions |
D24 | Estimate poses to track body parts of freely moving animals | LEAP : videos preprocessed for egocentric alignment and body parts labeled using GUI and predicts each location by confidence maps with probability distributions |
D25 | Jointly segment optic disc and cup in fundus images for glaucoma screening | M-Net: multi-scale network for generating multi-label segmentation prediction maps of disc and cup regions using polar transformation |
D26 | Reconstruct limited-view PAT to high-resolution 3D images | Deep gradient descent: learned iterative image reconstruction, incorporated with gradient information of the data fit separately computed from training |
D27 | Predict classifications of and localize knee injuries from MRI | MRNet: networks trained for each diagnosis according to a series to predict its presence and combine probabilities for classification using logistic regression |
D28 | Predict binding affinities between 3D structures of protein-ligand complexes | Pafnucy: structure-based prediction using 3D grid representation of molecular complexes with different orientations as having same atom types |
D29 | Classify electrocardiogram signals based on wavelet transform | Deep bidirectional LSTM network–based wavelet sequences: generate decomposed frequency subbands of electrocardiogram signal as sequences by wavelet-based layer and use as input for classification |
D30 | Generate novel small molecule structures with possible biological activity | Reinforced Adversarial Neural Computer: combined with GAN and reinforcement learning, generates sequences matching the key feature distributions in the training molecule data |
D31 | Detect and localize breast cancer metastasis in digitized lymph nodes slides | LYmph Node Assistant: predict the likelihood of tumor in tissue area and generate a heat map for slides identifying likely areas |
D32 | Transform low-resolution thick slice knee MRI to high-resolution thin slices | DeepResolve: trained to compute residual images, which are added to low-resolution images to generate their high-resolution images |
D33 | Reconstruct sparse-view CT to suppress artifact and preserve feature | Learned Experts’ Assessment–Based Reconstruction Network: iterative reconstruction using previous compressive sensing methods, with fields of expert-applied regularization terms learned iteration dependently |
D34 | Unsupervised affine and deformable aligning of medical images | Deep Learning Image Registration: multistage registration network and unsupervised training to predict transformation parameters using image similarity and create warped moving images |
D35 | Classify subcellular localization patterns of proteins in microscopy images | Localization Cellular Annotation Tool: predict localization per cell for image-based classification of multi-localizing proteins, combined with gamer annotations for transfer learning |
a MRI: magnetic resonance imaging.
b CSDCNN: class structure-based deep convolutional neural network.
c SWE: shear wave elastography.
d DLRE: deep learning radiomics of elastography.
e CT: computed tomography.
f DAGAN: Dealiasing Generative Adversarial Networks.
g GAN: generative adversarial network.
h PALM: photoactivated localization microscopy.
i CNN: convolutional neural network.
j MRF: magnetic resonance fingerprinting.
k LEAP: LEAP Estimates Animal Pose.
l GUI: graphical user interface.
m PAT: photoacoustic tomography.
n LSTM: long short-term memory.
In quite a few of the reviewed studies, the black box problem of deep learning was partly addressed, as researchers implemented various methods to improve model interpretability. To understand the prediction results of image analysis models, most used one of the following two techniques to visualize the important regions: (1) activation-based heatmaps [ 45 , 54 , 65 , 70 ], especially class activation maps [ 57 , 61 , 77 , 92 ], and saliency maps [ 59 ] and (2) occlusion testing [ 39 , 75 , 82 , 94 ]. For models analyzing data other than images, there were no generally accepted techniques for model interpretation, and researchers suggested some methods, including adopting an interpretable hierarchical structure such as the cellular subsystem [ 122 ] or anatomical division [ 125 ], using backpropagation [ 123 ], observing gate activations of cells in the neural network [ 114 ], or investigating how corrupted input data affect the prediction and how identical predictions are made for different inputs [ 93 ]. As such, various methods were found to be used to tackle this well-known limitation of deep learning.
On average, each examined deep learning study with at least one PubMed indexed citation (429/978, 43.9%) had 25.8 (SD 20.0) citations. These cited references comprised 9373 unique records that were cited 1.27 times on average (SD 2.16). Excluding the ones that were unindexed in the WoS Core Collection (8618/9373, 8.06% of the unique records), an average of 1.77 (SD 1.07) categories were assigned to a record. The top ten WoS categories, which were assigned to the greatest number of total cited references, pertained to the following three major groups: (1) biomedicine ( Radiology, Nuclear Medicine, and Medical Imaging : 2025/11,033, 18.35%; Biochemical Research Methods : 1118/11,033, 10.13%; Mathematical and Computational Biology : 1066/11,033, 9.66%; Biochemistry and Molecular Biology : 1043/11,033, 9.45%; Engineering, Biomedical : 981/11,033, 8.89%; Biotechnology and Applied Microbiology : 916/11,033, 8.3%; Neurosciences : 844/11,033, 7.65%), (2) computer science and engineering ( Computer Science, Interdisciplinary Applications : 1041/11,033, 9.44%; Engineering, Electrical and Electronic : 645/11,033, 5.85%), and (3) Multidisciplinary Sciences (with 1411/11,033, 12.79% records).
To understand the intellectual structure of how knowledge is transferred among different areas of study through citations, we visualized the citation network of WoS subject categories. In the directed citation network shown in Figure 5 , the edges were directed clockwise with the source nodes as the WoS categories of the deep learning studies we examined and the target nodes as the WoS categories of the cited references from which knowledge was obtained. To enhance legibility, we filtered out categories with <100 weighted degrees, excluding self-loops, to form a network of 20 nodes (20/158, 12.7% of the total) and 59 edges (59/2380, 2.48% of the total). In the figure, the node color and size are proportional to the PageRank score (probability 0.85; ε=0.001; Figure 5 A) and weighted-out degree ( Figure 5 B), and the edge size and color are proportional to the link strength. PageRank considers not only the quantity but also the quality of incoming edges, identifying important exporters for knowledge diffusion based on how often and by which fields a node is cited. On the other hand, the weighted outdegree measures outgoing edges and identifies major knowledge importers that frequently cite other fields.
Citation network of the Web of Science subject categories assigned to the reviewed publications and their cited references according to (A) PageRank and (B) weighted outdegree (number of nodes=20; number of edges=59).
As depicted in Figure 5 A, categories with high PageRank scores mostly coincided with the frequently cited fields identified above and were grouped into two communities through modularity (upper half and lower half). The upper half region centered on Radiology, Nuclear Medicine, and Medical Imaging , which had the highest PageRank score (0.191) and proved to be a field with a significant influence on deep learning studies in biomedicine. Meanwhile, important knowledge exporters to this field included Engineering, Biomedical (0.134); Engineering, Electrical and Electronic (0.110); and Computer Science, Interdisciplinary Applications (0.091). The lower half region mainly comprised categories with comparable PageRank scores in which knowledge was frequently exchanged between one another, including Biochemical Research Methods (0.053), Multidisciplinary Sciences (0.053), Biochemistry and Molecular Biology (0.052), Biotechnology and Applied Microbiology (0.050), and Mathematical and Computational Biology (0.048). Specifically, in Figure 5 B, Mathematical and Computational Biology (1992), Biotechnology and Applied Microbiology (1836), and Biochemical Research Methods (1807) were identified as major knowledge importers with the highest weighted outdegrees, whereas Biochemistry and Molecular Biology (344) had a relatively low weighted outdegree, indicating their role as a source of knowledge for these fields.
We analyzed the 10 most frequently cited studies to gain an in-depth understanding of the most influential works and assigned these papers to one of the three categories: review, application, or development. Review articles provided comprehensive overviews of the development and applications of deep learning [ 1 , 3 ], with 1 focusing on applications to medical image analysis [ 4 ]. We summarize the 7 application (denoted by A ) or development (denoted by D ) studies in Table 4 .
Content analysis matrix of the highly cited references in the application or development category.
Category | Citation count, n | Research topic: task type | Objectives | Methods (deep learning architectures) |
A1 [ ] | 53 | Diagnostic image analysis: classification | Apply CNN to classifying skin lesions from clinical images | Inception version 3 fine-tuned end to end with images; tested against dermatologists on 2 binary classifications |
A2 [ ] | 51 | Diagnostic image analysis: classification | Apply CNN to detecting referrable diabetic retinopathy on retinal fundus images | Inception version 3 trained and validated using 2 data sets of images graded by ophthalmologists |
D1 [ ] | 34 | Computer science | Develop a new gradient-based RNN to solve error backflow problems | LSTM achieved constant error flow through memory cells regulated by gate units; tested numerous times against other methods |
D2 [ ] | 33 | Sequence analysis: binding (variant effects) prediction | Propose a predictive model for sequence specificities of DNA- and RNA-binding proteins | CNN-based DeepBind trained fully automatically through parallel implementation to predict and visualize binding specificities and variation effects |
A3 [ ] | 27 | Diagnostic image analysis: classification | Evaluate factors of using CNNs for thoracoabdominal lymph node detection and interstitial lung disease classification | Compare performances of AlexNet, CifarNet, and GoogLeNet trained with transfer learning and different data set characteristics |
D3 [ ] | 23 | Sequence analysis: chromatin profiles (variant effects) prediction | Propose a model for predicting noncoding variant effects from genomic sequence | CNN-based DeepSEA trained for chromatin profile prediction to estimate variant effects with single nucleotide sensitivity and prioritize functional variants |
A4 [ ] | 23 | Diagnostic image analysis: classification | Evaluate CNNs for tuberculosis detection on chest radiographs | Compare performances of AlexNet and GoogLeNet and ensemble of 2 trained with transfer learning, augmented data set, and radiologist-augmented approach |
a CNN: convolutional neural network.
b RNN: recurrent neural network.
c LSTM: long short-term memory.
In these studies, excluding the study by Hochreiter and Schmidhuber [ 135 ], whose research topic pertained to computer science, deep learning was used for diagnostic image analysis of various areas [ 12 - 14 , 136 ] and for sequence analysis of proteins [ 21 ] or genomes [ 22 ]. The main architectures implemented to achieve the different research objectives mostly comprised CNNs [ 12 - 14 , 136 ] or CNN-based novel models [ 21 , 22 ] and RNNs [ 135 ]. The findings indicated that these deep neural networks either outperformed previous methods or achieved a performance comparable with that of human experts.
With the increase in biomedical research using deep learning techniques, we aimed to gain a quantitative and qualitative understanding of the scientific domain, as reflected in the published literature. For this purpose, we conducted a scientometric analysis of deep learning studies in biomedicine.
Through the metadata and content analyses of bibliographic records, we identified the current leading fields and research topics, the most prominent being radiology and medical imaging. Other biomedical fields that have led this domain included biomedical engineering, mathematical and computational biology, and biochemical research methods. As part of interdisciplinary research, computer science and electrical engineering were important fields as well. The major research topics that were studied included computer-assisted image interpretation and diagnosis (which involved localizing or segmenting certain areas for classifying or predicting diseases), image processing such as medical image reconstruction or registration, and sequence analysis of proteins or RNA to understand protein structure and discover or design drugs. These topics were particularly prevalent in their application to neoplasms.
Furthermore, although deep learning techniques that had been proposed for these themes were predominantly CNN based, different architectures are preferred for different research tasks. The findings showed that CNN-based models mostly focused on analyzing medical image data, with RNN architectures for sequential data analysis and AEs for unsupervised dimensionality reduction yet to be actively explored. Other deep learning methods, such as deep belief networks [ 137 , 138 ], deep Q network [ 139 ], and dictionary learning [ 140 ], have also been applied to biomedical research but were excluded from the content analysis because of low citation count. As deep learning is a rapidly evolving field, future biomedical researchers should pay attention to the emerging trends and keep aware of state-of-the-art models for enhanced performance, such as transformer-based models, including bidirectional encoder representations from transformers for NLP [ 141 ]; wav2vec for speech recognition [ 142 ]; and the Swin transformer for computer vision tasks of image classification, segmentation, and object detection [ 143 ].
The findings from the analysis of the cited references revealed patterns of knowledge diffusion. In the analysis, radiology and medical imaging appeared to be the most significant knowledge source and an important field in the knowledge diffusion network. Relatedly, we identified knowledge exporters to this field, including biomedical engineering, electrical engineering, and computer science, as important, despite their relatively low citation counts. Furthermore, citation patterns revealed clique-like relationships among the four fields—biochemical research methods, biochemistry and molecular biology, biotechnology and applied microbiology, and mathematical and computational biology—with each being a source of knowledge and diffusion for the others.
Beyond knowledge diffusion, knowledge integration was also encouraged through collaboration among authors from different organizations and academic disciplines. Coauthorship analysis revealed active research collaboration between universities and hospitals and between hospitals and companies. Separately, we identified an engineering-oriented cluster and biomedicine-oriented clusters of disciplines, among which we observed a range of disciplinary collaborations, with the most prominent 2 between radiology and medical imaging and computer science and electrical engineering, which were the 3 disciplines that were most involved in publishing and collaboration. Meanwhile, pathology and public health showed a high collaborative research to publications ratio, whereas computational biology showed a low collaborative ratio.
This study has the following limitations that may have affected data analysis and interpretation. First, focusing only on published studies may have underrepresented the field. Second, publication data were only retrieved from PubMed; although PubMed is one of the largest databases for biomedical literature, other databases such as DataBase systems and Logic Programming may also include relevant studies. Third, the use of PubMed limited our data to biomedical journals and proceedings. Given that deep learning is an active research area in computer science, computer science conference articles are valuable sources of data that were not considered in this study. Finally, our current data retrieval strategy involved searching deep learning as the major MeSH term, which increased precision but may have omitted relevant studies that were not explicitly tagged as deep learning . We plan to expand our scope in future work to consider other bibliographic databases and search terms as well.
In this study, we investigated the landscape of deep learning research in biomedicine and identified major research topics, influential works, knowledge diffusion, and research collaboration through scientometric analyses. The results showed a predominant focus on research applying deep learning techniques, especially CNNs, to radiology and medical imaging and confirmed the interdisciplinary nature of this domain, especially between engineering and biomedical fields. However, diverse biomedical applications of deep learning in the fields of genetics and genomics, medical informatics focusing on text or speech data, and signal processing of various activities (eg, brain, heart, and human) will further boost the contribution of deep learning in addressing biomedical research problems. As such, although deep learning research in biomedicine has been successful, we believe that there is a need for further exploration, and we expect the results of this study to help researchers and communities better align their present and future work.
AE | autoencoder |
CNN | convolutional neural network |
FCNN | fully convolutional neural network |
MeSH | Medical Subject Heading |
NLP | natural language processing |
ResNet | residual neural network |
RNN | recurrent neural network |
WoS | Web of Science |
Authors' Contributions: SN and YZ designed the study. SN, DK, and WJ analyzed the data. SN took the lead in the writing of the manuscript. YZ supervised and implemented the study. All authors contributed to critical edits and approved the final manuscript.
Conflicts of Interest: None declared.
Get your deep learning proposal work from high end trained professionals. The passion of your areas of interest will be clearly reflected in your proposal. Chose an expert to provide you with custom research proposal work. To interpret the real-time process of the art, historical context and future scopes we have made a literature survey in Deep Learning (DL).
The objective of this review is to crucially recognize and integrate the real-time content in the area. Though it is a time-consuming work, it will be useful for someone aims to make research and latest works in DL.
Deep Learning project face recognition with python OpenCV
Designing a face remembering system using Python and OpenCV is an amazing work that introduces us into the world of computer vision and DL. The following are the step-by-step guide to construct a simple face recognition system:
Make sure that we have the required libraries installed:
pip install opencv-python opencv-python-headless
We require a dataset for training. We utilize the pre-defined dataset and capture our own using OpenCV.
cam = cv2.VideoCapture(0)
detector = cv2.CascadeClassifier(cv2.data.haarcascades + ‘haarcascade_frontalface_default.xml’)
id = input(‘Enter user ID: ‘)
sampleNum = 0
while True:
ret, img = cam.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = detector.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
sampleNum += 1
cv2.imwrite(f”faces/User.{id}.{sampleNum}.jpg”, gray[y:y+h,x:x+w])
cv2.rectangle(img, (x,y), (x+w, y+h), (255,0,0), 2)
cv2.waitKey(100)
cv2.imshow(‘Capture’, img)
cv2.waitKey(1)
if sampleNum > 20: # capture 20 images
break
cam.release()
cv2.destroyAllWindows()
OpenCV has a built-in face recognizer. For this example, we’ll use the LBPH (Local Binary Pattern Histogram) face recognizer.
import numpy as np
from PIL import Image
path = ‘faces’
recognizer = cv2.face.LBPHFaceRecognizer_create()
def getImagesAndLabels(path):
imagePaths = [os.path.join(path,f) for f in os.listdir(path)]
faceSamples=[]
ids = []
for imagePath in imagePaths:
PIL_img = Image.open(imagePath).convert(‘L’)
img_numpy = np.array(PIL_img,’uint8′)
id = int(os.path.split(imagePath)[-1].split(“.”)[1])
faces = detector.detectMultiScale(img_numpy)
for (x,y,w,h) in faces:
faceSamples.append(img_numpy[y:y+h,x:x+w])
ids.append(id)
return faceSamples, np.array(ids)
faces,ids = getImagesAndLabels(path)
recognizer.train(faces, ids)
recognizer.save(‘trainer/trainer.yml’)
recognizer.read(‘trainer/trainer.yml’)
cascadePath = cv2.data.haarcascades + “haarcascade_frontalface_default.xml”
faceCascade = cv2.CascadeClassifier(cascadePath)
font = cv2.FONT_HERSHEY_SIMPLEX
minW = 0.1*cam.get(3)
minH = 0.1*cam.get(4)
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.2,
minNeighbors=5,
minSize=(int(minW), int(minH)),
id, confidence = recognizer.predict(gray[y:y+h,x:x+w])
if (confidence < 100):
confidence = f” {round(100 – confidence)}%”
else:
id = “unknown”
cv2.putText(img, str(id), (x+5,y-5), font, 1, (255,255,255), 2)
cv2.putText(img, str(confidence), (x+5,y+h-5), font, 1, (255,255,0), 1)
cv2.imshow(‘Face Recognition’,img)
if cv2.waitKey(1) & 0xFF == ord(‘q’):
We have proper directories (faces and trainer) to design. It will be a basic face recognition system and can strengthen with DL models for better accuracy and robustness against various states in real-time. To achieve better accuracy in real-time conditions, we discover latest DL based techniques like FaceNet or pre-trained models from DL frameworks.
Deep learning MS Thesis topics
Have a conversation with our faculty members to get the best topics that matches with your interest. Some of the unique topic ideas are shared below …. contact us for more support.
Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.
Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.
Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.
We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal).
PhDdirection.com is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.
Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.
Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.
Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.
Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.
Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.
Our benefits, throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].
Questions ?
Click here to chat with us
IMAGES
VIDEO
COMMENTS
Deep Learning Research Proposal. The word deep learning is the study and analysis of deep features that are hidden in the data using some intelligent deep learning models. Recently, it turns out to be the most important research paradigm for advanced automated systems for decision-making.
Progressive Learning, a subcalegory of deep transfer learning, is the closest technique to human continual learning ability. The main goal of the following proposed system is to drive the current progressive learning method a step closer to the final destination of Artificial General Intelligence.
While existing methods have established a solid foundation for deep learning systems and research, this section outlines the below ten potential future research directions based on our study.
Abstract. Deep learning has been overwhelmingly successful in computer vision (CV), natural language processing, and video/speech recognition. In this paper, our focus is on CV. We provide a critical review of recent achievements in terms of techniques and applications.
Furthermore, it delves into cutting-edge facets of deep learning, including transfer learning, online learning, and federated learning. The survey finishes by outlining critical challenges and charting prospective pathways, thereby illuminating forthcoming research trends across diverse domains.
This research proposes a deep learning-driven impulse radio ultra-wideband (IR-UWB) multiantenna scheme for non-ionic breast tumor localization.
Deep learning (DL) is transforming many scientific disciplines, but its adoption in hydrology is gradual. DL can help tackle interdisciplinarity, data deluge, unrecognized linkages, and long-standing challenges such as scaling and equifinality.
This study proposes a new method based on deep learning for the early detection of diabetes. Like many other medical data, the PIMA dataset used in the study contains only numerical values. In this sense, the application of popular convolutional neural network (CNN) models to such data are limited.
In this study, we investigated the landscape of deep learning research in biomedicine and identified major research topics, influential works, knowledge diffusion, and research collaboration through scientometric analyses.
RESEARCH PROPOSAL DEEP LEARNING. Get your deep learning proposal work from high end trained professionals. The passion of your areas of interest will be clearly reflected in your proposal. Chose an expert to provide you with custom research proposal work.