Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to enable image recognition.

research papers on applications of image processing

Computer vision for kinematic metrics of the drinking task in a pilot study of neurotypical participants

  • Justin Huber
  • Stacey Slone

research papers on applications of image processing

A multicentre study to evaluate the diagnostic performance of a novel CAD software, DecXpert, for radiological diagnosis of tuberculosis in the northern Indian population

  • Ankit Shukla

research papers on applications of image processing

Automatic ploidy prediction and quality assessment of human blastocysts using time-lapse imaging

Assessing human embryos is crucial for in vitro fertilization, a task being revolutionized by artificial intelligence. Here, the authors introduce BELA, an automated AI model for predicting embryo ploidy status and quality using time-lapse imaging.

  • Suraj Rajendran
  • Matthew Brendel
  • Iman Hajirasouliha

research papers on applications of image processing

An encryption algorithm for color images based on an improved dual-chaotic system combined with DNA encoding

  • Tingting Liu

research papers on applications of image processing

Automated Association for Osteosynthesis Foundation and Orthopedic Trauma Association classification of pelvic fractures on pelvic radiographs using deep learning

  • Seung Hwan Lee
  • Kwang Gi Kim

research papers on applications of image processing

A pathology foundation model for cancer diagnosis and prognosis prediction

A study describes the development of a generalizable foundation machine learning framework to extract pathology imaging features for cancer diagnosis and prognosis prediction.

  • Junhan Zhao
  • Kun-Hsing Yu


News and Comment

Cell painting gallery: an open resource for image-based profiling.

  • Erin Weisbart
  • Ankur Kumar
  • Shantanu Singh

research papers on applications of image processing

The promise of machine learning approaches to capture cellular senescence heterogeneity

The identification of senescent cells is a long-standing unresolved challenge, owing to their intrinsic heterogeneity and the lack of universal markers. In this Comment, we discuss the recent advent of machine-learning-based approaches to identifying senescent cells by using unbiased, multiparameter morphological assessments, and how these tools can assist future senescence research.

  • Imanol Duran
  • Cleo L. Bishop
  • Ryan Wallis

research papers on applications of image processing

Visual interpretability of bioimaging deep learning models

The success of deep learning in analyzing bioimages comes at the expense of biologically meaningful interpretations. We review the state of the art of explainable artificial intelligence (XAI) in bioimaging and discuss its potential in hypothesis generation and data-driven discovery.

  • Assaf Zaritsky

Next-generation AI for connectomics

New approaches in artificial intelligence (AI), such as foundation models and synthetic data, are having a substantial impact on many areas of applied computer science. Here we discuss the potential to apply these developments to the computational challenges associated with producing synapse-resolution maps of nervous systems, an area in which major ambitions are currently bottlenecked by AI performance.

  • Michał Januszewski

research papers on applications of image processing

Multimodal large language models for bioimage analysis

Multimodal large language models have been recognized as a historical milestone in the field of artificial intelligence and have demonstrated revolutionary potentials not only in commercial applications, but also for many scientific fields. Here we give a brief overview of multimodal large language models through the lens of bioimage analysis and discuss how we could build these models as a community to facilitate biology research.

  • Shanghang Zhang
  • Jianxu Chen

research papers on applications of image processing

Neurotransmitters at a glance

Machine learning approaches can distinguish six different classes of presynapses from electron micrographs across the Drosophila brain.

  • Rita Strack

Grand challenges in image processing.


Grand challenges in image processing.

Frdric Dufaux

  • Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des signaux et Systèmes, Gif-sur-Yvette, France


The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging, medical imaging, and machine learning applied to these respective topics. Hereafter, we will consider both image and video content (i.e. sequence of images), and more generally all forms of visual information.

Rapid technological advances, especially in terms of computing power and network transmission bandwidth, have resulted in many remarkable and successful applications. Nowadays, images are ubiquitous in our daily life. Entertainment is one class of applications that has greatly benefited, including digital TV (e.g., broadcast, cable, and satellite TV), Internet video streaming, digital cinema, and video games. Beyond entertainment, imaging technologies are central in many other applications, including digital photography, video conferencing, video monitoring and surveillance, satellite imaging, but also in more distant domains such as healthcare and medicine, distance learning, digital archiving, cultural heritage or the automotive industry.

In this paper, we highlight a few research grand challenges for future imaging and video systems, in order to achieve breakthroughs to meet the growing expectations of end users. Given the vastness of the field, this list is by no means exhaustive.

A Brief Historical Perspective

We first briefly discuss a few key milestones in the field of image processing. Key inventions in the development of photography and motion pictures can be traced to the 19th century. The earliest surviving photograph of a real-world scene was made by Nicéphore Niépce in 1827 ( Hirsch, 1999 ). The Lumière brothers made the first cinematographic film in 1895, with a public screening the same year ( Lumiere, 1996 ). After decades of remarkable developments, the second half of the 20th century saw the emergence of new technologies launching the digital revolution. While the first prototype digital camera using a Charge-Coupled Device (CCD) was demonstrated in 1975, the first commercial consumer digital cameras started appearing in the early 1990s. These digital cameras quickly surpassed cameras using films and the digital revolution in the field of imaging was underway. As a key consequence, the digital process enabled computational imaging, in other words the use of sophisticated processing algorithms in order to produce high quality images.

In 1992, the Joint Photographic Experts Group (JPEG) released the JPEG standard for still image coding ( Wallace, 1992 ). In parallel, in 1993, the Moving Picture Experts Group (MPEG) published its first standard for coding of moving pictures and associated audio, MPEG-1 ( Le Gall, 1991 ), and a few years later MPEG-2 ( Haskell et al., 1996 ). By guaranteeing interoperability, these standards have been essential in many successful applications and services, for both the consumer and business markets. In particular, it is remarkable that, almost 30 years later, JPEG remains the dominant format for still images and photographs.

In the late 2000s and early 2010s, we could observe a paradigm shift with the appearance of smartphones integrating a camera. Thanks to advances in computational photography, these new smartphones soon became capable of rivaling the quality of consumer digital cameras at the time. Moreover, these smartphones were also capable of acquiring video sequences. Almost concurrently, another key evolution was the development of high bandwidth networks. In particular, the launch of 4G wireless services circa 2010 enabled users to quickly and efficiently exchange multimedia content. From this point, most of us are carrying a camera, anywhere and anytime, allowing to capture images and videos at will and to seamlessly exchange them with our contacts.

As a direct consequence of the above developments, we are currently observing a boom in the usage of multimedia content. It is estimated that today 3.2 billion images are shared each day on social media platforms, and 300 h of video are uploaded every minute on YouTube 1 . In a 2019 report, Cisco estimated that video content represented 75% of all Internet traffic in 2017, and this share is forecasted to grow to 82% in 2022 ( Cisco, 2019 ). While Internet video streaming and Over-The-Top (OTT) media services account for a significant bulk of this traffic, other applications are also expected to see significant increases, including video surveillance and Virtual Reality (VR)/Augmented Reality (AR).

Hyper-Realistic and Immersive Imaging

A major direction and key driver to research and development activities over the years has been the objective to deliver an ever-improving image quality and user experience.

For instance, in the realm of video, we have observed constantly increasing spatial and temporal resolutions, with the emergence nowadays of Ultra High Definition (UHD). Another aim has been to provide a sense of the depth in the scene. For this purpose, various 3D video representations have been explored, including stereoscopic 3D and multi-view ( Dufaux et al., 2013 ).

In this context, the ultimate goal is to be able to faithfully represent the physical world and to deliver an immersive and perceptually hyperrealist experience. For this purpose, we discuss hereafter some emerging innovations. These developments are also very relevant in VR and AR applications ( Slater, 2014 ). Finally, while this paper is only focusing on the visual information processing aspects, it is obvious that emerging display technologies ( Masia et al., 2013 ) and audio also plays key roles in many application scenarios.

Light Fields, Point Clouds, Volumetric Imaging

In order to wholly represent a scene, the light information coming from all the directions has to be represented. For this purpose, the 7D plenoptic function is a key concept ( Adelson and Bergen, 1991 ), although it is unmanageable in practice.

By introducing additional constraints, the light field representation collects radiance from rays in all directions. Therefore, it contains a much richer information, when compared to traditional 2D imaging that captures a 2D projection of the light in the scene integrating the angular domain. For instance, this allows post-capture processing such as refocusing and changing the viewpoint. However, it also entails several technical challenges, in terms of acquisition and calibration, as well as computational image processing steps including depth estimation, super-resolution, compression and image synthesis ( Ihrke et al., 2016 ; Wu et al., 2017 ). The resolution trade-off between spatial and angular resolutions is a fundamental issue. With a significant fraction of the earlier work focusing on static light fields, it is also expected that dynamic light field videos will stimulate more interest in the future. In particular, dense multi-camera arrays are becoming more tractable. Finally, the development of efficient light field compression and streaming techniques is a key enabler in many applications ( Conti et al., 2020 ).

Another promising direction is to consider a point cloud representation. A point cloud is a set of points in the 3D space represented by their spatial coordinates and additional attributes, including color pixel values, normals, or reflectance. They are often very large, easily ranging in the millions of points, and are typically sparse. One major distinguishing feature of point clouds is that, unlike images, they do not have a regular structure, calling for new algorithms. To remove the noise often present in acquired data, while preserving the intrinsic characteristics, effective 3D point cloud filtering approaches are needed ( Han et al., 2017 ). It is also important to develop efficient techniques for Point Cloud Compression (PCC). For this purpose, MPEG is developing two standards: Geometry-based PCC (G-PCC) and Video-based PCC (V-PCC) ( Graziosi et al., 2020 ). G-PCC considers the point cloud in its native form and compress it using 3D data structures such as octrees. Conversely, V-PCC projects the point cloud onto 2D planes and then applies existing video coding schemes. More recently, deep learning-based approaches for PCC have been shown to be effective ( Guarda et al., 2020 ). Another challenge is to develop generic and robust solutions able to handle potentially widely varying characteristics of point clouds, e.g. in terms of size and non-uniform density. Efficient solutions for dynamic point clouds are also needed. Finally, while many techniques focus on the geometric information or the attributes independently, it is paramount to process them jointly.

High Dynamic Range and Wide Color Gamut

The human visual system is able to perceive, using various adaptation mechanisms, a broad range of luminous intensities, from very bright to very dark, as experienced every day in the real world. Nonetheless, current imaging technologies are still limited in terms of capturing or rendering such a wide range of conditions. High Dynamic Range (HDR) imaging aims at addressing this issue. Wide Color Gamut (WCG) is also often associated with HDR in order to provide a wider colorimetry.

HDR has reached some levels of maturity in the context of photography. However, extending HDR to video sequences raises scientific challenges in order to provide high quality and cost-effective solutions, impacting the whole imaging processing pipeline, including content acquisition, tone reproduction, color management, coding, and display ( Dufaux et al., 2016 ; Chalmers and Debattista, 2017 ). Backward compatibility with legacy content and traditional systems is another issue. Despite recent progress, the potential of HDR has not been fully exploited yet.

Coding and Transmission

Three decades of standardization activities have continuously improved the hybrid video coding scheme based on the principles of transform coding and predictive coding. The Versatile Video Coding (VVC) standard has been finalized in 2020 ( Bross et al., 2021 ), achieving approximately 50% bit rate reduction for the same subjective quality when compared to its predecessor, High Efficiency Video Coding (HEVC). While substantially outperforming VVC in the short term may be difficult, one encouraging direction is to rely on improved perceptual models to further optimize compression in terms of visual quality. Another direction, which has already shown promising results, is to apply deep learning-based approaches ( Ding et al., 2021 ). Here, one key issue is the ability to generalize these deep models to a wide diversity of video content. The second key issue is the implementation complexity, both in terms of computation and memory requirements, which is a significant obstacle to a widespread deployment. Besides, the emergence of new video formats targeting immersive communications is also calling for new coding schemes ( Wien et al., 2019 ).

Considering that in many application scenarios, videos are processed by intelligent analytic algorithms rather than viewed by users, another interesting track is the development of video coding for machines ( Duan et al., 2020 ). In this context, the compression is optimized taking into account the performance of video analysis tasks.

The push toward hyper-realistic and immersive visual communications entails most often an increasing raw data rate. Despite improved compression schemes, more transmission bandwidth is needed. Moreover, some emerging applications, such as VR/AR, autonomous driving, and Industry 4.0, bring a strong requirement for low latency transmission, with implications on both the imaging processing pipeline and the transmission channel. In this context, the emergence of 5G wireless networks will positively contribute to the deployment of new multimedia applications, and the development of future wireless communication technologies points toward promising advances ( Da Costa and Yang, 2020 ).

Human Perception and Visual Quality Assessment

It is important to develop effective models of human perception. On the one hand, it can contribute to the development of perceptually inspired algorithms. On the other hand, perceptual quality assessment methods are needed in order to optimize and validate new imaging solutions.

The notion of Quality of Experience (QoE) relates to the degree of delight or annoyance of the user of an application or service ( Le Callet et al., 2012 ). QoE is strongly linked to subjective and objective quality assessment methods. Many years of research have resulted in the successful development of perceptual visual quality metrics based on models of human perception ( Lin and Kuo, 2011 ; Bovik, 2013 ). More recently, deep learning-based approaches have also been successfully applied to this problem ( Bosse et al., 2017 ). While these perceptual quality metrics have achieved good performances, several significant challenges remain. First, when applied to video sequences, most current perceptual metrics are applied on individual images, neglecting temporal modeling. Second, whereas color is a key attribute, there are currently no widely accepted perceptual quality metrics explicitly considering color. Finally, new modalities, such as 360° videos, light fields, point clouds, and HDR, require new approaches.

Another closely related topic is image esthetic assessment ( Deng et al., 2017 ). The esthetic quality of an image is affected by numerous factors, such as lighting, color, contrast, and composition. It is useful in different application scenarios such as image retrieval and ranking, recommendation, and photos enhancement. While earlier attempts have used handcrafted features, most recent techniques to predict esthetic quality are data driven and based on deep learning approaches, leveraging the availability of large annotated datasets for training ( Murray et al., 2012 ). One key challenge is the inherently subjective nature of esthetics assessment, resulting in ambiguity in the ground-truth labels. Another important issue is to explain the behavior of deep esthetic prediction models.

Analysis, Interpretation and Understanding

Another major research direction has been the objective to efficiently analyze, interpret and understand visual data. This goal is challenging, due to the high diversity and complexity of visual data. This has led to many research activities, involving both low-level and high-level analysis, addressing topics such as image classification and segmentation, optical flow, image indexing and retrieval, object detection and tracking, and scene interpretation and understanding. Hereafter, we discuss some trends and challenges.

Keypoints Detection and Local Descriptors

Local imaging matching has been the cornerstone of many analysis tasks. It involves the detection of keypoints, i.e. salient visual points that can be robustly and repeatedly detected, and descriptors, i.e. a compact signature locally describing the visual features at each keypoint. It allows to subsequently compute pairwise matching between the features to reveal local correspondences. In this context, several frameworks have been proposed, including Scale Invariant Feature Transform (SIFT) ( Lowe, 2004 ) and Speeded Up Robust Features (SURF) ( Bay et al., 2008 ), and later binary variants including Binary Robust Independent Elementary Feature (BRIEF) ( Calonder et al., 2010 ), Oriented FAST and Rotated BRIEF (ORB) ( Rublee et al., 2011 ) and Binary Robust Invariant Scalable Keypoints (BRISK) ( Leutenegger et al., 2011 ). Although these approaches exhibit scale and rotation invariance, they are less suited to deal with large 3D distortions such as perspective deformations, out-of-plane rotations, and significant viewpoint changes. Besides, they tend to fail under significantly varying and challenging illumination conditions.

These traditional approaches based on handcrafted features have been successfully applied to problems such as image and video retrieval, object detection, visual Simultaneous Localization And Mapping (SLAM), and visual odometry. Besides, the emergence of new imaging modalities as introduced above can also be beneficial for image analysis tasks, including light fields ( Galdi et al., 2019 ), point clouds ( Guo et al., 2020 ), and HDR ( Rana et al., 2018 ). However, when applied to high-dimensional visual data for semantic analysis and understanding, these approaches based on handcrafted features have been supplanted in recent years by approaches based on deep learning.

Deep Learning-Based Methods

Data-driven deep learning-based approaches ( LeCun et al., 2015 ), and in particular the Convolutional Neural Network (CNN) architecture, represent nowadays the state-of-the-art in terms of performances for complex pattern recognition tasks in scene analysis and understanding. By combining multiple processing layers, deep models are able to learn data representations with different levels of abstraction.

Supervised learning is the most common form of deep learning. It requires a large and fully labeled training dataset, a typically time-consuming and expensive process needed whenever tackling a new application scenario. Moreover, in some specialized domains, e.g. medical data, it can be very difficult to obtain annotations. To alleviate this major burden, methods such as transfer learning and weakly supervised learning have been proposed.

In another direction, deep models have been shown to be vulnerable to adversarial attacks ( Akhtar and Mian, 2018 ). Those attacks consist in introducing subtle perturbations to the input, such that the model predicts an incorrect output. For instance, in the case of images, imperceptible pixel differences are able to fool deep learning models. Such adversarial attacks are definitively an important obstacle to the successful deployment of deep learning, especially in applications where safety and security are critical. While some early solutions have been proposed, a significant challenge is to develop effective defense mechanisms against those attacks.

Finally, another challenge is to enable low complexity and efficient implementations. This is especially important for mobile or embedded applications. For this purpose, further interactions between signal processing and machine learning can potentially bring additional benefits. For instance, one direction is to compress deep neural networks in order to enable their more efficient handling. Moreover, by combining traditional processing techniques with deep learning models, it is possible to develop low complexity solutions while preserving high performance.

Explainability in Deep Learning

While data-driven deep learning models often achieve impressive performances on many visual analysis tasks, their black-box nature often makes it inherently very difficult to understand how they reach a predicted output and how it relates to particular characteristics of the input data. However, this is a major impediment in many decision-critical application scenarios. Moreover, it is important not only to have confidence in the proposed solution, but also to gain further insights from it. Based on these considerations, some deep learning systems aim at promoting explainability ( Adadi and Berrada, 2018 ; Xie et al., 2020 ). This can be achieved by exhibiting traits related to confidence, trust, safety, and ethics.

However, explainable deep learning is still in its early phase. More developments are needed, in particular to develop a systematic theory of model explanation. Important aspects include the need to understand and quantify risk, to comprehend how the model makes predictions for transparency and trustworthiness, and to quantify the uncertainty in the model prediction. This challenge is key in order to deploy and use deep learning-based solutions in an accountable way, for instance in application domains such as healthcare or autonomous driving.

Self-Supervised Learning

Self-supervised learning refers to methods that learn general visual features from large-scale unlabeled data, without the need for manual annotations. Self-supervised learning is therefore very appealing, as it allows exploiting the vast amount of unlabeled images and videos available. Moreover, it is widely believed that it is closer to how humans actually learn. One common approach is to use the data to provide the supervision, leveraging its structure. More generally, a pretext task can be defined, e.g. image inpainting, colorizing grayscale images, predicting future frames in videos, by withholding some parts of the data and by training the neural network to predict it ( Jing and Tian, 2020 ). By learning an objective function corresponding to the pretext task, the network is forced to learn relevant visual features in order to solve the problem. Self-supervised learning has also been successfully applied to autonomous vehicles perception. More specifically, the complementarity between analytical and learning methods can be exploited to address various autonomous driving perception tasks, without the prerequisite of an annotated data set ( Chiaroni et al., 2021 ).

While good performances have already been obtained using self-supervised learning, further work is still needed. A few promising directions are outlined hereafter. Combining self-supervised learning with other learning methods is a first interesting path. For instance, semi-supervised learning ( Van Engelen and Hoos, 2020 ) and few-short learning ( Fei-Fei et al., 2006 ) methods have been proposed for scenarios where limited labeled data is available. The performance of these methods can potentially be boosted by incorporating a self-supervised pre-training. The pretext task can also serve to add regularization. Another interesting trend in self-supervised learning is to train neural networks with synthetic data. The challenge here is to bridge the domain gap between the synthetic and real data. Finally, another compelling direction is to exploit data from different modalities. A simple example is to consider both the video and audio signals in a video sequence. In another example in the context of autonomous driving, vehicles are typically equipped with multiple sensors, including cameras, LIght Detection And Ranging (LIDAR), Global Positioning System (GPS), and Inertial Measurement Units (IMU). In such cases, it is easy to acquire large unlabeled multimodal datasets, where the different modalities can be effectively exploited in self-supervised learning methods.

Reproducible Research and Large Public Datasets

The reproducible research initiative is another way to further ensure high-quality research for the benefit of our community ( Vandewalle et al., 2009 ). Reproducibility, referring to the ability by someone else working independently to accurately reproduce the results of an experiment, is a key principle of the scientific method. In the context of image and video processing, it is usually not sufficient to provide a detailed description of the proposed algorithm. Most often, it is essential to also provide access to the code and data. This is even more imperative in the case of deep learning-based models.

In parallel, the availability of large public datasets is also highly desirable in order to support research activities. This is especially critical for new emerging modalities or specific application scenarios, where it is difficult to get access to relevant data. Moreover, with the emergence of deep learning, large datasets, along with labels, are often needed for training, which can be another burden.

Conclusion and Perspectives

The field of image processing is very broad and rich, with many successful applications in both the consumer and business markets. However, many technical challenges remain in order to further push the limits in imaging technologies. Two main trends are on the one hand to always improve the quality and realism of image and video content, and on the other hand to be able to effectively interpret and understand this vast and complex amount of visual data. However, the list is certainly not exhaustive and there are many other interesting problems, e.g. related to computational imaging, information security and forensics, or medical imaging. Key innovations will be found at the crossroad of image processing, optics, psychophysics, communication, computer vision, artificial intelligence, and computer graphics. Multi-disciplinary collaborations are therefore critical moving forward, involving actors from both academia and the industry, in order to drive these breakthroughs.

The “Image Processing” section of Frontier in Signal Processing aims at giving to the research community a forum to exchange, discuss and improve new ideas, with the goal to contribute to the further advancement of the field of image processing and to bring exciting innovations in the foreseeable future.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1 (accessed on Feb. 23, 2021).

Keywords: image processing, immersive, image analysis, image understanding, deep learning, video processing

Citation: Dufaux F (2021) Grand Challenges in Image Processing. Front. Sig. Proc. 1:675547. doi: 10.3389/frsip.2021.675547

Received: 03 March 2021; Accepted: 10 March 2021; Published: 12 April 2021.

Reviewed and Edited by:

Copyright © 2021 Dufaux. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Frédéric Dufaux, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Logo of benthamopen

Viewpoints on Medical Image Processing: From Science to Application

Thomas m. deserno (né lehmann).

1 Department of Medical Informatics, Uniklinik RWTH Aachen, Germany;

Heinz Handels

2 Institute of Medical Informatics, University of Lübeck, Germany;

Klaus H. Maier-Hein (né Fritzsche)

3 Medical and Biological Informatics, German Cancer Research Center, Heidelberg, Germany;

Sven Mersmann

4 Medical and Biological Informatics, Junior Group Computer-assisted Interventions, German Cancer Research Center, Heidelberg, Germany;

Christoph Palm

5 Regensburg – Medical Image Computing (Re-MIC), Faculty of Computer Science and Mathematics, Regensburg University of Applied Sciences, Regensburg, Germany;

Thomas Tolxdorff

6 Institute of Medical Informatics, Charité - Universitätsmedizin Berlin, Germany;

Gudrun Wagenknecht

7 Electronic Systems (ZEA-2), Central Institute of Engineering, Electronics and Analytics, Forschungszentrum Jülich GmbH, Germany;

Thomas Wittenberg

8 Image Processing & Biomedical Engineering Department, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Medical image processing provides core innovation for medical imaging. This paper is focused on recent developments from science to applications analyzing the past fifteen years of history of the proceedings of the German annual meeting on medical image processing (BVM). Furthermore, some members of the program committee present their personal points of views: (i) multi-modality for imaging and diagnosis, (ii) analysis of diffusion-weighted imaging, (iii) model-based image analysis, (iv) registration of section images, (v) from images to information in digital endoscopy, and (vi) virtual reality and robotics. Medical imaging and medical image computing is seen as field of rapid development with clear trends to integrated applications in diagnostics, treatment planning and treatment.


Current advances in medical imaging are made in fields such as instrumentation, diagnostics, and therapeutic applications and most of them are based on imaging technology and image processing. In fact, medical image processing has been established as a core field of innovation in modern health care [ 1 ] combining medical informatics, neuro-informatics and bioinformatics [ 2 ].

In 1984, the Society of Photo-Optical Instrumentation Engineers (SPIE) has launched a multi-track conference on medical imaging, which still is considered as the core event for innovation in the field [Methods]. Analogously in Germany, the workshop “Bildverarbeitung für die Medizin (BVM)” (Image Processing for Medicine) has recently celebrated its 20 th annual performance. The meeting has evolved over the years to a multi-track conference on international standard [ 3 , 4 , 5 , 6 , 7 , 8 , 9 ].

Nonetheless, it is hard to name the most important and innovative trends within this broad field ranging from image acquisition using novel imaging modalities to information extraction in diagnostics and treatment. Ritter et al. recently emphasized on the following aspects: (i) enhancement, (ii) segmentation, (iii) registration, (iv) quantification, (v) visualization, and (vi) computer-aided detection (CAD) [ 10 ].

Another concept of structuring is here referred to as the “from-to” approach. For instance,

  • From nano to macro : Co-founded in 2002 by Michael Unser of EPFL, Switzerland, The Institute of Electrical and Electronics Engineers (IEEE) has launched an international symposium on biomedical imaging (ISBI). This conference is focused in the motto from nano to macro covering all aspects of medical imaging from sub-cellular to the organ level.
  • From production to sharing : Another “from-to” migration is seen in the shift from acquisition to communication [ 11 ]. Clark et al. expected advances in the medical imaging fields along the following four axes: (i) image production and new modalities; (ii) image processing, visualization, and system simulation; (iii) image management and retrieval; and (iv) image communication and telemedicine.
  • From kilobyte to terabyte : Deserno et al. identified another “from-to” migration, which is seen in the amount of data that is produced by medical imagery [ 12 ]. Today, High-resolution CT reconstructs images with 8000 x 8000 pixels per slice with 0.7 μm isotropic detail detectability, and whole body scans with this resolution reach several Gigabytes (GB) of data load. Also, microscopic whole-slide scanning systems can easily provide so-called virtual slices in the rage of 30.000 x 50.000 pixels, which equals 16.8 GB on 10 bit gray scale.
  • From science to application : Finally, in this paper, we aim at analyzing recent advantages in medical imaging on another level. The focus is to identify core fields fostering transfer of algorithms into clinical use and addressing gaps still remaining to be bridged in future research.

The remainder of this review is organized as follows. In Section 3, we briefly analyze the history of the German workshop BVM. More than 15 years of proceedings are currently available and statistics is applied to identify trends in content of conference papers. Section 4 then provides personal viewpoints to challenging and pioneering fields. The results are discussed in Section 5.


Since 1994, annual proceedings of the presented contributions from the BVM workshops have been published, which are available electronically in postscript (PS) or the portable document format (PDF) from 1996. Disregarding the type of presentation (oral, poster, or software demonstration), the authors are allowed to submit papers with a length of up to five pages. In 2012 the length was increased to six pages. Both, English and German papers are allowed. The number of English contributions increased steadily over the years, and reached about 50% in 2008 [ 8 ].

In order to analyze the content of the on average 124k words long proceedings regarding the most relevant topics that were discussed on the BVM workshops, the incidence of the most frequent words has been assessed for each proceeding from 1996 until 2012. From this investigation, about 300 common words of the German and English language (e.g. and / und, etc.) have been excluded. (Fig. ​ 1 1 ) presents a word cloud computed from the 100 most frequent terms used in the proceedings of the 2012 BVM workshop. The font sizes of the words refer to their counted frequency in the text.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F1.jpg

Word cloud representing the most frequent 100 terms counted from the 469 page long BVM proceedings 2012 [13].

It can be seen, in 2012, “image” was the most frequent word occurring in the BVM proceedings (920 incidences), as also observed in all the other years (1996-2012: 10,123 incidences). Together with terms like “reconstruction”, “analysis”, or “processing”, medical imaging is clearly recognizable as the major subject of the BVM workshops.

Concerning the scientific direction of the BVM meeting over time, terms such as “segmentation”, “registration”, and “navigation”, which indicate image processing procedures relevant for clinical applications, have been used with increasing frequencies (Fig. ​ 2 2 , left). The same holds for terms like “evaluation” or “experiment”, which are related to the validation of the contributions (Fig. ​ 2 2 , middle), constituting a first step towards the transition of the scientific results into a clinical application. (Fig. ​ 2 2 right) shows the occurrence of the words “patient” and “application” in the contributed papers of the BVM workshops between 1996 and 2012. Here, rather constant numbers of occurrences are found indicating a stringent focus on clinical applications.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F2.jpg

Trends from BVM workshop proceedings from important terms of processing procedures (left), experimental verification (middle), and application to humans (right).


3.1. multi-modal image processing for imaging and diagnosis.

Multi-modal imaging refers to (i) different measurements at a single tomographic system (e.g., MRI and functional MRI), (ii) measurements at different tomographic systems (e.g., computed tomography (CT), positron emission tomography (PET), and single photon emission computed tomography (SPECT)), and (iii) measurements at integrated tomographic systems (PET/CT, PET/MR). Hence, multi-modal tomography has become increasingly popular in clinical and preclinical applications (Fig. ​ 3 3 ) providing images of morphology and function (Fig. ​ 4 4 ).

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F3.jpg

PubMed cited papers for search “multimodal AND (imaging OR tomography OR image)”.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F4.jpg

Morphological and functional imaging in clinical and pre-clinical applications.

Multi-modal image processing for enhancing multi-modal imaging procedures primarily deals with image reconstruction and artifact reduction. Examples are the integration of additional information about tissue types from MRI as an anatomical prior to the iterative reconstruction of PET images [ 14 ] and the CT- or MR-based correction of attenuation artifacts in PET, respectively, which is an essential prerequisite for quantitative PET analysis [ 15 , 16 ]. Since these algorithms are part of the imaging workflow, only highly automated, fast, and robust algorithms providing adequate accuracy are appropriate solutions. Accordingly, the whole image in the different modalities must be considered.

This requirement differs for multi-modal diagnostic approaches. In most applications, a single organ or parts of an organ are of interest. Anatomical and particularly pathological regions often show a high variability due to structure, deformation, or movement, which is difficult to predict and is thus a great challenge for image processing. In multi-modality applications, images represent complementary information often obtained at different time-scales introducing additional complexity for algorithms. Other inequalities are introduced by the different resolutions and fields of view showing the organ of interest in different degrees of completeness. From a scientific and thus algorithmic point of view, image processing methods for multi-modal images must meet higher requirements than those applied to single-modality images.

Looking exemplarily at segmentation as one of the most complex and demanding problems in medical image processing, the modality showing anatomical and pathological structures in high resolution and contrast (e.g., MRI, CT) is typically used to segment the structure or volume of interest (VOI) to subsequently analyze other properties such as function within these target structures. Here, the different resolutions have to be regarded to correct for partial volume effects in the functional modality (e.g., PET, SPECT). Since the structures to be analyzed are dependent on the disease of the actual patient examined, automatic segmentation approaches are appropriate solutions if the anatomical structures of interest are known beforehand [ 17 ], while semi-automatic approaches are advantageous if flexibility is needed [ 18 , 19 ].

Transferring research into diagnostic application software requires a graphical user interface (GUI) to parameterize the algorithms, 2D and 3D visualization of multi-modal images and segmentation results, and tools to interact with the visualized images during the segmentation procedure. The Medical Interaction Toolkit [ 20 ] or the MevisLab [ 21 ] provide the developer with frameworks for multi-modal visualization, interaction and tools to build appropriate GUIs, yielding an interface to integrate new algorithms from science to application.

Another important aspect transferring algorithms from pure academics to clinical practice is evaluation. Phantoms can be used for evaluating specific properties of an algorithm, but not for evaluating the real situation with all its uncertainties and variability. Thus, the most important step of migrating is extensive testing of algorithms on large amounts of real clinical data, which is a great challenge particularly for multi-modal approaches, and should in future be more supported by publicly available databases.

3.2. Analysis of Diffusion Weighted Images

Due to its sensitivity to micro-structural changes in white matter, diffusion weighted imaging (DWI) is of particular interest to brain research. Stroke is the most common and well known clinical application of DWI, where the images allow the non-invasive detection of ischemia within minutes of onset and are sensitive and relatively specific in detecting changes triggered by strokes [ 22 ]. The technique has also allowed deeper insights into the pathogenesis of Alzheimer’s disease, Parkinson disease, autism spectrum disorder, schizophrenia, and many other psychiatric and non-psychiatric brain diseases. DWI is also applied in the imaging of (mild) traumatic brain injury, where conventional techniques lack sensitivity to detect the subtle changes occurring in the brain. Here, studies on sports-related traumata in the younger population have raised considerable debates in the recent past [ 23 ].

Methodologically, recent advances in the generation and analysis of large-scale networks on basis of DWI are particularly exciting and promise new dimensions in quantitative neuro-imaging via the application of the profound set of tools available in graph theory to brain image analysis [ 24 ]. DWI sheds light on the living brain network architecture, revealing the organization of fiber connections together with their development and change in disease.

Big challenges remain to be solved though: Despite many years of methodological development in DWI post-processing, the field still seems to be in its infancy. The reliable tractography-based reconstruction of known or pathological anatomy is still not solved. Current reconstruction challenges at the 2011 and 2012 annual meetings of the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society have demonstrated the lack of methods that can reliably reconstruct large and well-known structures like the cortico-spinal tract in datasets of clinical quality [ 25 ]. Missing reference-based evaluation techniques hinder the well-founded demonstration of the real advantages of novel tractography algorithms over previous methods [ 26 ]. The mentioned limitations have obscured a broader application of DWI tractography, e.g. in surgical guidance. Even though the application of DWI e.g. in surgical resection has shown to facilitate the identification of risk structures [ 27 ], the widespread use of these techniques in surgical practice remains limited mainly by the lack of robust and standardized methods that can be applied multi-centered across institutions and comprehensive evaluation of these algorithms.

However, there are numerous applications of DWI in cancer imaging, which bridge imaging science and clinical application. The imaging modality has shown potential in the detection, staging and characterization of tumors (Fig. ​ 5 5 ), the evaluation of therapy response, or even in the prediction of therapy outcome [ 28 ]. DWI was also applied in the detection and characterization of lesions in the abdomen and the pelvis, where increased cellularity of malignant tissue leads to restricted diffusion when compared to the surrounding tissue [ 29 ]. The challenge here again will be the establishment of reliable sequences and post-processing methods for the wide-spread and multi-centric application of the techniques in the future.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F5.jpg

Depiction of fiber tracts in the vicinity of a grade IV glioblastoma. The volumetric tracking result (yellow) was overlaid on an axial T2-FLAIR image. Red and green arrows indicate the necrotic tumor core and peritumoral hyperintensity, respectively. In the frontal parts, fiber tracts are still depicted, whereas in the dorsal part, tracts seem to be either displaced or destructed by the tumor.

3.3. Model-Based Image Analysis

As already emphasized in the previous viewpoints, there is a big gap between the state of the art in current research and methods available in clinical application, especially in the field of medical image analysis [ 30 ]. Segmentation of relevant image structures (tissues, tumors, vessels etc.) is still one of the key problems in medical image computing lacking robust and automatic methods. The application of pure data-driven approaches like thresholding, region growing, edge detection, or enhanced data-driven methods like watershed algorithms, Markov random field (MRF)-based approaches, or graph cuts often leads to weak segmentations due to low contrasts between neighboring image objects, image artifacts, noise, partial volume effects etc.

Model-based segmentation integrates a-priori knowledge of the shapes and appearance of relevant structures into the segmentation process. For example, the local shape of a vessel can be characterized by the vesselness operator [ 31 ], which generates images with an enhanced representation of vessels. Using the vesselness information in combination with the original grey value image segmentation of vessels can be improved significantly and especially the segmentation of a small vessel becomes possible (e.g. [ 32 ]).

In statistical or active shape and appearance models [ 33 , 34 ], shape variability in organ distribution among individuals and characteristic gray value distributions in the neighborhood of the organ can be represented. In these approaches, a set of segmented image data is used to train active shape and active appearance models, which include information about the mean shape and shape variations as well as characteristic gray value distributions and their variation in the population represented in the training data set. Instead of direct point-to-point correspondences that are used during the generation of classical statistical shape models, Hufnagel et al. have suggested probabilistic point-to-point correspondences [ 35 ]. This approach takes into account that often inaccuracies are unavoidable by the definition of direct point correspondences between organs of different persons. In probabilistic statistical shape models, these correspondence uncertainties are respected explicitly to improve the robustness and accuracy of shape modeling and model-based segmentation. Integrated in an energy minimizing level set framework, the probabilistic statistical shape models can be used for enhanced organ segmentation [ 36 ].

In contrast thereto, atlas-based segmentation methods (e.g., [ 37 ]) realize a case-based approach and make use of the segmentation information contained in a single segmented data set, which is transferred to an unseen patient image data set. The transfer of the atlas segmentation to the patient segmentation is done by inter-individual non-linear registration methods. Multi-atlas segmentation methods using several atlases have been proposed (e.g. [ 38 ]) and show an improved accuracy and robustness in comparison to single atlas segmentation methods. Hence, multi-atlas approaches are currently in the focus of further research [ 39 , 40 ].

In future, more task-oriented systems integrated into diagnostic processes, intervention planning, therapy and follow-up are needed. In the field of image analysis, due the limited time of the physicians, automatic procedures are of special interest to segment and extract quantitative object parameters in an accurate, reproducible and robust way. Furthermore, intelligent and easy-to-use methods for fast correction of unavoidable segmentation errors are needed.

3.4. Registration of Section Images

Imaging techniques such as histology [ 41 ] or auto-radiography [ 42 ] are based on thin post-mortem sections. In comparison to in-vivo imaging, e.g. positron emission tomography (PET), magnetic resonance imaging (MRI), or DWI (as addressed in the previous viewpoint, cf. Section 4.1), several properties are considered advantageous. For instance, tissue can be processed after sectioning to enhance contrast (e.g. staining) [ 43 ], to mark specific properties like receptors [ 44 ] or to apply laser ablation studying the spatial element distribution [ 45 ]; tissue can be scanned in high-resolution [ 43 ]; and tissue is thin enough to allow optical light transmission imaging, e.g. polarized light imaging (PLI) [ 46 ]. Therefore, section imaging results in high space-resolved and high-contrasted data, which supports findings such as cytoarchitectonic boundaries [ 47 ], neuronal fiber directions [ 48 ], and receptor or element distributions [ 45 ].

Restacking of 2D sections into a 3D volume followed by the fusion of this stack with an in-vivo volume is the challenging task of medical image processing on the track from science to application. The 3D section stacks then serve as an atlas for a large variety of applications. Sections are non-linearly deformed during cutting and post-processing. Additionally, discontinuous artifacts like tears or enrolled tissue hamper the correspondence of true structure and tissue imaged.

The so-called “problem of the digitized banana” [ 41 ] prohibits the section-by-section registration without 3D reference. Smoothness of registered stacks is not equivalent to consistency and correctness. Whereas the deformations are section-specific, the orientation of the sections in comparison to the 3D structure depends on the cutting direction and, thus, is the same for all sections. In this tangled situation the question rises, if it is better to (i) restack the sections first, register the whole stack afterwards and correct for deformations at last (volume-first approach) or (ii) to register each section individually to the 3D reference volume while correcting deformations at the same time (section-first approach). Both approaches combine

  • Multi-modal registration : The need of a 3D reference and the application to correlate high-resolution section imaging findings with in-vivo imaging are sometimes solved at the same time. If possible, the 3D in-vivo modality itself is used as a reference.

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F6.jpg

Characteristic flow chart of volume-first approach and volume generation with (gray boxes) or without blockface images as intermediate reference modality (Column I). Either the in-vivo volume is post-processed to generate a pseudo-high-resolution volume with propagated section gaps (Column II) or the section volume is post-processed to get a low-resolution stack with filled gaps (Column III) [42].

Due to the variety of difficulties, missing evaluation possibilities and section specifics like post-processing, embedding, cutting procedure and tissue type there is not just one best approach to come from 2D to 3D. But careful work in this field is paid off by cutting edge applications. Not least within the European flagship, The Human Brain Project (HBP), further research in this area of medical image processing is demanded. The state-of-the-art review of HBP states in the context of human brain mapping: “What is missing to date is an integrated open source tool providing a standard application programming interface (API) for data registration and coordinate transformations and guaranteeing multi-scale and multi-modal data accuracy” [ 49 ]. Such a tool will narrow the gap from science to application.

3.5. From Images to Information in Digital Endoscopy

Basic endoscopic technologies and their routine applications (Fig. ​ 7 7 , bottom layers) still are purely data-oriented, as the complete image analysis and interpretation is performed solely by the physician. If content of endoscopic imagery is analyzed automatically, several new application scenarios for diagnostics and intervention with increasing complexity can be identified (Fig. ​ 7 7 , upper layers). As these new possibilities of endoscopy are inherently coupled with the use of computers, these new endoscopic methods and applications can be referred to as computer-integrated endoscopy [ 50 ]. Information, however, is referred to on the highest of the five levels of semantics (Fig. ​ 7 7 ):

An external file that holds a picture, illustration, etc.
Object name is CMIR-9-79_F7.jpg

Modules to build computer-integrated endoscopy, which enables information gain from image data.

  • 1. Acquisition : Advancements in diagnostic endoscopy were obtained by glass fibers for the transmission of electric light into and image information out of the body. Besides the pure wire-bound transmission of endoscopic imagery, in the past 10 years wireless broadcast came available for gastroscopic video data captured from capsule endoscopes [ 51 ].
  • 2. Transportation : Based on digital technologies, essential basic processes of endoscopic still image and image sequence capturing, storage, archiving, documentation, annotation and transmission have been simplified. These developments have initially led to the possibilities for tele-diagnosis and tele-consultations in diagnostic endoscopy, where the image data is shared using local networks or the internet [ 52 ].
  • 3. Enhancement : Methods and applications for image enhancement include intelligent removal of honey-comb patterns in fiberscopic recordings [ 53 ], temporal filtering for the reduction of ablation smoke and moving particles [ 54 ], image rectification for gastroscopes. Additionally, besides having an increased complexity, they have to work in real time with a maximum delay of 60 milliseconds, to be acceptable for surgeons and physicians.
  • 4. Augmentation : Image processing enhances endoscopic views with additional type of information. Examples of this type are artificial working horizon, key-hole views to endoscopic panorama-images [ 55 ], 3D surfaces computed from point clouds obtained by special endoscopic imaging devices such as stereo endoscopes [ 56 ], time-of-flight endoscopes [ 57 ], or shape-from polarization approaches [ 58 ]. This level also includes the possibilities of visualization and image fusion of endoscopic views with preoperative acquired radiological imagery such as angiography or CT data [ 59 ] for better intra-operative orientation and navigation, as well as image-based tracking and navigation through tubular structures [ 60 ].
  • 5. Content : Methods of content-based image analysis consider the automated segmentation, characterization and classification of diagnostic image content. Such methods describe computer-assisted detection (CADe) [ 61 ] of lesions (such as e.g. polyps) or computer-assisted diagnostics (CADx) [ 62 ], where already detected and delineated regions are characterized and classified into, for instance, benign or malign tissue areas. Furthermore, such methods automatically identify and track surgical instruments, e.g. supporting robotic surgery approaches.

On the technical side the semantics of the extracted image contents increases from the pure image recording up to the image content analysis level. This complexity also relates to the expected time axis needed to bring these methods from science to clinical applications.

From the clinical side, the most complex methods such as automated polyp detection (CADe) are considered as most important. However, it is expected that computer-integrated endoscopy systems will increasingly enter clinical applications and as such will contribute to the quality of the patient’s healthcare.

3.6. Virtual Reality and Robotics

Virtual reality (VR) and robotics are two rapidly expanding fields with growing application in surgery. VR creates three-dimensional environments increasing the capability for sensory immersion, which provides the sensation of being present in the virtual space. Applications of VR include surgical planning, case rehearsal, and case playback, which could change the paradigm of surgical training, which is especially necessary as the regulations surrounding residencies continue to change [ 63 ]. Surgeons are enabled to practice in controlled situations with preset variables to gain experience in a wide variety of surgical scenarios [ 64 ].

With the availability of inexpensive computational power and the need for cost-effective solutions in healthcare, medical technology products are being commercialized at an increasingly rapid pace. VR is already incorporated into several emerging products for medical education, radiology, surgical planning and procedures, physical rehabilitation, disability solutions, and mental health [ 65 ]. For example, VR is helping surgeons learn invasive techniques before operating, and allowing physicians to conduct real-time remote diagnosis and treatment. Other applications of VR include the modeling of molecular structures in three dimensions as well as aiding in genetic mapping and drug synthesis.

In addition, the contribution of robotics has accelerated the replacement of many open surgical treatments with more efficient minimally invasive surgical techniques using 3D visualization techniques. Robotics provides mechanical assistance with surgical tasks, contributing greater precision and accuracy and allowing automation. Robots contain features that can augment surgical performance, for instance, by steadying a surgeon’s hand or scaling the surgeon’s hand motions [ 66 ]. Current robots work in tandem with human operators to combine the advantages of human thinking with the capabilities of robots to provide data, to optimize localization on a moving subject, to operate in difficult positions, or to perform without muscle fatigue. Surgical robots require spatial orientation between the robotic manipulators and the human operator, which can be provided by VR environments that re-create the surgical space. This enables surgeons to perform with the advantage of mechanical assistance but without being alienated from the sights, sounds, and touch of surgery [ 67 ].

After many years of research and development, Japanese scientists recently presented an autonomous robot which is able to realize surgery within the human body [ 68 ]. They send a miniature robot inside the patient’s body, perceive what the robot saw and touched before conducting surgery by using the robot’s minute arms as though as it were the one’s of the surgeon.

While the possibilities – and the need – for medical VR and robotics are immense, approaches and solutions using new applications require diligent, cooperative efforts among technology developers, medical practitioners and medical consumers to establish where future requirements and demand will lie. Augmented and virtual reality substituting or enhancing the reality can be considered as multi-reality approaches [ 69 ], which are already available in commercial products for clinical applications.


In this paper, we have analyzed the written proceedings of the German annual meeting on Medical Imaging (BVM) and presented personal viewpoints on medical image processing focusing on the transfer from science to application. Reflecting successful clinical applications and promising technologies that have been recently developed, it turned out that medical image computing has transferred from single- to multi-images, and there are several ways to combine these images:

  • Multi-modality : Figs. ​ 2 2 and ​ 3 3 have emphasized that medical image processing has been moved away from the simple 2D radiograph via 3D imaging modalities to multi-modal processing and analyzing. Successful applications that are transferrable into the clinics jointly process imagery from different modalities.
  • Multi-resolution : Here, images with different properties from the same subject and body area need alignment and comparison. Usually, this implies a multi-resolution approach, since different modalities work on different scales of resolutions.
  • Multi-scale : If data becomes large, as pointed out for digital pathology, algorithms must operate on different scales, iteratively refining the alignment from coarse-to-fine. Such algorithmic design usually is referred to as multi-scale approach.
  • Multi-subject : Models have been identified as key issue for implementing applicable image computing. Such models are used for segmentation, content understanding, and intervention planning. They are generated from a reliable set of references, usually based on several subjects.
  • Multi-atlas : Even more complex, the personal viewpoints have identified multi-atlas approaches that are nowadays addressed in research. For instance in segmentation, accuracy and robustness of algorithms are improved if they are based on multiple rather than a single atlas. Both, accuracy and robustness are essential requirements for transferring algorithms into the clinical use.
  • Multi-semantics : Based on the example of digital endoscopy, another “multi” term is introduced. Image understanding and interpretation has been defined on several levels of semantics, and successful applications in computer-integrated endoscopy are operating on several of such levels.
  • Multi-reality : Finally, our last viewpoint has addressed the augmentation of the physician’s view by means of virtual reality. Medical image computing is applied to generate and superimpose such views, which results in a multi-reality world.

Andriole, Barish, and Khorasani also have discussed issues to consider for advanced image processing in the clinical arena [ 70 ]. In completion of the collection of “multi” issues, they emphasized that radiology practices are experiencing a tremendous increase in the number of images associated with each imaging study, due to multi-slice , multi-plane and/or multi-detector 3D imaging equipment. Computer-aided detection used as a second reader or as a first-pass screener will help maintaining or perhaps improving readers' performance on such big data in terms of sensitivity and specificity.

Last not least, with all these “multies”, the computational load of algorithms again becomes an issue. Modern computers provide enormous computational power and yield a revisiting and applications of several “old” approaches, which did not find their way into the clinical use yet, just because of the processing times. However, combining many images of large sizes, processing time becomes crucial again. Scholl et al. have recently addressed this issue reviewing applications based on parallel processing and usage of graphical processors for image analysis [ 12 ]. These are seen as multi-processing methods.

In summary, medical image processing is a progressive field of research, and more and more applications are becoming part of the clinical practice. These applications are based on one or more of the “multi” concepts that we have addressed in this review. However, effects from current trends in the Medical Device Directives that increase the efforts needed for clinical trials of new medical imaging procedure, cannot be observed until today. It will hence be an interesting point to follow the trend of the translation of scientific results of future BVM workshops into clinical applications.


Title: study on image filtering -- techniques, algorithm and applications.

Abstract: Image processing is one of the most immerging and widely growing techniques making it a lively research field. Image filtering is a technique for altering the size, shape, color, depth, smoothness, and other image properties. This paper introduces various image filtering techniques and their wide applications.
Subjects: Computer Vision and Pattern Recognition (cs.CV)
classes: 68U10
 classes: I.4
Cite as: [cs.CV]
  (or [cs.CV] for this version)
Access paper:.

  Google Scholar
  Semantic Scholar

Ieee transactions on image processing.

123  citations

Image Processing On Line

13  citations

SNIS: A Signal Noise Separation-Based Network for Post-Processed Image Forgery Detection

4  citations

Deep learning-based real-world object detection and improved anomaly detection for surveillance videos

3  citations

Noncontact Sensing Techniques for AI-Aided Structural Health Monitoring: A Systematic Review

Automatic seat identification system in smart transport using iot and image processing, practical application of digital image processing in measuring concrete crack widths in field studies.

2  citations

Saliency map in image visual quality assessment and processing

Integrated diffusion image operator (idio): a pipeline for automated configuration and processing of diffusion mri data, development of complete image processing system including image filtering, image compression & image security, android-based herpes disease detection application using image processing, automated invoice data extraction using image processing, efficient object detection and classification approach using htyolov4 and m2rfo-cnn, deep and low-rank quaternion priors for color image processing, implementation of automated pipeline for resting-state fmri analysis with pacs integration, stress detection using machine learning and image processing, automated extraction of seed morphological traits from images, identification of counterfeit indian currency note using image processing and machine learning classifiers, computer vision on x-ray data in industrial production and security applications: a comprehensive survey, iot based image processing filters, comprehensive automatic processing and analysis of adaptive optics flood illumination retinal images on healthy subjects., research on super-resolution image based on deep learning, ocr-mrd: performance analysis of different optical character recognition engines for medical report digitization, joint graph attention and asymmetric convolutional neural network for deep image compression, brain tumor diagnosis using image fusion and deep learning, brain tumor diagnosis using machine learning: a review, a study of air-water flow in a narrow rectangular duct using an image processing technique, deep learning using a residual deconvolutional network enables real-time high-density single-molecule localization microscopy., improved frqi on superconducting processors and its restrictions in the nisq era, darsia: an open-source python toolbox for two-scale image processing of dynamics in porous media.


  Google Scholar
  on Google Scholar
  Table of Contents

Integration of remote sensing and machine learning for precision agriculture: a comprehensive perspective on applications.

research papers on applications of image processing

1. Introduction

Author Contributions

Click here to enlarge figure

Model NameApplication of Precision AgricultureReference
Supervised LearningNaive BayesClassification of different crop diseases, soil types, etc.; prediction of the yield of wheat, corn, and other crops.[ , ]
Logistic RegressionAssessment of the risk level of pest occurrence; prediction of the yield of wheat, corn, and other crops.[ , ]
Linear RegressionOptimization of the amount of fertilizer application to improve the prediction accuracy of wheat, corn, and other crops yield.[ , ]
Lasso RegressionDetection of the extent to which crops are attacked by diseases and insect pests.[ , ]
AdaBoosT AlgorithmClassification and identification of different crop species and detection of crop diseases and insect pests.[ , ]
Linear Discriminant AnalysisClassification of soil types, identification of crop varieties, and determination of the effects of different soil fertilities on crop growth.[ , ]
Recurrent Neural NetworkAnalysis of crop growth time series data and prediction of time series changes in crop diseases and insect pests.[ , ]
Decision TreeSelection of pest management strategies; identification of crop pest types.[ , ]
Nearest Neighbor AlgorithmIdentification of different crop varieties; evaluation of soil fertility grades.[ , ]
XGBoost AlgorithmPrediction of yield of wheat, corn, and other crops based on climate, soil conditions, and other variables.[ , ]
Long Short-Term Memory NetworkForecasting the long-term trend of crop yield based on climate variables, such as precipitation and temperature, and prediction of the outbreak of crop diseases and insect pests by time series.[ , ]
Support Vector RegressionCrop growth monitoring and modeling, using remote sensing reflectance data to predict crop leaf area index, yield, etc.[ , ]
Artificial Neural NetworkIdentification of crop diseases and insect pests; crop growth monitoring and modeling; prediction of crop leaf area index, yield, etc.[ , ]
Convolutional Neural AlgorithmIdentification of crop leaf diseases and detection of disease invasion degree of crop leaves; prediction of crop leaf area index, yield, etc.[ , ]
Random ForestIdentification of crop diseases and insect pests; crop growth monitoring and modeling; prediction of crop leaf area index, yield, etc.[ , ]
Support Vector MachineIdentification of crop diseases and insect pests; crop growth monitoring and modeling; prediction of crop leaf area index, yield, etc.[ , ]
CatBoosT AlgorithmIdentification of crop leaf diseases and detection of disease invasion degree of crop leaves.[ , ]
Ridge RegressionPrediction of soil nutrients and key nutrient content based on soil sample data.[ , ]
Random Gradient DescentOptimization of model parameters to improve the accuracy of agricultural prediction and decision-making models; application to complex agricultural system modeling and prediction.[ , ]
Semi supervised learningGenerative Semi-Supervised LearningAssessment of soil quality; prediction of soil fertility, acidity, alkalinity, etc.; prediction and control of diseases and insect pests.[ , ]
AutoencodersIdentification and classification of diseases and insect pests; assessment of the risk level of pest occurrence.[ ]
UnsupervisedCo-TrainingIdentification, classification, and risk assessment of diseases and insect pests; soil type classification.[ ]
LearningProbabilistic Graphical ModelIdentification of crop diseases and insect pests; crop growth monitoring and modeling; prediction of crop leaf area index, yield, etc.[ ]
Independent Component AnalysisIdentification, classification, and risk assessment of diseases and insect pests; soil type classification.[ ]
Anomaly Detection AlgorithmDetection of crop wilt, soil moisture, and pH anomaly.[ ]
Self-Organizing MapsClassification of crops and rapid identification of soil types.[ ]
K-Means ClusteringAccurate identification of crops.[ ]
Principal Component AnalysisAccurate classification of crops based on their growth characteristics (such as color, texture, size, etc.).[ ]
ReinforcementDeep Q-NetworkRetrieval of key growth information, such as vegetation index, to effectively monitor crop growth and development.[ ]
Policy Gradient MethodsOptimization of crop irrigation and fertilization strategies.[ ]
Q-learningOptimization of agricultural decision making and environmental interaction.[ ]
Blockchain-based color medical image cryptosystem for industrial Internet of Healthcare Things (IoHT)

  • Published: 02 September 2024

Cite this article

research papers on applications of image processing

  • Fatma Khallaf 1 , 2 ,
  • Walid El-Shafai   ORCID: 1 , 3 ,
  • El-Sayed M. El-Rabaie 1 &
  • Fathi E. Abd El-Samie   ORCID: 1 , 4  

In recent years, the proliferation of smart devices and associated technologies, such as the Internet of Things (IoT), Industrial Internet of Things (IIoT), and Internet of Medical Things (IoMT), has witnessed a substantial growth. However, the limited processing power and storage capacity of smart devices make them vulnerable to cyberattacks, rendering traditional security and cryptography techniques inadequate. To address these challenges, blockchain (BC) technology has emerged as a promising solution. This study introduces an efficient framework for the Internet of Healthcare Things (IoHT), presenting a novel cryptosystem for color medical images using BC technology in conjunction with the IoT, Secure Hash Algorithm 256-bit (SHA256), shuffling, and bitwise XOR operations. The encryption scheme is specifically designed for an IIoT grid network computing system, relying on diffusion and confusion principles. In this paper, the proposed cryptosystem strength is evaluated against differential attacks with several comprehensive metrics. Simulation results and theoretical analysis demonstrate the cryptosystem effectiveness, showcasing its ability to provide high levels of security and immunity to data leakage. The proposed cryptosystem offers a versatile range of technical solutions and strategies that are adaptable to various scenarios. The evaluation metrics, with approximate values of 99.61% for Number of Pixels Change Rate (NPCR), 33.46% for Unified Average Changed Intensity (UACI), and 8 for information entropy, closely align with the desired ideal outcomes. Consequently, this paper contributes to the advancement of secure and private systems for medical image encryption based on BC technology, potentially mitigating the risks associated with cyberattacks on smart medical devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research papers on applications of image processing

Similar content being viewed by others

research papers on applications of image processing

Security Optimization of Resource-Constrained Internet of Healthcare Things (IoHT) Devices Using Asymmetric Cryptography for Blockchain Network

research papers on applications of image processing

Data protection in internet of medical things using blockchain and secret sharing method

research papers on applications of image processing

Blockchain based Chaotic Deep GAN Encryption scheme for securing medical images in a cloud environment

Explore related subjects.

  • Artificial Intelligence
  • Medical Ethics

Data availability

All data are available upon request from the corresponding author.

Dai HN, Zheng Z, Zhang Y (2019) Blockchain for Internet of Things: A survey. IEEE Internet Things J 6(5):8076–8094

Article   Google Scholar  

Zhang R, Xue R, Liu L (2019) Security and privacy on blockchain. ACM Computing Surveys (CSUR) 52(3):1–34

Ismail L, Materwala H, Zeadally S (2019) Lightweight blockchain for healthcare. IEEE Access 7:149935–149951

Raikwar M, Gligoroski D, Kralevska K (2019) SoK of used cryptography in blockchain. IEEE Access 7:148550–148575

Nofer M, Gomber P, Hinz O, Schiereck D (2017) Blockchain. Business & Information. Syst Eng 59:183–187

Google Scholar  

Giraldo FD, Gamboa CE (2020) Electronic voting using blockchain and smart contracts: Proof of concept. IEEE Lat Am Trans 18(10):1743–1751

Gai K, Guo J, Zhu L, Yu S (2020) Blockchain meets cloud computing: A survey. IEEE Commun Surv Tutorials 22(3):2009–2030

Xu J, Wang S, Zhou A, Yang F (2020) Edgence: A blockchain-enabled edge-computing platform for intelligent IoT-based dApps. China Commun 17(4):78–87

Zhang R, Xue R, Liu L (2021) Security and privacy for healthcare blockchains. IEEE Trans Serv Comput 15(6):3668–3686

Fernández-Caramés TM, Fraga-Lamas P (2018) A Review on the Use of Blockchain for the Internet of Things. Ieee Access 6:32979–33001

Daraghmi EY, Daraghmi YA, Yuan SM (2019) MedChain: a design of blockchain-based system for medical records access and permissions management. IEEE Access 7:164595–164613

Guo R, Shi H, Zheng D, Jing C, Zhuang C, Wang Z (2019) Flexible and efficient blockchain-based ABE scheme with multi-authority for medical on demand in telemedicine system. IEEE Access 7:88012–88025

Li F, Liu K, Zhang L, Huang S, Wu Q (2021) EHRChain: a blockchain-based ehr system using attribute-based and homomorphic cryptosystem. IEEE Trans Serv Comput 15(5):2755–2765

Madine, M M, Battah, AA, Yaqoob, I, Salah, K, Jayaraman, R, Al-Hammadi, Y., ..., Ellahham, S (2020) Blockchain for giving patients control over their medical records. IEEE Access, 8, 193102–193115

Ricci L, Maesa DDF, Favenza A, Ferro E (2021) Blockchains for covid-19 contact tracing and vaccine support: A systematic review. Ieee Access 9:37936–37950

Fernandez-Carames TM, Fraga-Lamas P (2020) Towards post-quantum blockchain: A review on blockchain cryptography resistant to quantum computing attacks. IEEE access 8:21091–21116

Tao J, Ling L (2021) Practical medical files sharing scheme based on blockchain and decentralized attribute-based encryption. IEEE Access 9:118771–118781

Wang Y, Zhang A, Zhang P, Wang H (2019) Cloud-assisted EHR sharing with security and privacy preservation via consortium blockchain. Ieee Access 7:136704–136719

Yang X, Li T, Pei X, Wen L, Wang C (2020) Medical data sharing scheme based on attribute cryptosystem and blockchain technology. IEEE Access 8:45468–45476

Indumathi J, Shankar A, Ghalib MR, Gitanjali J, Hua Q, Wen Z, Qi X (2020) Block chain based internet of medical things for uninterrupted, ubiquitous, user-friendly, unflappable, unblemished, unlimited health care services (bc iomt u 6 hcs). IEEE Access 8:216856–216872

Umran SM, Lu S, Abduljabbar ZA, Zhu J, Wu J (2021) Secure data of industrial internet of things in a cement factory based on a Blockchain technology. Appl Sci 11(14):6376

Khan PW, Byun Y (2020) A blockchain-based secure image encryption scheme for the industrial Internet of Things. Entropy 22(2):175

Article   MathSciNet   Google Scholar  

Zheng Z, Xie S, Dai HN, Chen X, Wang H (2018) Blockchain challenges and opportunities: A survey. Int J Web Grid Serv 14(4):352–375

Gaurav AB, Kumar P, Kumar V, Thakur RS (2023) Conceptual insights in blockchain technology: Security and applications. In: Research anthology on convergence of blockchain, internet of things, and security. IGI Global, pp 841–851

Puthal D, Malik N, Mohanty SP, Kougianos E, Das G (2018) Everything you wanted to know about the blockchain: Its promise, components, processes, and problems. IEEE Consumer Electronics Magazine 7(4):6–14

Wang G (2021) Sok: Applying blockchain technology in industrial internet of things. Cryptology ePrint Archive

Zaid OMA, El-Fishawy NA, Nigm EM (2013) Cryptosystem Algorithm Based on Chaotic Systems for Encrypting Colored Images. Int J Comput Sci Issues (IJCSI) 10(4):215

Kaur M, Kumar V (2020) A comprehensive review on image encryption techniques. Arch Comput Methods Eng 27:15–43

Wang X, Liu C (2017) A novel and effective image encryption algorithm based on chaos and DNA encoding. Multimed Tools Appl 76:6229–6245

Zhong W, Qin C, Liu C, Li H, Wang H (2012) The edge detection of rice image based on mathematical morphology and wavelet packet. In: Proceedings of 2012 International Conference on Measurement, Information and Control (Vol. 2). IEEE, pp 801–804

Podder P, Parvez AMS, Yeasmin MN, Khalil MI (2018) Relative performance analysis of edge detection techniques in iris recognition system. In: 2018 international conference on current trends towards converging technologies (ICCTCT). IEEE, pp 1–6

Mou H, Li X, Li G, Lu D, Zhang R (2018) A self-adaptive and dynamic image encryption based on latin square and high-dimensional chaotic system. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). IEEE, pp 684–690

Chen G, Mao Y, Chui CK (2004) A symmetric image encryption scheme based on 3D chaotic cat maps. Chaos, Solitons Fractals 21(3):749–761

Al-Dmour H, Al-Ani A (2015) Quality optimized medical image steganography based on edge detection and hamming code. In: 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI). IEEE, pp 1486–1489

El-Shafai W, Khallaf F, El-Rabaie ESM, El-Samie FEA (2021) Robust medical image encryption based on DNA-chaos cryptosystem for secure telemedicine and healthcare applications. J Ambient Intell Humaniz Comput 12:9007–9035

Alarifi A, Sankar S, Altameem T, Jithin KC, Amoon M, El-Shafai W (2020) A novel hybrid cryptosystem for secure streaming of high efficiency H. 265 compressed videos in IoT multimedia applications. IEEE Access 8:128548–128573

El-Shafai W, Khallaf F, El-Rabaie ESM, El-Samie FEA (2022) Proposed 3D chaos-based medical image cryptosystem for secure cloud-IoMT eHealth communication services. J Ambient Intell Human Comput 1–28

Farahat IS, Aladrousy W, Elhoseny M, Elmougy S, Tolba AE (2022) Improving Healthcare Applications Security Using Blockchain. Electronics 11(22):3786

‏Mohanty MD, Das A, Mohanty MN, Altameem A, Nayak SR, Saudagar AKJ, Poonia RC (2022) Design of smart and secured healthcare service using deep learning with modified SHA-256 algorithm. In: Healthcare (Vol. 10, No. 7). MDPI, p 1275

Khan AA, Wagan AA, Laghari AA, Gilal AR, Aziz IA, Talpur BA (2022) BIoMT: a state-of-the-art consortium serverless network architecture for healthcare system using blockchain smart contracts. IEEE Access 10:78887–78898

Egala BS, Pradhan AK, Dey P, Badarla V, Mohanty SP (2023) Fortified-chain 2.0: intelligent blockchain for decentralized smart healthcare system. IEEE Int Things J

Andrew J, Isravel DP, Sagayam KM, Bhushan B, Sei Y, Eunice J (2023) Blockchain for healthcare systems: architecture, security challenges, trends and future directions. J Netw Comput Appl 103633

El-Shafai W, Khallaf F, El-Rabaie ESM, El-Samie A, Fathi E (2022) Proposed neural SAE-based medical image cryptography framework using deep extracted features for smart IoT healthcare applications. Neural Comput Appl 1–25

El-Shafai W, Khallaf F, El-Rabaie EM, El-Samie FEA, Almomani I (2023) A multi-stage security solution for medical color images in healthcare applications. Comput Syst Sci Eng 46(3):3599–3618

Yousif SF, Abboud AJ, Radhi HY (2020) Robust image encryption with scanning technology, the El-Gamal algorithm and chaos theory. IEEE Access 8:155184–155209

Yousif SF, Abboud AJ, Alhumaima RS (2022) A new image encryption based on bit replacing, chaos and DNA coding techniques. Multimed Tools Appl 81(19):27453–27493

Liao X, Li K, Yin J (2017) Separable data hiding in encrypted image based on compressive sensing and discrete fourier transform. Multimed Tools Appl 76:20739–20753

Liao X, Shu C (2015) Reversible data hiding in encrypted images based on absolute mean difference of multiple neighboring pixels. J Vis Commun Image Represent 28:21–27

Liao X, Qin Z, Ding L (2017) Data embedding in digital images using critical functions. Signal Process Image Commun 58:146–156

Alqahtani F, Amoon M, El-Shafai W (2022) A fractional fourier based medical image authentication approach. CMC-Comput Mater Continua 70(2):3133–3150

El-Shafai W, Almomani IM, Alkhayer A (2021) Optical bit-plane-based 3D-JST cryptography algorithm with cascaded 2D-FrFT encryption for efficient and secure HEVC communication. IEEE Access 9:35004–35026

El-Shafai W, Aly M, Algarni A, Abd El-Samie FE, Soliman NF (2022) Secure and robust optical multi-stage medical image cryptosystem. CMC-Comput Mater Continua 70(1):895–913

El-Shafai W, Mesrega AK, Ahmed HEH, El-Bahnasawy NA, Abd El-Samie FE (2022) An efficient multimedia compression-encryption scheme using latin squares for securing internet-of-things networks. J Inf Secur Appl 64:103039

Faragallah, OS, Alzain, MA, El-Sayed, HS, Al-Amri, JF, El-Shafai, W, Afifi, A, ..., Soh, B (2018) Block-based optical color image encryption based on double random phase encoding. IEEE Access, 7, 4184–4194

Faragallah, OS, AlZain, MA, El-Sayed, HS, Al-Amri, JF, El-Shafai, W, Afifi, A, ..., Soh, B (2020) Secure color image cryptosystem based on chaotic logistic in the FrFT domain. Multimedia Tools and Applications, 79, 2495–2519

Faragallah OS, El-sayed HS, Afifi A, El-Shafai W (2021) Efficient and secure opto-cryptosystem for color images using 2D logistic-based fractional Fourier transform. Opt Lasers Eng 137:106333

Faragallah, O S, El-Shafai, W, Sallam, AI, Elashry, I, EL-Rabaie, ESM, Afifi, A, ..., El-sayed, HS (2022) Cybersecurity framework of hybrid watermarking and selective encryption for secure HEVC communication. J Ambient Intell Human Comput, 1–25

Helmy M, El-Shafai W, El-Rabaie S, El-Dokany IM, El-Samie FEA (2021) Efficient security framework for reliable wireless 3d video transmission. Multidimension Syst Signal Process 1–41

El-Shafai W, Almomani I, Ara A, Alkhayer A (2022) An optical-based encryption and authentication algorithm for color and grayscale medical images. Multimed Tools Appl 1–36

Download references


The authors are very grateful to all the institutions in the affiliation list for successfully performing this research work. The authors would like to thank Prince Sultan University for their support.

The authors did not receive support from any organization for the submitted work.

Author information

Authors and affiliations.

Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf, 32952, Egypt

Fatma Khallaf, Walid El-Shafai, El-Sayed M. El-Rabaie & Fathi E. Abd El-Samie

Department of Electrical Engineering, Faculty of Engineering, Ahram Canadian University, 6th October City, Giza, Egypt

Fatma Khallaf

Security Engineering Lab, Computer Science Department, Prince Sultan University, 11586, Riyadh, Saudi Arabia

Walid El-Shafai

Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, 11671, Riyadh, Saudi Arabia

Fathi E. Abd El-Samie

You can also search for this author in PubMed   Google Scholar


All authors equally contributed in this work.

Corresponding author

Correspondence to Walid El-Shafai .

Ethics declarations

Ethics approval and consent to participate.

All authors contributed and accepted to submit the current work.

Competing interests

The authors have neither relevant financial nor non-financial interests to disclose.

Conflict of interest

The authors declare that they have no conflicts of interests.

Consent for publication

All authors accept to submit and publish the submitted work.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Khallaf, F., El-Shafai, W., El-Rabaie, ES.M. et al. Blockchain-based color medical image cryptosystem for industrial Internet of Healthcare Things (IoHT). Multimed Tools Appl (2024).

Download citation

Received : 10 October 2022

Revised : 20 June 2023

Accepted : 31 August 2023

Published : 02 September 2024


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Medical images
  • Blockchain (BC)
  • Internet of Medical Things (IoMT)
  • Internet of Healthcare Things (IoHT)
  • Healthcare applications
  • Cybersecurity


  • Find a journal
  • Publish with us
  • Track your research


  1. (PDF) Application of Image Processing in Real World

    research papers on applications of image processing

  2. (PDF) Review Paper On Image Processing

    research papers on applications of image processing

  3. (PDF) Digital image processing and applications

    research papers on applications of image processing

  4. Digital Image Processing Research Proposal [Professional Thesis Writers]

    research papers on applications of image processing

  5. 😊 Research paper on digital image processing. Digital Image Processing

    research papers on applications of image processing

  6. (PDF) Application of Image Processing in Agriculture: A Survey

    research papers on applications of image processing


  1. 2018 IEEE Transactions on Image Processing topics with abstract

  2. Introduction to Image Processing using OpenCV

  3. 001 Natural Language Processing (NLP)

  4. Digital Image Processing using MATLAB IMAGE INTERPOLATION

  5. STEP 1-End

  6. Enhancing Edge Processing: Imagers with In-pixel Processors


  1. Image Processing: Research Opportunities and Challenges

    Image Processing: Research O pportunities and Challenges. Ravindra S. Hegadi. Department of Computer Science. Karnatak University, Dharwad-580003. ravindrahegadi@rediffmail. Abstract. Interest in ...

  2. (PDF) A Review on Image Processing

    Abstract. Image Processing includes changing the nature of an image in order to improve its pictorial information for human interpretation, for autonomous machine perception. Digital image ...

  3. (PDF) Advances in Artificial Intelligence for Image Processing

    AI has had a substantial influence on image processing, allowing cutting-edge methods and uses. The foundations of image processing are covered in this chapter, along with representation, formats ...

  4. Techniques and Applications of Image and Signal Processing : A

    This paper comprehensively overviews image and signal processing, including their fundamentals, advanced techniques, and applications. Image processing involves analyzing and manipulating digital images, while signal processing focuses on analyzing and interpreting signals in various domains. The fundamentals encompass digital signal representation, Fourier analysis, wavelet transforms ...

  5. Image processing

    Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to enable ...

  6. Deep learning models for digital image processing: a review

    Within the domain of image processing, a wide array of methodologies is dedicated to tasks including denoising, enhancement, segmentation, feature extraction, and classification. These techniques collectively address the challenges and opportunities posed by different aspects of image analysis and manipulation, enabling applications across various fields. Each of these methodologies ...

  7. Image Processing Technology Based on Machine Learning

    Machine learning is a relatively new field. With the deepening of people's research in this field, the application of machine learning is increasingly extensive. On the other hand, with the advancement of science and technology, graphics have been an indispensable medium of information transmission, and image processing technology is also booming. However, the traditional image processing ...

  8. Developments in Image Processing using Deep learning and Reinforcement

    The present study thoroughly explores essential and recent improvements, applications, and advancements within the sphere of image processing, offering insights into a domain characterized by continual and swift evolution. Additionally, the paper delineates prospective avenues for future research in this dynamic field.

  9. Advances in image processing using machine learning techniques

    The paper 'Ship Images Detection and Classification Based on Convolutional Neural Network with Multiple Feature Regions', by Zhijing Xu, Jiuwu Sun, and Yuhao Huo (SPR-2021-10-0144.R2), presents an exciting application of image recognition and classification in the maritime industry to cope with significant challenges for intelligent ship ...

  10. Home

    The journal is dedicated to the real-time aspects of image and video processing, bridging the gap between theory and practice. Covers real-time image processing systems and algorithms for various applications. Presents practical and real-time architectures for image processing systems. Provides tools, simulation and modeling for real-time image ...

  11. Digital Image Processing: Advanced Technologies and Applications

    A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the ...

  12. Recent trends in image processing and pattern recognition

    The Call for Papers of the special issue was initially sent out to the participants of the 2018 conference (2nd International Conference on Recent Trends in Image Processing and Pattern Recognition). To attract high quality research articles, we also accepted papers for review from outside the conference event.

  13. Grand Challenges in Image Processing

    Introduction. The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging ...

  14. Techniques and Challenges of Image Segmentation: A Review

    Image segmentation, which has become a research hotspot in the field of image processing and computer vision, refers to the process of dividing an image into meaningful and non-overlapping regions, and it is an essential step in natural scene understanding. Despite decades of effort and many achievements, there are still challenges in feature extraction and model design. In this paper, we ...

  15. Application of artificial intelligence algorithms in image processing

    In order to achieve better image processing effect, this paper focuses on the application of artificial intelligence algorithm in image processing. Image segmentation is a technology that decomposes images into regions with different characteristics and extracts useful targets. ... After the practice and research of image processing, the ...

  16. 471383 PDFs

    All kinds of image processing approaches. | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on IMAGE PROCESSING. Find methods information, sources ...

  17. Viewpoints on Medical Image Processing: From Science to Application

    This paper is focused on recent developments from science to applications analyzing the past fifteen years of history of the proceedings of the German annual meeting on medical image processing (BVM). Furthermore, some members of the program committee present their personal points of views: (i) multi-modality for imaging and diagnosis, (ii ...

  18. Deep Learning-based Image Text Processing Research

    Deep learning is a powerful multi-layer architecture that has important applications in image processing and text classification. This paper first introduces the development of deep learning and two important algorithms of deep learning: convolutional neural networks and recurrent neural networks. The paper then introduces three applications of deep learning for image recognition, image ...

  19. Study on Image Filtering -- Techniques, Algorithm and Applications

    Image processing is one of the most immerging and widely growing techniques making it a lively research field. Image processing is converting an image to a digital format and then doing different operations on it, such as improving the image or extracting various valuable data. Image filtering is one of the fascinating applications of image processing. Image filtering is a technique for ...

  20. Top 1287 papers published in the topic of Image processing in 2023

    Brain Tumor Diagnosis using Image Fusion and Deep Learning. 22 Mar 2023. TL;DR: In this paper , brain tumor images are used with a discrete cosine transform-based fusion approach to create fused pictures, which can enhance the quality of the final images and hence enhance classifier performance.

  21. Real-time intelligent image processing for the internet of things

    Overall, the eleven papers appearing in this special issue demonstrate multiple perspectives and approaches with implications for the theories, models, and algorithms used in real-time image processing and its IoT applications. These papers identify frameworks and techniques for artificial intelligence and deep learning, helping the field to ...

  22. Integration of Remote Sensing and Machine Learning for Precision ...

    A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the ...

  23. Development Model Based on Visual Image Big Data Applied to Art

    This paper aims to explore the application of visual image big data (BD) in art management, and proposes and develops a new art management model. First of all, this study conducted extensive research on the overview and application of big data, focusing on analyzing the characteristics of big data and its characteristics and application methods in art management.

  24. (PDF) Studies on application of image processing in ...

    1. Studies on application of image processing in vario us fields: An. overview. T Prabaharan, P Periasamy,V Mugendiran,Ramanan. 1 Research Scholar, St. Peter's Institute of Higher Education and ...

  25. An improved multi‐scale YOLOv8 for apple leaf dense lesion detection

    In order to enhance the detection accuracy of multi-scale disease spots, this paper proposes a more suitable method based on YOLOv8. The proposed approach is validated on a dataset containing eight kinds of apple leaf disease instances in complex field scenarios.

  26. Artificial Intelligence Image Processing Based on Wireless Sensor

    In addition, the popularity of mobile networks enables more users to participate in environmental monitoring, obtain more data through crowdsourcing, and improve the breadth and depth of research. With the advancement of image processing and artificial intelligence technology, it is possible to combine these technologies with wireless sensor ...

  27. Applications of image processing algorithms on the modern digital image

    Abstract. Digital image processing technology is one of the most vital areas of computer science discipline. Its application areas involve computer-aided design, Fourier transformation, three ...

  28. Application of Wiener Filter Based on Improved BB Gradient Descent in

    Iris recognition, renowned for its exceptional precision, has been extensively utilized across diverse industries. However, the presence of noise and blur frequently compromises the quality of iris images, thereby adversely affecting recognition accuracy. In this research, we have refined the traditional Wiener filter image restoration technique by integrating it with a gradient descent ...

  29. Implementation of Wavelet Transform Analysis Filter Using FPGA

    Discrete Wavelet Transform (DWT) represents an important mathematical tool in the last decades in signal processing applications. Therefore, DWT is widely used in several domains like signal and image processing, compression, statistics, computer vision, and in data communication… etc. [1,2,3].Various transmission systems nowadays like WiFi (IEEE 802.11) and WiMAX (IEEE 802.16) are based on ...

  30. Blockchain-based color medical image cryptosystem for industrial

    Algorithm (1): Color Medical Image Security for Smart IoT E-Healthcare Applications. 1) Initialize the algorithm. 2) Obtain the output processing result. 3) Set up the BC cloud service among network elements. 4) Identify image capturing devices as nodes. 5) Perform initial checks for the image sent to the network. 6)