• Reference Manager
  • Simple TEXT file

People also looked at

Review article, phishing attacks: a recent comprehensive study and a new anatomy.

www.frontiersin.org

  • Cardiff School of Technologies, Cardiff Metropolitan University, Cardiff, United Kingdom

With the significant growth of internet usage, people increasingly share their personal information online. As a result, an enormous amount of personal information and financial transactions become vulnerable to cybercriminals. Phishing is an example of a highly effective form of cybercrime that enables criminals to deceive users and steal important data. Since the first reported phishing attack in 1990, it has been evolved into a more sophisticated attack vector. At present, phishing is considered one of the most frequent examples of fraud activity on the Internet. Phishing attacks can lead to severe losses for their victims including sensitive information, identity theft, companies, and government secrets. This article aims to evaluate these attacks by identifying the current state of phishing and reviewing existing phishing techniques. Studies have classified phishing attacks according to fundamental phishing mechanisms and countermeasures discarding the importance of the end-to-end lifecycle of phishing. This article proposes a new detailed anatomy of phishing which involves attack phases, attacker’s types, vulnerabilities, threats, targets, attack mediums, and attacking techniques. Moreover, the proposed anatomy will help readers understand the process lifecycle of a phishing attack which in turn will increase the awareness of these phishing attacks and the techniques being used; also, it helps in developing a holistic anti-phishing system. Furthermore, some precautionary countermeasures are investigated, and new strategies are suggested.

Introduction

The digital world is rapidly expanding and evolving, and likewise, as are cybercriminals who have relied on the illegal use of digital assets—especially personal information—for inflicting damage to individuals. One of the most threatening crimes of all internet users is that of ‘identity theft’ ( Ramanathan and Wechsler, 2012 ) which is defined as impersonating the person’s identity to steal and use their personal information (i.e., bank details, social security number, or credit card numbers, etc.) by an attacker for the individuals’ own gain not just for stealing money but also for committing other crimes ( Arachchilage and Love, 2014 ). Cyber criminals have also developed their methods for stealing their information, but social-engineering-based attacks remain their favorite approach. One of the social engineering crimes that allow the attacker to perform identity theft is called a phishing attack. Phishing has been one of the biggest concerns as many internet users fall victim to it. It is a social engineering attack wherein a phisher attempts to lure the users to obtain their sensitive information by illegally utilizing a public or trustworthy organization in an automated pattern so that the internet user trusts the message, and reveals the victim’s sensitive information to the attacker ( Jakobsson and Myers, 2006 ). In phishing attacks, phishers use social engineering techniques to redirect users to malicious websites after receiving an email and following an embedded link ( Gupta et al., 2015 ). Alternatively, attackers could exploit other mediums to execute their attacks such as Voice over IP (VoIP), Short Message Service (SMS) and, Instant Messaging (IM) ( Gupta et al., 2015 ). Phishers have also turned from sending mass-email messages, which target unspecified victims, into more selective phishing by sending their emails to specific victims, a technique called “spear-phishing.”

Cybercriminals usually exploit users with a lack of digital/cyber ethics or who are poorly trained in addition to technical vulnerabilities to reach their goals. Susceptibility to phishing varies between individuals according to their attributes and awareness level, therefore, in most attacks, phishers exploit human nature for hacking, instead of utilising sophisticated technologies. Even though the weakness in the information security chain is attributed to humans more than the technology, there is a lack of understanding about which ring in this chain is first penetrated. Studies found that certain personal characteristics make some persons more receptive to various lures ( Iuga et al., 2016 ; Ovelgönne et al., 2017 ; Crane, 2019 ). For example, individuals who usually obey authorities more than others are more likely to fall victim to a Business Email Compromise (BEC) that is pretending to be from a financial institution and requests immediate action by seeing it as a legitimate email ( Barracuda, 2020 ). Greediness is another human weakness that could be used by an attacker, for example, emails that offering either great discounts, free gift cards, and others ( Workman, 2008 ).

Various channels are used by the attacker to lure the victim through a scam or through an indirect manner to deliver a payload for gaining sensitive and personal information from the victim ( Ollmann, 2004 ). However, phishing attacks have already led to damaging losses and could affect the victim not only through a financial context but could also have other serious consequences such as loss of reputation, or compromise of national security ( Ollmann, 2004 ; Herley and Florêncio, 2008 ). Cybercrime damages have been expected to cost the world $6 trillion annually by 2021, up from $3 trillion in 2015 according to Cybersecurity Ventures ( Morgan, 2019 ). Phishing attacks are the most common type of cybersecurity breaches as stated by the official statistics from the cybersecurity breaches survey 2020 in the United Kingdom ( GOV.UK, 2020 ). Although these attacks affect organizations and individuals alike, the loss for the organizations is significant, which includes the cost for recovery, the loss of reputation, fines from information laws/regulations, and reduced productivity ( Medvet et al., 2008 ).

Phishing is a field of study that merges social psychology, technical systems, security subjects, and politics. Phishing attacks are more prevalent: a recent study ( Proofpoint, 2020 ) found that nearly 90% of organizations faced targeted phishing attacks in 2019. From which 88% experienced spear-phishing attacks, 83% faced voice phishing (Vishing), 86% dealt with social media attacks, 84% reported SMS/text phishing (SMishing), and 81% reported malicious USB drops. The 2018 Proofpoint 1 annual report ( Proofpoint, 2019a ) has stated that phishing attacks jumped from 76% in 2017 to 83% in 2018, where all phishing types happened more frequently than in 2017. The number of phishing attacks identified in the second quarter of 2019 was notably higher than the number recorded in the previous three quarters. While in the first quarter of 2020, this number was higher than it was in the previous one according to a report from Anti-Phishing Working Group (APWG 2 ) ( APWG, 2018 ) which confirms that phishing attacks are on the rise. These findings have shown that phishing attacks have increased continuously in recent years and have become more sophisticated and have gained more attention from cyber researchers and developers to detect and mitigate their impact. This article aims to determine the severity of the phishing problem by providing detailed insights into the phishing phenomenon in terms of phishing definitions, current statistics, anatomy, and potential countermeasures.

The rest of the article is organized as follows. Phishing Definitions provides a number of phishing definitions as well as some real-world examples of phishing. The evolution and development of phishing attacks are discussed in Developing a Phishing Campaign . What Attributes Make Some People More Susceptible to Phishing Attacks Than Others explores the susceptibility to these attacks. The proposed phishing anatomy and types of phishing attacks are elaborated in Proposed Phishing Anatomy . In Countermeasures , various anti-phishing countermeasures are discussed. The conclusions of this study are drawn in Conclusion .

Phishing Definitions

Various definitions for the term “phishing” have been proposed and discussed by experts, researchers, and cybersecurity institutions. Although there is no established definition for the term “phishing” due to its continuous evolution, this term has been defined in numerous ways based on its use and context. The process of tricking the recipient to take the attacker’s desired action is considered the de facto definition of phishing attacks in general. Some definitions name websites as the only possible medium to conduct attacks. The study ( Merwe et al., 2005 , p. 1) defines phishing as “a fraudulent activity that involves the creation of a replica of an existing web page to fool a user into submitting personal, financial, or password data.” The above definition describes phishing as an attempt to scam the user into revealing sensitive information such as bank details and credit card numbers, by sending malicious links to the user that leads to the fake web establishment. Others name emails as the only attack vector. For instance, PishTank (2006) defines phishing as “a fraudulent attempt, usually made through email, to steal your personal information.” A description for phishing stated by ( Kirda and Kruegel, 2005 , p.1) defines phishing as “a form of online identity theft that aims to steal sensitive information such as online banking passwords and credit card information from users.” Some definitions highlight the usage of combined social and technical skills. For instance, APWG defines phishing as “a criminal mechanism employing both social engineering and technical subterfuge to steal consumers’ personal identity data and financial account credentials” ( APWG, 2018 , p. 1). Moreover, the definition from the United States Computer Emergency Readiness Team (US-CERT) states phishing as “a form of social engineering that uses email or malicious websites (among other channels) to solicit personal information from an individual or company by posing as a trustworthy organization or entity” ( CISA, 2018 ). A detailed definition has been presented in ( Jakobsson and Myers, 2006 , p. 1), which describes phishing as “a form of social engineering in which an attacker, also known as a phisher, attempts to fraudulently retrieve legitimate users’ confidential or sensitive credentials by mimicking electronic communications from a trustworthy or public organization in an automated fashion. Such communications are most frequently done through emails that direct users to fraudulent websites that in turn collect the credentials in question.”

In order to understand the anatomy of the phishing attack, there is a necessity for a clear and detailed definition that underpins previous existent definitions. Since a phishing attack constitutes a mix of technical and social engineering tactics, a new definition (i.e., Anatomy) has been proposed in this article, which describes the complete process of a phishing attack. This provides a better understanding for the readers as it covers phishing attacks in depth from a range of perspectives. Various angles and this might help beginner readers or researchers in this field. To this end, we define phishing as a socio-technical attack, in which the attacker targets specific valuables by exploiting an existing vulnerability to pass a specific threat via a selected medium into the victim’s system, utilizing social engineering tricks or some other techniques to convince the victim into taking a specific action that causes various types of damages.

Figure 1 depicts the general process flow for a phishing attack that contains four phases; these phases are elaborated in Proposed Phishing Anatomy . However, as shown in Figure 1 , in most attacks, the phishing process is initiated by gathering information about the target. Then the phisher decides which attack method is to be used in the attack as initial steps within the planning phase. The second phase is the preparation phase, in which the phisher starts to search for vulnerabilities through which he could trap the victim. The phisher conducts his attack in the third phase and waits for a response from the victim. In turn, the attacker could collect the spoils in the valuables acquisition phase, which is the last step in the phishing process. To elaborate the above phishing process using an example, an attacker may send a fraudulent email to an internet user pretending to be from the victim’s bank, requesting the user to confirm the bank account details, or else the account may be suspended. The user may think this email is legitimate since it uses the same graphic elements, trademarks, and colors of their legitimate bank. Submitted information will then be directly transmitted to the phisher who will use it for different malicious purposes such as money withdrawal, blackmailing, or committing further frauds.

www.frontiersin.org

FIGURE 1 . General phishing attack process.

Real-World Phishing Examples

Some real-world examples of phishing attacks are discussed in this section to present the complexity of some recent phishing attacks. Figure 2 shows the screenshot of a suspicious phishing email that passed a University’s spam filters and reached the recipient mailbox. As shown in Figure 2 , the phisher uses the sense of importance or urgency in the subject through the word ‘important,’ so that the email can trigger a psychological reaction in the user to prompt them into clicking the button “View message.” The email contains a suspicious embedded button, indeed, when hovering over this embedded button, it does not match with Uniform Resource Locator (URL) in the status bar. Another clue in this example is that the sender's address is questionable and not known to the receiver. Clicking on the fake attachment button will result in either installation of a virus or worm onto the computer or handing over the user’s credentials by redirecting the victim onto a fake login page.

www.frontiersin.org

FIGURE 2 . Screenshot of a real suspicious phishing email received by the authors’ institution in February 2019.

More recently, phishers take advantage of the Coronavirus pandemic (COVID-19) to fool their prey. Many Coronavirus-themed scam messages sent by attackers exploited people’s fear of contracting COVID-19 and urgency to look for information related to Coronavirus (e.g., some of these attacks are related to Personal Protective Equipment (PPE) such as facemasks), the WHO stated that COVID-19 has created an Infodemic which is favorable for phishers ( Hewage, 2020 ). Cybercriminals also lured people to open attachments claiming that it contains information about people with Coronavirus within the local area.

Figure 3 shows an example of a phishing e-mail where the attacker claimed to be the recipient’s neighbor sending a message in which they pretended to be dying from the virus and threatening to infect the victim unless a ransom was paid ( Ksepersky, 2020 ).

www.frontiersin.org

FIGURE 3 . Screenshot of a coronavirus related phishing email ( Ksepersky, 2020 ).

Another example is the phishing attack spotted by a security researcher at Akamai organization in January 2019. The attack attempted to use Google Translate to mask suspicious URLs, prefacing them with the legit-looking “ www.translate.google.com ” address to dupe users into logging in ( Rhett, 2019 ). That attack followed with Phishing scams asking for Netflix payment detail for example, or embedded in promoted tweets that redirect users to genuine-looking PayPal login pages. Although the tricky/bogus page was very well designed in the latter case, the lack of a Hypertext Transfer Protocol Secure (HTTPS) lock and misspellings in the URL were key red flags (or giveaways) that this was actually a phishing attempt ( Keck, 2018 ). Figure 4A shows a screenshot of a phishing email received by the Federal Trade Commission (FTC). The email promotes the user to update his payment method by clicking on a link, pretending that Netflix is having a problem with the user's billing information ( FTC, 2018 ).

www.frontiersin.org

FIGURE 4 . Screenshot of the (A) Netflix scam email and (B) fraudulent text message (Apple) ( Keck, 2018 ; Rhett, 2019 )

Figure 4B shows a text message as another example of phishing that is difficult to spot as a fake text message ( Pompon et al., 2018 ). The text message shown appears to come from Apple asking the customer to update the victim’s account. A sense of urgency is used in the message as a lure to motivate the user to respond.

Developing a Phishing Campaign

Today, phishing is considered one of the most pressing cybersecurity threats for all internet users, regardless of their technical understanding and how cautious they are. These attacks are getting more sophisticated by the day and can cause severe losses to the victims. Although the attacker’s first motivation is stealing money, stolen sensitive data can be used for other malicious purposes such as infiltrating sensitive infrastructures for espionage purposes. Therefore, phishers keep on developing their techniques over time with the development of electronic media. The following sub-sections discuss phishing evolution and the latest statistics.

Historical Overview

Cybersecurity has been a major concern since the beginning of APRANET, which is considered to be the first wide-area packet-switching network with distributed control and one of the first networks to implement the TCP/IP protocol suite. The term “Phishing” which was also called carding or brand spoofing, was coined for the first time in 1996 when the hackers created randomized credit card numbers using an algorithm to steal users' passwords from America Online (AOL) ( Whitman and Mattord, 2012 ; Cui et al., 2017 ). Then phishers used instant messages or emails to reach users by posing as AOL employees to convince users to reveal their passwords. Attackers believed that requesting customers to update their account would be an effective way to disclose their sensitive information, thereafter, phishers started to target larger financial companies. The author in ( Ollmann, 2004 ) believes that the “ph” in phishing comes from the terminology “Phreaks” which was coined by John Draper, who was also known as Captain Crunch, and was used by early Internet criminals when they phreak telephone systems. Where the “f” in ‘fishing’ replaced with “ph” in “Phishing” as they both have the same meaning by phishing the passwords and sensitive information from the sea of internet users. Over time, phishers developed various and more advanced types of scams for launching their attack. Sometimes, the purpose of the attack is not limited to stealing sensitive information, but it could involve injecting viruses or downloading the malicious program into a victim's computer. Phishers make use of a trusted source (for instance a bank helpdesk) to deceive victims so that they disclose their sensitive information ( Ollmann, 2004 ).

Phishing attacks are rapidly evolving, and spoofing methods are continuously changing as a response to new corresponding countermeasures. Hackers take advantage of new tool-kits and technologies to exploit systems’ vulnerabilities and also use social engineering techniques to fool unsuspecting users. Therefore, phishing attacks continue to be one of the most successful cybercrime attacks.

The Latest Statistics of Phishing Attacks

Phishing attacks are becoming more common and they are significantly increasing in both sophistication and frequency. Lately, phishing attacks have appeared in various forms. Different channels and threats are exploited and used by the attackers to trap more victims. These channels could be social networks or VoIP, which could carry various types of threats such as malicious attachments, embedded links within an email, instant messages, scam calls, or other types. Criminals know that social engineering-based methods are effective and profitable; therefore, they keep focusing on social engineering attacks, as it is their favorite weapon, instead of concentrating on sophisticated techniques and toolkits. Phishing attacks have reached unprecedented levels especially with emerging technologies such as mobile and social media ( Marforio et al., 2015 ). For instance, from 2017 to 2020, phishing attacks have increased from 72 to 86% among businesses in the United Kingdom in which a large proportion of the attacks are originated from social media ( GOV.UK, 2020 ).

The APWG Phishing Activity Trends Report analyzes and measures the evolution, proliferation, and propagation of phishing attacks reported to the APWG. Figure 5 shows the growth in phishing attacks from 2015 to 2020 by quarters based on APWG annual reports ( APWG, 2020 ). As demonstrated in Figure 5 , in the third quarter of 2019, the number of phishing attacks rose to 266,387, which is the highest level in three years since late 2016. This was up 46% from the 182,465 for the second quarter, and almost double the 138,328 seen in the fourth quarter of 2018. The number of unique phishing e-mails reported to APWG in the same quarter was 118,260. Furthermore, it was found that the number of brands targeted by phishing campaigns was 1,283.

www.frontiersin.org

FIGURE 5 . The growth in phishing attacks 2015–2020 by quarters based on data collected from APWG annual reports.

Cybercriminals are always taking advantage of disasters and hot events for their own gains. With the beginning of the COVID-19 crisis, a variety of themed phishing and malware attacks have been launched by phishers against workers, healthcare facilities, and even the general public. A report from Microsoft ( Microsoft, 2020 ) showed that cyber-attacks related to COVID-19 had spiked to an unprecedented level in March, most of these scams are fake COVID-19 websites according to security company RiskIQ ( RISKIQ, 2020 ). However, the total number of phishing attacks observed by APWG in the first quarter of 2020 was 165,772, up from the 162,155 observed in the fourth quarter of 2019. The number of these unique phishing reports submitted to APWG during the first quarter of 2020 was 139,685, up from 132,553 in the fourth quarter of 2019, 122,359 in the third quarter of 2019, and 112,163 in the second quarter of 2019 ( APWG, 2020 ).

A study ( KeepnetLABS, 2018 ) confirmed that more than 91% of system breaches are caused by attacks initiated by email. Although cybercriminals use email as the main medium for leveraging their attacks, many organizations faced a high volume of different social engineering attacks in 2019 such as Social Media Attacks, Smishing Attacks, Vishing Attacks, USB-based Attacks (for example by hiding and delivering malware to smartphones via USB phone chargers and distributing malware-laden free USBs) ( Proofpoint, 2020 ). However, info-security professionals reported a higher frequency of all types of social engineering attacks year-on-year according to a report presented by Proofpoint. Spear phishing increased to 64% in 2018 from 53% in 2017, Vishing and/or SMishing increased to 49% from 45%, and USB attacks increased to 4% from 3%. The positive side shown in this study is that 59% of suspicious emails reported by end-users were classified as potential phishing, indicating that employees are being more security-aware, diligent, and thoughtful about the emails they receive ( Proofpoint, 2019a ). In all its forms, phishing can be one of the easiest cyber attacks to fall for. With the increasing levels of different phishing types, a survey was conducted by Proofpoint to identify the strengths and weaknesses of particular regions in terms of specific fundamental cybersecurity concepts. In this study, several questions were asked of 7,000 end-users about the identification of multiple terms like phishing, ransomware, SMishing, and Vishing across seven countries; the US, United Kingdom, France, Germany, Italy, Australia, and Japan. The response was different from country to country, where respondents from the United Kingdom recorded the highest knowledge with the term phishing at 70% and the same with the term ransomware at 60%. In contrast, the results showed that the United Kingdom recorded only 18% for each Vishing and SMishing ( Proofpoint, 2019a ), as shown in Table 1 .

www.frontiersin.org

TABLE 1 . Percentage of respondents understanding multiple cybersecurity terms from different countries.

On the other hand, a report by Wombat security reflects responses from more than 6,000 working adults about receiving fraudulent solicitation across six countries; the US, United Kingdom, Germany, France, Italy, and Australia ( Ksepersky, 2020 ). Respondents from the United Kingdom stated that they were recipients of fraudulent solicitations through the following sources: email 62%, phone call 27%, text message 16%, mailed letter 8%, social media 10%, and 17% confirmed that they been the victim of identity theft ( Ksepersky, 2020 ). However, the consequences of responding to phishing are serious and costly. For instance, the United Kingdom losses from financial fraud across payment cards, remote banking, and cheques totaled £768.8 million in 2016 ( Financial Fraud Action UK, 2017 ). Indeed, the losses resulting from phishing attacks are not limited to financial losses that might exceed millions of pounds, but also loss of customers and reputation. According to the 2020 state of phish report ( Proofpoint, 2020 ), damages from successful phishing attacks can range from lost productivity to cash outlay. The cost can include; lost hours from employees, remediation time for info security teams’ costs due to incident response, damage to reputation, lost intellectual property, direct monetary losses, compliance fines, lost customers, legal fees, etc.

There are many targets for phishing including end-user, business, financial services (i.e., banks, credit card companies, and PayPal), retail (i.e., eBay, Amazon) and, Internet Service Providers ( wombatsecurity.com, 2018 ). Affected organizations detected by Kaspersky Labs globally in the first quarter of 2020 are demonstrated in Figure 6 . As shown in the figure, online stores were at the top of the targeted list (18.12%) followed by global Internet portals (16.44%) and social networks in third place (13.07%) ( Ksepersky, 2020 ). While the most impersonated brands overall for the first quarter of 2020 were Apple, Netflix, Yahoo, WhatsApp, PayPal, Chase, Facebook, Microsoft eBay, and Amazon ( Checkpoint, 2020 ).

www.frontiersin.org

FIGURE 6 . Distribution of organizations affected by phishing attacks detected by Kaspersky in quarter one of 2020.

Phishing attacks can take a variety of forms to target people and steal sensitive information from them. Current data shows that phishing attacks are still effective, which indicates that the available existing countermeasures are not enough to detect and prevent these attacks especially on smart devices. The social engineering element of the phishing attack has been effective in bypassing the existing defenses to date. Therefore, it is essential to understand what makes people fall victim to phishing attacks. What Attributes Make Some People More Susceptible to Phishing Attacks Than Others discusses the human attributes that are exploited by the phishers.

What Attributes Make Some People More Susceptible to Phishing Attacks Than Others

Why do most existing defenses against phishing not work? What personal and contextual attributes make them more susceptible to phishing attacks than other users? Different studies have discussed those two questions and examined the factors affecting susceptibility to a phishing attack and the reasons behind why people get phished. Human nature is considered one of the most affecting factors in the process of phishing. Everyone is susceptible to phishing attacks because phishers play on an individual’s specific psychological/emotional triggers as well as technical vulnerabilities ( KeepnetLABS, 2018 ; Crane, 2019 ). For instance, individuals are likely to click on a link within an email when they see authority cues ( Furnell, 2007 ). In 2017, a report by PhishMe (2017) found that curiosity and urgency were the most common triggers that encourage people to respond to the attack, later these triggers were replaced by entertainment, social media, and reward/recognition as the top emotional motivators. However, in the context of a phishing attack, the psychological triggers often surpass people’s conscious decisions. For instance, when people are working under stress, they tend to make decisions without thinking of the possible consequences and options ( Lininger and Vines, 2005 ). Moreover, everyday stress can damage areas of the brain that weakens the control of their emotions ( Keinan, 1987 ). Several studies have addressed the association between susceptibility to phishing and demographic variables (e.g., age and gender) as an attempt to identify the reasons behind phishing success at different population groups. Although everyone is susceptible to phishing, studies showed that different age groups are more susceptible to certain lures than others are. For example, participants with an age range between 18 and 25 are more susceptible to phishing than other age groups ( Williams et al., 2018 ). The reason that younger adults are more likely to fall for phishing, is that younger adults are more trusting when it comes to online communication, and are also more likely to click on unsolicited e-mails ( Getsafeonline, 2017 ). Moreover, older participants are less susceptible because they tend to be less impulsive ( Arnsten et al., 2012 ). While some studies confirmed that women are more susceptible than men to phishing as they click on links in phishing emails and enter information into phishing websites more often than men do. The study published by Getsafeonline (2017) identifies a lack of technical know-how and experience among women than men as the main reason for this. In contrast, a survey conducted by antivirus company Avast found that men are more susceptible to smartphone malware attacks than women ( Ong, 2014 ). These findings confirmed the results from the study ( Hadlington, 2017 ) that found men are more susceptible to mobile phishing attacks than women. The main reason behind this according to Hadlington (2017) is that men are more comfortable and trusting when using mobile online services. The relationships between demographic characteristics of individualls and their ability to correctly detect a phishing attack have been studied in ( Iuga et al., 2016 ). The study showed that participants with high Personal Computer (PC) usage tend to identify phishing efforts more accurately and faster than other participants. Another study ( Hadlington, 2017 ) showed that internet addiction, attentional, and motor impulsivity were significant positive predictors for risky cybersecurity behaviors while a positive attitude toward cybersecurity in business was negatively related to risky cybersecurity behaviors. On the other hand, the trustworthiness of people in some web sites/platforms is one of the holes that the scammers or crackers exploit especially when it based on visual appearance that could fool the user ( Hadlington, 2017 ). For example, fraudsters take advantage of people’s trust in a website by replacing a letter from the legitimate site with a number such as goog1e.com instead of google.com . Another study ( Yeboah-Boateng and Amanor, 2014 ) demonstrates that although college students are unlikely to disclose personal information as a response to an email, nonetheless they could easily be tricked by other tactics, making them alarmingly susceptible to email phishing attacks. The reason for that is most college students do not have a basis in ICT especially in terms of security. Although security terms like viruses, online scams and worms are known by some end-users, these users could have no knowledge about Phishing, SMishing, and Vishing and others ( Lin et al., 2012 ). However, study ( Yeboah-Boateng and Amanor, 2014 ) shows that younger students are more susceptible than older students, and students who worked full-time were less likely to fall for phishing.

The study reported in ( Diaz et al., 2020 ) examines user click rates and demographics among undergraduates by sending phishing attacks to 1,350 randomly selected students. Students from various disciplines were involved in the test, from engineering and mathematics to arts and social sciences. The study observed that student susceptibility was affected by a range of factors such as phishing awareness, time spent on the computer, cyber training, age, academic year, and college affiliation. The most surprising finding is that those who have greater phishing knowledge are more susceptible to phishing scams. The authors consider two speculations for these unexpected findings. First, user’s awareness about phishing might have been increased with the continuous falling for phishing scams. Second, users who fell for the phish might have less knowledge about phishing than they claim. Other findings from this study agreed with findings from other studies that is, older students were more able to detect a phishing email, and engineering and IT majors had some of the lowest click rates as shown in Figure 7 , which shows that some academic disciplines are more susceptible to phishing than others ( Bailey et al., 2008 ).

www.frontiersin.org

FIGURE 7 . The number of clicks on phishing emails by students in the College of Arts, Humanities, and Social Sciences (AHSS), the College of Engineering and Information Technology (EIT), and the College of Natural and Mathematical Sciences (NMS) at the University of Maryland, Baltimore County (UMBC) ( Diaz et al., 2020 ).

Psychological studies have also illustrated that the user’s ability to avoid phishing attacks affected by different factors such as browser security indicators and user's awareness of phishing. The author in ( Dhamija et al., 2006 ) conducted an experimental study using 22 participants to test the user’s ability to recognize phishing websites. The study shows that 90% of these participants became victims of phishing websites and 23% of them ignored security indexes such as the status and address bar. In 2015, another study was conducted for the same purpose, where a number of fake web pages was shown to the participants ( Alsharnouby et al., 2015 ). The results of this study showed that participants detected only 53% of phishing websites successfully. The authors also observed that the time spent on looking at browser elements affected the ability to detect phishing. Lack of knowledge or awareness and carelessness are common causes for making people fall for a phishing trap. Most people have unknowingly opened a suspicious attachment or clicked a fake link that could lead to different levels of compromise. Therefore, focusing on training and preparing users for dealing with such attacks are essential elements to minimize the impact of phishing attacks.

Given the above discussion, susceptibility to phishing varies according to different factors such as age, gender, education level, internet, and PC addiction, etc. Although for each person, there is a trigger that can be exploited by phishers, even people with high experience may fall prey to phishing due to the attack sophistication that makes it difficult to be recognized. Therefore, it is inequitable that the user has always been blamed for falling for these attacks, developers must improve the anti-phishing systems in a way that makes the attack invisible. Understanding the susceptibility of individuals to phishing attacks will help in better developing prevention and detection techniques and solutions.

Proposed Phishing Anatomy

Phishing process overview.

Generally, most of the phishing attacks start with an email ( Jagatic et al., 2007 ). The phishing mail could be sent randomly to potential users or it can be targeted to a specific group or individuals. Many other vectors can also be used to initiate the attack such as phone calls, instant messaging, or physical letters. However, phishing process steps have been discussed by many researchers due to the importance of understanding these steps in developing an anti-phishing solution. The author in the study ( Rouse, 2013 ) divides the phishing attack process into five phases which are planning, setup, attack, collection, and cash. A study ( Jakobsson and Myers, 2006 ) discusses the phishing process in detail and explained it as step-by-step phases. These phases include preparation for the attack, sending a malicious program using the selected vector, obtaining the user’s reaction to the attack, tricking a user to disclose their confidential information which will be transmitted to the phisher, and finally obtaining the targeted money. While the study ( Abad, 2005 ) describes a phishing attack in three phases: the early phase which includes initializing attack, creating the phishing email, and sending a phishing email to the victim. The second phase includes receiving an email by the victim and disclosing their information (in the case of the respondent) and the final phase in which the defrauding is successful. However, all phishing scams include three primary phases, the phisher requests sensitive valuables from the target, and the target gives away these valuables to a phisher, and phisher misuses these valuables for malicious purposes. These phases can be classified furthermore into its sub-processes according to phishing trends. Thus, a new anatomy for phishing attacks has been proposed in this article, which expands and integrates previous definitions to cover the full life cycle of a phishing attack. The proposed new anatomy, which consists of 4 phases, is shown in Figure 8 . This new anatomy provides a reference structure to look at phishing attacks in more detail and also to understand potential countermeasures to prevent them. The explanations for each phase and its components are presented as follows:

www.frontiersin.org

FIGURE 8 . The proposed anatomy of phishing was built upon the proposed phishing definition in this article, which concluded from our understanding of a phishing attack.

Figure 8 depicts the proposed anatomy of the phishing attack process, phases, and components drawn upon the proposed definition in this article. The proposed phishing anatomy explains in detail each phase of phishing phases including attackers and target types, examples about the information that could be collected by the attacker about the victim, and examples about attack methods. The anatomy, as shown in the figure, illustrates a set of vulnerabilities that the attacker can exploit and the mediums used to conduct the attack. Possible threats are also listed, as well as the data collection method for a further explanation and some examples about target responding types and types of spoils that the attacker could gain and how they can use the stolen valuables. This anatomy elaborates on phishing attacks in depth which helps people to better understand the complete phishing process (i.e., end to end Phishing life cycle) and boost awareness among readers. It also provides insights into potential solutions for phishing attacks we should focus on. Instead of always placing the user or human in an accusation ring as the only reason behind phishing success, developers must be focusing on solutions to mitigate the initiation of the attack by preventing the bait from reaching the user. For instance, to reach the target’s system, the threat has to pass through many layers of technology or defenses exploiting one or more vulnerabilities such as web and software vulnerabilities.

Planning Phase

This is the first stage of the attack, where a phisher makes a decision about the targets and starts gathering information about them (individuals or company). Phishers gather information about the victims to lure them based on psychological vulnerability. This information can be anything like name, e-mail addresses for individuals, or the customers of that company. Victims could also be selected randomly, by sending mass mailings or targeted by harvesting their information from social media, or any other source. Targets for phishing could be any user with a bank account and has a computer on the Internet. Phishers target businesses such as financial services, retail sectors such as eBay and Amazon, and internet service providers such as MSN/Hotmail, and Yahoo ( Ollmann, 2004 ; Ramzan and Wuest, 2007 ). This phase also includes devising attack methods such as building fake websites (sometimes phishers get a scam page that is already designed or used, designing malware, constructing phishing emails. The attacker can be categorized based on the attack motivation. There are four types of attackers as mentioned in studies ( Vishwanath, 2005 ; Okin, 2009 ; EDUCBA, 2017 ; APWG, 2020 ):

▪ Script kiddies: the term script kiddies represents an attacker with no technical background or knowledge about writing sophisticated programs or developing phishing tools but instead they use scripts developed by others in their phishing attack. Although the term comes from children that use available phishing kits to crack game codes by spreading malware using virus toolkits, it does not relate precisely to the actual age of the phisher. Script kiddies can get access to website administration privileges and commit a “Web cracking” attack. Moreover, they can use hacking tools to compromise remote computers so-called “botnet,” the single compromised computer called a “zombie computer.” These attackers are not limited to just sit back and enjoy phishing, they could cause serious damage such as stealing information or uploading Trojans or viruses. In February 2000, an attack launched by Canadian teen Mike Calce resulted in $1.7 million US Dollars (USD) damages from Distributed Denial of Service (DDoS) attacks on CNN, eBay, Dell, Yahoo, and Amazon ( Leyden, 2001 ).

▪ Serious Crackers: also known as Black Hats. These attackers can execute sophisticated attacks and develop worms and Trojans for their attack. They hijack people's accounts maliciously and steal credit card information, destroy important files, or sell compromised credentials for personal gains.

▪ Organized crime: this is the most organized and effective type of attacker and they can incur significant damage to victims. These people hire serious crackers for conducting phishing attacks. Moreover, they can thoroughly trash the victim's identity, and committing devastated frauds as they have the skills, tools, and manpower. An organized cybercrime group is a team of expert hackers who share their skills to build complex attacks and to launch phishing campaigns against individuals and organizations. These groups offer their work as ‘crime as a service’ and they can be hired by terrorist groups, organizations, or individuals.

▪ Terrorists: due to our dependency on the internet for most activities, terrorist groups can easily conduct acts of terror remotely which could have an adverse impact. These types of attacks are dangerous since they are not in fear of any aftermath, for instance going to jail. Terrorists could use the internet to the maximum effect to create fear and violence as it requires limited funds, resources, and efforts compared to, for example, buying bombs and weapons in a traditional attack. Often, terrorists use spear phishing to launch their attacks for different purposes such as inflicting damage, cyber espionage, gathering information, locating individuals, and other vandalism purposes. Cyber espionage has been used extensively by cyber terrorists to steal sensitive information on national security, commercial information, and trade secrets which can be used for terrorist activities. These types of crimes may target governments or organizations, or individuals.

Attack Preparation

After making a decision about the targets and gathering information about them, phishers start to set up the attack by scanning for the vulnerabilities to exploit. The following are some examples of vulnerabilities exploited by phishers. For example, the attacker might exploit buffer overflow vulnerability to take control of target applications, create a DoS attack, or compromise computers. Moreover, “zero-day” software vulnerabilities, which refer to newly discovered vulnerabilities in software programs or operating systems could be exploited directly before it is fixed ( Kayne, 2019 ). Another example is browser vulnerabilities, adding new features and updates to the browser might introduce new vulnerabilities to the browser software ( Ollmann, 2004 ). In 2005, attackers exploited a cross-domain vulnerability in Internet Explorer (IE) ( Symantic, 2019 ). The cross-domain used to separate content from different sources in Microsoft IE. Attackers exploited a flaw in the cross-domain that enables them to execute programs on a user's computer after running IE. According to US-CERT, hackers are actively exploiting this vulnerability. To carry out a phishing attack, attackers need a medium so that they can reach their target. Therefore, apart from planning the attack to exploit potential vulnerabilities, attackers choose the medium that will be used to deliver the threat to the victim and carry out the attack. These mediums could be the internet (social network, websites, emails, cloud computing, e-banking, mobile systems) or VoIP (phone call), or text messages. For example, one of the actively used mediums is Cloud Computing (CC). The CC has become one of the more promising technologies and has popularly replaced conventional computing technologies. Despite the considerable advantages produced by CC, the adoption of CC faces several controversial obstacles including privacy and security issues ( CVEdetails, 2005 ). Due to the fact that different customers could share the same recourses in the cloud, virtualization vulnerabilities may be exploited by a possible malicious customer to perform security attacks on other customers’ applications and data ( Zissis and Lekkas, 2012 ). For example, in September 2014, secret photos of some celebrities suddenly moved through the internet in one of the more terrible data breaches. The investigation revealed that the iCloud accounts of the celebrities were breached ( Lehman and Vajpayee, 2011 ). According to Proofpoint, in 2017, attackers used Microsoft SharePoint to infect hundreds of campaigns with malware through messages.

Attack Conducting Phase

This phase involves using attack techniques to deliver the threat to the victim as well as the victim’s interaction with the attack in terms of responding or not. After the victim's response, the system may be compromised by the attacker to collect user's information using techniques such as injecting client-side script into webpages ( Johnson, 2016 ). Phishers can compromise hosts without any technical knowledge by purchasing access from hackers ( Abad, 2005 ). A threat is a possible danger that that might exploit a vulnerability to compromise people’s security and privacy or cause possible harm to a computer system for malicious purposes. Threats could be malware, botnet, eavesdropping, unsolicited emails, and viral links. Several Phishing techniques are discussed in sub- Types and Techniques of Phishing Attacks .

Valuables Acquisition Phase

In this stage, the phisher collects information or valuables from victims and uses it illegally for purchasing, funding money without the user’s knowledge, or selling these credentials in the black market. Attackers target a wide range of valuables from their victims that range from money to people’s lives. For example, attacks on online medical systems may lead to loss of life. Victim’s data can be collected by phishers manually or through automated techniques ( Jakobsson et al., 2007 ).

The data collection can be conducted either during or after the victim’s interaction with the attacker. However, to collect data manually simple techniques are used wherein victims interact directly with the phisher depending on relationships within social networks or other human deception techniques ( Ollmann, 2004 ). Whereas in automated data collection, several techniques can be used such as fake web forms that are used in web spoofing ( Dhamija et al., 2006 ). Additionally, the victim’s public data such as the user’s profile in social networks can be used to collect the victim’s background information that is required to initialize social engineering attacks ( Wenyin et al., 2005 ). In VoIP attacks or phone attack techniques such as recorded messages are used to harvest user's data ( Huber et al., 2009 ).

Types and Techniques of Phishing Attacks

Phishers conduct their attack either by using psychological manipulation of individuals into disclosing personal information (i.e., deceptive attack as a form of social engineering) or using technical methods. Phishers, however, usually prefer deceptive attacks by exploiting human psychology rather than technical methods. Figure 9 illustrates the types of phishing and techniques used by phishers to conduct a phishing attack. Each type and technique is explained in subsequent sections and subsections.

www.frontiersin.org

FIGURE 9 . Phishing attack types and techniques drawing upon existing phishing attacks.

Deceptive Phishing

Deceptive phishing is the most common type of phishing attack in which the attacker uses social engineering techniques to deceive victims. In this type of phishing, a phisher uses either social engineering tricks by making up scenarios (i.e., false account update, security upgrade), or technical methods (i.e., using legitimate trademarks, images, and logos) to lure the victim and convince them of the legitimacy of the forged email ( Jakobsson and Myers, 2006 ). By believing these scenarios, the user will fall prey and follow the given link, which leads to disclose his personal information to the phisher.

Deceptive phishing is performed through phishing emails; fake websites; phone phishing (Scam Call and IM); social media; and via many other mediums. The most common social phishing types are discussed below;

Phishing e-Mail

The most common threat derived by an attacker is deceiving people via email communications and this remains the most popular phishing type to date. A Phishing email or Spoofed email is a forged email sent from an untrusted source to thousands of victims randomly. These fake emails are claiming to be from a person or financial institution that the recipient trusts in order to convince recipients to take actions that lead them to disclose their sensitive information. A more organized phishing email that targets a particular group or individuals within the same organization is called spear phishing. In the above type, the attacker may gather information related to the victim such as name and address so that it appears to be credible emails from a trusted source ( Wang et al., 2008 ), and this is linked to the planning phase of the phishing anatomy proposed in this article. A more sophisticated form of spear phishing is called whaling, which targets high-rank people such as CEOs and CFOs. Some examples of spear-phishing attack victims in early 2016 are the phishing email that hacked the Clinton campaign chairman John Podesta’s Gmail account ( Parmar, 2012 ). Clone phishing is another type of email phishing, where the attacker clones a legitimate and previously delivered email by spoofing the email address and using information related to the recipient such as addresses from the legitimate email with replaced links or malicious attachments ( Krawchenko, 2016 ). The basic scenario for this attack is illustrated previously in Figure 4 and can be described in the following steps.

1. The phisher sets up a fraudulent email containing a link or an attachment (planning phase).

2. The phisher executes the attack by sending a phishing email to the potential victim using an appropriate medium (attack conducting phase).

3. The link (if clicked) directs the user to a fraudulent website, or to download malware in case of clicking the attachment (interaction phase).

4. The malicious website prompts users to provide confidential information or credentials, which are then collected by the attacker and used for fraudulent activities. (Valuables acquisition phase).

Often, the phisher does not use the credentials directly; instead, they resell the obtained credentials or information on a secondary market ( Jakobsson and Myers, 2006 ), for instance, script kiddies might sell the credentials on the dark web.

Spoofed Website

This is also called phishing websites, in which phishers forge a website that appears to be genuine and looks similar to the legitimate website. An unsuspicious user is redirected to this website after clicking a link embedded within an email or through an advertisement (clickjacking) or any other way. If the user continues to interact with the spoofed website, sensitive information will be disclosed and harvested by the phisher ( CSIOnsite, 2012 ).

Phone Phishing (Vishing and SMishing)

This type of phishing is conducted through phone calls or text messages, in which the attacker pretends to be someone the victim knows or any other trusted source the victim deals with. A user may receive a convincing security alert message from a bank convincing the victim to contact a given phone number with the aim to get the victim to share passwords or PIN numbers or any other Personally Identifiable Information (PII). The victim may be duped into clicking on an embedded link in the text message. The phisher then could take the credentials entered by the victim and use them to log in to the victims' instant messaging service to phish other people from the victim’s contact list. A phisher could also make use of Caller IDentification (CID) 3 spoofing to dupe the victim that the call is from a trusted source or by leveraging from an internet protocol private branch exchange (IP PBX) 4 tools which are open-source and software-based that support VoIP ( Aburrous et al., 2008 ). A new report from Fraud Watch International about phishing attack trends for 2019 anticipated an increase in SMishing where the text messages content is only viewable on a mobile device ( FraudWatchInternational, 2019 ).

Social Media Attack (Soshing, Social Media Phishing)

Social media is the new favorite medium for cybercriminals to conduct their phishing attacks. The threats of social media can be account hijacking, impersonation attacks, scams, and malware distributing. However, detecting and mitigating these threats requires a longer time than detecting traditional methods as social media exists outside of the network perimeter. For example, the nation-state threat actors conducted an extensive series of social media attacks on Microsoft in 2014. Multiple Twitter accounts were affected by these attacks and passwords and emails for dozens of Microsoft employees were revealed ( Ramzan, 2010 ). According to Kaspersky Lab’s, the number of phishing attempts to visit fraudulent social network pages in the first quarter of 2018 was more than 3.7 million attempts, of which 60% were fake Facebook pages ( Raggo, 2016 ).

The new report from predictive email defense company Vade Secure about phishers’ favorites for quarter 1 and quarter 2 of 2019, stated that Soshing primarily on Facebook and Instagram saw a 74.7% increase that is the highest quarter-over- quarter growth of any industry ( VadeSecure, 2021 ).

Technical Subterfuge

Technical subterfuge is the act of tricking individuals into disclosing their sensitive information through technical subterfuge by downloading malicious code into the victim's system. Technical subterfuge can be classified into the following types:

Malware-Based Phishing

As the name suggests, this is a type of phishing attack which is conducted by running malicious software on a user’s machine. The malware is downloaded to the victim’s machine, either by one of the social engineering tricks or technically by exploiting vulnerabilities in the security system (e.g., browser vulnerabilities) ( Jakobsson and Myers, 2006 ). Panda malware is one of the successful malware programs discovered by Fox-IT Company in 2016. This malware targets Windows Operating Systems (OS). It spreads through phishing campaigns and its main attack vectors include web injects, screenshots of user activity (up to 100 per mouse click), logging of keyboard input, Clipboard pastes (to grab passwords and paste them into form fields), and exploits to the Virtual Network Computing (VNC) desktop sharing system. In 2018, Panda malware expanded its targets to include cryptocurrency exchanges and social media sites ( F5Networks, 2018 ). There are many forms of Malware-based phishing attacks; some of them are discussed below:

Key Loggers and Screen Loggers

Loggers are the type of malware used by phishers and installed either through Trojan horse email attachments or through direct download to the user’s personal computer. This software monitors data and records user keystrokes and then sends it to the phisher. Phisher uses the key loggers to capture sensitive information related to victims, such as names, addresses, passwords, and other confidential data. Key loggers can also be used for non-phishing purposes such as to monitor a child's use of the internet. Key loggers can also be implemented in many other ways such as detecting URL changes and logs information as Browser Helper Object (BHO) that enables the attacker to take control of the features of all IE’s, monitoring keyboard and mouse input as a device driver and, monitoring users input and displays as a screen logger ( Jakobsson and Myers, 2006 ).

Viruses and Worms

A virus is a type of malware, which is a piece of code spreading in another application or program by making copies of itself in a self-automated manner ( Jakobsson and Myers, 2006 ; F5Networks, 2018 ). Worms are similar to viruses but they differ in the execution manner, as worms are executed by exploiting the operating systems vulnerability without the need to modify another program. Viruses transfer from one computer to another with the document that they are attached to, while worms transfer through the infected host file. Both viruses and worms can cause data and software damaging or Denial-of-Service (DoS) conditions ( F5Networks, 2018 ).

Spying software is a malicious code designed to track the websites visited by users in order to steal sensitive information and conduct a phishing attack. Spyware can be delivered through an email and, once it is installed on the computer, take control over the device and either change its settings or gather information such as passwords and credit card numbers or banking records which can be used for identity theft ( Jakobsson and Myers, 2006 ).

Adware is also known as advertising-supported software ( Jakobsson and Myers, 2006 ). Adware is a type of malware that shows the user an endless pop-up window with ads that could harm the performance of the device. Adware can be annoying but most of it is safe. Some of the adware could be used for malicious purposes such as tracking the internet sites the user visits or even recording the user's keystrokes ( cisco, 2018 ).

Ransomware is a type of malware that encrypts the user's data after they run an executable program on the device. In this type of attack, the decryption key is held until the user pays a ransom (cisco, 2018). Ransomware is responsible for tens of millions of dollars in extortion annually. Worse still, this is hard to detect with developing new variants, facilitating the evasion of many antivirus and intrusion detection systems ( Latto, 2020 ). Ransomware is usually delivered to the victim's device through phishing emails. According to a report ( PhishMe, 2016 ), 93% of all phishing emails contained encryption ransomware. Phishing, as a social engineering attack, convinces victims into executing actions without knowing about the malicious program.

A rootkit is a collection of programs, typically malicious, that enables access to a computer or computer network. These toolsets are used by intruders to hide their actions from system administrators by modifying the code of system calls and changing the functionality ( Belcic, 2020 ). The term “rootkit” has negative connotations through its association with malware, and it is used by the attacker to alert existing system tools to escape detection. These kits enable individuals with little or no knowledge to launch phishing exploits. It contains coding, mass emailing software (possibly with thousands of email addresses included), web development software, and graphic design tools. An example of rootkits is the Kernel kit. Kernel-Level Rootkits are created by replacing portions of the core operating system or adding new code via Loadable Kernel Modules in (Linux) or device drivers (in Windows) ( Jakobsson and Myers, 2006 ).

Session Hijackers

In this type, the attacker monitors the user’s activities by embedding malicious software within a browser component or via network sniffing. The monitoring aims to hijack the session, so that the attacker performs an unauthorized action with the hijacked session such as financial transferring, without the user's permission ( Jakobsson and Myers, 2006 ).

Web Trojans

Web Trojans are malicious programs that collect user’s credentials by popping up in a hidden way over the login screen ( Jakobsson and Myers, 2006 ). When the user enters the credentials, these programs capture and transmit the stolen credentials directly to the attacker ( Jakobsson et al., 2007 ).

Hosts File Poisoning

This is a way to trick a user into going to the phisher’s site by poisoning (changing) the host’s file. When the user types a particular website address in the URL bar, the web address will be translated into a numeric (IP) address before visiting the site. The attacker, to take the user to a fake website for phishing purposes, will modify this file (e.g., DNS cache). This type of phishing is hard to detect even by smart and perceptive users ( Ollmann, 2004 ).

System Reconfiguration Attack

In this format of the phishing attack, the phisher manipulates the settings on a user’s computer for malicious activities so that the information on this PC will be compromised. System reconfigurations can be changed using different methods such as reconfiguring the operating system and modifying the user’s Domain Name System (DNS) server address. The wireless evil twin is an example of a system reconfiguration attack in which all user’s traffic is monitored via a malicious wireless Access Point (AP) ( Jakobsson and Myers, 2006 ).

Data theft is an unauthorized accessing and stealing of confidential information for a business or individuals. Data theft can be performed by a phishing email that leads to the download of a malicious code to the user's computer which in turn steals confidential information stored in that computer directly ( Jakobsson and Myers, 2006 ). Stolen information such as passwords, social security numbers, credit card information, sensitive emails, and other personal data could be used directly by a phisher or indirectly by selling it for different purposes.

Domain Name System Based Phishing (Pharming)

Any form of phishing that interferes with the domain name system so that the user will be redirected to the malicious website by polluting the user's DNS cache with wrong information is called DNS-based phishing. Although the host’s file is not a part of the DNS, the host’s file poisoning is another form of DNS based phishing. On the other hand, by compromising the DNS server, the genuine IP addresses will be modified which results in taking the user unwillingly to a fake location. The user can fall prey to pharming even when clicking on a legitimate link because the website’s domain name system (DNS) could be hijacked by cybercriminals ( Jakobsson and Myers, 2006 ).

Content Injection Phishing

Content-Injection Phishing refers to inserting false content into a legitimate site. This malicious content could misdirect the user into fake websites, leading users into disclosing their sensitive information to the hacker or it can lead to downloading malware into the user's device ( Jakobsson and Myers, 2006 ). The malicious content could be injected into a legitimate site in three primary ways:

1. Hacker exploits a security vulnerability and compromises a web server.

2. Hacker exploits a Cross-Site Scripting (XSS) vulnerability that is a programming flaw that enables attackers to insert client-side scripts into web pages, which will be viewed by the visitors to the targeted site.

3. Hacker exploits Structured Query Language (SQL) injection vulnerability, which allows hackers to steal information from the website’s database by executing database commands on a remote server.

Man-In-The-Middle Phishing

The Man In The Middle attack (MITM) is a form of phishing, in which the phishers insert communications between two parties (i.e. the user and the legitimate website) and tries to obtain the information from both parties by intercepting the victim’s communications ( Ollmann, 2004 ). Such that the message is going to the attacker instead of going directly to the legitimate recipients. For a MITM, the attacker records the information and misuse it later. The MITM attack conducts by redirecting the user to a malicious server through several techniques such as Address Resolution Protocol (ARP) poisoning, DNS spoofing, Trojan key loggers, and URL Obfuscation ( Jakobsson and Myers, 2006 ).

Search Engine Phishing

In this phishing technique, the phisher creates malicious websites with attractive offers and use Search Engine Optimization (SEO) tactics to have them indexed legitimately such that it appears to the user when searching for products or services. This is also known as black hat SEO ( Jakobsson and Myers, 2006 ).

URL and HTML Obfuscation Attacks

In most of the phishing attacks, phishers aim to convince a user to click on a given link that connects the victim to a malicious phishing server instead of the destination server. This is the most popular technique used by today's phishers. This type of attack is performed by obfuscating the real link (URL) that the user intends to connect (an attempt from the attacker to make their web address look like the legitimate one). Bad Domain Names and Host Name Obfuscation are common methods used by attackers to fake an address ( Ollmann, 2004 ).

Countermeasures

A range of solutions are being discussed and proposed by the researchers to overcome the problems of phishing, but still, there is no single solution that can be trusted or capable of mitigating these attacks ( Hong, 2012 ; Boddy, 2018 ; Chanti and Chithralekha, 2020 ). The proposed phishing countermeasures in the literature can be categorized into three major defense strategies. The first line of defense is human-based solutions by educating end-users to recognize phishing and avoid taking the bait. The second line of defense is technical solutions that involve preventing the attack at early stages such as at the vulnerability level to prevent the threat from materializing at the user's device, which means decreasing the human exposure, and detecting the attack once it is launched through the network level or at the end-user device. This also includes applying specific techniques to track down the source of the attack (for example these could include identification of new domains registered that are closely matched with well-known domain names). The third line of defense is the use of law enforcement as a deterrent control. These approaches can be combined to create much stronger anti-phishing solutions. The above solutions are discussed in detail below.

Human Education (Improving User Awareness About Phishing)

Human education is by far an effective countermeasure to avoid and prevent phishing attacks. Awareness and human training are the first defense approach in the proposed methodology for fighting against phishing even though it does not assume complete protection ( Hong, 2012 ). End-user education reduces user's susceptibility to phishing attacks and compliments other technical solutions. According to the analysis carried out in ( Bailey et al., 2008 ), 95% of phishing attacks are caused due to human errors; nonetheless, existing phishing detection training is not enough for combating current sophisticated attacks. In the study presented by Khonji et al. (2013) , security experts contradict the effectiveness and usability of user education. Furthermore, some security experts claim that user education is not effective as security is not the main goal for users and users do not have a motivation to educate themselves about phishing ( Scaife et al., 2016 ), while others confirm that user education could be effective if designed properly ( Evers, 2006 ; Whitman and Mattord, 2012 ). Moreover, user training has been mentioned by many researchers as an effective way to protect users when they are using online services ( Dodge et al., 2007 ; Salem et al., 2010 ; Chanti and Chithralekha, 2020 ). To detect and avoid phishing emails, a combined training approach was proposed by authors in the study ( Salem et al., 2010 ). The proposed solution uses a combination of tools and human learning, wherein a security awareness program is introduced to the user as a first step. The second step is using an intelligent system that detects the attacks at the email level. After that, the emails are classified by a fuzzy logic-based expert system. The main critic of this method is that the study chooses only limited characteristics of the emails as distinguishing features ( Kumaraguru et al., 2010 ; CybintCyberSolutions, 2018 ). Moreover, the majority of phishing training programs focus on how to recognize and avoid phishing emails and websites while other threatening phishing types receive less attention such as voice phishing and malware or adware phishing. The authors in ( Salem et al., 2010 ) found that the most used solutions in educating people are not useful if they ignore the notifications/warnings about fake websites. Training users should involve three major directions: the first one is awareness training through holding seminars or online courses for both employees within organizations or individuals. The second one is using mock phishing attacks to attack people to test users’ vulnerability and allow them to assess their own knowledge about phishing. However, only 38% of global organizations claim they are prepared to handle a sophisticated cyber-attack ( Kumaraguru et al., 2010 ). Wombat Security’s State of the Phish™ Report 2018 showed that approximately two-fifths of American companies use computer-based online awareness training and simulated phishing attacks as educating tools on a monthly basis, while just 15% of United Kingdom firms do so ( CybintCyberSolutions, 2018 ). The third direction is educating people by developing games to teach people about phishing. The game developer should take into consideration different aspects before designing the game such as audience age and gender, because people's susceptibility to phishing is varying. Authors in the study ( Sheng et al., 2007 ) developed a game to train users so that they can identify phishing attacks called Anti-Phishing Phil that teaches about phishing web pages, and then tests users about the efficiency and effectiveness of the game. The results from the study showed that the game participants improve their ability to identify phishing by 61% indicating that interactive games might turn out to be a joyful way of educating people. Although, user’s education and training can be very effective to mitigate security threats, phishing is becoming more complex and cybercriminals can fool even the security experts by creating convincing spear phishing emails via social media. Therefore, individual users and employees must have at least basic knowledge about dealing with suspicious emails and report it to IT staff and specific authorities. In addition, phishers change their strategies continuously, which makes it harder for organizations, especially small/medium enterprises to afford the cost of their employee education. With millions of people logging on to their social media accounts every day, social media phishing is phishers' favorite medium to deceive their victims. For example, phishers are taking advantage of the pervasiveness of Facebook to set up creative phishing attacks utilizing the Facebook Login feature that enables the phisher to compromise all the user's accounts with the same credentials (VadeSecure). Some countermeasures are taken by Social networks to reduce suspicious activities on social media such as Two-Factor authentication for logging in, that is required by Facebook, and machine-learning techniques used by Snapchat to detect and prevent suspicious links sent within the app ( Corrata, 2018 ). However, countermeasures to control Soshing and phone phishing attacks might include:

• Install anti-virus, anti-spam software as a first action and keep it up to date to detect and prevent any unauthorized access.

• Educate yourself about recent information on phishing, the latest trends, and countermeasures.

• Never click on hyperlinks attached to a suspicious email, post, tweet, direct message.

• Never trust social media, do not give any sensitive information over the phone or non-trusted account. Do not accept friend requests from people you do not know.

• Use a unique password for each account.

Training and educating users is an effective anti-phishing countermeasure and has already shown promising initial results. The main downside of this solution is that it demands high costs ( Dodge et al., 2007 ). Moreover, this solution requires basic knowledge in computer security among trained users.

Technical Solutions

The proposed technical solutions for detecting and blocking phishing attacks can be divided into two major approaches: non-content based solutions and content-based solutions ( Le et al., 2006 ; Bin et al., 2010 ; Boddy, 2018 ). Both approaches are briefly described in this section. Non-content based methods include blacklists and whitelists that classify the fake emails or webpages based on the information that is not part of the email or the webpage such as URL and domain name features ( Dodge et al., 2007 ; Ma et al., 2009 ; Bin et al., 2010 ; Salem et al., 2010 ). Stopping the phishing sites using blacklist and whitelist approaches, wherein a list of known URLs and sites is maintained, the website under scrutiny is checked against such a list in order to be classified as a phishing or legitimate site. The downside of this approach is that it will not identify all phishing websites. Because once a phishing site is taken down, the phisher can easily register a new domain ( Miyamoto et al., 2009 ). Content-based methods classify the page or the email relying on the information within its content such as texts, images, and also HTML, java scripts, and Cascading Style Sheets (CSS) codes ( Zhang et al., 2007 ; Maurer and Herzner, 2012 ). Content-based solutions involve Machine Learning (ML), heuristics, visual similarity, and image processing methods ( Miyamoto et al., 2009 ; Chanti and Chithralekha, 2020 ). and finally, multifaceted methods, which apply a combination of the previous approaches to detect and prevent phishing attacks ( Afroz and Greenstadt, 2009 ). For email filtering, ML techniques are commonly used for example in 2007, the first email phishing filter was developed by authors in ( Fette et al., 2007 ). This technique uses a set of features such as URLs that use different domain names. Spam filtering techniques ( Cormack et al., 2011 ) and statistical classifiers ( Bergholz et al., 2010 ) are also used to identify a phishing email. Authentication and verification technologies are also used in spam email filtering as an alternative to heuristics methods. For example, the Sender Policy Framework (SPF) verifies whether a sender is valid when accepting mail from a remote mail server or email client ( Deshmukh and raddha Popat, 2017 ).

The technical solutions for Anti-phishing are available at different levels of the delivery chain such as mail servers and clients, Internet Service Providers (ISPs), and web browser tools. Drawing from the proposed anatomy for phishing attacks in Proposed Phishing Anatomy , authors categorize technical solutions into the following approaches:

1. Techniques to detect the attack after it has been launched. Such as by scanning the web to find fake websites. For example, content-based phishing detection approaches are heavily deployed on the Internet. The features from the website elements such as Image, URL, and text content are analyzed using Rule-based approaches and Machine Learning that examine the presence of special characters (@), IP addresses instead of the domain name, prefix/suffix, HTTPS in domain part and other features ( Jeeva and Rajsingh, 2016 ). Fuzzy Logic (FL) has also been used as an anti-phishing model to help classify websites into legitimate or ‘phishy’ as this model deals with intervals rather than specific numeric values ( Aburrous et al., 2008 ).

2. Techniques to prevent the attack from reaching the user's system. Phishing prevention is an important step to defend against phishing by blocking a user from seeing and dealing with the attack. In email phishing, anti-spam software tools can block suspicious emails. Phishers usually send a genuine look-alike email that dupes the user to open an attachment or click on a link. Some of these emails pass the spam filter because phishers use misspelled words. Therefore, techniques that detect fake emails by checking the spelling and grammar correction are increasingly used, so that it can prevent the email from reaching the user's mailbox. Authors in the study ( Fette et al., 2007 ) have developed a new classification algorithm based on the Random Forest algorithm after exploring email phishing utilizing the C4.5 decision tree generator algorithm. The developed method is called "Phishing Identification by Learning on Features of Email Received" (PILFER), which can classify phishing email depending on various features such as IP based URLs, the number of links in the HTML part(s) of an email, the number of domains, the number of dots, nonmatching URLs, and availability of JavaScripts. The developed method showed high accuracy in detecting phishing emails ( Afroz and Greenstadt, 2009 ).

3. Corrective techniques that can take down the compromised website, by requesting the website's Internet Service Provider (ISP) to shut down the fake website in order to prevent more users from falling victims to phishing ( Moore and Clayton, 2007 ; Chanti and Chithralekha, 2020 ). ISPs are responsible for taking down fake websites. Removing the compromised and illegal websites is a complex process; many entities are involved in this process from private companies, self-regulatory bodies, government agencies, volunteer organizations, law enforcement, and service providers. Usually, illegal websites are taken down by Takedown Orders, which are issued by courts or in some jurisdictions by law enforcement. On the other hand, these can be voluntarily taken down by the providers themselves as a result of issued takedown notices ( Moore and Clayton, 2007 ; Hutchings et al., 2016 ). According to PHISHLABS ( PhishLabs, 2019 ) report, taking down phishing sites is helpful but it is not completely effective as these sites can still be alive for days stealing customers' credentials before detecting the attack.

4. Warning tools or security indicators that embedded into the web browser to inform the user after detecting the attack. For example, eBay Toolbar and Account Guard ( eBay Toolbar and Account Guard, 2009 ) protect customer’s eBay and PayPal passwords respectively by alerting the users about the authenticity of the sites that users try to type the password in. Numerous anti-phishing solutions rely mainly on warnings that are displayed on the security toolbar. In addition, some toolbars block suspicious sites to warn about it such as McAfee and Netscape. A study presented in ( Robichaux and Ganger, 2006 ) conducted a test to evaluate the performance of eight anti-phishing solutions, including Microsoft Internet Explorer 7, EarthLink, eBay, McAfee, GeoTrust, Google using Firefox, Netscape, and Netcraft. These tools are warning and blocking tools that allow legitimate sites while block and warn about known phishing sites. The study also found that Internet Explorer and Netcraft Toolbar showed the most effective results than other anti-phishing tools. However, security toolbars are still failing to avoid people falling victim to phishing despite these toolbars improving internet security in general ( Abu-Nimeh and Nair, 2008 ).

5. Authentication ( Moore and Clayton, 2007 ) and authorization ( Hutchings et al., 2016 ) techniques that provide protection from phishing by verifying the identity of the legitimate person. This prevents phishers from accessing a protected resource and conducting their attack. There are three types of authentication; single-factor authentication requires only username and password. The second type is two-factor authentication that requires additional information in addition to the username and password such as an OTP (One-Time Password) which is sent to the user’s email id or phone. The third type is multi-factor authentication using more than one form of identity (i.e., a combination of something you know, something you are, and something you have). Some widely used methods in the authorization process are API authorization and OAuth 2.0 that allow the previously generated API to access the system.

However, the progressive increase in phishing attacks shows that previous methods do not provide the required protection against most existing phishing attacks. Because no single solution or technology could prevent all phishing attacks. An effective anti-phishing solution should be based on a combination of technical solutions and increased user awareness ( Boddy, 2018 ).

Solutions Provided by Legislations as a Deterrent Control

A cyber-attack is considered a crime when an individual intentionally accesses personal information on a computer without permission, even if the individual does not steal information or damage the system ( Mince-Didier, 2020 ). Since the sole objective of almost all phishing attacks is to obtain sensitive information by knowingly intending to commit identity theft, and while there are currently no federal laws in the United States aimed specifically at phishing, therefore, phishing crimes are usually covered under identity theft laws. Phishing is considered a crime even if the victim does not actually fall for the phishing scam, the punishments depend on circumstances and usually include jail, fines, restitution, probation ( Nathan, 2020 ). Phishing attacks are causing different levels of damages to the victims such as financial and reputational losses. Therefore, law enforcement authorities should track down these attacks in order to punish the criminal as with real-world crimes. As a complement to technical solutions and human education, the support provided by applicable laws and regulations can play a vital role as a deterrent control. Increasingly authorities around the world have created several regulations in order to mitigate the increase of phishing attacks and their impact. The first anti-phishing laws were enacted by the United States, where the FTC in the US added the phishing attacks to the computer crime list in January 2004. A year later, the ‘‘Anti-Phishing Act’’ was introduced in the US Congress in March 2005 ( Mohammad et al., 2014 ). Meanwhile, in the United Kingdom, the law legislation is gradually conforming to address phishing and other forms of cyber-crime. In 2006, the United Kingdom government improved the Computer Misuse Act 1990 intending to bring it up to date with developments in computer crime and to increase penalties for breach enacted penalties of up to 10 years ( eBay Toolbar and Account Guard, 2009 ; PhishLabs, 2019 ). In this regard, a student in the United Kingdom who made hundreds of thousands of pounds blackmailing pornography website users was jailed in April 2019 for six years and five months. According to the National Crime Agency (NCA), this attacker was the most prolific cybercriminal to be sentenced in the United Kingdom ( Casciani, 2019 ). Moreover, the organizations bear part of the responsibility in protecting personal information as stated in the Data Protection Act 2018 and EU General Data Protection Regulation (GDPR). Phishing websites also can be taken down through Law enforcement agencies' conduct. In the United Kingdom, websites can be taken down by the National Crime Agency (NCA), which includes the National Cyber Crime Unit, and by the City of London Police, which includes the Police Intellectual Property Crime Unit (PIPCU) and the National Fraud Intelligence Bureau (NFIB) ( Hutchings et al., 2016 ).

However, anti-phishing law enforcement is still facing numerous challenges and limitations. Firstly, after perpetrating the phishing attack, the phisher can vanish in cyberspace making it difficult to prove the guilt attributed to the offender and to recover the damages caused by the attack, limiting the effectiveness of the law enforcement role. Secondly, even if the attacker’s identity is disclosed in the case of international attackers, it will be difficult to bring this attacker to justice because of the differences in countries' legislations (e.g., exchange treaties). Also, the attack could be conducted within a short time span, for instance, the average lifetime for a phishing web site is about 54 h as stated by the APWG, therefore, there must be a quick response from the government and the authorities to detect, control and identify the perpetrators of the attack ( Ollmann, 2004 ).

Phishing attacks remain one of the major threats to individuals and organizations to date. As highlighted in the article, this is mainly driven by human involvement in the phishing cycle. Often phishers exploit human vulnerabilities in addition to favoring technological conditions (i.e., technical vulnerabilities). It has been identified that age, gender, internet addiction, user stress, and many other attributes affect the susceptibility to phishing between people. In addition to traditional phishing channels (e.g., email and web), new types of phishing mediums such as voice and SMS phishing are on the increase. Furthermore, the use of social media-based phishing has increased in use in parallel with the growth of social media. Concomitantly, phishing has developed beyond obtaining sensitive information and financial crimes to cyber terrorism, hacktivism, damaging reputations, espionage, and nation-state attacks. Research has been conducted to identify the motivations and techniques and countermeasures to these new crimes, however, there is no single solution for the phishing problem due to the heterogeneous nature of the attack vector. This article has investigated problems presented by phishing and proposed a new anatomy, which describes the complete life cycle of phishing attacks. This anatomy provides a wider outlook for phishing attacks and provides an accurate definition covering end-to-end exclusion and realization of the attack.

Although human education is the most effective defense for phishing, it is difficult to remove the threat completely due to the sophistication of the attacks and social engineering elements. Although, continual security awareness training is the key to avoid phishing attacks and to reduce its impact, developing efficient anti-phishing techniques that prevent users from being exposed to the attack is an essential step in mitigating these attacks. To this end, this article discussed the importance of developing anti-phishing techniques that detect/block the attack. Furthermore, the importance of techniques to determine the source of the attack could provide a stronger anti-phishing solution as discussed in this article.

Furthermore, this article identified the importance of law enforcement as a deterrent mechanism. Further investigations and research are necessary as discussed below.

1. Further research is necessary to study and investigate susceptibility to phishing among users, which would assist in designing stronger and self-learning anti-phishing security systems.

2. Research on social media-based phishing, Voice Phishing, and SMS Phishing is sparse and these emerging threats are predicted to be significantly increased over the next years.

3. Laws and legislations that apply for phishing are still at their infant stage, in fact, there are no specific phishing laws in many countries. Most of the phishing attacks are covered under traditional criminal laws such as identity theft and computer crimes. Therefore, drafting of specific laws for phishing is an important step in mitigating these attacks in a time where these crimes are becoming more common.

4. Determining the source of the attack before the end of the phishing lifecycle and enforcing law legislation on the offender could help in restricting phishing attacks drastically and would benefit from further research.

It can be observed that the mediums used for phishing attacks have changed from traditional emails to social media-based phishing. There is a clear lag between sophisticated phishing attacks and existing countermeasures. The emerging countermeasures should be multidimensional to tackle both human and technical elements of the attack. This article provides valuable information about current phishing attacks and countermeasures whilst the proposed anatomy provides a clear taxonomy to understand the complete life cycle of phishing.

Author Contributions

This work is by our PhD student ZA supported by her Supervisory Team.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AOL America Online

APWG Anti Phishing Working Group Advanced

APRANET Advanced Research Projects Agency Network.

ARP address resolution protocol.

BHO Browser Helper Object

BEC business email compromise

COVID-19 Coronavirus disease 2019

CSS cascading style sheets

DDoS distributed denial of service

DNS Domain Name System

DoS Denial of Service

FTC Federal Trade Commission

FL Fuzzy Logic

HTTPS Hypertext Transfer Protocol Secure

IE Internet Explorer

ICT Information and Communications Technology

IM Instant Message

IT Information Technology

IP Internet Protocol

MITM Man-in-the-Middle

NCA National Crime Agency

NFIB National Fraud Intelligence Bureau

PIPCU Police Intellectual Property Crime Unit

OS Operating Systems

PBX Private Branch Exchange

SMishing Text Message Phishing

SPF Sender Policy Framework

SMTP Simple Mail Transfer Protocol

SMS Short Message Service

Soshing Social Media Phishing

SQL structured query language

URL Uniform Resource Locator

UK United Kingdom

US United States

USB Universal Serial Bus

US-CERT United States Computer Emergency Readiness Team.

Vishing Voice Phishing

VNC Virtual Network Computing

VoIP Voice over Internet Protocol

XSS Cross-Site Scripting

1 Proofpoint is “a leading cybersecurity company that protects organizations’ greatest assets and biggest risks: their people. With an integrated suite of cloud-based solutions”( Proofpoint, 2019b ).

2 APWG Is “the international coalition unifying the global response to cybercrime across industry, government and law-enforcement sectors and NGO communities” ( APWG, 2020 ).

3 CalleR ID is “a telephone facility that displays a caller’s phone number on the recipient's phone device before the call is answered” ( Techpedia, 2021 ).

4 An IPPBX is “a telephone switching system within an enterprise that switches calls between VoIP users on local lines while allowing all users to share a certain number of external phone lines” ( Margaret, 2008 ).

Abad, C. (2005). The economy of phishing: a survey of the operations of the phishing market. First Monday 10, 1–11. doi:10.5210/fm.v10i9.1272

CrossRef Full Text | Google Scholar

Abu-Nimeh, S., and Nair, S. (2008). “Bypassing security toolbars and phishing filters via dns poisoning,” in IEEE GLOBECOM 2008–2008 IEEE global telecommunications conference , New Orleans, LA , November 30–December 2, 2008 ( IEEE) , 1–6. doi:10.1109/GLOCOM.2008.ECP.386

Aburrous, M., Hossain, M. A., Thabatah, F., and Dahal, K. (2008). “Intelligent phishing website detection system using fuzzy techniques,” in 2008 3rd international conference on information and communication technologies: from theory to applications (New York, NY: IEEE , 1–6. doi:10.1109/ICTTA.2008.4530019

Afroz, S., and Greenstadt, R. (2009). “Phishzoo: an automated web phishing detection approach based on profiling and fuzzy matching,” in Proceeding 5th IEEE international conference semantic computing (ICSC) , 1–11.

Google Scholar

Alsharnouby, M., Alaca, F., and Chiasson, S. (2015). Why phishing still works: user strategies for combating phishing attacks. Int. J. Human-Computer Stud. 82, 69–82. doi:10.1016/j.ijhcs.2015.05.005

APWG (2018). Phishing activity trends report 3rd quarter 2018 . US. 1–11.

APWG (2020). APWG phishing attack trends reports. 2020 anti-phishing work. Group, Inc Available at: https://apwg.org/trendsreports/ (Accessed September 20, 2020).

Arachchilage, N. A. G., and Love, S. (2014). Security awareness of computer users: a phishing threat avoidance perspective. Comput. Hum. Behav. 38, 304–312. doi:10.1016/j.chb.2014.05.046

Arnsten, B. A., Mazure, C. M., and April, R. S. (2012). Everyday stress can shut down the brain’s chief command center. Sci. Am. 306, 1–6. Available at: https://www.scientificamerican.com/article/this-is-your-brain-in-meltdown/ (Accessed October 15, 2019).

Bailey, J. L., Mitchell, R. B., and Jensen, B. k. (2008). “Analysis of student vulnerabilities to phishing,” in 14th americas conference on information systems, AMCIS 2008 , 75–84. Available at: https://aisel.aisnet.org/amcis2008/271 .

Barracuda (2020). Business email compromise (BEC). Available at: https://www.barracuda.com/glossary/business-email-compromise (Accessed November 15, 2020).

Belcic, I. (2020). Rootkits defined: what they do, how they work, and how to remove them. Available at: https://www.avast.com/c-rootkit (Accessed November 7, 2020).

Bergholz, A., De Beer, J., Glahn, S., Moens, M.-F., Paaß, G., and Strobel, S. (2010). New filtering approaches for phishing email. JCS 18, 7–35. doi:10.3233/JCS-2010-0371

Bin, S., Qiaoyan, W., and Xiaoying, L. (2010). “A DNS based anti-phishing approach.” in 2010 second international conference on networks security, wireless communications and trusted computing , Wuhan, China , April 24–25, 2010 . ( IEEE ), 262–265. doi:10.1109/NSWCTC.2010.196

Boddy, M. (2018). Phishing 2.0: the new evolution in cybercrime. Comput. Fraud Secur. 2018, 8–10. doi:10.1016/S1361-3723(18)30108-8

Casciani, D. (2019). Zain Qaiser: student jailed for blackmailing porn users worldwide. Available at: https://www.bbc.co.uk/news/uk-47800378 (Accessed April 9, 2019).

Chanti, S., and Chithralekha, T. (2020). Classification of anti-phishing solutions. SN Comput. Sci. 1, 11. doi:10.1007/s42979-019-0011-2

Checkpoint (2020). Check point research’s Q1 2020 brand phishing report. Available at: https://www.checkpoint.com/press/2020/apple-is-most-imitated-brand-for-phishing-attempts-check-point-researchs-q1-2020-brand-phishing-report/ (Accessed August 6, 2020).

cisco (2018). What is the difference: viruses, worms, Trojans, and bots? Available at: https://www.cisco.com/c/en/us/about/security-center/virus-differences.html (Accessed January 20, 2020).

CISA (2018). What is phishing. Available at: https://www.us-cert.gov/report-phishing (Accessed June 10, 2019).

Cormack, G. V., Smucker, M. D., and Clarke, C. L. A. (2011). Efficient and effective spam filtering and re-ranking for large web datasets. Inf. Retrieval 14, 441–465. doi:10.1007/s10791-011-9162-z

Corrata (2018). The rising threat of social media phishing attacks. Available at: https://corrata.com/the-rising-threat-of-social-media-phishing-attacks/%0D (Accessed October 29, 2019).

Crane, C. (2019). The dirty dozen: the 12 most costly phishing attack examples. Available at: https://www.thesslstore.com/blog/the-dirty-dozen-the-12-most-costly-phishing-attack-examples/#:∼:text=At some level%2C everyone is susceptible to phishing,outright trick you into performing a particular task (Accessed August 2, 2020).

CSI Onsite (2012). Phishing. Available at: http://csionsite.com/2012/phishing/ (Accessed May 8, 2019).

Cui, Q., Jourdan, G.-V., Bochmann, G. V., Couturier, R., and Onut, I.-V. (2017). Tracking phishing attacks over time. Proc. 26th Int. Conf. World Wide Web - WWW ’17 , Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee . 667–676. doi:10.1145/3038912.3052654

CVEdetails (2005). Vulnerability in microsoft internet explorer. Available at: https://www.cvedetails.com/cve/CVE-2005-4089/ (Accessed August 20, 2019).

Cybint Cyber Solutions (2018). 13 alarming cyber security facts and stats. Available at: https://www.cybintsolutions.com/cyber-security-facts-stats/ (Accessed July 20, 2019).

Deshmukh, M., and raddha Popat, S. (2017). Different techniques for detection of phishing attack. Int. J. Eng. Sci. Comput. 7, 10201–10204. Available at: http://ijesc.org/ .

Dhamija, R., Tygar, J. D., and Hearst, M. (2006). “Why phishing works,” in Proceedings of the SIGCHI conference on human factors in computing systems - CHI ’06 , Montréal Québec, Canada , (New York, NY: ACM Press ), 581. doi:10.1145/1124772.1124861

Diaz, A., Sherman, A. T., and Joshi, A. (2020). Phishing in an academic community: a study of user susceptibility and behavior. Cryptologia 44, 53–67. doi:10.1080/01611194.2019.1623343

Dodge, R. C., Carver, C., and Ferguson, A. J. (2007). Phishing for user security awareness. Comput. Security 26, 73–80. doi:10.1016/j.cose.2006.10.009

eBay Toolbar and Account Guard (2009). Available at: https://download.cnet.com/eBay-Toolbar/3000-12512_4-10153544.html (Accessed August 7, 2020).

EDUCBA (2017). Hackers vs crackers: easy to understand exclusive difference. Available at: https://www.educba.com/hackers-vs-crackers/ (Accessed July 17, 2019).

Evers, J. (2006). Security expert: user education is pointless. Available at: https://www.cnet.com/news/security-expert-user-education-is-pointless/ (Accessed June 25, 2019).

F5Networks (2018). Panda malware broadens targets to cryptocurrency exchanges and social media. Available at: https://www.f5.com/labs/articles/threat-intelligence/panda-malware-broadens-targets-to-cryptocurrency-exchanges-and-social-media (Accessed April 23, 2019).

Fette, I., Sadeh, N., and Tomasic, A. (2007). “Learning to detect phishing emails,” in Proceedings of the 16th international conference on world wide web - WWW ’07 , Banff Alberta, Canada , (New York, NY: ACM Press) , 649–656. doi:10.1145/1242572.1242660

Financial Fraud Action UK (2017). Fraud the facts 2017: the definitive overview of payment industry fraud. London. Available at: https://www.financialfraudaction.org.uk/fraudfacts17/assets/fraud_the_facts.pdf .

Fraud Watch International (2019). Phishing attack trends for 2019. Available at: https://fraudwatchinternational.com/phishing/phishing-attack-trends-for-2019/ (Accessed October 29, 2019).

FTC (2018). Netflix scam email. Available at: https://www.ftc.gov/tips-advice/business-center/small-businesses/cybersecurity/phishing (Accessed May 8, 2019).

Furnell, S. (2007). An assessment of website password practices). Comput. Secur. 26, 445–451. doi:10.1016/j.cose.2007.09.001

Getsafeonline (2017). Caught on the net. Available at: https://www.getsafeonline.org/news/caught-on-the-net/%0D (Accessed August 1, 2020).

GOV.UK (2020). Cyber security breaches survey 2020. Available at: https://www.gov.uk/government/publications/cyber-security-breaches-survey-2020/cyber-security-breaches-survey-2020 (Accessed August 6, 2020).

Gupta, P., Srinivasan, B., Balasubramaniyan, V., and Ahamad, M. (2015). “Phoneypot: data-driven understanding of telephony threats,” in Proceedings 2015 network and distributed system security symposium , (Reston, VA: Internet Society ), 8–11. doi:10.14722/ndss.2015.23176

Hadlington, L. (2017). Human factors in cybersecurity; examining the link between internet addiction, impulsivity, attitudes towards cybersecurity, and risky cybersecurity behaviours. Heliyon 3, e00346-18. doi:10.1016/j.heliyon.2017.e00346

Herley, C., and Florêncio, D. (2008). “A profitless endeavor,” in New security paradigms workshop (NSPW ’08) , New Hampshire, United States , October 25–28, 2021 , 1–12. doi:10.1145/1595676.1595686

Hewage, C. (2020). Coronavirus pandemic has unleashed a wave of cyber attacks – here’s how to protect yourself. Conversat . Available at: https://theconversation.com/coronavirus-pandemic-has-unleashed-a-wave-of-cyber-attacks-heres-how-to-protect-yourself-135057 (Accessed November 16, 2020).

Hong, J. (2012). The state of phishing attacks. Commun. ACM 55, 74–81. doi:10.1145/2063176.2063197

Huber, M., Kowalski, S., Nohlberg, M., and Tjoa, S. (2009). “Towards automating social engineering using social networking sites,” in 2009 international conference on computational science and engineering , Vancouver, BC , August 29–31, 2009 ( IEEE , 117–124. doi:10.1109/CSE.2009.205

Hutchings, A., Clayton, R., and Anderson, R. (2016). “Taking down websites to prevent crime,” in 2016 APWG symposium on electronic crime research (eCrime) ( IEEE ), 1–10. doi:10.1109/ECRIME.2016.7487947

Iuga, C., Nurse, J. R. C., and Erola, A. (2016). Baiting the hook: factors impacting susceptibility to phishing attacks. Hum. Cent. Comput. Inf. Sci. 6, 8. doi:10.1186/s13673-016-0065-2

Jagatic, T. N., Johnson, N. A., Jakobsson, M., and Menczer, F. (2007). Social phishing. Commun. ACM 50, 94–100. doi:10.1145/1290958.1290968

Jakobsson, M., and Myers, S. (2006). Phishing and countermeasures: understanding the increasing problems of electronic identity theft . New Jersey: John Wiley and Sons .

Jakobsson, M., Tsow, A., Shah, A., Blevis, E., and Lim, Y. K. (2007). “What instills trust? A qualitative study of phishing,” in Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) , (Berlin, Heidelberg: Springer ), 356–361. doi:10.1007/978-3-540-77366-5_32

Jeeva, S. C., and Rajsingh, E. B. (2016). Intelligent phishing url detection using association rule mining. Hum. Cent. Comput. Inf. Sci. 6, 10. doi:10.1186/s13673-016-0064-3

Johnson, A. (2016). Almost 600 accounts breached in “celebgate” nude photo hack, FBI says. Available at: http://www.cnbc.com/id/102747765 (Accessed: February 17, 2020).

Kayne, R. (2019). What are script kiddies? Wisegeek. Available at: https://www.wisegeek.com/what-are-script-kiddies.htm V V February 19, 2020).

Keck, C. (2018). FTC warns of sketchy Netflix phishing scam asking for payment details. Available at: https://gizmodo.com/ftc-warns-of-sketchy-netflix-phishing-scam-asking-for-p-1831372416 (Accessed April 23, 2019).

Keepnet LABS (2018). Statistical analysis of 126,000 phishing simulations carried out in 128 companies around the world. USA, France. Available at: www.keepnetlabs.com .

Keinan, G. (1987). Decision making under stress: scanning of alternatives under controllable and uncontrollable threats. J. Personal. Soc. Psychol. 52, 639–644. doi:10.1037/0022-3514.52.3.639

Khonji, M., Iraqi, Y., and Jones, A. (2013). Phishing detection: a literature survey. IEEE Commun. Surv. Tutorials 15, 2091–2121. doi:10.1109/SURV.2013.032213.00009

Kirda, E., and Kruegel, C. (2005). Protecting users against phishing attacks with AntiPhish. Proc. - Int. Comput. Softw. Appl. Conf. 1, 517–524. doi:10.1109/COMPSAC.2005.126

Krawchenko, K. (2016). The phishing email that hacked the account of John Podesta. CBSNEWS Available at: https://www.cbsnews.com/news/the-phishing-email-that-hacked-the-account-of-john-podesta/ (Accessed April 13, 2019).

Ksepersky (2020). Spam and phishing in Q1 2020. Available at: https://securelist.com/spam-and-phishing-in-q1-2020/97091/ (Accessed July 27, 2020).

Kumaraguru, P., Sheng, S., Acquisti, A., Cranor, L. F., and Hong, J. (2010). Teaching Johnny not to fall for phish. ACM Trans. Internet Technol. 10, 1–31. doi:10.1145/1754393.1754396

Latto, N. (2020). What is adware and how can you prevent it? Avast. Available at: https://www.avast.com/c-adware (Accessed May 8, 2020).

Le, D., Fu, X., and Hogrefe, D. (2006). A review of mobility support paradigms for the internet. IEEE Commun. Surv. Tutorials 8, 38–51. doi:10.1109/COMST.2006.323441

Lehman, T. J., and Vajpayee, S. (2011). “We’ve looked at clouds from both sides now,” in 2011 annual SRII global conference , San Jose, CA , March 20–April 2, 2011 , ( IEEE , 342–348. doi:10.1109/SRII.2011.46

Leyden, J. (2001). Virus toolkits are s’kiddie menace. Regist . Available at: https://www.theregister.co.uk/2001/02/21/virus_toolkits_are_skiddie_menace/%0D (Accessed June 15, 2019).

Lin, J., Sadeh, N., Amini, S., Lindqvist, J., Hong, J. I., and Zhang, J. (2012). “Expectation and purpose,” in Proceedings of the 2012 ACM conference on ubiquitous computing - UbiComp ’12 (New York, New York, USA: ACM Press ), 1625. doi:10.1145/2370216.2370290

Lininger, R., and Vines, D. R. (2005). Phishing: cutting the identity theft line. Print book . Indiana: Wiley Publishing, Inc .

Ma, J., Saul, L. K., Savage, S., and Voelker, G. M. (2009). “Identifying suspicious URLs.” in Proceedings of the 26th annual international conference on machine learning - ICML ’09 (New York, NY: ACM Press ), 1–8. doi:10.1145/1553374.1553462

Marforio, C., Masti, R. J., Soriente, C., Kostiainen, K., and Capkun, S. (2015). Personalized security indicators to detect application phishing attacks in mobile platforms. Available at: http://arxiv.org/abs/1502.06824 .

Margaret, R. I. P. (2008). PBX (private branch exchange). Available at: https://searchunifiedcommunications.techtarget.com/definition/IP-PBX (Accessed June 19, 2019).

Maurer, M.-E., and Herzner, D. (2012). Using visual website similarity for phishing detection and reporting. 1625–1630. doi:10.1145/2212776.2223683

Medvet, E., Kirda, E., and Kruegel, C. (2008). “Visual-similarity-based phishing detection,” in Proceedings of the 4th international conference on Security and privacy in communication netowrks - SecureComm ’08 (New York, NY: ACM Press ), 1. doi:10.1145/1460877.1460905

Merwe, A. v. d., Marianne, L., and Marek, D. (2005). “Characteristics and responsibilities involved in a Phishing attack, in WISICT ’05: proceedings of the 4th international symposium on information and communication technologies . Trinity College Dublin , 249–254.

Microsoft (2020). Exploiting a crisis: how cybercriminals behaved during the outbreak. Available at: https://www.microsoft.com/security/blog/2020/06/16/exploiting-a-crisis-how-cybercriminals-behaved-during-the-outbreak/ (Accessed August 1, 2020).

Mince-Didier, A. (2020). Hacking a computer or computer network. Available at: https://www.criminaldefenselawyer.com/resources/hacking-computer.html (Accessed August 7, 2020).

Miyamoto, D., Hazeyama, H., and Kadobayashi, Y. (2009). “An evaluation of machine learning-based methods for detection of phishing sites,” in international conference on neural information processing ICONIP 2008: advances in neuro-information processing lecture notes in computer science . Editors M. Köppen, N. Kasabov, and G. Coghill (Berlin, Heidelberg: Springer Berlin Heidelberg ), 539–546. doi:10.1007/978-3-642-02490-0_66

Mohammad, R. M., Thabtah, F., and McCluskey, L. (2014). Predicting phishing websites based on self-structuring neural network. Neural Comput. Applic 25, 443–458. doi:10.1007/s00521-013-1490-z

Moore, T., and Clayton, R. (2007). “Examining the impact of website take-down on phishing,” in Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit on - eCrime ’07 (New York, NY: ACM Press ), 1–13. doi:10.1145/1299015.1299016

Morgan, S. (2019). 2019 official annual cybercrime report. USA, UK, Canada. Available at: https://www.herjavecgroup.com/wp-content/uploads/2018/12/CV-HG-2019-Official-Annual-Cybercrime-Report.pdf .

Nathan, G. (2020). What is phishing? + laws, charges & statute of limitations. Available at: https://www.federalcharges.com/phishing-laws-charges/ (Accessed August 7, 2020).

Okin, S. (2009). From script kiddies to organised cybercrime. Available at: https://comsecglobal.com/from-script-kiddies-to-organised-cybercrime-things-are-getting-nasty-out-there/ (Accessed August 12, 2019).

Ollmann, G. (2004). The phishing guide understanding & preventing phishing attacks abstract. USA. Available at: http://www.ngsconsulting.com .

Ong, S. (2014). Avast survey shows men more susceptible to mobile malware. Available at: https://www.mirekusoft.com/avast-survey-shows-men-more-susceptible-to-mobile-malware/ (Accessed November 5, 2020).

Ovelgönne, M., Dumitraş, T., Prakash, B. A., Subrahmanian, V. S., and Wang, B. (2017). Understanding the relationship between human behavior and susceptibility to cyber attacks. ACM Trans. Intell. Syst. Technol. 8, 1–25. doi:10.1080/00207284.1985.11491413

Parmar, B. (2012). Protecting against spear-phishing. Computer Fraud Security , 2012, 8–11. doi:10.1016/S1361-3723(12)70007-6

Phish Labs (2019). 2019 phishing trends and intelligence report the growing social engineering threat. Available at: https://info.phishlabs.com/hubfs/2019 PTI Report/2019 Phishing Trends and Intelligence Report.pdf .

PhishMe (2016). Q1 2016 malware review. Available at: WWW.PHISHME.COM .

PhishMe (2017). Human phishing defense enterprise phishing resiliency and defense report 2017 analysis of susceptibility, resiliency and defense against simulated and real phishing attacks. Available at: https://cofense.com/wp-content/uploads/2017/11/Enterprise-Phishing-Resiliency-and-Defense-Report-2017.pdf .

PishTank (2006). What is phishing. Available at: http://www.phishtank.com/what_is_phishing.php?view=website&annotated=true (Accessed June 19, 2019).

Pompon, A. R., Walkowski, D., and Boddy, S. (2018). Phishing and Fraud Report attacks peak during the holidays. US .

Proofpoint (2019a). State of the phish 2019 report. Sport Mark. Q. 14, 4. doi:10.1038/sj.jp.7211019

Proofpoint (2019b). What is Proofpoint. Available at: https://www.proofpoint.com/us/company/about (Accessed September 25, 2019).

Proofpoint (2020). 2020 state of the phish. Available at: https://www.proofpoint.com/sites/default/files/gtd-pfpt-us-tr-state-of-the-phish-2020.pdf .

Raggo, M. (2016). Anatomy of a social media attack. Available at: https://www.darkreading.com/analytics/anatomy-of-a-social-media-attack/a/d-id/1326680 (Accessed March 14, 2019).

Ramanathan, V., and Wechsler, H. (2012). PhishGILLNET-phishing detection methodology using probabilistic latent semantic analysis, AdaBoost, and co-training. EURASIP J. Info. Secur. 2012, 1–22. doi:10.1186/1687-417X-2012-1

Ramzan, Z. (2010). “Phishing attacks and countermeasures,” in Handbook of Information and communication security (Berlin, Heidelberg: Springer Berlin Heidelberg ), 433–448. doi:10.1007/978-3-642-04117-4_23

Ramzan, Z., and Wuest, C. (2007). “Phishing Attacks: analyzing trends in 2006,” in Fourth conference on email and anti-Spam (Mountain View , ( California, United States ).

Rhett, J. (2019). Don’t fall for this new Google translate phishing attack. Available at: https://www.gizmodo.co.uk/2019/02/dont-fall-for-this-new-google-translate-phishing-attack/ (Accessed April 23, 2019). doi:10.5040/9781350073272

RISKIQ (2020). Investigate | COVID-19 cybercrime weekly update. Available at: https://www.riskiq.com/blog/analyst/covid19-cybercrime-update/%0D (Accessed August 1, 2020).

Robichaux, P., and Ganger, D. L. (2006). Gone phishing: evaluating anti-phishing tools for windows. Available at: http://www.3sharp.com/projects/antiphishing/gonephishing.pdf .

Rouse, M. (2013). Phishing defintion. Available at: https://searchsecurity.techtarget.com/definition/phishing (Accessed April 10, 2019).

Salem, O., Hossain, A., and Kamala, M. (2010). “Awareness program and AI based tool to reduce risk of phishing attacks,” in 2010 10th IEEE international conference on computer and information technology (IEEE) , Bradford, United Kingdom , June 29–July 1, 2010, 2001 ( IEEE ), 1418–1423. doi:10.1109/CIT.2010.254

Scaife, N., Carter, H., Traynor, P., and Butler, K. R. B. (2016). “Crypto lock (and drop it): stopping ransomware attacks on user data,” in 2016 IEEE 36th international conference on distributed computing systems (ICDCS) ( IEEE , 303–312. doi:10.1109/ICDCS.2016.46

Sheng, S., Magnien, B., Kumaraguru, P., Acquisti, A., Cranor, L. F., Hong, J., et al. (2007). “Anti-Phishing Phil: the design and evaluation of a game that teaches people not to fall for phish,” in Proceedings of the 3rd symposium on usable privacy and security - SOUPS ’07 (New York, NY: ACM Press ), 88–99. doi:10.1145/1280680.1280692

Symantic, (2019). Internet security threat report volume 24|February 2019 . USA.

Techpedia (2021). Caller ID. Available at: https://www.techopedia.com/definition/24222/caller-id (Accessed June 19, 2019).

VadeSecure (2021). Phishers favorites 2019. Available at: https://www.vadesecure.com/en/ (Accessed October 29, 2019).

Vishwanath, A. (2005). “Spear phishing: the tip of the spear used by cyber terrorists,” in deconstruction machines (United States: University of Minnesota Press ), 469–484. doi:10.4018/978-1-5225-0156-5.ch023

Wang, X., Zhang, R., Yang, X., Jiang, X., and Wijesekera, D. (2008). “Voice pharming attack and the trust of VoIP,” in Proceedings of the 4th international conference on security and privacy in communication networks, SecureComm’08 , 1–11. doi:10.1145/1460877.1460908

Wenyin, L., Huang, G., Xiaoyue, L., Min, Z., and Deng, X. (2005). “Detection of phishing webpages based on visual similarity,” in 14th international world wide web conference, WWW2005 , Chiba, Japan , May 10–14, 2005 , 1060–1061. doi:10.1145/1062745.1062868

Whitman, M. E., and Mattord, H. J. (2012). Principles of information security. Course Technol. 1–617. doi:10.1016/B978-0-12-381972-7.00002-6

Williams, E. J., Hinds, J., and Joinson, A. N. (2018). Exploring susceptibility to phishing in the workplace. Int. J. Human-Computer Stud. 120, 1–13. doi:10.1016/j.ijhcs.2018.06.004

wombatsecurity.com (2018). Wombat security user risk report. USA. Available at: https://info.wombatsecurity.com/hubfs/WombatProofpoint-UserRiskSurveyReport2018_US.pdf .

Workman, M. (2008). Wisecrackers: a theory-grounded investigation of phishing and pretext social engineering threats to information security. J. Am. Soc. Inf. Sci. 59 (4), 662–674. doi:10.1002/asi.20779

Yeboah-Boateng, E. O., and Amanor, P. M. (2014). Phishing , SMiShing & vishing: an assessment of threats against mobile devices. J. Emerg. Trends Comput. Inf. Sci. 5 (4), 297–307.

Zhang, Y., Hong, J. I., and Cranor, L. F. (2007). “Cantina,” in Proceedings of the 16th international conference on World Wide Web - WWW ’07 (New York, NY: ACM Press ), 639. doi:10.1145/1242572.1242659

Zissis, D., and Lekkas, D. (2012). Addressing cloud computing security issues. Future Generat. Comput. Syst. 28, 583–592. doi:10.1016/j.future.2010.12.006

Keywords: phishing anatomy, precautionary countermeasures, phishing targets, phishing attack mediums, phishing attacks, attack phases, phishing techniques

Citation: Alkhalil Z, Hewage C, Nawaf L and Khan I (2021) Phishing Attacks: A Recent Comprehensive Study and a New Anatomy. Front. Comput. Sci. 3:563060. doi: 10.3389/fcomp.2021.563060

Received: 17 May 2020; Accepted: 18 January 2021; Published: 09 March 2021.

Reviewed by:

Copyright © 2021 Alkhalil, Hewage, Nawaf and Khan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chaminda Hewage, [email protected]

This article is part of the Research Topic

2021 Editor's Pick: Computer Science

  • Search Menu
  • Editor's Choice
  • Author Guidelines
  • Submission Site
  • Open Access
  • About Journal of Cybersecurity
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Exposure to cyberattacks and policy attitudes, the mediating role of threat perceptions, experimental method.

  • < Previous

Cyberattacks, cyber threats, and attitudes toward cybersecurity policies

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Keren L G Snider, Ryan Shandler, Shay Zandani, Daphna Canetti, Cyberattacks, cyber threats, and attitudes toward cybersecurity policies, Journal of Cybersecurity , Volume 7, Issue 1, 2021, tyab019, https://doi.org/10.1093/cybsec/tyab019

  • Permissions Icon Permissions

Does exposure to cyberattacks influence public support for intrusive cybersecurity policies? How do perceptions of cyber threats mediate this relationship? While past research has demonstrated how exposure to cyberattacks affects political attitudes, the mediating role played by threat perception has been overlooked. This study employs a controlled randomized survey experiment design to test the effect of exposure to lethal and nonlethal cyberattacks on support for different types of cybersecurity policies. One thousand twenty-two Israeli participants are exposed to scripted and simulated television reports of lethal or nonlethal cyberattacks against national infrastructure. Findings suggest that exposure to cyberattacks leads to greater support for stringent cybersecurity regulations, through a mechanism of threat perception. Results also indicate that different types of exposure relate to heightened support for different types of regulatory policies. People exposed to lethal cyberattacks tend to support cybersecurity policies that compel the government to alert citizens about cyberattacks. People who were exposed to nonlethal attacks, on the other hand, tend to support oversight policies at higher levels. More broadly, our research suggests that peoples’ willingness to accept government cybersecurity policies that limit personal civil liberties and privacy depends on the type of cyberattacks to which they were exposed and the perceptions associated with such exposure.

In recent years, the increase in civilian exposure to cyberattacks has been accompanied by heightened demands for governments to introduce comprehensive cybersecurity policies. These demands peaked in the aftermath of the 2021 Colonial Pipeline and SolarWinds cyberattacks, where the US government's lack of access to cybersecurity information in critical industries wrought havoc on the country's national and economic security. In the aftermath of these attacks, lawmakers and the public exhibited newfound enthusiasm for legislation that would mandate cyberattack reporting by private enterprises—accelerating a regulatory trend that has existed for several years [ 1 ]. In 2020, for example, 40 US states and territories introduced more than 280 cybersecurity related bills and resolutions [ 2 , 3 ]. A similar process has taken place in Europe [ 4 ] and in Israel [ 5 , 6 ].

The public willingness to accept government policies and regulations that limit personal civil liberties and privacy is part of a delicate tradeoff between security and privacy. In some ways, privacy is seen as an adequate cost of enhanced personal and societal security in the face of novel threats. However, the public has grown increasingly sensitive to the importance of online privacy, and is keenly aware of the ethical, political, legal, and rights-based dilemmas that revolve around government monitoring of online activity and communications [ 7 , 8 ].

The debate on digital surveillance centers on how and whether authorities should gain access to encrypted materials, and raise key questions concerning the extent of state interference in civic life, and the protection of civil rights in the context of security. Yet what lies at the heart of this willingness to accept government policies and regulations that limit personal civil liberties and privacy via increasing public demand for government intervention in cybersecurity? Does exposure to different types of cyberattacks lead to heightened support for different types of regulatory policies? And does the public differentiate between interventionist and regulatory forms of cybersecurity policies?

To test these questions, we ran a controlled randomized survey experiment that exposed 1022 Israeli participants to simulated video news reports of lethal and nonlethal cyberattacks. We argue that public support for governmental cybersecurity measures rises as a result of exposure to different forms of cyberattacks, and that perceived threat plays a mediating role in this relationship. More specifically, we propose that exposure to initial media reports about cyberattacks is a key to the exposure effect, since at this time the threat is magnified and the public has minimal information about the identity of the attacker and the type of cyberattack that was conducted. Past events show that in many cases, the public internalizes the details of an attack in its immediate aftermath when media reports are heaviest. While later reports in the days and weeks following an attack will include far more detailed information, the damage by this time has already been done and the public is already scared and alert.

Further to this, we suggest that the literature has erroneously pooled together all cyber regulatory policies under a single banner of cybersecurity. We propose that civilian exposure to different types of cyberattacks leads to increased support for different and specific cybersecurity policies. We therefore differentiate between support for policies that focus on alerting the public in cases of cyberattacks and others that call for oversight of cybersecurity. In examining how exposure to cyberattacks influences support for these specific policy positions, we distinguish between the outcome of cyberattacks—lethal attacks that cause lethal consequences as a first- or second-degree outcome of the attack, versus nonlethal attacks that merely involve financial consequences. This more nuanced breakdown of exposure types and policy options can help officials contend with certain policy debates without the need for a one-size-fits-all policy. For example, reservations expressed by conservative/libertarian scholars who are concerned about government intervention in the commercial marketplace need not disqualify all forms of cybersecurity policy [ 9 ]. Likewise, the reservations of those concerned with individual privacy violations need not lead to the denunciation of all policies [ 10 ].

To ground this analysis of how the public responds following exposure to both lethal and nonlethal cyberattacks, we apply theories associated with the literature on terrorism and political violence. These theories offer sophisticated mechanisms that explain how individual exposure to violence translates into political outcomes—including demands for government intervention and policymaking. This approach is especially applicable in the digital realm as cyberattacks track a middle ground between technological breakthroughs that constitute tactical developments and new strategic weapons [ 11 ]. The consequence of such ambiguity is that civilians who are exposed to digital political threats can only identify the outcomes of the attack—i.e. whether it is a lethal or nonlethal cyberattack—while the motivations and identities of attackers often remain veiled, or at least unsettled. In light of these attributional challenges, and reflecting the fact that the public typically operates in a low-information environment, we refrain from declaring that the cyberattacks that appear in our experimental manipulations are cybercrime, cyberterrorism, cyber-vandalism, or any other type of attack. Rather, we refer to all attacks under the general heading of "cyberattacks," leaving all respondents to react to the attacks in a way that they see as appropriate in light of the severity of the reported outcome.

The most common form of cyberattack is cybercrime. Reports of data breaches resulting from cyberattacks by criminal organizations show a growth of more than threefold between 2011 and 2018 [ 12 ]. In the first half of 2019 alone, the United States Treasury Department announced that there had been 3494 successful cyberattacks against financial institutions resulting in colossal financial losses and the capture of personal information relating to hundreds of millions of people [ 13 ]. Cyberattacks executed by terror organizations are a newer phenomenon, albeit one that has captured the popular imagination. While terror organizations predominantly make use of cyberspace for fundraising, propaganda, and recruitment [ 14 , 15 ], a recent development has been the next-generation capacity of cyber strikes to trigger lethal consequences, be it through first- or second-order effects. 1 We acknowledge that scholars have expressed some skepticism about the likelihood of impending destructive cyberterror incidents [ 16–18 ], yet national security officials have regularly predicted that lethal cyberattacks pose a "critical threat" [ 19 ]. In the last decade, the nature of this threat has evolved from the earlier depictions of an apocalyptic cyber "pearl harbor" that would ravage modern society from the shadows [ 20 ], to a more nuanced understanding that cyberattacks, while still posing a threat to critical infrastructure, are more likely to manifest through targeted strikes. For example, in April 2020, Israel narrowly averted a cyberattack targeting civilian water networks that would have killed scores of civilians by adding chlorine to the water supply [ 19 ]. Other physically destructive cyberattacks have caused explosive damage to critical infrastructure [ 21 ], while researchers have experimentally verified the ability of malicious digital actors to hack pacemakers and insulin pumps [ 22 ]. While the lethal stature of cyberattacks is still developing, these incidents establish the bona fides of this impending threat and the importance of understanding how the public responds to this type of event.

The discussion that follows has four parts. We begin by examining the theory of how exposure to violence translates into policy preferences, with a particular focus on the mediating role of threat perception. Second, we discuss the design of our controlled, randomized experiment that exposes participants to television news reports of lethal and nonlethal cyberattacks. Third, we present our main results and consider various mediation models that pertain to the different regulatory subsets. We conclude by discussing the implications of our findings for the study of cybersecurity and cyber threats more generally.

Civilians who are exposed to political violence often suffer from feelings of trauma, anxiety, and helplessness in the face of threatening external forces [ 23–25 ]. These emotional responses—whether caused by acts of cyber or conventional violence—are known to cause shifts in political attitudes. Research has shown how exposure to conventional terrorism, which targets civilians and disrupts their daily routines, has an impact on individuals’ support for attitudes toward peace and compromise with the other [ 26 ], political conservatism [ 27 ], exclusionism [ 28 ] and intragroup relations [ 29 ].

Despite the sizeable literature dealing with the effects of exposure to violence, few studies directly investigate the effects of exposure to destructive cyberattacks. This is despite the growing recognition that these threats have become a very tangible part of modern life. In a complex scenario described in the Tallinn Manual 2.0 on the International Law Applicable to Cyber Warfare, the authors contemplated how new forms of cyberattacks could be used to “acquire the credentials necessary to access the industrial control system of a nuclear power plant… with the intent of threatening to conduct cyber operations against the system in a manner that will cause significant damage or death…” [ 30 ]. Even more recently, reports have acknowledged how cyberterror attacks could immobilize a country's or region's electrical infrastructure [ 31 ], disable military defense systems [ 32 ], and even imperil nuclear stability [ 33 ]. While there is a difference between capability and intent, and we acknowledge that physically destructive cyber threats have remained scarce until now, understanding how civilians respond to such digital cyberattacks will become particularly important as the threat matures.

Studies that directly investigated exposure to digital political violence found that exposure had significant effects on political behavior and attitudes, akin to exposure to conventional political violence [ 34 , 35 ]. In a series of exploratory studies regarding the phenomena of cyberterrorism, Gross et al . [ 34 , 36 ] sought to empirically measure the effects of exposure to cyberterrorism under controlled experimental conditions. Their key finding was that exposure to cyberterrorism was severe enough to generate significant negative emotions and cognitive reactions (threat perceptions) at equivalent levels to those of conventional terror acts. Canetti et al . [ 37 ] found that victims of cyberattacks react by demanding government protection, with psychological distress explaining the relationship between exposure and the demand for government intervention. In a subsequent biologically focused experiment, Canetti et al . measured cortisol levels to show how participants who are exposed to cyberterror attacks and experience higher levels of stress are more likely to support hardline retaliatory policies [ 38 ].

Building on this foundation, other research has sought to refine a more precise psycho-political mechanism that understands how cyberattacks trigger shifts in political attitudes. Research by Shandler et al . [ 39 , 40 ], e.g. found that only lethal cyberattacks cause political consequences akin to conventional political violence, and that only the emotion of anger explained these shifts.

In the current paper, we aim to add to this emerging body of research by examining the topic of cybersecurity preferences in the aftermath of lethal and nonlethal cyberattacks. While one past study by Cheung-Blunden et al . [ 41 ] examined how emotional responses to cyber incidents sway cybersecurity preferences, no research has yet attempted to analyze how different types of cyberattacks affect different kinds of cybersecurity policies. As such, we add much needed nuance to the literature.

For the purpose of considering the effects of exposure to cyberattacks, this research focuses on the "outcome" of a cyberattack rather than the "identity" of the perpetrator or the "classification" of the attack. This is necessary for several reasons that relate to the specific characteristics of cyberspace. First, as introduced above, a new class of cyberattack exemplified by the ransomware epidemic has exhibited characteristics of both cybercrime and cyberterror operations, impeding the classification of cyber incidents into simple categories. Second, attribution in cyberspace is fraught with difficulty, and an age of manipulated information complicates the determination of provenance [ 42–44 ]. Sophisticated cyber operatives working from anywhere in the world can exploit the principle of anonymity that underlies the Internet infrastructure to hide their identity. Though authorities would be able to quickly identify the identity of an attacker behind any major cyberattack [ 42 ], this is essentially impossible for members of the public who are confronted with both structural and technical obstacles that prevent them from rendering an objective judgement about the attack source. This reality of publicly obscured cyber antagonists can be viewed in the timelines of several famous cyber incidents. It took between six months and three years for authorities and private actors to publicly reveal the actors behind the 2017 WannaCry attacks, the 2016 cyber intrusion into the Democratic National Committee's networks, and the 2016 cyberattack against the Bowman Dam in New York [ 45–47 ]. While each of these incidents were eventually attributed to an attack source, and the authorities may well have known the identity of the attacker from an early date, we can see that from the perspective of the public, there was a time lag of several months or years before a name was attached to any attack. Third, state involvement in cyberattacks—either as a direct attacker or via proxies—can add substantial background noise to the perception of an attack, raising the specter of interstate war. There is an interesting debate in the literature about whether states may be deemed capable of conducting cyberterrorism—or whether this is a label that can only be applied to nonstate actors. While the literature is still unsettled on this point, Macdonald, Jarvis and Nouri [ 48 ] found considerable expert support for the proposition that states can engage in cyberterrorism.

It is for these reasons that we choose to follow the lead of the scholars who are beginning to evaluate responses to cyber threats through the prism that is most readily available for the public—specifically, the outcome variable, or in other words, the lethality of the attack [ 33 ]. This focus on outcome rather than attacker is necessary in order to understand the factors that prompt emotional and political responses in the public. While these information asymmetries explain our focus on the outcome of the attack rather than the identity of the attacker, we acknowledge that the people draw inferences about the identity and motivations of attackers based on prior experiences and political orientation [ 49 ]. Liberman and Skitka's vicarious retribution theory [ 50 , 51 ] demonstrates how the public may impute responsibility to unrelated or symbolically related offenders when the identity of an attacker is unclear. Nonetheless, maintaining the highest standards of ecological validity demands that attribution and attack categorization is absent in initial public reports of cyber incidents.

Under this framework, we hypothesize that:

Hypothesis 1: Exposure to (i) lethal or (ii) nonlethal cyberattacks will lead to greater support for adopting cybersecurity policies compared with people who were not exposed to any cyberattack. In other words, exposure to cyberattacks—lethal (LC) or nonlethal (NLC)—will increase support for adopting cybersecurity policies, as compared with a control group.

Hypothesis 2: People who are exposed to lethal cyberattacks (LC) will exhibit to higher support for adopting cybersecurity policies than people who are exposed to nonlethal cyberattacks (NLC).

Civilians are notoriously weak at accurately assessing security threats—a fact that is amplified in the cyber realm due to low cybersecurity knowledge, general cognitive biases in calculating risk, and the distortion of cyber risks by the media, which focuses predominantly on spectacular yet low-likelihood attacks [ 52–54 ]. Perceived risk is partly reliant on the scope of the attack to which people are exposed. Victims of cybercrimes (identity theft and cyber bullying) report moderate or severe emotional distress such as anger, fear, anxiety, mistrust, and loss of confidence [ 55 ]. The effects of conventional terrorism include post-traumatic stress, depression, and anticipatory anxiety [ 56 , 29 ]. In both of these cases, threat perception is a common predictor of political attitudes and behavior. Indeed, the best predictor of hostile out-group attitudes is the perceived threat that out-group members will harm members of the in-group, whether physically, economically or symbolically [ 28 , 57 , 58 ]. In many of the studies cited above, threat perception was found to mediate the relationship between exposure to violence and support for harsh or restrictive policies, especially in conflict-related contexts [ 27 ]. Extending this empirical and theoretical evidence to digital political violence suggests that individuals are likely to respond similarly to cyber threats by supporting strong cybersecurity policies through the interceding influence of heightened threat perception.

A set of early studies compared the level of threat evoked by exposure to different forms of cyber threats, identifying key differences in the how cybercrime and cyberterrorism influenced attitudes toward government policy [ 34 , 36 ]. These studies concluded that direct exposure to cyberterrorism had no effect on support for hardline cybersecurity policies (increased digital surveillance, the introduction of intrusive new regulations), but threat perceptions relating to cyberterrorism successfully predicted support for these policies. Recognizing therefore that threat perception plays a central role in understanding the response to cyberattacks, we predict that

Hypothesis 3: Cyber threat perception will mediate the relationship between individual exposure to cyberattacks and support for cybersecurity policies.

To test our hypotheses, we conducted a controlled survey experiment that exposed respondents to simulated news reports about major cyberattacks. The experimental manipulation relied on professionally produced original video clips that broadcast feature news reports. The lethal treatment group viewed a feature report discussing several lethal cyberattacks that had taken place against Israeli targets, while the nonlethal treatment group broadcast a collection of stories pertaining to nonlethal cyber incidents (see below for additional details about each manipulation). The control group did not watch any news report.

We utilized the medium of video news reports for our experimental manipulation since experiments in recent years have shown how broadcast videos and media reports of major attacks arouse strong emotions among viewers, which in turn trigger reevaluations of policy positions and political attitudes related to issues of security [ 35 , 59 , 60 ]. The rationale behind these finding can be partly explained by Terror Management Theory, which explains how even indirect exposure to violent acts triggers potent emotional reactions as people confront threats to their mortality [ 61 , 62 ]. Just as importantly, news reports are a key avenue by which the public learns about major security incidents, and so this method maintains its ecological validity. Each of the groups completed a pre- and post-survey, answering a series of questions about their attitudes to cybersecurity along with relevant sociodemographic information.

Each of the television news reports was presented as an authentic feature story that appeared on Israeli channel 1 television station. The news reports described the global scale of cyber threats facing the public (i.e. two million malicious web sites launch each month and 60 000 new malware programs appear every day at an annual cost to the global economy of 500 billion dollars). The clips were screened in a feature format using on-camera interviews, voiceover and film footage to describe various cyberattacks. To increase the authenticity of the experience, the reports included interviews with well-known Israeli security experts. To mimic the challenges of cyber attribution, the perpetrators of the attacks described in the videos were not identified and were neutrally referred to as cyber operatives. Each video lasted approximately 3 min.

Lethal Cyber Condition—The television news report described various cyberattacks with lethal consequences that had targeted Israel during the previous years. For example, in one of the featured stories, an attack was revealed to have targeted the servers controlling Israel's electric power grid, cutting off electricity to a hospital and causing deaths. In another story, cyber operatives were said to have attacked a military navigation system, altering the course of a missile so that it killed three Israeli soldiers. A third story concerned the use of malware to infect the pacemaker of the Israeli Defense Minister, and a fourth involved the failure of an emergency call to 10 000 military reserve soldiers due to a cyberattack in which foreign agents changed the last digit of the soldiers’ telephone numbers in the military database. The video's interviews with well-known figures from Israel's security sector emphasized the life-threatening danger posed by cyberattacks.

Nonlethal Cyber Condition—The television news report revealed various nonlethal cyberattacks that had targeted Israel during recent years. For example, the broadcast explained how mobile phone users are made vulnerable to attackers by installing new games and applications, potentially introducing malware that can later access data like personal messages or financial details. Another example concerned the dangers posed by the Internet of Things and featured a story in which all the major credit cards companies suspended their customer support after hundreds of thousands of citizens were fraudulently charged for food purchases by their smart refrigerators. The Israeli experts in this video emphasized the potential financial damage from cyberattacks.

Participants

The online survey experiment was administered in Israel during September 2015 via the Midgam Survey Panel. One thousand twenty-two participants were randomly assigned to the three groups (lethal condition: N  = 387; nonlethal condition: N  = 374; control group: N  = 361). The experimental sample represents a random cross-section of the Jewish Israeli population. The sample is largely representative of the wider population, and balance checks reveal that the treatment distribution is acceptable. We note that due to data collection constraints, the sample does not include ultra-orthodox (religious) respondents due to difficulties in accessing this subgroup through online methods. The mean age of the participants was 41 (SD = 14.81), and gender distribution of 49.96% male and 50.04% female. With respect to political orientation, 44.35% of the sample define themself as right-wing ( N  = 452), 38.28% themselves as centrist ( N  = 390), and 17.37% as left-wing ( N  = 177) (this reflects the right-wing slant of the Israeli population that has been apparent in recent elections). The distribution of education and income levels was similar across the three groups (Education: F(2, 1120) = 0.20, P  < 0.82; Income: F(2, 1045) = 0.63, P  < 0.53). Sociodemographic characteristics of the participants are presented in Appendix A (Supporting Information), together with experimental balance checks.

The experiment incorporated three primary variables: the predictor variable (exposure to cyberattacks), the dependent variable (support for cybersecurity policies), and the mediator variable (threat perception). Sociodemographic measures were also collected.

Predictor variable—exposure to cyberattacks

Exposure to cyberattacks was operationalized by random assignment to one of the three experimental treatments described above—lethal cyberattacks/nonlethal cyberattacks/control condition.

Dependent variable: support for cybersecurity policies

Support for cybersecurity policies was examined using twelve questions taken from two scales developed by McCallister and Graves [ 63 , 64 ]. After separating out one item that reflected a unique form of cybersecurity policy, the remaining items were subjected to a principal component analysis (PCA), which highlighted different aspects of cybersecurity policy. Our criteria for the factor dimension extraction was an eigenvalue greater than one for number of dimensions, and factor loading greater than 0.35, for dimension assignment. We applied the PCA extraction method with the Varimax rotation to construct orthogonal factors [ 65 ]. This procedure gave rise to two clearly distinguishable cyber policy dimensions. Following this process, we combined the two remaining items that were excluded due to poor loadings (loading < 0.35) to create a third policy dimension with a high correlation between the items ( r  = 0.617, P  < 0.001) (see Appendix B in the Supporting Information for the PCA and complete list of the items used to construct each scale). The final three measures of cybersecurity policies reflected the breadth of available policy options, which emphasized different levels of government intervention and oversight strategies. The first of these is cybersecurity prevention policy (CPP); the second is cybersecurity alert policy (CAP); and the third is cybersecurity oversight policy (COP).

The cybersecurity prevention policy dimension (CPP) captures the idea that the state should mandate commercial companies to implement minimum levels of cybersecurity to prevent damage. Respondents were asked questions such as: “should the state compel business owners to protect themselves against cyberattacks?” Cronbach's α was within an acceptable range at 0.720.

The cybersecurity oversight policy dimension (COP) refers to the notion that the state should directly intervene to offer cyber protection to its citizens and businesses. Relevant questions for this dimension included “should the state protect its citizens from cyberattacks?” Cronbach's α was within an acceptable range at 0.737.

The cybersecurity alert policy dimension (CAP) relates to the state's presumed responsibility to ensure citizens are alerted when a hack of a cyberattack is discovered. For example, a related question would ask: “should the state alert citizens after a successful attack on critical infrastructure?” As opposed to the prevention policy dimension that relates to measures that must be taken before a cyberattack, the alert policy focuses on the measures to be taken after an attack. Cronbach's α was slightly below acceptable range at 0.632. All questions were measured on a scale ranging from 1 (“completely disagree”) to 6 (“completely agree”).

Mediator: perceptions of cybersecurity threats

Threat perception pertaining to cyber threats was gauged using a five-item scale based on studies conducted in the United States [ 66 ]. Respondents were asked how concerned they feel about the possibility of an actual threat to their security. Respondents answered questions including: “To what extent does the idea of a cyberattack on Israel affect your sense of personal security?” and “To what extent does a cyberattack on Israel threaten the country's critical infrastructure?,” and the answers ranged from 1 (“not at all”) to 6 (“to a very great degree”). The internal consistency of this measure was very high (Alpha = 0.913).

Control variables

Control variables collected included political ideology (assessed through a self-reported five-point scale ranging from 1 [very conservative] to 5 [very liberal]), age, gender, marital status, religiosity, education, and income.

We also measured and controlled for participants’ past exposure to cyberattacks. To measure this variable, we adapted a four-item scale used to measure exposure to terrorism and political violence [ 67 , 35 ]. Items included questions that asked the extent to which the respondents, their friends and their family had ever suffered harm or loss from a cyberattack. Similarly to past studies, we did not calculate the internal reliability for past exposure, given that one type of exposure does not necessarily portend another type.

Preliminary analyses

We begin our analysis by testing the variance between the treatment groups regarding attitudes toward cybersecurity policies, to establish that the experimental conditions produce at least minimal levels of differences in the dependent variables. Hence, we conducted a one-way univariate analysis of variance (ANOVA), in which the different cyber policies were the dependent variables. The results indicated differences between the three groups in support for policies regarding cybersecurity alerts (CAP: F(2, 1020) = 4.61, P  < 0.010). No differences between groups were found in support for cybersecurity prevention policy or cybersecurity oversight policy (CPP: F(2, 1020) = 1.35, P  < 0.259; COP: F(2, 1020) = 0.94, P  < 0.39). We followed the CAP ANOVA analysis with pairwise comparisons using Bonferroni corrections, which revealed that the highest level of support for cybersecurity alerts was expressed by the group exposed to lethal cyberattacks on average, while the other two groups showed lower levels of support for this policy. These results support the conclusion that the differences in cybersecurity policy preferences between the three groups derive from the video stimulus, and not from differences in participants’ sociodemographic characteristics (see Appendix C in the Supporting Information for means and standard deviations of study variables, in all three manipulation groups).

In addition, we tested group differences regarding threat perceptions and found significant differences in threat perceptions between the three groups (F(2, 1020) = 21.68, P  < 0.001). The follow up pairwise comparisons with Bonferroni corrections, revealed that participants in both experimental groups (LC and NLC) expressed higher levels of threat perceptions in comparison to participants in the control group. These analyses provide sufficient preliminary support to conduct more complex analyses that integrate multiple effects in this triangle of exposure to cyberattacks, cyber threat perception, and support for cybersecurity policies.

Mediation analysis

To test hypothesis 3, we ran a path analysis model, i.e. a structural equation modeling with observed indicators only. In this model, the exposure was divided into lethal vs control and nonlethal vs control. More specifically, with regard to the mediation effect, the model structure included two pathways from the experimental conditions to support for cybersecurity policies: From the lethal vs control, and from nonlethal vs control through threat perceptions. The latter variable was expected to mediate the effect condition effects on cyber policy positions as proposed in the theory section.

In order to further investigate the mediation mechanism, we constructed an integrative path analysis model [ 53 ]. Running this model enables us to identify direct and indirect effects among all the study variables. We provide modeling results in the following Table 1 and an illustration of the path analysis model in Fig. 1 .

Empirical model results—direct effects of exposure to lethal and nonlethal attack groups vs control group. *P < 0.05, **P < 0.01, ***P < 0.001.

Empirical model results—direct effects of exposure to lethal and nonlethal attack groups vs control group. * P  < 0.05, ** P  < 0.01, *** P  < 0.001.

Path: analysis direct effects, standardized estimates

Standard error in parentheses; * P  < 0.05, ** P  < 0.01, *** P  < 0.001. NLC = non-ethal cyberattack; LC = lethal cyberattack.

Direct effects

Table 1 presents the results of the standardized estimates (beta coefficients) of each experimental group vis-à-vis the control group (i.e. NLC vs control, and LC vs control), perceptions of threat, past exposure to cyberattacks and socio demographic variables—gender, religiosity, education and political ideology—with the three dimensions of cybersecurity policies as the dependent variables. In the pairwise comparison of the experimental groups, which compares the lethal and nonlethal conditions to the control group, we find a larger direct effect in the LC (lethal) group compared with the NLC (nonlethal) group in predicting support for CAP.

A follow-up that compared the two regression weights further confirmed the stronger relative effect of the lethal exposure over the nonlethal exposure (H 2 : NLC-LC = −0.21 (0.10), P  = 0.047). This demonstrates support for our second hypothesis. People who were exposed to lethal cyberattacks tended to support cybersecurity policies that compel the government and security forces to alert citizens if they have evidence of citizens’ computers being hacked or if an act cyberattack is discovered (CAP) at higher levels than people who were exposed to nonlethal/economic cyberattacks compared with people in the control group.

Interestingly, this trend was reversed for the oversight policies (COP) form of cybersecurity regulation. Here, we identified a significant direct effect wherein exposure to nonlethal cyberattacks led to support for oversight policies (COP) at higher levels than respondents who were exposed to the lethal cyberattacks manipulation or the control group. However, the difference between the two treatment conditions was not significant (NLC-LC = 0.11(0.08), P  = 0.16). This indicates that exposure to any kind of cyberattack, lethal or nonlethal, predicts greater support for oversight regulation policies (COP) to the same extent. No direct effect was found between exposure to cyberattacks and support for prevention regulation policies (CPP). By breaking apart this analysis into different dimensions of cybersecurity polices our results reveal how exposure to different forms of cyberattacks contribute to support for distinct types of policy that emphasize oversight or intervention.

Most importantly, results indicate a significant direct effect of threat perceptions on all three dimensions of cybersecurity policy and higher levels of threat perception in the lethal cyber manipulation group compared with the nonlethal cyber manipulation group and the control group.

Mediating effects

Table 2 presents the indirect effects of each of the two treatment conditions in comparison to the control group for the three dimensions of cybersecurity policies—with threat perception as a mediator. The indirect effects are pathways from the independent variable to the policy variables through threat perceptions. In the path analysis model, each dependent variable, i.e. support for particular cybersecurity policies, could have two potential paths, one from the nonlethal condition and the one from the lethal condition. Altogether, six mediation pathways were tested. These indirect outcomes are illustrated in Fig. 1 . In the LC group we see a complete mediation effect of threat perceptions and no significant direct effect of exposure on COP support. This means that for those participants who were exposed to the lethal condition, the actual exposure was not as strong a predictor of policy support as the threat perception associated with the attacks.

Path: analysis mediation effects, standardized estimates

Standard error in parentheses; * P  < 0.05, ** P  < 0.01, *** P  < 0.001. In squared brackets 95% confidence interval with bias correction bootstrapping ( n  = 2000).

In our models predicting CAP, we see a partial mediation effect for both treatment groups, in addition to the direct effect that we described above. We see a larger indirect effect in the LC group than in the NLC group and this was confirmed by a test of difference. This indicates that people who were exposed to lethal cyberattacks reported higher levels of cyber threat perception as compared with people who were exposed to the nonlethal condition, and this heightened threat perception in turn led to more support for various cybersecurity polices.

Support for CAP (i.e. cybersecurity policies whereby the government or relevant organizations are expected to alert citizens if they have evidence of citizens’ computers being hacked or an act of cyberattack being detected) was predicted both by a direct effect of level of exposure to cyberattacks (NLC, LC) and by the mediation of threat perceptions.

Yet our models predicting support for oversight polices (COP) showed a different picture. In the NLC group we see a partial mediation of threat perceptions in addition to the direct effect that we found in the models shown in Table 2 . Support for COP (i.e. cybersecurity policies whereby the state should protect the country, organizations, and citizens from cyberattacks through direct government action) was predicted by a direct effect of NLC exposure and by the mediation of threat perceptions in both LC and NLC groups. In the LC group versus the control group, support of COP was predicted only through the mediation perceptions of threat. These results support our third hypothesis regarding the mediating role played by threat perception in predicting COP.

Our models predicting support for prevention policies (CPP) showed a complete mediation effect of threat perception in both experimental treatment groups. No direct effect of exposure on CPP was found, indicating that the mediating mechanism is the best predictor for CPP. Support for CPP (i.e. cybersecurity policies whereby the state compels commercial enterprises to install minimum thresholds of cybersecurity) was predicted by the indirect effect of threat perception.

These results emphasize the central role played by threat perception in predicting support for adopting stringent cybersecurity policies. What is especially noteworthy is that threat perception overrides past experience as the full mediation models indicate. For example, we found that when people are exposed to destructive cyberattacks, the level of perceived threat predicted support for adopting cybersecurity policies that required the state to protect citizens and organizations (COP). Similarly, we found that when it comes to predicting support for prevention policies—threat is the driving force.

In order to complement the indirect effect analyses and test the relative strength of the mediation pathways, we contrasted the indirect effects of the various groups on each policy option. According to the outcome estimates in Table 2 , model 3 has a significantly larger mediation effect compared with model 1 (difference = –0.014; 0.024 P  < 0.001) 2 , which indicates that within the NLC group, the mediation model is a stronger predictor of support for COP than CAP. In other words, participants who were exposed to the nonlethal condition were more likely to support oversight polices than alert policies.

Our findings draw on an experimental design that suggests that exposure to different types of cyberattacks intensifies perceptions of cyber threats and shifts political attitudes in support of stringent cybersecurity policies. We find that exposure to lethal cyberattacks affects individual-level political behavior in a manner akin to conventional terrorism [ 68–71 ]. This research was motivated by a desire to better understand what drives individuals to support strong or hardline cybersecurity policies, using Israel as a case study. The findings contribute to this research direction in a number of important ways.

First, exposure to lethal cyberattacks heightens perceptions of cyber threat to a greater degree than nonlethal/economic cyberattacks. Second, as a result of exposure to cyberattacks, respondents were willing to forfeit civil liberties and privacy in exchange for more security. Like conventional terrorism, cyberattacks with lethal consequences harden political attitudes, as individuals tend to support more government oversight, greater regulation of cybersecurity among commercial businesses, and the implementation of strategies to increase public awareness following cyberattacks. Third, our data suggest that in some cases the mere exposure to cyberattack, either lethal or nonlethal, affects the level of support for specific types of cybersecurity polices (stronger support of cybersecurity alert policies among participants in the lethal cyberattack manipulation, and stronger support of cybersecurity oversight policy among participants in the nonlethal cyberattack treatment group). In other cases, threat perception, rather than the exposure to the cyber-events themselves, drive the cognitive effects of cyberattacks on attitudes toward policy (A strong support for COP among the LC group was predicted only through the mediating role of threat perception, and support of CPP, in both manipulation groups was predicted only through a mediated pathway). Finally, we observed differences in the way our mediation model works in relation to different cybersecurity policies. The mediation model for the nonlethal condition group participants predicted greater support for cybersecurity policies focusing on oversight rather than policies focusing on alerting the public.

Our study examined public support for three distinct types of cybersecurity policies that we described as prevention policies, alert policies, and oversight policies. Each of these play a role in securing cyberspace, where the uncertainty regarding the form and nature of potential threats calls for a varied array of preventive actions [ 36 , 37 ]. Each of these policies raises questions about the delicate balancing act between privacy and security demands. In reality, policy approaches are likely to combine several of these elements—yet it behooves us to first consider each of them independently since very little is known about the public knowledge and familiarity with different cybersecurity policies. While preliminary research has looked at public support for cybersecurity preferences in general [ 41 ], these have yet to consider the varied approaches to cybersecurity. To that end, in the current paper we tried to simplify the different cybersecurity polices as much as possible based on real-world policies.

Overall, the study provides evidence that exposure to cyberattacks predicts support for cybersecurity policies through the mediating effect of threat perception. Yet our discovery of differential effects depending on the type of cybersecurity policy being proposed adds a new level of nuance that should be probed further in subsequent studies. More so, results indicate that the public worry and concern in the aftermath of cyberattacks leads directly to calls for governmental intervention. This information sheds light on public opinion processes and helps inform our understanding how individuals will likely respond to new cyber threats. It may also help policymakers understand the complex emotions and cognitions evoked by attacks, which can improve policy formulations that respond to the needs of the public.

Future studies should also investigate how fear appeals intervene in this mechanism, and how to motivate people to take cyber threats more seriously in a way that leads to positive behavioral change.

Participants who were exposed to the lethal manipulation supported cybersecurity policies that focus on alerting the public in cases of cyberattacks more than participants in the two other groups. On the other hand, participants who were exposed to the nonlethal manipulation tended to support cybersecurity policies that call for state oversight of cybersecurity. We found no evidence that any type of exposure has a direct effect on support for polices mandating minimum thresholds of cybersecurity in the commercial arena.

One possible explanation for these results is that thus far, cyberattacks have caused economic damage, but lethal cyberattacks that vividly resemble terrorism are a significantly rarer phenomenon. Hence, participants who were exposed to lethal terror cyberattacks supported cybersecurity policies that would alert them and keep them informed about impending cyber threats. Policies that focus on oversight are perceived as less important during violent terror attacks. On the other hand, exposure to nonlethal cyberattacks, which are typically focused on economic gain, is more common. The economic damage caused by cyberattacks is estimated to reach $6 trillion by 2021 [ 72 ]. As such, participants in the nonlethal manipulation may have regarded cyberattacks causing economic damage as more likely and therefore supported polices that will bolster digital protections.

We note a key condition about the temporal nature of these findings. In analyzing the effect of exposure to cyberattacks, this study focuses on people's immediate response following exposure to cyber threats. Assessing people's short-term responses is valuable as the responses speak to the direction of the political and psychological effects. Yet what is missing from this picture (and beyond the scope of our research design), is the longevity of the response, which speaks to the strength of the effect. If the measured distress and political outcomes swiftly dissipate, then the policy relevance of our findings comes into question.

The literature is split on the question of the temporal durability of attitudinal shifts in the aftermath of major attacks. There is one school of thought that holds that most political effects stemming from political violence or terrorism are fleeting, and that the public is broadly desensitized to political violence [ 73–75 ]. Yet a second school of thought suggests that exposure to attacks can trigger prolonged effects and lasting shifts in political and psychological attitudes. Brandon & Silke [ 76 ] assert that while the distress triggered by exposure dissipates over time, this is not an instantaneous process. Several longitudinal studies following the Oklahama bombing and 9/11 found lingering harms, with exposed individuals reporting elevated levels of psychological distress and altered political attitudes for months or years following the event [ 77–79 ].

In applying this to the case of cyberattacks, there is insufficient evidence to positively determine the longevity of the political and psychological effects that we identified in our study. We anticipate that the effects will be more than fleeting, since the novelty of cyber threats means that people have yet to undergo any cognitive or emotional desensitization to cyberattacks [ 80 ]. However, we acknowledge that this this position requires further empirical substantiation in future research.

A central conclusion of this study is that the implementation of cybersecurity regulations should take account of public perception of cyber threats and public exposure to cyberattacks. This position challenges two unspoken yet ubiquitous notions in the field of cybersecurity. First, the formulation of cybersecurity policies—in a manner akin to national security and espionage discussions—has typically taken place without public input due to the perception that it is a question best left to experts with engineering or national security expertise [ 81 ]. Scholars argue that this complete abdication of cybersecurity policy to specialists is a profound mistake, since excluding “the general public from any meaningful voice in cyber policymaking removes citizens from democratic governance in an area where our welfare is deeply implicated” [ 82 ]. Functional cybersecurity relies on good practices by the ordinary public, and the failure of cybersecurity awareness campaigns to effectively change behavior may well be linked to the lack of public input in its regulation [ 81 ]. Our findings indicate that growing civilian exposure to cyberattacks leads to more defined attitudes toward specific cybersecurity regulations through the mechanism of heightened threat perception. Governments will increasingly need to engage the public as one of the stakeholders in effecting new cyber regulations.

A second conceptual dilemma about the role of public exposure and opinion has to do with the question of whether cybersecurity is a public good deserving of government investment and regulation at all. Much of the field of cybersecurity is dominated by private enterprise, with government involvement taking place in limited ways. Support for government intervention in the realm of cybersecurity is premised on the astronomical public costs of cybercrime, the threat of cyberterror attacks, and the claim of a market failure in the provision of cybersecurity whose negative externalities in the absence of government involvement would cause substantial national damage [ 83 ]. A prominent counter-school of thought, resting on a belief that the private market is the most efficient system of allocating economic resources, claims that there is no need for government intervention in the cybersecurity market [ 84 ]. These proponents of private sector cybersecurity suggest that the private sector can more effectively achieve cybersecurity outcomes, an assertion that is backed up by the fact that private spending on cybersecurity in 2018 reached USD $96 billion [ 85 ]. This raises the question of how civilian exposure to cyberattacks and the subsequent support for cybersecurity regulation can translate to real outcomes if the market responds to both public and private interests, which take account of public opinion and civilian threat perception in different ways.

Seeing that cyber threats are continuously evolving, there are opportunities to expand and consolidate this research in future studies. In the current article, we focus on the effect of exposure to lethal and nonlethal cyberattacks on support for different types of cybersecurity policies among Israeli participants. Yet despite this singular geographic focus, the results offer lessons that can be applied widely. Like several other Western countries, Israel has been repeatedly exposed to publicly reported cyberattacks on critical infrastructure. And, similarly to American and some European countries, Israel has high levels of Internet penetration and publicly renowned levels of cybersecurity readiness to deal with such attacks. Past studies that examined public perceptions of cyber threats have replicated the findings across multiple countries. Shandler et al . [ 80 ] found that psychological responses to internalized reports of cyberattacks explains support for military retaliation, and that this mechanism applies similarly in Israel, the United States, and England. Though requiring additional research, the evidence suggests that cyber threats operate via an underlying psycho-political mechanism that transcends national borders. In fact, the effects of cyberattacks may prove weaker in Israel than elsewhere as the constant exposure among Israelis to political violence places digital violence in the context of a political struggle that has, in many ways, fixed and acceptable costs [ 34 ]. Therefore, we believe that an Israeli sample offers major advantages in understanding the effects of cyberattacks among other Western nations. Nonetheless, we encourage future studies to corroborate these findings in different settings.

A second area where our findings could benefit from additional research relates to the nature of the media exposure. In this study, we exposed respondents to "initial" media reports about major cyberattacks where there is minimal information pertaining to the identity of the attacker and the type of attack that was conducted. While this in many ways reflects the reality of media reports about cyberattacks, it does not discount that journalists will sometimes make inferences about the details of an attack, and that later reports in the days and weeks following an attack will include far more detailed information. More so, this article bears implications for a wide literature beyond the political violence discipline. The public discussion regarding digital privacy and surveillance has spurred crucial new research on the dynamics of digital insecurity. In communications and media studies, for example, scientists are focusing on information-age warfare via different social media platforms, and early results show that citizens are as active in correcting disinformation online as they are in spreading disinformation [ 86 , 87 ]. The debate in the field of business management is also developing as it focuses on consumer expectations surrounding information technology and big data, as well as on the roles and responsibilities of public and private actors in securing personal data [ 88 , 89 ].

Cyber threats are a critical and growing component of national security. As this threat continues to grow all over the world, both in its public perception and in the true scope of the threat, the need to implement strong cybersecurity regulations will grow as well. Our findings indicate that particular forms of exposure to cyberattacks can contribute to support for various types of cybersecurity legislation and contribute to their public legitimacy. This is especially important since the introduction of these regulations constitutes a sacrifice of civil liberties, a sacrifice that citizens are prone to support only under particular conditions.

Though a DDoS attack, e.g. may not trigger physical casualties, its crippling of emergency services and telecommunications could catastrophically amplify the second- and third-order damage during a physical attack; for more, see Catherine A. Theohary and John W. Rollins,   Cyberwarfare and cyberterrorism: In brief (Washington, DC: Congressional Research Service, 2015).

We also see a marginal significant effect between mediation 1 and 5 and 2 and 6. The differences between mediation 1 and mediation 5 show mediation 5 (NLC/control-threat-CPP) has a marginal significant larger mediation effect compared with mediation 1 (NLC/control-threat-CAP) (difference = –0.035; 0.035 P  = 0.073). This means that within the NLC group the mediation model predicts stronger predicting CPP than CAP. In other words, participants who were exposed to the nonlethal (NLC) condition were more likely to support CPP than CAP. We saw that the CAP is stronger in the LC group. Another marginal significant effect was found between mediation 2 and mediation 6. The differences between mediation 2 and mediation 6 show mediation 6 (LC/control-threat-CPP) has a marginal significant larger mediation effect compared with mediation 2 (LC/control-threat-CAP) (difference = −0.044; 0.024 P  = 0.062). This means that within the LC group the mediation model predicts stronger predicting CPP than CAP. In other words, participants who were exposed to the lethal (LC) condition were more likely to support CPP than CAP. We saw a direct effect of LC on CAP.

Geller E , Matishak M . A federal government left ‘completely blind’ on cyberattacks looks to force reporting . Politico . 2021 . https://www.politico.com/news/2021/05/15/congress-colonial-pipeline-disclosure-488406 (10 August, 2021, date last accessed) .

Google Scholar

Cybersecurity legislation 2020. NCSL . https://www.ncsl.org/research/telecommunications-and-information-technology/cybersecurity-legislation-2020.aspx (17 October 2020, date last accessed).

US state cybersecurity regulation more than doubled in 2017, while federal regulation waned. BusinessWire . https://www.businesswire.com/news/home/20180129005238/en/State-Cybersecurity-Regulation-Doubled-2017-Federal-Regulation (29 January 2018, last accessed) .

Kasper A . EU cybersecurity governance: stakeholders and normative intentions towards integration . In: Harwood M , Moncada S , Pace R (eds). The Future of the European Union: Demisting the Debate . Msida : Institute for European Studies , 2020 , 166 – 85 .

Google Preview

Israel National Cyber Directorate (INCD) . https://www.gov.il/en/departments/about/newabout (1 February 2021, date last accessed) .

Ochoa CS , Gadinger F , Yildiz T . Surveillance under dispute: conceptualizing narrative legitimation politics . Eur J Int Secur . 2021 ; 6 : 210 – 32 ..‏

Flyverbom M , Deibert R , Matten D . The governance of digital technology, big data, and the internet: new roles and responsibilities for business . Bus Soc . 2019 ; 58 : 3 – 19 ..‏

Rosenzweig P . The alarming trend of cybersecurity breaches and failures in the U.S. government . The Heritage Foundation. https://www.heritage.org/defense/report/the-alarming-trend-cybersecurity-breaches-and-failures-the-us-government-continues (17 April 2020, last accessed) .

Lee JK , Chang Y , Kwon HY et al.  Reconciliation of privacy with preventive cybersecurity: the bright internet approach . Inf Syst Front . 2020 ; 22 : 45 – 57 .

Nye JS . Nuclear lessons for cyber security? . Strateg Stud Q . 2011 ; 5 : 18 – 38 .

Annual number of data breaches and exposed records in the United States from 2005 to 2018 (in millions) . Statista . https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed (26 February 2019, last accessed) .

For big banks, it's an endless fight with hackers The Business Times , 30 July 2019 . https://www.businesstimes.com.sg/banking-finance/for-big-banks-it%E2%80%99s-an-endless-fight-with-hackers

Nye JS Jr . Cyber Power . Cambridge : Harvard Kennedy School, Belfer Center for Science and International Affairs , 2010 .

Stohl M . Cyber terrorism: a clear and present danger, the sum of all fears, breaking point or patriot games? . Crime Law Soc Change . 2006 ; 46 : 223 – 38 .

Lawson ST . Cybersecurity Discourse in the United States: Cyber-Doom Rhetoric and Beyond . New York : Routledge , 2019 .

Valeriano B , Maness RC . Cyber War Versus Cyber Realities: Cyber Conflict in the International System . New York : Oxford University Press , 2015 .

Lawson S . Beyond cyber-doom: Assessing the limits of hypothetical scenarios in the framing of cyber-threats . J Inf Technol Polit . 2013 ; 10 : 86 – 103 .

Israeli cyber chief: Major attack on water systems thwarted. Washington Post. https://www.washingtonpost.com/world/middle_east/israeli-cyber-chief-major-attack-on-water-systems-thwarted/2020/05/28/5a923fa0-a0b5-11ea-be06-af5514ee0385_story.html (28 May 2020, last accessed) .

Panetta warns of dire threat of cyberattack on U.S. New York Times. (October 11, 2012). https://www.nytimes.com/2012/10/12/world/panetta-warns-of-dire-threat-of-cyberattack.html

Choi SJ , Johnson ME , Lehmann CU . Data breach remediation efforts and their implications for hospital quality . Health Serv Res . 2019 ; 54 : 971 – 80 .

Zetter K . A cyber attack has caused confirmed physical damage for the second time ever . Wired . 2015 . http://www.wired.com/2015/01/german-steel-mill-hack-destruction . (April 2020, date last accessed) .

Hobfoll SE , Canetti-Nisim D , Johnson RJ . Exposure to terrorism, stress-related mental health symptoms, and defensive coping among Jews and Arabs in Israel . J Consult Clin Psychol . 2006 ; 74 : 207 – 18 .

Halperin E , Canetti-Nisim D , Hirsch-Hoefler S . The central role of group-based hatred as an emotional antecedent of political intolerance: Evidence from Israel . Polit Psychol . 2009 ; 30 : 93 – 123 .

Bar-Tal D , Halperin E , de Rivera J . Collective emotions in conflict situations: societal implications . J Soc Issues . 2007 ; 63 : 441 – 60 .

Hirsch-Hoefler S , Canetti D , Rapaport C et al.  Conflict will harden your heart: exposure to violence, psychological distress, and peace barriers in Israel and Palestine . Br J Polit Sci . 2016 ; 46 : 845 – 59 .

Bonanno GA , Jost JT . Conservative shift among high-exposure survivors of the September 11th terrorist attacks . Basic Appl Soc Psychol . 2006 ; 28 : 311 – 23 .

Canetti-Nisim D , Ariely G , Halperin E . Life, pocketbook, or culture: the role of perceived security threats in promoting exclusionist political attitudes toward minorities in Israel . Polit Res Q . 2008 ; 61 : 90 – 103 .

Zeitzoff T . Anger, exposure to violence, and intragroup conflict: a “lab in the field” experiment in southern Israel . Polit Psychol . 2014 ; 35 : 309 – 35 .

Schmitt N . Tallinn Manual 2.0 on the International Law Applicable to Cyber Operations . Cambridge : Cambridge University Press , 2017 .

Russian hackers appear to shift focus to U.S. power grid. The New York Times, 27 July 2018 . 2018 ;

Aucsmith D . Disintermediation, Counterinsurgency, and Cyber Defense . 2016 , Available at SSRN 2836100 . doi: 10.1093/cybsec/tyw018 , (10 August, 2021 last accessed) .

Gartzke E , Lindsay JR . Thermonuclear cyberwar . J Cybersecur . 2017 ; 3 : 37 – 48 .

Gross ML , Canetti D , Vashdi DR . Cyberterrorism: its effects on psychological well-being, public confidence and political attitudes . J Cybersecur . 2017 ; 3 : 49 – 58 .

Backhaus S , Gross ML , Waismel-Manor I et al.  A cyberterrorism effect? Emotional reactions to lethal attacks on critical infrastructure . Cyberpsychol Behav Soc Netw . 2020 ; 23 : 595 – 603 ..‏

Gross ML , Canetti D , Vashdi DR . The psychological effects of cyber-terrorism . Bull At Sci . 2016 ; 72 : 284 – 91 .

Canetti D , Gross ML , Waismel-Manor I . Immune from cyber-fire? The psychological & physiological effects of cyberwar . In: Allhoff F , Henschke A , Strawser BJ (eds). Binary Bullets: The Ethics of Cyberwarfare . Oxford : Oxford University Press , 2016 , 157 – 76 .

Canetti D , Gross ML , Waismel-Manor I et al.  How cyberattacks terrorize: Cortisol and personal insecurity jump in the wake of cyberattacks . Cyberpsychol Behav Soc Netw . 2017 ; 20 : 72 – 7 .

Shandler R , Gross MG , Backhaus S et al.  Cyber terrorism and public support for retaliation: a multi-country survey experiment . Br J Polit Sci . 1 – 19 ., 2021 . DOI: 10.1017/S0007123420000812 .

Rosenzweig P . Cybersecurity and public goods, The public/private ‘partnership’ . In: Berkowitz P (ed). Emerging Threats in National Security and Law . Stanford : Hoover Institution, Stanford University , 2011 , 1 – 36 .

Cheung-Blunden V , Cropper K , Panis A et al.  Functional divergence of two threat-induced emotions: fear-based versus anxiety-based cybersecurity preferences . Emotion . 2017 ; 19 : 1353 – 65 .

Jardine E , Porter N . Pick your poison: the attribution paradox in cyberwar. 2020 , https://osf.io/preprints/socarxiv/etb72/ .

Rid T , Buchanan B . Attributing cyber attacks . J Strateg Stud . 2015 ; 38 : 4 – 37 .

Clark DD , Landau S . Untangling attribution . Harvard National Secur J . 2011 ; 2 : 323 – 52 .

Alraddadi W , Sarvotham H . A comprehensive analysis of WannaCry: technical analysis, reverse engineering, and motivation . https://docplayer.net/130787668-A-comprehensive-analysis-of-wannacry-technical-analysis-reverse-engineering-and-motivation.html , (17 April 2020, last accessed).

Romanosky S , Boudreaux B . Private-sector attribution of cyber incidents: benefits and risks to the US government . Int J Intell CounterIntelligence . 2020 ; 0 : 1 – 31 .

Baezner M . Iranian cyber-activities in the context of regional rivalries and international tensions . ETH Zurich . 2019 : 1 – 37 .

Macdonald S , Jarvis L , Nouri L . State cyberterrorism: a contradiction in terms? . J Terrorism Res . 2015 ; 6 : 62 – 75 .

Canetti D , Gubler J , Zeitzoff T . Motives don't matter? Motive attribution and counterterrorism policy . Polit Psychol . 2021 ; 42 : 483 – 99 .

Liberman P , Skitka LJ . Revenge in US public support for war against Iraq . Public Opin Q . 2017 ; 81 : 636 – 60 .

Liberman P , Skitka LJ . Vicarious retribution in US public support for war against Iraq . Secur Stud . 2019 ; 28 : 189 – 215 .

Kostyuk N , Wayne C . The microfoundations of state cybersecurity: cyber risk perceptions and the mass public . J Glob Secur Stud . 2021 ; 6 : ogz077 .

Gomez MA . Past behavior and future judgements: seizing and freezing in response to cyber operations . J Cybersecur . 2019 ; 5 : 1 – 19 .

Gomez MA , Villar EB . Fear, uncertainty, and dread: cognitive heuristics and cyber threats . Polit Gov . 2018 ; 6 : 61 – 72 .

Harrell E , Langton L . The Victims of Identity Theft, 2012 . US Department of Justice, Office of Justice Programs, Bureau of Justice Statistics , 2013 . https://www.bjs.gov/content/pub/pdf/vit12.pdf

Sinclair SJ , Antonius D . The Psychology of Terrorism Fears . Oxford : Oxford University Press , 2012 .

Quillian L . Prejudice as a response to perceived group threat: population composition and anti-immigrant and racial prejudice in Europe . Am Sociol Rev . 1995 ; 60 : 586 – 611 .

Ben-Nun Bloom P , Arikan G , Lahav G . The effect of perceived cultural and material threats on ethnic preferences in immigration attitudes . Ethn Racial Stud . 2015 ; 38 : 1760 – 78 .

Shoshani A , Slone M . The drama of media coverage of terrorism: emotional and attitudinal impact on the audience . Stud Confl Terror . 2008 ; 31 : 627 – 40 ..‏

Huddy L , Smirnov O , Snider KL et al.  Anger, anxiety, and selective exposure to terrorist violence . J Confl Resolut . 2021 : 00220027211014937 .‏

Greenberg J , Pyszczynski T , Solomon S . The causes and consequences of a need for self-esteem: a terror management theory . In: Public Self and Private Self . New York, NY : Springer , 1986 , ‏ 212 – 189 .

Hall BJ , Hobfoll SE , Canetti D et al.  The defensive nature of benefit finding during ongoing terrorism: an examination of a national sample of Israeli Jews . J Soc Clin Psychol . 2009 ; 28 : 993 – 1021 ..‏

Canetti D , Hall BJ , Rapaport C et al.  Exposure to political violence and political extremism . Eur Psychol . 2013 ; 18 : 263 – 72 .

McCallister E . Guide to Protecting the Confidentiality of Personally Identifiable Information . Darby : Diane Publishing , 2010 .

Graves J , Acquisti A , Anderson R . Experimental measurement of attitudes regarding cybercrime . In: 13th Annual Workshop on the Economics of Information Security . 2014 ; Pennsylvania State University.‏

Huddy L , Feldman S , Capelos T et al.  The consequences of terrorism: disentangling the effects of personal and national threat . Polit Psychol . 2002 ; 23 : 485 – 509 .

Hefetz A , Liberman G . The factor analysis procedure for exploration: a short guide with examples . Cult Educ . 2017 ; 29 : 526 – 62 .

Muthén LK , Muthén BO . MPlus: Statistical Analysis with Latent Variables: User's Guide . Muthén & Muthén , Los Angeles, CA , 2012 .

Galea S , Ahern J , Resnick H et al.  Psychological sequelae of the September 11 terrorist attacks in New York City . N Engl J Med . 2002 ; 346 : 982 – 7 .

Canetti-Nisim D , Halperin E , Sharvit K et al.  A new stress-based model of political extremism: personal exposure to terrorism, psychological distress, and exclusionist political attitudes . J Confl Res . 2009 ; 53 : 363 – 89 .

Canetti D , Snider KLG , Pedersen A et al.  Threatened or threatening? How ideology shapes asylum seekers’ immigration policy attitudes in Israel and Australia . J Refug Stud . 2016 ; 29 : 583 – 606 .

Morgan S . Cybersecurity Ventures predicts cybercrime will cost the world in excess of $6 trillion annually by 2021. Cybercrime Magazine . 2017 ; https://cybersecurityventures.com/hackerpocalypse-cybercrime-report-2016/ (11 May 2020, date last accessed) .

Yakter A , Harsgor L . Long-term change in conflict attitudes: a dynamic approach . ‏ 2021 . http://liran.harsgor.com/wp-content/uploads/2021/07/YakterHarsgor_2021_Long-term-conflict.pdf

Brouard S , Vasilopoulos P , Foucault M . How terrorism affects political attitudes: France in the aftermath of the 2015–2016 attacks . West Eur Polit . 2018 ; 41 : 1073 – 99 .

Castanho Silva B . The (non)impact of the 2015 Paris terrorist attacks on political attitudes . Pers Soc Psychol Bull . 2018 ; 44 : 838 – 50 .

Brandon SE , Silke AP . Near- and long-term psychological effects of exposure to terrorist attacks .‏ In: Bongar B , Brown LM , Beutler LE , al. et (eds). Psychology of Terrorism . Oxford: Oxford University Press 2007 , 175 – 93 .

Pfefferbaum B , Nixon SJ , Krug RS et al.  Clinical needs assessment of middle and high school students following the 1995 Oklahoma City bombing . Am J Psychiatry . 1999 ; 156 : 1069 – 74 ..‏

Galea S , Vlahov D , Resnick H et al.  Trends of probable post-traumatic stress disorder in New York City after the September 11 terrorist attacks . Am J Epidemiol . 2003 ; 158 : 514 – 24 ..‏

Landau MJ , Solomon S , Greenberg J et al.  Deliver us from evil: the effects of mortality salience and reminders of 9/11 on support for President George W. Bush . Pers Soc Psychol Bull . 2004 ; 30 : 1136 – 50 ..‏

Nussio E . Attitudinal and emotional consequences of Islamist terrorism. Evidence from the Berlin attack . Polit Psychol . 2020 ; 41 : 1151 – 71 ..‏

Bada M , Sasse AM , Nurse JRC . Cyber security awareness campaigns: why do they fail to change behaviour? In: International Conference on Cyber Security for Sustainable Society , Global Cyber Security Capacity Centre. 2015 , 1 – 11 .

Shane PM . Cybersecurity policy as if ‘ordinary citizens’ mattered: the case for public participation in cyber policy making . SSRN Electron J . 2012 ; 8 : 433 – 62 .

Shandler R . White paper: Israel as a cyber power . 2019 , DOI: 10.13140/RG.2.2.15936.07681 .

Gartner forecasts worldwide security spending will reach $96 billion in 2018, up 8 percent from 2017. Gartner. https://www.gartner.com/newsroom/id/3836563 (1 August 2019, date last accessed) .

Shandler R , Gross ML , Canetti D . A fragile public preference for using cyber strikes: evidence from survey experiments in the United States, United Kingdom and Israel . Contemp Secur Policy . 2021 ; 42 : 135 – 62 .

Prier J . Commanding the trend: social media as information warfare . Strateg Stud Q . 2017 ; 11 : 50 – 85 ..‏

Golovchenko Y , Hartmann M , Adler-Nissen R . State, media and civil society in the information warfare over Ukraine: citizen curators of digital disinformation . Int Aff . 2018 ; 94 : 975 – 94 ..‏

Belk RW . Extended self in a digital world . J Consum Res . 2013 ; 40 : 477 – 500 .

West SM . Data capitalism: redefining the logics of surveillance and privacy . Bus Soc . 2019 ; 58 : 20 – 41 .

Cahane A . The new Israeli cyber draft bill: a preliminary overview . CSRCL . 2018 . https://csrcl.huji.ac.il/news/new-israeli-cyber-law-draft-bill . (10 August, 2021, date last accessed) .

Supplementary data

Email alerts, citing articles via, affiliations.

  • Online ISSN 2057-2093
  • Print ISSN 2057-2085
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Open access
  • Published: 21 April 2020

Review and insight on the behavioral aspects of cybersecurity

  • Rachid Ait Maalem Lahcen 1 ,
  • Bruce Caulkins 2 ,
  • Ram Mohapatra 1 &
  • Manish Kumar 3  

Cybersecurity volume  3 , Article number:  10 ( 2020 ) Cite this article

41k Accesses

37 Citations

1 Altmetric

Metrics details

Stories of cyber attacks are becoming a routine in which cyber attackers show new levels of intention by sophisticated attacks on networks. Unfortunately, cybercriminals have figured out profitable business models and they take advantage of the online anonymity. A serious situation that needs to improve for networks’ defenders. Therefore, a paradigm shift is essential to the effectiveness of current techniques and practices. Since the majority of cyber incidents are human enabled, this shift requires expanding research to underexplored areas such as behavioral aspects of cybersecurity. It is more vital to focus on social and behavioral issues to improve the current situation. This paper is an effort to provide a review of relevant theories and principles, and gives insights including an interdisciplinary framework that combines behavioral cybersecurity, human factors, and modeling and simulation.

Introduction

Gary Warner delivered in March 1, 2014, a TEDX Birmingham presentation about our current approach to cybercrime. Warner, the Director of the Center for Information Assurance and Joint Forensics Research, at the University of Alabama, Birmingham, explained the challenges of protecting individuals and reporting cybercrimes. Benefits of making money and conducting low risk illegal acts drive cybercriminals. The Internet Security Threat Report ( Symantec 2017 ) shows that the average ransom was $373 in 2014 and it was $294 in 2015. It jumped to $1077 in 2016, and we surmise that it is due to the upsurge value of Bitcoin. A digital currency preferred by ransomware criminals because they can accept it globally without having to reveal their identities. The same report shows that the number of detection of ransomware increased to 463,841, in 2016; and more than 7.1 billion identities have been compromised in cyber attacks in the last 8 years. Malware attacks are on the rise, for instance, the recurrence of disk wiping malware "Shamoon" in the Middle East, and cyber attacks against Ukrainian targets involving the KillDisk Trojan. To show a historical damage that such malware can do, we give the example of the Ukranian power grid that suffered a cyber attack in December 2015. It caused an outage of around 225,000 customers. A modified KillDisk was used to delete the master boot record and logs of targeted systems’ organizations; consequently, it was used in stage two to amplify attacks by wiping off workstations, servers, and a Human Machine Interface card inside of a Remote Terminal Unit. Trojan Horse viruses are considered the third wave of malware that spreads across the Internet via malicious websites and emails ( Donaldson et al. 2015 ). There is no doubt that breaches of data are one of the most damaging cyber attacks ( Xu et al. 2018 ). Figure  1 depicts three main cyber targets, or their combination based on the work discussed in Donaldson et al. (2015) . They are usually referred to as CIA triad:

Confidentiality threat (Data Theft) that can target databases, backups, application servers, and system administrators.

Integrity threat (Alter Data) includes hijacking, changing financial data, stealing large amounts of money, reroute direct deposit, and damage of organization image.

Availability attacks (Denial Access) can be Distributed Denial of Service (DDoS), targeted denial of service, and physical destruction.

figure 1

Losses caused by cyber threats, modified based on Donaldson et al. (2015)

Attackers will try to penetrate all levels of security defense system after they access the first level in the network. Therefore, the defender should be more motivated to analyze security at all levels using tools to find out vulnerabilities before the attackers do ( Lahcen et al. 2018 ). The 2018 Black Report pays particular attention to the period it takes intruders to hack organization’s cyber system, both by stages of the breach and by industry. The clear majority of respondents say that they can gain access to an organization’s system, to map and detect valuable data, to compromise it within 15 hours. Now, most industry reports say the average gap between a breach and its discovery is between 200 and 300 days ( Pogue 2018 ).

It is clear that cyber offenders or criminals still have an advantage over cyber defenders. Therefore, what are the deficiencies in current research and what areas need immediate attention or improvement? Thomas Holt at Michigan State University’s School of Criminal Justice argues that it is essential to situate a cybercrime threat in a multidisciplinary context ( Holt 2016 ). Hence, based on literature review described in “( Related work ”) section, we believe that the behavioral side of cybersecurity needs more research and can improve faster if it is integrated with human factors, and benefit from sophisticated modeling and simulation techniques. Our study emphasizes two necessary points:

(1) Interdisciplinary approach to cybersecurity is essential and it should be defined based on cyberspace understanding. We adopt a definition by the International Organization for Standardization of cyberspace, "the complex environment resulting from the interaction of people, software and services on the Internet by means of technology devices and networks connected to it, which does not exist in any physical form" ( Apvera 2018 ). This definition presents the cyberspace as a complex environment and initiates the interactions with people. Consequently, people’s biases and behaviors influence the interactions with software and technology, which affect the cyberspace. We believe that advancing this interdisciplinary research could bring more relevance and increase of cybercrimes’ manuscripts in top-tier journals. It is noticed that a low number of cyber-dependent crime manuscripts is due to a low number of criminologists who study cybercrime ( Payne and Hadzhidimova 2018 ). Thus, we address several behavioral and crime theories. Based on the proposed interdisciplinary approach, cyber teams have to include individuals with different backgrounds ranging from IT, criminology, psychology, and human factors.

(2) Enterprises must account for possibility of vulnerabilities including human error in the design of systems. Avoiding a vulnerability is a much better option than trying to patch it, or spend resources in guarding it. This may sound as a trivial proposition yet, in reality, many defenders and users often deal with security as a secondary task when their primary function is not security. The authors in Pfleeger and Caputo (2012) stated that security is barely the primary task of those who use the information infrastructure. Also, system developers focus on the user’s needs before integrating security into an architecture design. Afterwards, they add security tools that are easy to incorporate or meet some other system requirements. This is our rationale behind making modeling and simulation an essential component. The stakeholders such as users, managers, and developers, should be involved in building those models, and determine simulations that evaluate cognitive loads and response times to threats. Stakeholders can also use simulation to exercise real life scenarios of social engineering attacks. Furthermore, accounting for vulnerabilities may be affected by the budget. Enterprises keep cybersecurity’s budget to a minimum. A report by Friedman and Gokhale (2019) found that financial institutions’ on the average spending on cybersecurity is 10% of their IT spending or an average of 0.3% of revenue. Recently, some companies are spending more on cyber defense but in areas that may not maximize security. The report of Blackborrow and Christakis (2019) found that organizations are spending more on security but not wisely. This so called reactive security spending and results in widespread inefficiency. By all means, this status increases the complexity of the security problem. Therefore, the perceptions of various industries about their cybersecurity needs vary, in most cases, they lack.

Related work

We conducted a comprehensive literature review using different criteria to capture both a historical stand point and the latest findings. We started the search of theories, human factors, and decision making strategies from 1980. It is important to acknowledge their historical contributions and explore how they can be applied to cybercrimes. We started the search of cybercrime reports from 2014 to understand cybercrime trends and magnitudes. The search of other subjects such as insider threat, hacking, information security, cyber programs, etc. is from the past decade. Some of the search commands: (cybersecurity AND human factors), (cybersecurity AND behavioral aspects), (cybersecurity AND modeling and simulation), (interdisciplinary approach and cybersecurity), (cybersecurity AND crime theories). Some of the databases that were searched are EBSCO, IEEE Xplore, JSTOR, Science Direct, and Google Scholar. It is worthwhile to note that several search results that include interdisciplinary cybersecurity awareness are about educational undergraduate students. This explains the urgency in educating future cyber professionals who will work in interdisciplinary cyber teams. We observed in recent conferences that few speakers debate whether there is talent’s shortage or the problem is inadequate use of available tools. Nevertheless, our view is that the problem could be both. The two points mentioned in introduction (interdisciplinary approach and vulnerability in design) are used as criterion to decide related articles cited here.

It is acknowledged that human as the end user can be a critical backdoor into the network ( Ahram and Karwowski 2019 ). The research done by Addae et al. ( ) used behavioral science approach to determine the factors shaping cybersecurity behavioral decisions of users. The results suggest that security perceptions and general external factors affect individual cybersecurity adoptive behavior, and those factors are regulated by users traits (gender, age) and working environment. The authors in Maimon and Louderback (2019) conducted an interdisciplinary review reiterating that several criminological theories provide important frameworks that guide empirical investigations of different junctures within the cyber-dependent crime ecosystem. Also, they found that more research is needed and suspect that criminologists may not still bring cybercrime scholarship to the forefront of the criminological area. The authors in Payne and Hadzhidimova (2018) found that the most popular criminological explanations of cyber crime include learning theory, self-control theory, neutralization theory, and routine activities theory. In general, their finding reinforce the fact that integration of cybersecurity into criminal justice is not fast, probably because a few criminologists study cybercrimes. The work in Pfleeger and Caputo (2012) addresses the importance of involving human behavior when designing and building cyber technology. They presented two topics of behavioral aspects: (1) cognitive load that can contribute to inattentional blindness that prevents a team member to notice unexpected events when focusing on a primary task, and (2) biases that could help security designers and developers to anticipate perceptions and account for them in the designs. We will articulate more related work in the components’ sections of the proposed framework.

In summary, research has been consistent in acknowledging that behavioral aspects are still underexplored and the focus is more on the technology aspect. One of the challenges is the complexity of the models when addressing different theories. Our aim is to provide insights on current issues, for example, classifying insider threat under human error makes insider issue a design requirement. This insight makes our approach significant because it opens channels to use the best human factors practices found in healthcare, aviation and the chemical industry. It reinforces the idea of insider as a design requirement (prevention).

The rest of the paper proceeds as follows: “( Interdisciplinary framework )” section proposes the Interdisciplinary Framework, “( Behavioral cybersecurity )” section explains Behavioral Cybersecurity, “( Human factors )” section Human Factors is discussed, “( Modeling and simulation )” section deals with Modeling and Simulation component, and we mention Conclusion and Future Work in “( Conclusion and future work )” section.

Interdisciplinary framework

Because all partial solutions (Firewall, IDS/IPS, netflow, proxy, mail gateway, etc.) do not add up to a complete solution and the offenders still have the most latitude for variation at the network level ( Kemmerer 2016 ), it is necessary to invest in interdisciplinary frameworks. In this section, we propose an interdisciplinary framework that enables understanding of interconnectivity of relations and should serve as a background to improve research and maturity of security programs. We focus on three areas based on the work of Caulkins (2017) , depicted in a Venn diagram in Fig.  2 :

Behavioral cybersecurity is the main focus of our study. We address profiles and methods of hackers, insiders, behavioral, social, and crime theories. Weapons of influence that are largely used by the offenders and mostly ignored by the defenders will also be identified.

Integrate human factors discipline with behavioral cybersecurity. We give an insight on human factors that trigger human error. If we consider the insider problem as a human error, we can mitigate the risks by improving the environment, and plan it in the design requirement of future systems. The assumption is that system design enables insider risk because of the already existing vulnerabilities or conditions. The National Institute of Standards and Technology (NIST) recommends that the best method to involve everybody is to motivate everyone using incentives within the cyber economy ( Addae et al. ). Hence, it is worth integrating human factors to improve working environment, mitigate risks, and make the system’s probability of failure lower.

Using Modeling and simulation for researching, developing and implementing new techniques, tools and strategies is our recommendation. Modeling and simulation are useful for many reasons and can be extended to situations such as when real experimentation is not convenient, or dangerous, or not cost effective ( Niazi 2019 ). Simulation can test applications of human factors, for example, whether the real process may cause a cognitive load that will inhibit the security end-user to miss important information or threats. We review modeling and simulation in literature, and we provide insight in that section based on our focus on human error.

figure 2

Venn diagram for the interdisciplinary framework, based on Caulkins (2017)

There is no doubt that behavioral cybersecurity is important, and it needs more research. We emphasize the three components of this proposed interdisciplinary framework because human performance is not affected solely by training, which is the main focus of cyber defenders. It is affected by the system itself, people’s biases, environment workload, administrative management, communication practices, human-computer interfaces, existing distractions, etc. Many factors still contribute to the slow research and implementation of interdisciplinary approaches. Unfortunately, many enterprises underestimate the severity of cyber incidents, or they pass the blame to one person when an incident occurs. For instance, Federal Trade Commission website reports that in September of 2017, Equifax announced a data breach that exposed the personal information of 147 million people and Equifax has agreed to a global settlement with the Federal Trade Commission, the Consumer Financial Protection Bureau, and 50 U.S. states and territories. The settlement includes up to $425 million to help people affected by the data breach ( FTC 2019 ). Yet, the settlement does little to those who file claims ($125 one time payout or credit monitoring for a number of years). Individuals cannot opt out of Equifax being their data steward which makes many persons nervous. Most of the online reports state that Equifax did not update a known vulnerability in the Apache Struts web-application software. Nevertheless, Equifax’s Chief Executive told members of Congress on October 3, 2017, that the massive breach happened because of a mistake by a single employee.

Behavioral cybersecurity

Cybercrime offenders: hackers, hackers’ techniques.

A hacker is a human that uses technical intellect to get unauthorized access to data to modify it, delete it or sell it by any means ( Pal and Anand 2018 ). Although a hacker may follow various steps to execute a successful attack, a usual network intrusion involves reconnaissance to collect information, scanning to set up a vulnerability profile, gaining access or penetrating an access point or level, maintaining access by accessing other levels or planting programs to keep access, and covering tracks to hide the trails ( Lahcen et al. 2018 ). The authors in Shetty et al. (2018) have surveyed hacking techniques:

The dictionary attack to crack vulnerable passwords. This is like brute force to defeat security. It takes advantage of users not being able to remember difficult passwords or the ones that do not make any sense so they use relevant or easy passwords. Often hackers find those users who adopt weak passwords such as 123456 or password . Currently, companies are enhancing passwords’ syntax and mandate specific changing procedures. Yet, users still use same passwords across websites.

Structured Query Language (SQL) injection of harmful code to modify the SQL query structure. It manipulates website’s database.

Cross Site Scripting (XSS) is an attack vector that injects malicious scripts into victim’s webpages.

Phishing is a social engineering attack in which a phisher fools the user to reveal secret information. Some examples are discussed in the weapons of influence “( Weapons of influence )” section.

Wireless hacking due to a weakness of some networks. Those networks do not even change vendor access point and default passwords. A Wi-Fi network can be hacked in wardriving if it has a vulnerable access point. A hacker uses port scanning and enumeration.

The Keylogger is a software that runs in the background and captures the user’s key strokes. With it, hackers can record credentials.

Literature review discusses several hacker profiles. They have various levels of education, they hold many certificates, and they are either self-employed or work for organizations. Hackers can be script kiddies who are the new and novice. Their intent is curiosity or notoriety. Cyber-punks such as virus writers, they have medium skill level and their intent could be notoriety with some financial gain. Insiders or previously called internals can be driven by many motives such as revenge or financial benefits. Insider’s skills are usually high. The intent of petty thieves, virus writers, grey hat or old guard hackers is curiosity or notoriety, but their skill levels are high. The motive of professional criminals or black hat hackers can be financial and they hold very high capabilities. The motive of information warriors who are cyber mercenaries is mainly espionage, and they are placed under Nation State groups. Political activist or hacktivists are ideologically motivated, and they manage to include members who posses high level of skills ( Hald and Pedersen 2012 ).

Insight on hackers’ techniques

It is important to understand that hacking techniques and hackers’ motives in order to anticipate hackers’ moves. All hackers do not think the same way as defenders or in a linear manner. Consequently, defenders need to be interdisciplinary in order to take in account various techniques and combat. We support this assumption with one of the real stories of exploitation by hackers that Mitnick and Simon discussed in Mitnick and Simon (2005) : Hackers changed firmware in the slot machines after hiring an insider or a casino employee. Their motive was money and their stimulus was that the programmers of the machines were human, hence, they most likely had a backdoor flaw in the programs. One hacker checked the patent office for a code since it was a requirement to include it for patent filing. The analysis of the code gave away its secret. The pseudo random generator in the machines was 32-bit random number generator and cracking it was trivial. The designers of the machine did not want real random number generation so they have some control over the odds and the game. The hackers in this story were programmers and their thinking was simple enough to find a sequence of instructions to reach their goal. At that time, casinos spend money in security guards and not in consulting with security sources. One hacker said that he did not even feel remorse because they are stealing from casinos who in return steal from people.

Therefore, we present some of the questions that should be answered periodically to predict hacker’s next move: Is the attack surface defined? Attack surface involves the sum of all the attack vectors where a hacker can attempt to exploit a vulnerability. What is a critical or a most vulnerable or a most damaging asset if exploited? How are the access points protected? How can hackers access crown jewels? An example of crown jewels is the most valued data. Where crown jewels are located (servers, network, backups, etc.)? Are the inventories of authorized and unauthorized devices known? Are operating systems well configured and updated? Is a system in place to identify stolen credentials or compromised user accounts? What type of malware defenses are used? How effective are training or awareness programs? Are employees aware of social media risks? How is the situation of employees in the working environment? How effective and robust are the used intrusion detection systems? Is the reporting system of a potential threat or breach clear? Is there a plan to combat insider threat? We should highlight that many companies see that emphasizing prevention increases cost and reduces productivity. The increase of cost is due to interaction with security control and incident response. Lost of productivity is due to granting permissions or re-certifying credentials or users’ accounts ( Donaldson et al. 2015 ). We think that they should analyze costs of different options: prevention driven program, incident response driven program, or a hybrid option.

Cybercrime offenders: insiders

Insiders’ threat.

An insider is a hacker from inside the organization; hence, this insider has access rights and is behind the firewalls. Insider threat is broadly recognized as an issue of highest importance for cybersecurity management ( Theoharidou et al. 2005 ). Several surveys have considered varying aspects of cybersecurity: The SANS Healthcare Cyber Security Survey ( Filkins 2014 ), The Insider Threat Spotlight 2015 Report ( Partners 2015 ), Department for Business Innovation and Skills, 2014 Information Security Breaches Survey ( Willetts 2014 ), etc. The Insider Threat Spotlight 2015 Report stated that companies were more concerned by inadvertent insider threat data leak breaches than malicious data breaches ( Partners 2015 ). However, their concerns do not surely translate to effective changes in cyber programs. According to the SANS Healthcare Cyber Security Survey, 51% considered careless insider as a main threat when it comes to human behavior as an aspect of cybersecurity ( Filkins 2014 ). Many theories can be applied to understand insider risk and motives, and can be applied to behavioral models. Often policies and risk management guidance are geared towards rational cyber-actors while rationalities of users and defenders represent cyber-system vulnerabilities ( Fineberg 2014 ). Irrational behavior can be dangerous and unpredictable, it builds on frustration or fury, and it can be motivated by lack of job satisfaction. Often cyber defenders do not verify irrational behaviors. The authors in Stanton et al. (2005) have concluded that end users’ behaviors that occur in organizations could be sited within these behavioral groups leading to intentional damage, harmful misuse, unsafe tinkering, naive mistakes, mindful assurance, simple hygiene, and using intentionality and technical expertise as criteria. Myers et al. (2009) have added automated insiders such as bots to unauthorized use of privileges. The authors in Azaria et al. (2014) have divided related works into six categories including psychological and social theories, anomaly based approaches, honeypot based approaches, graph based approaches, game theory approaches, and motivating studies. The authors in Greitzer and Hohimer (2011) have described a predictive modeling framework CHAMPION that integrates various data from cyber domain, to analyze psychological, and motivational factors that concern malicious exploitation by the insider. The ontologies in CHAMPION represent knowledge in the specialized domain to reason about data. The reifiers are used for the feeding of the ontologies’ primitive data types. The memory is used to store both the primitive data and the facts concluded by the reasoning system. In addition, the Auto-associative Memory Columns (AMCs) or reasoning components stacked in a hierarchy and are used for data’s interpretation and are used to infer new statements. The authors in Cappelli et al. (2014) have discussed the Management and Education of the Risk of Insider Threat (MERIT) models that can be implemented to communicate insider’s threat. They identified and validated seven observations after analyzing several insider IT sabotage cases. Those observations are insiders had personal predispositions, were disgruntled employees, were among those who suffered stressful events (sanctions), had behavioral precursors (drug use, aggressive, etc.), created unknown channels to attack after termination, or lacked physical and electronic access (exploited insufficient access). A limitation in dealing with insider threat research is the scarcity of data ( Stolfo et al. 2008 ).

Insight on insiders’ threat

We think that there is a confusion in classifying insider threat, and many organizations may not even have policies or controls addressing it. Another issue of concern is that organizations do not want to admit of having insider incidents, they choose firing the intruder, and protect their reputation. Our insight considers the insider as a human error to be addressed at the top level of any developed taxonomy. So we group all user errors and the insider into human error, summarized in Fig.  3 .

figure 3

Proposed UIM human error as insider-anomaly concept

For this purpose, we adopt a definition of human error mentioned by the Center for Chemical Process Safety (AIChE) in Rodriguez et al. (2017) :

"Human error is any human action that exceeds some control limit as defined by the operating system."

We believe our insight is important because it simplifies this confusing issue to Unintentional - Intentional - Malicious or (UIM) instead of several categories. Moreover, it also allows to adopt lessons learned from industries that have a long history in applying human factors, and built mature programs. Besides, this insight allows to comprehend that failures happen at the management level, at the design level, or at the technical expert levels of the company; and they result in human error or failure ( Embrey et al. 1994 ). Obviously, UIM category is decided by its consequence or intent:

Unintentional human error can be due to lack of organized knowledge or operating skills. This error may remain unintentional or transforms to another type (intentional or malicious).

Intentional human error is caused by a user who knows of risky behavior but acts on it, or misuses assets. The wrong action may not necessarily bring a sudden harm to the organization, but it may still breach of existing laws or privacy.

Malicious human error is the worst error as it is intentional with specific and damaging consequences in mind.

This classification does not downgrade the insider threat. It brings it upfront in the system design, similar to human errors that are usually considered at the beginning of designs. It is easier to blame the human during a cyber incident instead of blaming the cyber program or the design of the systems. In fact, the system design that did not consider the human factor is also to blame. Often the user does not see the security policies in the same way as those who wrote them or want them implemented. It is imperative to realize that users often exhibit their own biases in decision making ( Fineberg 2014 ). This grouping can also be implemented in user’s training and help make awareness easier. We give few examples:

Unintentional error can happen from using a public Wi-Fi to access important accounts and not knowing about the risk. Or, while working, employee visits unsafe websites linked from social media.

Intentional error can occur if a user writes a password on a sticky note, leaves it near computer or in desk’s drawer and hoping no one else uses it.

Malicious error can occur with employee stealing confidential data (exfiltration).

As mentioned, a user error can change from a UIM category to another. For example, a user should not activate links or download attachments in emails without a verification. If a new employee is not aware of social engineering tactics, the employee may click on those links (unintentional). This employee’s clicking rate on those link should decrease with training, if not, employee’s action becomes intentional. Similarly, honeypots or decoys can be used to learn about user’s normal or deviant activities. Some companies implement programs to simulate real life scenarios such as phishing exercises. We suggest that they are transparent with employees about the use of phishing simulators or other awareness programs. The goal should be to improve the culture of cyber awareness and not adding stress to workloads.

We previously described the cyber targets (Fig.  1 ), and mentioned that the defender should consider them in the system design that usually inspects requirements. (1) To define confidentiality requirement, the organization should characterize data and its location. The user should differentiate whether one is dealing with public, confidential, or limited data. Compromising data may happen on the computer of the user, in transit across an open or close network, on a front-end server, or in storage ( Maiwald and Sieglein 2002 ). The user’s access to confidential data should be updated if data classification changes or a user’s status changes. Understanding that insider threat as a human error or anomaly within requirements of data security helps us to set up policies on credentials of persons who have access to confidential data. For example, to implement Just In Time (JIT) credentials. JIT helps to avoid permanent administrator (admin) privileges. It should in return mitigate the risk to steal admin credentials, and prevent admin data access outside the times in which there is no need to access confidential data. (2) Integrity is a system requirement. Data may be modified by the user, in transit across a closed or open network, a front-end server, or in storage ( Maiwald and Sieglein 2002 ). Considering user’s alteration of a system policy as an error helps to best treat integrity like confidentiality. Hence, the user’s access and impact on system integrity need to be examined. (3) Availability is also a system requirement. Because system’s components can be interconnected, a user who affects the availability of a part of a system can affect other parts. User’s error to make a system unavailable can easily happen intentionally or unintentionally if the system design did not identify failure points.

Behavior, social and crime theories

Computer scientists, security researchers, psychologists, social scientists have attempted to explain the behavior of users in relation to cybersecurity. There is insufficient knowledge about the behavior of the user toward information technologies that defend systems and data from troubles such as malware, spyware, and interruptions ( Dinev and Hu 2007 ). The authors in Greitzer and Hohimer (2011) have emphasized that the only way to be proactive in the cyber domain is to take behavioral or psycho-social data into account. At this point, we introduce theories that should help with such issues.

Theories: normative, planned behavior, social bond, and social cognition

There are questions about rationality when it comes to norms and the study of human cognition. The norms are essential to the study of informal argumentation, studies of judgment, and decision-making. Normative theories are studied in procedural theories forms and epistemic theories forms. It is difficult to resolve questions about suitable norms for a specific behavior without comprehending the origins of normativity ( Corner and Hahn 2013 ). It is recognized that playing a matching game between a particular behavior and some prescriptive standard is not enough to understand the concept of normativity. Hence, Corner and Han attempted to answer what makes something normative? It seems that there is a continuing debate on this subject. Our modest understanding is that a rational human behavior happens when the behavior matches some criterion, and logic is used to evaluate arguments. Yet, logic has limitations and may not be appropriate to judge arguments’ strength. Such limitations of logic encouraged the popularity to Bayesian probability as a calculating application for argument strength ( Corner and Hahn 2013 ). Therefore, the authors make a good argument that the Bayesian is suitable for the normativity’s requirements.

Another widely used theory is the Theory of Planned Behavior (TPB) depicted in Fig.  4 . It uses a predictive model that indicates that subjective norms and attitudes influence behavioral intention. The latter influences actual behavior. The TPB postulates that people’s behavioral intention is a good predictor of their real behavior. Another perception of behavior is the subjective norm. The ease or difficulty of performing behavior is the perceived behavioral control.

figure 4

Theory of Planned Behavior diagram, from Icek (2019)

Generally, the greater is the attitude, subjective norm, and perceived behavioral control with respect to a behavior, the higher should be an individual’s intention to demonstrates the behavior under consideration. The attitude is connected to beliefs (behavioral, normative and control). In addition, multiple authors structure social pressure as a cause to normative beliefs. Until now, insufficient research is done on subjective norms regarding cybersecurity. An area in which TPB can be useful in the study of insider threat; as TPB is used successfully in predicting several health behaviors like smoking and substance use. It will be useful to understand the roles of various behavioral factors and learn which ones will have the highest predictive value in order to integrate it in a preventive plan, or an intrusion detection system. Similar to the work of Pabian and Vandebosch that studied cyberbullying using TPB; they found that cyberbullying intention is a predictor of self-reported cyberbullying behavior after six months ( Pabian and Vandebosch 2013 ). The attitude is the primary direct predictor of intention followed by the subjective norm. The authors in Dinev and Hu (2007) have integrated TPB and Technology Acceptance Model (TAM) and found that technology awareness is a predictor to a user behavioral intention to use anti-virus or anti-spyware. Technology awareness had the strong influence on attitudes toward behavior and behavioral intention. They also found that awareness is highly correlated with both TPB and TAM beliefs, and recommended that for managers to create social advocacy groups and networks. Their role is to advocate for cybercrime awareness. The authors of Burns and Roberts (2013) have used TPB to predict online protective behaviors. Their findings indicate a significant relationship between a subjective norm and intention. It also emphasizes that external parties influence the intention of the user to engage in cyber protective behavior.Social Cognition Theory (SCT) initiated as Social Learning Theory by Albert Bandura and became SCT in 1986. It postulates that cognitive factors are related to an environment and behavioral factors. Consequently, learning happens in a social context ( Hardy et al. 1980 ) with reciprocal determinism. Figure  5 depicts SCT basic diagram based on Hardy et al. (1980) . There is a reciprocal cause and effect between a person’s behavior and both the social world and personal characteristics. Hence, criminal or deviant behavior is a learned behavior just like any other behavior. Social Bond Theory makes the assumption that weaker social bonds can increase the chance of a person to be involved in a crime.

figure 5

Social Cognition Theory basic diagram

The interesting part of SCT is that it tries to explain the maintenance of behavior, unlike other theories’ concern of initiating a behavior. SCT can be applied to the cyber domain to investigate decision support and behavior. It can probably support a robust security framework that studies practice behaviors of self-users. For example, studying the impact of self-efficacy is a cornerstone of SCT, on decision and cyber behavior. Self-efficacy is not self-esteem and it is kind of self-evaluation which is significant in individual behavior ( Hardy et al. 1980 ). Self-efficacy can influence the amount of effort, self-regulation, initiation of tasks, and handling of obstacles ( Hardy et al. 1980 ). Also, ill-defined circumstances and performance requirements can bring inconsistencies to self-efficacy expectation and performance ( Reardon 2011 ).

Theories: general deterrence, neutralization, self-control, and situational crime prevention

The authors of Theoharidou et al. (2005) have summarized criminology theories and security literature. It seems that all theories involve a motive and one theory is about the opportunity of a crime. Besides, General Deterrence Theory is based on a perpetrator committing a crime if the cost of sanction is less than the benefit of the crime. Hence, stiff punishment and awareness programs deter many potential perpetrators. Authors in Cheng et al. (2014) found that employees focus on the perceived benefits of personal internet use while, at the same time, finding justification for their behavior and keep less attention to the expected punishment. They are less worried about severity of punishment, and more worried about the likelihood of being caught. Those users try to justify their deviant behavior as excusable. This is a topic of neutralization theory. Hence, employees could use neutralization techniques to justify risky security behaviors. Neutralization is an excellent predictor of employees’ intention to violate information security policies ( Siponen and Vance 2010 ). They see it as an indicator of a motivational state that exists just prior to committing an act. Self-control Theory postulates that criminal acts attract low self-control people as these acts provide pleasure to them. A low self-control individual prefers immediately gratifying activities that involve risky behaviors, and shows little empathy for others. Self-control theory’s definition of crime is behaviors that provide momentary or immediate satisfactions and create negative consequences ( Gottfredson 2017 ). This theory can be applied to cybercrime and may be integrated with other stated theories. The theory of Situational Crime Prevention (SCP) makes the hypothesis that a perpetrator must have an opportunity in addition to a motive. A motive without an apportunity will not yield to a crime. Hence, it is different because it looks at the opportunities and the formation of motives to excite crimes ( Theoharidou et al. 2005 ). SCP framework includes rational choice, opportunity structure, specificity, and twenty-five techniques to reduce crime found in Freilich et al. ( ). The latest studies discussed complex issues in working with SCP, for instance, the competency and the responsibility to prevent a crime. Consequently, reducing cybercrime spike will depend on involving many parties such as law enforcement, government agencies, security companies, etc.

Multi-criteria decision-making

We should include Multi-criteria decision-making (MCDM) with above theories because conflicting ideas may arise and decisions need to be made to have good programs or models. MCDM is crucial for several real life problems including cybersecurity. However, the discussion on the usability of decision theory against cyber threats is limited, which indicates the existence of a gap ( Wilamowski et al. 2017 ). Often, challenges rise during the evaluation of alternatives in terms of a set of deciding measures. There is no doubt that decision making in this paper’s context cannot be easily modeled because of dealing with human element and judgement. A wide range of mathematical methods of MCDM for evaluation and validation of alternatives exist, and embedded in, linear programming, integer programming, design of experiments, Bayesian networks ( Wilamowski et al. 2017 ). MCDM usually involve three steps when using numerical analysis of the alternatives: (1) identify alternatives to criteria, (2) attach numerical measures to the criteria and impact of alternatives, and (3) rank each alternative after processing numerical values ( Triantaphyllou et al. 1997 ). The weighted sum model remains the simplest and the most widely used MCDM method. The authors of Triantaphyllou and Mann (1995) have used the analytical hierarchy of the process for decision making in engineering and found challenges. For instance, when some alternatives are similar or very close to each other, the decision-maker needs to be very careful. They suggest trying to consider additional decision making criteria to considerably discriminate among the alternatives. We can assume so far that decision making theories can easily give different answers to the same cybersecurity problem, yet they should be used as tools to back a decision as the authors of Triantaphyllou and Mann (1995) suggested. The authors of Wilamowski et al. (2017) have studied two theories in decision making: Analytical Hierarchy Process (AHP) and an Analytical Network Process (ANP). They determined that a generalized application benchmark framework could be employed to derive a Measure of Effectiveness (MOE) that relate to the overall operational success criteria (mission performance, safety, availability, and security). MOEs continuance are measured under specific environmental and operational conditions, from the users’ viewpoint. The AHP is an appropriate option if a situation requires rapid and effective decisions due to imminent threat. The ANP is appropriate if the time constraints are less important, and more far-reaching factors should be considered while constructing a defensive strategy. Their findings can provide cybersecurity policy makers a way to quantify the judgments of their technical team regarding cybersecurity policy.

The authors of Kabassi and Virvou (2015) have added Human Plausible Reasoning Theory (HPR) that is a cognitive theory to MCDM and provides more reasoning to a user interface. HPR depends on analyzing people’s answers to ordinary questions about the world. HPR theory assumes dynamic hierarchies to represent human knowledge. HPR defines parameters of certainty as a set of criteria that should be taken into account in order to select the best hypothesis. Nevertheless, HPR does not propose precise mathematical methods for combining these criteria. Indeed, MCDM compliments HPR and improves control in an intelligent user interface ( Kabassi and Virvou 2015 ).

Weapons of influence

We owe the credit, for this section’s title, to the first chapter title of Cialdini’s book "Influence - The Psychology of Persuasion" . Unfortunately, social engineers use weapons to influence and manipulates persons to disclose sensitive information or granting unauthorized access. Cialdini identified six principles of influence that guide human behavior ( Rodriguez et al. 2017 ): Reciprocity, scarcity, authority, consistency, liking and consensus. The authors in Haycock and Matthews (2016) have addressed them in their "Persuasive Advocacy" article. Based on their analysis, we give some examples in which social engineering can exploit and direct human actions with a view to understanding reason that motivates cybercrime:

Liking can give a false sense of credibility. Hackers can use it to build rapport, or encourage certain behaviors by generating fake likes, and artificially increasing the number of followers on social media to give the impression that other people are supporting that behavior.

Reciprocity is due to feeling of obligation to return favors. Hackers can offer free services or products and expect access or data in return.

Social proof or consensus summarizes how a person follows other’s lead. Hackers can use this type of validation to influence users and gain access to data. When people are not certain they may easily reply to other persons, especially peers.

Persuasion by peers. Hackers can persuade insiders to steal data for a cause that a peer or a role model is promoting.

Individuals who decree expertise or credentials try to harness the power of authority. Authority can bring phony claims and influence a user that is wary of job loss.

Consistency comes from the need to appear or to remain consistent. Hackers can find out about consistent actions and use them to distract a user prior to an attack.

Scarcity of resources makes a user vulnerable. It can influence a user to take an immediate action without thinking about consequences such as a data breach.

Researchers found that the effectiveness of each one of these principles is due to the victim’s personality characters. Examples from Uebelacker and Quiel (2014) and Caulkins (2017) about Cialdini principles’ work in social engineering: Agreeableness of a user has increased the vulnerability towards liking, authority, reciprocity, and social proof. Neuroticism indicates a user is less susceptible to most social engineering attacks. Conscientious user may not resist the principles of authority, reciprocity, and commitment and consistency, especially, when commitments are made public. Extraversion user may have greater vulnerability for the scarcity principle since the latter is considered as an excitement. Conscientiousness may decrease user’s susceptibility to cyber attacks. Yet, conscientiousness has a higher tendency to follow through commitments which may make the person susceptible to continuation of social engineering tactics. Agreeableness of a user may have increased susceptibility to phishing, and share passwords. Openness reduces social engineering vulnerability as more digitally literate users better detect social engineering attacks. Authors in Halevi et al. (2013) have found that women are more vulnerable to prize phishing attacks than men, and they found a high correlation between neurosis and responsiveness to phishing attacks. In addition to Cialdini’s work, researchers like Gragg and Stajano discussed what triggers of influence and scams. Table  1 is based on the work of Ferreira et al. (2015) and Caulkins (2017) , and it summarizes the principles of Cialdini, Gragg, and Stajano.

Those authors found that phishing emails use social engineering and depend on liking, deception, and similarity principles. Distraction is the second most commonly used principle. The combination of principles increase success of phishing attacks ( Ferreira et al. 2015 ). The elaboration likelihood model of persuasion in Cacioppo and Petty (2001) suggests that there are central (involve high elaboration) and peripheral (involve low elaboration) routes to persuasion. A person who is faced with a persuasive message will run through it using either a low or high elaboration.

Insight on discussed theories and principles

Applying described theories to cyber domains should help to identify targets by understanding opportunities of a crime. This can be a subject of asset management and risk assessment. What are the crown jewels? And what are their vulnerabilities? Should a company decoy offenders or harden the targets? Who may be interested in hacking them? A hacker type and technique are to be identified. A much better than a current situation in which those questions are asked during an incident response. Those theories can also explain an initiation of deviant behavior, maintenance of a behavior, and a motive of a cybercrime. They consider social and environmental factors that could be missed when preparing a prevention program. Little research is done in this field. One example is research can explore those theories’ use to develop simple models like Persona non Grata that identify adversaries who can be inside or outside security perimeters. Integrating different theories can further classify a deviant behavior as a misbehavior or a beginning of an imminent attack. It seems that creating a social advocacy group and cyber awareness can help improve users’ intentions and attitudes. Strong social bonds are much better than weaker social bonds. We also discussed decision making and understanding alternatives and norms. Weapons of influence are used by intruders, and the defenders lack the research to use them to defend confidentiality, integrity, and availability. The paper of Faklaris (2018) has suggestions on using weapons of influence to support IT professionals. The Commonly used attack vectors by social engineers are phishing (by email), vishing (phone call), impersonation and smishing (text message).

  • Human factors

Relate human factors to cybersecurity

For the Human Factors, researchers can learn from the health and aviation industries since they have extensive work in this discipline. Human factors is the discipline that works to optimize the relationship between the humans and technology. We pick the Map-Assess-Recognize-Conclude (MARC) process shown in Fig.  6 and found in Parush et al. (2017) to address behavioral aspects and focus on human error.

figure 6

Interpretation of MARC process, based on Parush et al. (2017)

Mapping the user and the environment requires asking a set of questions on their characteristics, roles, knowledge, skills, experience, tasks, responsibility, personality traits, access points and locations, human machine interface, etc. Assessment can analyze known factors, collect facts on user capabilities and limitations, and the working environment. While assessing, one can recognize the emerging factors that were not initially included in the mapping and can cause a human error. The two types of emergent factors are environmental (physical and human) and human (psychological, physical). For example, fatigue or distraction can contribute to unintentional mistake, and loss of vigilance can cause intentional mistakes. Fatigue, distraction and loss of vigilance could be emergent factors. Norman argues that humans will make errors in the best designed systems so the systems should be designed to minimize the effect of the error ( Norman 1983 ). We agree with this view, as human errors are known to cause a variety of accidents in various industries and organizations. In aviation, twelve human errors or dirty dozen that lower people’s ability of performance and safety, which could lead to maintenance errors are: lack of communication, complacency, lack of knowledge, distraction, lack of teamwork, fatigue, lack of resources, pressure, lack of assertiveness, stress, lack of awareness, and norms ( Dupont 1997 ). We can easily relate those factors to cybersecurity.

Lack of communication is a problem for any organization. The survey by Ponemon Institute LLC (2014 ) found that 51% report lack of information from security solutions and are unsure if their solution can tell the cause of an attack. Lack of communication can certainly affect awareness negatively. Human factor integration can contribute to environmental situations involving work shifts, communication during emergencies, communication of concerns and risks to contractors, identification of tools, and communication of changes to procedures and plans. The main aim is to not miss important information, or create misunderstandings, or increase cost due to dealing with unhelpful information. Complacency can cause false confidence at both organizational level and at the user level. A user can feel confident because current behavior did not cause a breach, yet it does not mean that intentional wrong doing would not cause a future breach. Lack of knowledge can cause unintentional mistake such as not logging off accounts, or writing difficult to memorize password on a paper, etc. Distraction was already mentioned as a mistake and as a tactic of an attack. Lack of team work can cause a breach because hackers have an understanding on how IT teams work, and they can take advantage of their dysfunction. Fatigue was already mentioned as a problem factor. The environment in which the user is working can cause pressure and stress while it does not provide actionable policies or training to strengthen weaknesses. We discussed in SCT that environment affects behavioral factors. Lack of assertiveness can be connected to communication and self-efficacy. Lack of assertiveness can lead to not communicating directly with teammates potential concerns, or proposing possible solutions, or asking for a feedback. Lack of awareness can be caused by not being vigilant. Norms were discussed in Normative Behavior theory, and the user can conduct negative or unsafe behavior, or take a wrong action in ambiguous cases.

Insight based on chemical industry

Behavioral cybersecurity can benefit from the pitfalls recognized by human factors in other industries. We mention here our insight as an interpretation of human errors in cybersecurity based on common mistakes that happen in chemical industry sites, that are labeled as major hazard sites ( Noyes 2011 ). A parallel comparison of major vulnerable cyber environment to a major hazard site is the following:

Cyber defenders and users are not superhuman, and may not be able to intervene heroically in emergencies. The incident response team is formed by many members and its efficiency depends on many factors such as the team’s budget, training, whether teams are internal or external, available tools, etc. Actually, more research is needed on resilience and agility function of those response teams.

Not documenting assumptions or data sources when documenting probabilities of human failure. As mentioned previously, designs and plans are usually geared towards rational cyber-actors.

Assuming that a defender will always be present, detect a problem and immediately take an appropriate action.

Assuming that users and defenders are well-trained to respond to incidents. Note that training does not prevent violations.

Assuming that defenders and users will always follow procedures.

Assuming that defenders and users are highly motivated and thus not prone to unintentional errors or malicious violations.

Ignoring the human element, especially human performance as if the cyberspace is unmanned.

Inappropriate use of defense tools and losing sight of techniques or tools where they are the most effective.

Not knowing how to manage human error.

Moreover, we interpret three concerns that match with our literature review based on Noyes (2011) :

The focus is more on technology than human aspects.

Ignoring initial vulnerabilities in design and development of systems and focus on training.

Blame incidents on a user with or without investigating the system and management failures.

Modeling and simulation

Network security and all the tools associated with it do not provide perfect security. In fact, perfect security does not exist. Hence, there is a continuous need to develop new solutions and tools and test them. This is where modeling and simulation are helpful to save time and keep the cost down while creating test-beds or environments in which those new tools or strategies are tested. Several tools are already established for network simulation since the 1990s such as Network Simulation Testbed (NEST), Realistic and Large (REAL), OMNeT++, SSFNet, NS2, NS3, J-Sim, OPNET and QualNet ( Niazi 2019 ). Yet, not many of these tools are created to address the human element. The main challenge is to validate reliability and dependability of simulation in a comparison to real-life scenarios or data sets. The anonymity problem makes the challenge more difficult. The author in Cohen (1999) discussed the complexity issue in modeling; a simple model may not be as accurate, and the fully detailed models of every threat and defense mechanisms may have higher accuracy but are costly. Exploring answers to many questions about hackers’ or insiders’ behaviors could help research (or enterprises) to use modeling and simulation to detect anomalies and respond. For instance, what are all possible user behaviors? (Start an application, send a ping, open a file, etc.), what are acceptable or normal behaviors? (Open an authorized file, start an application, etc.), and what are unacceptable behaviors? (Open or attempt to open an unauthorized file, ping, send a bulk of pages to a printer, and browse irrelevant sites that probably can come from copying and pasting disable emails URLs, etc).

The theoretical models of human behavior have been developed and some examples are stated in Goerger (2004) :

(1) Baysian-networks are useful to reason from effects to causes or from causes to effects or by a mixed of inferences. Baysian networks are directed graphs and their models belong to the family of probabilistic graphical models. They can be used to simulate the impact of actions or motives, and build in action to mitigate the overall risk. Researchers have used Bayesian network models in intrusion detection systems. Those models have the flexibility to be combined with other techniques, yet authors in Xie et al. (2010) warn that the combination should be done with preserving Bayesian networks strength to identify and represent relevant uncertainties. Many of the behavioral theories can be tested by simulation. In Dutt et al. (2013) , Instance-Based Learning Theory predicts that both defender and adversary behaviors are likely to influence the defender’s accurate and timely detection of threats. The defender’s cyber awareness is affected by the defender’s cognitive abilities (experience and tolerance) and attacker’s strategy (timing of threats).

(2) A neural-network is a set of algorithms, that are designed to recognize patterns based on a cognitive model or try to mimic the properties of the human brain. Neural-network models are relatively fast, but require a training set to learn and apply learning in operating mode. There are several types of neural network and they are surveyed in Berman et al. (2019) and Parveen (2017) . They have useful applications in security and are already used in intrusion detection systems for anomaly detection ( Parveen 2017 ). Their work can be expanded in similar ways that banks currently using them to detect fraudulent transactions. Hence, they can be trained to detect abnormal behaviors. Yet, they still face the challenge of being used as a black box. The recommendation is to use them in combination with artificial intelligence or other models.

(3) While an agent based system could identify characteristics of the environment, it might be able to link user-based actions with their destructive impact on systems. Agent-based modeling is used by social scientists to analyze human behavior and social interactions. Those models are useful to study complex systems and the interaction of the networks can be shown using visualization methods.

(4) Multi-Agent System is a behavior model in which agents can act autonomously on behalf of their users. Agents can work individually or cooperatively. The Multi-Agent System is used recently in studying smart grid communication protocols.

(5) A rule-based or knowledge based system endeavors to imitate human behavior using an enumeration of steps with causal if/then association. Hence, there is precoding of possible situations. This causes a problem where rules are not determined before. Rule-based models are used in detecting anomalies in intrusion detection systems. In Chen and Mitchell (2015) , authors proposed a methodology to transform behavior rules used for intrusion detection to a state machine.

Conclusion and future work

Behavioral aspects of cybersecurity are becoming a vital area to research. The unpredictable nature of human behavior and actions make Human an important element and enabler of the level of cybersecurity. The goal from discussing reviewed theories is to underscore importance of social, behavior, environment, biases, perceptions, deterrence, intent, attitude, norms, alternatives, sanctions, decision making, etc. in understanding cybercrimes. Although those theories have some limitations, they can still collectively be used to strengthen a behavioral model. Both the user’s and the offender’s behaviors and intentions should be understood and modeled. Improving this area will definitely help improve readiness and prevent incidents. No system is 100% secure, but maximizing security cannot happen without considering the human element. The motto of Trust, but Verify mentioned by President Ronald Reagan applies to cybersecurity. There is a level of trust that is going to be put on a cyber domain in order to be able to work with it, however an ongoing verification is necessary. Employees have to be knowledgeable of the risks, and differentiate desired from undesired behaviors. Yet, some employees may not comply because of implementing techniques of neutralization. Cyber awareness training should be personalized because employees may have different credentials or levels of access and responsibilities. They also have their own biases to security. One size fits all awareness programs are not effective. There is a level of trust that needs to be put on employees, however, technology and cyber awareness must be taught, and a verification of compliance is necessary. More training is not always the solution. A conceptual framework that is interdisciplinary is proposed to bring together behavioral cybersecurity, human factors and modeling and simulation. Enterprises should be involved in research to make sure that models work the way they are intended. Using a model that is available for the sake of convenience without personalizing it may not be proper. George E. P. Box quote,

"All models are wrong, but some are useful"

should motivate researchers and organizations to ask more questions about the usefulness of a model, which in return promotes revising policies and approaches to security. Therefore, coordinating behavioral aspects and technical aspects of cybersecurity should be typical to each organization. Our future work will contribute to the three main concerns stated at the end of Section 3 . For instance, we will explore cyber incidents such as insider threat from the perspective of human error using the proposed framework. A concept model is depicted in Fig.  7 .

figure 7

Mitigating human error concept model using proposed framework

The model can also support mitigating failure due to social engineering, or weapons of influence. Hence, future work will support a different kind of cyber ontologies. We will also study deception games using game theory with different attacker-defender scenarios. The final statement is remain vigilant and be prepared to expect the unexpectable.

Availability of data and materials

No data is used in this paper.

Abbreviations

Analytical hierarchy process

Analytical network process

Auto-associative memory columns

Cross site scripting

Distributed denial of service

Human plausible reasoning theory

Intrusion detection system

Just in time

Map-assess-recognize-conclude

Measure of effectiveness

National institute of standards and technology

Situational crime prevention

Social cognition theory

Structured query language

Technology acceptance model

Theory of planned behavior

Unintentional - intentional - malicious

Addae, JH, Sun X, Towey D, Radenkovic MExploring user behavioral data for adaptive cybersecurity. User Model User-Adap Inter 29(3):701–750. https://doi.org/10.1007/s11257-019-09236-5 .

Ahram, T, Karwowski W (2019) Advances in Human Factors in Cybersecurity In: AHFE: International Conference on Applied Human Factors and Ergonomics, 66–96.. Springer, Washington D.C.https://doi.org/10.4018/978-1-5225-9742-1.ch003.

Google Scholar  

Apvera (2018) The Essential Guide to Risk Management & Compliance (GRC) 2018, Tech. rep.. Apvera.

Azaria, A, Richardson A, Kraus S, Subrahmanian VS (2014) Behavioral analysis of insider threat: A survey and bootstrapped prediction in imbalanced data. IEEE Trans Comput Soc Syst 1(2):135–155. https://doi.org/10.1109/TCSS.2014.2377811 .

Article   Google Scholar  

Berman, DS, Buczak AL, Chavis JS, Corbett CL (2019) A survey of deep learning methods for cyber security. Inf (Switzerland) 10(4). https://doi.org/10.3390/info10040122 .

Blackborrow, J, Christakis S (2019) Complexity In Cybersecurity Report 2019 - How Reducing Complexity Leads To Better Security Outcomes. Tech. Rep. May, Forrester’s Security & Risk research group.

Burns, S, Roberts L (2013) Applying the Theory of Planned Behaviour to predicting online safety behaviour. Crime Prev Community Saf 15(1):48–64. https://doi.org/10.1057/cpcs.2012.13 .

Cacioppo, JT, Petty RE (2001) The elaboration likelihood model of persuasion. Adv Exp Soc Psychol 19:673–676. https://doi.org/10.1558/ijsll.v14i2.309 .

Cappelli, D, Moore A, Trzeciak R (2014) The CERT Guide to Insider Threats How to Prevent, Detect, and Respond to Information Technology Crimes (Theft, Sabotage, Fraud) In: SEI Series in Software Engineering represents, 2nd edn.. Addison-Wesley, Westford, Massachusetts.

Caulkins, B (2017) Lecture title Modeling and Simulation of Behavioral Cybersecurity, Retrieved on December 26, 2018 from IDC 5602 Cybersecurity: A Multidisciplinary Approach.

Chen, IR, Mitchell R (2015) Behavior Rule Specification-Based Intrusion Detection for Safety Critical Medical Cyber Physical Systems. IEEE Trans Dependable Secure Comput 12(1). https://doi.org/10.1109/tdsc.2014.2312327 .

Cheng, L, Li W, Zhai Q, Smyth R (2014) Understanding personal use of the Internet at work: An integrated model of neutralization techniques and general deterrence theory. Comput Hum Behav 38:220–228. https://doi.org/10.1016/j.chb.2014.05.043 .

Cohen, F (1999) Simulating Cyber Attacks, Defences, and Consequences Modeling, Simulation, and Data Limitations in Information Protection. Comput Secur 18:479–518.

Corner, A, Hahn U (2013) Normative theories of argumentation: Are some norms better than others?Synthese 190(16):3579–3610. https://doi.org/10.1007/s11229-012-0211-y .

Dinev, T, Hu Q (2007) The Centrality of Awareness in the Formation of User Behavioral Intention toward Protective Information Technologies. J Assoc Inf Syst 8(7):386–408. https://doi.org/10.17705/1jais.00133 .

Donaldson, S, Siegel S, Williams CK, Aslam A (2015) Enterprise Cybersecurity - How to Build a Successful Cyberdefense Program Against Advanced Threats. Apress Media LLC, New York.

Dupont, G (1997) Human Error In Aviation Maintenance. https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.faa.gov%2Fabout%2Finitiatives%2Fmaintenance_hf%2Flibrary%2Fdocuments%2Fmedia%2Fhuman_factors_maintenance%2Fhuman_error_in_aviation_maintenance.pdf&data=02%7C01%7Crachid%40ucf.edu%7C12bb36a6d43b4079629208d7cc912306%7Cbb932f15ef3842ba91fcf3c59d5dd1f1%7C0%7C0%7C637202796057556002&sdata=xRO8qOMjAMovJLrEJMArj4%2B%2BYHTO6Pl9FdyDO9UQJR4%3D&reserved=0 . Accessed 28 Dec 2019.

Dutt, V, Ahn YS, Gonzalez C (2013) Cyber situation awareness: Modeling detection of cyber attacks with instance-based learning theory. Hum Factors 55(3):605–618. https://doi.org/10.1177/0018720812464045 .

Embrey, D, Kontogiannis T, Green M (1994) Guidelines for Preventing Human Error in Process Safety. Am Inst Chem Eng. https://doi.org/10.1002/9780470925096 .

Faklaris, C (2018) Social Cybersecurity and the Help Desk : New Ideas for IT Professionals to Foster Secure Workgroup Behaviors. Baltimore, MD: USENIX Symposium on Usable Privacy and Security.

Ferreira, A, Coventry L, Lenzini G (2015) Principles of Persuasion in Social Engineering and Their Use in Phishing. Springer International Publishing. https://doi.org/10.1007/978-3-319-20376-8_4 .

Filkins, B (2014) New Threats Drive Improved Practices: State of Cybersecurity in Health Care Organizations. Sans Inst. https://www.qualys.com/docs/sans-threats-drive-improved-practices-state-of-cybersecurity-health-care-organizations.pdf . Accessed 30 Mar 2020.

Fineberg, V (2014) BEC: Applying Behavioral Economics to Harden Cyberspace. J Cyber Secur Inf Syst 2(1):27–33.

Freilich, JD, Newman GR, Freilich JD, Newman GRSituational Crime Prevention In: Oxford Research Encyclopedia of Criminology and Criminal Justice, February 2020, 1–28. https://doi.org/10.1093/acrefore/9780190264079.013.3 .

Friedman, S, Gokhale N (2019) Pursuing cybersecurity maturity at financial institutions: Survey spotlights key traits among more advanced risk managers. Tech. rep. Deloitte Center for Financial Services analysis.

FTC (2019) Equifax Data Breach Settlement. https://www.ftc.gov/enforcement/cases-proceedings/refunds/equifax-data-breach-settlement . Accessed 27 Dec 2019.

Goerger, SR (2004) Validating human behavioral models for combat simulations using techniques for the evaluation of human performance. Tech. Rep. 3. Naval Postgraduate School, MOVES Institute, Monterey, CA.

Gottfredson, M (2017) Self-Control Theory and Crime. https://doi.org/10.1093/acrefore/9780190264079.013.252 .

Greitzer, FL, Hohimer RE (2011) Modeling Human Behavior to Anticipate Insider Attacks. J Strat Secur 4(2):25–48. https://doi.org/10.5038/1944-0472.4.2.2 . http://scholarcommons.usf.edu/jss/vol4/iss2/3/ .

Hald, SL, Pedersen JM (2012) An updated taxonomy for characterizing hackers according to their threat properties. Int Conf Adv Commun Technol ICACT:81–86.

Halevi, T, Lewis J, Memon N (2013) Phishing, Personality Traits and Facebook. https://doi.org/10.1111/j.1469-0691.2005.01161.x . http://arxiv.org/abs/1301.7643.

Hardy, AB, Howells G, Bandura A, Adams NE (1980) Tests of the generality of self-efficacy theory. Cogn Ther Res 4(1):39–66.

Haycock, K, Matthews JR (2016) Persuasive Advocacy. Public Libr Q 35(2):126–135. doi:10.1080/01616846.2016.1200362.

Holt, TJ (2016) Cybercrime through an interdisciplinary lens. Routledge Taylor & Francis Group. https://doi.org/10.4324/9781315618456 .

Icek, A (2019) Theory of Planned Behavior Diagram. http://people.umass.edu/aizen/tpb.diag.html . Accessed 7 Sept 2019.

Kabassi, K, Virvou M (2015) Combining decision-making theories with a cognitive theory for intelligent help: A comparison. IEEE Trans Hum Mach Syst 45(2):176–186. https://doi.org/10.1109/THMS.2014.2363467 .

Kemmerer, M (2016) Detecting the Adversary Post- Compromise with Threat Models and Behavioral Analytics. https://www.mitre.org/sites/default/files/publications/pr-16-3058-presentation-detecting-adversary-post-compromise.pdf . Accessed 27 Dec 2019.

Lahcen, RAM, Mohapatra R, Kumar M (2018) Cybersecurity: A survey of vulnerability analysis and attack graphs In: International Conference on Mathematics and Computing, 97–111.. Springer.

Maimon, D, Louderback ER (2019) Cyber-Dependent Crimes: An Interdisciplinary Review. Ann Rev Criminol 2(1):191–216. https://doi.org/10.1146/annurev-criminol-032317-092057 .

Maiwald, E, Sieglein W (2002) Security Planning & Disaster Recovery. Brandon A. Nordin, Berkeley, California.

Mitnick, KD, Simon WL (2005) The art of intrusion : the real stories behind the exploits of hackers, intruders, & deceivers. Wiley.

Myers, J, Grimaila MR, Mills RF (2009) Towards insider threat detection using web server logs In: Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research Cyber Security and Information Intelligence Challenges and Strategies - CSIIRW ’09, 1. http://portal.acm.org/citation.cfm?doid=1558607.1558670 .

Niazi, MA (2019) Modeling and Simulation of Complex Communication Networks. Modeling and Simulation of Complex Communication Networks Edited by Muaz A. Niazi. The Institution of Engineering and Technology, London. https://doi.org/10.1049/pbpc018e .

Book   Google Scholar  

Norman, D (1983) Design Rules Based on Analyses of Human Error. Commun ACM 26(4):254–259.

Noyes, J (2011) The human factors toolkit Human factors in the management of major accident hazards. https://doi.org/10.1049/pbns032e_ch4 .

Pabian, S, Vandebosch H (2013) Using the theory of planned behaviour to understand cyberbullying. Eur J Dev Psychol 11(4):463–477. https://doi.org/10.1080/17405629.2013.858626. T4 - The importance of beliefs for developing interventions M4 - Citavi.

Pal, SK, Anand S (2018) InfoSec : A Comprehensive Study. IUP J Comput Sci XII:45–65.

Partners, CR (2015) Insider Threat Spotlight Report. Tech. rep. Crowd Research Partners.

Parush, A, Parush D, Ilan R (2017) Human factors in healthcare: a field guide to continuous improvement. Morgan & Claypool.

Parveen, J (2017) Neural Networks in Cyber Security. Int Res J Comput Sci 9(4):2015–2018.

MathSciNet   Google Scholar  

Payne, BK, Hadzhidimova L (2018) Cyber security and criminal justice programs in the United States: Exploring the intersections. Int J Crim Justice Sci 13(2):385–404.

Pfleeger, SL, Caputo DD (2012) Leveraging behavioral science to mitigate cyber security risk. Comput Secur 31(4):597–611. https://doi.org/10.1016/j.cose.2011.12.010 .

Pogue, C (2018) Decoding the minds of hackers. https://www.nuix.com/black-report/black-report-2018 .

Ponemon Institute LLC (2014) Exposing the Cybersecurity Cracks : A Global Perspective. Tech. Rep. April. Ponemon Institute LLC.

Reardon, S (2011) Antismoking drive tries cigarette ads, in reverse. Science 333(6038):23–24. https://doi.org/10.1126/science.333.6038.23 . https://science.sciencemag.org/content/333/6038/23 .

Rodriguez, MA, Bell J, Brown M, Carter D (2017) Integrating Behavioral Science with Human Factors to Address Process Safety. J Organ Behav Manag 37:301–315.

Shetty, SS, Shetty RR, Shetty TG, D’Souza DJ (2018) Survey of hacking techniques and it’s prevention. IEEE Int Conf Power Control Signals Instrum Eng ICPCSI 2017:1940–1945. https://doi.org/10.1109/ICPCSI.2017.8392053 .

Siponen, M, Vance A (2010) Neutralization: New insights into the problem of employee information systems security policy violations. MIS Q 34(3):487–502. https://doi.org/10.1038/174197b0 .

Stanton, JM, Stam KR, Mastrangelo P, Jolton J (2005) Analysis of end user security behaviors. Comput Secur 24(2):124–133. https://doi.org/10.1016/j.cose.2004.07.001 .

Stolfo, SJ, Bellovin SM, Hershkop S, Keromytis AD, Sinclair S, Smith SW (2008) Advances in information security: Insider attack and cyber security - Beyond the hacker. Springer, New York.

Symantec (2017) Internet Security Threat Report ISTR 22 Government Internet Security Threat Report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-22-2017-en.pdf . Accesssed 27 Dec 2019.

Theoharidou, M, Kokolakis S, Karyda M, Kiountouzis E (2005) The insider threat to information systems and the effectiveness of iso17799. Comput Secur 24(6):472–484.

Triantaphyllou, E, Kovalerchuk B, Mann L, Knapp GM (1997) Determining the most important criteria in maintenance decision making. J Qual Maint Eng 3(1):16–28. https://doi.org/10.1108/13552519710161517 .

Triantaphyllou, E, Mann SH (1995) Using the analytic hierarchy process for decision making in engineering applications: some challenges. Int J Ind Eng Appl Pract 2(1):35–44.

Uebelacker, S, Quiel S (2014) The social engineering personality framework In: Proceedings - 4th Workshop on Socio-Technical Aspects in Security and Trust, STAST 2014 - Co-located with 27th IEEE Computer Security Foundations Symposium, CSF 2014 in the Vienna Summer of Logic 2014, January, 24–30. https://doi.org/10.1109/STAST.2014.12 .

Wilamowski, GC, Dever JR, Stuban SMF (2017) Using Analytical Hierarchy and Analytical Network Processes to Create CYBER SECURITY METRICS. Defense ARJ. https://doi.org/10.22594/dau.16-760.24.02 .

Willetts, D (2014) 2014 Information Security Breaches Survey: Technical Report. Tech. rep. Department of Business Innovation & Skills.

Xie, P, Li JH, Ou X, Liu P, Levy R (2010) Using Bayesian networks for cyber security analysis In: Proceedings of the International Conference on Dependable Systems and Networks, 211–220. https://doi.org/10.1109/DSN.2010.5544924 .

Xu, M, Schweitzer KM, Bateman RM, Xu S (2018) Modeling and Predicting Cyber Hacking Breaches. IEEE Trans Inf Forensic Secur 13(11):2856–2871. https://doi.org/10.1109/TIFS.2018.2834227 .

Download references

Acknowledgements

The authors would like to thank the journal for the opportunity to publish an open access paper, and many thanks to the outstanding reviewers for their hard work and feedback.

The authors declare that this work was not funded.

Author information

Authors and affiliations.

University of Central Florida, Mathematics Department, Orlando, 32816, FL, USA

Rachid Ait Maalem Lahcen & Ram Mohapatra

Institute for Simulation and Training, 3100 Technology Pkwy, Orlando, 32826, FL, USA

Bruce Caulkins

Birla Institute of Technology and Sciences - Pilani, Hyderabad Campus, Hyderabad, 500078, Telangana, India

Manish Kumar

You can also search for this author in PubMed   Google Scholar

Contributions

Authors’ contributions.

All authors contributed to different parts of the manuscript. They participated in revising and approving revisions. The author(s) read and approved the final manuscript.

Authors’ information

Rachid Ait Maalem Lahcen is a Mathematics Instructor at University of Central Florida (UCF) Orlando Florida. He holds a Master of Sciences in Mechanical Engineering, a Master of Sciences in Modeling & Simulation, a graduate certificate in Mathematics, and a graduate certificate in Modeling and Simulation of Behavioral Cybersecurity. All from UCF. His research interests are cybersecurity, graph network, inverse problems, numerical methods and students’ learning.

Bruce Caulkins is a Research Assistant Professor and Director of the Modeling & Simulation (M&S) of Behavioral Cybersecurity Program at the Institute for Simulation & Training (IST) at the University of Central Florida (UCF). He is a retired Army Colonel with over 28 years of experience in tactical, operational, and strategic communications and cyberspace operations. In his last military assignment, he was the Chief of the Cyber Strategy, Plans, Policy, and Exercises Division (J65) within the U.S. Pacific Command. In this capacity, he gained extensive insight into cyber capabilities, operational requirements, combatant command requirements, coalition and partner cyber/communications interoperability, and human factor requirements. He also led over a dozen coalition and partner interoperability exercises, to include the HADR-focused PACIFIC ENDEAVOR. Bruce previously taught at and ran several communications and cyber-related schools within the Army’s Training and Doctrine Command. He earned his Ph.D. in Modeling and Simulation at the University of Central Florida, focusing on anomaly detection within intrusion-detection systems. His research interests include behavioral aspects of cybersecurity; threat modeling; cyber workforce development; anomaly detection; cyber security and analysis; cyber education and training methodologies; predictive modeling; data mining; cyber strategy; and, cyber policy.

Ram Mohapatra received his Ph.D. from Jabalpur University India, and taught in American University of Beirut, University of Alberta, Edmonton, York University, Downsview, and at the University of Central Florida, Orlando from 1984where he serves as a Professor of Mathematics. His research interests are in Summability Theory and Sequence Spaces, Fourier Analysis and wavelets, Frame and Approximation Theory, Variational Inequalities and Optimization Theory, Harmonic Functions and Complex Analysis. He has written over 150 research papers in refereed journals. His current research interest is Cyber Security and Graph Theory. In addition to the journal papers, he has written many book chapters, edited seven monographs/ proceedings of conferences, and written two books: one on Fuzzy Differential Equations and the other on Biomedical Statistics with computing. He serves as a member of the editorial Board of five journals in Mathematics.

Manish Kumar is presently working as assistant professor in the Department of Mathematics at the Birla Institute of Technology and Science, Pilani at Hyderabad campus, Hyderabad, Telangana, India. Dr. Kumar obtained his Master of Science in Mathematics from Banaras Hindu University, Varanasi, Ph. D. in Department of Applied Mathematics at Indian School of Mines, Dhanbad, and received various awards. Dr. Kumar is guiding several undergraduate students and published various research papers in national and international journals of repute. Dr. Kumar had chaired a session at the International Congress in Honor in Faculty of Arts and Science, Department of Mathematics in Bursa, Turkey, and also organized a Symposium in ICNAAM 2013 at Rhodes in Greece. Dr. Kumar is member of several national and international professional bodies and societies. Dr. Kumar has visited and delivered invited talks in several national and international conferences, including his recent talk on “Two stage hyper-chaotic system based image encryption in wavelet packet domain for wireless communication systems” at ICM 2018 in Rio de Janeiro, Brazil. Dr. Kumar research areas are pseudo-differential operators, distribution theory, wavelet analysis and its applications, digital image processing, and cryptography.

Corresponding author

Correspondence to Rachid Ait Maalem Lahcen .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Maalem Lahcen, R.A., Caulkins, B., Mohapatra, R. et al. Review and insight on the behavioral aspects of cybersecurity. Cybersecur 3 , 10 (2020). https://doi.org/10.1186/s42400-020-00050-w

Download citation

Received : 06 October 2019

Accepted : 10 March 2020

Published : 21 April 2020

DOI : https://doi.org/10.1186/s42400-020-00050-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cybersecurity
  • Behavioral aspects
  • Crime theories

types of cyber attacks research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

Intelligent Techniques for Detecting Network Attacks: Review and Research Directions

Malak aljabri.

1 Computer Science Department, College of Computer and Information Systems, Umm Al-Qura University, Makkah 21955, Saudi Arabia; as.ude.uqu@iritomhs

2 SAUDI ARAMCO Cybersecurity Chair, Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@4807000812 (S.M.); as.ude.uai@5017000812 (F.M.A.); as.ude.uai@0917000812 (M.A.); as.ude.uai@3222000812 (H.S.A.)

Sumayh S. Aljameel

3 Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@leemajlas

Rami Mustafa A. Mohammad

4 Department of Computer Information Systems, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@dammahommr

Sultan H. Almotiri

Samiha mirza, fatima m. anis, menna aboulnour, dorieh m. alomari.

5 SAUDI ARAMCO Cybersecurity Chair, Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@9807000812 (D.M.A.); as.ude.uai@5217000812 (D.H.A.)

Dina H. Alhamed

Hanan s. altamimi.

The significant growth in the use of the Internet and the rapid development of network technologies are associated with an increased risk of network attacks. Network attacks refer to all types of unauthorized access to a network including any attempts to damage and disrupt the network, often leading to serious consequences. Network attack detection is an active area of research in the community of cybersecurity. In the literature, there are various descriptions of network attack detection systems involving various intelligent-based techniques including machine learning (ML) and deep learning (DL) models. However, although such techniques have proved useful within specific domains, no technique has proved useful in mitigating all kinds of network attacks. This is because some intelligent-based approaches lack essential capabilities that render them reliable systems that are able to confront different types of network attacks. This was the main motivation behind this research, which evaluates contemporary intelligent-based research directions to address the gap that still exists in the field. The main components of any intelligent-based system are the training datasets, the algorithms, and the evaluation metrics; these were the main benchmark criteria used to assess the intelligent-based systems included in this research article. This research provides a rich source of references for scholars seeking to determine their scope of research in this field. Furthermore, although the paper does present a set of suggestions about future inductive directions, it leaves the reader free to derive additional insights about how to develop intelligent-based systems to counter current and future network attacks.

1. Introduction and Background

Rapid advancements in technology have made the Internet easily accessible and it is now actively used by the majority of people for a plethora of professional and personal tasks. Various sensitive activities including communication, information exchange, and business transactions are carried out using the Internet. The Internet helps foster connection and communication, but the integrity and confidentiality of these connections and information exchanges can be violated and compromised by attackers who seek to damage and disrupt network connections and network security. The number of attacks targeting networks are increasing over time, leading to the need to analyze and understand them and develop more robust security protection tools. Every organization, industry, and government requires network security solutions to protect them from the ever growing threat of cyber-attacks. The need for more effective and stable network security systems to protect business and client data is rising as there is no network immune to network attacks.

Several techniques have been proposed over the years to handle and classify network traffic attacks. One is the port-based technique, which includes identifying port numbers among the ones registered by the Internet Assign Number Authority (IANA) [ 1 ]. However, due to the growing number of applications, the number of unpredictable ports has increased and this technique has proven to be ineffective. Furthermore, this technique does not cover account applications that do not register their ports with the IANA or applications that use dynamic port numbers. Another technique that has been proposed is the payload-based technique, also known as deep packet inspection (DPI), where the network packet contents are observed and matched with an existing set of signatures stored in the database [ 1 ]. This method provides more accurate results than the port-based technique, but does not work on network applications using encrypted data. Furthermore, this technique has been proven to be complex, involving high computational costs and a high processing load [ 1 ]. Behavioral classification techniques analyze the entire network traffic received at the host in order to identify the type of application [ 2 ]. The network traffic patterns can be analyzed graphically as well as by examining heuristic information, for example, transport layer protocols and the number of distinct ports contacted. Although behavioral techniques yield good results as they are able to detect unknown threats, they are resource-intensive and are prone to false positives. Another technique, called the rationale-based technique or the statistical technique [ 2 ] examines the statistical characteristics of traffic flow, namely, the number of packets and the maximum, mean, and minimum of the packet size. These statistical characteristics are used to identify different applications since these measurements are unique for every application. However, there is a growing need to incorporate this approach with techniques that could improve the accuracy and speed up the process of classifying the statistical patterns. The correlation-based classification [ 2 ] accumulates packets into flows; that is, it collects data packets with the same source and destination IP, port, and protocol. These are classified according to the correlation between network flows. Multiple flows are usually accumulated further into a Bag of Flow (BoF). Although this technique has proven to perform better than statistical techniques as it overcomes the problem of feature redundancy, it has a high computational overhead for feature matching. Therefore, the need to create techniques that could overcome the rising challenges persist.

At the onset of the 21st century, the concepts of intelligent techniques, namely machine learning (ML) and deep learning (DL) became widespread. Researchers widely acknowledged that these techniques could greatly increase the calculation potential since they focus on using statistical methods and data to make computers think the way humans think. Hence, these intelligent techniques started being used by computer scientists in network security as they addressed the limitations of the non-intelligent techniques. In the field of network security, ML or DL algorithms can be trained with network data to recognize traffic type as normal or malicious and thus protect the network from intruders. Furthermore, the algorithms can be trained to identify the attack type if the network traffic is malicious and trigger appropriate action to prevent the attack. By analyzing past cyber-attacks, the model can be taught to prepare individual defensive reactions. These applications of intelligent methods in network security, which is the focal point of this research paper, can be useful in big businesses, organizations, law enforcement agencies, and banks that store sensitive information as well as in personal networks.

In the past, most of the developed network attack detection techniques actively depended on a set of pre-defined signature-based attacks. This was a major setback since the database of the attacks needed to be constantly updated as the attackers found new ways to exploit network security. However, with the evolution of intelligent-based techniques such as ML and DL, the predictive accuracy of identifying and classifying network attacks has been greatly improved. Therefore, using intelligent-based techniques in network security is a thriving field for research that needs to be explored.

Although several review articles exploring how intelligent-based systems have been applied to detect network attacks have been published in the last few years, none have been found that are as comprehensive as this article. This article covers almost one hundred research articles produced from 2010 to 2021 on a range of network attacks. It will provide clear insights into the race between developing intelligent systems to counter network attacks and how these attacks have evolved to circumvent intelligent systems, thus highlighting gaps in the research and indicating potential future research areas. This research also applied a different taxonomy that, to the best of our knowledge, has not been used in any previous research. It sets up several criteria against which the articles being reviewed could be assessed and compared including:

  • (i) What is/are the classification algorithms implemented?
  • (ii) What is/are the datasets employed for developing the intelligent systems?
  • (iii) Furthermore, this research article compared the results obtained using different evaluation metrics.

It then discusses the answers to the following main questions:

  • (i) Which algorithm(s) was/were commonly implemented and in which kind of attacks?
  • (ii) Which dataset(s) is/are considered more reliable based on the results obtained?

The resulting comparisons and discussions will help future researchers to identify the directions to take in their research, that is, to either improve the intelligent-based algorithms or consider other algorithms, to identify the features that should be added or removed when building the training dataset, and to indicate the evaluation metrics that should be adopted to evaluate the created intelligent systems.

The outcomes of this paper provide valuable directions for further research and applications in the field of applying effective and efficient intelligent techniques in network analytics.

This article is organized into four sections. The first section provides an introduction and background to the research area. A brief overview of network attacks is presented in Section 2 . Section 3 discusses intelligent network attack mitigation techniques where all the reviewed research papers, the network attacks they address using ML and DL techniques, and their findings are presented. Finally, the last section provides a discussion of the findings and the ideas presented in the papers reviewed and sets out promising research directions.

2. Network Attacks

For decades, networking technologies have been used to improve data transfer and circulation. Their continuous improvements have facilitated a wide range of new services.

The Internet of Things (IoT) is a powerful tool for improving communication by connecting different devices to the Internet and collecting data. The information gathered assists firms in the analysis and forecasting of consumer behavior to enhance the quality of their products. Nowadays, ML and DL are being used to construct network systems that can conduct advanced analytics and automation. This technology is transforming the users’ networking experiences by simulating human intellect and gathered data with built-in algorithms [ 3 ].

The emerging cloud computing technologies have brought about remarkable evolutions in network technology where different applications, services, and computing and storage resources are offered on demand to a large number of users via the Internet, thus offering tremendous advantages including flexibility, minimal administrative efforts, cost effective resource utilization, high accessibility, efficiency, and reliability [ 4 ].

A new global wireless standard is the 5th generation (5G) mobile network, which represents a logical network type that connects essentially anything including machines, objects, and gadgets. Not only does 5G offer faster speeds and a greater number of linked devices, it also enables network slicing. Network slicing is the process of dividing several virtual networks operating on the same network infrastructure to create subnetworks that meet the demands of various applications. From entertainment and gaming to school and community safety, the 5G network technology has the ability to develop anything. 5G has the potential to provide higher download rates, real-time replies, and improved connection over time, allowing companies and consumers to explore new innovations [ 5 ].

Such an exponential growth in network technologies has offered many advantages and has greatly improved communications. However, each emerging network technology presents new security challenges and triggers the need for the development of detection tools and countermeasures to meet the new demands. The following subsections briefly discuss the main types of network attacks.

2.1. Types of Network Attacks

A network attack is an approach to hurt, reveal, change, destroy, steal, or obtain illegal access to a network system resource. The attack could come from inside (internal attack) or from outside (external attack). Table 1 lists and describes a number of different types of network attacks that disrupt communication, classifying them as either active or passive attacks, bitcoin attacks, account attacks, or a security breach [ 6 ].

Types of network attacks.

2.2. Network Attack Detection and Prevention Techniques

Security and defense systems are designed to identify, defend, and recover from network assaults. Confidentiality, availability, and integrity are the three primary aims of network security systems. Network intrusion detection and prevention techniques can be classified based on the approach used to detect network threats, prevent them, or a combination of both. These techniques are developed as software, hardware, or a combination of both. They can be classified into two classes: intrusion detection systems (IDS), and intrusion prevention systems (IPS) [ 6 , 7 ].

  • Intrusion Detection System (IDS): Referred to also as network-based IDS (NIDS). This system intensely monitors malicious network activities and notifies officials if an attack is detected with no prevention abilities. Signature-based and anomaly-based detection are the two most prevalent approaches used by IDS to identify threats. Signature-based procedures are applied to detect only known threats, relying on a database containing a list of pre-existing characteristics of known attacks (attacks signatures) to identify suspicious events. The database needs to be continuously updated to include emerging attacks. On the other hand, anomaly-based procedures attempt to differentiate malicious traffic from real traffic based on a change in the network traffic; thus, they can detect unknown threats. Inconsistencies such as high-size traffic, network latency, traffic from uncommon ports, and abnormal system performance, all represent changes in the normal behaviors of the system and can indicate the presence of network attacks.
  • Intrusion Prevention System (IPS): Known also as intrusion detection and prevention systems (IDPS). It scans the network continuously for the presence of illegal or rogue control points that are detected on the basis of changes in behavior. The system automatically takes countermeasures to tackle the threats and defend the system. The primary objective of an IDPS is to keep malicious or undesired packets and attacks from causing any harm. IDPS is more effective than IDS as it not only detects threats, but is able to take action against them. There are two types of IDPS: network-based intrusion detection and prevention systems (NIDPS) that analyze the network protocol to identify any suspicious activities and host-based intrusion detection and prevention systems (HIDPS) that are used to monitor host activities for any suspicious events within the host.

To identify attacks effectively and efficiently, a variety of detection approaches are constantly being developed based on intelligent techniques including ML and DL, which have recently gained immense popularity in the network security field.

3. Intelligent Network Attack Mitigation Techniques

In this section, research studies that used intelligent models to detect different cyber-attack types are reviewed and their findings summarized. Several ML algorithms have been used in these studies including classification, regression, and clustering techniques such as logistic regression (LR), decision trees (DT), etc. Some used random forest (RF), an ensemble of DT, in order to visually represent the sequences of the decision-making process in the form of a tree. Support vector machine (SVM) was widely used in classification due to its ability to distinctly classify the data points by building a hyperplane in an n-dimensional space, where n represents the number of features. Another ML classifier that has been widely used is naïve Bayes (NB), a supervised learning model that uses Bayes’ theorem of probability. Finally, some researchers have used the K-nearest neighbor (KNN) for classification and K-means clustering, an unsupervised approach. Further details about these algorithms can be found in [ 8 ].

DL is a subset of ML, which is a subset of artificial intelligence (AI). A number of DL techniques have been used to build the detection models in some studies, primarily the artificial neural network (ANN), which is an information-processing system that consists of several layers that work best with non-linear dependence and recurrent neural network (RNN), a type of ANN that contains memory function to maintain previous content. Another commonly used DL technique is the convolutional neural network (CNN), which is also a type of ANN that mimics human vision. Furthermore, deep neural network (DNN), a supervised learning type of ANN that finds correct mathematical manipulation to turn input into output, has been used by some authors. Long-short term memory (LSTM), a type of RNN designed to model temporal sequences more accurately, and multi-layer perceptron (MLP), a type of ANN that consists of many layers in directed graphs, have also been widely used. Finally, the gated recurrent unit (GRU), which, though a variant of LSTM and is considered to be more efficient than LSTM as it uses comparatively less memory and executes faster, has also been used. More information about the mentioned algorithms can be found in [ 9 ].

3.1. Problem Domains of the Reviewed Articles

The papers were classified according to the cyber-attack type on which they focused. The different attack types mentioned in this section are insider threat, DDoS attacks, zero-day attacks, phishing attacks, malware attacks, and botnet attacks. We then reviewed articles that did not target specific attacks, but aimed to identify attacks at IoT networks, classify the malicious traffic to different attacks, and identify attacks at the DNS level. Finally, we also mention papers targeting the detection of intrusions in the network.

3.1.1. Insider Threat

Cybersecurity measures have tended to focus on threats outside an organization rather than threats inside that can cause harmful effects. Therefore, researchers have started to look at different techniques to identify insider threats. Tuor et al. [ 10 ] built a model using principal component analysis (PCA) for feature selection, and unsupervised DL namely, DNN, RNN, SVM, isolation forest, DNN-Ident, DNN-Diagnosis LSTM-Ident, LSTM-Diagnosis, among others, that use system logs to detect anomalous activities in the network. The dataset used was synthetic CERT insider threat v6.2 [ 11 ], which was taken from the event log lines of a network of a simulated organization’s computer. The researchers targeted two prediction approaches: the “next time step” and the “same time step”. The results of the experiments showed that the “same time step” approach resulted in higher performance, and that the isolation forest model was the strongest model. To evaluate the proposed model, recall was used and DNN-Diagnosis, LSTM-Diagnosis, and the isolation forest model all obtained 100% recall. In future work, the researchers may apply the proposed model to a wider range of streaming tasks and explore different granularities of time.

Similarly, LSTM and CNN techniques were used by Yuan et al. [ 12 ] to build a model to detect insider threats. They applied the model on the CERT insider threat v4.2 dataset [ 13 ], which contained 32 M log lines among which 7323 were anomalous activities. The advantage of this version of the CERT dataset was that it contained more samples of insider threats than other versions. The train–test split was 70–30%. The researchers first used LSTM to extract the user behavior, abstracted temporal features, and produced the feature vectors. After that, the researchers transformed the feature vectors into fixed-size matrices. Finally, CNN was used to classify the feature matrices into anomaly or normal. The proposed model resulted in an area under the curve (AUC) of 94.49%.

Hu et al. [ 14 ] used DL methods to build a user authentication model based on characteristics of mouse behaviors that could be used to monitor and detect insider authentications. They used an open-source dataset called the Balabit Mouse Dynamics Challenge dataset [ 15 ], and CNN algorithm. CNN showed high performance in user authentication based on mouse features with a false acceptance rate (FAR) of 2.94% and a false rejection rate (FRR) of 2.28%.

3.1.2. DDoS Attacks

One of the most harmful threats in network security is distributed denial of service (DDoS) attacks that attempt to disrupt the availability of services. Since DDoS is easy to launch but not easy to detect, as in most cases the attack traffic is very similar to legitimate traffic, some researchers have focused solely on detecting them using different ML approaches.

Yuan et al. [ 16 ] proposed DeepDefense, which is a DL-based DDoS attack detection approach that can study network traffic sequence patterns and trace the network attack activities. They used the UNB ISCX intrusion detection evaluation 2012 (ISCX2012) dataset [ 17 ], and the RNN algorithm to build the model. From ISCX2012, the team extracted 20 network traffic fields to generate a 3-D feature map using a sliding time window. Data14 and data15 were extracted from ISCX2012, which contained 9.6 M packets and 34.9 M packets, respectively. The total number of training samples in data14 and data15 were 15,176 and 233,450, respectively. The experiment results showed that the DL models reduced the error rate by 39.69% compared to ML methods in a small dataset. For large datasets, the reduction in the error rate ranged from 7.517% to 2.103%. For future work, they suggested increasing the diversity of DDoS vectors and system settings to test the DeepDefense model as well as compare DeepDefense against other ML algorithms.

A study proposing a model for analyzing and detecting DDoS attacks on the network-level and service levels of the bitcoin ecosystem was carried out by Baek et al. [ 18 ]. The dataset consisted of real DDoS attacks [ 19 ] and contained the service affected, date of the attack, category of service, number of posts, etc. From the bitcoin block data, the researchers extracted statistical data such as maximum, minimum, summation, and standard variation. The researchers used PCA to perform feature extraction. MLP was used to detect DDoS while the training set, validation set, and testing set were divided according to the ratio 6:2:2. The results showed that the accuracy of DDoS attack detection was about 50% and the accuracy for classifying normal block data was about 70% while setting the unit of epoch to 100. In future work, the researchers wish to find out how to extract the features that impact the characteristics of the blocks made when a DDoS attack occurs.

Sabeel et al. [ 20 ] used DNN and LSTM for binary prediction of unknown DoS and DDoS attacks. To train the models, they used the CICIDS2017 dataset (size 283 MB) [ 17 ]. For testing, a new dataset called ANTS2019 (size 330 MB), which mimics real-life attacks, was generated in a simulated environment to measure performance. In feature engineering, 78 features were used for the training set and 77 for testing (the ‘Fwd Header length’ feature was dropped). The train–test split was 75–25%. When the model was trained using CICIDS2017 and part of ANTS2019, the highest evaluation accuracy of 99.68% for DNN was obtained. When the researchers demonstrated the retraining of the models on a dataset with new unknown attacks, the true positive rate (TRP) obtained was 99.8% and 99.9% for DNN and LSTM, respectively. To maintain performance, it was concluded that the models should be updated with new attacks at regular intervals.

An intrusion detection system (IDS) used against DDoS attacks called DDoSNet was built by Elsayed et al. [ 21 ], which was a combination of autoencoder (AE) with RNN. In their paper, the researchers evaluated their classifier using the newly released CICDDoS2019 dataset [ 22 ], which contained 80 flow features. For feature engineering, PCA was applied, and the input features were 77. The total number of samples for training, validation, and testing sets were 161,523, 46,150, and 23,000, respectively. When the model was evaluated, the results indicated an accuracy of 99%, outperforming all compared ML methods—SVM, DT, NB, RF, Booster, and LR. In future work, the researchers intend to test the performance of their model in different datasets and extend the work to multiclass classification, since, in this research, a binary classification framework was applied.

A model that exploited the characteristics of CNN to classify the traffic flows as either benign or malicious was proposed by Doriguzzi-Corin et al. [ 23 ]. The CICIDS2018, CICIDS2017, and ISCX2012 datasets, which can be obtained through the Canadian Institute for Cybersecurity of the University of New Brunswick (UNB), were used by the researchers. They extracted 37,378 DDoS flows, and 37,378 randomly selected benign flows from ISCX2012. Then, they repeated the process for CICIDS2017 with 97,718 for benign and 97,718 for DDoS flows, and again for CICIDS2018 [ 17 ] with 360,832 for benign and 360,832 for DDoS flows. Following the pre-processing phase, each dataset was split as 90–10% train–test sets. The results showed that the accuracy for each dataset was 99.87%, 99.67%, and 98.88%, respectively. The UNB201X dataset was then constructed by combining splits from every year and the accuracy for the model with the UNB201X dataset was 99.46%. In future work, the researchers would like to optimize the pre-processing tool, rather than the detection model and also extend the dataset’s labels.

Ahuja et al. [ 24 ] used various DL algorithms to detect the DDOS attacks: CNN, RNN, LSTM, CNN-LSTM, support vector classifier-self organizing map (SVC-SOM), and stacked autoencoder-multi layer perceptron (SAE-MLP). The team used the dataset provided by leading India Project Mentor [ 25 ], which consists of 22 features. Two different optimizers were used: stochastic gradient descent (SGD) for the first 10 epochs and Adam for the next 150 epochs. For an unencrypted network, using a CNN, traffic features can be extracted automatically. Finally, they evaluated the model using the following metrics: accuracy, precision, recall, F-score, false positive rate (FPR), and false negative rate (FNR). The highest classification accuracy of 99.75% was achieved with the SAE-MLP algorithm.

A study conducted by Shi et al. [ 26 ] focused on using DL for both packet-wise and period-wise methods for traffic DDoS attack detection. They proposed a model that leveraged a DL approach for DDoS detection, which was DeepDDoS. It used spark as a big data processing framework. Additionally, for feature selection, maximal information coefficient and mutual information were used. The LSTM model was used for the training phase due to its better performance in longer sequences. The proposed work tried to filter out the abnormal flow with the least computational costs. The dataset used was CICIDS2017 (Size 283 MB). The results showed that the model achieved over 99% accuracy when receiving five packets in a continuous flow.

A model that used DL for the detection of multi-vector DDoS on a software-defined network was construed by Quamar Niyaz et al. [ 27 ]. An SAE-based DL approach was applied and the team collected network traffic from a real network (packets for normal traffic were captured from network connected to the Internet) and a private network (packets with DDoS attacks were captured from a private lab network) for the evaluation of the model. They divided the dataset files into training and testing, and then normalized them using max–min normalization. For comparison, models with soft-max and neural networks (NN) were also developed. The result showed that SAE performed better than the soft-max and NN model. The model achieved 95.65% accuracy. The researchers intend to develop a NIDS in future to detect the DDoS along with other attacks as well as the use of DL for feature extraction from raw bytes.

Pande et al. [ 28 ] aimed to build a ML model to detect DDoS attacks. To build the proposed model, a DDoS attack was performed using the ping of death technique and detected using RF. The dataset used by the researchers was the NSL-KDD [ 29 ] dataset containing a training set of 125,973 records and testing set of 22,544 instances and 41 attributes. The building time of the model was 8.71 s and the testing time was 1.28 s. The proposed model built using the random forest (RF) algorithm resulted in 99.76% accuracy. For future work, the researchers will implement the DL technique to classify the instances.

Radivilova et al.’s [ 30 ] goal was to analyze the main methods of identifying DDoS attacks through network traffic using the SNMP-MIB dataset [ 31 ]. They used RF as the classification method. The experiments began with the training and evaluation of a time series classifier. Recurrence analysis was used to extract features and the Hurst exponent was set at 10 intervals during the experiment. The main evaluation metrics were accuracy, FNR, and TPR. A numerical experiment showed that early detection is plausible when the average attack ratio represents 15–20% of the average traffic.

Likewise, Filho et al. [ 32 ] presented a smart detection system for DoS using ML. The goal was to detect both high- and low-volume DDoS attacks. The researchers used RF, perceptron, AdaBoost, DT, SGD, and LR. Since RF achieved higher precision while using 28 variables, it was used for classifying the network traffic. The evaluation of the proposed system was based on four intrusion detection benchmark datasets, namely, CICIDS2017, CICDoS2017 [ 33 ], CICIDS2018, and customized datasets. To evaluate the proposed model, recall, precision, and F-measure (F1) were used. In the CICIDS2018 and CIC-DoS2017 datasets, the proposed system achieved precision and a detection rate (DR) of more than 93% with a false alarm rate (FAR) of less than 1%. The researchers intend to include an analysis of DDoS attacks of Heartbleed and brute force attacks in their future work and to evolve methods for correlating triggered alarms.

Correspondingly, Vijayanand et al. [ 34 ] proposed a detection system of novel DoS attacks using multi-layer deep algorithms arranged in hierarchical order to detect the attacks accurately by analyzing the smart meter network traffic. The suggested technique addresses issues arising as a result of a large amount of input data and the complexity of input features. To evaluate the designed model, 9919 records from the CICIDS2017 dataset were used. The performance of the proposed system was analyzed by comparing it with simple multi-layer DL algorithms and hierarchical SVM algorithms, obtaining efficiency values of 39.78% and 99.99%, respectively.

An improved rule induction (IRI) based model was put forth by Mohammed et al. [ 35 ] for detecting DDoS attacks. UNSW-NB15 [ 36 ] dataset was used and, following the application of under-sampling without replacement and further pre-processing as well as correlation-based feature selection, the final dataset ended up with eight attributes. The suggested algorithm, called IRI for detecting DDoS attacks (IRIDOS), eliminates all insignificant items during the model creation and reduces the searching space to create the classification rules. Furthermore, the algorithm stops learning a rule after reaching a ‘rule-power’ threshold. The proposed technique was evaluated on 13 datasets from the UCI repository. IRI obtained a F1 score of 93.90% on UNSW-NB15. The model attained promising results, especially when compared to other data mining algorithms such as PRISM (divide-conquer knowledge-based approach), PART (a rule-based classification algorithm), and OneRule (OR).

An evaluation and comparison of the performance of different supervised ML algorithms on the CAIDA DoS attack dataset [ 37 ] were carried out by Robinson and Thomas [ 38 ]. Other datasets used were CAIDA Conficker, and KDD-99 [ 39 ]. The different ML algorithms included NB, RF, MLP, BayesNet, J.48, IBK, and Voting. It was observed that since the CAIDA Conficker dataset contained DDoS attacks generated from large botnets with flooding-attack vectors that were easily distinguishable with more bias, all ML algorithms, except NB, achieved an accuracy rate of more than 99% in this dataset.

Research that used the same CAIDA dataset was conducted by Barati et al. [ 40 ] who developed a hybrid ML technique to detect DDoS attacks. The CAIDA USCD 2007 was used for the attack as it contained an hour of anonymized traces from a DDoS attack on 4 August 2007. For normal traffic, the CAIDA Anonymized 2013 was used as it contained passive traces from CAIDA passive monitors in 2013. For feature selection and attack detection, genetic algorithm (GA) and ANN were used, respectively, and to select the most efficient feature wrapper method, GA was applied. The attack detection method was improved by deploying the MLP method of ANN. While building the model, the 10-fold cross-validation technique was used. The results showed that the proposed method obtained an excellent AUC of 99.91%. The researchers’ future work will include performing more experiments to detect the robustness of the model on different datasets.

Kim et al. [ 41 ] developed a model based on a convolutional neural network (CNN) for DoS attacks. They used two different datasets: the KDD-99 dataset and the CICIDS2018 dataset. They generated two types of intrusion images, RGB and grayscale. They considered the number of convolutional layers and the size of the kernel when they designed their CNN model. They performed both binary classification and multiclass classification. Moreover, the performance of the proposed model was evaluated by comparing it to the recurrent neural network (RNN) model. The best results were achieved with the KDD dataset by the CNN model that showed 99% or more accuracy in the binary and multiclass classifications. The RNN showed 99% accuracy in the binary classification. The CNN model proposed by the researchers was better able to identify specific DoS attacks with similar characteristics than the RNN model.

Finally, an approach to detect DDoS attacks using GRU was carried out by Rehman et al. [ 42 ]. The team produced a high-efficiency approach called DIDDOS to detect real world DDoS attacks using GRU, a form of RNN. Different classification models, namely GRU, RNN, NB, and SMO, were applied on the CICDDoS2019 dataset. For DDoS classification in the case of reflection attacks, the highest accuracy level of 99.69% was achieved while for the DDoS classification in the case of exploitation attacks, the highest accuracy level of 99.94% was achieved using GRU.

3.1.3. Phishing Attacks

Some studies have focused on training models and testing them to detect phishing attacks. For instance, the main goal of Alam et al. [ 43 ] was to defend against phishing attacks by developing an attack detection model using RF and DT, which are ML algorithms. For ML processing, a traditional phishing attack dataset from Kaggle that contained 32 features was used. To analyze the dataset characteristics, the intended model used PCA, a type of feature selection algorithm. An accuracy level of 97% through RF was reached. With less change and variance in RF, the over-fitting obstacle was controlled. Future studies will include the prediction of phishing attacks from the registered attacks in a dataset by applying CNN and implementing the IDS.

To identify phishing website attacks, a self-structuring neural network based on ANN was developed by Mohammad et al. [ 44 ]. Phishing-related features are crucial in detecting the kind of web pages that are extremely dynamic, thus the structure of the network should be constantly improved. The proposed approach addresses this issue by automating the network structuring process and demonstrating high acceptance for noisy input, fault tolerance, and significant prediction accuracy. This was accomplished by increasing the learning rate and expanding the hidden layer with additional neurons. The goal of the developed model was to obtain generalization ability, which means that the training and testing classification accuracy should be as similar as possible. The dataset included 600 legal and 800 phishing websites, with 17 characteristics retrieved using their own tool [ 45 , 46 ]. The accuracy of the training, validation, and testing sets were 94.07%, 91.31%, 92.18% for 1000 epochs, respectively. The principle of the model was to use an adaptive scheme with four processes including structural simplicity, learning rate adaptation, structural design adaptation, and an early stopping approach based on validation faults.

Trial and error is one of the most popular techniques used to train a neural network, but it has a significant drawback in that it takes a very long time to set the parameters and might even require the assistance of a domain expert. Rather than trial and error, a better self-structuring neural network anti-phishing model, which makes it simpler to structure NN classifiers, was proposed by Thabtah et al. [ 47 ]. The goal of the technique was to build a large enough structure from the training dataset to develop models that can be generalized to the testing dataset. During the training phase, the algorithm dynamically modifies the structural parameters in order to generate accurate non-overfitting classifiers. With a dataset of over 11,000 websites from UCI, the neural network characteristics were updated as the classification model was being built, but they were largely dependent on the computed error rate, intended error rate, and underlying technologies. When compared to Bayesian networks and DT, the findings indicated that the dynamic neural network anti-phishing model had a higher prediction accuracy. The highest average accuracy achieved was 93.06% when information gain was used for pre-processing.

A two-layered detection framework to identify phishing web attacks by using features derived from domain and DNS packet-level data was built by Rendall et al. [ 48 ] using four ML models, namely MLP, SVM, NB, and DT. The team investigated the use of the approach where a phishing domain was classified multiple times, with additional classification being carried out only when it scored below a predefined confidence level set by the owner of the system. The model was evaluated on a dataset created by the team, and it contained 5995 phishing records and 7053 benign records. After applying the models in the two-layered architecture, the highest accuracy of 86% was achieved by MLP and DT.

Li et al. [ 49 ] built a stacking model using URL and HTML features to detect phishing web pages. They used lightweight HTML and URL features as well as HTML string embeddings to make it possible to detect phishing in real-time. The 50K-PD dataset that contained around 49,947 samples as well as the 50K-IPD dataset that contained 53,103 web page samples were made and used. The stacking model was made by combining GBDT, XGBoost, and LightGBM in multiple layers. The model achieved an accuracy of 97.30% on the 50K-PD dataset and an accuracy of 98.60% on the 50K-IPD dataset.

Phishpedia, an ensemble deep learning model described in [ 50 ], addresses major technological difficulties in phishing detection by identifying and matching brand logo variations. Three different datasets were used for this experiment. First, researchers collected the first dataset by subscribing to a service; then they collected the second one from a top-ranked Alexa list, and finally, to evaluate the detection model, they collected the third dataset from a benign dataset. As a Siamese neural network converts image to vector, which assists in estimating the correlation between two visuals, this model was chosen by the researchers for their project. A better accuracy level and less runtime cost were achieved with Phishpedia. Unlike many other approaches, phishing data are not required for training. With an accuracy of 99.2%, Phishpedia outperformed the state-of-art approaches such as LogoSENSE, EMD, and PhishZoo by a large margin. In the future, the researchers plan to expand Phishpedia by adding a system to monitor phishing online.

Supervised machine learning models were used to detect phishing attacks based on novel combination features that were extracted from the URL by Batnaru et al. [ 51 ]. The researchers used a dataset from Kaggle [ 52 ] and PhishTank [ 53 ] containing 100,000 URLs that consisted of 40,000 benign URLs from Kaggle and 60,315 phishing URLs from PhishTank for the training. They used five ML models, namely MLP, RF, SVM, NB, and DT. In terms of model selection, RF was found to be the best candidate based on F1 scores. The evaluation process was performed using an unbalanced dataset that consisted of 305,737 benign URLs and 74,436 phishing URLs to evaluate the selected model in a realistic scenario. The achieved accuracy was 99.29%. The results were compared with the performance of Google Safe Browsing (GSB), which is the default protection that is available through popular web browsers. The model outperformed the GSB. In future work, the researchers’ aim is to explore the effectiveness of their model on other datasets as well as experiment with more features. They also plan to assess the robustness of the methodology against adversarial attacks that are mostly used by malicious parties.

PhishDump, a new mobile app based on a mix of LSTM and SVM algorithms, was suggested by Rao et al. [ 54 ] to detect genuine and fake websites in mobile platforms. Because PhishDump concentrates on extracting characteristics of URLs, it offers important benefits in comparison with previous efforts including quick calculation, class independence, and resistance to unintentional malware installation. The data were gathered from three separate inputs: Alexa, OpenPhish, and PhishTank. The application’s positive aspect is that it is free of external code and databases, allowing for the identification of malicious websites in as little as 621 ms. The characteristics extracted from the LSTM model are supplied as input for URL classification to SVM using a python code. Using several datasets, this application was compared against current baseline classifiers. PhishDump surpassed all previous studies with an accuracy of 97.30%. This approach has limitations such as the chance that an intruder might circumvent the approach by implementing structural modifications to the URL, and the system could miss phishing websites with shortened URLs.

Marchal et al. [ 55 ] reviewed phishing attack problems. The researchers provided guidelines for designing and evaluating phishing webpage detection techniques. They also presented the strengths and weaknesses of various design and implementation alternatives with regard to deployability and ease of use. Moreover, they provided a list of guidelines to evaluate the proposed solutions following the selection of representative ground truth, appropriate use of the dataset, and the relevant metrics. These recommendations can also enable comparison of the accuracy of different phishing detection technologies. The researchers state that academic research in phishing detection should adopt design and evaluation methods that are relevant to real-world publication.

Similarly, Das et al. [ 56 ] also reexamined the existing research on phishing and spear phishing from the perspective of different security domains such as real-time detection, dataset quality, active attacker, and base rate fallacy. They elucidated on the challenges faced and surveyed the existing solutions to phishing and spear phishing. Their work helps guide the development of more robust solutions by examining all the existing research on phishing.

3.1.4. Zero-Day Attacks

Interestingly, some researchers have focused on identifying zero-day attacks. One such study was conducted by Beaver et al. [ 57 ] who used ML methods that are able to distinguish between normal and malicious traffic. In their study, they used the adaptive boosting (AdaBoost) ensemble learner with DT in order to distinguish and classify the type of traffic on the KDD-99 dataset. The implementation that was tested in this study had four levels: (1) the top-level model that puts a cap on the FPR; (2) the first internal model that includes the AdaBoost ensemble, (3) the second internal model that implements the DT, and the lowest model that provides a judgment on whether the traffic was normal and relies on an anomaly detection algorithm. The system was able to detect 82% of the attacks that were previously missed by the signature-based sensor, detected 89% of attacks that it had not been trained to detect, and had a DR of 94% and a 1.8% false alarm rate. The future goals of the researchers are to scale the performance, which will require more parallelism in the architecture and modification of the training in order to accommodate larger datasets.

Ahmed et.al. [ 58 ] proposed a DL model that was used for identifying zero-day botnet attacks in real-time with a feed-forward backpropagation ANN technique and DNN. An important factor for obtaining high performance is a reliable dataset and hence the CTU-13 dataset [ 59 ] was obtained from the Botnet Capture Facility. There were nine input layer features and the dataset size was 10,000 randomly chosen flows. The first step was to normalize the whole data followed by the application of Adam’s optimizer in the model. The train–test split was 80–20%. The result showed that the accuracies achieved were over 99.6% after 300 epochs and that the model outperformed the NB, SVM, and backpropagation algorithms. In future work, the researchers suggest examining the efficiency of the proposed model with various other datasets.

3.1.5. Malware Attacks

Barut et al. [ 60 ] aimed to compare the ML algorithms, namely SVM, RF, and MLP, to determine the most accurate and the fastest method to detect malware encrypted data. Two datasets were generated: dataset1, which was produced using Stratosphere IPS [ 61 ] extracting 20 types of malware classes (Adload, Ransom, Trickbot, etc.), and dataset2, which used CICIDS2017. In feature engineering, 200 flow features were extracted and the chi-square was used. The researchers concluded that RF was the best performing algorithm as its results showed a DR of 99.996% and a FAR of 2.97%. Generally, the results showed that the SVM, RF, and MLP models are the most accurate, with some trade-offs. For dataset1, the RF model was the best performing across all evaluation metrics except for the prediction speed, which was higher when using the SVM model. For dataset2, the SVM model was the most accurate.

Marin et al. [ 62 ] developed a model for malware traffic detection of an encrypted network using DL. The specific DL model proposed in this study was the DeepMAL, which automatically discovered the best features/data representation from raw data. The dataset used was the USTCTFC2016 [ 63 ], which comprised two sections labelled malicious or normal traffic and 10 types of malware traffic. Two types of representations were used for the raw data: packets and flows. It was concluded that using raw flows representation of the input for the DL models achieved better results. The results showed that DeepMAL detected Rbot botnet with an accuracy of 99.9%, while Neris and Virut achieved 63.5% and 54.7% each. Despite the low rates achieved, they still performed better than RF.

Park et al. [ 64 ] evaluated the recognition performance of various types of attacks including IDS, malware, and shellcode using the RF algorithm and the Kyoto 2006+ [ 65 ] dataset (total size 19.8 GB). The dataset consisted of three class types: attack, shellcode, and normal. For the first two classes, there are three attack types: IDS, malware, and shellcode. This dataset contains the traffic data collected from November 2006 to December 2015. In the data preparation step, the researchers selected one month of data (May 2014) to train the model and another month (April 2014) to test the model. In the experiment, Park et al. considered 17 features and normalized the data. The overall performance was 99% for F-Score. However, it was observed that the performance of detecting different attacks differed. They propose to further evaluate the performance of the detection of various attack types using the same dataset but varying the training conditions.

In order to classify new malware variants accurately, David et al. [ 66 ] used DL to build a model using a deep belief network (DBN) algorithm that could generate and classify a malware signature automatically. The dataset used to build the proposed model was collected by the authors and contained 1800 instances and six malware categories (Zeus, Carberp, Spy-Eye, Cidox, Andromeda, and DarkCome) with 300 variants for each category. The DBN had eight layers with the output layer containing 30 neurons. The training process was unsupervised with 1200 vectors for training and 600 vectors for testing. To denoise the autoencoders, the noise ratio was 0.2 and training epochs was 1000. The model resulted in an accuracy of 98.6% when evaluated.

Reinforcement learning continuously mimics attackers to produce new malware samples, thereby giving viable attack models for defenders, as Wu et al. [ 67 ] explained. They suggested the gym-plus model, where gym-malware is improved by adding additional activities to the action space and allowing it to modify harmful portable executable files. Additionally, it retrains the algorithm using the public EMBER [ 68 ] dataset to substantially increase the DR. In gym-plus, the DQN, SARSA, and Double DQN algorithms were used, and DQN established better policies than the other algorithms. Through retraining on the adversarial instances provided by the DQN agent, malware detection accuracy increased from 15.75% to 93.5%.

Another dataset called MTA KDD 19 [ 69 ] was explored by Letteri et al. [ 70 ], who applied dataset optimization strategies to detect malware traffic. Two dataset optimization strategies, namely dimensional reduction technique based on autoencoders (AE-optimized) as well as feature selection technique based on rank relevance weight (RRw-optimized) and sensibility enhancement on the MLP algorithm were used. In RRw, feature selection consisted of two steps: dataset tampering where 5-fold cross-validation was applied, and backward feature elimination. In the AE-optimized technique, 33 input and output neurons were made and the train–validation split was 85–15%. The training set was further split to 15% testing. The highest accuracy of 99.60% was achieved in the RRw-optimized MTA KDD 19 dataset.

3.1.6. Malware Botnet Attacks

A novel scheme using supervised learning algorithms and an improved dataset to detect botnet traffic was carried out by Ramos et al. [ 71 ]. Five ML classifiers were evaluated namely, DT, RF, SVM, NB, and KNN on two datasets: CICIDS2018 and ISOT HTTP [ 72 ] Botnet (total size 420 GB). A network flow metrics analysis and feature selection was carried out on both datasets after which the ISOT dataset had 20 attributes including sources, destination port numbers, and transfer protocols among the selected features, and CICIDS2018 had 19 similar kinds of attributes. Five-fold cross-validation was applied and 80% of botnet instances were used for training and the remaining for testing. For the CICIDS2018 dataset, RF and DT achieved the highest accuracy of 99.99%. For ISOT HTTP, again, RF and DT achieved a high accuracy of 99.94% and 99.90%, respectively.

Using a similar dataset, Pektas and Akerman [ 73 ] utilized DL techniques and flow-based botnet discovery methods to identify botnet using two datasets: CTU-13 and ISOT HTTP, containing both normal and botnet data. They combined two DL algorithms namely, MLP and LSTM. In feature extraction, a flow graph was constructed where all flow data were processed to extract the features. The ISOT dataset consisted of two types of botnets, namely Waledac and Zeus, whereas CTU-13 contained seven botnet families. For the ISOT dataset, the approach achieved an F-score of 98.8%, and for CTU-13, an F-score of 99.1%.

3.1.7. Detecting Attacks over IoT Networks

As the Internet of Things (IoT) has become an important aspect of our lives, concerns about its security have increased, motivating researchers to focus their efforts on identifying new techniques to detect different attacks and increase the security of IoT. One such study was conducted by Abu Al-Haija et al. [ 74 ], where they developed an intelligent detection and classification DL-based system by leveraging the power of CNN for cyber-attacks in IoT communication networks. For evaluation, the NSL-KDD, which includes all the key IoT computing attacks, was employed. This system was validated and evaluated using K-fold and confusion matrix parameters, respectively. The outcome was an efficient and intelligent deep-learning-based system that can detect the mutations of IoT cyberattacks with an accuracy level that is greater than 99.3% and 98.2% for the binary-class and the multiclass, respectively. Discussions on future work include developing new software that catches and investigates data packets that communicate through the IoT environment and updating the existing dataset for more attacks.

By utilizing unique computing resources in a regular IoT space and applying an instance of extreme learning machine (ELM), a blockchain-based efficient solution for safe and secure IoT was proposed by Khan et al. [ 75 ]. This approach analyzes the credibility of the blockchain-based smart home in terms of the fundamental security objectives of confidentiality, accessibility, and integrity. The simulation outputs were provided to show that ELM’s overheads were minor in comparison to the cybersecurity advantages it brings. The ELM architecture is made up of input layers, numerous hidden layers, and a final output layer, with hidden layers consisting of fixed neurons to boost the network’s efficiency. To minimize the error rate, the backpropagation approach is combined with a feed-forward mechanism to modify the network weights. After pre-processing the data, to remove abnormalities and lessen the risk of faults, input data from NSL-KDD was mainly split into 85% training and 15% validation. The researchers aim to investigate more datasets and architectures in the future, because the presented ELM surpassed previous ML algorithms and achieved an accuracy of 93.91%.

Ullah et al. [ 76 ] aimed to detect malware-infected files and pirated software across the IoT network using the DL approach. The dataset used was collected by Google Code Jam (GCJ) [ 77 ]. The combined DL-based approach comprised two steps. First, to detect the pirated features, the TensorFlow neural network was proposed. The unwanted details were removed using the tokenization process and extra features were mined using stemming, root words, and frequency constraints. Second, to detect the malware, a new methodology based on CNN was proposed. The raw binary files were converted to a color image to solve the detection of malware by using an image classification problem. Grayscale visualization was gained by transforming the color images, which were then used to classify malware types. The results showed that this method performed better than modern methods when it came to measuring cybersecurity threats in IoT. In future work, the researchers intend to put forward an algorithm that can detect unknown malware families.

A model that was used for the classification of attacks in IoT networks and anomaly detection was created by Tama and Rhee [ 78 ] using a DNN. The team used CIDDS-001 [ 79 ], UNSW-NB15, GPRS-WEP [ 80 ], and GPRS-WPA2 [ 80 ] datasets and compared the results. The results showed a good performance in attack detection. The average performance of DNN was validated using 10-fold cross-validation on the UNSW-NB15, CIDDS-001, GPRS-WEP, and GPRS-WPA2 datasets that resulted in 94.17%, 99.99%, 82.89%, and 94% accuracy, respectively. In future work, the researchers want to investigate a larger value of trial repetition given the unaffected performance of the different validation methods.

To mitigate IoT cybersecurity threats in a smart city, Alrashdi et al. [ 81 ] proposed an anomaly detection-IoT system using the RF model of ML. The UNSW-NB15 dataset was selected for this project, which includes 49 features and nine attack classifications to revise normal and abnormal behaviors. The resulting model could detect cyber-attacks at fog nodes in a smart city by monitoring the network traffic that passed through each node. After detection, it alerted the security cloud services to analyze and update their system. This solution achieved the highest classification accuracy of 99.34% with the lowest FPR while detecting compromised IoT devices at distributed fog nodes. Using open sources of distributed computing to distribute the model in fog nodes to detect IoT attack networks and using n-fold cross validation to evaluate performance metrics of design are some of the researchers’ future goals.

3.1.8. Malicious Traffic Classification

In order to protect organizations and individuals against cyber-attacks, network traffic first needs to be analyzed and classified so that anomaly and malicious attacks can be detected. As the role of malicious traffic classification is very important, many researchers have sought to improve classification techniques using the power of AI. Some studies have focused on anomaly and abnormal traffic. Yang et al. [ 82 ] built a model that found hidden abnormal traffic in the network to detect attacks using DL techniques. The dataset used was NetFlow campus information, which is a collection of data gathered by campus routers. For the pre-processing stage, the authors transformed the data into standardized format, and then the RNN algorithm was applied. The proposed model resulted in an accuracy of 98%. For future work, the authors propose to search for more critical features that could help in detecting further cyber-attacks.

Chou et al. [ 83 ] used AI algorithms through TensorFlow to train the system by providing it with rules and signatures to distinguish between normal and abnormal traffic behavior. The researchers developed a framework of a DL model on TensorFlow by combining multiple layers of non-linear features and training the system to learn the normal behavior using a forward propagation algorithm on the NSL-KDD dataset. The results were promising, showing high accuracy during testing of up to 97.65% in the detection of probing attacks and 98.99% in the detection of DDoS attacks. In future work, improvements need to be made in the training characteristics in TensorFlow as the present model could not predict user to root (U2R—attacker tries to gain unauthorized access posing as a normal user) and remote to local (R2L—attacker tries to gain unauthorized access by exploiting network vulnerabilities) attacks since the dataset sample was too monotonous, leading to over-learning.

An ensemble deep model to detect and classify anomalies at both the network and host levels was presented by Dutta et al. [ 84 ]. The datasets used were IoT-23 [ 61 ], LITNET-2020 [ 85 ], and NetML-2020 [ 86 ] and the DL techniques applied were DNN, long short-term memory (LSTM), and a meta-classifier (i.e., LR). A deep sparse autoencoder (DSAE) was used as the feature engineering technique and a stacking ensemble learning approach was used for classification. After testing on three heterogenous datasets, the researchers concluded that the suggested approach outperformed individual and meta-classifiers such as RF and SVM. In future work, the researchers suggest conducting experiments on more sophisticated datasets and using advanced computational methods to boost processing speed.

Sun et al. [ 87 ] built a traffic classification model using DL techniques, focusing on web and peer-to-peer (P2P) traffic. The dataset used to train the proposed model was collected by the authors by capturing traffic from the network using a distributed host-based traffic collection platform (DHTCP). In the training process, the dataset was divided by 5:5, 7:3, and 10-fold cross-validation for the first, second, and third experiment, respectively, and radial basis function neural network (RBFNN), SVM, and probabilistic neural network (PNN) were applied. The results showed that the highest accuracy was 88.18% when using PNN and dividing the dataset as 7:3 for training and testing.

Some researchers have focused on investigating the effects of network data representation on the intelligent models. Millar et al. [ 88 ] devised and compared three ways of network data representation for malicious traffic classification to deep learners: payload data, flow image, and flow statistics. They showed that malicious classes can be predicted using just 50 bytes of a packet’s payload. Since DL benefits from an extensive and large dataset, the UNSW-NB15 dataset was selected for the experiment. The payload-based method was found to have the best performance. However, all methods failed to accurately identify DDOS attacks. Since different malicious attacks exhibit different defining characteristics, there is no ‘one size fits all’ solution for identifying all attacks. Hence, in future work, the researchers propose to research the combination of payload-based and statistical inputs to identify malicious traffic.

Yang et al. [ 89 ] aimed to develop a model for malicious traffic detection of an encrypted network using DL. The model proposed was developed based on a residual neural network (ResNet), which can automatically identify features and effectively isolate contextual information of the encrypted traffic. Moreover, the CTU-13 dataset was used to train the model and, in the pre-processing stage, the data were converted into the IDX format, then traffic refinement, traffic purification, data length unification, and IDX file generation were performed. Then, deep Q-network (DQN) reinforcement learning, and deep convolution generative adversarial networks (DCGAN) were used to generate the encrypted traffic adversarial sample. This resolved the issue of unbalanced and insufficient or small samples. The model achieved a high accuracy of 99.94%. In future, the researchers will focus on delivering advanced genetic algorithms into DCGAN to enhance generator efficiency.

A new framework using ML for hardware-assisted malware detection by monitoring and memory access pattern classification was introduced by Xu et al. [ 90 ]. They proposed in-processor monitoring to obtain virtual address trace and addressed this by dividing accesses into epochs and summarizing the memory access patterns of each epoch into features, after which they are fed to ML classifiers, namely RF and LR. It was concluded that the best performing classifier was RF for both kernel rootkits and memory corruption attacks. Its accuracy in kernel rootkits detection reached a 100% TPR, with less than 1% FPR. As for user-level memory corruption attacks, the algorithm demonstrated a 99.0% DR with less than 5% FPR.

De Lucia et al. [ 91 ] proposed a malicious network traffic detection mechanism of encrypted traffic using two techniques—SVM and CNN. To conduct the experiments, the team leveraged a public dataset [ 92 ], which consisted of malicious and normal TLS network traffic packets. In data pre-processing, the desired TLS features were extracted from the packet captures using a custom program written in the PcapPlusPlus framework [ 93 ]. The train–test split was 70–30%. Both methods successfully achieved a high F-score and accuracy and a low FPR. However, SVM outperformed CNN by achieving a lower FPR and a slightly higher F-score, precision, accuracy, and recall.

While building ML models for the detection of normal or malicious traffic, it was observed that questions arise regarding the selection of the right features. With this in mind, Shafiq et al. [ 94 ] proposed a ML algorithm called weighted mutual information_ area under the curve (WMI_AUC), a hybrid feature selection algorithm, that helps in selecting the effective features in the traffic flow. The databases used in the study were the HIT Trace 1, which was captured by the authors from WeChat messenger using Wireshark, and the NIMS dataset, which was collected by the authors from their research-tested network. To build the final model, the researchers used 11 different ML algorithms. The model built using the partial decision tree (PART) algorithm resulted in an accuracy of 97.88% using the HIT Trace 1 dataset. For the NIMS dataset, RF resulted in an accuracy of 100%.

Another field that was also covered by researchers was the detection of malicious virtual private network (VPN) traffic. Miller et al. [ 95 ] proposed a computational model to address the current limitations in detecting VPN traffic and aid in the detection of VPN technologies that are being used to hide an attacker’s identity. A model was built to detect VPN usage by using a MLP trained neural network by flow statistics found in the captured network packets’ TCP header. The experiment using OpenVPN was able to identify VPN traffic with an accuracy of 93.71% and identify Stunnel OpenVPN with an accuracy of 97.82% when using 10-fold cross-validation. Future studies could be carried out to detect unauthorized user access and research organizational security, which is essential for a business.

Since the spread of malicious websites, research emphasis has been on factor analysis of the site category and the correct identification of unlabeled data in order to distinguish between benign and dangerous websites to mitigate the risk of malicious websites. Wang et al. [ 96 ] demonstrated the use of the NB model to classify malicious websites. A self-learning system was developed to categorize websites based on their features, with NB being used to divide the websites into two categories: malicious or benign. The dataset used was the ISCX2016 [ 97 ] dataset, which contains over 100,000 URLs and 50 features for each URL. A higher accuracy of up to 90% was achieved after applying factor identification of datasets and accomplishing website classification using the NB classifier, demonstrating that the NB classifier can perform well when it comes to website classification.

Finally, Ongun et al. [ 98 ] used the CTU-13 dataset to build ensemble models for malicious traffic detection. The algorithms used to build the model were LR, RF, and gradient boosting (GB). The first representation was connection-level representation where the features were extracted from the raw connection logs. The second representation was aggregated traffic statistics where the authors compared between raw features in the first representation and the features obtained by time aggregation in this representation. The last representation was temporal features, where the authors considered the time interval with the features obtained by time aggregation in the second representation. The best performance achieved by the model built using RF and GB and resulted in high AUC of 99% when applying it on the features of the third representation.

Malicious Traffic in a Cloud Environment

Using a dataset constructed from a real cloud environment, Alshammari and Aldribi [ 99 ] built ML models to detect malicious traffic in cloud computing. The dataset used was the new ISOT CID [ 100 ], a publicly available cloud-specific dataset where the training data contained 17,296 instances and testing had 7411 instances. Their aim was to add some significant features, prepare the training data, and test the dataset against different ML models, namely DT, KNN, NNet, SVM, NB, and RF. The dataset contained 89,364 instances among which 44,569 were malicious and 44,795 were normal instances. They performed both cross-validation (5-, 10-, 15-folds) and split–validation (90–10%, 80–20%, 70–30%). For cross-validation (all 5-, 10-, 15-folds), DT, RF, and KNN all obtained an accuracy of 100%. In the case of split validation (for all 90%, 80%, and 70% splitting), both DT and RF achieved an accuracy of 100%.

Using the same cloud dataset, Sethi et al. [ 101 ] proposed an IDS to protect cloud networks from cyber-attacks. The algorithm applied was double deep Q-learning (DDQN). The datasets used were the ISOT CID dataset, and the standard NSL-KDD dataset. The total size of ISOT is 8 TB, but for the purposes of the experiment, only the network traffic data portion was used. For the feature selection phase, the team applied a chi-square feature selection algorithm. The selected features were 164 and 36 for ISOT CID and NSL-KDD, respectively. The accuracy for the proposed model tested for NSL-KDD was 83.40%, whereas for ISOT CID, it was 96.87%. After measuring the robustness of their model against an adversarial attack, the accuracy obtained was 79.77% for NSL-KDD and 92.17% for ISOT CID.

Xie et al. [ 102 ] used a class SVM technique based on a short sequence model. They used the Australian Defense Force Academy (ADFA) dataset [ 103 ], which contains thousands of normal traces taken from a host setup to simulate a modern Linux server as well as hundreds of anomalous traces caused by six different types of cyber-attacks. As it was a short sequence, duplicate entries were removed, leading to an improved separability between the normal and abnormal. The k values chosen for this experiment were k = 3, 5, 8, 10, with k = 5 providing the greatest results and an accuracy of 70% attained at an FPR of roughly 20%. Although the experimental result showed a significant reduction in computing cost, the rate of an individual kind of attack mode recognition was low.

Vanhoenshoven et al. [ 104 ] addressed a variety of ML approaches to solve the challenge of detecting malicious URLs as a binary classification problem including multi-layer perceptron, DT, RF, and KNN. The researchers used Ma et al.’s dataset [ 105 ], called the Malicious URLs Dataset, which consists of 121 sets gathered over 121 days. There are 2.3 million URLs and 3.2 million features in the overall dataset. The researchers divided the URLs into three groups based on their characteristics. Each of the methods was used to classify these sets. The models were assessed based on their accuracy, precision, and recall, with features such as blacklists and WHOIS information taken into account. The article implies that all of its approaches achieved high accuracy, with RF being the most convenient approach to use, obtaining an accuracy of roughly 97% based on experimental results. The method also had great precision and recall, demonstrating its reliability.

For the purpose of detecting harmful URLs, Yuan et al. [ 106 ] introduced a parallel neural joint model approach. The semantic and text features were included in the method by integrating a parallel joint neural network incorporating capsule network (CapsNet) and independent RNN (IndRNN) to improve the detection accuracy. The malicious URLs data were gathered from two sources: an anti-phishing website called PhishTank and a malware domain list that collects a blacklist of harmful websites. The 5-fold cross-validation technique was applied and unified performance metrics were used to evaluate the model’s performance. According to the results of the experiments, the model performed best when the dimension of the feature was 185 and the number of IndRNN layers was 2. The accuracy and recall rates both reached 99.78% and 99.98%, respectively, resulting in a performance that exceeded traditional models.

By utilizing ML on the latest and more advanced dataset for IoT networks called IoTID 20 [ 107 ], Maniriho et al. [ 108 ] proposed an approach for anomaly-based intrusion detection in IoT networks. The ML algorithm applied was RF. The dataset had three subsets: subset 1 contained normal and DoS instances; subset 2 contained normal and man-in-the-middle (MITM), and subset 3 contained normal and scan traffic. A 10-fold cross-validation and train–test split of 70–30% were applied. The overall accuracy for each subset attack was DoS—99.95%, Scan—99.96%, and MITM—99.9761% using cross-validation while using the percentage split DoS—99.94%, Scan—99.93%, and MITM—99.9647.

Since the security of IoT networks is a major concern for researchers and decision-makers, some other researchers have used the same IoTID 20 dataset in order to build an IDS for in-home devices. A three-stage strategy that includes clustering with oversampling, reduction, and classification using a single hidden layer feed-forward neural network (SLFN) was provided by Qaddoura et al. [ 109 ]. The paper’s significance lies in the data reduction and oversampling techniques used to provide relevant and balanced training data as well as the hybrid combination of supervised and unsupervised techniques for identifying intrusion activities. With a ratio of 0.9 and a k value of 3 for the k-means++ clustering technique, the results showed that using the SLFN classification technique and using the SVM and synthetic minority oversampling technique (SVM-SMOTE) yielded more accurate results than using other values and classification techniques. Similarly, a deep multi-layer classification strategy was suggested by Quddoura et al. [ 110 ], which consisted of two phases of detection. The first phase entails detecting the presence of an intrusion and the second phase identifies the kind of intrusion. In preprocessing, the oversampling technique was carried out to enhance classification results. Furthermore, the most optimal model was built, which contained 150 neurons for the single-hidden layer feed-forward neural network (SLFN) (phase 1), and 150 neurons and two layers for LSTM (phase 2). When the findings were compared to well-known classification approaches, the suggested model outscored them by 78% with regard to the G-mean.

3.1.9. Attacks at DNS Level

In order to improve the user’s privacy, a new protocol called DNS over HTTP (DoH) was recently created. This protocol can be used instead of traditional DNS for domain name translation with the benefit of encryption. However, security tools depend on readable information from DNS to detect attacks such as malware and botnet. Hence, Singh and Roy [ 111 ] aimed to use ML algorithms to detect malicious DoH traffic. The five ML algorithms used were GB, NB, RF, KNN, and LR. The team conducted the experiment on the benchmark MoH dataset—CIRA-CIC-DoHBrw-2020, which was recently developed and shared publicly [ 112 ]. It contained a benign file that had 19,807 instances and a malicious file that had 249,836 instances. The DoHMeter tool [ 113 ], which was developed in Python and is freely available, was used to extract important features from the PCAP files. To build the model, the data were split into a train–test ratio of 70–30%. The experimental results showed that RF and GB attained the maximum accuracy of 100%.

3.1.10. Intrusion Detection

NIDS analyzes and monitors the whole network to detect malicious traffic. The following studies used the NSL-KDD dataset. Al-Qatf et al. [ 114 ] proposed self-taught learning (STL)-IDS using the DL approach in an unsupervised manner as a feature selection technique to reduce the testing and training time and effectively enhance the accuracy of the prediction for the SVM model. In the pre-processing phase, a 1-n encoding system was applied before STL. Max–min normalization was used to map all features into a specific range. The results obtained through the proposed model represented the classification accuracy of improved SVM compared with algorithms such as J.48, NB, and RF. Moreover, it performed well in five-category (normal and five types of attacks) and two-category (attacks and normal traffic) classification.

Similarly, to develop a flexible and efficient NIDS, Niyaz et al. [ 115 ] proposed a self-taught learning (STL) based on sparse autoencoder (AE) and soft-max regression (SMR) on the NSL-KDD dataset. The authors applied 10-fold cross validation on the training data for STL and applied the dataset directly for SMR. The results showed a high-performance accuracy rate of 98% for STL.

Following the same principle of using DL for intrusion detection, Zhang et al. [ 116 ] proposed an approach using the NSL-KDD dataset, consisting of normal and different forms of abnormal traffic. By first applying feature selection to remove the unrelated features and noise, the autoencoder was implemented to learn the features of the input data and extract the key features. Soft-max regression classification was then applied. The measures for evaluation used were accuracy, precision, recall, and F-score. Finally, the model achieved F-score and recall values of 76.47% and 79.47%, respectively.

Some studies have focused on multi-layer DL algorithms. Wu and Guo [ 117 ] proposed a LuNet model, which is a hierarchical CNN and RNN neural network, applied on the NSL-KDD and UNSW-NB15 dataset. They started by converting the categorical features using the ‘get dummies’ function in Pandas, then they applied standardization to scale input data and concluded by employing K-fold cross-validation. To evaluate LuNet, the following evaluation criteria were used: accuracy, FPR, and DR. The performance in binary classification achieved on average 99.24% on the NSL-KDD dataset and 97.40% accuracy on the UNSW-NB15 dataset. The performance in multiclass classification was an average of 99.05% accuracy on NSL-KDD, and 84.98% accuracy on UNSW-NB15. In future work, the researchers intend to investigate worms and backdoors as these were wrongly classified in the model.

To detect network intrusions efficiently, Hasan et al. [ 118 ] used an ANN. Different backpropagation algorithm training approaches were employed to detect the attacks and non-attack connections. The DARPA 1998 [ 119 ] intrusion detection dataset was used for training and testing purposes. To train the model, the researchers used the backpropagation learning algorithm, letting it detect intrusions in the following three modes: batch gradient descent with momentum (BGDM), batch gradient descent (BGD), and resilient backpropagation (RP). Finally, they used the DR and the FPR to determine the performance of intrusion detection. The total attack detection performance and the efficiency measure support the RP method of training, which obtained an accuracy of 92%. Further changes in the network architecture can be made to enable the efficient use of the network with other approaches.

Likewise, Devikrishna et al. [ 120 ] proposed an approach that used ANN as a pattern recognition technique to classify normal and attack patterns. The dataset used was the KDD-99 dataset. The feature extraction process consisted of feature selection and feature construction. An MLP was used for intrusion detection. MLP was a layered feed-forward ANN network typically trained with backpropagation. Accuracy was a goal that largely improved the overall effectiveness of the IDS. A possible future research direction could be to incorporate more attack scenarios in the dataset.

Abuadlla et al. [ 121 ] also proposed an IDS based on flow data built in two stages. The first stage involved the detection of abnormal traffic on the network. The second stage involved detecting and classifying the attack types in the network traffic. The NetFlow dataset made by network captures was employed to train the proposed system. To build the proposed model, a multilayer feedforward neural network and the radial basis function network (RBFN) were used. The proposed model resulted in a higher accuracy of 94.2% for the abnormal traffic detection stage, and 99.4% for the attack detection and classification stage. Although the multilayer feedforward neural network resulted in higher accuracy, it consumed more time and memory in comparison with RBFN, which makes RBFN a better choice for real-time detection. In future work, the researchers aim to build a faster and more accurate model for real-time detection with a smaller number of features.

Utilizing the KDD-99 dataset, Alrawashdeh et al. [ 122 ] aimed to build a DL model for anomaly detection in real-time. The researchers began by transforming categorical features into numerical features for convenience. Then, they removed the duplicated records to reduce computational time and improve performance. Three models were built: first using the restricted Boltzmann machine (RBM), the second using deep belief network (DBN), and the third using DBN with LR. The model that was built using DBN and LR resulted in the best performance with an accuracy of 97.9% and a FN rate of 2.47%.

In addition, Al-Janabi et al. [ 123 ] proposed a model based on ANN using the KDD-99 dataset and incorporated three scenarios: detection mode, detection and classification mode, and detailed classification mode. The researchers performed their experiment for each scenario by training the models using different number of features in each. The best results achieved were a 91% DR and 3% FP rate using 44 features with the detection only scenario. The results showed that performance decreased as a higher level of classification was performed.

Belavagi et al. [ 124 ] evaluated the different ML algorithms used to classify the network data traffic as normal traffic or intrusive (malicious) traffic. By using the NSL-KDD dataset consisting of internet traffic record data, supervised ML classifiers, namely LR, SVM, Gaussian NB, and RF were applied to identify four simulated attacks. After converting all the categorical data to numerical form in the pre-processing stage, the predicted labels from these models were compared with the actual labels, and TPR and FPR were computed. From the observed results, it was concluded that the RF classifier outperformed other classifiers for the considered dataset, with an accuracy of 99%. The researchers suggested that the work can be further extended by considering the classifiers for multiclass classification and considering only the important attributes for intrusion detection.

Additionally, Almseidin et al. [ 125 ] evaluated the different ML algorithms, keeping the focus on FNR (identifying an attack as normal traffic) and FPR (identifying normal traffic as an attack) performance metrics to improve the DR of the IDS. They used several algorithms, namely J.48, RF, random tree, decision table, multi-layer perception (MLP), NB, and Bayes network. The KDD-99 dataset was imported to SQL server 2008 to implement statistical measurement values such as attack types and occurrence ratios. Then, 148,753 record instances were extracted for training data. A wide range of results was obtained by using Weka tools that demonstrated that the RF achieved the highest average accuracy and the decision table achieved the lowest FNR.

Choudhury et al. [ 126 ] implemented ML algorithms to categorize network traffic as normal or anomalous. Algorithms such as BayesNet, LR, instance-based knowledge (IBK), J.48, PART, JRip, random tree, RF, REPTree, boosting, bagging, and blending were incorporated and compared. The researchers used the NSL-KDD dataset and Weka tools to model and compare the algorithms. The results showed that RF achieved the highest accuracy of 91.523%, and the lowest accuracy of 84.96% resulted from LR.

Similarly, the objective of the system proposed by Thaseen et al. [ 127 ] was to detect any intrusions in the network using ML by classifying different packets without decrypting their content. For intrusion detection analysis, packets were generated and transmitted over a network and were captured by Wireshark. This captured data was organized into a dataset. By implementing ML algorithms such as NB, SVM, RF, and KNN, the data were classified with an accuracy of 83.63%, 98.23%, 99.81%, and 95.13%, respectively. Future work to this study includes the plan to use DL algorithms to enhance the performance and accuracy of recognition and classifying different types of packets transmitted over a network.

Likewise, Churcher et al. [ 128 ] proposed several ML models to cope with the increase in the number of network attacks. The researchers highlighted several ML methods that were used in IDS such as DT, SVM, NB, RF, KNN, LR, and ANN. The Bot-IoT dataset [ 129 ] containing ten CSV files that have records of IoT network attacks and 35 features was used. In the pre-processing stage, the undesirable features were removed. The results of the model showed that in RF, the accuracy for DDoS attacks was 99% in binary classification and its performance was superior in the context of all types of attacks. However, KNN achieved 99% accuracy and outperformed other ML algorithms in the multiclass classification. In conclusion, KNN and ANN are more accurate when used in weighted and non-weighted datasets, respectively, for multiclass classification.

A comparative analysis of two commonly used classification methods, SVM and NB, to evaluate the accuracy and misclassification rate was conducted by Halimaa et al. [ 130 ] using the NSL-KDD dataset. For comparative analysis, the Weka tool’s randomized filter was used to ensure the random selection of 19,000 cases. The results showed that SVM attained an accuracy of 93.95% and NB achieved an accuracy of 56.54%. The researchers plan to work with larger amounts of data and construct a cross multistage model to create the ability to categorize additional attacks with accuracy and better performance.

Ghanem et al. [ 131 ] assessed the performance of their existing IDS against 1- and 2-class SVMs by applying both straight and non-linear forms. For the first step of data collection, they collected five datasets from the IEEE 802.11 network testbed and another dataset was collected in Loughborough University from an ethernet local area network office. All this traffic was collected in the PCAP structure using tcpdump. The results demonstrated that the linear 2-class SVM presented generally highly accurate findings. In addition to reaching a 100% success rate over four out of five of the metrics, it required training datasets. Meanwhile, the linear 1-class SVM’s performance was nearly as good as the best technique and did not require training the dataset. Overall, it was concluded that the existing unsupervised anomaly-based IDS can benefit from using any of the two ML techniques to improve accuracy in detection and its analysis of traffic, especially when it is comprised of non-homogeneous features.

Mehmood et al. [ 132 ] focused on supervised learning algorithms to make a comparison of three ML algorithms, namely SVM, J.48, NB, and decision table for anomaly-based detection. These algorithms were trained using the short version of the KDD-99 dataset as it has many records. The performance measures used in this comparison were FPR, TPR, and precision. The results highlighted a limitation when it came to DR, as not a single algorithm had a high DR for all the tested attacks in the KDD-99 dataset. However, the J.48 had a low misclassification rate. Hence, it was concluded that this algorithm performed best out of all the other algorithms.

An approach that boosts the capacities of wireless network IDS was introduced by AlSubaie et al. [ 133 ]. The dataset used was WSN-DS [ 134 ], which included 23 attributes and five potential outputs (four attacks (DoS attack): flooding, grayhole, blackhole, and scheduling and one normal state (no attack)). The ML algorithms used here were ANN and the J.48. Additionally, the data noise was calculated as it affects the accuracy of the ML algorithms. The amount of noise permissible for the ML model to be deemed trustworthy was determined. The results determined that J.48 performed better than the ANN when noise was not considered, obtaining the highest accuracy rate of 99.66%. With datasets having more noise, ANN was more tolerable.

In order to determine which of the models could handle large amounts of data and still produce accurate predictions, Ahmad et al. [ 135 ] used the SVM linear and radial basis function (RBF), RF, and ELM methods and compared their performance on the NSL-KDD dataset. The results demonstrated that when using the full dataset, the ELM outperformed the other algorithms in terms of all the metrics being tested in all experiments including accuracy, which reached 99.5%. On the other hand, when using half and a quarter of the dataset, SVM performed better overall, with an accuracy of around 98.5%. Hence, it was concluded that ELM is best suited for intrusion detection when dealing with large amounts of data. The researchers plan to further explore ELM and experiment with it using different selection and feature transformation techniques and their impact on its performance.

Amira et al. [ 136 ] found MLP to be the most effective and appropriate classifier to increase detection accuracy. The data pre-processing phase was carried out using the equal width binning algorithm. The sequential floating forward selection (SFFS) feature selection technique was applied, resulting in the selection of 26 features. Using the NSL-KDD dataset, Amira et al. then applied a multi-agent, 2-layer classification algorithm. The different classifiers that were tested and compared were: NB and DT, namely NBTree, BFTree, J.48, and RF Tree. NBTree and BFTree gave better results than RF and J.48. MLP gave good results in terms of classifying normal and DoS attacks compared to identifying the R2L and U2R attacks. Overall, it was concluded that a single classifier is not sufficient to classify the attack class. Therefore, to increase classification accuracy, multiple classifiers must be involved.

Rather than comparing different techniques, Gogoi et al. [ 137 ] focused on evaluating the clustering approach to detect network traffic anomalies on different datasets. The proposed method was evaluated using TUIDS [ 138 ] datasets, the NSL-KDD dataset, and the KDD-99 datasets. The real-life TUIDS intrusion datasets consist of three datasets: flow level, packet level, and port scan. After the pre-processing stage, they applied a combination of supervised clusters and unsupervised incremental clusters which labelled the training data into different profiles (or rules). The prediction was undertaken using a supervised classification algorithm. Using the TUIDS dataset, the packet level had the highest accuracy of 99.42%. When using the KDD-99 dataset, the accuracy achieved was 92.39%. Finally, using NSL-KDD, the accuracy achieved was 98.34%.

Aiming to classify real-time traffic by using 12 features of network traffic data to classify 17 attack types of DoS, probing as well as normal was conducted by Wattanapongsakorn et al. [ 139 ]. Supervised ML techniques—DT, ripple rule, back-propagation neural network, and Bayesian network—were applied. In the pre-processing stage, the team used a packet sniffer and a built-in Jpcap library to collect and store network records over a period of time. Then, in the classification part, training and testing were performed using Weka tool, and results were observed. The DT approach achieved the highest DR of 85.7%. In the second experiment, some attack types were grouped together, and training data consisted of 9000 records with 600 records of each attack type (so 600 × 15). In this case, the DR was much higher, with the DT being 95.5%.

Further research that worked on enhancing an existing algorithm for intrusion detection was done by Cui et al. [ 140 ], who worked on enhancing the Bayes classifier (BC). The proposed method seeks to integrate the spatiotemporal patterns of measurement into a flexible BC to detect cyber-attacks. Spatiotemporal patterns were captured by the graph Laplacian matrix for system measurements. After the evaluation of the developed method’s performance, it was concluded that the flexible BC showed the largest TPR compared with the naïve BC, SVM, and DT methods, which verified the effectiveness of the developed method. For future work, DL techniques will be involved by mapping the spatiotemporal patterns to a linear space using the LSTM network for better detection accuracy of cyber-attacks.

Moreover, Kumar et al. [ 141 ] focused on enhancing the detection efficiency by combining three algorithms—RF, JRIP, PART—to identify threats of mobile devices. The dataset used contained around 600 samples that were captured by the researchers from the virtual machine using Wireshark. For feature extraction, the researchers used bidirectional flow export using the IP flow information export method (RFC-5103 BiFlow). The challenge the researchers faced was an overfitting problem and concept drift condition, which is caused by choosing low performance giving features. The ensemble model resulted in an accuracy of 98.2% with the ability to identify benign traffic. For future work, the researchers aim to integrate ML with conventional NIDS and to reduce the chance of concept drift by introducing innovative methods.

Similarly, Tahir et al. [ 142 ] constructed a hybrid ML technique for detecting network traffic as normal or intrusive by combining K-means clustering and SVM classification to improve the DR and to reduce the FPR alarm and FNR alarm. The dataset applied in the proposed technique was the NSL-KDD dataset. Pre-processing was performed on the dataset to reduce ambiguity and supply accurate information to the detection engine. After applying the classifier subset evaluator and best-first search algorithms, both the classifiers—K-means and SVM—were then tested and their performance evaluated. The hybrid ML technique results showed that they attained 96.26% as the DR and 3.7% as the FNR. The model showed a comparatively higher detection for DoS, PROBE, and R2L attacks.

One more enhanced technique was proposed by Sharma et al. [ 143 ] to apply efficient data mining algorithms for detecting network traffic as normal or anomalous. The team applied KDD-99, which contains 4.9 M data instances and four class types. In feature selection, they collected basic features such as protocol type, duration, flags, etc. The data was normalized and the classification was carried out using k-means clustering via a NB classifier. The target variable was classified as normal, DoS, U2L, R2L, probing. The DR achieved by using the proposed method was 99%.

Following the same ideology, Lehnert et al. [ 144 ] built their system in steps with more complexity added at each level. They used the KDD-99 dataset and Shogun ML Toolbox to test and train the data. The study’s focus was mainly on using the SVM implementation provided by the toolbox. The key step in this paper was the training phase, which was done using labelled data. The goal was to attempt to choose the most appropriate kernel and minimize the number of features. The results showed that two out of the four available kernels on Shogun tied in the best accuracy. These kernels were Gaussian and Sigmoid, which produced an error of only 2.79%. It was concluded that identifying both the kernel that has the lowest error rate and the subset of the most relevant features leads to an improved version of the algorithm. Ultimately, this can enhance the accuracy and efficiency of the SVM applied for intrusion detection, making it able to predict with higher speed and accuracy.

An innovative feature selection algorithm called the ‘highest wins (HW)’ was proposed by Mohammad and Alsmadi [ 145 ] in order to enhance intrusion detection. This HW algorithm was applied in NB techniques on 10 benchmark datasets from the UCI repository to evaluate its performance. The results showed that the proposed HW algorithm could successfully reduce the dimensionality for most of these datasets compared to other feature selection methods such as chi-square and IG. The team conducted another set of experiments where NB and DT (C4.5) classifiers were built using the HW technique on the NSL-KDD dataset on its binary and multiclass versions. For binary, HW reduced the features of the dataset from 41 to eight and the results gave an accuracy of 99.33% using the reduced features (0.23% decrease compared to using complete features). For multiclass, HW reduced the features of the dataset from 41 to 11, and in terms of time needed for building the model, reduced features had an enhancement of 2.3%. The results demonstrated that instead of using all 41 features of this dataset, using only eight by applying HW could produce classifiers with the same classification performance.

Furthermore, Chawla et al. [ 146 ] proposed a computational efficient anomaly-based IDS that was a combination of CNN and RNN. To detect malicious system calls, they merged stacked CNNs with GRUs. Using the ADFA dataset of system call traces, they obtained a set of equivalent findings with shorter training periods when using GRU. They employed CNN to extract the local features of system call sequences and feed them into the RNN layer, which was then processed through a fully connected SoftMax layer, which generates a probability distribution across the system calls processed by the network. Trained on normal system calls, which predict the likelihood of a subsequent system call, a testing sequence was employed to detect a malicious trace based on a pre-defined threshold. The RNN-based LSTM model’s training time was claimed to be reduced by the researchers.

In addition, Nguyen et al. [ 147 ] used the DL approach for detecting cyber-attacks in a mobile cloud environment. The used datasets were KDD-99, NSL-KDD, and UNSW-NB15 (training = 173,340 records, testing = 82,331 records). The researchers adopted principal component analysis (PCA) to reduce the dimensions for the datasets and the learning process comprised of three layers: the input layer, hidden layers, and output layer. The input layer used Gaussian restricted Boltzmann machine (GRBM) to transform real values to binary code. The hidden layer used restricted Boltzmann machine (RBM) to perform the learning process. The output of the hidden layer was used as input in the output layer (SoftMax regression step). They used accuracy, recall, and precision for measuring performance. The results showed that the accuracy for NSL-KDD, UNSW-NB15, and KDD-99 datasets, respectively, were 90.99%, 95.84%, and 97.11%. For future work, Nguyen et al. proposes implementing the model on real devices to measure the accuracy on a real-time basis and evaluate the energy and time consumed in the detection.

An improved IDS was proposed by Tama et al. [ 148 ] where they used two datasets to evaluate the performance of the model: NSL-KDD and UNSW-NB15. To minimize the feature size, a hybrid feature selection technique was used. The hybrid feature selection consisted of three methods: the ant colony algorithm, particle swarm optimization, and genetic algorithm. Then, the researchers proposed a two-stage classifier ensemble, which was rotation forest and bagging. The proposed model achieved an accuracy of 85.8% with the NSL-KDD dataset and 91.27% with the UNSW-NB15 dataset. For future work, the researchers intend to perform the proposed model to solve the multiclass classification problem.

A novel intrusion detection system was proposed that takes the advantage of both statistical features and payload features by Min et al. [ 149 ]. They used the ISCX2012 dataset, which is more updated and closer to reality, and they utilized word embedding and text-CNN to extract more features from the payloads. Then, the RF algorithm was applied on the combination of payload features and statistical features, where they named the model with TR-IDS. Moreover, the effectiveness of TR-IDS was compared against five ML models, namely SVM, NN, CNN, and RF (RF-1) and RF (RF-2, which used statistical features only). The highest result achieved was by TR-IDS with an accuracy of 99.13%.

Finally, more information on intrusion detection using unsupervised and hybrid methods can be found in a survey paper composed by Nisioti et al. [ 150 ]. They presented and highlighted important issues such as feature engineering methods for IDS. Furthermore, using IDS data to construct and correlate attacks to identify attackers as well as extending the current IDS to identify modern attacks were all addressed by the paper.

Table 2 below presents a summary of all details discussed in this section, giving overview picture of all reviewed articles in terms of research problem domain targeted, dataset used, and intelligent techniques applied as well as the results achieved.

Brief summaries of the reviewed papers.

3.2. Common Intelligent Algorithms Applied

In this literature review, a number of papers were studied between the period of 2010–2021 and a plethora of both ML and DL techniques were utilized in these papers to build or compare models to detect and classify network attacks. Table 3 presents a list of all the respected papers that utilized the different algorithms, highlighting all problem domains where each algorithm was used for as well as the highest performance achieved. Figure 1 presents the number of articles that utilized each algorithm. As seen from the figure and table, RF and SVM were the most widely used algorithms in a good number of papers and ELM was the least applied algorithm. For ML algorithms, the best performing algorithms were DT, RF, and KNN with their accuracy reaching up to 100% and the least utilized algorithms were J.48 and KNN. For DL algorithms, the best performing algorithm was RRN with the highest accuracy of 100% achieved and the least utilized and least popular algorithm was ELM, which is considered to be fast in terms of training as it consists of a single hidden layer, so it is usually applied to simple applications. However, it has recently been extended to be hierarchical to handle more complex problems with higher accuracy [ 152 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07070-g001.jpg

ML and DL algorithms used in the reviewed papers.

ML and DL algorithms evaluated in the reviewed papers.

3.3. Common Datasets Used

There are several datasets used by researchers in the reviewed papers to evaluate their network detection and classification model. The most widely used dataset is NSL-KDD due to the reasonable size of its training and testing sets and is also available publicly. There are 41 features in the NSL-KDD dataset. It is an enhanced version of the KDD dataset and removed the duplication of the records to eliminate the bias of the classifiers. Then, KDD-99 and CICIDS2017 came after NSL-KDD. The KDD-99 dataset was used for the first time in a competition and is an improved version of DARAP98. The CICIDS2017 dataset contains normal and new attacks and was published in 2017 by the Canadian Institute for Cybersecurity (CIC).

After that, the UNSW-NB15 dataset comes next in terms of repeatedly being used. The IXIA tool was used for creating the UNSW-NB15 dataset and it consists of nine types of attacks.

There are many other datasets, however, few researchers have tried to create their datasets. The CTU-13 dataset was captured by CTU University in the Czech Republic. It contains real botnet traffic combined with normal traffic and contains thirteen scenarios including legitimate traffic and attacks such as DoS. The SNMP-MIB dataset consists of about 4998 records with 34 variables. The attacks recorded in the data include six DoS attacks (TCP-SYN, ICMP-ECHO, HTTP flood, UDP flood, Slowloris, Slowpost) and web brute force attacks. The Kyoto 2006+ dataset was built from real traffic data from Kyoto University’s Honeypots over three years, from November 2006 to August 2009. The Kyoto 2006+ dataset consists of 24 features, 14 of which are derived from the KDD-99 dataset and 10 additional features that can be used to analyze and evaluate the IDS network. Honeypots, email server, darknet sensors, and web crawler were used to construct the Kyoto 2006+.

ADFA is an IDS that includes three data types in its structure: (1) normal training data with 4373 traces; (2) normal validation data with 833 traces; and (3) attack data with 10 attacks per vector. As the web became a significant internet criminal activity platform, the security community put in efforts to blacklist malicious URLs. Ma et al.’s dataset [ 153 ] consists of 121 sets with overall 2.3 million URLs and 3.2 million features in the dataset. The researchers divided the URLs into three groups based on their characteristics, with features being identified as binary, non-binary, numerical, or discrete.

Table 4 lists all the respected papers that utilized the different datasets, highlighting the main references for all datasets as well as the last year when each dataset was used. Figure 2 presents the number of articles that utilized each dataset.

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07070-g002.jpg

Datasets used in the reviewed papers.

Network traffic datasets used in the reviewed papers.

4. Discussion and Conclusions

Network security is a major concern for individuals, profit, and non-profit organizations as well as governmental organizations. In fact, with the digital explosion that we are witnessing in the present era, ensuring network security is an urgent necessity in order to safeguard society’s acceptance for thousands and thousands of services that rely essentially on the backbone of the digital life, which is the network. Therefore, network security turns out to be an urgent requirement, and not a luxury. Although many protection methods have been introduced, there are still some vulnerabilities that are exploited by hackers, leaving the network security administrators in a continuous race against the network attackers. Techniques that hover around the use of intelligent methods, namely machine learning (ML) and deep learning (DL) have proved their merits in several domains including health care systems, financial analysis, higher education, energy industry, etc. This indeed motivated the people responsible for the network security to further explore the ability of these techniques in providing the required level of network security. Consequently, several intelligent security techniques have been offered in the past few years. Although these techniques showed exceptional performance, the problem has not been resolved entirely. This leaves us in a position to critically evaluate the currently offered solutions to recognize the possible research directions that might lead to building more secured network environments.

The complication of using the right dataset and features or the right ML and DL algorithms to identify the different attack types has proven to be an arduous decision for experts to make. Hence, among the reviewed papers, some researchers focused on comparing different algorithms to determine which algorithm to use for building an intelligent model using a training dataset. As no algorithm has been found to be a silver bullet for identifying and classifying all attacks with high accuracy, it was widely noted that it is not reasonable to accept a single algorithm as a universal model.

When building any intelligent system, the designer should take into account what is/are the algorithm(s) that best fit the domain. Not only this, but the designer should also decide which dataset comprises a set of features that better represent the classification area. Considering the network attacks, this research article found that RF is the most commonly used algorithm and this can be justified due to the fact that it uses an ensemble learning technique, which to some extent might ensure a life-long system due to the exceptional capability to continuously learn new knowledge on the fly. Producing models with reduced overfitting is another motivation behind using the RF. Not only this, but RF can also be effectively applied on both categorical and continuous features, and thus it can be applied to a wide range of datasets. In addition, the exceptional ability to handle missing data puts RF as a first option when building network attack mitigation models taking into account that most of the datasets are susceptible to include missing values. However, since RF produces complex trees, building a real-life system based on RF could be a challenging task because it might require more computational power and resources, while in fact, the main success factor for building a system for detecting network attacks is the quick and instant reaction. SVM is the second most widely used algorithm. However, SVM is applied to a fewer number of network attacks when compared to RF. This can be justified due to the fact that SVM produces complex intelligent models that are difficult to apply in real life. Nevertheless, SVM is considered as the main competitor to RF due to the fact that it shares several advantages with RF such as the exceptional capability to deal with missing values, and the remarkable capability to reduce the overfitting problem. NB ranks in third place, but still did not achieve the same predictive performance as RF and SVM due to the fact that it assumes that the dataset features are independent, which in fact, is not true in most training datasets. DT was employed almost half the time that RF and SVM were used. DT proved its merits in several domains, but in the network security domains, it has not been used very much. This can be justified due to the fact that it produces a set of rules that if exposed to the attackers, they can adopt their attacks by avoiding the rules adopted from the DT models.

Included among the algorithms that conveyed excellent performing results were DL models, namely, DNN and RNN as well as ML models, namely, RF and DT with their accuracies reaching up to 100%. A more promising research direction to explore can increasingly be toward applying hybrid or ensemble models to improve attack detection accuracy; for instance, augmenting DL techniques such as CNN with long short-term memory (LSTM) for automating feature engineering and improving network attack detection accuracy. Furthermore, gated recurrent unit (GRU), initially proposed in 2014, can further be applied by researchers in solving various problem domains in network security as it is considered more efficient than LSTM, and it uses comparatively less memory, and executes faster. They can solve complex problems faster, if trained well, and therefore, they are worth trying in network attack detection, namely for DDoS or in IoT networks.

Since the performance of the intelligent models largely depend on the datasets used for training them, it is important to analyze and evaluate which dataset to use for which type of attack. It is recommended that large datasets are used with a good distribution of each class type to increase the detection and classification accuracy. Moreover, limited availability of such datasets represents a challenge in the development of more robust intelligent-based models and highlights the need for producing and publishing more new datasets in different network attack problem domains. Most of the authors in the reviewed articles used the KDD-99 dataset as well as its latest version, the NSL-KDD dataset. However, the ADFA dataset was also used by some, which was proposed as a replacement for the KDD-99 dataset, ISOT HTTP for botnet, ISOT CID for cloud environments, and IoT20 for IoT environments, so can be explored further and used to build different ML and DL models.

Identifying malicious and benign URLs was also a fundamental research direction carried out by researchers where an important set of features that affected the model accuracy were URL related features. It was found that additional improvements in classifying malicious and benign URLs can be accomplished by deploying a lexical approach, which uses static lexical features extrapolated from the URL, in addition to analyzing the URL contents for instantaneous and reliable results. Hence, using a lexical approach to classify URLs can be an important direction to explore.

Several other problem domains need to be explored as they could be a valuable direction for enhancing network security in the modern world. Namely, with the growing establishment of encrypted network traffic as well as virtual private networks, more research needs to be carried out in detecting malicious traffic in these domains using intelligent techniques as not enough research has been focused in this area. Furthermore, with the rising number of inter-connected devices and the establishments of Internet of Things (IoTs) networks, more investigation needs to be carried out in assessing different intelligent techniques on new datasets such as IoT20 as well as paving ways to developing software that can detect and analyze data packets communicated in IoT environments to update the existing datasets for more attacks. Additionally, a new protocol called DNS over HTTP (DoH) has been created recently for which more research needs to be explored on detecting malicious DoH traffic at this (DNS) level.

Finally, multiple researchers intend in their future work to convert the models they built into a real-time system in order to benefit from them in real-life scenarios such as in attack detection and prevention. There are two levels of real-time ML which are online predictions and online learning. Online prediction means making predictions in real-time. Furthermore, online learning allows for the system to incorporate new data and update the model in real-time. Hence, converting intelligent models into real time systems may be considered as a fundamental direction to probe by more researchers.

Author Contributions

Conceptualization, M.A. (Malak Aljabri), S.S.A., R.M.A.M. and S.H.A.; methodology, M.A. (Malak Aljabri), S.S.A., R.M.A.M. and S.H.A.; software, S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; validation, M.A. (Malak Aljabri), S.M. and F.M.A.; formal analysis, M.A. (Malak Aljabri), S.M. and F.M.A.; investigation, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; resources, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; data curation, S.M. and F.M.A.; writing—original draft preparation, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; writing—review and editing, M.A. (Malak Aljabri), S.M., F.M.A., S.S.A., R.M.A.M. and S.H.A.; visualization, S.M. and F.M.A.; supervision, M.A. (Malak Aljabri); project administration, M.A. (Malak Aljabri); funding acquisition, M.A. (Malak Aljabri) and S.S.A. All authors have read and agreed to the published version of the manuscript.

We would like to thank SAUDI ARAMCO Cybersecurity Chair for funding this project.

Conflicts of Interest

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

cyber attacks Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Analysis of Trending Topics and Text-based Channels of Information Delivery in Cybersecurity

Computer users are generally faced with difficulties in making correct security decisions. While an increasingly fewer number of people are trying or willing to take formal security training, online sources including news, security blogs, and websites are continuously making security knowledge more accessible. Analysis of cybersecurity texts from this grey literature can provide insights into the trending topics and identify current security issues as well as how cyber attacks evolve over time. These in turn can support researchers and practitioners in predicting and preparing for these attacks. Comparing different sources may facilitate the learning process for normal users by creating the patterns of the security knowledge gained from different sources. Prior studies neither systematically analysed the wide range of digital sources nor provided any standardisation in analysing the trending topics from recent security texts. Moreover, existing topic modelling methods are not capable of identifying the cybersecurity concepts completely and the generated topics considerably overlap. To address this issue, we propose a semi-automated classification method to generate comprehensive security categories to analyse trending topics. We further compare the identified 16 security categories across different sources based on their popularity and impact. We have revealed several surprising findings as follows: (1) The impact reflected from cybersecurity texts strongly correlates with the monetary loss caused by cybercrimes, (2) security blogs have produced the context of cybersecurity most intensively, and (3) websites deliver security information without caring about timeliness much.

Adaptive distributed Kalman-like filter for power system with cyber attacks

Designing and evaluating an automatic forensic model for fast response of cross-border e-commerce security incidents.

The rapid development of cross-border e-commerce over the past decade has accelerated the integration of the global economy. At the same time, cross-border e-commerce has increased the prevalence of cybercrime, and the future success of e-commerce depends on enhanced online privacy and security. However, investigating security incidents is time- and cost-intensive as identifying telltale anomalies and the source of attacks requires the use of multiple forensic tools and technologies and security domain knowledge. Prompt responses to cyber-attacks are important to reduce damage and loss and to improve the security of cross-border e-commerce. This article proposes a digital forensic model for first incident responders to identify suspicious system behaviors. A prototype system is developed and evaluated by incident response handlers. The model and system are proven to help reduce time and effort in investigating cyberattacks. The proposed model is expected to enhance security incident handling efficiency for cross-border e-commerce.

Designing and Evaluating an Automatic Forensic Model for Fast Response of Cross-Border E-Commerce Security Incidents

Spi: automated identification of security patches via commits.

Security patches in open source software, providing security fixes to identified vulnerabilities, are crucial in protecting against cyber attacks. Security advisories and announcements are often publicly released to inform the users about potential security vulnerability. Despite the National Vulnerability Database (NVD) publishes identified vulnerabilities, a vast majority of vulnerabilities and their corresponding security patches remain beyond public exposure, e.g., in the open source libraries that are heavily relied on by developers. As many of these patches exist in open sourced projects, the problem of curating and gathering security patches can be difficult due to their hidden nature. An extensive and complete security patches dataset could help end-users such as security companies, e.g., building a security knowledge base, or researcher, e.g., aiding in vulnerability research. To efficiently curate security patches including undisclosed patches at large scale and low cost, we propose a deep neural-network-based approach built upon commits of open source repositories. First, we design and build security patch datasets that include 38,291 security-related commits and 1,045 Common Vulnerabilities and Exposures (CVE) patches from four large-scale C programming language libraries. We manually verify each commit, among the 38,291 security-related commits, to determine if they are security related. We devise and implement a deep learning-based security patch identification system that consists of two composite neural networks: one commit-message neural network that utilizes pretrained word representations learned from our commits dataset and one code-revision neural network that takes code before revision and after revision and learns the distinction on the statement level. Our system leverages the power of the two networks for Security Patch Identification. Evaluation results show that our system significantly outperforms SVM and K-fold stacking algorithms. The result on the combined dataset achieves as high as 87.93% F1-score and precision of 86.24%. We deployed our pipeline and learned model in an industrial production environment to evaluate the generalization ability of our approach. The industrial dataset consists of 298,917 commits from 410 new libraries that range from a wide functionalities. Our experiment results and observation on the industrial dataset proved that our approach can identify security patches effectively among open sourced projects.

Cyber Security Frameworks

Abstract: In this paper we attempt to explain and establish certain frameworks that can be assessed for implementing security systems against cyber-threats and cyber-criminals. We give a brief overview of electronic signature generation procedures which include its validation and efficiency for promoting cyber security for confidential documents and information stored in the cloud. We strictly avoid the mathematical modelling of the electronic signature generation process as it is beyond the scope of this paper, instead we take a theoretical approach to explain the procedures. We also model the threats posed by a malicious hacker seeking to induce disturbances in the functioning of a power transmission grid via the means of cyber-physical networks and systems. We use the strategy of a load redistribution attack, while clearly acknowledging that the hacker would form its decision policy on inadequate information. Our research indicate that inaccurate admittance values often cause moderately invasive cyber-attacks that still compromise the grid security, while inadequate capacity values result in comparatively less efficient attacks. In the end we propose a security framework for the security systems utilised by companies and corporations at global scale to conduct cyber-security related operations. Keywords: Electronic signature, Key pair, sequence modelling, hacker, power transmission grid, Threat response, framework.

Information Security in Medical Robotics: A Survey on the Level of Training, Awareness and Use of the Physiotherapist

Cybersecurity is becoming an increasingly important aspect to investigate for the adoption and use of care robots, in term of both patients’ safety, and the availability, integrity and privacy of their data. This study focuses on opinions about cybersecurity relevance and related skills for physiotherapists involved in rehabilitation and assistance thanks to the aid of robotics. The goal was to investigate the awareness among insiders about some facets of cybersecurity concerning human–robot interactions. We designed an electronic questionnaire and submitted it to a relevant sample of physiotherapists. The questionnaire allowed us to collect data related to: (i) use of robots and its relationship with cybersecurity in the context of physiotherapy; (ii) training in cybersecurity and robotics for the insiders; (iii) insiders’ self-assessment on cybersecurity and robotics in some usage scenarios, and (iv) their experiences of cyber-attacks in this area and proposals for improvement. Besides contributing some specific statistics, the study highlights the importance of both acculturation processes in this field and monitoring initiatives based on surveys. The study exposes direct suggestions for continuation of these types of investigations in the context of scientific societies operating in the rehabilitation and assistance robotics. The study also shows the need to stimulate similar initiatives in other sectors of medical robotics (robotic surgery, care and socially assistive robots, rehabilitation systems, training for health and care workers) involving insiders.

Challenges, Trends and Solutions for Communication Networks and Cyber-Security in Smart Grid.

Abstract: Power grid is one of the most important manifestations of the modern civilization and the engine of it where it is described as a digestive system of the civil life. It is a structure has three main functions: generation, transmission lines, distribution. This concept was appropriate for a century. However, the beginning of the twenty-first century brought dramatic changes on different domains: media, human growth, economic, environmental, political, and technical etc. Smart grid is a sophisticated structure including cyber and physical bodies hence it reinforces the sustainability, the energy management, the capability of integration with microgrids, and exploiting the renewable energy resources. The quantum leap of smart grid is related to the advanced communication networks that deal with the cyber part. Moreover, the communication networks of smart grid offer attractive capabilities such as monitoring, control, and protection at the level of real time. The wireless communication techniques in integration frame are promised solution to compensate the requirements of smart grid designing such as wireless local area networks, worldwide interoperability for microwave access, long term evolution, and narrowband- internet of things. These technologies could provide high capacity, flexibility, low-cost maintenance for smart grid. However, the multi-interfaces in smart grid may exploit by persons or agencies to implement different types of cyber-attacks may lead to dangerous damage. This research paper reviews the up-to-date researches in the field of smart grid to handle the new trends and topics in one frame in order to offer integration vision in this vital section. It concentrates on the section of communication networks the mainstay of smart grid. This paper discusses the challenging and requirements of adopting the wireless communication technologies and delves deeply into literature review to devise and suggest solutions to compensate the impairments efficiently. Moreover, it explores the cyber security that representing the real defiant to implement the concept of smart grid safely.

Application of Bayesian network in risk assessment for website deployment scenarios

Abstract—The rapid development of web-based systems in the digital transformation era has led to a dramatic increase in the number and the severity of cyber-attacks. Current attack prevention solutions such as system monitoring, security testing and assessment are installed after the system has been deployed, thus requiring more cost and manpower. In that context, the need to assess cyber security risks before the deployment of web-based systems becomes increasingly urgent. This paper introduces a cyber security risk assessment mechanism for web-based systems before deployment. We use the Bayesian network to analyze and quantify the cyber security risks posed by threats to the deployment components of a website. First, the deployment components of potential website deployment scenarios are considered assets, so that their properties are mapped to specific vulnerabilities or threats. Next, the vulnerabilities or threats of each deployment component will be assessed according to the considered risk criteria in specific steps of a deployment process. The risk assessment results for deployment components are aggregated into the risk assessment results for their composed deployment scenario. Based on these results, administrators can compare and choose the least risky deployment scenario. Tóm tắt—Sự phát triển mạnh mẽ của các hệ thống trên nền tảng web trong công cuộc chuyển đổi số kéo theo sự gia tăng nhanh chóng về số lượng và mức độ nguy hiểm của các cuộc tấn công mạng. Các giải pháp phòng chống tấn công hiện nay như theo dõi hoạt động hệ thống, kiểm tra và đánh giá an toàn thông tin mạng được thực hiện khi hệ thống đã được triển khai, do đó đòi hỏi chi phí và nhân lực thực hiện lớn. Trong bối cảnh đó, nhu cầu đánh giá rủi ro an toàn thông tin mạng cho các hệ thống website trước khi triển khai thực tế trở nên cấp thiết. Bài báo này giới thiệu một cơ chế đánh giá rủi ro an toàn thông tin mạng cho các hệ thống website trước khi triển khai thực tế. Chúng tôi sử dụng mạng Bayes để phân tích và định lượng rủi ro về an toàn thông tin do các nguồn đe dọa khác nhau gây ra trên các thành phần triển khai của một website. Đầu tiên, các thành phần triển khai của các kịch bản triển khai website tiềm năng được mô hình hoá dưới dạng các tài sản, sao cho các thuộc tính của chúng đều được ánh xạ với các điểm yếu hoặc nguy cơ cụ thể. Tiếp đó, các điểm yếu, nguy cơ của từng thành phần triển khai sẽ được đánh giá theo các tiêu chí rủi ro đang xét tại mỗi thời điểm cụ thể trong quy trình triển khai. Kết quả đánh giá của các thành phần triển khai được tập hợp lại thành kết quả đánh giá hệ thống trong một kịch bản cụ thể. Căn cứ vào kết quả đánh giá rủi ro, người quản trị có thể so sánh các kịch bản triển khai tiềm năng với nhau để lựa chọn kịch bản triển khai ít rủi ro nhất.

A Hybrid Framework for Intrusion Detection in Healthcare Systems Using Deep Learning

The unbounded increase in network traffic and user data has made it difficult for network intrusion detection systems to be abreast and perform well. Intrusion Systems are crucial in e-healthcare since the patients' medical records should be kept highly secure, confidential, and accurate. Any change in the actual patient data can lead to errors in the diagnosis and treatment. Most of the existing artificial intelligence-based systems are trained on outdated intrusion detection repositories, which can produce more false positives and require retraining the algorithm from scratch to support new attacks. These processes also make it challenging to secure patient records in medical systems as the intrusion detection mechanisms can become frequently obsolete. This paper proposes a hybrid framework using Deep Learning named “ImmuneNet” to recognize the latest intrusion attacks and defend healthcare data. The proposed framework uses multiple feature engineering processes, oversampling methods to improve class balance, and hyper-parameter optimization techniques to achieve high accuracy and performance. The architecture contains &lt;1 million parameters, making it lightweight, fast, and IoT-friendly, suitable for deploying the IDS on medical devices and healthcare systems. The performance of ImmuneNet was benchmarked against several other machine learning algorithms on the Canadian Institute for Cybersecurity's Intrusion Detection System 2017, 2018, and Bell DNS 2021 datasets which contain extensive real-time and latest cyber attack data. Out of all the experiments, ImmuneNet performed the best on the CIC Bell DNS 2021 dataset with about 99.19% accuracy, 99.22% precision, 99.19% recall, and 99.2% ROC-AUC scores, which are comparatively better and up-to-date than other existing approaches in classifying between requests that are normal, intrusion, and other cyber attacks.

Export Citation Format

Share document.

A Study On Various Cyber Attacks And A Proposed Intelligent System For Monitoring Such Attacks

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Book cover

Mobile, Ubiquitous, and Intelligent Computing pp 489–494 Cite as

Analysis of Cyber Attacks and Security Intelligence

  • Youngsoo Kim 5 ,
  • Ikkyun Kim 5 &
  • Namje Park 6  
  • Conference paper

2903 Accesses

9 Citations

Part of the Lecture Notes in Electrical Engineering book series (LNEE,volume 274)

A cyber attack is deliberate exploitation of computer systems, technology-dependent enterprises and networks. Cyber attacks use malicious code to alter computer code, logic or data, resulting in disruptive consequences that can compromise data and lead to cybercrimes, such as information and identity theft. Cyber attack is also known as a computer network attack (CNA). Cyber attacks occurred targeting banks and broadcasting companies in South Korea on March 20. The malware involved in these attacks brought down multiple websites and interrupted bank transactions by overwriting the Master Boot Record (MBR) and all the logical drives on the infected servers rendering them unusable. It was reported that 32,000 computers had been damaged and the exact amount of the financial damage has not yet been calculated. More serious is that we are likely to have greater damages in case of occurring additional attacks, since exact analysis of cause is not done yet. APT(Advanced Persistent Threat), which is becoming a big issue due to this attack, is not a brand new way of attacking, but a kind of keyword standing for a trend of recent cyber attacks. In this paper, we show some examples and features of recent cyber attacks and describe phases of them. Finally, we conclude that only the concept of security intelligence can defend these cyber threats.

  • Cyber Attacks
  • Security Intelligence

This research was funded by the MSIP(Ministry of Science, ICT & Future Planning), Korea in the ICT R&D Program 2013.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Unable to display preview.  Download preview PDF.

Advanced Persistent Threat (APT), http://en.wikipedia.org/wiki/Advanced_persistent_threat

Pangalos, G., et al.: The Importance of Corporate Forensic Readiness in the information security framework. In: 2010 Workshops on Enabling Technologies (2010)

Google Scholar  

Zero-day attack, https://en.wikipedia.org/wiki/Zero-day_attack

Rootkit, http://en.wikipedia.org/wiki/Rootkit

Rivner, U.: Anatomy of an Attack, http://blogs.rsa.com/rivner/anatomy-of-an-attack/

MacDonald, N.: The future of information Security is Context Aware and Adaptive. Gartner

Park, N., Kwak, J., Kim, S., Won, D., Kim, H.: WIPI Mobile Platform with Secure Service for Mobile RFID Network Environment. In: Shen, H.T., Li, J., Li, M., Ni, J., Wang, W. (eds.) APWeb Workshops 2006. LNCS, vol. 3842, pp. 741–748. Springer, Heidelberg (2006)

Chapter   Google Scholar  

Park, N.: Security scheme for managing a large quantity of individual information in RFID environment. In: Zhu, R., Zhang, Y., Liu, B., Liu, C. (eds.) ICICA 2010. CCIS, vol. 106, pp. 72–79. Springer, Heidelberg (2010)

Park, N.: Secure UHF/HF Dual-Band RFID: Strategic Framework Approaches and Application Solutions. In: Jędrzejowicz, P., Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 488–496. Springer, Heidelberg (2011)

Park, N.: Implementation of Terminal Middleware Platform for Mobile RFID computing. International Journal of Ad Hoc and Ubiquitous Computing 8(4), 205–219 (2011)

Article   Google Scholar  

Park, N., Kim, Y.: Harmful Adult Multimedia Contents Filtering Method in Mobile RFID Service Environment. In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ICCCI 2010, Part II. LNCS (LNAI), vol. 6422, pp. 193–202. Springer, Heidelberg (2010)

Park, N., Song, Y.: AONT Encryption Based Application Data Management in Mobile RFID Environment. In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ICCCI 2010, Part II. LNCS (LNAI), vol. 6422, pp. 142–152. Springer, Heidelberg (2010)

Park, N.: Customized Healthcare Infrastructure Using Privacy Weight Level Based on Smart Device. In: Lee, G., Howard, D., Ślęzak, D. (eds.) ICHIT 2011. CCIS, vol. 206, pp. 467–474. Springer, Heidelberg (2011)

Park, N.: Secure Data Access Control Scheme Using Type-Based Re-encryption in Cloud Environment. In: Katarzyniak, R., Chiu, T.-F., Hong, C.-F., Nguyen, N.T. (eds.) Semantic Methods. SCI, vol. 381, pp. 319–327. Springer, Heidelberg (2011)

Kim, Y., Park, N., Hong, D.: Enterprise Data Loss Prevention System Having a Function of Coping with Civil Suits. In: Lee, R. (ed.) Computers,Networks, Systems, and Industrial Engineering 2011. SCI, vol. 365, pp. 201–208. Springer, Heidelberg (2011)

Kim, Y., Park, N., Won, D.: Privacy-Enhanced Adult Certification Method for MultimediaContents on Mobile RFID Environments. In: Proc. of IEEE International Symposium onConsumer Electronics, pp. 1–4. IEEE, Los Alamitos (2007)

Kim, Y., Park, N., Hong, D., Won, D.: Adult Certification System on Mobile RFID ServiceEnvironments. Journal of Korea Contents Association 9(1), 131–138 (2009)

Park, N., Song, Y.: Secure RFID Application Data Management Using All-Or-Nothing Transform Encryption. In: Pandurangan, G., Anil Kumar, V.S., Ming, G., Liu, Y., Li, Y. (eds.) WASA 2010. LNCS, vol. 6221, pp. 245–252. Springer, Heidelberg (2010)

Park, N.: The Implementation of Open Embedded S/W Platform for Secure Mobile RFID Reader. The Journal of Korea Information and Communications Society 35(5), 785–793 (2010)

Park, N.: Mobile RFID/NFC Linkage Based on UHF/HF Dual Band’s Integration in U-Sensor Network Era. In: Park, J.J. (J.H.), Kim, J., Zou, D., Lee, Y.S. (eds.) ITCS & STA 2012. LNEE, vol. 180, pp. 265–271. Springer, Heidelberg (2012)

Download references

Author information

Authors and affiliations.

Cyber Security Research Laboratory, Electronics and Telecommunications Research Institute (ETRI), 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350, Korea

Youngsoo Kim & Ikkyun Kim

Department of Computer Education, Teachers College, Jeju National University, Jeju, Korea

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Youngsoo Kim .

Editor information

Editors and affiliations.

Department of Computer Science and Engineering, Seoul University of Science & and Technology (SeoulTech), Seoul, Korea, Republic of (South Korea)

James J. (Jong Hyuk) Park

Biomedical Informatics Neuroscience, Ohio State University Center for Biomedical Engineering, Columbus, Ohio, USA

Hojjat Adeli

Dept of Computer Education, Jeju National University Teachers College, Jeju Special Self-Governing Province, Korea, Republic of (South Korea)

Ryerson University Dept. Computer Science, Toronto, Ontario, Canada

Isaac Woungang

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper.

Kim, Y., Kim, I., Park, N. (2014). Analysis of Cyber Attacks and Security Intelligence. In: Park, J., Adeli, H., Park, N., Woungang, I. (eds) Mobile, Ubiquitous, and Intelligent Computing. Lecture Notes in Electrical Engineering, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40675-1_73

Download citation

DOI : https://doi.org/10.1007/978-3-642-40675-1_73

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-642-40674-4

Online ISBN : 978-3-642-40675-1

eBook Packages : Engineering Engineering (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • 07 February 2024

Cyberattacks on knowledge institutions are increasing: what can be done?

You have full access to this article via your institution.

Bronze statue by Eduardo Paolozzi (1924-2005), inspired by William Blake's famous image of English mathematician Sir Isaac Newton, British Library.

A statue of Isaac Newton greets visitors to the British Library. As Warden and later Master of the Royal Mint (1696–1727), Newton was involved in the prosecution of financial crimes. Credit: SSPL/Getty

It has been more than three months since the British Library’s staff and users awoke to the news that its computer systems had been hijacked. After the attack on 28 October, anything that used the Internet — the library’s phone systems, its digital collections and website — became inaccessible. A hacking group called Rhysida had demanded a ransom, which the London-based library refused to pay. In November, Rhysida listed around half a million confidential files, including names and e-mail addresses of the library’s staff and users, for auction on the dark web, with bids starting at 20 bitcoins (US$800,000).

Berlin’s natural history museum was also attacked in mid-October. In-person visits are continuing, but research is possible only “to a limited extent”. These attacks are not isolated cases. In one study, researchers analysed 58 cyberattacks between 1988 and 2022 on universities, schools and other organizations worldwide, and found that the frequency of attacks had increased since 2015 ( H. Singh Lallie et al. Preprint at https://arxiv.org/abs/2307.07755; 2023 ). Information on the attacks was gleaned from publicly available online sources, such as media reports and the institutions’ own websites. The scientists concluded that research and education data are “a prime target for cyber criminals”. The study suggests that ransomware attacks — which permanently block access to data or systems until money is paid — were the most common form of cyberattack from an external source. Within an institution, students hacking the system to alter their grades were most often the cause.

types of cyber attacks research paper

Where is Russia’s cyberwar? Researchers decipher its strategy

The vulnerability of educational and research institutions is not difficult to predict. All around the world, millions of members of staff, students and alumni log into institutional computer systems daily. Moreover, since the COVID-19 pandemic, remote access from personal devices with varying levels of protection has increased massively. Some of the biggest security risks come from the use of weak passwords and computer systems that can be accessed without multi-factor authentication — in which users verify their identity through two or more independent pieces of evidence. According to an annual survey by US technology giant IBM on data breaches, only four in ten organizations, including those in research and education, require users of computer systems to verify their identities regularly with such authentication methods (see bit.ly/4bfzamz ).

Research institutions are generally not short of information technology expertise — the British Library, for example, houses the UK national research centre for artificial intelligence and data science, the Alan Turing Institute. Yet there is a lack of in-depth, publicly available research on the extent and range of cyberattacks against educational institutions. Not all those that are attacked go public with details — the British Library did not reveal the attack was an instance of ransomware until 29 November. In many countries, organizations are required to report attacks to the relevant authorities, but governments, for understandable reasons, often do not publish this information.

Some in national security circles consider such research, and the public scrutiny associated with it, a risk for producing or increasing vulnerabilities. However, collaboration between researchers who study computer security and those who investigate crime will bring wider benefits. It could help institutions to protect themselves against future attacks, and enable organizations to handle an attack effectively and minimize damage. Sharing knowledge on how to react to a ransom demand is one example. Institutions that are subject to ransomware attacks are advised not to pay, although some have done so. Everyone would benefit if these experiences were studied, peer reviewed and published in the open literature.

types of cyber attacks research paper

A holistic and proactive approach to forecasting cyber threats

Another important question is who should pay to recover and strengthen computer systems that are protecting national assets. In the case of the British Library, three months after the attack, some collections are available for people who visit in person, but it could be months more before its online records of books, journals, PhD theses and rare manuscripts are fully accessible to the library’s users all over the world. The organization also needs to find in the region of £6 million ($7.5 million) to £7 million from its own resources to repair the damage.

So far, the UK government has not said whether it will underwrite the costs — a position that has left other librarians perplexed. The British Library is the United Kingdom’s national library. It is important to the nation’s businesses, colleges, research centres, schools and universities, and even more so to all those who do independent research. Library users are experiencing continued delays in a range of lending services, from ordering copies of books published over a span of more than three centuries, to accessing journal articles. The institution has one of the world’s largest collections of maps, along with archives of sound recordings and every UK PhD thesis published over the past century. By not contributing to the repairs, the government is disadvantaging researchers who cannot access other institutional libraries.

This is not just a matter for the UK government, but for national and regional governments worldwide. Relevant authorities need to step up to support important institutions in times of crisis. And funders and researchers should consider how they can help — for example, by studying how to minimize the risk of cyberattacks happening in the future and what to do when they do take place.

Nature 626 , 234 (2024)

doi: https://doi.org/10.1038/d41586-024-00323-1

Reprints and permissions

Related Articles

types of cyber attacks research paper

Increase vigilance against cyberattacks

  • Computer science

AI chatbot shows surprising talent for predicting chemical properties and reactions

AI chatbot shows surprising talent for predicting chemical properties and reactions

News 06 FEB 24

How can scientists make the most of the public’s trust in them?

How can scientists make the most of the public’s trust in them?

Editorial 31 JAN 24

Reaching carbon neutrality requires energy-efficient training of AI

Correspondence 30 JAN 24

Largest post-pandemic survey finds trust in scientists is high

Largest post-pandemic survey finds trust in scientists is high

News 14 FEB 24

Build global collaborations to protect marine migration routes

Correspondence 13 FEB 24

Deep-sea mining opponents: there’s no free lunch when it comes to clean energy

US and China likely to delay renewal of key science pact again

US and China likely to delay renewal of key science pact again

News 08 FEB 24

Israel is flooding Gaza’s tunnel network: scientists assess the risks

Israel is flooding Gaza’s tunnel network: scientists assess the risks

News 02 FEB 24

Crackdown on skin-colour bias by fingertip oxygen sensors is coming, hints FDA

Crackdown on skin-colour bias by fingertip oxygen sensors is coming, hints FDA

Two Faculty Positions in Life Science, iGCORE, Japan

The Institute for Glyco-core Research, iGCORE, in Tokai National Higher Education and Research System in Japan (THERS; consisting of Nagoya University

Institute for Glyco-core Research (iGCORE), Tokai National Higher Education and Research System

types of cyber attacks research paper

Attending Physician (m/f/d)

The Institute of Transfusion Medicine – Transfusion Centre headed by Univ.-Prof. Dr. med. Daniela S. Krause is hiring:

Mainz, Rheinland-Pfalz (DE)

University of Mainz

NIH Cancer Postdoc Fellowship

Train with world-renowned cancer researchers at NIH? Consider joining the Center for Cancer Research (CCR) at the National Cancer Institute

Bethesda, Maryland (US)

NIH National Cancer Institute (NCI)

Postdoctoral Fellow (PhD)

Houston, Texas (US)

Baylor College of Medicine (BCM)

types of cyber attacks research paper

Professional-Track Faculty Positions Available!

VGTI is seeking professional-track faculty candidates with demonstrated potential for creative collaborations in infectious disease.

Beaverton, Oregon

Vaccine & Gene Therapy Institute, Oregon Health & Science University

types of cyber attacks research paper

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

U.S. flag

An official website of the United States government

Here’s how you know

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Report Fraud
  • Sign Up for Consumer Alerts
  • Search the Legal Library

Take action

  • Report an antitrust violation
  • File adjudicative documents
  • Find banned debt collectors
  • View competition guidance
  • Competition Matters Blog

New HSR thresholds and filing fees for 2024

View all Competition Matters Blog posts

We work to advance government policies that protect consumers and promote competition.

View Policy

Search or browse the Legal Library

Find legal resources and guidance to understand your business responsibilities and comply with the law.

Browse legal resources

  • Find policy statements
  • Submit a public comment

types of cyber attacks research paper

Vision and Priorities

Memo from Chair Lina M. Khan to commission staff and commissioners regarding the vision and priorities for the FTC.

Technology Blog

Ai (and other) companies: quietly changing your terms of service could be unfair or deceptive.

View all Technology Blog posts

Advice and Guidance

Learn more about your rights as a consumer and how to spot and avoid scams. Find the resources you need to understand how consumer protection law impacts your business.

  • Report fraud
  • Report identity theft
  • Register for Do Not Call
  • Sign up for consumer alerts
  • Get Business Blog updates
  • Get your free credit report
  • Find refund cases
  • Order bulk publications
  • Consumer Advice
  • Shopping and Donating
  • Credit, Loans, and Debt
  • Jobs and Making Money
  • Unwanted Calls, Emails, and Texts
  • Identity Theft and Online Security
  • Business Guidance
  • Advertising and Marketing
  • Credit and Finance
  • Privacy and Security
  • By Industry
  • For Small Businesses
  • Browse Business Guidance Resources
  • Business Blog

Servicemembers: Your tool for financial readiness

Visit militaryconsumer.gov

Get consumer protection basics, plain and simple

Visit consumer.gov

Learn how the FTC protects free enterprise and consumers

Visit Competition Counts

Looking for competition guidance?

  • Competition Guidance

News and Events

Latest news, ftc order will ban california-based company from covid-19 advertising claims.

View News and Events

Upcoming Event

Closed commission meeting - february 20, 2024.

View more Events

Sign up for the latest news

Follow us on social media

-->   -->   -->   -->   -->  

FTC Headquarters Building angle

Open Commission Meetings

Track enforcement and policy developments from the Commission’s open meetings.

Latest Data Visualization

Visualization of FTC Refunds to Consumers

FTC Refunds to Consumers

Explore refund statistics including where refunds were sent and the dollar amounts refunded with this visualization.

About the FTC

Our mission is protecting consumers and competition by preventing anticompetitive, deceptive, and unfair business practices through law enforcement, advocacy, and education without unduly burdening legitimate business activity.

Learn more about the FTC

Lina M. Khan

Meet the Chair

Lina M. Khan was sworn in as Chair of the Federal Trade Commission on June 15, 2021.

Chair Lina M. Khan

Looking for legal documents or records? Search the Legal Library instead.

  • Cases and Proceedings
  • Premerger Notification Program
  • Merger Review
  • Anticompetitive Practices
  • Competition and Consumer Protection Guidance Documents
  • Warning Letters
  • Consumer Sentinel Network
  • Criminal Liaison Unit
  • FTC Refund Programs
  • Notices of Penalty Offenses
  • Advocacy and Research
  • Advisory Opinions
  • Cooperation Agreements
  • Federal Register Notices
  • Public Comments
  • Policy Statements
  • International
  • Military Consumer
  • Consumer.gov
  • Bulk Publications
  • Data and Visualizations
  • Stay Connected
  • Commissioners and Staff
  • Bureaus and Offices
  • Budget and Strategy
  • Office of Inspector General
  • Careers at the FTC

As Nationwide Fraud Losses Top $10 Billion in 2023, FTC Steps Up Efforts to Protect the Public

Facebook

  • Consumer Protection
  • Bureau of Consumer Protection

Newly released Federal Trade Commission data show that consumers reported losing more than $10 billion to fraud in 2023, marking the first time that fraud losses have reached that benchmark. This marks a 14% increase over reported losses in 2022.

Consumers reported losing more money to investment scams—more than $4.6 billion—than any other category in 2023. That amount represents a 21% increase over 2022. The second highest reported loss amount came from imposter scams, with losses of nearly $2.7 billion reported. In 2023, consumers reported losing more money to bank transfers and cryptocurrency than all other methods combined.

"Digital tools are making it easier than ever to target hard-working Americans, and we see the effects of that in the data we're releasing today,” said Samuel Levine, Director of the FTC’s Bureau of Consumer Protection. “The FTC is working hard to take action against those scams."

types of cyber attacks research paper

Online shopping issues were the second most commonly reported in the fraud category, followed by prizes, sweepstakes, and lotteries; investment-related reports; and business and job opportunity scams.

Another first is the method scammers reportedly used to reach consumers most commonly in 2023: email. Email displaced text messages, which held the top spot in 2022 after decades of phone calls being the most common. Phone calls are the second most commonly reported contact method for fraud in 2023, followed by text messages.

The Commission monitors these trends carefully, and is taking a comprehensive approach to detect, halt, and deter consumer fraud, including in 2023 alone:

  • Leading the largest-ever crackdown on illegal telemarketing : The FTC joined more than 100 federal and state law enforcement partners nationwide, including the attorneys general from all 50 states and the District of Columbia in Operation Stop Scam Calls , a crackdown on illegal telemarketing calls involving more than 180 actions targeting operations responsible for billions of calls to U.S. consumers.
  • Proposing a ban on impersonator fraud:  The FTC is in the final stages of a rulemaking process targeting business and government impersonation scams.
  • Cracking Down on Investment Schemes:  The FTC has brought multiple cases against investment and business opportunity schemes, including Wealthpress , Blueprint to Wealth , Traffic and Funnels , Automators and Ganadores .
  • Confronting Emerging Forms of Fraud: The FTC has taken steps to listen to consumers and build knowledge and tools to fight emerging frauds. For example, the FTC announced a challenge in 2023 to help promote the development of ideas to protect consumers from the misuse of artificial intelligence-enabled voice cloning for fraud and other harms.
  • Stepping up CAN-SPAM Enforcement : The FTC is using its authority under the CAN-SPAM Act to rein in unlawful actions, including in cases against Publishers Clearing House and Experian .
  • Reaching Every Community:  The FTC has expanded its ability to hear directly from consumers in multiple languages through the Consumer Sentinel Network.

The FTC’s Consumer Sentinel Network is a database that receives reports directly from consumers, as well as from federal, state, and local law enforcement agencies, the Better Business Bureau, industry members, and non-profit organizations. More than 20 states contribute data to Sentinel.

Sentinel received 5.4 million reports in 2023; these include the fraud reports detailed above, as well as identity theft reports and complaints related to other consumer issues, such as problems with credit bureaus and banks and lenders. In 2023, there were more than 1 million reports of identity theft received through the FTC’s IdentityTheft.gov website.

The FTC uses the reports it receives through the Sentinel network as the starting point for many of its law enforcement investigations, and the agency also shares these reports with approximately 2,800 federal, state, local, and international law enforcement professionals. While the FTC does not intervene in individual complaints, Sentinel reports are a vital part of the agency’s law enforcement mission and also help the FTC to warn consumers and identify fraud trends it is seeing in the data.

A full breakdown of reports received in 2023 is now available on the FTC’s data analysis site at ftc.gov/exploredata . The data dashboards there break down the reports across a number of categories, including by state and metropolitan area, and also provide data from a number of subcategories of fraud reports.

The Federal Trade Commission works to promote competition and protect and educate consumers . Learn more about consumer topics at consumer.ftc.gov , or report fraud, scams, and bad business practices at  ReportFraud.ftc.gov . Follow the FTC on social media , read consumer alerts and the business blog , and sign up to get the latest FTC news and alerts .

Contact Information

Contact for consumers, media contact.

Jay Mayfield Office of Public Affairs 202-326-2656

The Citizen Lab

PAPERWALL Chinese Websites Posing as Local News Outlets Target Global Audiences with Pro-Beijing Content

Key findings.

  • A network of at least 123 websites operated from within the People’s Republic of China while posing as local news outlets in 30 countries across Europe, Asia, and Latin America, disseminates pro-Beijing disinformation and ad hominem attacks within much larger volumes of commercial press releases. We name this campaign PAPERWALL.
  • PAPERWALL has similarities with HaiEnergy, an influence operation first reported on in 2022 by the cybersecurity company Mandiant. However, we assess PAPERWALL to be a distinct campaign with different operators and unique techniques, tactics and procedures.
  • PAPERWALL draws significant portions of its content from Times Newswire, a newswire service that was previously linked to HaiEnergy. We found evidence that Times Newswire regularly seeds pro-Beijing political content, including ad hominem attacks, by concealing it within large amounts of seemingly benign commercial content.
  • A central feature of PAPERWALL, observed across the network of websites, is the ephemeral nature of its most aggressive components, whereby articles attacking Beijing’s critics are routinely removed from these websites some time after they are published.
  • We attribute the PAPERWALL campaign to Shenzhen Haimaiyunxiang Media Co., Ltd., aka Haimai, a PR firm in China based on digital infrastructure linkages between the firm’s official website and the network.
  • While the campaign’s websites enjoyed negligible exposure to date, there is a heightened risk of inadvertent amplification by the local media and target audiences, as a result of the quick multiplication of these websites and their adaptiveness to local languages and content.
  • These findings confirm the increasingly important role private firms play in the realm of digital influence operations and the propensity of the Chinese government to make use of them.

Why Exposing this Type of Campaign Matters

Beijing is increasing its aggressive activities in the spheres of influence operations (IOs), both online and offline . In the online realm, relevant to the findings in this report, Chinese IOs are shifting their tactics and increasing their volume of activity. For example, in November 2023 Meta – owner of the social media platforms Facebook, Instagram, and WhatsApp – announced the removal of five networks engaging in “coordinated inauthentic behavior” (i.e. influence operations) and targeting foreign audiences. Meta noted it as a marked increase in IO activity by China , stating that “for comparison, between 2017 and November 2020, we took down two CIB networks from China, and both mainly focused on the Asia-Pacific region. This represents the most notable change in the threat landscape, when compared with the 2020 [US] election cycle.”

Seeding ad hominem attacks on Beijing’s critics can result in particularly harmful consequences for the targeted individuals, especially when, as in PAPERWALL’s case, it happens within much larger amounts of ostensibly benign news or promotional content that lends credibility to and expands the reach of the attacks. The consequences to these individuals can include, but are not limited to, their delegitimization in the country that hosts them; the loss of professional opportunities; and even verbal or physical harassment and intimidation by communities sympathetic to the Chinese government’s agenda.

This report adds yet more evidence, to what has been reported by other researchers, of the increasingly important role played by private firms in the management of digital IOs on behalf of the Chinese government. For example, an October 2023 blog post by the RAND corporation summarized recent public findings on this issue, and advocated for the disruption of the disinformation-for-hire industry through the use of sanctions or other available legal and policy means.

It should be noted that disinformation-for-hire companies, driven by revenue, not ideology, tend not to be discerning about the motivations of their clients. As major recent press investigations have shown , both their origin and their client base can truly be global. Exposing this actor type, and its tactics, can help understand how governments seek plausible deniability through the hiring of corporate proxies. It can also refocus research on the latter, increasing deterrence by exposing their actions.

On October 25, 2023, the Italian newspaper Il Foglio published an article , summarized in English here , that exposed a small network of six websites posing as news outlets for Italian audiences that did not correspond to any real newsrooms in Italy. Il Foglio’s investigation confirmed that the websites were not registered as news outlets in the national registry, as legally required for any information organization operating within the country.

The identified domains used a specific naming convention: the name of an Italian city in the local spelling (i.e. “Roma”, or “Milano”), followed by mundane terms (for example, “moda”, meaning fashion; “money”; or “journal”). The websites hosted on those domains were all similar in structure, layout, and content, with generic political, crime, and entertainment articles interspersed with a relatively high amount of news related to China, or even directly derived from Chinese news organizations.

Il Foglio claimed that the network was being operated from China, and possibly by the Chinese government, based on content analysis and on the six domains resolving to an unspecified IP address owned by Tencent Computer Systems Inc., a major Chinese corporation. The Italian newspaper also hinted at the possible existence of a broader set of websites linked to the six presented, without publicly disclosing further information.

On November 13, 2023, the South Korean National Cyber Security Center (NCSC) , a governmental agency, also published a report exposing eighteen Korean-language websites posing as local news outlets. The report attributed these sites to a Chinese PR firm called Haimai , based on the firm itself advertising the opportunity for its clients to publish press releases on these same sites. These websites presented strong similarities with the six Italian-language ones exposed by Il Foglio, from their technical structure to the modus operandi utilized.

We set out to research the whole network, with the objective of discovering additional websites, their tactics, targeting, and impact; and of verifying the attribution of the activity to its operators.

An Extensive Network of Websites

The initial set.

Based on DNS infrastructure overlaps, we were able to expand the network identified by Il Foglio to an initial total of 74 domains . The majority of the domains could be identified through a relatively small set of three IP addresses they resolved to.

The number of domains hosted on these IP addresses is relatively low: they featured a total of less than 100 domain resolutions, while theoretically, each could have hosted thousands of domains. This could indicate that the IPs are only linked to one operator, rather than multiple clients of the provider.

We started from the following six domains, identified in the original news article:

Table 1: List of 6 domains hosting Italian-language websites as identified by Il Foglio

Based on Passive DNS resolution data made available by RiskIQ , we found that the above domains resolved, during the last two years, to at least one of the following three IP addresses:

Table 2: List of IP addresses to which the 6 domains resolved since 2021

We found other domains that had pointed to at least one of those three IP addresses since April 2018, obtaining the following list of 74 domains:

We verified that — with only four exceptions, highlighted in table 3 — the domains hosted websites posing as news outlets in several countries. The four highlighted exceptions resolved to one or more of the three examined IP addresses before or after the rest of the network was present on them, making their affiliation to PAPERWALL questionable. Additionally, many of them appeared to utilize the naming convention identified for the Italian-language domains (city name, followed by a generic term).

The Broader Network

By replicating the same process on the websites highlighted in the NCSC report, we were able to identify additional domains, and confirm them as fully matching the PAPERWALL signature features.

These include:

The websites’ structure

All of them were built on WordPress, and utilized a ( highly popular ) page builder plugin – WPBakery – for their setup.

The domains’ infrastructure

As spotted by Il Foglio, the current hosting infrastructure for the six Italian-language domains linked back to Tencent, a Chinese-based company. In fact, the relevant service being utilized is Tencent Cloud; and we could verify that all the currently active domains were being hosted on a Tencent Cloud IP address.

  • It is important however to note that this is something that any private customer can request, provided that certain requirements given by the host provider are satisfied.
  • We confirmed in the Tencent Cloud service documentation that the requirements imposed by the company are minimal: the identity of the individual or company subscribing to the service, a mobile phone number (to be verified through a security code sent via SMS), and a credit or debit card.
  • This effectively means that any private or corporate subscriber operating the network of websites could have pointed their domains to a Tencent IP address by subscribing to their Cloud service.

The WordPress users

We analyzed the usernames utilized to post content on the PAPERWALL websites through a technique called user enumeration . This technique revealed that the whole network shared a small number of content author names, visible in the table below.

Table 4: WordPress usernames identified as used on the PAPERWALL websites

The content

All of the identified websites had almost identical homepage menus, typically including (translated in the target language): Politics, Economy, Culture, Current Affairs, and Sport. The actual content being posted was a mix of scraped and reposted content from local media in the targeted country; press releases; and occasional Chinese state media articles, or anonymous disinformation content. The content could typically be observed as being simultaneously cross-posted across several of the websites at once. We analyze the content in more detail later in this report .

Examples of a commercial press release related to a company called Great Wall Motor being posted to six different PAPERWALL websites within the span of six days (25 to 31 October 2023). Note: we did not find any evidence that GWM was aware of its content being promoted as part of a deceptive coordinated campaign.

As of December 21, 2023, we were able to identify a total of 123 domains , almost all of which are hosting websites posing as news outlets. A full list of these domains is available in the Appendix .

Target Audiences

Based on the language utilized, as well as on the sourcing of the local news content reposted by PAPERWALL websites – an aspect that we will also describe in more detail later in this report – we observed the network as mimicking local news outlets in 30 different countries , as shown in the map below. A full list of the target countries, with the number of websites addressing each, is available in the Appendix .

The PAPERWALL target audiences, showing the distribution of websites per each country targeted

To appear as legitimate local news outlets, PAPERWALL websites typically utilized local references as part of their names. For example, “Eiffel” or “Provence” for French-language websites; “Viking” for the Norwegian one; or city names, commonly used for Italian and Spanish websites.

Headers of napolimoney[.]com (Italy), eiffelpost[.]com (France), and sevillatimes[.]com (Spain) shown as examples of the nomenclature pattern used by PAPERWALL

Meanwhile, in April 2020, the domain wdpp[.]org (presumably abbreviated for “World Development Press”) was registered. The website located on a Tencent IP address, which is also linked to updatenews[.]info and 16 other PAPERWALL domains, will be critical to our attribution .

In July 2020, we saw the first group registrations. That month, nine domains were registered, with each hosting a website aimed at Japanese audiences. One of them, fujiyamatimes[.]com , has a footer linking it to “Updatenews” .

Footer on fujiyamatimes[.]com, showing the line “Support: FUJIYAMA TIMES by Updatenews.”

The Content

Breakdown of the content categories found on the PAPERWALL network of websites

Political Content: Targeted Attacks and Disinformation

Hidden within much larger amounts of generic content, a smaller portion published by the PAPERWALL network is of a political nature. The following sections break down content types and main features.

Targeted Attacks

A common type of politically-themed content includes ad hominem attacks , usually kept in English irrespective of the target audience, on figures perceived by Beijing as hostile. For example, an article titled “Yan Limeng is a complete rumor maker” could be found on every active PAPERWALL website as of December 2023. This article contains a direct attack on Li-Meng Yan , a Chinese virologist who alleges that the COVID-19 virus originated from a Chinese government laboratory. While her theories have been widely dismissed by the global scientific community, the attacks on her by PAPERWALL were unsubstantiated, aimed at her personal and professional reputation, and completely anonymous.

Examples of an article attacking Li-Meng Yan, as published by the PAPERWALL websites nlpress[.]org (Netherlands), sevillatimes[.]com (Spain), and milanomodaweekly[.]com (Italy).

This article echoes others that circulated outside of the PAPERWALL network on websites that cannot be confirmed as part of the same network, as well as on blogging platforms. For example:

  • “ The Perelman School Of Medicine Should Expel Yan Limeng ”, published on 16 October 2023 by theinscribermag[.]com. A review of the other articles posted by the same author, “Dawn Wells”, reveals more targeted attacks on political figures, for example the President of Taiwan, Tsai Ing-wen .
  • “ Reject Yan Limeng for Perelman Medical College ”, published on prlog[.]org, a distinct but equally anonymous press release publishing platform, on 6 March 2022.
  • “ This is Yan Limeng was hired as a Perelman School ” (sic), published on 21 June 2023 on medium.com, an open blogging platform.
  • “ #汉奸闫丽梦#闫丽梦Maintain campus cleanliness Reject Yan Limon for Perelman Medical College ”, published on 14 December 2023, also on medium.com.

This suggests that PAPERWALL is used as an amplifier for campaigns targeting specific individuals and anonymously employing an array of additional online platforms to maximize their attacks.

Conspiracy Theories

A second type of politically themed content present within the PAPERWALL network of websites is conspiracy theories, typically aimed at the image of the United States, or its allies. Claims could include, for example, allegations of the US conducting biological experiments on the local population in South-East Asian countries.

On the left is an example of conspiracy theory from euleader[.]org. The article was published in an anonymous form directly on the PAPERWALL website, with the feature image hosted on a website called timesnewswire[.]com which we will further analyze in the following section. The image was taken from the cover of a book titled “Biological Weapons: Using Nature to Kill” by Anna Collins.

Chinese State Media

A final category of political content disseminated by PAPERWALL often takes the form of verbatim reposts of content from Chinese state media, such as CGTN or the Global Times. Also, in this case, the content usually remains untranslated from English. An example of this scenario is shown in figure 10.

Example of CGTN (Chinese state media) article reposted, verbatim, by the PAPERWALL website italiafinanziarie[.]com on December 13, 2023

Scraping of Local Mainstream Media

One of the most evident tactics PAPERWALL employs to disguise its websites as local news outlets is to regularly republish content, verbatim, from legitimate online sources in the target country. Below is an example extracted from the French-language website eiffelpost[.]com :

Article posted on eiffelpost[.]com (a confirmed PAPERWALL website), left, and the original published by the real French newspaper Le Parisien, right

Commercial Content

Press releases.

Mixed with the copy/pasted news content, the PAPERWALL websites typically publish press releases of a commercial nature. These press releases are often posted either in an explicit “Press Release” section or directly on the homepage. A peculiarity of the press release content is that it is usually not translated in the target language, but remains in the original one – which, for the most part, is English.

Dec 15, 2023 screenshot from the homepage of the PAPERWALL website italiafinanziarIe[.]com, showing a press release (in English), mixed with Italian-language legitimate news content (lifted, in this example, from the local news website https://www.rete8.it).

Cryptocurrencies

A substantial portion of the press release content is specifically dedicated to cryptocurrency topics. This is consistent with the sourcing of press releases from Times Newswire – which we will analyze in the next section – where cryptocurrency topics are among the most common.

Snapshot of the Press Release (“Comunicato Stampa” in Italian) section of italiafinanziarie[.]com, showing five distinct cryptocurrency-related press releases, all in English. Again, the Italian language is reserved for the legitimate news content extracted from real local media

Content Sourcing

In order to better understand the nature and proportion of the sourcing of content by PAPERWALL, we utilized the backlinks analysis platform provided by AHREFS . Backlinks are links created when one website links to another .

  • We extracted all the domains that PAPERWALL backlinked to – therefore including those hosting content published by PAPERWALL – as of November 30, 2023.
  • We sorted them by the amount of total backlinking PAPERWALL domains, in descending order.
  • We then manually reviewed and categorized the backlinked domains. The top 25 ones are visible in figure 15.

Our elaboration of the backlinks data obtained through the AHREFS platform, showing the top 25 domains that PAPERWALL websites backlinked to as of November 30, 2023. CGTN and Global Times, both Chinese state media, appear in the list respectively with 95 and 86 backlinking domains each

The results show:

  • A top layer of social media domains, which is unsurprising – individual press releases will typically contain links to the client company’s social media profiles;
  • A set of cryptocurrency websites , which – once reviewed individually – are confirmed as the subject of multiple press releases each. Also, two non-crypto private corporations , likely benefiting from the paid press release services that PAPERWALL appears to host;
  • Two Chinese state media websites (CGTN and Global Times), backlinked to by almost 100 domains each;
  • Finally, but crucially, approximately 100 domains backlinked to Times Newswire , a supposed newswire service.

Times Newswire

Links to paperwall.

The consistent connection between PAPERWALL and Times Newswire is one of the most peculiar traits of the campaign. While there is certainly no definitive playbook on how online influence operations are conducted, it is uncommon for a network of coordinated websites to regularly draw content from a single publicly available but equally covert source. For example, as seen in other known disinformation campaigns , a typical tactic would be to create copycat domains, mimicking real news sources without revealing where the content was first published. This characteristic makes it possible to analyze the distribution and type of the content and renders the source website a central component of the campaign.

As of November 30, 2023, the alleged newswire service was backlinked to by 98 distinct PAPERWALL domains, out of the total 123. We assess that the vast majority of the backlinks in question consist of content directly hosted on the Times Newswire website , and reposted by the PAPERWALL network , as seen in a previous example .

Times Newswire is a known entity in the context of influence operations: it was first reported about in 2023 by Mandiant, a Google-owned cybersecurity company. Mandiant observed Times Newswire’s hosted content disseminated through a network of subdomains for legitimate US-based news outlets in the context of an influence campaign that the company dubbed as HaiEnergy.

Mandiant had attributed HaiEnergy to a Chinese PR firm called Haixun , previously identified in their original 2022 report ; however, in their 2023 report the cybersecurity firm stated: “we currently lack technical evidence to suggest an underlying connection between Haixun and […] Times Newswire, […] and thus currently view them as distinct entities.” In fact, timesnewswire[.]com is – like the PAPERWALL websites – a fully anonymous asset.

It should be noted that – unlike the PAPERWALL websites – timesnewswire[.]com offers a “Submit Post” button, hinting at the possibility for registered users to publish content directly to the website. However, once clicked, the button leads to a login page, with no registration module being displayed. The registration of users therefore appears not to happen through the website, and is probably controlled and individually approved by the website’s operators separately.

Similarly to what was stated by Mandiant for the HaiEnergy campaign, we cannot currently attribute Times Newswire to the same operators as PAPERWALL. There are however at least two significant similarities between the newswire and the PAPERWALL network:

The hosting IP address is also a Tencent one, and on the same AS number (132203) as the PAPERWALL domains. An Autonomous System (AS) number is a collection of IP addresses “ under the control of one or more network operators on behalf of a single administrative entity or domain .”

Times Newswire also uses a simple WordPress template as its main structure. Additionally, it utilizes the same page builder plugin ( WPBakery ) used by PAPERWALL.

Being central to at least two distinct operations – PAPERWALL and HaiEnergy – Times Newswire could however be an independent asset, simultaneously exploited by multiple influence operations.

Ephemerality

We were able to identify examples of politically-themed articles that were routinely deleted from Times Newswire. For example, we observed ad hominem attack posts on figures in direct conflict with Beijing’s positions that were later removed from the website.

  • One of these figures was Li Hongzhi , founder and leader of the religious movement Falun Gong, that has been banned and persecuted in mainland China since 1999 .
  • While a Google search on the articles mentioning Li Hongzhi currently only returns two articles, a similar search through the Times Newswire content archived by the Wayback Machine showed a total of eight pieces.
  • All articles are anonymous opinion pieces expressing extremely harsh views on Li and the religious movement he leads.

This behavior suggests that ephemeral seeding is the intention for most content of that type which is deleted from the source website (Times Newswire) at an unspecified time after its initial publication. As noted in previous research , ephemeral disinformation is designed to elude detection. With the evidence disappearing from the source websites not long after having been published, investigators may be unable to make the necessary connections to detect an influence operation or correctly identify the reach and depth of the operation. At the same time, the seeded message could be picked up and amplified by mainstream or social media, making the narrative stay even if the original source had been removed.

In the case of PAPERWALL however, as we discuss in more detail in the Conclusions section, we currently have no evidence that this has ever happened.

Headlines of two now-deleted Times Newswire articles (1, 2) attacking Li Hongzhi, founder and leader of the religious movement Falun Gong

As a final note on the operational tactics utilized by Times Newswire and, as a consequence, by PAPERWALL, we note that the articles targeting Li Hongzhi, as well as others of a political nature that we could observe, were all categorized as “press releases” on the website, similarly to the thousands of actual promotional posts it published. It is however highly unusual for press releases to include content of this kind. We judge this as another tactic designed to make the political narratives hard to detect without diminishing their potential impact.

Attribution: Haimai

We attribute PAPERWALL to a PR firm based in China, Shenzhen Haimaiyunxiang Media Co., Ltd., or “Haimai.”

Haimai was first exposed by the Korean NCSC in their investigation on 18 Korean-focused PAPERWALL websites as being responsible for operating them. However, based on the evidence presented in the NCSC report , that assessment appeared to be primarily based on Haimai itself advertising the paid placement of promotional articles on Times Newswire, and as a consequence, on the PAPERWALL network of websites.

We do not consider this criterion as sufficient for a conclusive attribution. In fact, during our research we could identify at least three other PR and marketing companies advertising the sale of promotional packages to be placed directly on PAPERWALL websites. They include:

  • A South Korean firm named Excelsior Partners , which on Kmong (a Korean service marketplace, hosting the advertisement of specialized services by freelancers, or agencies) advertised the sale of language-specific promotional packages. Each of the packages exclusively listed PAPERWALL domains as the “major local media” on which paid editorial content could be placed.
  • A second Korean company called AN&ON , which advertised country-specific promotional packages on its own website in a similar way to Excelsior Partners. The domains listed were, also in this case, PAPERWALL ones.
  • A Chinese company, called Coin Blog , also known as BIBK , equally selling paid editorial content placement on several confirmed PAPERWALL domains.

However, we could identify digital infrastructure linkages between Haimai and PAPERWALL . Specifically, the two earliest registered PAPERWALL domains, updatenews[.]info and wdpp[.]org, hosted a Google AdSense ID linking them to Haimai’s official website, hmedium[.]com, and to a second website directly related to it. AdSense IDs are unique identifiers for a website operator’s AdSense account .

This is therefore an incriminating finding, proving that both PAPERWALL domains had been set up by the same operators as the Haimai assets.

A review of the source code for updatenews[.]info and wdpp[.]org revealed the presence on both websites of the Google AdSense ID ca-pub-5378976189690174 .

Figure 17: Excerpts of source code from updatenews[.]info (top) and wdpp[.]org (bottom), both displaying the AdSense ID ca-pub-5378976189690174.

Conclusions

PAPERWALL is a large, and fast growing , network of anonymous websites posing as local news outlets while pushing both commercial and political content aligned with Beijing’s views to a variety of European, Asian, and Latin American audiences.

The campaign is an example of a sprawling influence operation serving both financial and political interests, and in alignment with Beijing’s political agenda . By observing the minimal traffic towards the network’s websites that is measurable through open source tools 2 , and the lack of visible mainstream media coverage (including on news aggregators, such as for example Google News) or social media amplification, we can assess the impact of the campaign as negligible so far .

This assessment, however, as well as the large amount of seemingly benign commercial content wrapping the aggressively political one within the PAPERWALL network, should not be taken to indicate that such a campaign is harmless. Seeding pieces of disinformation and targeted attacks within much larger quantities of irrelevant or even unpopular content is a known modus operandi in the context of influence operations , which can eventually pay enormous dividends once one of those fragments is eventually picked up and legitimized by mainstream press or political figures .

Finally, the role and prominence of private firms in creating and managing influence operations is hardly news . However, since the early days of research in this space, the disinformation-for-hire industry has boomed , leading to findings and disruptions in countries around the world (for a few examples, in Myanmar , Brazil , the UAE, Egypt and Saudi Arabia ). China – previously exposed for having resorted to this proxy category in large influence operations, including the cited HaiEnergy – is now increasingly benefiting from this operating model, which maintains a thin veil of plausible deniability, while ensuring a broad dissemination of the political messaging. It is safe to assume that PAPERWALL will not be the last example of a partnership between private sector and government in the context of Chinese influence operations.

Acknowledgments

Special thanks to Jakub Dałek for his research support. Thanks to John Scott-Railton, Emma Lyon, Pellaeon Lin, Siena Anstis, and Céline Bauwens for their peer review and assistance. We would like to thank Melissa Chan for helpful recommendations. Research for this project was supervised by Ron Deibert.

Confirmed Domains

Targeted countries, high-confidence host ip addresses, paperwall domains.

  • We are redacting this domain name as it appeared on one of the shared DNS IP addresses only two months after a PAPERWALL domain was last seen on it, and it seems to belong to a legitimate business with no obvious connections to the network. ↩︎
  • We utilized hypestat.com , a web platform (and browser extension) measuring daily and monthly traffic to websites. The vast majority of the PAPERWALL domains did not even appear in the platform’s database, indicating that their traffic was most likely negligible. Some, such as the generic, English language ones (for example, wdpp[.]org or euleader[.]org) showed an average of about 50 daily visitors. ↩︎

Privacy Policy

Unless otherwise noted this site and its contents are licensed under a Creative Commons Attribution 2.5 Canada license.

COMMENTS

  1. A comprehensive review study of cyber-attacks and cyber security

    The low cost of entry, anonymity, uncertainty of the threatening geographical area, dramatic impact and lack of public transparency in cyberspace, have led to strong and weak actors including governments, organized and terrorist groups and even individuals in this space, and threats such as cyber warfare, cybercrime, cyber terrorism, and cyber e...

  2. Cyber-Attacks

    Cyber-Attacks - Trends, Patterns and Security Countermeasures Authors: Andreea Bendovschi Bucharest Academy of Economic Studies Abstract Technology is rapidly evolving in a world driven by...

  3. Phishing Attacks: A Recent Comprehensive Study and a New Anatomy

    With the significant growth of internet usage, people increasingly share their personal information online. As a result, an enormous amount of personal information and financial transactions become vulnerable to cybercriminals. Phishing is an example of a highly effective form of cybercrime that enables criminals to deceive users and steal important data. Since the first reported phishing ...

  4. Cyberattacks, cyber threats, and attitudes toward cybersecurity

    Research Paper In recent years, the increase in civilian exposure to cyberattacks has been accompanied by heightened demands for governments to introduce comprehensive cybersecurity policies.

  5. Cyber Security Threats and Vulnerabilities: A Systematic ...

    The important contribution of the paper is a discussion of cyber attack approaches, consequence modeling of these attacks and the detection and design of security architecture. ... The second aspect of this study focuses on the venue of the selected publication and its source type, which will help to address research question 2 (RQ2) (i.e., the ...

  6. Full article: Cybersecurity Deep: Approaches, Attacks Dataset, and

    The research should highlight building a real-time setup to validate deep learning approaches so they can counter new types of cybersecurity attacks. Researchers should primarily focus on issues where a criminal uses the DL Technique method to break into the compromised procedure previously acquired using DL Techniques.

  7. Cyber risk and cybersecurity: a systematic review of data ...

    Finally, this research paper highlights the need for open access to cyber-specific data, without price or permission barriers. ... described the attack types in the datasets and the algorithms used for data leak detection. ... Cyber attacks on hospitality sector: Stock market reaction. Journal of Hospitality and Tourism Technology 11 (2): 277 ...

  8. The Emerging Threat of Ai-driven Cyber Attacks: A Review

    Volume 36, 2022 - Issue 1 Open access 27,479 Views 14 CrossRef citations to date 0 Altmetric Listen Review The Emerging Threat of Ai-driven Cyber Attacks: A Review Blessing Guembe , Ambrose Azeta , Sanjay Misra , Victor Chukwudi Osamor , Luis Fernandez-Sanz & Vera Pospelova

  9. Ransomware: Recent advances, analysis, challenges and future research

    Ransomware attacks can cause significant financial damage, reduce productivity, disrupt normal business operations, and harm the reputations of individuals or companies (Jain, Rani, 2020, Zhang-Kennedy, Assal, Rocheleau, Mohamed, Baig, Chiasson, 2018).The global survey 'The State of Ransomware 2021' commissioned by Sophos announced in its findings that, among roughly 2000 respondents whose ...

  10. A Survey of Cyber Attacks on Cyber Physical Systems: Recent Advances

    This paper presents a survey of state-of-the-art results of cyber attacks on cyber physical systems. First, as typical system models are employed to study these systems, time-driven and event-driven systems are reviewed. Then, recent advances on three types of attacks, i.e., those on availability, integrity, and confidentiality are discussed.

  11. Deep learning techniques to detect cybersecurity attacks: a systematic

    Considering the 77 evaluation research papers, the majority of them were published in 2020 and introduced a new CNN technique to detect Injection attacks in the IoT domain that was evaluated on the NSL-KDD dataset. ... and various types of cyber-attacks respectively. With a proper structure, the CNN-based methodologies can allow the ...

  12. (PDF) Cyber attacks: A literature Survey

    Cyber attacks refer to those attacks launched on unsuspecting online users either using a computer as the object of the crime (hacking, phishing, spamming etc.), or as a tool to advance other...

  13. A Review of Attacks, Vulnerabilities, and Defenses in Industry 4.0 with

    The contributions of this paper are as follows: (1) it presents the results of a systematic review of industry 4.0 regarding attacks, vulnerabilities and defense strategies, (2) it details and classifies the attacks, vulnerabilities and defenses mechanisms, and (3) it presents a discussion of recent challenges and trends regarding cybersecurity-...

  14. Review and insight on the behavioral aspects of cybersecurity

    Stories of cyber attacks are becoming a routine in which cyber attackers show new levels of intention by sophisticated attacks on networks. Unfortunately, cybercriminals have figured out profitable business models and they take advantage of the online anonymity. A serious situation that needs to improve for networks' defenders. Therefore, a paradigm shift is essential to the effectiveness of ...

  15. A deeper look into cybersecurity issues in the wake of Covid-19: A

    The paper is unique because it covered the main types of cyber-attacks that most organizations are currently facing and how to address them. An intense look into the recent advances that cybercriminals leverage, the dynamism, calculated measures to tackle it, and never-explored perspectives are some of the integral parts which make this review ...

  16. (PDF) Cyber Security Attacks and Mitigation

    In this paper we will be discussing the importance of cyber security and the different types of cyber-attacks happening in the current digital era. We will also be emphasizing of the...

  17. Intelligent Techniques for Detecting Network Attacks: Review and

    The outcomes of this paper provide valuable directions for further research and applications in the field of applying effective and efficient intelligent techniques in network analytics. This article is organized into four sections. The first section provides an introduction and background to the research area.

  18. cyber attacks Latest Research Papers

    36 (FIVE YEARS 15) Latest Documents Most Cited Documents Contributed Authors Related Sources Related Keywords Analysis of Trending Topics and Text-based Channels of Information Delivery in Cybersecurity ACM Transactions on Internet Technology 10.1145/3483332 2022 Vol 22 (2) pp. 1-27 Author (s): Tingmin Wu Wanlun Ma Sheng Wen Xin Xia

  19. A Study On Various Cyber Attacks And A Proposed Intelligent System For

    A Study On Various Cyber Attacks And A Proposed Intelligent System For Monitoring Such Attacks Abstract: World is moving rapidly towards the digital transformation. The internet across the world is growing rapidly which gives rise to many opportunities in every field including entertainment, finance, education, sports etc.

  20. Analysis of Cyber Attacks and Security Intelligence

    Abstract. A cyber attack is deliberate exploitation of computer systems, technology-dependent enterprises and networks. Cyber attacks use malicious code to alter computer code, logic or data, resulting in disruptive consequences that can compromise data and lead to cybercrimes, such as information and identity theft.

  21. Cyberattacks on knowledge institutions are increasing: what ...

    For months, ransomware attacks have debilitated research at the British Library in London and Berlin's natural history museum. They show how vulnerable scientific and educational institutions ...

  22. What Is Cybersecurity? Technology And Data Protection

    Last summer, Checkpoint Research reported that the number of attacks had risen by 8% during 2022, with organizations facing an average of 1,258 incidents per week. And as a result, cybersecurity ...

  23. As Nationwide Fraud Losses Top $10 Billion in 2023, FTC Steps Up

    Newly released Federal Trade Commission data show that consumers reported losing more than $10 billion to fraud in 2023, marking the first time that fraud losses have reached that benchmark. This marks a 14% increase over reported losses in 2022. Consumers reported losing more money to investment scams—more than $4.6 billion—than any other category in 2023.

  24. 13 Types of Cyber Attacks You Should Know in 2023

    Mohamed Shawky ESLSCA University Abstract In 2023, the digital landscape is filled with unprecedented cybersecurity risks. From phishing scams to ransomware attacks, This article outlines...

  25. PAPERWALL: Chinese Websites Posing as Local News Outlets Target Global

    A network of at least 123 websites operated from within the People's Republic of China while posing as local news outlets in 30 countries across Europe, Asia, and Latin America, disseminates pro-Beijing disinformation and ad hominem attacks within much larger volumes of commercial press releases. We name this campaign PAPERWALL. We attribute the PAPERWALL campaign to Shenzhen Haimaiyunxiang ...