Mistrust Plugins You Must: A Large-Scale Study Of Malicious Plugins In WordPress Marketplaces

March 7, 2023

Research

Authors:

Ranjita Pai Sridhar, Jonathan Fuller, Yiting Sun, Omar Chabklo, Andres Rodriguez, Jeman Park, Brendan Saltaformaggio

Article shepherded by:

Rik Farrow

Many modern websites are almost entirely constructed from plugins and themes, which place implicit trust in large amounts of un-vetted code with limitless access to the webserver. This trust is often broken for monetary gains and malicious plugin authors are malevolently selling plugins packed with malware to unsuspecting victims.

We developed YODA, an automated framework to detect malicious plugins and track down their origin. YODA uncovered 47,337 malicious plugins on 24,931 unique websites by investigating over 400K production webservers dating back to 2012. Among these, $41.5K had been spent on 3,685 malicious plugins sold on legitimate plugin marketplaces while pirated plugins cheated developers out of $228K in revenues. Post-deployment attacks infected $834K worth of previously benign plugins with malware.

1. Perilous Economy Behind CMS Plugins

Economy of CMS Marketplaces. WordPress plugins and themes generate millions of dollars in sales every year. These plugins are created by an individual or teams of developers, including WordPress developers themselves.

Table 1: The Economy of WordPress Plugin Marketplaces.

In collaboration with CodeGuard, we had an opportunity to investigate the nightly backups of over 400K unique WordPress websites dating back to 2012. We performed a preliminary study to understand the scale of this fraud. As seen in Table 1, thousands of plugins are freely available in WordPress repositories and on software development platforms (e.g., Github), while paid versions of the plugins are sold through marketplaces (e.g., CodeCanyon). WP Plugins is the most popular marketplace overall, with 7.5M average downloads per plugin. Some marketplaces do not sell individual plugins and instead provide a subscription service for all plugins at a flat rate. For example, WPMU DEV has a $49/month subscription and is the most popular paid plugin marketplace in our dataset with 1.4M average downloads per plugin. Less-popular plugins are also directly available from freelance developers or small businesses. As seen in Column 9, website owners from our dataset alone spent $7.3M at plugin marketplaces, and we estimate the revenue earned by these plugins globally to be over $210M based on a conservative estimate of the reported download counts.

Loosely-regulated Marketplaces. While these marketplaces are growing rapidly, the regulations to assess plugins are minimal. For example, our study found a CodeCanyon plugin for $10M with a note from the plugin author to not buy the plugin. The fact that the plugin author was able to set the price so high, to dissuade downloaders, rather than legitimately removing the listing underscores how little oversight these marketplaces provide. We also found that attackers include malicious behaviors in plugins and then sell them on reputable plugin marketplaces, such as the WordPress repository (§5.1). A report by Wordfence, a leading WordPress malware scanner, found nine popular plugins updated at source (i.e., the WordPress plugin store) with malicious code as part of a coordinated spam campaign. It is more urgent now than ever to study the impact of this problem and address the challenges toward securing the plugin ecosystem.

2. Challenges

Despite over a billion dollars in revenue every year, little has been done by the research community to evaluate, assess, and ensure the safety of plugins. Past research studied malicious apps in the Google Play Store [1], malicious extensions on the Chrome Web Store [2], and malicious packages in package registries [3]. Prior work also exposed malicious behaviors on webservers, such as the presence of vulnerabilities [4], webshells [5], and backdoors [6], but none analyzed the underlying plugins which lead to many of these attacks. Further, the complexities of prior research solutions have prevented the average CMS user from adopting them.

In the meanwhile, CMS website owners often rely on simple indicators such as plugin popularity, ratings, and reviews on the plugin marketplaces to determine that a plugin is safe to install on their website. The diligent CMS user may consult freely available or commercial plugin vulnerability scan databases before installing a plugin. Unfortunately, these sources provide neither complete nor robust measures of security. Driven by economic incentives, attackers buy the codebase of popular free plugins, add malicious code, and wait for plugin users to auto-update (§5.2). In such cases, none of the commonly used simple indicators can help prevent malware from infiltrating the website.

Complicated Stakeholders. It becomes more challenging to mitigate malicious CMS plugins due to the diverse range of stakeholders in the CMS plugin ecosystem. Each has different motivations and visibilities regarding this malicious plugin problem. Website owners have full visibility over the webserver activity, but they rely on naive indicators when installing plugins. Hosting providers have no visibility into the plugin installations but need to ensure their hosting platform remains malware-free. Plugin marketplaces have visibility over the plugins they host but need a scalable and efficient measurement of the malicious plugins being sold on their marketplaces. An ideal solution must ensure ease of use and reliable detection since plugins could be malicious anywhere in this supply chain: from the source marketplace to a post-deployment web attack (i.e., fake plugin injection).

3. Modelling Malicious CMS Plugins

While legitimate marketplaces are still popular ways for CMS website owners to get necessary plugins (Table 1), there also exist "nulled marketplaces" which are the dark markets for pirated CMS plugins.

Nulled Marketplaces. Since most paid plugin marketplaces do not offer a trial option, several marketplaces started a “try before you buy” initiative. Unfortunately, this gave rise to pirated “trial plugin” marketplaces, referred to as nulled marketplaces. Nulled plugins are pirated versions of originally paid plugins, freely distributed via nulled marketplaces (unbeknownst to the original creator). Generally, these plugins have been hacked or contain modified code which causes user harm or collects sensitive user data and are made to work indefinitely without a license key. Our study has found that, more often than not, nulled plugins introduce malicious code onto webservers (§5.2).

Injected Plugins. To avoid a vetting system, the attacker can initially develop and list the benign plugin in the marketplace and then inject the malicious code into them. The attacker also can buy the codebase of the popular free plugins to inject malicious code. We classify CMS plugins that later turn malicious via code update as injected plugins.

Infected Plugins. Some malicious plugins on the webserver try to increase the attack's coverage by hijacking other plugins. We classify these plugins infected by other malicious plugins as infected plugins.

4. YODA

Figure 1 shows an overview of the YODA pipeline. YODA first conducts Plugin Detection. Hosting providers and website owners can deploy this to detect plugins on their websites. But, marketplaces or plugin developers can skip directly to YODA's Malicious Behavior Detection for a given plugin. Next, YODA identifies the Origin of Malicious Plugins. Finally, it performs an Impact Study to understand the scale and impact of the plugin economy.

Figure 1: YODA Design Overview.

4.1 Plugin Detection

YODA first detects all of the webserver's plugins by identifying the plugin root and all the associated files that belong to the plugin. To this end, YODA performs (1) metadata analysis to identify the plugin root files and (2) code analysis to identify all of the associated files as part of the plugin (Figure 1).

Metadata Analysis. YODA parses the comments from all of the server-side code files and performs regular expression matching to identify the plugin root files (i.e., files containing the plugin header, a specially formatted block comment that contains metadata about the plugin). For every plugin root, YODA extracts and records the plugin metadata from the header, including the plugin name, plugin URI, author name, author URI, and plugin version.

Code Analysis. With the plugin root files identified, YODA proceeds to find all associated plugin files. To do this, YODA generates and parses the abstract syntax tree (AST) of all of the server-side code files in parallel and sub-directories of the plugin root. Since several users customize plugins either by using configuration files or modifying the PHP code, YODA will detect files based on three scores. Specifically, YODA identifies the existence of a plugin header (H_j), the number of reference calls linking other files (R_j), and the number of occurrences of plugin-specific API calls (A_j) to calculate the likelihood of a group of files being part of a plugin (detailed in [7]).

4.2 Malicious Behavior Detection

YODA employs both syntactic features (e.g., file meta-data, sensitive APIs) and context-aware semantic features of all plugin code files (e.g., AST with resolved file dependencies). Syntactic analysis uses data flow analysis to identify suspicious APIs being used as sinks in plugin code files.

Semantic Analysis. The presence of suspicious APIs alone does not equate to malicious plugin behavior. To ensure that the malicious behaviors are detected across multiple plugin files, YODA performs a context-aware semantic analysis. In the dependency-resolved ASTs, it marks all the sensitive APIs identified earlier as sinks and performs targeted inter-procedural backward slicing on the AST from each sink to the predefined sources. These source-sink dataflows, called “semantic models” cover 14 malicious behaviors including Webshell, Post Injection, Input Gating, SSO Backdoor, Library Function Exists, Spam Injection, Code Obfuscation, Blackhat SEO, Downloader, Function Reconstruction, Insert User, Malvertising, Fake Plugin, and Cryptominer. You can find the details of how YODA detects these behaviors in our full paper [7].

4.3 Origin of Malicious Plugins

YODA then determines the origin of these malicious behaviors. This helps work out the different attacker entry points within the CMS ecosystem. To this end, YODA classifies the plugin state into “A” and “M” by investigating temporal snapshots. For example, when the plugin P_i does not exist at time t but is added at t + 1, then YODA considers the plugin status of Pi at t + 1 as “A” (added). On the other hand, if any file within the plugin P_i changed at time t, YODA will consider the plugin status of P_i at t as “M”(modified).

Legitimate Plugin Marketplace. YODA marks the malicious origin as a legitimate plugin marketplace if: (1) one or more malicious behaviors are seen when the effective plugin state is “A”; or (2) the effective plugin state is “M” due to plugin version and/or author change.

Nulled Plugin Marketplace. Nulled plugins commonly include multiple malicious domains to download malicious content on the webserver. If YODA records downloader, malvertizing, or spam injection when the effective plugin state was “A”, and if the plugin contains multiple redundant blacklisted URLs, it is categorized as nulled based on its behavior. If the plugin name contains nulled marketplace metadata, it is also categorized as nulled.

Injected Plugin. If YODA finds (1) fake plugin behavior when the effective plugin state was "A", or (2) fake plugin and code obfuscation behaviors when the effective plugin state is "M", the plugin is categorized as an injected plugin.

Infected Plugin. If YODA found malicious behaviors in a plugin with an effective plugin state "M" in an already compromised website, it is marked as infected.

4.4 Impact Study

The origin of malicious plugins in §4.3 reveals the broad attacker platforms used to victimize CMS users. To understand the scale of this impact on the plugin marketplaces, YODA extracted the impact metrics associated with each plugin (i.e., monetary impact in terms of plugin cost and popularity impact in terms of the number of downloads) by mapping the plugins in our dataset to the plugin marketplace it originated from.

5. Malicious Plugins in the Wild

We deployed YODA on the full dataset of 410,122 unique WordPress websites’ nightly backups. This dataset provided a realistic view of the plugin ecosystem because over 37% of the world’s websites and over 63% of CMS-based websites run on WordPress. It also allowed us to deploy YODA retroactively over 8 years. The backups contained an average of 406 day-snapshots per website. Each website had between 1-68 plugins, with an average of 49 plugins per website.

5.1 Malicious Behavior Evolution

YODA found malicious plugin instances (#P) in 24,931 of the 410,122 websites (#W), shown in Table 2. Over 10K malicious plugin instances used age-old web attack techniques: webshells and code obfuscation. The infection ratio (IR, the ratio of #P to #W) shows a measure of infection spread. Several malicious behaviors have IR >3, implying that multiple plugins within the same website contain these same malicious behaviors. Closer inspection revealed that these are due to plugin-to-plugin infection: a single malicious plugin on the webserver infects multiple benign plugins, replicating the behavior.

Table 2: Distribution of Malicious Behaviors.

Thousands of malicious plugins originated from legitimate plugin marketplaces (Marketplace Columns). Row 2 shows that none of these plugins use code obfuscation techniques — despite being sold on legitimate marketplaces they brazenly hide in plain sight. Attackers (rightly) assume that an average website owner will not inspect the plugin code before installing it on their webserver. Attackers exploited the scalable CMS infrastructure to inject malicious plugins into websites (Injected Columns). They are injected without the website owner's knowledge and over 80% of these plugins had fake plugin behaviors (1,336), webshells (994), or obfuscated code (558). We also found 8,525 malicious nulled plugin instances that exploited human vulnerabilities to rapidly spread malware (Nulled Columns). It was shown that over 91% (7,821 of 8,525) of these plugin instances used input gating (i.e., password-protecting the publicly accessible code) to thwart competing attackers from introducing malicious payloads.

It was concerning that over 40K plugins were infected post-deployment (Infected Columns). Most attackers employed behaviors such as webshells, obfuscation, and downloaders in 9,943, 8,819, and 4,254 plugin instances, respectively.

5.2 Fueling the Malware Economy

Next, we turned our attention to the economic drivers of these malicious plugins. Table 3 categorizes our results based on the origin of malicious behaviors, i.e., legitimate marketplaces, nulled marketplaces, and infected plugins. Since plugin marketplaces do not provide any price history, we used the reported download counts and the prices from July 2020 to estimate the money spent on these plugins.

Table 3: The Economy of Malicious Plugin Marketplaces.

Table 3 begins with malicious plugins originating from legitimate plugin marketplaces. About 70% (2,597 of 3,685) were found on 5 of the 7 most popular marketplaces. Our dataset alone constituted over $41K in purchases of malicious plugins from legitimate marketplaces, meaning that malicious plugin authors are literally selling plugins packed with malware to unsuspecting victims at an average of $15.78 per plugin. Furthermore, the malicious plugins from these marketplaces were extremely popular, averaging 336K and 945K downloads per plugin, which implies that the attacker would get a monetary gain of millions of dollars.

Nulled plugins impersonate plugins from legitimate marketplaces. YODA extracts their cost and popularity from legitimate marketplaces. The cost represents the explicit losses incurred by the legitimate plugin authors. About 75% of the malicious nulled plugins (i.e., 6,407 of 8,525) contain legitimate counterparts in the 7 most popular marketplaces. Surprisingly, we found a total of 102 plugins from WP Plugins and WP Themes sold on nulled marketplaces. As expected, we also found that over 77% (349 of 451) of the nulled plugin counterparts were sold on paid marketplaces. Overall, the website owners from our dataset alone contributed $228K in explicit losses to the plugin authors.

For post-deployment infected plugins, about 65% (i.e., 26,655 of 40,533) were downloaded from these 7 popular marketplaces. Since the plugins from WP Plugins and WP Themes are widely used, they are also commonly infected. 34.2% and 23.8% of plugins from WP Themes and WP Plugins became victims of plugin infections. Website owners spent a total of $834K on these plugins, only to find them compromised. This encapsulates the additional implicit cost of malware cleanup incurred by installing malicious plugins from legitimate and nulled marketplaces.

5.3 Are Infected Plugins Cleaned Up?

Table 4 studies the plugin clean-up statistics to understand how attackers are evading website owners. Very few website owners (2,697 of 24,931 or 10.8% of the compromised websites overall) attempt to clean up the malicious plugins on their webservers. As seen in Table 4, 24.1% of websites with malicious plugins from legitimate marketplaces are cleaned up, the highest rate by far. Only 6.7% of nulled plugins are cleaned up, which strengthens our hypothesis that despite much later adoption, nulled plugins provide robust persistence for attackers.

Table 4: The Cleanup and Reinfection Distribution of Malicious Plugins.

Of the 2,697 websites that attempted to clean up 7,042 malicious plugins, 12.5% of the websites (336 of 2,697) were reinfected. Interestingly, nulled plugins were most consistently reinfected (17.8% of 353 websites). Plugins downloaded from legitimate marketplaces show the least rate of reinfection (9.9%). This can be attributed to community engagement in identifying malicious plugins on legitimate marketplaces. Such plugins are either purged from the marketplace or their authors are forced to remove the malicious code. We also measured the websites that remained infected up to the time of writing. Despite cleanup efforts, over 94% of all websites with malicious plugins remained infected.

6. Persistence of Malicious Plugins

To understand the persistence patterns of malicious plugins, Figure 2 shows a box plot measuring the number of days malicious plugins were identified on the webserver, categorized by their origin. The median persistence ranges from 189-209 days, meaning that over 50% of the malicious plugins persisted for over 6 months. We also noted that over 80% of the remaining malicious plugins (those that persisted for less than 6 months) were introduced during Feb - Mar 2020 and persisted through the end of our study, which confirms that 94% of the malicious plugins in our dataset installed over 8 years are still active today.

Figure 2: Persistence of Malicious Plugins.

Popular plugins on legitimate marketplaces mostly introduce malicious behaviors via plugin updates. Thus, we assumed that these behaviors would be cleaned up with updates as well. Unfortunately, malicious plugins from legitimate marketplaces are not immediately identified at source and persist for 176 - 380 days. Over 60% of website owners do not enable auto-updates and use outdated plugin versions. The persistence of nulled plugins (131 - 232 days) is shorter than other origins. This can be attributed to the fact that even though nulled marketplaces have existed since 2013, they gained popularity around 2018, and their blackhat SEO campaigns accelerated in early 2019. We found that once nulled plugins are installed on the webserver, they are rarely removed (§5.3). Notably, it is the injected plugins that win the persistence war. Over 75% of these plugins remain active for at least 177 days, and over 25% of these plugins persist for at least 525 days. This proves that injected plugins are never noticed by website owners, who typically use GUIs to manage their CMS.

7. Concluding Remarks

YODA provides an automated investigation framework that uncovered 47,337 malicious plugin installs on 24,931 unique websites, 94% of which are still active today. We have disclosed the results to CodeGuard and they are working on remediating the identified attacks. We have made YODA's source code available at https://github.com/CyFI-Lab-Public/YODA.

Appendix

References:

[1] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang, “Hey, you, get off of my market: Detecting malicious apps in official and alternative android markets.,” in Proceedings of the 19th Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA, Feb. 2012.

[2] N. Jagpal, E. Dingle, J.-P. Gravel, P. Mavrommatis, N. Provos, M. A. Rajab, and K. Thomas, “Trends and lessons from three years fighting malicious extensions,” in Proceedings of the 24th USENIX Security Symposium (Security), Washington, DC, Aug. 2015.

[3] R. Duan, O. Alrawi, R. Pai Kasturi, R. Elder, B. Saltaformaggio, and W. Lee, “Towards Measuring Supply Chain Attacks on Package Managers,” in Proceedings of the 2021 Annual Network and Distributed System Security Symposium (NDSS), Virtual Conference, Feb. 2021.

[4] J. Dahse and T. Holz, “Static detection of second-order vulnerabilities in web applications,” in Proceedings of the 23rd USENIX Security Symposium (Security), San Diego, CA, Aug. 2014.

[5] L. Invernizzi, P. M. Comparetti, S. Benvenuti, C. Kruegel, M. Cova, and G. Vigna, “Evilseed: A guided approach to finding malicious web pages,” in Proceedings of the 33rd IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, May 2012.

[6] R. P. Kasturi, Y. Sun, R. Duan, O. Alrawi, E. Asdar, V. Zhu, Y. Kwon, and B. Saltaformaggio, “TARDIS: Rolling Back The Clock On CMS-Targeting Cyber Attacks,” in Proceedings of the 41st IEEE Symposium on Security and Privacy (S&P), Virtual Conference, May 2020.

[7] R. P. Kasturi, J. Fuller, Y. Sun, O. Chabklo, A. Rodriguez, J. Park, and B. Saltaformaggio, “Mistrust plugins you must: A Large-Scale study of malicious plugins in WordPress marketplaces,” in Proceedings of the 31st USENIX Security Symposium (Security), Boston, MA, Aug. 2022.

Article Categories:

Security

Last updated March 20, 2023

Authors:

Ranjita Pai Sridhar is a Data Scientist at Microsoft. She graduated with a Ph.D. in Electrical & Computer Engineering (ECE) at the Georgia Institute of Technology. Her research interests lie in cyber attack forensics, web application security, and applied AI for large-scale security measurement.

[email protected]

Jonathan Fuller is a Research Scientist at the Army Cyber Institute. His research focus is combining cyber forensics and binary program analysis towards detecting, monitoring, and counteracting advanced malware.

[email protected]

Yiting Sun is an MS student in Electrical & Computer Engineering (ECE) at the Georgia Institute of Technology. Her research interest focuses mainly on web security.

[email protected]

Omar Chbaklo is a Security Engineer at Praetorian. He graduated with an MS in Electrical & Computer Engineering (ECE) at the Georgia Institute of Technology. His specializations include securing web/mobile applications, cloud, and web3 assets.

[email protected]

Andres Rodriguez is a Software Engineer at Assured Information Security (AIS). He graduated with a BS in Electrical & Computer Engineering (ECE) at the Georgia Institute of Technology.

[email protected]

Jeman Park is currently a Research Scientist in the School of Cybersecurity and Privacy at the Georgia Institute of Technology. His research interest focuses on cyber forensics, malware/binary analysis, web security, and mobile security.

[email protected]

Brendan Saltaformaggio is an Assistant Professor in the School of Cybersecurity and Privacy and the School of Electrical and Computer Engineering at the Georgia Institute of Technology. His research interest lies in computer systems security, cyber forensics with focuses on memory forensics, binary analysis and instrumentation, vetting of untrusted software, and mobile/IoT security.

[email protected]