Blog Comments – Analysing 100,000 Comments and Spammers

“Nice blog, thanks for the info”

“Awesome site. Great job”

“You should take part in a contest for one of the best blogs on the web. I will recommend this site!”


I know you like flattering comments on your website. And I know you love to see many comments on each one of your posts (say you community participation). Who doesn’t, right? We love them too.

So we decided to take a closer look at the last 100,000 (well, 98,238 to be more exact) comments that were sent to the network of sites that we are monitoring. How much of them are spam? Who are the most annoying spammers? And things like that.

Comment Analysis

We filtered the latest 98,238 comments received (that’s less than a week worth of comments), and ran them through our analysis engine. Guess how many of them were spam? How many were good?

  • Spam comments: 79,858 (81.2%)
  • Good comments: 18,380 (18.8%)

Wow! So according to our analysis, more than 80% of the comments were classified as spam. We even took a conservative approach and classified unsure comments as good comments. So out of every 5 comments received, only 1 was valid.

*Unsure comments were ones we only saw hitting one web site, but the content was suspicious. Those in this list were almost 10,000 (9% of the overall total). If we had classified those as spam, the number would have grown to 90+% spam.

Spam Analysis – Messages

This really amused us. What type of message do you think a spammer was sending? Most of the time, we noticed that they sent a flattering note to increase the odds of the webmaster accepting the comment. Here are the top 10 messages sent by spammers:

The last one in the list is the funniest (“You should take part in a contest for one of the best blogs on the web. I will recommend this site”). Taking out the Viagra and the Louis Vuitton spam, why do they do it?

They do it because in the URL field, they add a link to their own web site (which can increase their page rankings, visitors, etc). Example:

[author] => Mary Jane
[email] => info@fabfunapps.com
[url] => http://fabfunapps.com
[comment] => Good share! I hope more people will discover your blog because you really know what you’re talking about. Can’t wait to read more from you

Spam Analysis – Emails

This email analysis was not as useful as we would have hoped. The emails are very random and mostly from gmail and hotmail accounts. These were the top spammer emails:

470 [email] => ofangjiancong@gmail.com
222 [email] => colorado@uymail.com
175 [email] => nhaofangjiancong@gmail.com
172 [email] => Rich@seoplugins.org
167 [email] => n9zvrx.dzpbhniuvb@gmail.com
161 [email] => imtheking@hotmail.com
136 [email] => euq.wxtzlrl17fvbx@gmail.com
133 [email] => crearlynaxzex@gmail.com
132 [email] => alms5eg.m0352vbi3@gmail.com
129 [email] => io6llx3za08izklw@gmail.com
123 [email] => mc.1e0l033z.fbr13z@gmail.com
121 [email] => gr794g4ci1a.bhcju@gmail.com
120 [email] => www.realcazinoz.com@gmail.com
120 [email] => hn.58gmso.jvbhxz36@gmail.com
120 [email] => 18ag5yfa46.io0ll2@gmail.com
115 [email] => plm.n5fqls79vmrop@gmail.com
115 [email] => ofawc5j0lhd9uab.8@gmail.com
113 [email] => yoagxxtp4mciouqx@gmail.com

These were the top domains used by spammers:

16514 gmail.com
7300 hotmail.com
3267 yahoo.com
2309 aol.com
2038 gmail.com
1066 googlemail.com
984 gnumail.com
954 123mail.net
950 yahoomail.com
443 ymail.com
349 yahoo.co.uk
261 cwcom.net
219 live.com
202 magicmail.com
197 mail.com
192 Gmail.com
180 mail.ru
160 msn.com

Spam Analysis – URLs

Now it is getting useful, let’s see the domains that are using comment spam to increase their ratings and visitors. Top 30 on this one (out of 24,976 different URLs):

1163 [url] => http://www.kitsucesso.com
1114 [url] => http://www.listasegmentada.com
677 [url] => http://stevepavlina.com
481 [url] => http://afriendshipquotes.blogspot.com/p/poem-for-best-friends.html
344 [url] => http://online-viagra-online.com
332 [url] => http://movie-web.org
317 [url] => http://www.divulgaemail.com
314 [url] => http://www.linklegends.com/free-trial
254 [url] => http://earn7800permonth.com
225 [url] => http://www.tvturn.com
208 [url] => http://www.zimbio.com/General/articles/-rEPnqoftf3/live+stream+TV+personal?add=True
208 [url] => http://www.filpan.ru
202 [url] => http://www.prlog.org/11261550-phone-number-lookup-catch-cheater-quickly.html
197 [url] => http://filter-paper.net
193 [url] => http://lnklicious.com
190 [url] => http://www.listadeemail.org
187 [url] => http://www.wordpress-subscribers.info
187 [url] => http://onlinepharmacy-levitra.com
179 [url] => http://eng.umek.su
172 [url] => http://www.seoplugins.org
167 [url] => http://5millionebooks.com
161 [url] => http://whatwhatwhat.com/
150 [url] => http://www.pharmacyreviewer.com/
146 [url] => http://www.japancoachstores.com/
144 [url] => http://www.guccibagoutletjp.com/
132 [url] => http://rsproductsonline.com/
129 [url] => http://diablo-3-for-free.com/review/diablo-3/
127 [url] => http://terbelizzder.com
126 [url] => http://www.cialis.vc

Spam Analysis – IP Addresses

To finish, some actionable information for hosting providers and website owners. This is the list of IP addresses sending the most spam so you can block them out:

296 62.75.181.210
238 216.59.22.16
238 188.138.84.93
227 66.85.128.34
182 204.12.237.43
166 204.45.108.226
162 37.59.151.187
161 37.59.173.137
129 178.32.151.208
126 37.59.151.182
119 91.201.64.4
113 178.137.160.195
107 192.162.102.221
104 83.136.86.21
104 83.136.86.105
101 78.112.161.207
100 178.32.201.178
93 94.153.9.47
92 109.73.77.149
91 94.45.168.67
90 91.207.8.26
84 80.84.51.194
83 91.210.104.143
80 69.194.161.228
79 83.22.254.204
78 75.35.174.45
73 120.62.1.232
71 120.62.1.174
69 91.236.74.133
68 71.21.19.133
68 63.141.237.224
68 210.192.65.242
66 120.62.16.89
65 74.108.93.214

The total list is very big (12,190 unique IP addresses), but blocking the top ones is a good start.

Spam Analysis – Countries

Out of curiosity we decided to check the top Countries sending spam (based on the IP address):

23,899 United States (31%)
16,888 China (22%)
5,145 Russian Federation (6.7%)
3,291 Brazil (4.3%)
3,094 France (4.0%)
2,850 Germany (3.7%)

In the olympics of SPAM, the USA is #1, followed by China (Silver), Russia (Bronze) and Brazil.

Conclusion

Yes, there is a lot of spam out there. I would say that 9 out of 10 comments are spammy in some way (even if not automated – we only classified automated messages as spam). In any event, let us know if you want any more information from this list. We have raw data, so we can run numbers and different analysis as requested.

Scan your website for free:
About Daniel Cid

Daniel is the Founder & CTO of Sucuri and also the founder of the open source project - OSSEC HIDS. His interests range from intrusion detection, log analysis (log-based intrusion detection), web-based malware research and secure development.

You can find more about Daniel at his site dcid.me or on Twitter: @danielcid