Word Press, Apache, and Linux Contribute to Majority of Internet Blogger Spam
Posted by Keith Elder | Posted in Internet, Linux | Posted on 07-09-2007
If you have a blog as do I, one of the things you enjoy looking at is your PingBacks or TrackBacks. PingBacks and TrackBacks were created as a nice way for bloggers to who know when other community members linked back to their articles. Essentially they work like this. Let’s say I write an article on my blog. Another blogger reads it and links to it in one of their blog articles. When a viewer on their blog clicks on my article link, a comment gets added to my blog article linking back to their site. This is called a TrackBack.
For the original author this provides a way to keep track of who is commenting off line or linking to his/her information. This has worked really well until the porn and drug industry figured out they could post unwilling information to thousands upon thousands of blog sites for free. Spammers and hackers have literally taken this feature away from bloggers like myself by automating TrackBacks of web cams, sex toys, and so on. Surprisingly the mortgage industry has yet to catch on. 🙂
There are some spam systems out there that fight this which plug into several blog packages. For example Akismet API is one the blog software I use (SubText) comes with. Akismet is actually powered by Word Press (another popular blog software package written in PHP) and does a fair, not great job, of stopping TrackBack spam. I say fair because I wind up still having to clean this junk from my comment logs. To make matters worse if a blogger doesn’t clean this stuff out then it will count against him or her in the long run on search engine rankings. A lot of bloggers have given up and turned TrackBacks off all together. This is a shame because the feature is really useful. This brings us to our question. Where is this stuff coming from then?
The results may astonish you. The same blog engine that is suppose to help you fight TrackBack spam is the very one that is creating the spam! One Hundred percent of what comes through to my site that is considered spam TrackBacks comes from one of three things: Compromised Word Press blogs, an Apache Server, or a server running Linux. Notice I said 100%, not 99% or 95%. I can attribute each spam TrackBack to one or the other. Don’t believe me? Then let’s look at some examples. Here is a screen shot to show you what I mean. Below are the last four spam TrackBacks I received that were not filtered by Akismet.
Let’s look at what Netcraft says these domains are running below.
University of Alaska Fairbanks is obviously a distance learning community site and it has public_html folders enabled for user accounts. In this example the user account idesign has been compromised since we see several hidden directories, “.psy” and “.xml”. For those not in the know, if you EVER see a URL like /.something/ don’t click on it (ever wonder where http://www.slashdot.org gets it name? now you know). This is a hidden folder on Unix servers. The reason it is hidden is because when you type “ls” to get a directory listing the command doesn’t show you hidden folders that start with a dot. It is a way hackers hide information on Unix systems from users. This is one thing all the TrackBacks have in common. The first URL is cut off but trust me, there is a folder start with a dot in the URL.
The other interesting thing to note is three of the four URLs are generated from Word Press. The “/wp-content/” folder gives this away since all word press folders typically start with the letters wp. Out of the four URLs listed, three of them are Word Press and they all are running the Apache web server and two of the three unique domains are reportedly running Linux.
Obviously the most ironic thing about this is the same blog software that is trying to help stop TrackBack spam is the same software that is creating the majority of it. Thank you Word Press.
Dude, roll your own blog. No spam problems. Just make sure you don’t allow linking to single articles, it’s more fun to annoy southerners and allow linking only to a list.
Man you don’t even know how long I’ve waited for this since disabling my own Movable Type widget (that doesn’t work since Haloscan bypasses that code).
THANK YOU!
Hi Keith,
Interesting theory you have here.
I wanted to pick up on what you said about /. – you implied it was a hidden folder, but that’s not really it. /. is the root folder of the server. The start of the tree of all folders on a system. (The dot is mostly redundant, but I suppose it helps decorate the slash.)
Regards,
-james.
BTW, I like your live comment preview. It’s not only unusual but appears to be very well done.