You’ve heard the claims: up to a third of all web traffic is suspicious and it is costing advertisers and publishers billions of dollars.
You’ve also probably asked yourself a couple of questions. Where is this suspicious traffic coming from? And where is it going? The answer to both questions is complicated.
Understanding Ghost Sites
The sources of the traffic consist mainly of a variety of software designed to replicate human web users. This software is commonly referred to as “bots” and these bots are driving traffic to what have become known as “ghost sites.”
So, what exactly is a ghost site?
Ghost sites are designed to look real to the eyes of both consumers and advertisers. From the perspective of a DSP, a ghost site seemingly looks to drive a significant amount of traffic and have a large number of users. But in fact, these sites are ghost towns.
The traffic is suspicious and are vehicles created solely to monetize traffic that advertisers don’t want or need. Ghost sites are created and managed by the same parties that have developed the bots that create non-human impressions.
Bot network operators simply signal the thousands of computers under their control to visit their ghost sites. According to researchers, some individual sites are raking in millions from unsuspecting advertisers.
Sites Victimize Advertisers, Users
The main victims of this tactic are advertisers who are paying for the fraudulent impressions and platforms that inadvertently mistake them for real sites. Internet users are also victims, since their computers have been infected with bots, but the impact is limited to slight increases in the bandwidth and processing cycles, needed for the bot to operate.
So how can you tell if a site is haunted? First off, despite their good looks and generally professional appearance, ghost sites will have no affiliation with any known online or offline brands. These sites typically have domain names that are generic, but logos that are unique and graphically impressive.
How to Uncover Ghost Sites
Sound familiar? You might be thinking this describes many sites on the Internet. Here are the basic investigative steps I suggest ad ops teams and advertisers take when assessing inventory and publisher quality:
Examine for Bland & Generic Content
There are content farms that churn out articles for as little as $1 each, so it’s no surprise that when ghost site content is searched for elsewhere on the Internet, it can’t be found. You’ll find that ghost sites typically offer useless content on ‘how-to’ topics such as cooking, gardening, finance and home improvement.
Check the About Us & Contact Pages
Ghost sites will never have fully verifiable contact information or traceable provenance. This is most apparent when looking at the "About Us" page. There will be flowery language, incoherent sentences (an indicator that the source of the site is unknown) and no further information about the site operators. Moreover, "Contact Us" pages will more than likely be comprised of no more than a generic web contact form.
Verify the Physical Address
Many trading desks and networks are now requiring a physical address on websites, a requirement driven by the idea that a fraudster won’t have a legitimate address to use. First, find out if it’s a legitimate business address by using tools such as the White Pages to determine a person or business behind it. Shared office space addresses are common because information on their occupants is rarely shared.
Use Tools Such as WhoIs, Alexa & SimilarWeb
- Always look up domain registration information using WhoIs. This will tell you how long the domain has been active. Be on the lookout for young domains — fraudsters move quickly once up and running; after all, there’s no need to build an audience of real humans!
- Alexa offers information on the site’s "bounce rate," which represents the number of visitors (real or bot) that leave the site after first contact from another site. The normal range for legitimate sites with good content is in the 40 percent to 60 percent range. Bounce rates that are too high or too low are red flags. The other valuable signal available on Alexa is "time on site." Some ghost sites – especially those using the site as a front for inventory laundering – have high time-on-site times (30 minutes and higher).
- SimilarWeb allows you to look at large changes in traffic over the course of one to three months, as well as the sources driving traffic. SimilarWeb shows the percentage of traffic coming from six channels: direct navigation, referrals, search, social, email, and display. Ghost sites will tend to weigh heavily toward referral traffic (and you can view the top referrers to the ghost site, which will also be revealing) or direct traffic. Either of these without an even share of traffic from other sources may indicate bot traffic.
More than anything, trust your instincts. When looking at a ghost site, use common sense and as always, if it looks too good to be true, it’s probably not a site where you want to linger.
Title image by Steinar La Engeland