How to Detect and Prevent Bot Traffic on Your Ad Exchange

By Roman Vrublivskyi. in Internet. Updated on October 16, 2022.

According to Cloudflare, more than 40% of Internet traffic is believed to be bot-driven, and malicious bots represent a huge chunk of this share.

In this article, we’ll dive right into the cause of this problem to understand what kinds of bot traffic types there are and which anti-fraud tools one can use to protect one’s advertising platform (and ad exchange, in particular).

What is bot traffic?

Bot is short for “robot” – a program that performs simple and repetitive tasks. And it does so faster than a human (because it’s a robot, right?), so there’s a great deal of good (or bad) that you can achieve using bot automation. A bot, like any technology, is just as helpful or harmful as the intent behind it.

Not all of this bot traffic has ill intent, but how can you tell which ones are good and bad?

Good bots

Good guys are selfless heroes who help someone in distress – no matter what. They’re unbiased, so don’t even think of bribing them. Bot crawlers from search engines like Google, Yandex or Bing, help your website content get noticed by those who’ll likely be interested in it, and that’s how you get your traffic.

There are many other cool things that bots can do, like enforcing your copyrighted content or even talking to your website visitors!

Bad bots

These impersonate real people for the sake of achieving a particular purpose – e.g., imitating ad views and clicks (often used by unscrupulous publishers to get paid more from every impression and click generated on their inventory).

Bad bots also actively steal information from websites, post spam comments, and drain advertisers’ Pay-Per-Click (PPC) budgets.

Malicious bots are used for various reasons – they can be a part of a complex marketing strategy to get a competitive edge or something as straightforward as obtaining personal banking information.

How do traffic bots work?

A bot is essentially an algorithm designed to handle a specific sequence of actions. It can operate using predefined rules, leveraging Machine Learning (ML) and Artificial Intelligence (AI) or combining all of them. The more sophisticated the bot’s algorithm, the more advanced means of protection are required to counter them.

Research shows that a website can be exposed to bot traffic as much as 120,000 times per hour (which is surprisingly high, depending on the overall number of website visits per month). But that should give you an idea of how much bot traffic you can get during just one month.

Knowing that almost half of website visits are bot traffic, think about how much server resources are wasted and how negatively that affects the publisher’s website performance.

Even when a bot traffic attack doesn’t reach its desired objective, it can still exhaust servers to the extent where the website becomes unavailable to actual visitors.

So, it’s really about maintaining an online presence, which is a crucial element of keeping a publisher’s side of the bargain.

How to identify bot traffic

Some bot traffic can be relatively harmless (i.e., messing with your website stats), while others do the actual damage to your business. Ideally, you want to be able to deal with all of the issues mentioned earlier.

Publishers can use the following to recognize bot traffic on their websites:

Abnormally low page speed load. This is an indication that a server is busy fulfilling HTTPS requests generated by bot traffic, which affects website performance. A website should typically be able to withstand sudden spikes (given that a hosting provider is reliable enough), so a small amount of bot traffic won’t make a big difference. However, a large number of Layer 7 DDoS attacks can take the entire website down.
High bounce rates and pages per session. Some bots can be programmed to hit a page and leave it instantly, while others can crawl across the entire website. The first one would produce an abnormally high bounce rate, while the latter would result in increased pages per session.
Website traffic comes from unexpected locations. There is no way to control where the website visitors come from unless some location restrictions are applied. However, a sudden influx of visitors from surprising locations can be a clear indication that a website is under a bot traffic attack.
The website content is plagiarized. Duplicated content that appears on strange websites without permission can be the result of scraping bot traffic activity.

Challenges to fight bot traffic in programmatic advertising

Malicious bots are getting really good at what they do. However, the primary reason why ad fraud exists in programmatic is gaining financial benefits by disrupting the media-trading process.

This can be done in a number of ways:

Impression fraud

Impressions are the key metric when it comes to media buying on a Cost-Per-Thousand Impressions (CPM) basis. However, it can be hard to say whether or not valid impressions took place.

Sometimes, publishers can intentionally make their ads appear on low-traffic websites to generate excessive impressions to get an extra charge. The main problem with impression fraud is that it can be hard to detect.

Click fraud

This is arguably one of the most common threats that most advertisers have encountered at least once. According to Pixalate, click fraud topped the list of advertising frauds in 2017.

A couple of years after, the ad tech ecosystem learned how to deal with this type of fraud (thanks to the sellers and authorization standards developed by IAB and new click fraud scanning technologies).

Conversion fraud

Conversions are what the majority of advertisers want when it comes to online advertising. The Cost-Per-Action (CPA) model allows media trading based on users’ valuable actions on a website.

The CPA model tends to cost slightly more than CPM and CPC, making junk conversions more damaging. Dishonest publishers can utilize bot traffic for that purpose since it’s relatively simple to train bots to take specific actions on a website.

How to combat bot traffic?

It’s important to distinguish between simple and advanced bot traffic since dealing with them might require different strategies.

Fighting simple bot traffic is pretty straightforward, so every publisher should be aware of these basic rules to protect their inventory. These sellers and inventory authorization standards were established by Interactive Advertising Bureau (IAB). These days, they belong to the most viable mechanisms you can implement in your advertising platform.

Here’s a look at the names and descriptions of IAB standards that every digital publisher should know about:

Ads.txt

This standard is a common yet efficient way of fighting against inventory arbitrage and domain spoofing. This file includes a list of approved resellers, making it somewhat difficult for fraudsters to manipulate ad impressions and website URLs for unauthorized limited inventory reselling using bot traffic.

App-ads.txt

This standard is similar to ads.txt – the only difference is that this one protects the mobile ecosystem from bot traffic.

Seller.json

As far as ads.txt and app-ads.txt are tasked with protecting publishers, seller.json is targeted to secure suppliers from bot traffic. Seller.json contains a list of authorized publishers, eliminating the necessity of directly contacting every single publisher.

All in all, using these three IAB standards is a sure way to bring transparency to the supply chain ecosystem. Ad exchange owners should integrate them into their ad exchanges, as it will provide an additional security level for every party involved in media trading.

If traffic is the foundation of your business, and the participants in your ad exchange rely on paid traffic, it is important to protect your ecosystem additionally so that it stays healthy and profitable.

What scanners can protect your ad exchange?

The ad tech industry deals with very specific kinds of fraud: click bots, domain spoofing and installs hijacking (to name just a few). Ad fraud is constantly evolving, and the businesses whose primary specialization is not cybersecurity can’t keep pace with fraudsters.

For this reason, it became a common practice for ad tech platforms to collaborate with traffic security providers, as their tools and anti-fraud mechanisms are the most agile and updated.

Many of these security providers offer specialized scanners that can be integrated into ad exchanges and any other kind of advertising platform. When activated, such scanners will automatically prevent fraudulent traffic before the impression takes place or bidding occurs. The other scanners that work post-bid are focused on discovering where the source of fraud originated.

Let’s review a shortlist of the most popular scanners for ad exchange, taking SmartHub (white-label ad exchange) as an example. With SmartHub’s prebuilt platform, you can create your own self-branded ad exchange and integrate the most advantageous anti-fraud scanners, like the ones featured below:

The Media Trust (TMT)

This tool can be used primarily to verify ad creatives. TMT has a database with a list of ads approved for an ad exchange, so it forbids potentially malicious banners and native creatives from showing up.

TMT is an additional option on SmartHub, the integration of which can be negotiated at the stage of ad exchange deployment. Once it becomes accessible on your dashboard, you can configure API and daily scan limits at your convenience.

Forensiq Scanner

This tool allows ad impressions to take place as long as they meet quality traffic requirements. Specifically, the Forensiq scanner checks the fraud score scale to check whether an ad is eligible for showing or whether it’s another manifestation of a bot traffic attack.

You can access Forensiq in the scanner section of SmartHub’s dashboard. Setting it up is pretty straightforward – just make sure to define a number of scans per day and the Forensiq scanning system will do the rest.

Protected Media

Protect Media is a renowned supply scanning tool (available at SmartHub by default) with AI and ML at its core. It provides Real-time Buying (RTB) capabilities to detect bot traffic before and after the ad auction.

You are also provided with detailed stats about all ads that made it to the auction, so you can check their credibility by viewing traffic origin. This tool is great at detecting video, image and CTV bot traffic fraud.

Pixalate

Pixalate is another advanced scanner that you can find in ad exchanges like SmartHub. It looks a lot like Protected Media, but it also allows detecting OTT and in-app bot traffic fraud. When setting up Pixalate, you can choose whether or not to apply macroses or IMG pixel as your monitoring method.

Like we previously mentioned, the main issue with detecting bot traffic by observing sure signs of bot activity on your website is that it happens too late. Pixalate automates fraud detection by detecting any possible threats at the earliest stages.

GeoEdge

GeoEdge helps to fight against ads that contain redirect links. When SSP in your ad exchange gets a response, the GeoEdge automatically wraps those in <header> and <footer> tags.

Once you configure GedEdge, you will begin to receive alert notifications about redirect ads, and bot traffic will be blocked instantly – there’s no need to do any coding. This scanner will be automatically working with your ad exchange, which is very convenient.

Botman

A good thing about Botman is that it can identify and block General Invalid Traffic (GIVT) and Sophisticated Invalid Traffic (SIVT) bot traffic. GIVT consists of web crawlers that – while generally harmless – can pose a danger to your stats. SIVT is fraudulent bot traffic as is.

Luckily, Botman handles that, as well. In the SmartHub dashboard, it is easy to set up a Botman pre-bid integration, as all scanners are easily configured in the scanners section.

HumanSecurity

The scanning capabilities of this tool are designed to detect and block bot traffic, including both individual bots and entire botnets. And it does so before any of these bad actors can engage in RTB auction. The environments that HumanSecurity supports include desktop, mobile, and CTV.

To wrap it up

Some bots are relatively easy to detect and block. At the same time, advanced bots are good at mimicking quality traffic and require more superior strategies and antic-fraud technologies, such as professional scanners.

If you plan to create an advertising marketplace that is secure by design, SmartHub can be the right option. White-label ad exchanges, like SmartHub, are very configurable, so you can integrate the scanners and implement IAB authorization standards to always keep inventory and sellers in your ecosystem verified.