New year, new ways that fraudsters are trying to steal data and revenue from advertisers. This year, “SDK Spoofing” is the latest form of mobile performance fraud that consumes an advertiser’s budget by generating legitimate-looking installs without any real installs occurring. This type of fraud evolved very quickly and dramatically during the course of 2017. Fraudsters utilize a real device without the device’s user actually installing an app. SDK spoofing, also known as replay attacks, is now harder to spot than fake installs generated in emulation or install farms, as the devices used in this scheme are real and therefore normally active and spread out.
How it All Began
The main approach of the perpetrators here was to break open the SSL encryption between the communication of a tracking SDK and its backend servers, typically done by performing a “man in the middle attack” (MITM attack). The most popular approach is to use a proxy software (e.g. Charles Proxy), which enables this type of attack with a simple press of a button.
After completing the MITM attack, fraudsters then generate a series of test installs for an app they want to defraud. Since they can read URLs in clear text format for all the server-side connections, they can learn which URL calls represent specific actions within the app, such as first open, repeated opens, and even different in-app events like purchases, leveling up or anything being tracked. They also research which parts of these URLs are static and which are dynamic. Keeping the static parts (things like shared secrets, event tokens, etc.) and experimenting with the dynamic parts, which include things like advertising identifiers or other data specific to the device and the particular circumstances.
Now, thanks to callbacks and near real-time communication detailing the success of installs and events, the perpetrators can test their setup by simply creating a click and a matching install session. If the install doesn’t go through, then there is a mistake in their URL logic. If it is successfully tracked, they know they’ve nailed the logic. It’s simple trial and error; with only a couple dozen variables, that process becomes easier to understand the longer the experiment lasts.
Once an install is successfully tracked, the fraudsters will have figured out a URL setup that allows them to create installs from thin air.
It’s important to note that during the early stages of this fraud scheme’s rise to notoriety, the level of sophistication and understanding of a URL structure was low; therefore, spoofing attempts were more easily spotted and blocked. Calls would come from data centers or VPNs, and data was often times nonsensical or the URL parameters were filled with data that did not match the intended purpose.
How SDK Spoofing Attacks Became So Sophisticated
As an industry, we’ve been collectively fighting fraud for a while. We know that fraudsters are continuously improving their own methods and with every improvement made, fraudsters eventually figure out why their efforts were thwarted and then up their game.
Fraudsters really pushed the bar in terms of their level of sophistication, this time. Fraudulent device data started to match data from real-device traffic and became consistent over a multitude of device-based parameters (and, later, all device-based) parameters. This has been unheard of, so how was this possible if everything was fake?
The simple yet stunning answer was (and still is) that not everything is fake anymore. Fraudsters started to collect real device data. They do this by using their own apps or by leveraging any app they have control over. The intent of their data collection is, of course, malicious, but that does not mean that the app being exploited for data is purely malicious or could even be found out as malicious. The perpetrator’s app might have a very real purpose or it might be someone else’s legit app and the perpetrators simply have access to it by means of having their SDK integrated within it. This could be any type of SDK from monetization SDKs to any closed-source SDK where the information being collected isn’t transparent. Regardless of the specific circumstances, the fraudsters have access to an app that is being used by a large amount of users.
Having a source or multiple sources that generate real device data makes the fraudster’s task a lot simpler. They no longer need to randomize or curate troves of data, because now they have access to the real thing. This has made it incredibly hard on the anti-fraud side to research and identify these spoofing attempts.
To make matters worse, this giant leap in the evolution of fraudsters went hand in hand with a second and equally impactful step in the sophistication of SDK Spoofing. The URLs were no longer called from data centers anymore, or tunneled through VPNs. Instead, they were proxied directly through the app the perpetrator had access to on a device of an unsuspecting user. For the non-techies out there, this means the fraudster’s server runs a script that automatically creates a URL that will trigger us (or any attribution company) to track an install or event. Instead of sending this URL directly to our servers (or through an anonymizing network as they used to) the fraudsters now send it to the app (the one the perpetrators have access to) on a user’s device. This app now executes the URL on the user’s device.
You can see how this method makes it look like the connection came from a real device (and, for that matter, a device that matches the transported data) because it does! The connection is real, the device data is real, the device is real. It’s bad enough that there’s no interaction between the user and any advertising for the advertised app, but the even bigger issue is that there’s no actual install.
How To Protect Data and Advertising from SDK Spoofing Attacks
As a result of this drastic and quickly-evolving scheme, it’s key to pull out all the stops. In radical cases, you may need your fraud prevention or data team to manually research hundreds of thousands of data points to prove that these installs were in fact fake, giving you a chance to recuperate your lost advertising budget.
The alternative is to work with your ad network partners and data team to develop a solution that puts a stop to this fraud scheme dead in its tracks. We recommend creating a signature hash to sign SDK communication packages. This method ensures that replay attacks do not work, as a new dynamic parameter in the URL cannot be guessed or stolen, and is only ever used once.
To achieve a reasonably secure hash and an equally reasonable user experience, opt for an additional shared secret. Marketers should also have the opportunity to renew secrets and use different ones for different version releases of their app. This will allow them to deprecate signature versions over time, making sure that attribution is based on the highest security standard for the newest releases and the older releases can be removed from attribution fully.
Andreas Naumann is head of fraud at Adjust. Adjust’s dedicated fraud prevention initiative focuses on analyzing traffic on a very large scale and finding the patterns and flaws to identify fraudulent traffic in real-time.