Researchers Aim to Trace PII Data Sharing Through 300 Fake Accounts


Due to a number of high-profile incidents, the way in which big companies and data brokers use (or, abuse) or data is well and truly in the spotlight. Governments, regulatory bodies, and consumers are more concerned than ever about how, where, and when our data is being used. However, a lack of transparency persists thanks to the sheer overwhelming number of places we divulge our information as well as the complex web of data trading that takes place once our information has been divulged.

A team of researchers from the Hume Center for National Security and Technology at Virginia Polytechnic Institute and State University in Blacksburg, VA has set out to try and establish what exactly happens to consumer data once it’s released into the wild. To do it, they created 300 fake identities with which they signed up to 185 legitimate websites. Their aim was to test the traceability of the PII using this method as well as to glimpse the data privacy and compliance of a number of popular websites.


LIFARS-as-Service can validate your compliance and controls to help you maintain your compliance. We evaluate your current approach and create a strong security foundation.


Then, all they had to do was sit back and wait for the emails, text messages, and phone calls to roll in. By seeing which accounts’ data was compromised, they were able to connect the dots to establish how much data sharing a platform takes part in. For this initial study, researchers did not use fake accounts to respond to the initial communications and only passively monitored them.

Their findings indicated that there was both good and bad news for the public when it comes to their data sharing concerns.

Data sharing – the good news

To the researchers’ surprise, fewer companies seemed to have taken part in selling user data than were expected. Of the 300 accounts, only about 10 experienced widespread sharing of their data to other companies and platforms. Of these, most were from social media sites like Twitter and TikTok.

So, nearly 97% of organizations seemed to be content with collecting user data and keeping it for their own use.

Of the 1,423 attachments they received, all were found to be free from malicious software, although this might be thanks to the University’s security measures.

As a side note, they discovered that Facebook had excellent fake count detection capabilities. 6 out of the 8 fake accounts on the site were immediately rejected with the other 2 being picked up in the following week. WeChat was similarly good at preventing fake accounts thanks to it requiring a legitimate Chinese phone number to register.

In general, the data shared also mostly seemed to be in relation to marketing Black Friday sales and were rarely used for political or other nefarious reasons.

Data sharing – the bad news

One of the metrics used by the researchers to measure the impact this data sharing has on our lives was to look at the amount of time is wasted. It found that some accounts led to over 20 hours of distractions in the form of spam communications.

In total, the experiment generated 16,584 emails, 3,482 phone calls, 948 voicemail messages, and 753 text messages.

While many of these communications were innocuous, there were the usual scam attempts. Some were clear social security scams with others using everyday concerns, such as car or product warranty, as an attempted exploit vector.

The researchers also found that there was little to no correlation between the quality of the various websites and businesses’ data privacy policies and the amount of data sharing they took part in. At worst, this may indicate that many online businesses simply care about protecting their interests and image by stating these policies without intending to enforce them. At best, it indicates a glaring communication gap between these companies’ legal and IT teams.

What’s next?

The study seemed to indicate that while the problem of data sharing might not be as widespread as first thought, it only takes one or two entities that are willing to abuse your data to have a significant impact on your life.

This initial study was used to determine the feasibility of PII data tracing as well as to help develop a quantitative scoring system for privacy policies and terms of service.

Researcher hope to draw more informative and comprehensive conclusions by scaling their experiment in the future. The Hume Center hopes to repeat this study with a further up to 100,000 fake accounts using automation to generate responses and analyze the follow-up communications.




Use & Abuse of Personal Information