www.steveandalex.org

Using Digital Masks to Secure Your Data Privacy — A Radical Approach

Amid recent data leaks including the Cambridge Analytica’s controversy, online users have become more aware and concerned with how their personal information is being used. Data collection not only includes user inputs (name, DoB, address) but also insights generated by analysing user behaviour (how you spend time in apps or on websites and what it says about you).

In an earlier piece, I discussed how users could be empowered to control their personal data better by leveraging GDPR. In a nutshell, this idea relies on a platform that acts as an intermediary to help users consolidate their personal data that is spread across hundreds of companies and delete all / excessive information. However, most corporations are not GDPR-ready which makes it difficult to scale such a hypothetical platform. Furthermore, such a platform would require some level of cooperation from the companies that ‘hold’ the data.

Deception As a Weapon

Another approach (more a thought experiment) would be to use deception to secure your personal information to some degree. This can be played out at two fronts: (1) minimize disclosure of personal information (name, DoB etc.) and / or manipulate it to an extent and (2) manipulate user behaviour patterns so that any insights generated based on the user behaviour is incorrect.

(1) Disclosure of Personal Information

This is something that users already practice today. According to Verve, 60% of consumers provide intentionally incorrect information when submitting their personal details online amid privacy concerns.

Problem

However, this can be quite cumbersome and users still have to enter a valid email, name, address, in order to access products / services.

Solution

There is potential for a service that automatically creates fake ‘digital aliases’, and fills in forms automatically, providing only minimal ‘true’ personal information to effectively use the service. In other words, it would serve as a digital ‘mask’, enabling users to access services without providing too much information. This would also enable users to minimise information transfer across apps / websites etc and so impact of data breaches. If a different set of (potentially fake) information is used across websites / apps, the impact of any leaks is limited.

(2) Disclosure of User Behaviour Patterns

Problem

Sophisticated algorithms can derive very personal details including personality traits, political compass etc. by analysing how a user behaves on a particular website (what he/she browses, likes, comments etc. — especially on social networks). Whilst users do not directly disclose the information, it is implied by their behaviour. This makes it more dangerous as users are generally not even aware of this.

In order to understand what you reveal about yourself through your browsing behaviour, you can install Dataselfie, a browser plug-in that analyses how you spend time on social networks using Machine Learning.

An illustrative dashboard, generated by Dataselfie, is shown below.

Copyright @ Dataselfie

I found it quite impressive what kind of insights can be derived from how one spends time on a social network. It should be noted that Dataselfie is a third-party service; the social network itself would have far deeper insights as it has access to more layers of data.

Solution

A potential solution would be a browser plug-in that sits ‘on top’ of apps / websites (similar to Dataselfie) and randomly engages with the service (social network, websites) to create a ‘fake’ user behaviour profile and so blurs any ‘genuine’ interaction. This would make it difficult for algorithms to predict your typical ‘profile’ as they cannot distinguish between real and fake ‘you’.

Disclaimer / Risks

The approaches in the article are all thought experiments and constructed to provoke users to think about different ways of securing data privacy. I obviously do not endorse any solutions that violate terms and conditions of any companies.

There are several key risks with some listed below:

  • Any such solution would likely violate the terms of conditions of any companies on whose products it is deployed
  • Any manipulation of data would lead to a decline in user experience as users get less targeted ads
  • Companies would likely catch on such a service by using pattern recognition and may be able to filter out real vs fake information
  • It is questionable if one could technically design the proposed solution that is seamlessly integrated with the various services
  • This approach is not sustainable as tech majors rely on ad revenues to offer ‘free’ services. Therefore, services would break down if the majority starts using such a solution