Cyber "Street Cred": Verifiability and Responsibility of Data Processing in Cybersecurity

"Street Cred - commanding a level of respect in particular environments

due to experience in or knowledge of issues

affecting those environments"

Urban Dictionary

Fine-grained data (including cybersecurity-related data) can offer substantial benefits when used to personalise services, create content, optimize complex systems and support novel business models. These benefits can affect users, citizens, businesses, and governments, and they can inform evidence based policy-making and political processes. AI techniques applied to data enable the tailoring of services, goods, and even information to users’ predicted interests, tastes and preferences (inferred from data-driven analysis of the digital ‘residue’ of users on-line behaviour) , with the aim of delivering services continuously, pre-emptively and conveniently, as well as allegedly simplifying processes and ‘assisting’ human decision making in complex environments by directing user attention to information algorithmically identified as most ‘relevant’ to them.

Yet, the current AI-enabled digital economy practices suffer from several significant deficiencies. Consumers have little practical control over their data, often having no way to verify how their data are collected, analyzed, and used and rarely benefit from offering their data to businesses. Developers often work on open-source projects, which are exploited in various ways by large corporations. Businesses have limited ability to manage risks associated with technologies which they are developing, particularly when the risks emerge in unanticipated ways. Policy makers and government agencies have difficulties in regulating and using the AI space due to its complexity and lack of responsible practice (e.g., facial recognition applications). All these issues create an erosion of trust and loss of credibility, where more and more people and organisations become cautious about the AI technology. This erosion of trust particularly backfires in the domain of cybersecurity, where safety and security often counterbalance privacy and trust, creating trade-offs.

The example of Cambridge Analytica’s political micro-targeting using bulk-harvested Facebook profiles highlights how AI and data-driven technology could be exploited to manipulate individuals at scale affecting a whole range of stakeholders (consumers, voters, businesses, developers). It vividly demonstrates how a single digital platform which enables people to connect to each other grew into a potentially dangerous business model, detrimentally affecting the rights of many consumers, but also pose significant threats to democracy at national and global levels. Consequently, there have been growing concerns about the adverse impacts of the application of AI, including but not limited to privacy, democracy and threats to individual autonomy, highlighting the importance of responsible data collection, sharing and use, including data-driven business models for both public and private sector. Business as well as public sector organizations now face major challenges as there is an expectation that they must ensure that data science and technology is undertaken responsibly. This requires, among other things, increasing verifiability, credibility and trustworthiness of the digital economy. All of these issues place increasing pressure on AI-intensive organisations to demonstrate responsible behaviour, not only in what they do but in how they do it.

Therefore, the goal of many organizations now is to make AI systems more credible and trustworthy by ensuring that the way data is collected, analysed and subsequently utilised in business models and governmental processes adheres to a set of 'best practice' guidelines for trustworthy and responsible data practices. In that regard, 3 major trends emerge:

Increasing Verifiability of Data Use

This trend is concerned with determining and specifying the way in which personal data should be used by various organisations and how users as well as other relevant stakeholders can verify this use. A typical use case here is police access to data stored on mobile phones. The police often ask victims of rape (or victims of other crimes for that matter) to hand over their mobile phones, so that the police and other law enforcement agencies can verify the nature of any communication the victim may have had with the alleged perpetrator. But victims are understandably reluctant to do this, because the vast majority of data on their phone is not relevant, and therefore the procedure seems grossly disproportionate. Secure hardware and cryptography techniques may potentially help to solve this type of problem. Yet, many questions still remain unanswered:

  • What technologies can we design that allow the victim to have assurance that only the relevant and permitted data accesses will be made?

  • Additionally, can the victim be provided with a verifiable transcript of actual accesses made?

  • How should the set of permitted data access be specified, in such a way as to satisfy the needs of all the stakeholders?

  • How can the police be assured that the data they lawfully extract is complete and forensically sound?

Achieving Greater Trustworthiness of AI-enabled Products

This trend is concerned with trying to understand the current way in which humans interact with existing AI systems and investigates how the transparency of data systems supporting the AI-enabled technology could be improved to achieve greater trust in these systems and technologies. It also addresses technological, legal, economic, and business aspects of developing responsible, verifiable, and trustworthy AI, suggesting concrete business models which will encourage responsible use of data.

A typical use case here is the usage of home-based IoT devices, including speech-based assistants. Everybody understands that asking Amazon Alexa to turn on the lights allows Amazon to know and record that a particular request took place. But recent criminal cases have included judges mandating Amazon to reveal information about conversations it captured at the scene of a crime. We fill our houses with devices that have microphones and video cameras, but few people would accept that this should allow public authorities to monitor all our activities.

Another use case for this trend is cloud-based and on-device data processing. Apple has positioned itself as the privacy-friendly mobile phone ecosystem. It was the first platform to introduce end-to-end message encryption (in its iMessage product); and the first one to use hardware-based security techniques to prevent anyone (even themselves) from decrypting data on a user’s phone. However, Apple’s system is entirely proprietary, and no-one is in a position to verify that the claims they make about not being able to access user data are true.

Once again, secure hardware and cryptography comprise suitable primitives for designing mechanisms to give users evidence that their data is being processed in accordance with the stated aims. Yet, again, many questions are still open:

  • Can we design technologies that automatically limit the extraction of data from our homes? (e.g., machine-learning-enabled reverse-firewalls can correlate inward and outward data flow patterns, to detect outward data flows that don’t appear to be associated with an assistance request)

  • How an accountable decryption can be used if the cloud hosting company needs to retain the ability to decrypt data, but the user wants to verify how often that decryption is actually done?

  • How can businesses be motivated and incentivized to deliver verifiable and responsible services?

  • What are the underlying business models which would strike a balance between the desire of the citizens and customers to know how their data is used and organization and businesses who want to deliver personalized services and achieve high profitability?

Reciprocal Engagement between Humans and Technology

In order for users to be sufficiently empowered to challenge the technology, there needs to be understanding of the goals that the technology is pursuing, the notions of responsibility, ethics, etc., in which these goals are framed, and the consequences (intended or otherwise) of pursuing the goals in this manner. On a daily basis users are making data transactions. They exchange their personal and other digital data for digital services (e.g., by downloading and using apps). They also rarely check how their data is used. For example, since the adoption of GDPR regulation in the European Union very few consumers exercised their right to request their personal data from the private sector companies.

Within this trend, we also need to explore questions which go beyond a simple matter of asking the users to read the ‘terms and conditions’ prior to their interactions, but involves a much deeper appreciation of how users will engage with the technology (and why organisations require users to engage). The main question is:

  • How can the user make sense of the verification processes that technology uses?

Here, the concept of common ground involves linguistic, perceptual and cultural information that needs to be sufficiently aligned to allow informed interaction between users and technology. While the management of linguistic sources (through the terminology that both parties - users and companies - use) and perceptual sources (through the display of information that both parties can access) can be approached through user interface and dialogue design, this does not directly address the management of cultural sources (relating to understanding of notions of responsibility or ethics to which each party operates). There is also need for exploring how individuals make choices about trade-offs between values - in ways that can provide 'participatory' input into the design and configuration of digital devices and their operation.


Much of the success of the future cybersecurity systems (especially those based on technology rather than human input) relies on establishing the "cyber street cred" for data collection, storage, and processing techniques as well as algorithms we use in cybersecurity. After all, if humans rely on cybersecurity systems without trusting those systems and, if fact, if human are being treated by those systems as "a priori" untrustworthy "objects" (something that so-called "zero trust systems" often imply), then any shock to these systems will have detrimental and, perhaps, catastrophic consequences.

#cybersecurity #humanfactor #cyberattack #cyberthreat #cyberrisk #infosec #cybermindset #cyberculture #resilience #robustness #cyberstreetcred


© 2020 by Ganna Pogrebna and Boris Taratine