When you hear the word “surveillance” what comes to mind?
You might be thinking about hidden cameras, tracking website history, public health monitoring, or a military operation. It’s often construed as invasive or secretive.
Can the same be said about AI today?
Think about it: If you’re training a model on human behavior, you have to capture the nuances of that behavior somehow. Whether it’s by tracking purchase history or analyzing social media interactions, you’re surveilling humans to some degree.
The predictions that your model makes, based on the data it’s been trained on, may then influence content humans see or even prompt future behaviors. We start becoming defined by our past behaviors, and future behaviors are predicted for us.
If this is true, how can we separate the identity of AI from surveillance?
The key difference between AI and surveillance is privacy. With the volume of accessible data growing exponentially, organizations must prioritize privacy in AI. To trust AI, people must trust that their personal information is not being misused.
Here are just a few ways you can ensure privacy is protected in AI.
Although machine learning used to require large amounts of data, it’s now more important to prioritize quality and relevant data. When using your own data, ensure that your organization is practicing good data hygiene: audit your data for issues, remove unneeded data, and implement consistent rules and constraints for data collection. It’s also important that your data (both owned and from third-party sources) is fair and representative.
When collecting data, it is critical that individuals are aware of the information being collected and how it will be used. An opportunity to opt out of data collection should also be provided. In addition to data collection transparency, it’s also best practice to inform stakeholders when AI is being used to make decisions about them.
Most biased algorithms point back to biased datasets. Data that is collected on humans and human behavior naturally carries human prejudices and harmful bias with it. For example, a model trained on historically poor credit score ratings manually given to a particular marginalized group will only mirror and amplify these biased ratings.
Organizations can reduce bias in AI and datasets by measuring for bias in all areas of AI design, immersing humans in your AI training and testing stages, and using third-party tools, like IBM’s AI Fairness 350 Toolkit, to check for fairness.
AI and surveillance may share some similarities, but at the end of the day, it is your responsibility as an AI-ready leader to ensure your AI and data protects the privacy of everyone involved.
Learn more ways to protect privacy in AI, and stay up-to-date on the latest in trusted AI and data science by subscribing to our Voices of Trusted AI monthly digest. It’s a once-per-month email that contains helpful trusted AI resources, reputable information, and actionable tips.
Contributor: Nicole Ponstingle McCaffrey is the COO and AI Translator at Pandata.