Julia Stoyanovich, is a professor at NYU’s Tandon School of Engineering and founding Director of the Center for Responsible AI. She recently delivered testimony to the NYC Council’s Committee on Technology about a proposed bill that would regulate the use of AI for hiring and employment decisions.
You are the founding Director of the Center for Responsible AI at NYU. Could you share with us some of the initiatives undertaken by this organization?
I co-direct the Center for Responsible AI (R/AI) at NYU with Steven Kuyan. Steven and I have complementary interests and expertise. I’m an academic, with a computer science background and with a strong interest in use-inspired work at the intersection of data engineering, responsible data science, and policy. Steven is the managing director of the NYU Tandon Future Labs, a network of startup incubators and accelerators that has already had a tremendous economic impact in NYC. Our vision for R/AI is to help make “responsible AI” synonymous with “AI”, through a combination of applied research, public education and engagement, and by helping companies large and small – especially small – develop responsible AI.
In the last few months, R/AI has actively engaged in conversations around ADS (Automated Decision Systems) oversight. Our approach is based on a combination of educational activities and policy engagement.
New York City is considering a proposed law, Int 1894, that would regulate the use of ADS in hiring through a combination of auditing and public disclosure. R/AI submitted public comments on the bill, based on our research and on insights we gathered from jobseekers through several public engagement activities.
We also collaborated with The GovLab at NYU and with the Institute for Ethics in AI at the Technical University of Munich on a free online course called “AI Ethics: Global Perspectives” that was launched earlier this month.
Another recent project of R/AI that has been getting quite a bit of attention is our “Data, Responsibly” comic book series. The first volume of the series is called “Mirror, Mirror”, it’s available in English, Spanish, and French, and accessible with a screen reader in all three languages. The comic got the Innovation of the Month award from Metro Lab Network and GovTech, and was covered by the Toronto Star, among others.
What are some of the current or potential issues with AI bias for hiring and employment decisions?
This is a complex question that requires us to first be clear what we mean by “bias”. The key thing to note is that automated hiring systems are “predictive analytics” — they predict the future based on the past. The past is represented by historical data about individuals who were hired by the company, and about how these individuals performed. The system is then “trained” on this data, meaning that it identifies statistical patterns and uses these to make predictions. These statistical patterns are the “magic” of AI, that’s what predictive models are based on. Clearly, but importantly, historical data from which these patterns were mined is silent about individuals who weren’t hired because we simply don’t know how they would have done on the job that they didn’t get. And this is where bias comes into play. If we systematically hire more individuals from specific demographic and socioeconomic groups, then membership in these groups, and the characteristics that go along with that group membership, will become part of the predictive model. For example, if we only ever see graduates of top universities being hired for executive roles, then the system cannot learn that people who went to a different school might also do well. It’s easy to see a similar problem for gender, race, and disability status.
Bias in AI is much broader than just bias in the data. It arises when we attempt to use technology where a technical solution is simply inappropriate, or when we set the wrong goals for the AI – often because we don’t have a diverse set of voices at the design table, or when we give up our agency in human – AI interactions after the AI is deployed. Each of these reasons for bias deserves its own discussion that will likely run longer than space in this article permits. And so, for the sake of staying focused, let me return to bias in the data.
When explaining bias in the data, I like to use the mirror reflection metaphor. Data is an image of the world, its mirror reflection. When we think about bias in the data, we interrogate this reflection. One interpretation of “bias in the data” is that the reflection is distorted – our mirror under-represents or over-represents some parts of the world, or otherwise distorts the readings. Another interpretation of “bias in the data” is that, even if the reflection was 100% faithful, it would still be a reflection of the world such as it is today, and not of how it could or should be. Importantly, it’s not up to data or to an algorithm to tell us whether it’s a perfect reflection of a broken world, or a broken reflection of a perfect world, or if these distortions compound. It’s up to people – individuals, groups, society at large – to come to consensus about whether we are OK with the world such as it is, or, if not, how we should go about improving it.
Back to predictive analytics: The stronger the disparities are in the data, as a reflection of the past, the more likely they will be picked up by the predictive models, and to be replicated – and even exacerbated – in the future.
If our goal is to improve our hiring practices with an eye on equity and diversity, then we simply cannot outsource this job to machines. We have to do the hard work of identifying the true causes of bias in hiring and employment head-on, and of negotiating a socio-legal-technical solution with input from all stakeholders. Technology certainly has a role to play in helping us improve on the status quo: it can help us stay honest about our goals and outcomes. But pretending like de-biasing the data or the predictive analytic will solve the deep seated problems of discrimination in hiring is naive at best.
You recently delivered testimony to the NYC Council’s Committee on Technology, one striking comment was as follows: “We find that both the advertiser’s budget and the content of the ad each significantly contribute to the skew of Facebook’s ad delivery. Critically, we observe significant skew in delivery along gender and racial lines for ‘real’ ads for employment and housing opportunities despite neutral targeting parameters.” What are some solutions to avoid this type of bias?
This comment I made is based on a brilliant paper by Ali et al. called “Discrimination though Optimization: How Facebook’s Ad Delivery Can Lead to Biased Outcomes.” The authors find that the ad delivery mechanism itself is responsible for introducing and amplifying discriminatory effects. Needless to say, this finding is highly problematic, especially as it plays out against the backdrop of opacity at Facebook and other platforms — Google and Twitter. The burden is on the platforms to urgently and convincingly demonstrate that they can reign in discriminatory effects such as those found by Ali et al.. Short of that, I cannot find a justification for continued use of personalized ad targeting in housing, employment, and other domains where people’s lives and livelihoods are at stake.
How can data scientists and AI developers best prevent other unintentional bias from creeping into their systems?
It’s not entirely up to data scientists, or to any one stakeholder group, to ensure that technical systems are aligned with societal values. But data scientists are, indeed, at the forefront of this battle. As a computer scientist myself, I can attest to the attractiveness of thinking that the systems we design are “objective”, “optimal”, or “correct”. How successful computer science and data science are — how influential and how broadly used — is both a blessing and a curse. We technologists no longer have the luxury of hiding behind the unattainable goals of objectivity and correctness. The burden is on us to think carefully about our place in the world, and to educate ourselves on the social and political processes we are impacting. Society cannot afford us moving fast and breaking things, we must slow down and reflect.
It is symbolic that philosophy was once the centerpiece of all scientific and societal discourse, then came mathematics, then computer science. Now, with data science taking center stage, we have come full circle and are in need of connecting back with our philosophical roots.
Another recommendation that you made is creating an informed public. How do we inform a public that may not be familiar with AI, or understand the problems associated with AI bias?
There is a dire need to educate non-technical people about technology, and to educate technical people about its social implications. Achieving both of these goals will take a strong commitment and substantial investment on the part of our government. We need to develop materials and educational methodologies for all of these groups, and to find ways to incentivise participation. And we cannot leave this work up to commercial entities. The European Union is leading the way, with several governments providing support for basic AI education of its citizens, and incorporating AI curricula into highschool programs. We at R/AI are working on a publicly available and broadly accessible course, aiming to create an engaged public that will help make AI what WE want it to be. We are very excited about this work, please stay tuned for more information in the coming month.
Thank you for the great detailed responses, readers who wish to learn more should visit Center for Responsible AI.