Discussing Responsible AI & International Governance

Discussing Responsible AI & International Governance

Share This Post

In the previous episode of Private AI’s ML Speaker Series, Patricia Thaine (CEO of Private AI) sat down with Dr. Sarah Shoker (Research Scientist at Open AI) to discuss her recent work on the responsible use of AI. 

Dr. Shoker is part of the policy research team at Open AI on the policy research team where she focuses on assessing the geopolitical considerations that drive AI design, who benefits from AI, and how its design shifts state-to-state relations. She is formally trained in political science and international relations and is a former SSHRC postdoctoral scholar from the University of Waterloo, where she worked at the crossroads of emerging technologies and international security. Her PhD from McMaster University examined counterinsurgency, military drones, and bias in decision-making when targeting civilians.

If you missed this latest discussion, scroll down to find a recap of Patricia and Sarah’s chat below.

Discussing Responsible AI & International Governance

PAI: What distinction, if any, do you make between ethical AI and responsible AI?

Sarah: That’s a great question and also a little bit of a difficult one because these terms aren’t standardized. You’ll likely see variations in how the terms are used depending on the speaker. But generally, I would conceptualize ethics as a field of philosophy that addresses questions about morals and values. When it comes to ethical AI, that field would pertain to AI development use and the morals that are related to that. 

I do want to be clear that my training is not as a philosopher, though of course all of us make value judgments every day in our work. But when we talk about responsible AI, the unspoken word is accountability. Responsibility implies rights and obligations and that there are both moral and sometimes legal requirements on the part of the designer to affected individuals and communities. So I would conceptualize responsible AI as being nested within a larger conversation about ethical AI, but also within discussions about regulations and governance. And I approach the question from a social sciences background; I’m personally very interested in the adoption of AI safety standards and international mechanisms that we can leverage to encourage international cooperation on those issues. 

PAI: Thank you for that clarification. That’s really interesting to think about, because I don’t think that distinction is often made.

So with that in mind, the EU draft AI regulation from 2021 categorizes AI applications into three pockets, unacceptable use, high risk systems, and limited and minimal risk systems.

What are your thoughts on how the legislation has categorized these AI systems?

Sarah: Yeah, the AIA is very controversial. In terms of these categories, I think the most obvious thing that we can point to is that, with the exception of maybe chatbots here, most of these categories are pointing to AI use cases and capabilities that would impact social and political environments. And I think that’s important because we can think of AI systems as dual use technologies, if I can borrow a term that’s used commonly in the military sphere–meaning technologies that can satisfy both peaceful and malicious aims. I think using a risk assessment rubric that focuses on capabilities rather than the model itself is probably the right method since historically treaties and export control lists, which I’m very interested in, struggle to regulate digital technologies after they come into existence. 

Looking at bad use cases can be one way to preempt this problem while still allowing companies to innovate. Instead of rendering prohibitions on technology types, you can render prohibitions or regulations on certain usages of tech. That being said, the exploitative techniques causing harm could be carried out by a chatbot. So I’m not sure why chatbots are being viewed as unproblematic or low-risk. To me, that’s a bit of a strange carve out. I could see a chatbot, for example, being very good at persuading someone to harm themselves or being developed in such a way that incentivizes extreme psychological dependencies. So I think I’d want to do a little bit more research on why chatbots are being viewed as the least risky. But other than that, I agree largely with the aim behind this categorization system. 

PAI: That makes a lot of sense. And when you consider that chatbots can also memorize personal information if you train it on production data and then spew it out, it’s a little bit scary to consider it limited risk. 

Sarah: Yeah, I think maybe the motivation, and I’m speculating here a little bit, is that chatbots are already in frequent use by governments because they’re thought of as one way to reduce friction between governments and citizenry. So maybe this could be a situation of path dependency where governments are already used to chatbots being in existence, and so they’re viewed less problematically. But that doesn’t mean that in the future they might not be developed in such a way that they could pose the risk of real harm. 

PAI: In your 2019 article on how artificial intelligence is reshaping global power and Canadian foreign policy, you mentioned how AI has been blamed for disasters, like the Boeing 737 crashes, without there actually being any AI integrated.

Do you think misappropriating the term significantly impacts our ability to legislate the use of AI? 

Sarah: Yeah, that’s an interesting question. To an extent I do think that some countries are finding workarounds to what we might call the definition problem. So countries like Canada, who mandate algorithmic impact assessments from federal agencies have opted to just entirely forego the term AI, and instead they use the term data-driven decision making. And I’m referring to a treasury directive that mandates algorithmic impact assessments from agencies within the federal government itself. So this does not necessarily apply to private companies, but the idea behind that directive is that because data driven systems can impact our social and political environments well before they reach that AI threshold, the regulation should not necessarily impact AI, but the data that powers AI, which is often viewed to be the source of social or political problems that might emerge from AI technologies. You also see this problem at the international level. 

I know we’re going to probably talk about legal autonomous weapons systems at some point, but within the convention on certain conventional weapons at the UN, there has been this ten year ongoing debate on how to define autonomy because there is no legal or technical consensus on when a non-autonomous technology becomes autonomous or even semi-autonomous. And that lack of definition makes it difficult to ban or restrict uses of autonomous systems. That’s been a site of frustration for the UN for the last ten years – the lack of consensus. And I think also there’s just a public communications concern. We’re entering into a world where a number of political and social decision-making is being offloaded and downloaded onto automated systems. And how can you develop the required literacy so that individuals can contest and engage meaningfully with our changing social and political environment? So I do worry a little bit about misappropriating the AI definition, but I think hopefully things will improve as we move forward. And I do think that a number of journalists are raising red flags or the alarm when the term is being used inappropriately. 

PAI: You wrote a working paper titled “Algorithmic Bias and the Principle of Distinction: Towards an Audit of Lethal Autonomous Weapons Systems.” 

Are Lethal Autonomous Weapons the most unacceptable application of AI in your opinion, or are there use cases we should be equally if not more concerned about?

Sarah: Yeah. So lethal autonomous weapon systems, I’m going to call them LAWS for short, because usually that’s how it’s shortened in the research. So I don’t ever want to say it’s the most inappropriate case because I think people continue to surprise me with bad ideas, but I would say that I think it’s within a larger category of us-cases that present extremely serious problems because it is the most overt example of weaponized AI existing in an international legal environment that I think has done a very poor job at ensuring that the few rights that civilians are granted in these environments are fulfilled. So if we go back to the EU AIA, there is a carve out there that prevents real time use cases of biometric information and policing. 

Because international violent conflict operates under a different legal structure, those types of use cases aren’t prohibited in violent conflict. So as citizens of liberal democracies, we might expect certain data protections, but that’s rarely the case for civilians caught in violent conflict within, I would say, a worse AI legal gap. And I’m also just extremely concerned about object and civilian identification in violent conflict because historically we’ve struggled to correctly identify combatants from civilian objects. Those are legal identifiers. And if you know anything about labeling training sets, you understand that labeling, imagery, or objects there are always normative values that are applied to the labeling process. And AI isn’t going to magically resolve that problem for you if you’re mislabeling the data. 

So what we call the principle of distinction —  the ability to distinguish between combatant and civilian — is, in my mind, one of the most important legal principles that characterize the post-1945 global order. That is what prevents total war. 

There are also other scenarios that are extremely problematic. For example, I would say the integration or offloading of AI decision making within the command, control, and communications (C3) of nuclear weapons is unfortunately something that has been raised by some military observers and higher level service members. And historically, strategic stability in the international order has been characterized by trying to understand other human beings and their political intentions. So what happens when you try to offload communication to a software entity that does not necessarily behave like a human? With that, we could anticipate that the risk of an accident could increase. So that’s another thing that I think would be right up there in r the category of worst use cases.

PAI: One thing that you mentioned in the paper that was really surprising to me was that, correct me if I’m wrong, there were males who were attacked by autonomous weapons, they just did not categorize them as casualties. 

Sarah: Yeah, almost there. I would say that we don’t necessarily have lethal autonomous weapons systems yet. Though there was a recent incident with a drone,  a Turkish made drone called the Kargu-2, which caused people to argue whether or not it fulfills the definition of autonomy. There’s the problem of definitions there again, but you are right. That military age male, which is another identifier, is a technocratic identifier. It’s not a legal identifier. You will not find that term in international humanitarian law. It’s used within the military environment to describe boys and men who are generally aged 16 and above. But of course, we are biased human beings so often we can’t tell just by looking at someone what their age is. 

In practice, it has also been applied to children younger than the age of 16. If they were killed within a drone strike, they were excluded from the collateral damage count. You can ask any military observer and they will tell you that a military-aged male is not the same thing as a combatant. They are technically a civilian until proven otherwise. That’s international humanitarian law. But if they were in the same vicinity of a drone strike, they were then, in death, considered to be a combatant.re They were then  counted as a combatant and stripped of of their civilian identity and then excluded from the collateral damage count. And so, the collateral damage count, the civilian death count, was artificially low. That’s why I say we have historically really struggled to distinguish civilian from combatant objects in violent conflict. 

PAI: We haven’t been hearing much about lethal autonomous weapons in recent years. Is it because it’s old news being dragged out by a number of newer international concerns, or has something happened to actually limit the creation of use of these weapons? 

Sarah: Nothing’s happened to limit the creation or use. I’m going to blame this on COVID fatigue. It is true. I think you’re pointing to maybe a lack of public dialogue and discussion. The meetings are still occurring. I think this might be because those responsible for highlighting lethal autonomous weapons systems in the public sphere have generally fallen on the shoulders of civil society and it’s probably just harder to mobilize in a pandemic. 

There was a second “Killer Robots” video recently released by the Future of Life Institute. There was a first video that went somewhat viral about lethal autonomous weapons that looked at the possibility of, say, drone swarms targeting college-aged students. There is a follow-up video to that that was released a few months ago, so I think mobilization is still occurring. I’m not sure if people are just really tired. Goodness knows that lethal autonomous weapons systems are one of many challenges that the global order is currently facing. So yeah, I could put it down on COVID fatigue, but that’s just the best guess on my part. 

PAI: How would you balance the risk of certain countries developing unethical AI, like lethal autonomous weapons, and using the technology against countries who don’t? 

Sarah: The literature on international relations will tell you that it’s very difficult to simply ban innovation and research into new technologies. But what I think you’re pointing to is a question about international stability, and that is a very big question. There are a few ways that we can approach this question. One way is to point to historical lessons about mitigating the spread of dual-use technologies. I think policymakers are also actively exploring the modernization of export control. 

I know in the United States, semiconductor export controls and limitations on semiconductor transfer to other countries are actively being discussed right now. There’s also a small explosion of new AI initiatives that seek to develop consensus about responsible use and the development of AI that conforms to human rights standards. So the Global Platform on AI, I think, would be a really key instance of that, and Canada and France are the original founding members there, but it does contain a number of countries. The US, Singapore, and Japan are involved and we actually have a center of excellence that comes out of the Global Platform on AI that is located in Montreal. So this is an active political agenda item, I would say. 

PAI: Are there any ways you’ve seen AI practitioners effectively prevent the misuse of their products or their work in unethical applications? 

Sarah: I am personally a big proponent of algorithmic audits and impact assessments. Other researchers have pointed to this already. I don’t think I’m inventing anything new here, but model cards, system cards, these types of accountability mechanisms have analogies in other industries. For example, we have nutrition labels, so we know what’s going on with the food that we consume. We have other consumer protections to ensure that our products don’t contain harmful ingredients. 

The government of Canada, as I mentioned earlier, does mandate algorithmic impact assessments. I would also point to the UK NHS, which recently partnered with the Ada Lovelace Institute to develop an impact assessment of AI used in the healthcare context, which I think is very exciting. I’m personally interested in seeing what kinds of lessons we can learn from algorithmic impact assessments and how they might transfer to the international sphere. This particular initiative is currently led by a few domestic governments, mostly pushed by civil society, and some industry.o it’s still nascent. There are no standards or consensus on what an algorithmic impact assessment should currently look like. But as the research develops, I think there will likely be some lessons and parallels that we can use at the international level when it comes to AI regulation.

PAI: There’s some work by Dr. Saif Mohammed from the National Research Council of Canada, and he came up with an ethical sheet for AI practitioners or researchers to fill up when they’re thinking through the misuses of their work or the ethical implications. Really interesting work to check out if anybody’s interested as well. 

Can you tell us about how Open AI is ensuring their technology is not used for unethical purposes? 

Sarah: We take risk mitigation and safety very seriously, and I can tell you a little bit about our deployment strategy. I think we’ve actually been pretty public about this; our deployment strategy has emphasized gradual rollouts to a small user base and then expanding access once we assess the impact of our API and models on the social and political environment. And that’s the strategy we’ve taken with GPT-2, GPT-3, along with DALL·E 2, which we recently released. I think personally, I’m just speaking as an individual researcher now, I think that’s just a basic form of due diligence, especially in an international environment where regulatory agreements and export control lists struggle to keep pace with the proliferation of dual use technologies that emerge from the private sector. 

Now I will also say that there is no such thing as a zero risk safety critical system in terms of things like the elimination of bias, which I sometimes hear in journalistic media. The idea that we can somehow have a bias-free AI, or that we can expect a world where we have zero accidents from automated technologies, I personally don’t agree with that assessment. I think regardless of what you are building, you are going to be making design decisions that are based on a normative value set. And so it’s extremely important to be explicit about what values you’re actually optimizing for. 

I’ll actually also take this moment to invite people to view our system card for DALL·E 2, which is another risk mitigation and accountability tool that we’ve published. It’s the results of our red teaming effort, and I think it’s pretty comprehensive in terms of the risks we’ve identified with our model. I’d also say that’s the level of public disclosure that I think is important for moving the dialogue on AI safety. I’m really excited that the AI space is moving towards that direction with the publication of model and system cards. So that’s one thing that we’re doing and also a number of other AI companies are doing as well. 

PAI: Is there anything else you’re excited to be working on and would like to share with us? 

Sarah: I’m actually going to take this opportunity to talk a little bit about some of the stuff that we’re doing at Open AI that might be of interest to people. We’re currently in the process of launching what we’re calling the Researcher Access program. If you’re a researcher, you don’t have to be affiliated with a university; this is also open to researchers from think tanks, civil society. If you want subsidized access to our models to conduct research, this program is now available. And we prioritize questions that deal with social impact, risk and safety mitigation. You do not have to be in STEM, so this is open for all disciplines. You just have to have a good research question. So if you’re interested in that, just keep a lookout on the Open AI blog. We’re going to be announcing that shortly. So I’m pretty excited about that. More collaboration with academia and the public sphere. (Edit: The program is now launched!)

PAI: How do you think that the CCPA and GDPR will limit data collection in beneficial ways? 

Sarah: What are the benefits of those data protections? If I can interpret the question that way, I’m going to take a little step back and think about this conceptually and broadly. I would characterize data collection as being on the periphery of our awareness. We’re not actually sure how that data collection functions, what data is even being collected, and then how it influences what we see and what socialization pressures  might be derived from it. One benefit of having the GDPR is that there’s just greater public awareness. 

If data collection is being used to influence our social and political environments, one could argue that means there’s this kind of pressure moving towards an anti-democratic politics where we are less able to contest and argue against some of the forces that are shaping our lives. The GDPR provides a remedy that allows us to understand what our political environment even looks like. It also creates a platform, I think, for individual users to contest how their own data, how their own lives are being shaped by a technology that we often have no input in designing. That would, to me, be the primary benefit. I would say it provides a counterbalance to what is potentially an anti-democratic politics. And as a political scientist that is of concern to me. Telegating important decision-making to non-human entities, has profound moral implications. So we need to have a conversation on that. The GDPR makes those sorts of normative decisions and trade-offs more explicit. Hopefully that answers the question. It’s a good question. It’s a tough one too. 

PAI: How do you feel about supervisory authorities that require full deletion of algorithms and instances of data collection done properly? 

Sarah: Yes, this has now happened a few times. If I recall correctly, I think the most obvious case has to do with the improper collection of children’s data. In one particular case, the company  was actually collecting children’s data–and children have even less rights than adults. So this is an extremely vulnerable population that has very limited ability to consent. I think this question is referring to new-ish FTC guidelines. I guess the idea behind algorithmic destruction is that a company  shouldn’t profit or only pay a fine from poor data collection because historically that has not been enough of a disincentive to behave properly, right? If you’re large enough, you can weather all the fines that the FTC might leverage against you. It becomes no big deal. So I’m largely supportive. I could change my mind in the future. But yeah, I think if you’re going to have any kind of rule book, that rulebook needs to have teeth and it actually needs to create incentives for proper behavior. If people are willing to pay the fine of poor behavior, then you’re not actually creating a regulation that’s effective. So, you want to create regulation that actually impacts and alters behavior towards the positive end of the spectrum. 

Missed this webinar? Sign-up for Private AI’s newsletter to receive updates on upcoming events.

Subscribe To Our Newsletter

Sign up for Private AI’s mailing list to stay up to date with more fresh content, upcoming events, company news, and more! 

More To Explore

Download the Free Report

Request an API Key

Fill out the form below and we’ll send you a free API key for 500 calls (approx. 50k words). No commitment, no credit card required!

Language Packs

Expand the categories below to see which languages are included within each language pack.
Note: English capabilities are automatically included within the Enterprise pricing tier. 

French
Spanish
Portuguese

Arabic
Hebrew
Persian (Farsi)
Swahili

French
German
Italian
Portuguese
Russian
Spanish
Ukrainian
Belarusian
Bulgarian
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
Greek
Hungarian
Icelandic
Latvian
Lithuanian
Luxembourgish
Polish
Romanian
Slovak
Slovenian
Swedish
Turkish

Hindi
Korean
Tagalog
Bengali
Burmese
Indonesian
Khmer
Japanese
Malay
Moldovan
Norwegian (Bokmål)
Punjabi
Tamil
Thai
Vietnamese
Mandarin (simplified)

Arabic
Belarusian
Bengali
Bulgarian
Burmese
Catalan
Croatian
Czech
Danish
Dutch
Estonian
Finnish
French
German
Greek
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Khmer
Korean
Latvian
Lithuanian
Luxembourgish
Malay
Mandarin (simplified)
Moldovan
Norwegian (Bokmål)
Persian (Farsi)
Polish
Portuguese
Punjabi
Romanian
Russian
Slovak
Slovenian
Spanish
Swahili
Swedish
Tagalog
Tamil
Thai
Turkish
Ukrainian
Vietnamese

Rappel

Testé sur un ensemble de données composé de données conversationnelles désordonnées contenant des informations de santé sensibles. Téléchargez notre livre blanc pour plus de détails, ainsi que nos performances en termes d’exactitude et de score F1, ou contactez-nous pour obtenir une copie du code d’évaluation.

99.5%+ Accuracy

Number quoted is the number of PII words missed as a fraction of total number of words. Computed on a 268 thousand word internal test dataset, comprising data from over 50 different sources, including web scrapes, emails and ASR transcripts.

Please contact us for a copy of the code used to compute these metrics, try it yourself here, or download our whitepaper.