Microsoft’s framework for building AI systems responsibly
Today we are sharing publicly Microsoft’s Responsible AI Standard, a framework to guide how we build AI systems. It is an important step in our journey to develop better, more trustworthy AI. We are releasing our latest Responsible AI Standard to share what we have learned, invite feedback from others, and contribute to the discussion about building better norms and practices around AI.
Guiding product development towards more responsible outcomes
AI systems are the product of many different decisions made by those who develop and deploy them. From system purpose to how people interact with AI systems, we need to proactively guide these decisions toward more beneficial and equitable outcomes. That means keeping people and their goals at the center of system design decisions and respecting enduring values like fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability.
The Responsible AI Standard sets out our best thinking on how we will build AI systems to uphold these values and earn society’s trust. It provides specific, actionable guidance for our teams that goes beyond the high-level principles that have dominated the AI landscape to date.
The Standard details concrete goals or outcomes that teams developing AI systems must strive to secure. These goals help break down a broad principle like ‘accountability’ into its key enablers, such as impact assessments, data governance, and human oversight. Each goal is then composed of a set of requirements, which are steps that teams must take to ensure that AI systems meet the goals throughout the system lifecycle. Finally, the Standard maps available tools and practices to specific requirements so that Microsoft’s teams implementing it have resources to help them succeed.
The need for this type of practical guidance is growing. AI is becoming more and more a part of our lives, and yet, our laws are lagging behind. They have not caught up with AI’s unique risks or society’s needs. While we see signs that government action on AI is expanding, we also recognize our responsibility to act. We believe that we need to work towards ensuring AI systems are responsible by design.
Refining our policy and learning from our product experiences
Over the course of a year, a multidisciplinary group of researchers, engineers, and policy experts crafted the second version of our Responsible AI Standard. It builds on our previous responsible AI efforts, including the first version of the Standard that launched internally in the fall of 2019, as well as the latest research and some important lessons learned from our own product experiences.
Fairness in Speech-to-Text Technology
The potential of AI systems to exacerbate societal biases and inequities is one of the most widely recognized harms associated with these systems. In March 2020, an academic study revealed that speech-to-text technology across the tech sector produced error rates for members of some Black and African American communities that were nearly double those for white users. We stepped back, considered the study’s findings, and learned that our pre-release testing had not accounted satisfactorily for the rich diversity of speech across people with different backgrounds and from different regions. After the study was published, we engaged an expert sociolinguist to help us better understand this diversity and sought to expand our data collection efforts to narrow the performance gap in our speech-to-text technology. In the process, we found that we needed to grapple with challenging questions about how best to collect data from communities in a way that engages them appropriately and respectfully. We also learned the value of bringing experts into the process early, including to better understand factors that might account for variations in system performance.
The Responsible AI Standard records the pattern we followed to improve our speech-to-text technology. As we continue to roll out the Standard across the company, we expect the Fairness Goals and Requirements identified in it will help us get ahead of potential fairness harms.
Appropriate Use Controls for Custom Neural Voice and Facial Recognition
Azure AI’s Custom Neural Voice is another innovative Microsoft speech technology that enables the creation of a synthetic voice that sounds nearly identical to the original source. AT&T has brought this technology to life with an award-winning in-store Bugs Bunny experience, and Progressive has brought Flo’s voice to online customer interactions, among uses by many other customers. This technology has exciting potential in education, accessibility, and entertainment, and yet it is also easy to imagine how it could be used to inappropriately impersonate speakers and deceive listeners.
Our review of this technology through our Responsible AI program, including the Sensitive Uses review process required by the Responsible AI Standard, led us to adopt a layered control framework: we restricted customer access to the service, ensured acceptable use cases were proactively defined and communicated through a Transparency Note and Code of Conduct, and established technical guardrails to help ensure the active participation of the speaker when creating a synthetic voice. Through these and other controls, we helped protect against misuse, while maintaining beneficial uses of the technology.
Building upon what we learned from Custom Neural Voice, we will apply similar controls to our facial recognition services. After a transition period for existing customers, we are limiting access to these services to managed customers and partners, narrowing the use cases to pre-defined acceptable ones, and leveraging technical controls engineered into the services.
Fit for Purpose and Azure Face Capabilities
Finally, we recognize that for AI systems to be trustworthy, they need to be appropriate solutions to the problems they are designed to solve. As part of our work to align our Azure Face service to the requirements of the Responsible AI Standard, we are also retiring capabilities that infer emotional states and identity attributes such as gender, age, smile, facial hair, hair, and makeup.
Taking emotional states as an example, we have decided we will not provide open-ended API access to technology that can scan people’s faces and purport to infer their emotional states based on their facial expressions or movements. Experts inside and outside the company have highlighted the lack of scientific consensus on the definition of “emotions,” the challenges in how inferences generalize across use cases, regions, and demographics, and the heightened privacy concerns around this type of capability. We also decided that we need to carefully analyze all AI systems that purport to infer people’s emotional states, whether the systems use facial analysis or any other AI technology. The Fit for Purpose Goal and Requirements in the Responsible AI Standard now help us to make system-specific validity assessments upfront, and our Sensitive Uses process helps us provide nuanced guidance for high-impact use cases, grounded in science.
These real-world challenges informed the development of Microsoft’s Responsible AI Standard and demonstrate its impact on the way we design, develop, and deploy AI systems.
For those wanting to dig into our approach further, we have also made available some key resources that support the Responsible AI Standard: our Impact Assessment template and guide, and a collection of Transparency Notes. Impact Assessments have proven valuable at Microsoft to ensure teams explore the impact of their AI system – including its stakeholders, intended benefits, and potential harms – in depth at the earliest design stages. Transparency Notes are a new form of documentation in which we disclose to our customers the capabilities and limitations of our core building block technologies, so they have the knowledge necessary to make responsible deployment choices.