Why responsible AI is a business imperative


The arrival of generative artificial intelligence (AI) prompted a great deal of discussion of speculative AI harms, with controversies over everything from training data to military use to how many people AI vendors employed in their ethics departments. Although so-called “AI doomers” have been very vocal in calling for regulation to tackle long-term harms they view as existential, they’ve also criticised emerging legislation that focuses more on competition, access and consumer protection.

Meanwhile, large AI vendors such as Microsoft and Google are publishing annual transparency reports on how they build and test their AI services. Increasingly, these highlight a shared responsibility model for enterprise customers using their tools and services that will be familiar from cloud security and particularly important as “agentic” AI tools that can autonomously take action arrive.

With systems using AI – whether generative or more traditional approaches – already in use inside organisations and in customer facing tools, AI governance needs to move from data science teams (who rarely have expertise in either ethics or business risk) to CIOs who can tackle AI ethics in a practical rather than theoretical way, covering risk tolerance, regulatory requirements and possible changes to business operations. According to Accenture, only a tiny percentage of organisations have managed to both assess the risk and implement best practices at scale.

The most pressing real-world issues are “lack of transparency, problems with bias, accuracy issues, and issues with purpose boundaries”, according to Gartner analyst Frank Buytendijk. Under regulations such as GDPR, data collected for one purpose can’t be used for another just because AI enables the new use.

“If you are an insurance company, it is problematic to use social media data in the form of pictures and use AI to scan them for people who are smoking, while in their insurance application they said they don’t,” says Buytendijk.

Even with the threat of upcoming AI-specific legislation, there are better reasons than just compliance to make sure your AI tools are aligned with your organisation’s core values and business objectives, says Forrester principal analyst Brandon Purcell.

“There are a lot of looming sticks waiting to hit companies that get this wrong, but we’re missing the carrot: when you take the time to ensure the objective you’re giving an AI system is as close as possible to the intended outcome in the real world, you’re actually going to do better business. You’re going to achieve more profitability, more revenue, more efficiencies. AI ethics, responsibility, alignment all go hand in hand with using AI well.”

Paula Goldman, chief ethical and humane use officer at Salesforce, agrees: “It’s a matter of compliance, but it’s also a matter of how well AI works and how much juice you’re going to get from the squeeze. The more we build trust into the system, the better AI is going to work and the more productive it’s going to be.”

Start with principles

Responsible AI is both a business imperative and just good business, suggests Diya Wynn, responsible AI lead at AWS. She prefers the term to AI ethics because it broadens the conversation from moral connotations to the security, privacy and compliance perspectives organisations will need to address risks and unintended impacts.

The good news is that most companies with compliance teams in place for GDPR already have many of the structures necessary for AI governance, although you may need to add ethical expertise to the technical capabilities of data science teams.

Responsible AI is about quality, safety, fairness and reliability. To deliver that, Purcell urges organisations to start with a set of ethical AI principles covering accountability, competence, dependability, empathy, factual consistency, integrity and transparency that articulate corporate culture and values.

That will help when AI optimisations expose tensions between different business teams, such as AI lending tools improving sales by offering loans to riskier applicants, and give you a strong basis for adopting effective controls using tools increasingly designed for business leaders as well as data scientists. “AI ethics is a human discipline, not a technology category,” warns Buytendijk.

The metrics businesses care about when deploying AI systems are rarely the purely technical measures of machine learning accuracy in many early responsible AI tools. Instead, they are more technosocial measurements such as getting the right balance of productivity improvements, customer satisfaction and return on investment.

Generative AI chatbots that close calls faster and avoid escalations may not reduce the overall time human agents spend on the phone if it’s simple cases they handle quickly, leaving agents to deal with complex queries that take more time and expertise to address – but also matter most for high-value customer loyalty. Equally, time saved is irrelevant if customers give up and go to another vendor because the chatbot just gets in the way.

Many organisations want custom measures of how their AI system treats users, says Mehrnoosh Sameki, responsible AI tools lead for Azure: “Maybe they want friendliness to be a metric, or the apology score. We hear a lot that folks want to see to what extent their application is polite and apologetic, and a lot of customers want to understand the ‘emotional intelligence’ of their models.”

Emerging AI governance tools from vendors such as Microsoft, Google, Salesforce and AWS (which worked with Accenture on its Responsible Artificial Intelligence Platform) cover multiple stages of a responsible AI process: from picking models using model cards and transparency notes that cover the capabilities and risks, to creating input and output guardrails, grounding, managing user experience and monitoring production systems.

Gateways and guardrails for genAI

There are unique risks with generative AI models, over and above issues of fairness, transparency and bias that can occur with more traditional machine learning, requiring layered mitigations.

Input guardrails help keep an AI tool on topic: for example, a service agent that can process refunds and answer questions about the status of orders but passes any other queries to a human. That improves the accuracy of responses, which is good for both customer service and business reputation.

It also keeps costs down by avoiding expensive multi-turn conversations that could be either an attempted jailbreak or a frustrated customer trying to fix a problem the tool can’t help with. That becomes even more important as organisations start to deploy agents that can take action rather than just give answers.

Guardrails also address compliance concerns like making sure personally identifiable information (PII) and confidential information are never sent to the model, but it’s the output guardrails where ethical decisions may be most important, avoiding toxic or inaccurate responses as well as copyrighted material.

Azure AI Content Safety handles both, with new options added recently, Sameki explains. “It can filter for a lot of risky content, like sexual, hateful, harmful, violent content. It can detect that someone is attempting a jailbreak, either directly or indirectly, and filter that. It could filter IP protected materials as a part of the response. It could even see that your response is containing ungrounded, hallucinated content, and rewrite it for you.”

Hallucinations are probably the best known issue with generative AI: where the model accidentally misuses data or generates content that is not grounded in the context that was available to the model. But equally important, she suggests, are omissions, when the model is deliberately leaving out information in the responses.

Rather than just filtering out hallucinations, it’s better to ground the model with relevant data.

Handling hallucinations

Training data is a thorny question for generative AI, with questions over both the legality and ethics of mass scraping to create training sets for large language models. That’s a concern for employees as well as CIOs; in a recent Salesforce study, 54% said they don’t trust the data used to train their AI systems, and the majority of those who don’t trust that training data think AI doesn’t have the information to be useful and are hesitant to use it.

While we wait for legislation, licence agreements and other progress on LLM training data, organisations can do a lot to improve results by using their own data to better ground generative AI by teaching it what data to use to provide information in its answers. The most common technique for this is Retrieval Augmented Generation (RAG) although the tools often use catchier names: this is a form of metaprompting – improving the user’s original prompt, in this case by adding information from your own data sources.

While fully fine-tuning LLMs is too expensive for most organisations, you can look for models that have been tuned for specific, specialist domains using a technique called Low-Rank Adaptation (LoRA), which fine tunes a smaller number of parameters.

Collecting and using feedback is key to improving systems (on top of the usual considerations like tracking usage and costs). Content safety services from Azure and Salesforce, for example, include audit trails: if the AI system has to remove off-colour language from a prompt, you want to know that you caught it but you also want to know if it was a user expressing frustration because they’re not getting useful results, suggesting you need to give the model extra information. 

Similarly, monitoring hallucinations tells you not just the quality of AI outputs, but whether the model has retrieved the relevant documents to answer questions. Gather detailed feedback from users and you can effectively allow expert users to train the system with feedback.

Designing the user experience to make feedback natural also gives you a chance to consider how to effectively present AI information: not just making it clear what’s AI generated but also explaining how it was created or how reliable it is to avoid people accepting it without evaluating and verifying.

“This handoff between AI and people is really important, and people need to understand where they should trust the response coming from AI and where they may need to lean in and use judgement,” Goldman says.

Human-at-the-helm patterns can include what she calls “mindful friction”, such as the way demographics are never selected by default for marketing segmentation in Salesforce.

“Sometimes, when you’re creating a segment for a marketing campaign using demographic attribute,s a customer base may be perfectly appropriate. Other times, it may be an unintentional bias that limits your customer base,” Goldman adds.

Turn training into a trump card

How AI systems affect your employees is a key part of ethical AI usage. Customer support is one of the successful areas for generative AI adoption and the biggest benefits come not from replacing call centre employees, but helping them resolve difficult calls faster by using systems with RAG and fine tuning it to know the product, the documentation and the customer better, says Peter Guagenti, president of Tabnine.

“The most successful applications of this are highly trained on a company’s unique offerings, the unique relationship they have with their customers and the unique expectations of customers,” he says.

And training isn’t just important for AI systems. The European Unions (EU) AI Act will require businesses to foster “AI literacy” inside the organisation. As with other technologies, familiarity allows power users who have had specific, relevant training to get the most out of AI tools.

“The people who use the tool the most are the ones who make the best use of the tool and see the greatest benefit,” says Guagenti, and building on the automation efficiencies rather than attempting to replace users with AI will benefit the business.

“If you can coach and teach your people how to do these things, you’re going to benefit from it, and if you build an actual curriculum around it with your HR function, then you’ll become the employer of choice, because people know that these are critical skills.”

As well as making your investment in AI worthwhile by improving adoption, involving employees about “pain points they’re trying to solve” is also the best way to create effective AI tools, Goldman says.

Getting all of that right will require cooperation between business and development teams. Responsible AI operationalisation needs a governance layer and tools – such as the Azure impact assessment template or an open source generative AI governance assessment – can help to create a process where technical teams start by giving not just the business use case, but the risks that will need mitigating, so the CIO knows what to expect.

Bring in a red team to stress the models and test mitigations: the impact assessment means the CIO can clearly see if the system is ready to be approved and put into production – and monitored to see how it behaves.

A more overall assessment of what your risk exposure is from AI can pay unexpected dividends by finding places where you need to improve the way PII is treated and who has access to data, Purcell says. “One of the real benefits of this moment is using it as a catalyst to solve a lot of your existing governance problems, whether they’re specific to AI or not.”



Source link