Too late to shine a light into the AI black box?

Share this:
ILLUSTRATION: Google DeepMind on Unsplash

What can technology companies do to rein in the problems that AI creates as it becomes a big part of everyday work, I was asked earlier this week during a conversation.

Though the likes of Microsoft and Google have called for tougher regulation – yes, they are asking to be regulated – the point of putting the genie back in the bottle may already have passed, I replied.

To be sure, AI has been around for years before the phenomenon known as ChatGPT started taking the world by storm late last year. In just months, it has captured the imagination like few other technologies ever did.

Just like before, there are new opportunities to broaden one’s horizon as well as get into a spot of moral panic at the things the new technology could upend.

What is less often discussed, however, is one crucial issue – by now, it may just be too difficult to unravel or explain how an AI has come up with a result or ouput.

Yes, I’m thinking of black box AI, where you would throw a question into a black box before it spits out an answer without any transparency into the process.

Today’s most popular generative AI tools get their smarts from machine learning. This involves three components – the algorithm, the training data and a model.

The algorithm learns to identify pattens by looking through huge volumes of training data, say, to spot what is a dog, and the result is the machine-learning model, as Scientific American succinctly explains.

Now, much of the generative AI out there today may have algorithms that are open-source, but the training data may be collected by companies themselves and the model is often not available openly to be examined.

In other words, one or more of the three components needed to create a generative AI used by people today can be hidden or blocked from public view in a black box.

And there is also a sense among some Big Tech companies such as Google and OpenAI that the money and effort they plough into today’s open-source AI code may not be rewarding in the long term if the “crown jewels” are given away to competitors to easily copy the stuff.

These Big Tech firms are organisations that exist for profit so that’s not surprising. If AI becomes a competitive advantage, they would want that edge instead of blunting it by letting everyone have it.

That, however, is just one issue. With generative AI, it also becomes harder to find how a machine has come up with a result, say, in the form of an article or an image.

Perhaps asking ChatGPT to draft a recommendation letter or a congratulatory message in a promotion letter won’t cause much issue.

What happens, however, if you ask ChatGPT to write news articles that then become a trusted record for others to write more stories on? This is where things get murky.

I asked ChatGPT just this week where it had come up with a bunch of text for a piece of work I tasked it with, and it just told me the source was from a bunch of learning data. No, it could not pinpoint any precise sources.

For reporters, knowing the source is the most basic and important job of getting things right. If you ask ChatGPT or another generative AI to write an article or even parts of an article without checking thoroughly, you may end up generating fake news. You may also be plagiarising, since you cannot attribute the content to the original author.

This is just one example of emerging issues with AI, when there is no longer any way to verify how it works.

Expand this to other areas where we might ask AI to do a job and the issues become more complex.

What happens if AI is used, for example, to determine if one person receives social security benefits over another based on a set of criteria? Can the government welfare agency still explain to citizens how the AI arrived at a decision?

If not, then the agency and government can expect lawsuits lobbed in its way, if not social unrest, when AI helps make decisions for citizens and the decision makers cannot explain how the decision was made.

Such problems may be worsened by an endless feedback loop. First, AI generates answers based on biased or problematic data, which it spews out to human users in the form of inaccurate content.

These users then put the generated data out on the Net, which is later ingested again by AI to spit out even more biased or skewed answers. Yes, garbage in, garbage out.

I put this question to June Yang, Google Cloud’s vice-president of cloud AI and industry solutions, recently, and she acknowledged the issue as well.

While there may be ways to spot AI-generated content, for example, she noted how difficult it was to ask a generative AI to explain how it came to an answer. More research in the area was needed, she added.

Others have pointed to different approaches to AI, for example, by pushing for what’s broadly known as Explainable AI.

As its name implies, this means AI that has processes and methods that people can understand. More importantly, they can trust AI-generated results because of this.

So, if a radiologist uses AI to look for a medical issue or if a finance team is tasked to audit high-risk decisions enabled by AI, they can look to the how and why the results came about, as this blog from Carnegie Mellon University explains.

There are many challenges around explainable AI, to be sure. Aligning the efforts and definitions of what make up explainable AI is one.

Plus, of course, Big Tech companies are in a race to outgun each other in AI and they might resist slowing down to explain what is happening in the black box.

They may have to be forced to. The European Union is considering proposals to make AI more transparent by making generative AI providers register their models in a new public database.

They may also be told to disclose that content is generated by AI and publish summaries of copyrighted data used for training, as part of ongoing amendments to the EU’s AI Act. The final form of the law may be passed by the end of the year.

Unlike much legislation that is reactive, the efforts here are forward looking. They could yet help to mitigate technology risks in a world where AI is expected to be used so broadly, both for good and bad.

What’s unclear is how Big Tech firms will react, or if they are still able to peer into the black box that their AI has grown from, understand what has happened and explain how it gives us all its answers. In other words, putting the genie back into the bottle.

Search this website