By Guido Willemsen
Artificial Intelligence is hot! Chat-GPT, Dall-E, Watson, Observe, Chorus and DataRobot are only a few popular solutions in the vanguard of an exponential growing pool of AI applications. The popularity and the rise of AI solutions in the last two decades is roughly caused by a dramatic reductions in data processing costs, where earlier concepts from the field of artificial neural networks come to light. With this developments it becomes so easy to find solutions for complex puzzles which could never been solved by the human brain only. But there is a significant pitfall which we have to realize in our reliance on AI applications: how do we differentiate between right and wrong?
What if our judges would fully rely on the outcomes of AI algorithms? A suspect in a robbery that is labelled as perpetrator based on the details of the crime location, his statements in the interrogation, the statements of the victim, his personal identity, his social environment, expense profile and other relevant characteristics. A possible outcome of AI would be to declare him guilty for this crime. But what if his alibi is actually correct and that after a rigorous trial by the judge the victim even confesses that he lied while the robbery never took place? AI won’t be able (yet) to replace the judge in his analysis and the trial process. The exercise to qualify situations as wrong is not a purely analytical matter but a philosophical. And there is no common ground on which we can base our decision (even not always in court)!
During the far-reaching conversation on the Complexity podcast, Chris Moore, professor at the Santa Fe Institute, elaborates on the subjective elements in decision making. As he states, many optimization or classification problems must deal with noise in data sets, especially when variable dimensions or relations between data increase (Garfield, 2021). From a complexity perspective, we must deal with phase transitions, which can disrupt the earlier models and algorithms and force us to think about their applicability when the ground is shifting.
According to Lebovitz et al. (Lebovitz, Sarah; Lifshiutz-Assaf, Hila; Levina, Natalia, 2023) the ground base for AI should be carefully (re)considered when outcomes of smart algorithms are assessed. The metric to qualify an outcome of an artificial intelligence task is called the Area Under the Receiver Operating Characteristic (AUC). This metric defines the reference on which the outcome is qualified. For example, for an optimal transportation plan for a grocery chain the ground base could be the lowest operating cost and timely delivery. But what happens when this ground base results in many damaged products because the truck and package type was not considered and so 20% of the beer bottles arrived broken? This is not a good plan!
Often, AI developers weigh the relative costs and benefits when deciding how to assign ground truth qualifiers– a decision that has significant influence on the overall quality and potential value of the tool. During the development of the AI solution, the product owner has to explicitly validate the modeling decisions that are embedded in the algorithms. Also, many AI tools on the market focus on more subjective decision context, where expert often disagree about whether a decision was “true” (Lebovitz et al.). The product owner has to care about the level of variability and subjectivity on which the ground base is developed.
There are so many different AI solutions for so many different problems. In general, this kind of tools is used to cluster and classify data, develop algorithms, make decisions and apply algorithms to create new artifacts or solve problems. Trust in a fully automatic, artificial solution still has a hurdle that many humans have to take. But there is a good reason for this hesitancy. Source data can be biased where big data does not always correspond with good data. Although there are many initiatives to improve this biased data, there is also a tendency towards standardization and implication of citizens by reviewing data by experts (Jean-Claude, 2022). As he states, a contemporary AI system could threaten human autonomy, freedom and even survival. This implies that we have to be very careful in interpreting the data, algorithms and knowledge used in these AI systems. On the other hand, when the ground base for artificial decisions is well-thought-out it can be a powerful tool for decision-making where humans can rely on. But how do we get unbiased data and a solid ground based to refer to?
Traditionally, as a manager you are taught to develop SMART Key Performance Indicators to be able to gain control over your strategic and tactical goals. With the rise of BI dashboards, that is has become more important, as dimensioning of data and the selection of reliable, harmonized data structure is a critical precondition for good reporting.
When data is incomplete, biased and fuzzy, we lose ground and decisions are taken based on wrong perceptions of the situation. So, even in traditional management systems we should use a solid ground base for decision making. So why do we skip this step in the development of our algorithms for artificial learning and decision systems? Do we fully rely on the expertise of our AI provider and are blind for the possible misconception of reality? Often, we build our AI systems on fancy technology and place the foundation of the future company on instable quicksand.