How to see, understand and evaluate AI Vision
AI vision, i.e. image processing with artificial intelligence, is a topic that is much discussed. But in many areas, such as industrial applications, the new technology has not yet fully arrived, so long-term empirical values are lacking. While there are some embedded vision systems on the market that allow AI to be used in industrial conditions, many facility operators are still hesitant to buy one of these platforms and upgrade their applications. Although AI has already shown pioneering new possibilities where rule-based image processing has run out of rules and remained without a solution until now. So what prevents the new technology from spreading faster?
AI or machine learning (ML) works quite differently from classical, rule-based image processing. This also changes the approach and handling of image processing tasks. The quality of the results is no longer the product of a manually developed programme code by an image processing expert, as was previously the case, but is determined by the learning process of the neural networks used with suitable image data. In other words, the object features relevant for inspection are no longer predetermined by predefined rules, but the AI must be taught to recognise them itself in a training process. And the more varied the training data, the more likely the ML algorithms are to recognise the really relevant features later in operation. But what sounds so simple everywhere also only leads to the desired goal with sufficient expertise and experience. Without a skilled eye for the right image data, errors will occur here as well. This means that the key competences for working with machine learning methods are no longer the same as those for rule-based image processing. But not everyone has the time or manpower to read into the subject from scratch to build up new key competencies for working with machine learning methods. Unfortunately, that’s the problem with new things – they can’t be used productively immediately. And if they actually deliver good results without much effort, but unfortunately cannot be clearly reproduced, you can hardly believe it and don’t trust it.
(Not) a black box The way neural networks work is therefore often wrongly perceived as a black box whose decisions are not comprehensible. “Although DL models are undoubtedly complex, they are not black boxes. In fact, it would be more accurate to call them glass boxes, because we can literally look inside and see what each component is doing.” [Quote from “The black box metaphor in machine learning”]. Inference decisions of neural networks are not based on classical comprehensible rules, and the complex interactions of their artificial neurons may not be easy for humans to understand, but they are nevertheless results of a mathematical system and thus reproducible and analysable. All that is (still) missing are the right tools to support us. There is still a lot of room for improvement in this area of AI. This shows how well the various AI systems on the market can support users in their endeavours.
Software makes AI explainable For this reason, IDS Imaging Development GmbH is researching and working in this field together with institutes and universities to develop precisely these tools. The IDS NXT AI vision system contains the results of this cooperation. Statistical analyses using a so-called confusion matrix make it possible to determine and understand the quality of a trained neural network. After the training process, the network can be validated with a previously determined series of images with already known results. Both the expected results and the results actually determined by inference are compared in a table. This makes it clear how often the test objects were recognised correctly or incorrectly for each trained object class. From these hit rates, an overall quality of the trained CNN can then be given. Furthermore, the matrix clearly shows where the recognition accuracy could still be too low for productive use. However, it does not show the reason for this.
This is where the attention map comes in, showing a kind of heat map that highlights the areas or image contents that get the most attention from the neural network and thus influence the decisions. During the training process in IDS NXT lighthouse, the creation of this visualisation form is activated based on the decision paths generated during training, allowing the network to generate such a heat map from each image during analysis. This makes it easier to understand critical or inexplicable decisions made by the AI, ultimately increasing the acceptance of neural networks in industrial environments.
We are only at the beginning Used correctly, AI vision has the potential to improve many vision processes. With IDS NXT, an embedded AI system is available that can be used quickly and easily as an industrial tool by any user group with a comprehensive and user-friendly software environment – even without in-depth knowledge of machine learning, image processing or application programming
IDS Imaging Development Systems Limited
Landmark House, Station Road
RG27 9HA Hook
Phone: +44 1256 962910