Although a buzzword of our time, Artificial Intelligence (AI), is not a new concept. The earliest substantial work in the field was done by logician and computer pioneer Alan Turing, famed for his contributions to the Allies’ code breaking efforts during the Second World War. In 1947, he gave the earliest known public lecture to mention computer intelligence, and in 1948, he introduced many of the central concepts of AI in a report titled “Intelligent Machinery”.
But in the years since, the evolution of AI has not been a smooth one; we have seen bouts of optimism followed by many disappointments, and periods of outright stagnation. Until recently, the idea of AI being used in the mainstream was a non-starter.
Like many technological evolutions, the next step for AI has been enabled by advancements in other fields. These have included: 1) the vast computing power provided by cloud computing and ever-increasing levels of computational efficiency; 2) the falling cost of data storage and exponential growth in data volumes used for training AI models and; 3) the emergence of open source frameworks which enable the sharing/integration of different data sets and applications.
With 75% of executives saying AI will be actively implemented in their organisations within three years, the technology is growing in its importance. So what is AI? How does it work? And why are we likely to see more of it?
AI is not one type of technology. It encompasses many different technologies working together to mimic the intelligence or behavioural patterns of humans. It is used to predict, automate, and optimise tasks that humans do. Perhaps the two most commonly discussed subfields of AI are machine learning and deep learning.
Machine learning is dependent on human intervention to help it learn. Humans determine the hierarchy of features from which computers can then try to understand the differences between data inputs – this requires more structured data to learn. For example, to distinguish between different car model images, a human operator may determine the characteristics which differentiate them e.g. a car’s bonnet.
Where things get interesting is with deep learning, a subset of machine learning. It is primarily used for more complex use cases such as language and image recognition.
Deep learning is loosely modelled on the biology of our brains: interconnections between neurons – a neural network. Unlike our brains, where any neuron can connect to another within a certain physical distance, artificial neural networks have separate layers, connections, and directions of data propagation. The application of deep learning can be thought of in two phases: training and inference.
When training a neural network, data is put into the first layer and individual neurons assign a probability weighting to the input — how correct or incorrect it is — depending on the task being performed. Unlike machine learning models, it can ingest unstructured data in its raw form (e.g. text, images), and can automatically determine the set of features which distinguish them apart. So for image recognition of our car model example, the first layer might look for edges. The next might look for how these edges form shapes to determine whether the image is a car. The third might look for particular features of a car model — such as bonnets, headlights and wing mirrors. Each layer passes the image to the next, until the final layer and final output is determined by the total of all those weightings which are produced.
Where the neural networks of deep learning AI differs from our own brain is that if the training algorithm doesn’t reach a conclusion, it doesn’t get informed what the right answer is i.e. an error message is returned. The error is propagated back through the network’s layers and it has to guess at another characteristic. In each attempt it must consider other attributes — like a car bumper — and weigh the attributes examined at each layer higher or lower. It then guesses again. And again. And again. Until it has the correct weightings and gets the correct answer practically every time – it’s a 2012 Volkswagen Golf. Our model has been ‘trained’.
The problem now is that our deep learning neural network is essentially a large clunky database which requires a huge amount of computational power to run. If it is to be used practically, on a day-to-day basis, it needs to be speedy but retain the learning and apply it quickly to data it has never seen. That’s where inference comes into play.
Inferencing can be done in two ways. The first approach looks at parts of the neural network that don’t get activated after it’s trained – they can be snipped away. The second looks for ways to fuse multiple layers of the neural network into a single computational step, like when compressing and uploading a digital image to the internet.
Either way, inference means AI can be used all the time. Apple’s Siri voice-activated assistant uses inference, as does Google’s image search, and Amazon’s and Netflix’s recommendation engines. More importantly, inferencing was crucial in enabling Moderna to quickly and successfully develop a COVID-19 vaccine, by modelling which subcomponents of a vaccine were most likely to trigger an immune response. In the future, inference will enable autonomous vehicles to roam our roads via image recognition.
What is clear is that over time such AI models and applications will get smarter, faster and more accurate. Training will get less cumbersome, and inference will assist with ever more sophisticated tasks and bring new applications to every aspect of our lives. Although not a new concept, AI technology is still very much in its infancy.