To date, a wide variety of approaches and mathematical algorithms has been accumulated for constructing artificial intelligence (AI) systems. Approaches and algorithms include Bayesian methods, logistic regression, support vector machine, decision trees, and algorithm ensembles. A number of experts came to the conclusion that the majority of modern and really successful implementations of AI are solutions built on the technology of deep neural networks and deep machine learning.
Neural networks (neural networks) are based on an attempt to recreate a primitive model of the nervous systems in biological organisms. A human being’s neuron is an electrically excitable cell that processes, stores, and transmits information using electrical and chemical signals through synaptic connections. The neuron has a complex structure and narrow specialization. Connecting with each other to transmit signals using synapses, neurons create biological neural networks. In the human brain, there are on average about 65 billion neurons and 100 trillion synapses. In essence, this is the basic mechanism of learning and the brain activity of all living beings – i.e., their intelligence. For example, in the classical experiment of Pavlov, each time a dog was to be fed, a bell rang immediately before the feeding The dog quickly learned to associate the bell with food. From a physiological point of view, the result of experience on the dog’s brain was the establishment of synaptic connections between the areas of the cerebral cortex responsible for hearing and the areas responsible for controlling the salivary glands. As a result, when the dog was excited by the sound of a bell, salivation began. So the dog learned to respond to signals (data) coming from the outside world and to draw the “right” conclusion.
The ability of the nervous system to learn and correct its mistakes was the basis for research in artificial intelligence. The initial task was an attempt to artificially reproduce the low-level structure of the brain — that is, to create a computer’s “artificial brain.” As a result, the concept of an “artificial neuron” was proposed – a mathematical function that converts several inputs into one output, assigning impact weights to them. Each artificial neuron can take a weighted sum of input signals and, in case the total input exceeds a certain threshold level, transmit the binary signal further.
Artificial neurons unite in a network – connecting the outputs of some neurons with the inputs of others. Interconnected artificial neurons create an artificial neural network – a specific mathematical model that can be implemented on software or hardware. In simple terms, a neural network is simply a “black box” program that receives input data and provides answers. Being built from a very large number of simple elements, the neural network is capable of solving extremely complex tasks.
Currently, there are many models of the implementation of neural networks. There are “classic” single-layer neural networks. They are used to solve simple problems. There are mathematical models in which the output of one neural network is directed to the input of another and thus cascades of connections are created. These are the so-called multilayer neural networks (MNN) .
MNN have large computational capabilities, but also require huge computational resources. Given the utilization of the cloud infrastructure, multilayered neural networks have become available to a larger number of users. Now they are the foundation of modern AI solutions. In 2016, the company Digital Reasoning from the United States, which is engaged in cognitive computing technology, created and trained a neural network consisting of 160 billion digital neurons. It is much more powerful than the neural networks available to Google (11.2 billion neurons) and the US National Laboratory in Livermore (15 billion neurons).
Another interesting type of neural network is the neural network with feedback (RNN, or recurrent neural network), where the output from the network layer is fed back to one of the inputs. Such platforms have a “memory effect” and they are able to track the dynamics of changes in input factors. A simple example is a smile. The person begins to smile with barely noticeable movements of the facial muscles of the eyes and face before clearly showing his emotions. RNN makes it possible to detect such movement in the early phases.which is useful for predicting the behavior of a living object in time by analyzing a series of images or constructing a sequential flow of speech in natural language.
Machine learning is the process of machine analysis of gathered statistical data to find patterns and create the necessary algorithms based on them, thus setting the neural network parameters which will later be used for predictions. The algorithms created at the machine learning stage will allow computer artificial intelligence to draw correct conclusions based on the data provided to it.
There are 3 main approaches to machine learning:
Training with a teacher
Learning without a teacher (self-study)
In training with the teacher, pre-selected data is used. Correct answers are already known and reliably defined and the parameters of the neural network are adjusted to minimize the error. In this method, the AI can match the correct answers to each input example and identify possible dependencies of the response on the input data. For example, a collection of X-ray images with the specified conclusions will be the basis for training the system. From the series of models obtained, a person ultimately chooses the most appropriate, according to the maximum accuracy of the predictions issued.
Often, the preparation of such data and retrospective responses requires a lot of human intervention and manual selection. Also, the quality of the result is influenced by the subjectivity of the human expert. If, for any reason, he does not consider the entire set of the sample data and its attributes during training, the conceptual model will be limited to the current level of development of science and technology. The resulting AI will also have a certain “blindness”.
Therefore, it is important to teach the AI system using examples and frequencies that are adequate and represent real-life conditions. The geographical and socio-demographic aspect can have a great impact. That is why it is usually not a good idea to use mathematical models trained on population data from other countries and regions. The expert is also responsible for the representativeness of the training set.
Self-study is used when there are no ready answers and classification algorithms. In this case, the AI focuses on the independent identification of hidden dependencies. Machine self-learning allows you to categorize information by analyzing hidden patterns and auto-recovering the internal structure and nature of the information. This eliminates the systemic “blindness” of the researcher.
Deep learning mechanisms usually use multilayered neural networks and a very large number of instances of objects for training a neural network. The number of records in the training sample should be hundreds of thousands or more. In order to teach AI to recognize a person’s face in a photo, the Facebook team needed millions of images with metadata and tags indicating the presence of a face in the photo. Facebook’s success in implementing facial recognition functions was due to having a huge amount of initial information for learning at their disposal; there are hundreds of millions of accounts of people in the social network who uploaded a huge number of photos while pointing at them and marking (identifying) people. Deep machine learning based on this amount of data allowed Facebook to create a reliable artificial intelligence application. The application not only detects the face of a person in an image in milliseconds, but also quite often guesses who is shown in the photo.
The method of teaching with a teacher is more convenient and preferable in situations where there is accumulated and reliable retrospective baseline data. Training based on it will require less time and will allow you to quickly get a working AI solution. When there is no possibility of obtaining a database with correlated information and answers, it is necessary to apply self-learning methods based on deep machine learning. Such solutions will not need human supervision.
Victoria Liset is strategic business & technology consultant to SMEs. She helps businesses improve their performance by using data more efficiently, and helping them to understand the implications of new technologies such as AI, Machine Learning, Big data, blockchain and IoT.