Some time ago, an entertaining article was published on Geektimes about a critical assessment of the results of deep learning .
The article is very entertaining and interesting, it opens up a lot of new things for those who are interested, but not very much immersed in the topic of Artificial Intelligence, artificial neural networks and deep learning.
So I would recommend reading it first, and then returning to this note of mine, which is a kind of response to this article. But if you are reluctant to read it or there is no time, then here are the theses:
- For deep learning systems to work, a lot of high-quality data is needed, which must be pre-cleaned and labeled by a specialist. The deeper the degree of the system, the more data is required.
- Deep learning systems work only with the types of data on which the training took place, and they still cannot generalize and transfer the found patterns to data of other types, even very close ones.
- It is very difficult for deep learning systems to work with hierarchical structures, so language processing is very difficult for them, since natural language is a very deep hierarchical structure.
- As a consequence of the previous deep learning systems, it is very difficult to perceive inaccurate and fuzzy data, they often do not see the difference where for a person the difference is huge.
- Deep learning systems inherited from artificial neural networks and exacerbated the problem of high complexity (to the point of practically impossibility) of explaining the results obtained and the inference produced.
- Deep learning systems do not take into account the existing body of knowledge, but are retrained from the input data, simply interpreting them in their own way.
- Identifying causal relationships and separating them from simple correlations is an important task, but deep learning systems are still struggling to do this.
- Deep learning can be easily fooled, especially if it is on the verge of overfitting. This vulnerability opens up a wide scope for a variety of attacks, the consequences of which are not yet fully understood. They did not even begin to solve this problem.
- Applied applications to deep learning systems have not yet been found, and this point is, in fact, a consequence of all the previous ones.
Further, the author of this polemical note makes his disappointing predictions about what a fair amount of hype about deep learning can lead to (how to what – to a new, third winter of AI), and also gives his vision of how all this could be overcome … Among his advice are: the use of spontaneous learning methods, attempts to solve more complex problems, the use of modern knowledge in the field of psychology and, finally, the use of hybrid models with symbol manipulation technologies.
It is on this last opportunity that I would like to focus my attention today.
I have already mentioned a little about hybrid artificial intelligence . Let’s find out in more detail what it is. So first, let’s remember its general scheme:
As can be seen from the presented diagram, a hybrid artificial intelligent system is nothing more than a universal cybernetic machine, which has three main elements: affectors, a control subsystem, and effectors. With the help of affectors, the cybernetic machine perceives environmental signals, which are processed in the control subsystem, the signals from which are then sent to the effectors, which affect the environment. This is a general scheme of any autonomous agent, therefore, a hybrid artificial intelligent system is also an intelligent agent, implementing an agent-based approach.
A hybrid intelligent system differs in that its affectors (sensors, sensors) and effectors (executive devices) are connected with the control and decision-making subsystem through neural networks.
This exploits the strengths of the bottom-up or dirty approach. An affective neural network receives signals from the external environment cleared by sensors and converts them into symbols that are fed to the input of a universal inference machine. The latter carries out inference based on symbolic knowledge from its knowledge base and outputs the result, which is also represented in the form of symbols.
Thereby, the strengths of the top-down or “clean” approach are realized. The symbolic result is fed to the input of the motor neural network, which converts high-level symbols into specific control signals for actuators.
Among other things, control connections from all its elements to sensors must be implemented within the hybrid intelligent system. Thus, adaptation mechanisms based on homeostasis of the internal state of the system are realized.
The sensors record the change in the internal state of each subsystem, their elements and complexes, and in the case of the controlled values going beyond the established homeostatic intervals, the control subsystem makes a decision, the purpose of which will be to return the changed indicators to the setting interval.
It is a system with such an architecture that, when passing through a certain threshold of complexity, can be considered reasonable. At the same time, rationality in this sense is defined as an adequate response not only to stimuli from the external environment, but also to internal states, which also includes constant monitoring of the state of one’s own control subsystem, which is called “self-reflection”, which leads to awareness.
So now let’s look at all the bottlenecks of deep learning technology, given in the original article, from the perspective of a hybrid approach.
1. Deep learning needs data
The more data, especially well-labeled data, the better the result of deep neural networks. Moreover, they can generalize information and find hidden dependencies, building something like “if … then …” production rules.
In fact, any neural network does this during training, since in fact it builds the rules for transforming input data into outputs, which at a higher level of abstraction can be represented in the form of productions “If the input takes such and such values, then the output is equal to that”.
However, the problem is that even if the neural network makes such generalizations, then their representation is hidden in its depths in the form of implicit information, which is displayed only in the weight coefficients on the connections between neurons. Interpreting these coefficients is very difficult, if not impossible, especially since the same interneural connection can contribute to several different rules.
If we consider the example given in the original article, where the word “smester” was introduced, which is defined as “a sister whose age is from 10 to 21 years old,” then the neural network may well learn to determine who is smester, who is just a sister, and who is not at all. one or the other.
The only question is in the training sample. And after the neural network has learned, it will perfectly separate the schmester from the sisters, but it will never be able to explain who the smester is and why she recognizes the schmester in that case.
However, if we are building a hybrid artificial intelligent system, then its architecture can be arranged like this. The sensory neural network determines the basic parameters of the input. For example, the fact that the picture is a woman and she is estimated to be 18 years old.
This is what neural networks can do very well today. But then these recognized basic characteristics are fed to the input of a universal inference machine, in the knowledge base of which there is a rule “If a woman is a sister and she is from 10 to 21 years old, then this is a smester”, after which the inference process starts – the system will check whether the recognized a woman’s sister, and if so, it will be absolutely clear that this is a smester. The result of such a conclusion can be explained very simply, going down to the basic characteristics recognized by the neural network.
And the training of such a hybrid system can be carried out both in the development process and in the process of interacting with it.
For example, if we communicate with her using a dialog interface (chat bot), then when the word “shmester” is used in the text, the system may ask what it is. And then you can define it.
So the first problem of deep learning systems is fairly easy to solve with a hybrid approach.
2. Deep learning so far has little depth and does not transfer the acquired knowledge to other data
Deep learning systems find interesting patterns, but they cannot generalize them, and from here come various funny incidents described in the original article.
Generalization is an operation of a higher level of abstraction than simply finding patterns, even hidden ones. And if neural networks cope with the latter very easily and already surpass people in this, then they have serious problems with generalization. However, many people do too.
In fact, generalization is a symbolic operation. Summarizing, we transfer the found patterns from individual instances to the types and classes of objects of the external world or abstract entities.
And this is done exactly at the symbolic level, when the inference rules are modified using meta-rules. For example, one of these meta-rules might be: “If several objects of the same class have the same property, then assume that all objects of this class have this property.” Classes can be grouped into higher-order classes, and generalizations can be made for them, and so on.
All this can be done with the help of a self-learning universal inference engine that has meta-rules, including for changing the meta-rules themselves.
But for this it is necessary to have a hierarchy of concepts. But in deep learning systems, the implementation of hierarchies is still very weak, since …
3. Deep learning does not yet have a natural way to work with hierarchical structure.
Since neural networks are still struggling to cope with hierarchies of concepts. Attempts to represent hierarchical structures in neural network models lead either to retraining of neural networks, or to the fact that they cannot fully differentiate recognition objects when going down the hierarchy.
The neural networks of deep learning have achieved some success in this matter, since hierarchies of images are implicitly and independently formed in them, but the whole question is to put some given hierarchy into the neural network in an explicit form. So far, this causes certain difficulties.
Although, for example, there are neural network models that transform sequences of words into some “vectors in a three-hundred-dimensional space of meanings, the essence of which is not clear to human consciousness”, which seem to even be grouped into some clusters. But again, these are not hierarchies, but all the same vector transformations, which, in fact, neural network models are.
On the other hand, the top-down paradigm has all the means to natively represent hierarchical concepts. Even with the help of productions, this can be done, not to mention the semantic networks and ontologies that are designed for this.
But here we are again talking about the explicit representation of knowledge and deductive learning with a teacher, although it is quite possible to build artificial intelligent systems that are trained not in a special learning mode, but in the process of interacting with the environment. The main thing is that the right reinforcement comes from the environment.
So when using the hybrid approach, you can again take the best of both paradigms. At the lower level, recognizing neural networks will determine specific objects and concepts with which the system has to interact, and at a higher level, these recognized concepts can fit into semantic networks to determine hierarchical relationships and obtain generalized conclusions based on higher-order classes that include the recognized object. For example, a visual neural network recognizes a man’s face in a photograph, and at the upper level, not only personal data and physical characteristics are added to the specific characteristics of the recognized face, but also concepts such as “adult”, “male”, “human”, “primate”, “Mammal” and all other classes, which include all the underlying objects of the hierarchy.
4. Deep Learning still has a hard time dealing with imprecisely defined concepts.
In the course of their life, people are constantly faced with situations where data is incomplete, information is inaccurate, knowledge is contradictory, indistinct and uncertain. These so-called NOT-factors of knowledge permeate the entire information fabric of our reality by virtue of its very nature, since the accuracy of data cannot be increased indefinitely due to fundamental limitations.
The human mind has learned to handle such situations, but there is no deep learning system, since they require clearly labeled data and the most complete set of input parameter values for training. It is only recently that combined mechanisms have begun to appear for processing with fuzzy or uncertainty using knowledge neural networks.
At the same time, the symbolic approach in artificial intelligence does an excellent job with non-factors.
A variety of formalisms are intended for this, from the Dempster-Schafer method, L. Zade’s fuzzy logic to soft and linguistic calculations. These formalisms make it possible to carry out a conclusion even in the case of a high degree of uncertainty, having obtained quite acceptable results.
Nevertheless, knowledge also has non-factors of a different kind, which do not relate to specific information, but to knowledge in general. This is the already mentioned incompleteness, this is inconsistency and incorrectness, this is inadequacy and some other properties of knowledge that manifest themselves in people to varying degrees, but characterize each person according to the degree of his expertise in one area or another (you can be an expert in one area, having all available completeness of knowledge in it, and a layman in another, having no knowledge in it at all). Hybrid artificial intelligence systems can combine the power of deep learning to reveal new knowledge in data and formal methods of processing non-factors to work in conditions of uncertainty and incompleteness of information.
5. Deep learning is still not transparent enough
A lot has already been said about the “impossibility” to interpret the results of training neural networks. The question here is not unrecognizability, but too high computational need in order to understand the results of adjusting the weight coefficients on the connections between neurons obtained as a result of training. If the network consists of hundreds of thousands of neurons and thousands of layers, then the number of weights on the connections is simply cyclopean, and “disassembling” the learning results is simply not feasible from a computational point of view.
This problem has already led to the emergence of a new paradigm – XAI, eXplainable Artificial Intelligence (explainable artificial intelligence).
XAI must explain its decisions and results, and this is quite possible for hybrid intelligent systems, since in a top-down approach, explanation is one of the key features. Symbolic systems can always explain the results of inference, since both the inference rules and the input data are known.
Even in the case of processing of non-factors of knowledge, it is quite possible to explain how certain results were obtained or certain decisions were made.
In the case of creating and using hybrid AI systems, deep learning networks prepare basic information for making decisions with a universal solver. In fact, it is at this level that the “inexplicable” in the operation of such systems will end. But human consciousness works in much the same way.
For example, if a person sees a cat, then he usually cannot explain exactly how he recognized the cat in the animal (all rationalizing explanations on the topic “this is a small animal with sharp ears and a striped tail” are not suitable, since this is precisely post-facto rationalization ). But then the symbol “cat” and all of its over-symbols can participate in the further derivation, and the results of such an inference can be fully explained.
6. Deep learning does not yet integrate well with existing knowledge
This problem is a consequence of all of the above. To integrate artificial neural networks with already accumulated and even formalized knowledge, it is necessary to do a lot of work to bring such knowledge into the form that deep learning systems can perceive.
However, a lot of knowledge has already been reduced to a form in which symbolic systems can perceive them, and, therefore, hybrid ones. And the work on the formalization of knowledge is carried out in this direction – knowledge is presented within the framework of symbolic formalisms, and not marked up for consumption by neural networks.
This means that it makes sense to focus precisely on combining the two approaches, and this is the way to hybridize the top-down and bottom-up paradigm. A hybrid AI system, by its very nature, will be integrated with the entire body of existing knowledge.
7. Deep Learning is not yet able to automatically distinguish causation from correlation.
To be honest, even people with their natural intelligence often cannot distinguish causality from correlation. Natural selection has contributed to the fact that the human neocortex is inclined to find causal relationships where they do not exist and cannot be. From the point of view of evolution, this is quite reasonable, since it is better to stay out and survive than to ponder for a long time on the topic of whether one event is a direct consequence of another. So an excessive tendency to see cause-and-effect relationships in people is most likely a tool for reducing Type II errors in pattern recognition.
At the same time, formal logic provides all the necessary tools for determining cause-and-effect relationships in the results of observations. Most often it is necessary to use the apparatus of fuzzy logic and the theory of confidence, since observations or experiments are usually never “pure”.
So we have to lay some ambiguity on the fact that there are admixtures of undetected factors in the results. Therefore, simultaneously applying the methods of mathematical statistics (finding correlations) and formal logic (finding cause-effect relationships) works quite well and reliably.
Deep learning technologies implement statistical methods with ease and ease, but formal logic is the domain of the symbolic approach. Therefore, to solve this problem of neural networks, paradigm hybridization is again required.
8. Deep Learning works well as an approximation so far, but its answers often cannot be completely trusted.
This problem is a consequence of the complexity of interpreting the results obtained from the output of deep learning neural networks and the previous problem. Yes, deep learning does a pretty good job with statistical models, as already shown.
Approximation is a simplification, replacing a complex computational process with a simpler one. If some computational process can be at least approximately represented as a product of matrices, then using deep learning methods it is quite possible to approximate such a process.
This is, perhaps, akin to the processes that occur within natural intelligence. Neural networks in the human nervous system (and even in many animals) quite successfully approximate rather complex computational problems. How to catch a thrown ball with your hand? Quickly solve a system of second order differential equations? There are great doubts that some kind of biochemical process in neurons solves such equations. But on the contrary, neural networks quite successfully approximate the solution of such problems after numerous acts of training. But try to explain how you catch the thrown ball.
And again the hybrid paradigm comes to the rescue. It doesn’t matter how the neural network weights are tuned. If the network has learned to successfully approximate a process, then at a higher level of abstraction, such results can be explained using certain rules (at least, it can be explained in the same way as a person would explain these results). And then further it would be possible to use the apparatus of any theory of trust (for example, the Dempster-Schafer theory) to obtain a degree of confidence in the conclusion and its explanation.
9. Deep learning is hard to use for applied purposes
And finally, a kind of decadent problem. In fact, there is no problem here. Deep learning systems can be easily applied in various applied problems, it’s just that the industry itself is still young enough to get applied development. But the processes are already underway. Machine vision systems have long been in the service of industries such as safety and traffic management.
Recognizing incidents on highways and in video surveillance control zones are the most applied tasks. Recommendations based on the results of purchases or likes on social networks are also applied problems that are successfully solved using deep learning methods.
Of course, artificial intelligence methods are not yet as widely used as we would like. In general, AI will enter all spheres of human and society life and change them, often beyond recognition. In particular, the following aspects will definitely be most seriously affected:
- State and municipal administration.
- Ensuring personal and public safety.
- Transport and logistic.
- The science.
I foresee that in all these areas of life powerful decision support systems will appear and will develop, which, in their essence, are vivid examples of the implementation of the top-down paradigm. But these DSSs will further absorb deep learning methods and other technologies of the ascending paradigm, and ultimately will become hybrid DSS – decision-making systems from which the notorious human factor is often excluded in principle.
✹ ✹ ✹
All this says only one thing – it is necessary to become a broad specialist with knowledge in a wide variety of areas. To create perfect AI systems, it is no longer enough to just take and learn the technology of artificial neural networks, it is necessary to immerse yourself in the methods and technologies of artificial intelligence that have been developed since the very appearance of this direction of science and technology. Otherwise, it will lead to the third Winter.