Support Vector Machines, Neural Networks and Fuzzy Logic Models

Preface Introduction


  This is a book about learning from experimental data and about transferring human knowledge into analytical models.

  Performing such tasks belongs to soft computing. Neural networks (NNs) and support vector machines (SVMs) are the mathematical structures (models) that stand behind the idea of learning, and fuzzy logic (FL) systems are aimed at embedding structured human knowledge into workable algorithms. However, there is no clear boundary between these two modeling approaches. The notions, basic ideas, fundamental approaches, and concepts common to these two fields, as well as the differences between them, are discussed in some detail. The sources of this book are course material presented by the author in undergraduate and graduate lectures and seminars, and the research of the author and his graduate students. The text is therefore both class- and practice-tested.

  The primary idea of the book is that not only is it useful to treat support vector machines, neural networks, and fuzzy logic systems as parts of a connected whole but it is in fact necessary. Thus, a systematic and unified presentation is given of these seemingly different fields - learning from experimental data and transferring human knowledge into mathematical models.

  Each chapter is arranged so that the basic theory and algorithms are illustrated by practical examples and followed by a set of problems and simulation experiments. In the author's experience, this approach is the most accessible, pleasant, and useful way to master this material, which contains many new (and potentially difficult) concepts. To some extent, the problems are intended to help the reader acquire technique, but most of them serve to illustrate and develop further the basic subject matter of the chapter. The author feels that this structure is suitable both for a textbook used in a formal course and for self-study.

  How should one read this book? A kind of newspaper reading, starting with the back pages, is potentially viable but not a good idea. However, there are useful sections at the back. There is an armory of mathematical weaponry and tools containing a lot of useful and necessary concepts, equations, and methods. More or less frequent trips to the back pages (chapters 8 and 9) are probably unavoidable. But in the usual way of books, one should most likely begin with this preface and continue reading to the end of chapter 1. This first chapter provides a pathway to the learning and soft computing field, and after that, readers may continue with any chapters they feel will be useful. Note, however, that chapters 3 and 4 are connected and should be read in that order. (See the figure in Chapters' Survey which represents the connections between the chapters. See also the concise chapter's description there.)

  In senior undergraduate classes, the order followed was chapters 1, 3, 4, 5, and 6, and chapters 8 and 9 when needed. For graduate classes, chapter 2 on support vector machines is not omitted, and the order is regular, working directly through chapters 1-6.

  There is some redundancy in this book for several reasons. The whole subject of this book is a blend of different areas. The various fields bound together here used to be separate, and today they are amalgamated in the broad area of learning and soft computing. Therefore, in order to present each particular segment of the learning and soft computing field, one must follow the approaches, tools, and terminology in each specific area. Each area was developed separately by researchers, scientists, and enthusiasts with different backgrounds, so many things were repeated. Thus, in this presentation there are some echoes but, the author believes, not too many. He agrees with the old Latin saying, Repetio est mater studiorum - Repetition is the mother of learning. This provides the second explanation of `redundancy' in this volume.

  A few words about the accompanying software are in order. All the software is based on MATLAB. All programs run in R11 and R12, i.e., in versions 5 and 6.

  The author designed and created the complete aproxim directory, the entire SVM toolbox for classification and regression, the multilayer perceptron routine that includes the error backpropagation learning, all first versions of core programs for RBF models for n-dimensional inputs, and some of the core fuzzy logic models. Some programs date back as far as 1992, so they may be not very elegant. However, all are effective and perform their allotted tasks as well as needed.

  The author's students took an important part in creating user-friendly programs with attractive pop-up menus and boxes. At the same time, those students were from different parts of the world, and the software was developed in different countries - Yugoslavia, the United States, Germany, and New Zealand. Most of the software was developed in New Zealand. These facts are mentioned to explain why readers may find program notes and comments in English, Serbian, and German. (However, all the basic comments are written in English.) We deliberately left these lines in various languages as nice traces of the small modern world. Without the work of these multilingual, ingenious, diligent students and colleagues, many of the programs would be less user-friendly and, consequently, less adequate for learning purposes.

  The MATLAB version of the package LEARNSC (as pre-parsed pseudo-code files, P-files) needed for the simulation experiments (compatible with the Releases R11 and R12, i.e., with the MATLAB 5 and MATLAB 6 versions respectively) is free downloadable and it can be retrieved from this site.

  The readers interested in authors' programming solutions may purchase the source MATLAB code (M-files) of the program package LEARNSC (for Releases R11 and R12) from the same site. Note that both the software package LEARNSC that corresponds to this book and its each particular routine are for fair use only and free for all educational purposes. They may not be used for any kind of commercial activity.

  A preliminary draft of this book was used in the author's senior undergraduate and graduate courses at various universities in Yugoslavia, Germany, USA and New Zealand. The valuable feedback from the curious students who took these courses made many parts of this book easier to read. He thanks them for that.

back to top

Introductory part

  In this book no suppositions are made about preexisting analytical models. There are, however, no limits to human curiosity and the need for mathematical models. Thus, when devising algebraic, differential, discrete, or any other models from first principles is not feasible, one seeks other avenues to obtain analytical models. Such models are devised by solving two cardinal problems in modern science and engineering:

  • Learning from experimental data (examples, samples, measurements, records, patterns, or observations) by support vector machines (SVMs) and neural networks (NNs)
  • Embedding existing structured human knowledge (experience, expertise, heuristics) into workable mathematics by fuzzy logic models (FLMs).

These problems seem to be very different, and in practice that may well be the case. However, after NN or SVM modeling from experimental data is complete, and after the knowledge transfer into an FLM is finished, these two models are mathematically very similar or even equivalent. This equivalence, discussed in section 6.2, is a very attractive property, and it may well be used to the benefit of, both fields.

  The need for a book about these topics is clear. Recently, many new 'intelligent' products (theoretical approaches, software and hardware solutions, concepts, devices, systems, and so on) have been launched on the market. Much effort has been made at universities and R&D departments around the world, and numerous papers have been written on how to apply NNs, FLMs, and SVMs, and the related ideas of learning from data and embedding structured human knowledge. These two concepts and associated algorithms form the new field of soft computing. They have been recognized as attractive alternatives to the standard, well established 'hard computing' paradigms. Traditional hard computing methods are often too cumbersome for today's problems. They always require a precisely stated analytical model and often a lot of computation time. Soft computing techniques, which emphasize gains in understanding system behavior in exchange for unnecessary precision, have proved to be important practical tools for many contemporary problems. Because they are universal approximators of any multivariate function, NNs, FLMs, and SVMs are of particular interest for modeling highly nonlinear, unknown, or partially known complex systems, plants, or processes. Many promising results have been reported. The whole field is developing rapidly, and it is still in its initial, exciting phase.

   At the very beginning, it should be stated clearly that there are times when there is no need for these two novel model-building techniques. Whenever there is an analytical closed form model, using a reasonable number of equations, that can solve the given problem in a reasonable time, at reasonable cost, and with reasonable accuracy, there is no need to resort to learning from experimental data or fuzzy logic modeling. Today, however, these two approaches are vital tools when at least one of those criteria is not fulfilled. There are many such instances in contemporary science and engineering.

  The title of the book gives only a partial description of the subject, mainly because the meaning of learning is variable and indeterminate. Similarly, the meaning of soft computing can change quickly and unpredictably. Usually, learning means acquiring knowledge about a previously unknown or little known system or concept. Adding that the knowledge will be acquired from experimental data yields the phrase statistical learning. Very often, the devices and algorithms that can learn from data are characterized as intelligent. The author wants to be cautious by stating that learning is only a part of intelligence, and no definition of intelligence is given here. This issue used to be, and still is, addressed by many other disciplines (notably neuroscience, biology, psychology, and philosophy). However, staying firmly in the engineering and science domain, a few comments on the terms intelligent systems or smart machines are now in order.

  Without any doubt the human mental faculties of learning, generalizing, memorizing, and predicting should be the foundation of any intelligent artificial device or smart system. Many products incorporating NNs, SVMs, and FLMs already exhibit these properties. Yet we are still far away from achieving anything similar to human intelligence. Part of a machine's intelligence in the future should be an ability to cope with a large amount of noisy data coming simultaneously from different sensors. Intelligent devices and systems will also have to be able to plan under large uncertainties, to set the hierarchy of priorities, and to coordinate many different tasks simultaneously. In addition, the duties of smart machines will include the detection or early diagnosis of faults, in order to leave enough time for reconfiguration of strategies, maintenance, or repair. These tasks will be only a small part of the smart decision making capabilities of the next generation of intelligent machines. It is certain that the techniques presented here will be an integral part of these future intelligent systems.

back to top

You are here: Home >Preface

Copyright Kecman © 2000 - All Rights Reserved