iii
ivPREFACE
about 60% correct on 100 categories), the fact that_we_pull it off seemingly effortlessly serves as a “proof of concept” that it can be done. But there is no doubt in my mind that building truly intelligent machines will involve learning from data.
The first reason for the recent successes of machine learning and the growth of the field as a whole is rooted in its multidisciplinary character. Machine learning emerged from AI but quickly incorporated ideas from fields as diverse as statistics, probability, computerscience, information theory, convex optimization, control theory, cognitive science, theoretical neuroscience, physics and more. To give an example, the main conference in this field is called:advances in neural information processing systems, referring to information theory and theoretical neuroscience and cognitive science.
The second, perhaps more important reason for the growth of machine learning is the exponential growth of both available data and computer power. While the field is build on theory and tools developed statistics machine learning recognizes that the most exiting progress can be made to leverage the enormous flood of data that is generated each year by satellites, sky observatories, particle accelerators, the human genome project, banks,the stock market, the army, seismic measurements, the internet, video, scanned text and so on. It is difficult to appreciate the exponential growth of data that our society is generating. To give an example, a modern satellite generates roughly the same amount of data all previous satellites produced together. This insight has shifted the attention from highly sophisticated modeling techniques on small datasets to more basic analysis on much larger data-sets (the latter sometimes calleddata-mining). Hencethe emphasis shifted to algorithmic efficiency and as a result many machine learning faculty (like myself) can typically be found in computer science departments. To give some examples of recent successes of this approach one would only have to turn on one computer and perform an internet search. Modern search engines do not run terribly sophisticated algorithms, but they manage to store and sift through almost the entire content of the internet to return sensible search results. There has also been much success in the field of machine translation, not because a new model was invented but because many more translated documents became available.
The field of machine learning is multifaceted and expanding fast. To sample a few sub-disciplines: statistical learning, kernel methods, graphical models, artificial neural networks, fuzzy logic, Bayesian methods and so on. The field also covers many types of learning problems, such as supervised learning, unsupervised learning, semi-supervised learning, active learning, reinforcement learning etc. I will only cover the most basic approaches in this book from a highly perv
sonal perspective. Instead of trying to cover all aspects of the entire field I have chosen to present a few popular and perhaps useful tools and approaches. But what will (hopefully) be significantly different than most other scientific books is the manner in which I will present these methods. I have always been frustrated by the lack of proper explanation of equations. Many times I have been staring at a formula having not the slightest clue where it came from or how it was derived. Many books also excel in stating facts in an almost encyclopedic style, without providing the proper intuition of the method. This is my primary mission: to write a bookwhich conveys intuition. The first chapter will be devoted to why I think this is important.