Deep Learning

26th August 2022 0 By indiafreenotes

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.

Deep-learning architectures such as deep neural networks, deep belief networks, deep reinforcement learning, recurrent neural networks, convolutional neural networks and Transformers have been applied to fields including computer vision, speech recognition, natural language processing, machine translation, bioinformatics, drug design, medical image analysis, climate science, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.

Artificial neural networks (ANNs) were inspired by information processing and distributed communication nodes in biological systems. ANNs have various differences from biological brains. Specifically, artificial neural networks tend to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analogue.

The adjective “deep” in deep learning refers to the use of multiple layers in the network. Early work showed that a linear perceptron cannot be a universal classifier, but that a network with a nonpolynomial activation function with one hidden layer of unbounded width can. Deep learning is a modern variation which is concerned with an unbounded number of layers of bounded size, which permits practical application and optimized implementation, while retaining theoretical universality under mild conditions. In deep learning the layers are also permitted to be heterogeneous and to deviate widely from biologically informed connectionist models, for the sake of efficiency, trainability and understandability, hence the “structured” part.


Deep neural networks are generally interpreted in terms of the universal approximation theorem or probabilistic inference.

The classic universal approximation theorem concerns the capacity of feedforward neural networks with a single hidden layer of finite size to approximate continuous functions. In 1989, the first proof was published by George Cybenko for sigmoid activation functions and was generalised to feed-forward multi-layer architectures in 1991 by Kurt Hornik. Recent work also showed that universal approximation also holds for non-bounded activation functions such as the rectified linear unit.

The universal approximation theorem for deep neural networks concerns the capacity of networks with bounded width but the depth is allowed to grow. Lu et al. proved that if the width of a deep neural network with ReLU activation is strictly larger than the input dimension, then the network can approximate any Lebesgue integrable function; If the width is smaller or equal to the input dimension, then a deep neural network is not a universal approximator.

The probabilistic interpretation derives from the field of machine learning. It features inference, as well as the optimization concepts of training and testing, related to fitting and generalization, respectively. More specifically, the probabilistic interpretation considers the activation nonlinearity as a cumulative distribution function. The probabilistic interpretation led to the introduction of dropout as regularizer in neural networks. The probabilistic interpretation was introduced by researchers including Hopfield, Widrow and Narendra and popularized in surveys such as the one by Bishop.

Rise of Deep Learning

Machine learning is said to have occurred in the 1950s when Alan Turing, a British mathematician, proposed his artificially intelligent “learning machine.” Arthur Samuel wrote the first computer learning program. His program made an IBM computer improve at the game of checkers the longer it played. In the decades that followed, various machine learning techniques came in and out of fashion.

Neural networks were mostly ignored by machine learning researchers, as they were plagued by the ‘local minima’ problem in which weightings incorrectly appeared to give the fewest errors. However, some machine learning techniques like computer vision and facial recognition moved forward. In 2001, a machine learning algorithm called Adaboost was developed to detect faces within an image in real-time. It filtered images through decision sets such as “does the image have a bright spot between dark patches, possibly denoting the bridge of a nose?” When the data moved further down the decision tree, the probability of selecting the right face from an image grew.

Neural networks did not return to favor for several more years when powerful graphics processing units finally entered the market. The new hardware-enabled researchers to use desktop computers instead of supercomputers to run, manipulate, and process images. The most significant leap forward for neural networks happened because of the introduction of substantial amounts of labeled data with ImageNet, a database of millions of labeled images from the Internet. The cumbersome task of manually labeling images was replaced by crowdsourcing, giving networks a virtually unlimited source of training materials. In the years since technology companies have made their deep learning libraries open source. Examples include Google Tensorflow, Facebook open-source modules for Torch, Amazon DSSTNE on GitHub, and Microsoft CNTK.

Deep Learning Career Prospects

The field of artificial intelligence is seriously understaffed. While not all companies are currently hiring professionals with deep learning skills quite yet, having such trained experts are expected to gradually become a crucial requirement for organizations looking to remain competitive and drive innovation. Machine learning engineers are in high demand because neither data scientists nor software engineers has precisely the skills needed for the field of machine learning. The role of machine learning engineer has evolved to fill the gap. What is deep learning promising in terms of career opportunities and pay? Quite a bit. Glassdoor lists the average salary for a machine learning engineer at nearly $115,000 annually. According to PayScale, the salary range spans $100,000 to $166,000. Growth will accelerate in the coming years as deep learning systems and tools improve and expand into all industries.