1、Artificial Neural Networks 人工神经网络,Introduction,09/10/2018,Artificial Neural Networks - I,2,Table of Contents,Introduction to ANNs Taxonomy Features Learning Applications,I,09/10/2018,Artificial Neural Networks - I,3,Contents - I,Introduction to ANNs Processing elements (neurons) Architecture Functio
2、nal Taxonomy of ANNs Structural Taxonomy of ANNs Features Learning Paradigms Applications,09/10/2018,Artificial Neural Networks - I,4,The Biological Neuron,10 billion neurons in human brain Summation of input stimuli Spatial (signals) Temporal (pulses) Threshold over composed inputs Constant firing
3、strength,billion synapses in human brain Chemical transmission and modulation of signals Inhibitory synapses Excitatory synapses,09/10/2018,Artificial Neural Networks - I,5,Biological Neural Networks,10,000 synapses per neuron Computational power = connectivity Plasticity new connections (?) strengt
4、h of connections modified,09/10/2018,Artificial Neural Networks - I,6,Neural Dynamics,Refractory time,Action potential,Action potential 100mV Activation threshold 20-30mV Rest potential -65mV Spike time 1-2ms Refractory time 10-20ms,09/10/2018,Artificial Neural Networks - I,7,神经网络的复杂性,神经网路的复杂多样,不仅在于
5、神经元和突触的数量大、组合方式复杂和联系广泛,还在于突触传递的机制复杂。现在已经发现和阐明的突触传递机制有:突触后兴奋,突触后抑制,突触前抑制,突触前兴奋,以及“远程”抑制等等。在突触传递机制中,释放神经递质是实现突触传递机能的中心环节,而不同的神经递质有着不同的作用性质和特点,09/10/2018,Artificial Neural Networks - I,8,神经网络的研究,神经系统活动,不论是感觉、运动,还是脑的高级功能(如学习、记忆、情绪等)都有整体上的表现,面对这种表现的神经基础和机理的分析不可避免地会涉及各种层次。这些不同层次的研究互相启示,互相推动。在低层次(细胞、分子水平)上
6、的工作为较高层次的观察提供分析的基础,而较高层次的观察又有助于引导低层次工作的方向和体现其功能意义。既有物理的、化学的、生理的、心理的分门别类研究,又有综合研究。,09/10/2018,Artificial Neural Networks - I,9,The Artificial Neuron,Stimulus,urest = resting potential xj(t) = output of neuron j at time t wij = connection strength between neuron i and neuron j u(t) = total stimulus at
7、 time t,yi(t),x1(t),x2(t),x5(t),x3(t),x4(t),wi1,wi3,wi2,wi4,wi5,Neuron i,Response,09/10/2018,Artificial Neural Networks - I,10,Artificial Neural Models,McCulloch Pitts-type Neurons (static) Digital neurons: activation state interpretation (snapshot of the system each time a unit fires) Analog neuron
8、s: firing rate interpretation (activation of units equal to firing rate) Activation of neurons encodes information Spiking Neurons (dynamic) Firing pattern interpretation (spike trains of units) Timing of spike trains encodes information (time to first spike, phase of signal, correlation and synchro
9、nicity,09/10/2018,Artificial Neural Networks - I,11,Binary Neurons,“Hard” threshold,= threshold,ex: Perceptrons, Hopfield NNs, Boltzmann Machines Main drawbacks: can only map binary functions, biologically implausible.,off,on,Stimulus,Response,09/10/2018,Artificial Neural Networks - I,12,Analog Neur
10、ons,“Soft” threshold,ex: MLPs, Recurrent NNs, RBF NNs. Main drawbacks: difficult to process time patterns, biologically implausible.,off,on,Stimulus,Response,09/10/2018,Artificial Neural Networks - I,13,Spiking Neurons,= spike and afterspike potential urest = resting potential e(t,u(t) = trace at ti
11、me t of input at time t = threshold xj(t) = output of neuron j at time t wij = efficacy of synapse from neuron i to neuron j u(t) = input stimulus at time t,Response,Stimulus,09/10/2018,Artificial Neural Networks - I,14,Spiking Neuron Dynamics,09/10/2018,Artificial Neural Networks - I,15,赫布律,加拿大心理学家
12、Donald Hebb出版了行为的组织一书,指出学习导致突触的联系强度和传递效能的提高,即为“赫布律”。 在此基础上,人们提出了各种学习规则和算法,以适应不同网络模型的需要。有效的学习算法,使得神经网络能够通过连接权值的调整,构造客观世界的内在表示,形成具有特色的信息处理方法,信息存储和处理体现在网络的连接中。,09/10/2018,Artificial Neural Networks - I,16,Hebbs Postulate of Learning,Biological formulationWhen an axon of cell A is near enough to excite
13、a cell and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that As efficiency as one of the cells firing B is increased.,09/10/2018,Artificial Neural Networks - I,17,赫布律,当细胞A的一个轴突和细胞B 很近,足以对它产生影响,并且持久地、不断地参与了对细胞B 的兴奋,那
14、么在这两个细胞或其中之一会发生某种生长过程或新陈代谢变化,以致于A作为能使B 兴奋的细胞之一,它的影响加强了。,09/10/2018,Artificial Neural Networks - I,18,Hebbs Postulate: revisited,Stent (1973), and Changeux and Danchin (1976) have expanded Hebbs rule such that it also mo- dels inhibitory synapses: If two neurons on either side of a synapse are activa
15、ted simultaneously (synchronously), then the strength of that synapse is selectively increased. If two neurons on either side of a synapse are activated asynchronously, then that synapse is selectively weakened or eliminated.,09/10/2018,Artificial Neural Networks - I,19,Artificial Neural Networks,Ou
16、tput layer,Input layer,Hidden layers,fully connected,sparsely connected,09/10/2018,Artificial Neural Networks - I,20,Feedforward ANN Architectures,Information flow unidirectional Static mapping: y=f(x) Multi-Layer Perceptron (MLP) Radial Basis Function (RBF) Kohonen Self-Organising Map (SOM),09/10/2
17、018,Artificial Neural Networks - I,21,Recurrent ANN Architectures,Feedback connections Dynamic memory: y(t+1)=f(x(),y(),s() (t,t-1,.) Jordan/Elman ANNs Hopfield Adaptive Resonance Theory (ART),09/10/2018,Artificial Neural Networks - I,22,History,Early stages 1943 McCulloch-Pitts: neuron as comp. ele
18、m. 1948 Wiener: cybernatics 1949 Hebb: learning rule 1958 Rosenblatt: perceptron 1960 Widrow-Hoff: least mean square algorithm Recession 1969 Minsky-Papert: limitations perceptron model Revival 1982 Hopfield: recurrent network model 1982 Kohonen: self-organizing maps 1986 Rumelhart et. al.: backprop
19、agation,09/10/2018,Artificial Neural Networks - I,23,历史,40年代心理学家Mcculloch和数学家Pitts合作提出的兴奋与抑制型神经元模型和Hebb提出的神经元连接强度的修改规则,他们的研究结果至今仍是许多神经网络模型研究的基础。50年代、60年代的代表性工作是Rosenblatt的感知机和Widrow的自适应性元件Adaline。 1969年,Minsky和Papert合作发表了颇有影响的Perceptron一书,得出了消极悲观的论点,加上数字计算机正处于全盛时期并在人工智能领域取得显著成就,70年代人工神经网络的研究处于低潮。,09
20、/10/2018,Artificial Neural Networks - I,24,历史,80年代后,传统的Von Neumann数字计算机在模拟视听觉的人工智能方面遇到了物理上不可逾越的极限。与此同时,Rumelhart与Mcclelland以及Hopfield等人在神经网络领域取得了突破性进展,神经网络的热潮再次掀起。自适应共振理论(ART) 组织特征映射理论 Hinton 等人最近提出了 Helmboltz 机 徐雷提出的 Ying-Yang 机理论模型 甘利俊一( S.Amari) 开创和发展的基于统计流形的方法应用于人工神经网络的研究,09/10/2018,Artificial N
21、eural Networks - I,25,ANN Capabilities,Learning Approximate reasoning Generalisation capability Noise filtering Parallel processing Distributed knowledge base Fault tolerance,09/10/2018,Artificial Neural Networks - I,26,Main Problems with ANN,Knowledge base not transparent (black box) (Partially res
22、olved) Learning sometimes difficult/slow Limited storage capability,09/10/2018,Artificial Neural Networks - I,27,ANN Learning Paradigms,Supervised learning Classification Control Function approximation Associative memory Unsupervised learning Clustering Reinforcement learning Control,09/10/2018,Arti
23、ficial Neural Networks - I,28,Supervised Learning,Teacher presents ANN input-output pairs ANN weights adjusted according to error Iterative algorithms (e.g. Delta rule, BP rule) One-shot learning (Hopfield) Quality of training examples is critical,09/10/2018,Artificial Neural Networks - I,29,Present
24、ed by Martin Ho, Eddy Li, Eric Wong and Kitty Wong - Copyright 2000,Linear Separability in Perceptrons,09/10/2018,Artificial Neural Networks - I,30,Presented by Martin Ho, Eddy Li, Eric Wong and Kitty Wong - Copyright 2000,Learning Linearly Separable Functions (1),What can these functions learn ?Bad
25、 news:- There are not many linearly separable functions.Good news:- There is a perceptron algorithm that will learn any linearly separable function, given enough training examples.,09/10/2018,Artificial Neural Networks - I,31,Delta Rule,a.k.a. Least Mean Squares Widrow-Hoff iterative delta rule (196
26、0) Gradient descent of the error surface Guaranteed to find minimum error configuration in single layer ANNs Stochastic approximation of desired behaviour,09/10/2018,Artificial Neural Networks - I,32,Unsupervised Learning,ANN adapts weights to cluster input data Hebbian learning Connection stimulus-
27、response strengthened (hebbian) Competitive learning algorithms Kohonen & ART Input weights adjusted to resemble stimulus,09/10/2018,Artificial Neural Networks - I,33,Hebbian Learning,Hebb postulate (1948) Correlation-based learning Connections between concurrently firing neurons are strengthened Ex
28、perimentally verified (1973),l=learning coefficient wij=connection from neuron xj to yi,General Formulation,Hebb postulate,Kohonen & Grossberg (ART),09/10/2018,Artificial Neural Networks - I,34,Learning principle for artificial neural networks,ENERGY MINIMIZATIONWe need an appropriate definition of
29、energy for artificial neural networks, and having that we can use mathematical optimisation techniques to find how to change the weights of the synaptic connections between neurons.ENERGY = measure of task performance error,09/10/2018,Artificial Neural Networks - I,35,Neural network mathematics,Inpu
30、ts,Output,09/10/2018,Artificial Neural Networks - I,36,Neural network mathematics,Neural network: input / output transformation,W is the matrix of all weight vectors.,09/10/2018,Artificial Neural Networks - I,37,MLP neural networks,MLP = multi-layer perceptron Perceptron:MLP neural network:,09/10/20
31、18,Artificial Neural Networks - I,38,RBF neural networks,RBF = radial basis function,Example:,Gaussian RBF,x,yout,09/10/2018,Artificial Neural Networks - I,39,Neural network tasks,controlclassificationpredictionapproximation,These can be reformulated in general as FUNCTION APPROXIMATIONtasks.,Approx
32、imation: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x.,09/10/2018,Artificial Neural Networks - I,40,Neural network approximation,Task specification:Data: set of value pairs: (xt, yt), yt=g(xt) + zt; zt is random measurement noise.O
33、bjective: find a neural network that represents the input / output transformation (a function) F(x,W) such that F(x,W) approximates g(x) for every x,09/10/2018,Artificial Neural Networks - I,41,Learning to approximate,c is the learning parameter (usually a constant),09/10/2018,Artificial Neural Netw
34、orks - I,42,Learning with a perceptron,Perceptron:,Data:,Error:,Learning:,A perceptron is able to learn a linear function.,09/10/2018,Artificial Neural Networks - I,43,Learning with RBF neural networks,Only the synaptic weights of the output neuron are modified. An RBF neural network learns a nonlin
35、ear function.,09/10/2018,Artificial Neural Networks - I,44,Learning with MLP neural networks,MLP neural network: with p layers,Data:,Error:,x,yout,1 2 p-1 p,09/10/2018,Artificial Neural Networks - I,45,Learning with backpropagation,Learning: Apply the chain rule for differentiation:calculate first t
36、he changes for the synaptic weights of the output neuron;calculate the changes backward starting from layer p-1, and propagate backward the local error terms.,The method is still relatively complicated but it is much simpler than the original optimisation problem.,09/10/2018,Artificial Neural Networ
37、ks - I,46,Learning with general optimization,In general it is enough to have a single layer of nonlinear neurons in a neural network in order to learn to approximate a nonlinear function. In such case general optimisation may be applied without too much difficulty.,Example: an MLP neural network wit
38、h a single hidden layer:,09/10/2018,Artificial Neural Networks - I,47,Learning with general optimization,09/10/2018,Artificial Neural Networks - I,48,New methods for learning with neural networks,Bayesian learning:the distribution of the neural network parameters is learntSupport vector learning:the
39、 minimal representative subset of the available data is used to calculate the synaptic weights of the neurons,09/10/2018,Artificial Neural Networks - I,49,Reinforcement Learning,Sequential tasks Desired action may not be known Critic evaluation of ANN behaviour Weights adjusted according to critic M
40、ay require credit assignment Population-based learning Evolutionary Algorithms Swarming Techniques Immune Networks,09/10/2018,Artificial Neural Networks - I,50,ANN Summary,09/10/2018,Artificial Neural Networks - I,51,神经网络的集成,1996年,Sollich和Krogh 将神经网络集成定义为:“神经网络集成是用有限个神经网络对同一个问题进行学习,集成在某输入示例下的输出由构成集成
41、的各神经网络在该示例下的输出共同决定”。,09/10/2018,Artificial Neural Networks - I,52,ANN Application Areas,Classification Clustering Associative memory Control Function approximation,09/10/2018,Artificial Neural Networks - I,53,ANN Classifier systems,Learning capability Statistical classifier systems Data driven Gener
42、alisation capability Handle and filter large input data Reconstruct noisy and incomplete patterns Classification rules not transparent,09/10/2018,Artificial Neural Networks - I,54,Applications for ANN Classifiers,Pattern recognition Industrial inspection Fault diagnosis Image recognition Target reco
43、gnition Speech recognition Natural language processing Character recognition Handwriting recognition Automatic text-to-speech conversion,09/10/2018,Artificial Neural Networks - I,55,Clustering with ANNs,Fast parallel distributed processing Handle large input information Robust to noise and incomplet
44、e patterns Data driven Plasticity/Adaptation Visualisation of results Accuracy sometimes poor,09/10/2018,Artificial Neural Networks - I,56,ANN Clustering Applications,Natural language processing Document clustering Document retrieval Automatic query Image segmentation Data mining Data set partitioni
45、ng Detection of emerging clusters Fuzzy partitioning Condition-action association,09/10/2018,Artificial Neural Networks - I,57,Associative ANN Memories,Stimulus-response association Auto-associative memory Content addressable memory Fast parallel distributed processing Robust to noise and incomplete
46、 patterns Limited storage capability,09/10/2018,Artificial Neural Networks - I,58,Application of ANN Associative Memories,Character recognition Handwriting recognition Noise filtering Data compression Information retrieval,09/10/2018,Artificial Neural Networks - I,59,ANN Control Systems,Learning/ada
47、ptation capability Data driven Non-linear mapping Fast response Fault tolerance Generalisation capability Handle and filter large input data Reconstruct noisy and incomplete patterns Control rules not transparent Learning may be problematic,09/10/2018,Artificial Neural Networks - I,60,ANN Control Sc
48、hemes,ANN controller conventional controller + ANN for unknown or non-linear dynamics Indirect control schemes ANN models direct plant dynamics ANN models inverse plant dynamics,09/10/2018,Artificial Neural Networks - I,61,ANN Control Applications,Non-linear process control Chemical reaction control
49、 Industrial process control Water treatment Intensive care of patients Servo control Robot manipulators Autonomous vehicles Automotive control Dynamic system control Helicopter flight control Underwater robot control,09/10/2018,Artificial Neural Networks - I,62,ANN Function Modelling,ANN as universal function approximator Dynamic system modelling Learning capability Data driven Non-linear mapping Generalisation capability Handle and filter large input data Reconstruct noisy and incomplete inputs,