Introduction to Neural Networks.ppt

上传人：李朗文档编号：376689 上传时间：2018-10-08 格式：PPT 页数：33 大小：148KB

下载相关举报

第1页 / 共33页

第2页 / 共33页

第3页 / 共33页

第4页 / 共33页

第5页 / 共33页

亲，该文档总共33页，到这儿已超出免费预览范围，如果喜欢就下载吧！

资源描述

1、Introduction to Neural Networks,John Paxton Montana State University Summer 2003,Chapter 7: A Sampler Of Other Neural Nets,Optimization Problems Common Extensions Adaptive Architectures Neocognitron,I. Optimization Problems,Travelling Salesperson Problem. Map coloring. Job shop scheduling. RNA secon

2、dary structure.,Advantages of Neural Nets,Can find near optimal solutions. Can handle weak (desirable, but not required) constraints.,TSP Topology,Each row has 1 unit that is on Each column has 1 unit that is on,City ACity BCity C,1st 2nd 3rd,Boltzmann Machine,Hinton, Sejnowski (1983) Can be modelle

3、d using Markov chains Uses simulated annealing Each row is fully interconnected Each column is fully interconnected,Architecture,ui,j connected to uk,j+1 with di,k ui1 connected to ukn with -dik,U11,Un1,Unn,U1n,b,-p,Algorithm,1. Initialize weights b, p p b p greatest distance between cities Initiali

4、ze temperature T Initialize activations of units to random binary values,Algorithm,2. while stopping condition is false, do steps 3 8 3. do steps 4 7 n2 times (1 epoch)4. choose i and j randomly 1 = i, j = n uij is candidate to change state,Algorithm,5. Compute c = 1 2uijb + S S ukm (-p)where k i, m

5、 j 6. Compute probability to accept changea = 1 / (1 + e(-c/T) ) 7. Accept change if random number 01 a. If change, uij = 1 uij 8. Adjust temperature T = .95T,Stopping Condition,No state change for a specified number of epochs. Temperature reaches a certain value.,Example,T(0) = 20 units are on init

6、ially b = 60 p = 70 10 cities, all distances less than 1 200 or fewer epochs to find stable configuration in 100 random trials,Other Optimization Architectures,Continuous Hopfield Net Gaussian Machine Cauchy Machine Adds noise to input in attempt to escape from local minima Faster annealing schedule

7、 can be used as a consequence,II. Extensions,Modified Hebbian Learning Find parameters for optimal surface fit of training patterns,Boltzmann Machine With Learning,Add hidden units 2-1-2 net below could be used for simple encoding/decoding (data compression),x1,x2,z1,y2,y1,Simple Recurrent Net,Learn

8、 sequential or time varying patterns Doesnt necessarily have steady state output input units context units hidden units output units,Architecture,x1,xn,cp,c1,zp,z1,ym,y1,Simple Recurrent Net,f(ci(t) = f(zi(t-1) f(ci(0) = 0.5 Can use backpropagation Can learn string of characters,Example: Finite Stat

9、e Automaton,4 xi 4 yi 2 zi 2 ci,BEGIN,A,B,END,Backpropagation In Time,Rumelhart, Williams, Hinton (1986) Application: Simple shift register,x1,x2,z1,y2,y1,x2,x1,1 (fixed),1 (fixed),Backpropagation Training for Fully Recurrent Nets,Adapts backpropagation to arbitrary connection patterns.,III. Adaptiv

10、e Architectures,Probabilistic Neural Net (Specht 1988)Cascade Correlation (Fahlman, Lebiere 1990),Probabilistic Neural Net,Builds its own architecture as training progresses Chooses class A over class B if hAcAfA(x) hBcBfB(x) cA is the cost of classifying an example as belonging to A when it belongs

11、 to B hA is the a priori probability of an example belonging to class A,Probabilistic Neural Net,fA(x) is the probability density function for class A, fA(x) is learned by the net zA1: pattern unit, fA: summation unit,x1,xn,zBk,zB1,zAj,zA1,fB,fA,y,Cascade Correlation,Builds own architecture while tr

12、aining progresses Tries to overcome slow rate of convergence by other neural nets Dynamically adds hidden units (as few as possible) Trains one layer at a time,Cascade Correlation,Stage 1,x0,x1,x2,y2,y1,Cascade Correlation,Stage 2 (fix weights into z1),x0,x1,x2,y2,y1,z1,Cascade Correlation,Stage 3 (

13、fix weights into z2),x0,x1,x2,y2,y1,z1,z2,Algorithm,1. Train stage 1. If error is not acceptable, proceed.2. Train stage 2. If error is notacceptable, proceed.3. Etc.,IV. Neocognitron,Fukushima, Miyako, Ito (1983) Many layers, hierarchical Very spare and localized connections Self organizing Supervi

14、sed learning, layer by layer Recognizes handwritten 0, 1, 2, 3, 9, regardless of position and style,Architecture,Architecture,S layers respond to patterns C layers combine results, use larger field of view For example S11 responds to 0 0 0 1 1 1 0 0 0,Training,Progresses layer by layer S1 connections to C1 are fixed C1 connections to S2 are adaptable A V2 layer is introduced between C1 and S2, V2 is inhibatory C1 to V2 connections are fixed V2 to S2 connections are adaptable,

展开阅读全文