1、Introduction to Neural Networks in Medical Diagnosis,Wodzisaw DuchDept. of Informatics, Nicholas Copernicus University, Toru, Poland,What is it about?,Data is precious! But also overwhelming . Statistical methods are important but new techniques may frequently be more accurate and give more insight
2、into the data. Data analysis requires intelligence. Inspirations come from many sources, including biology: artificial neural networks, evolutionary computing, immune systems .,Computational Intelligence,Computational Intelligence Data + Knowledge Artificial Intelligence,What do these methods do?,Pr
3、ovide non-parametric models of data. Allow to classify new data to pre-defined categories, supporting diagnosis & prognosis. Allow to discover new categories. Allow to understand the data, creating fuzzy or crisp logical rules. Help to visualize multi-dimensional relationships among data samples. He
4、lp to model real neural networks!,GhostMiner Philosophy,There is no free lunch provide different type of tools for knowledge discovery. Decision tree, neural, neurofuzzy, similarity-based, committees. Provide tools for visualization of data. Support the process of knowledge discovery/model building
5、and evaluating, organizing it into projects.,GhostMiner, data mining tools from our lab. Separate the process of model building and knowledge discovery from model use = GhostMiner Developer & GhostMiner Analyzer,Neural networks,Inspired by neurobiology: simple elements cooperate changing internal pa
6、rameters. Large field, dozens of different models, over 500 papers on NN in medicine each year. Supervised networks: heteroassociative mapping X=Y, symptoms = diseases, universal approximators. Unsupervised networks: clusterization, competitive learning, autoassociation. Reinforcement learning: mode
7、ling behavior, playing games, sequential data.,Supervised learning,Compare the desired with the achieved outputs you cant always get what you want.,Unsupervised learning,Find interesting structures in data.,Reinforcement learning,Reward comes after the sequence of actions.,Real and artificial neuron
8、s,Synapses,Axon,Dendrites,Synapses,(weights),Nodes artificial neurons,Signals,Neural network for MI diagnosis,Myocardial Infarction, p(MI|X),Sex,Age,Smoking,ECG: ST,Pain,Duration,Elevation,0.7,Output weights,Input weights,MI network function,Training: setting the values of weights and thresholds, ef
9、ficient algorithms exist.,Effect: non-linear regression function,Such networks are universal approximators: they may learn any mapping X = Y,Learning dynamics,Decision regions shown every 200 training epochs in x3, x4 coordinates; borders are optimally placed with wide margins.,Neurofuzzy systems,Fe
10、ature Space Mapping (FSM) neurofuzzy system. Neural adaptation, estimation of probability density distribution (PDF) using single hidden layer network (RBF-like) with nodes realizing separable functions:,Fuzzy: m(x)=0,1 (no/yes) replaced by a degree m(x)0,1. Triangular, trapezoidal, Gaussian . MF.,M
11、.f-s in many dimensions:,Knowledge from networks,Simplify networks: force most weights to 0, quantize remaining parameters, be constructive!,Regularization: mathematical technique improving predictive abilities of the network.Result: MLP2LN neural networks that are equivalent to logical rules.,MLP2L
12、N,Converts MLP neural networks into a network performing logical operations (LN).,Input layer,Aggregation: better features,Output: one node per class.,Rule units: threshold logic,Linguistic units: windows, filters,Recurrence of breast cancer,Data from: Institute of Oncology, University Medical Cente
13、r, Ljubljana, Yugoslavia.,286 cases, 201 no recurrence (70.3%), 85 recurrence cases (29.7%) no-recurrence-events, 40-49, premeno, 25-29, 0-2, ?, 2, left, right_low, yes9 nominal features: age (9 bins), menopause, tumor-size (12 bins), nodes involved (13 bins), node-caps, degree-malignant (1,2,3), br
14、east, breast quad, radiation.,Recurrence of breast cancer,Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia.,Many systems used, 65-78% accuracy reported. Single rule: IF (nodes-involved 0,2 degree-malignant = 3 THEN recurrence, ELSE no-recurrence 76.2% accuracy, only
15、 trivial knowledge in the data: Highly malignant breast cancer involving many nodes is likely to strike back.,Recurrence - comparison.,Method 10xCV accuracy MLP2LN 1 rule 76.2 SSV DT stable rules 75.7 1.0 k-NN, k=10, Canberra 74.1 1.2 MLP+backprop. 73.5 9.4 (Zarndt) CART DT 71.4 5.0 (Zarndt) FSM, Ga
16、ussian nodes 71.7 6.8 Naive Bayes 69.3 10.0 (Zarndt) Other decision trees 70.0,Breast cancer diagnosis.,Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.,699 cases, 9 features quantized from 1 to 10: clump thickness, uniformity of cell size, uniformity of cell shape
17、, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, mitosesTasks: distinguish benign from malignant cases.,Breast cancer rules.,Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.,Simplest rule from MLP2LN, large regulariza
18、tion: If uniformity of cell size 3 Then benign Else malignant Sensitivity=0.97, Specificity=0.85More complex NN solutions, from 10CV estimate: Sensitivity =0.98, Specificity=0.94,Breast cancer comparison.,Method 10xCV accuracy k-NN, k=3, Manh 97.0 2.1 (GM) FSM, neurofuzzy 96.9 1.4 (GM) Fisher LDA 96
19、.8 MLP+backprop. 96.7 (Ster, Dobnikar) LVQ 96.6 (Ster, Dobnikar) IncNet (neural) 96.4 2.1 (GM) Naive Bayes 96.4 SSV DT, 3 crisp rules 96.0 2.9 (GM) LDA (linear discriminant) 96.0 Various decision trees 93.5-95.6,Collected in the Outpatient Center of Dermatology in Rzeszw, Poland. Four types of Melan
20、oma: benign, blue, suspicious, or malignant.,250 cases, with almost equal class distribution. Each record in the database has 13 attributes: asymmetry, border, color (6), diversity (5). TDS (Total Dermatoscopy Score) - single index Goal: hardware scanner for preliminary diagnosis.,Melanoma skin canc
21、er,Method Rules Training % Test % MLP2LN, crisp rules 4 98.0 all 100 SSV Tree, crisp rules 4 97.50.3 100 FSM, rectangular f. 7 95.51.0 100 knn+ prototype selection 13 97.50.0 100 FSM, Gaussian f. 15 93.71.0 953.6 knn k=1, Manh, 2 features - 97.40.3 100 LERS, rough rules 21 - 96.2,Melanoma results,27
22、 features taken into account: polarity, size, hydrogen-bond donor or acceptor, pi-donor or acceptor, polarizability, sigma effect. Pairs of chemicals, 54 features, are compared, which one has higher activity? 2788 cases, 5-fold crossvalidation tests.,Antibiotic activity of pyrimidine compounds.,Pyri
23、midines: which compound has stronger antibiotic activity?,Common template, substitutions added at 3 positions, R3, R4 and R5.,Antibiotic activity - results.,Pyrimidines: which compound has stronger antibiotic activity?,Mean Spearmans rank correlation coefficient used: -1 rs +1 Method Rank correlatio
24、n FSM, 41 Gaussian rules 0.770.03 Golem (ILP) 0.68 Linear regression 0.65 CART (decision tree) 0.50,Thyroid screening.,Garavan Institute, Sydney, Australia 15 binary, 6 continuous Training: 93+191+3488 Validate: 73+177+3178 Determine important clinical factorsCalculate prob. of each diagnosis.,Thyro
25、id some results.,Accuracy of diagnoses obtained with different systems.,Method Rules/Features Training % Test %MLP2LN optimized 4/6 99.9 99.36 CART/SSV Decision Trees 3/5 99.8 99.33 Best Backprop MLP -/21 100 98.5 Nave Bayes -/- 97.0 96.1 k-nearest neighbors -/- - 93.8,Psychometry,MMPI (Minnesota Mu
26、ltiphasic Personality Inventory) psychometric test. Printed forms are scanned or computerized version of the test is used.,Raw data: 550 questions, ex: I am getting tired quickly: Yes - Dont know - No Results are combined into 10 clinical scales and 4 validity scales using fixed coefficients. Each s
27、cale measures tendencies towards hypochondria, schizophrenia, psychopathic deviations, depression, hysteria, paranoia etc.,Scanned form,Computer input,Scales,Psychometry,There is no simple correlation between single values and final diagnosis. Results are displayed in form of a histogram, called a p
28、sychogram. Interpretation depends on the experience and skill of an expert, takes into account correlations between peaks.,Goal: an expert system providing evaluation and interpretation of MMPI tests at an expert level. Problem: agreement between experts only 70% of the time; alternative diagnosis a
29、nd personality changes over time are important.,Psychogram,Psychometric data,1600 cases for woman, same number for men. 27 classes: norm, psychopathic, schizophrenia, paranoia, neurosis, mania, simulation, alcoholism, drug addiction, criminal tendencies, abnormal behavior due to .,Extraction of logi
30、cal rules: 14 scales = features. Define linguistic variables and use FSM, MLP2LN, SSV - giving about 2-3 rules/class.,Psychometric data,10-CV for FSM is 82-85%, for C4.5 is 79-84%. Input uncertainty +Gx around 1.5% (best ROC) improves FSM results to 90-92%.,Psychometric Expert,Probabilities for diff
31、erent classes. For greater uncertainties more classes are predicted. Fitting the rules to the conditions: typically 3-5 conditions per rule, Gaussian distributions around measured values that fall into the rule interval are shown in green. Verbal interpretation of each case, rule and scale dependent
32、.,MMPI probabilities,MMPI rules,MMPI verbal comments,Visualization,Probability of classes versus input uncertainty.Detailed input probabilities around the measured values vs. change in the single scale; changes over time define patients trajectory. Interactive multidimensional scaling: zooming on th
33、e new case to inspect its similarity to other cases.,Class probability/uncertainty,Class probability/feature,MDS visualization,Summary,Neural networks and other computational intelligence methods are useful additions to the multivariate statistical tools. They support diagnosis, predictions, and dat
34、a understanding: extracting rules, prototypes.,FDA has approved many devices that use ANNs: Oxfords Instruments Ltd EEG analyzer, Cardionetics (UK) ECG analyzer. PAPNET (NSI), analysis of Pap smears ,Challenges,Discovery of theories rather than data modelsIntegration with image/signal analysisIntegration with reasoning in complex domainsCombining expert systems with neural networks.,Fully automatic universal data analysis systems: press the button and wait for the truth ,We are slowly getting there. More & more computational intelligence tools (including our own) are available.,