1、Slide No.: 1,Introduction to Clinical Terminology and Classification Clinical Decision Support L4,AL Rector OpenGALEN TopThing UK The Medical Informatics Group, U of Manchester www.cs.man.ac.uk/mig/galen www.opengalen.org rectorcs.man.ac.uk,Slide No.: 2,The Vision,Best Practice,Best Practice,Slide
2、No.: 3,OpenGALEN: Philosophy,Terminology is software Terminology is the interface between people and machines Re-use is the key Patient-centred information Terminology must have a purpose Always ask: “Whats it for?” Not art for arts sake Terminology supports clinical applications - not vice versa Ap
3、plications for someone to do something for somebody Keep the Horse before the Cart Always ask: “How will we know if it works?” “How will we know if it fails?”,Slide No.: 4,OpenGALEN: Key ideas,Separation of kinds of knowledge Terminology, medical record and information system schemas Concepts, langu
4、age, Coding, Indexing, Pragmatics Machine level, User level Knowledge is fractal! There will always be more detail to be added Therefore terminologies must be extensible Formal logical Support Too big and complicated to maintain by hand Extensibility requires rules Software needs logical rigour,Slid
5、e No.: 5,Axes for kinds of Knowledge,Machine level Human Level,Concepts Language Coding Indexing Pragmatics & User Interface,Terminology Medical Records/ Information systems,Slide No.: 6,Uses of Terminology,Clinical Epidemiology and quality assurance Reproducibility / Comparability Indexing Software
6、 Re-use ! Integration and Messaging between systems Authoring and configuring systems Data capture and presentation (user interface) Indexing information and knowledge (meta-data, The Web),Slide No.: 7,History: Origins of existing terminologies,Epidemiology ICD - Farr in 1860s to ICD9 in 1979 Intern
7、ational reporting of morbidity/mortality ICPC - 1980s Clinically validated epidemiology in primary care Now expanded for use in Dutch GP software Librarianship MeSH - NLM from around 1900 - Index Medicus aimed at US insurance reimbursement,Slide No.: 8,Traditional Systems,Built by people for interpr
8、etation by people (Coding clerks) Most knowledge implicit in rubrics Must understand medicine to use intelligently Not built for software On paper for use on paper Enumerated - top down all possibilities listed Serial - Single use - Single View Hierarchical Thesauri Traditional terminological techni
9、ques from librarianship Broader than / Narrower than (ISO 1087) no logical foundation Focused on terms Language and concepts mixed Synonyms, preferred terms, etc caused confusion,Slide No.: 9,History (2),Pathology indexing SNOMED 1970s to 1990 (SNOMED International) First faceted or combinatorial sy
10、stem Topology, morphology, aetiology, function Plus diseases cross referenced to ICD9 Specialty Systems Mostly similar hierarchical systems ACRNEMA/SDM - Radiology NANDA, ICNP - Nursing ,Slide No.: 10,History (3),Early computer systems Read I (4 digit Read) Aimed at saving space on early computers 1
11、-5 Mbyte / 10,000 patients Hierarchical modelled on ICD9 Detailed signs and symptoms for primary care Purchased by UK government in 1990 Single use Morbidity indexing Medical Entities Dictionary (MED) Jim Cimino,Slide No.: 11,History (4),Aspirations for electronic patient records (EPRs) Weeds Proble
12、m Oriented Medical Record Direct entry by health care professionals Aspirations for decision support Ted Shortliffe (MYCIN), Clem McDonald (Computer based reminders), Perry Miller (Critiquing), Aspirations for re-use Patient centred information Needed common multi-use multi-purpose terminology None
13、worked,Slide No.: 12,Summary of Changes at end of 1st Generation,From terminologies for people to terminologies for machines From paper to software From single use to multiple re-use for patient centred systems From entry by coding clerks to direct entry by health care professionals From pre-defined
14、 reporting for statistics to reliable indexing for decision support,Slide No.: 13,Problems with First Generation Enumerated Systems in coping with these changes,Slide No.: 14,Problems (1),Scaling ! More detail and more specialities required scaling up, but. The combinatorial explosion Example: Burns
15、: 100 sites x 3 depths 404 codes 5 subsites/site x chemical or thermal 7272 x 3 extents x 3 durations 116,352 The Persian chessboard 264 1019 1019 grains of rice 100 billion tonnes of rice 1019 nanoseconds 10,000 years Read II grew from 20,000 to 250,000 terms in 100 staff-years still too small to b
16、e useful but too big to use,Slide No.: 15,Problems (2),Information implicit in the rubrics “Hypertension excluding pregancy” Computers cant read! Invisible to software No explicit information except the hierarchy Minimal support for software No opportunity to use softwre to help Language and concept
17、s confused Synonyms Preferred terms Homonyms Only simple look up and spelling correction,Slide No.: 16,Problems (3),Mixed Organisation Heart diseases in 13 of 19 chapters of ICD Tumours, infections, congenital abnormalities, toxic, Steroids in five chapters of standard drug classifications Anti-infl
18、ammatories, anthi-asthmatics, Unreliable for indexing or Abstractions How to say something about all heart diseases? Fixed organisation Single hierarchy - Single use Where to put gout - arthritis or metabolic disease? Back and forth in each edition of ICD No re-use,Slide No.: 17,Problems 3b Thesauri
19、 rather than Classifications,A Mixed Hierarchy,A correct kind-of (subsumption) hierarchy,Slide No.: 18,Problems (4),Semantic identifiers Codes really paths - moving a concept meant changing its code3 Cardiovascular disorders 3.4 Disorders of Artery . 3.4.2 Disorders of coronary artery . 3.4.2.3 Coro
20、nary thrombosis Easy to process but. Reorganisation requires changing codes Codes cannot be permanent,Slide No.: 19,Problems (5),Maintenance 20 Years from ICD9 to ICD10 100 person-years from Read 1 to Read 3 Mega francs/guilders/crowns/marks on European coding schemes Thousands of unpaid hours of co
21、mmittee time Impossible / meaningless decisions take longest You can search forever for something that is not there Multiple uses compete - Must choose one use Most successful were clear about their purpose - ICD, ICPC, MeSH Codes change meaning with version changes Old data misleading!,Slide No.: 2
22、0,Problems (6),Version specific artefacts “Not otherwise specified” (NOS) Used to move a general concept down Not elsewhere classified (NEC) Catch all - Nowhere else in coding system e.g. Tumour not elsewhere classified dependent on version, “Other” Catch all - Not listed below, e.g. “Other diseases
23、 of the cardiovascular system” dependent on version Not used consistsently,Slide No.: 21,Problem (7): Language is slippery: Two hands or Four?,Slide No.: 22,Language/Concepts are slippery,Human cognition makes it look easy Logic fails to capture it Classification is easy until you try to do it Tryin
24、g since Aristotle in the West and Ancient Chinese in the East Words/Concepts mean what a community decides they mean Does a chimpanzee have four hands? Is a prion alive? Is surgery on the ovary a kind of Endocrine surgery? Easier to agree on the concrete than the abstract Easy to agree on useful abs
25、tractions and generalisations Harder to agree on how to name them,Slide No.: 23,Problems (8),There is no re-use - there is no standard The grand challenge: A common controlled vocabulary for medicine But re-use requires multiple different views Peoples needs differ / People do and find different thi
26、ngs By profession Doctors and specialties, nurses, physiotherapiests, dentists By situation Inpatient, outpatient, primary care, community By task Diagnosis, management, prescribing, patient care, public health, quality assurance, management, planning By country and community US, UK, France, Germany
27、, Japan, Korea, .,Slide No.: 24,Summary of Problems 1st Generation Enumerated Systems,Enumerated Single Hierarchies List all possibilities in advance Cannot cope with fractal knowledge Most knowledge implicit Invisible to software Cant agree on common concepts and classification Unreliable for index
28、ing Difficult to use for healthcare professionals No support for user interface Cant build and maintain big classifications Language and concepts dont translate easily to logic and software,Slide No.: 25,Ciminos Desiderata (1),Concept orientation Separate language (terms) and concepts (codes) Concep
29、t permanence Never re-use a code (retire it) Nonsemantic concept identifiers Separate the code from the path Polyhierarchy Allow one concept to be classified in multiple ways Gout can be both a metabolic disease and an arthritis,Slide No.: 26,Ciminos Desiderata (2),Formal Definitions i.e Be composit
30、ional Reject Not elsewhere classified concept permanence and NEC Multiple granularities Organ, tissue, cellular, molecular Grades, types, classes of diseases Special clinical criteria Multiple consistsent views Allow different organisations e.g. functional, anatomical, pathological,Slide No.: 27,Cim
31、inos Desiderata (3),Represent context Family history, risk, source of information Evolve gracefully Allow controlled changes Recognise redundancy (equivalence) Carcinoma + Lung ?=? Carcinoma of the lung How would we know? How could a machine know?,Solution Generation 1 Megaterm + Crossmapping = UMLS
32、,Clinical Applications,Medical Records,Data entry,Decision support,Slide No.: 29,Unified Medical Language System (UMLS) from US National Library of Medicine Defacto common registry for vocabularies Concept Unique Identifiers (CUIs) and Lexical Unique Identifiers (LUIs) are defacto the common nomencl
33、ature,Solution 1 Cross-mapping & UMLS,Slide No.: 30,Solution 1 Cross-mapping & UMLS,An invaluable resource, but. No better than the vocabularies which are mapped Limited detail for patient care Unreliable for indexing or abstraction of knowledge Best for relating everything to MeSH for indexing lite
34、rature Still limited by combinatorial explosion Still cant cope with fractal knowledge Not extensible - no help in building or extending terminologiese No help in reorganising existing terminologies to re-use for new purposes Top down Information still implicit Minimal help with software No help wit
35、h data capture, user interfaces,Slide No.: 31,Solutions Generations 2-3 Compositional Systems,Beat the combinatorial explosion Build concepts out of pieces - leggo Dictionary and grammar rather than phrasebook But hard,Slide No.: 32,Solution Generation 1.5: Faceted,Faceted systems: SNOMED Internatio
36、nal Inflammation + Lung + Infection + Pneumococcus Pneumoccal pneumonia Limit combinatorial explosion, but Rigid - a limited number of axes / facets / chapters Each facet has the problems of a first generation enumerated system Much knowledge still implicit No way to know how identifiers relate No e
37、xplicit relations, only + No way to recognise redundancy / equivalence No help with data capture or user interface / No way to recognise nonsense Carcinoma + Hair + Donkey + Emotional ? Still cant cope with fractal knowledge Limited extensibility: limited help with building, extending or reorganisin
38、g Still Top Down,Slide No.: 33,Generation 2: Enumerated Compositional,Read III with qualifiers Inflammation: site: lung, cause: pneumococcus Pnemococcal Pneumonia More semantics but Limited qualifiers - limited views - limited re-use Limited help with data capture - User interface difficult Much inf
39、ormation still implicit - limited software support No way to recognise redundancy / equivalence / errors Organisation still mixed - indexing better but still unreliable Limited separation of language and concepts Still cant cope with fractal knowledge Limited extensibility; limited help with buildin
40、g and reorganising terminologies Top down,Slide No.: 34,CT Vocabulary,“Reference Terminology” vs “Interface Terminologies” Reference terminology = enumerated hierarchy of formally defined terms Interface terminology = navigation structure for user interface Explicitly excluded from SNOMED-RT “Termin
41、g”, “Coding”, and “Grouping” Terming - finding the lexical string Coding - finding the correct unique code (concept) Grouping - putting codes into groupers for epidmiological or other purposes,Slide No.: 35,Generation 2.5 Pre-coordinated Formal Compositions,SNOMED-RT (SNOMED-CT?) Formal logical mode
42、l for classifying a fixed list of definitions Simple fixed ontology (7 links) GALEN derived terminologies UK Drug Ontology Procedure classifications,Slide No.: 36,Generation 2.5 Pre-coordinated Formal Compositions More semantics,Limited ability to cope with combinatorial explosion Any one pre-coordi
43、nated terminology of fixed size but arbitrarily many terminologies might be derived Limited ability to cope with fractal knowledge Limited extensibility Extensibility requires access to Workbench Bottom up / middle out More explicit information Logical criteria for correctness / redundancy / equival
44、ence Based on knowledge representation (ontologies) and description logics Limited support for data capture and user interface,Slide No.: 37,Generation 3: Post-Coordinated Formal Concept Model with Constraints delivered as Software Services,OpenGALEN Reference Model - PEN&PAD/Clinergy Inflammation w
45、hich hasCause (Infection which hasCause Pneumococcus) PneumococcalPneumonia “Pneumococcal Pneumonia” A dictionary and grammar rather than a phrase book Software rather than data A sound logical and ontological foundation,Slide No.: 38,Generation 3: Post-Coordinated Formal Concept Models,Copes with c
46、ombinatorial explosion Indefinitely many compositions possible Lists not pre-enumerated Copes with fractal knowledge Easily extensible to add more detail Most information explicit More comprehensive ontology (50-250 links) Good support for data capture / user interface But requires additional pragma
47、tic knowledge layer Separates user view and machine view Intermediate representation vs GRAIL,Slide No.: 39,Case Study 1: The exploding bicycle,ICD-9 (E826) 8 READ-2 (T30) 81 READ-3 87 ICD-10 (V10-19) 587 V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on
48、 outside of vehicle, nontraffic accident, while working for income W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities,Slid
49、e No.: 40,Description Logics: A crash course,Thing,+ feature: pathological,+ (feature: pathological),Slide No.: 41,Defusing the exploding bicycle: 500 codes in pieces,10 things to hit Pedestrian / cycle / motorbike / car / HGV / train / unpowered vehicle / a tree / other 5 roles for the injured Driv
50、ing / passenger / cyclist / getting in / other 5 activities when injured resting / at work / sporting / at leisure / other 2 contexts In traffic / not in traffic V12.24 Pedal cyclist injured in collision with two- or three-wheeled motor vehicle, unspecified pedal cyclist, nontraffic accident, while resting, sleeping, eating or engaging in other vital activities,