1、Automated Model-Building with TEXTAL,Thomas R. Ioerger Department of Computer Science Texas A&M University,Automated model-building programCan we automate the kind of visual processing of patterns that crystallographers use? Intelligent methods to interpret density, despite noise Exploit knowledge a
2、bout typical protein structure Focus on medium-resolution maps optimized for 2.8A (actually, 2.6-3.2A is fine) typical for MAD data (useful for high-throughput) other programs exist for higher-res data (ARP/wARP),Overview of TEXTAL,Electron density map (not structure factors),TEXTAL,Protein model (m
3、ay need refinement),Main Stages of TEXTAL,electron density map,CAPRA,Ca chains,LOOKUP,model (initial coordinates),model (final coordinates),Post-processing routines,Reciprocal-space refinement/DM,Human Crystallographer (editing),build-in side-chain and main-chain atoms locally around each Ca,example
4、: real-space refinement,CAPRA: C-Alpha Pattern-Recognition Algorithm,tracing,linking,Neural network: estimates which pseudo-atoms are closest to true Cas,Example of Ca-chains fit by CAPRA,% built: 84% # chains: 2 lengths: 47, 88 RMSD: 0.82A,Rat a2 urinary protein (P. Adams) data: 2.5A MR map generat
5、ed at 2.8A,Stage 2: LOOKUP,LOOKUP is based on Pattern Recognition Given a local (5A-spherical) region of density, have we seen a pattern like this before (in another map)? If so, use similar atomic coordinates. Use a database of maps with known structures 200 proteins from PDB-Select (non-redundant)
6、 back-transformed (calculated) maps at 2.8A (no noise) regions centered on 50,000 Cas Use feature extraction to match regions efficiently feature (e.g. moments) represent local density patterns features must be rotation-invariant (independent of 3D orientation) use density correlation for more preci
7、se evaluation,Examples of Numeric Density Features,Distance from center-of-sphereto center-of-mass Moments of inertia - relativedispersion along orthogonal axes Geometric features like “Spoke angles” Local variance and other statistics,TEXTAL uses 19 distinct numeric features to represent the patter
8、n of density in a region, each calculated over 4 different radii, for a total of 76 features.,F=,F=,F=,F=,Database of known maps,Region in map to be interpreted,The LOOKUP Process,Find optimal rotation,Stage 3: Post-Processing,Interfaces for Using TEXTAL,Stand-alone commands and scripts capra-scale
9、prot.xplor prot-scaled.xplor neotex.sh myprotein textal.log lots of intermediate files and logs WINTEX: Tcl/Tk interface creates jobs in sub-directories Public Release: July 2004 http:/textal.tamu.edu:12321 Integrated into Phenix http:/phenix-online.org Python module model-building tasks in GUI,Gall
10、ery of Examples,Conclusions,Pattern recognition is a successful technique for macromolecular model-building Future directions: building ligands, co-factors, etc. recognizing disulfide bridges phase improvement (iterating with refinement) loop-building further integration with Phenix Intelligent Agen
11、t-based methods for guiding/automating model-building interactive graphics for specialized needs (e.g. fixing chains, editing identities),Acknowledgements,Funding: National Institutes of Health People: James C. Sacchettini Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik McKee, Reetal Pai, Tod Romo Our association with the PHENIX group: Paul Adams (Lawrence Berkeley National Lab) Randy Read (Cambridge University) Tom Terwilliger (Los Alamos National Lab),