Frank Delaglio: Agilent Technologies
Frank Delaglio knew he wanted a career in biomedical research at age 7, in 1968, when he saw his baby brother in an incubator being prepared for open heart surgery. Today, he is one of the go-to software experts in nuclear magnetic resonance (NMR), having designed or contributed significantly to the field's key software tools, such as NMRPipe and TALOS. But the path he took to get to this point — and to the point of having a direct impact on biomedicine — was circuitous and long, driven in equal parts by luck and preparation.
Delaglio landed his first job in a crystallography lab as a chemistry undergraduate at Syracuse University because he had some experience developing film. In that lab, he ended up producing software for analyzing small angle X-ray scattering data and published five papers, one as first author, on numerical methods and mathematical modeling.
What he did not produce, however, was a degree. Instead, Delaglio left the lab for a full-time software job in a newly established and very well funded NMR lab. At that time, in the early 1980s, two-dimensional NMR was just emerging, as were minicomputers, so the demand for numerical analysis software for NMR was on the rise. "It seemed that every week people were designing new NMR experiments that revealed something different about molecular structure," he says. "It was pretty exciting."
That laboratory's software work spun off as a startup, taking Delaglio with it to help produce and commercialize their 2D NMR product called NMR2. Delaglio looks back on these years spent working outside of the realm of academic science as tremendously valuable. "I constantly had to demonstrate my work to convince people it had value and I had to learn to communicate with clients to figure out how to serve them," he says.
In fact, that ability to communicate helped Delaglio score his next position, a coveted role on staff at the National Institutes of Health in the lab of Ad Bax, an NMR pioneer. During a commercial visit to the Bax lab, Delaglio impressed them so much that they hired him. "It's one of the moments of my career that I'm most proud of," he says.
At the NIH, Delaglio spent his time designing a new kind of software to support NMR. "I needed to make something flexible and extensible, with a clearly understandable framework," says Delaglio, who settled on a modular design built on the concept of UNIX pipes, aptly named NMRPipe and first released in 1995.
During this time, Delaglio was living the life of an academic, yet he'd never finished his degrees. When Professor Yuji Kobayashi of Osaka University heard this news — while sitting outdoors in a hot tub at a scientific conference, as Delaglio recalls — he exclaimed: "Frank-san, come have a PhD in my lab!"
A year and a half later, at the age of 40, Delaglio was awarded his bachelor's degree from Syracuse and his PhD from Osaka. "These things, they just fell into my lap," he says, not just of the degrees, but of his entire career.
Today, Delaglio works for Agilent Technologies designing NMR solutions for research. "I can see all the parts of NMR spectroscopy translated into the practice of clinical medicine," he says. "So my dream career only took 30 years to actually happen."
- Elizabeth Dougherty
Published February 24, 2014
Graeme Winter, author of the xia2 x-ray crystallography data processing software, got his start programming during a stint as an astrophysics graduate student working on software to simulate galaxies. He left astrophysics behind, leveraging his newly minted programming skills to land himself a job as a programmer in crystallography at the Medical Research Council's Laboratory of Molecular Biology in Cambridge, UK.
Crystallography stuck. It wasn't the programming that hooked him, but the mathematics. "Each step in the crystallographic process involves several different areas of mathematics, so I quite enjoyed that aspect of it," says Winter, who is now a scientist at the UK's Diamond Light Source.
The idea for xia2, originally xia, short for Crystallographic Infrastructure for Automation, came to Winter while working at LMB on a graphical user interface to guide users through data processing. "I realized people often run the same sequence," he says. "Why not have a program that does it for them?"
In 2002, Winter moved to the Darsebury Laboratory, former home of the UK synchrotron, and began work on xia to automate the invocation of existing data processing programs, such as XDS, CCP4, and Mosflm. "I quickly reached a dead end because I'd never designed it," he says. "I'd just started writing scripts."
He started over by taking a big step back. He picked the brains of other crystallographers and processed hundreds of data sets himself to determine the decisions involved in processing raw diffraction data. The result was xia2, this time designed as an expert system to navigate the data processing decisions. "Basically, it understands data processing just about as well as I do," he says.
This research ended up amounting to a doctoral thesis, so Winter earned his PhD part-time at the University of Manchester while still employed at Daresbury.
By automating what in the past was an arcane, manual process, xia2 makes these crucial early data processing decisions reproducible and traceable. "It's exceedingly hard to scientifically reproduce what someone has done if the data processing was done manually," says Winter, who has made xia2 an open source program so that others can see how it works and, if necessary, alter it.
Winter recently collaborated with colleagues at Diamond Light Source to solve a histamine membrane protein in complex with an anti-histamine drug. The work, published in Nature in 2011, presented a data processing challenge. Winter had to stitch together data from dozens of crystals to get a sufficiently strong signal. He did the data processing manually, but is now working to code his decisions in xia2. "We're trying to automate the process because the interesting research, like membrane or virus protein structures, involves stitching together data from multiple, possibly hundreds, of crystals," he says.
In addition, Winter is delving into the underlying data processing programs themselves. While the existing programs work very well, new data collection technologies, such as free electron lasers and microfocused beamlines, are creating data sets with more subtle signals. "To detect those signals, we have to get closer to the raw data and do a much more complete mathematical treatment. You can't make those improvements without getting your hands dirty," says Winter, who is working on this in collaboration with a team at Diamond Light Source and groups at LMB and Lawrence-Berkeley Laboratories. "Throughout the development of xia2, I've been very lucky to have a lot of very experienced people to work with."
Published January 24, 2014
In the late 1960s, only a dozen or so proteins had been solved using x-ray crystallography. Jane Richardson and her husband, David, solved one of them (Staphylococcal nuclease), while working at MIT and a second of the first 20 (superoxide dismutase) at Duke University, where they still work today. The problem was, even with the solutions in hand, no one could quite comprehend all the complex information in such structures. There was no standard way of visualizing them.
So Richardson, now a James B. Duke professor of biochemistry at Duke University, spent two years teasing out a method for drawing and labeling these structures. Her ribbon drawings and seminal 1981 paper, "The Anatomy and Taxonomy of Protein Structure," became the standard for visualizing proteins. "It's probably the most notable thing I've done in my whole career," says Richardson, who without formal doctoral training has advanced in academia, earning prestigious awards such as a MacArthur Fellowship in 1985, and election into the National Academy of Sciences, the American Academy of Arts and Sciences, and the Institute of Medicine.
Working before the rise of computer graphics programs, Richardson hand-drew 75 different structures. In doing this painstaking work, she developed an intuition about proteins that marks, essentially, the very beginnings of structure validation. Richardson could just tell when a solution wasn't right. "It's having looked at the detailed structures," she says. "Some of it you can get at the schematic level, where you know they haven't done it quite right. The most obvious one is if there isn't much of any secondary structure. People always under assign secondary structure at low resolution."
Since then structure validation and the necessary partner process of repair have become the primary focuses of her career.
From Pen to MolProbity
In the early 1980s, Richardson was a non-tenured "Associate" in the lab of her husband, David Richardson, professor of biochemistry at Duke University. Their joint research was at the forefront of synthetic and computational biology, doing de novo protein design. Their aim was to design for structure, rather than function. According to Richardson, they had some success, in usually getting the right overall folds. But they never got well-ordered unique proteins. "We got molten globules," she says.
They realized that they weren't getting the internal contacts right. "You really can't think about molecular contacts either inside a structure or between molecules without having all the hydrogen atoms there. They're what make all the contacts," she says.
By necessity, the Richardsons began looking into adding the hydrogens. This began what would become a 5-year effort to develop methods to add the hydrogens to a structural model and look at their contacts. They named the tool ALL-ATOM CONTACTS. This and other related tools, such as REDUCE, which optimizes the hydrogen contacts, and PROBE, which shows them graphically, gave shape to the larger software package now called MolProbity.
Before MolProbity, validation tools developed in other labs, such as PROCHECK, first available in 1993, and WHATCHECK, developed in 1997, provided statistical evaluations of solutions, applying Ramachandran and rotamer distributions, for example, to flag problem areas.
Today, MolProbity has also implemented these statistical methods for validation. The Richardsons' original implementation of rotamers in MolProbity relied on a database of the top 100 structures. They now are working on an update that uses an 8000-structure database.
The additional data makes validation more complex. "Almost nothing is actually a normal distribution," she says. "Most of the errors are systematic errors rather than random errors. As a programmer, you have to try to understand what causes the systematic error, so that you can do a much better job of fixing it."
The validation parameters used by MolProbity come from a combination of a small molecule database and from the well-ordered parts of the high-resolution Protein Data Bank (PDB) structures. Other validation parameters used by MolProbity come from a landmark paper written in 1992 by Engh and Huber that defined bond lengths and angle parameters for proteins. "Everything is still pretty close to those classic parameters," says Richardson. "We've recently spent a couple of years sweating over a redo of the hydrogen parameters and we've almost finished it. We've gotten them a whole lot better than they were."
ABCs of PDB Validation
The PDB is a vast database of protein structures. In recent years, it has put structure validation front and center, using tools like MolProbity to score submissions, providing a means for users of the database to assess the quality of a structure. Richardson is a member of several validation committees for the PDB. She helped write the validation requirements for x-ray crystallography that came out in 2011.
From Richardson's point of view, validation should be a way of life for the crystallographer, not an afterthought. "People want to leave things open because, for instance, you can get rotamer outliers that are real, even occasionally Ramachandran outliers that are real," says Richardson. That, however, is the exception and should not be counted on. "The best bet initially is to set things in the expected places, then see what the data is telling you. If the outlier is real, it will win out. Doing it the other way around is very dangerous."
Richardson also has several outstanding concerns about structure deposition.
One is regarding the fact that the PDB does not currently allow hydrogen atom
coordinates in a deposited file. "Some people think that it's trivial to add
them and that everyone would add them the same way. That is simply not true,"
Another concern is about resolution, an important quality measure in protein crystallography. The problem is, the measure is not actually well defined. Data sets are too asymmetrical and peculiar and inconsistent from one experiment to the next. Consequently, depositor decisions about how to define resolution are somewhat subjective and arbitrary. "From our point of view it is a real nuisance," says Richardson. "If you look from 10,000 feet, resolution is perfect as the independent variable. It is the basic measure of how much information you have, going in. But the detailed view isn't great."
There are many competing proposals, but little agreement on which one should win out. "We'd love to have a standard definition even if it isn't perfect," she says.
The Devil in the Details
At the present, MolProbity does not validate data, nor does it validate how well the model matches to the data. However, MolProbity is currently undergoing a major renovation. A research associate in the Richardsons' lab, Jeff Headd, is rewriting the underpinnings of MolProbity in Python so that it will be compatible with Phenix tools. The goal is to build a more effective connection with Phenix and its crystallography-based tools. This will allow MolProbity to pull up refinements and maps and crystal symmetries and enable the addition of new validation features, says Richardson.
Another focus of their work is generalizing the methods for validation and repair to work better at low resolution. It's hard, she says, because there isn't enough data. "But we're trying to put in more of the info from the combined knowledge of 100,000 structures and try to do a better job with it," she says.
Her lab is also working on adding features to the ROSETTA and ERRASER programs, both developed in other labs, to allow ERRASER to do RNA structure corrections. "It's just amazing how different DNA and RNA are, given that the only fundamental difference between them is an extra OH-group on the sugar ring. That makes an enormous difference to how they interact," she says.
While DNA wants to form a helix structure to fulfill its primary and predominant function as an information storage medium, RNA has many different forms. "RNA is more like proteins in terms of overall tertiary structure and the fact that it can do catalysis and specific binding. It has many more conformational states than DNA," she says.
Oddly, says Richardson, it's easier to deal with RNA in validation than DNA because RNA conformations are "sort of rotameric." Sort of. As with the many other validation problems Richardson has worked on over the years, the devil is in the details.
Jane Richardson photo by Rita Lo
Story by Elizabeth Dougherty
Published October 28, 2013
Taking technology to the next level in structural biology
When Tim Stevens finished his PhD in biochemistry at the University of Cambridge in 1999, he needed a job to tide him over for a few months. When he discovered that his department had 9 months of grant funding for someone to do Nuclear Magnetic Resonance Imaging (NMR) analysis, he applied.
Even though he'd never done NMR work before, he got the job, and so defined the next decade of his career.
During that 9-month stint, Stevens solved one structure on his own and assisted with another. "I'd done a lot of computing work and I knew my way around a protein very well," he says, "so using the NMR software came naturally to me."
The software Stevens was using had been written in the 1980s. "It was very good, but old fashioned," says Stevens. "I began mumbling that I could do better."
Meanwhile, on the floor just below him, Ernest Laue, a professor of structural biology at Cambridge, had a similar notion. He wanted to create a modern NMR software suite for structural biology. Patterned after CCP4 in funding and philosophy, he called it CCPN. He was looking for people to join his team. Stevens, with his computing, structural biology and new NMR experience, was a perfect fit.
Stevens joined Rasmus Fogh and Wayne Boucher, two other postdoctoral fellows working on the new software suite. Stevens focused on applications, such as the Analysis graphical tool. Within a few years, they had a product.
Today, CCPN has about 1000 users. "We've had great interactions with the user community," says Stevens, who has traveled around the world to participate in CCPN workshops.
Recently, Stevens has focused on retooling the user interface, taking it through yet another round of modernization. "That work is about halfway done," he says. He debuted SpecView and ChemBuild, previews of CCPN version 3, in an SBGrid Webinar in 2012.
Stevens is quick to point out that computing isn't his only gig. "I'm a bioinformatician that dabbles in NMR analysis," he says. During the past decade, Stevens has continued his work characterizing membrane proteins and their distinguishing features.
He has also worked on applying NMR to metabolomics. Unlike Xray crystallography, which essentially snaps a picture of a molecule, NMR spectra contain a jumble of signals that must be pieced together like a jigsaw puzzle. Those signals can also be useful indicators for screening, such as scanning blood or urine for specific compounds or scanning proteins to detect strong or weak binding with a drug candidate. "NMR is very sensitive and can detect even subtle changes. You can even measure protein folding," he says. "You can do an awful lot with it."
Stevens left his long-time post at CCPN at the end of 2012 due to funding constraints. The change is somewhat fortuitous, allowing him to transition to a position at the Babraham Institute, potentially defining the coming decade of his career.
At Babraham, Stevens will apply the analysis techniques of NMR to look at the macro structure of chromosomes. He and his colleagues hope to determine whether physical locations of genes contribute to their regulation. "When DNA is damaged, does that affect the physical arrangements of genes? If so, does that contribute to cancer or aging?" he asks.
To answer these questions, Stevens will combine his expertise in genomic sequencing and bioinformatics, structural biology and NMR analysis. "It's somewhat new territory," he says. "But we're hitting the technology at just the right stage."
- Elizabeth Dougherty
Published January 15, 2013
The Mathematical and Collaborative Artistry of Eleanor Dodson
Back in the mid-1970s, the British government funded several collaborative computing projects. Among them (14 in all) was Collaborative Computing Project 4, known by structural biologists as CCP4. "The idea was that computers were so expensive, you'd probably only have one in London and maybe one in Manchester, so everybody would have to collaborate on using the hardware and developing software," says Eleanor Dodson, Professor Emeritus at the York Structural Biology Laboratory and a contributor to CCP4 from the beginning.
By then, Dodson had already been involved in structural biology for over a decade. With just a bachelor's degree, she began working in the lab of protein crystallography pioneer and Nobel Laureate Dorothy Hodgkin, who solved structures for penicillin, Vitamin B12 and insulin. "Dorothy needed a technician and I needed a job," says Dodson, who took on the work of hand-contouring electron density maps, despite her limited artistic talent. "I had great difficulty drawing a curve that joined up with its own tail," she laughs.
But what Dodson lacked in artistry, she more than made up for in mathematics. She had a hand in developing fundamental mathematical methods of crystallography, including molecular replacement, experimental phasing and refinement. "When there aren't hard and fast rules about how things are to be done, it's very great fun to be working in that field," says Dodson.
Though tedious at times, plotting curves on transparent tracing paper and stacking them up to make an image of a molecule taught Dodson the value of interdisciplinary collaboration. "Without that time spent on mundane chores, we'd have failed to appreciate the contributions of others," she says. "That's one of the dangers of today's quicker scientific results. You aren't as involved in the process and so you don't see how others contributed."
In the mid-70s, Dodson and her husband Guy Dodson, also a scientist, left Hodgkin's lab and moved to York. Hodgkin retired. The informal collaborations the software developers had formed began to dissolve. "We all realized how much we depended upon each other," recalls Dodson. "You need someone who will shout at you and say, look! That's a stupid result. You must have done something wrong!"
To reconnect, the group got funding for CCP4, and established a network of programs still used and maintained today. The effort established fundamental principles such as using centralized, well-tested libraries for common functions and standardized data formats. "We weren't aware that we were being innovative," says Dodson. "We were just practicing common sense."
Dodson continued to work within CCP4 throughout her career. A recent development is the program ACORN, (also freely available in CCP4), a phasing procedure for determining protein structures using atomic resolution data based on the ideas of the physicist Michael Woolfson, also at York. In 2009, Dodson and Woolfson demonstrated that the ACORN technique could be extended to solve structures with more limited data sets, providing the data was artificially extended -- using the so-called "free-lunch" approach.
Though Dodson has been a keen driver for more computer automation in structural biology, she has some concerns about the increasingly successful applications. "One of the disadvantages is that you're failing to educate the next generation of crystallographers," she says.
However, she continues, the many training workshops that have been funded by projects like CCP4 and, more recently, SBGrid, help counter this drawback by bringing people together to solve hard problems. "We try to solve such problems in several ways, to use different software, and to start from different points of view," she says. "They give the innovators of the future a chance to get their fingers in the pie."
-— Elizabeth Dougherty
Published November 5, 2012
Using Digital Signal Processing and SPARX in cryo-Electron Microscopy
When Pawel Penczek took his first job in the lab of Joachim Frank, a pioneer in cryo-Electron Microscopy, he had never heard about the technique. "My interest was in digital signal processing," says Penczek, now director of the Structural Biology Imaging Center at the University of Texas - Houston Medical School and lead developer of SPARX, a Cryo-EM image processing software tool. "I was only remotely aware of using EM for biological applications."
When he arrived in Frank's lab in 1989, he became part of the team working on the first cryo-EM construction of the ribosome, which finally emerged at 45A resolution. "It was a major milestone," he says. As part of the team, Penczek, who had studied physics at the University of Warsaw, took on the task of improving the image processing algorithms in SPIDER, a tool developed by the Frank lab that mathematically averages large numbers of images of a molecule to construct its meaningful 3-D representation.
Penczek, along with others in the Frank lab, spent the next 5 years on technical advancements, all in an effort to squeeze more meaning out of the data. "Only beginning from the mid-1990s did the biological payoff start," says Penczek. "That's when people started to become interested in cryo-EM findings."
Still, the effort to make cryo-EM a workable technique proved to be much more challenging than anyone had expected. "I'm still working on it 2 decades later," laughs Penczek.
Cryo-EM's main advantage is that it allows imaging of a molecule in its native state. "Proteins appear as they are," says Penczek, "like flies in amber." In contrast, crystallized proteins used in X-ray crystallography may be constrained by the crystal.
The downside of cryo-EM is that in a single-molecule sample, the signal-to-noise ratio is very low. Solving the structure takes expertise, and, even then, the resolution is typically limited to 10 - 15 A.
For Penczek, the obvious task ahead is to improve the technique so that it becomes routine to obtain high-resolution solutions. However, says Penczek, "paradoxically, it is the resolution that I'd consider the last interesting challenge."
Rather, Penczek is focusing on finding ways to validate the resulting shape. He is also working to find ways to handle the conformational variability that comes from having unpredictable single-particle samples. "Since the molecules aren't constrained when they are frozen, they can have floppy or moving parts," he says. "We can take as many pictures as we want, but it's all fuzzy, like slow shutter speed pictures of moving cars."
According to Penczek, conformational variability is a blessing and a curse. "It limits resolution," he says, "but it also provides insight into the functioning of proteins."
For this reason, Penczek has been struggling with conformational variability for the last decade. His primary focus today is developing new image processing methods to improve validation and handle variability and implementing them in a software package called SPARX.
"SPARX is the place where my ideas take shape," he says, though his true research focus is on methods. "It's my venue for propagating ideas." For more information about SPARX, visit the SPARX Wiki at: http://sparx-em.org/sparxwiki/SparxWiki
Published June 4, 2012
From Mockery to Reality
“I was an angry young man,” says Gerard Kleywegt of his early days in the 1990s as a structural biologist. He’d found his way from the University of Utrecht, in the Netherlands, where he’d done his PhD on Nuclear Magnetic Resonance (NMR) spectroscopy, to Uppsala, in Sweden, where as a young post-doc he was learning X-ray crystallography from Alwyn Jones. “I thought quality and validation of structures was so important that, when I found an error, I was almost shocked.” And he wasn’t quiet about it.
Kleywegt now laughs at his “zeal,” and says, “I’ve mellowed a lot.” But the nickname Jones gave him, “CD,” has stuck. It stands for “Charm and Diplomacy” — the two least likely characteristics to be associated with Kleywegt, according to Jones.
Kleywegt’s passion for validation has also stuck. Today, working at the EMBL-European Bioinformatics Institute (EBI) as Head of the Protein Data Bank in Europe (PDBe), he oversees a major effort to improve validation throughout the PDB, a role that every turn of his career has prepared him for.
Originally, when Kleywegt went to Uppsala, he expected to spend just a year or two there learning crystallography before going on to a career in drug design. “I ended up staying for nearly 18 years,” he laughs. “I really enjoyed it. It was a very dynamic lab.”
While in Uppsala, he solved structures and wrote software tools, mainly to meet his own crystallography needs. He created what is now known as the Uppsala Software Factory, also tongue-in-cheek, he says, “because the so-called factory was just me.”
But what Kleywegt is probably best known for are his efforts to improve and advance structure validation. In fact, when he took on his new role at PDBe, he was already a member of the Worldwide Protein Data Bank (wwPDB) validation task force for crystallography. “When I came here, I transitioned from being an advisor to advisee,” he says.
The goal of the wwPDB validation effort is to make it easier for scientists who are not experts in structural biology to select structures intelligently from the PDB. This requires automating a number of very complex processes. To do this, task forces in each field (i.e. NMR, X-ray crystallography, EM) recommend validation methods and tools, which the wwPDB then knits into a “validation pipeline,” explains Kleywegt.
This automated pipeline consolidates results and assigns each structure a score to indicate its relative quality. Scores appear on a simple, intuitive sliding color scale from red (poorer scores than other structures) to blue (better scores). Each newly deposited structure gets validated and scored in the same way. Depositors are informed of possible errors or concerns that may need closer scrutiny, and are given an opportunity to address the issues before the data are made public. “Hopefully, this will prevent errors from making it into the archive in the first place,” says Kleywegt.
Though Kleywegt no longer has time to write software, which used to be his reward for completing other work, he says (with both charm and diplomacy): “This job is ideal for me, with my background in chemistry, NMR, crystallography and bioinformatics. But much as I love working here, I miss Sweden every single day.”
Published March 7, 2012
How a holiday lark became structural biology's Coot
A little over a decade ago, Paul Emsley, biochemistry professor at the University of Oxford, was looking to ditch his white coat. What he really wanted was to spend more time programming in the computer lab. “I was happy using existing software tools,” said Emsley, who had used O and other tools in his research. “But you go down the pub and think, if only the tool did this, and if only it did that. That festered for years.”
In the late 1990s, Emsley had the opportunity to join the lab of Kevin Cowtan at University of York with the task of implementing software to perform crystallographic “ridge line tracing” in three dimensions, a concept first imagined in the 1970s by Johnathan Greer, now Director of Structural Biology with Abbott Global Pharmaceuticals.
Early on in this work, and at the beginning of a Christmas holiday, Emsley realized that he wanted his program to be visual, to show the molecules, electron density maps and noncrystallographic symmetry. “There was no easy way of showing these with other people's tools,” says Emsley. “I spent that holiday playing with this idea, and I have just not stopped playing.”
That holiday's work eventually produced Coot.
Since then, Emsley, along with other contributors including Cowtan, have continued to expand Coot. It includes a feature similar to the Lego-like fragment selection tools in O. Coot also includes a novel way of representing electron densities in 3D using a technique called “marching cubes” rather than using contour algorithms. In addition, Coot provides very convenient ways to transform data and view from one representation to another.
That convenience comes from Coot's graphical interface. “Click-click on the data, and up pops a map,” says Emsley. The easy to navigate menu systems allow users to “discover” the program's features rather than memorizing commands. Also, the program “is forgiving,” he says. “You can drag things around in the model and then undo them. Coot doesn't punish you for exploring or experimenting.”
COOT is also an open source program with a GPLv3 license. Anyone can view the source code and submit suggested changes. Emsley recently updated Coot to include changes supplied by a researcher who models RNA. In addition, Phenix integrates with Coot through its Python interface.
In an upcoming release, Coot will include a novel model representation that is more familiar to chemists. “When you try to get medicinal chemists to look at Coot, it just doesn't work,” says Emsley. “They want to see a standard chemical structure diagram, so that's what we're working on.”
While it has become a bit of a standing joke, Emsley believes that version 1.0 will be released within the 18 months promised on Coot's website (http://www.biop.ox.ac.uk/coot/). Regardless, nightly builds provide fully tested releases daily, so users awaiting new features can begin using them as soon as they are complete. – Elizabeth Dougherty
Published October 15, 2011
A Welshman's journey into computer graphics
An unexpected side-effect of Alywn Jones' decision to write Frodo, one of the first computer graphics programs written for Xray crystallography, was learning to swear in German. His teacher? Johann Deisenhoffer, the 1988 winner of the Nobel Prize in Chemistry.
“He was always using my experimental versions,” said Jones, then at the Max Planck Institute for Biochemistry, now professor of structural biology at Uppsala University in Sweden. “He used to swear at me when my program exploded, which it often did.” Back then, in 1976, Jones had happened into computer graphics. “I took a wrong turn and I just kept going,” he says, his voice just slightly less gritty than that of The Boss.
Jones programmed on what at the time was a sophisticated computer. “I could have bought three Ferraris for what we paid for that system,” says Jones. But with only 32000 words of memory and just a megabyte of disk storage, writing modeling software was a challenge. To do anything, he had to link his computer to a larger system, creating a flow of data he thought of as a ring. That ring inspired him to name his program Frodo.
In 1979, Jones took Frodo to Uppsala. There he implemented Frodo's original Lego-like model building tool. He built the tool because he had noticed over the years that the same structure fragments kept reappearing in solved structures. He wondered if he could use these fragments to jump-start solving new structures. Pursuing this idea, Jones found many well-refined fragments. “I was surprised to see that I could use these fragments to build a whole protein,” says Jones.
Without such a tool, he says, “someone could end up sitting in front of one of these computer systems and fitting to noise rather than fitting to a realistic expectation. They might end up with models that have incorrect stereochemistry, or parts of the main chain pointing in the wrong direction.”
By the mid-eighties, Frodo's mini-computer hardware platform had become obsolete, replaced by Digital's VAX. “The VAX was the first computer where crystallographers could actually control their computing destiny,” says Jones. “You could do all of the computing on it and not have to use a computer center.”
Jones decided to write Frodo again from scratch for the new hardware, creating “O.” Jones closely guards the meaning of the name. Even collaborator Morton Kjeldgaard of Denmark, who contributed the cartoon representations of proteins in O, does not know. “O is the end of Frodo,” he guessed. Not so. Perhaps O is The One Ring? Jones will not tell.
Regardless, O includes a newer, more elegant implementation of Frodo's Lego-like graphics tool. In O, Jones used a database to store the fragments and other data the program needs, making it easy to support new features as he conceives them. In addition to updating O, Jones also still relies on O to take electron density maps and turn them into models of proteins. For more information about O, please see Essential O, available online.
– Elizabeth Dougherty
Published June 17, 2011
Wolfgang Kabsch and the making of XDS
As Wolfgang Kabsch headed for the darkroom, facing another day of developing films of Xray diffraction patterns, he passed by a new machine sitting on a bench, unused. It was the mid-1980s and the machine was an early electronic Xray detector, full of new technology but lacking the software to make it usable.
“It was just sitting there, looking at me,” says Kabsch, staff scientist emeritus in biophysics at the Max Planck Institute for Medical Research. “I decided rather than wasting my time in the darkroom, I could program the detector to do something useful.”
Kabsch's efforts led to the development of XDS, Xray Detector Software. First released in 1986, the tool is still widely used today to translate raw Xray data from a variety of detectors into standardized data that includes the H, K and L indices of reflection, the intensity of the reflection, and the standard deviation of the intensity.
At the time, Kabsch was working on his main project, solving the structure of the muscle protein actin. After 14 months of exploring the detector and coding all of its features into a Fortran-based software tool, he was able to solve the actin structure with the excellent data from the new detector, rendering the film-based approach obsolete. “It was a lot of work,” says Kabsch, “but when it was done, the structures just came out like nothing.”
Kabsch released XDS in 1986 and published the actin structure in Nature in 1990. He continued to use the detector for other projects, such as solving (in collaboration with SBGrid member Emil Pai, biochemistry professor at the University of Toronto) the RAS p21 protein, an oncoprotein in human cancer.
Since that first software release, Kabsch has run XDS as a one-man shop. As new detectors come on the market, he extends the software to support them. In all, the software supports over 22 different detectors.
Recently, Kabsch added support for a novel pixel detector called PILATUS from the Paul Scherrer Institut in Switzerland. This data-intensive detector delivers ten images per second, each 6 megapixels in size. On the docket is support for a new types of detectors assembled from may separate, planar segments for recording FEL (free electron laser) data at the Linac Coherent Light Source at Stanford.
Kabsch, now retired, is always looking out for ways to keep XDS up to date. “When I see a better algorithm or a better way of computing, I put it in,” he says. He announces improvements on the XDS web page. Kabsch has also contributed to other applications, including the development of a dictionary of protein secondary structure, used in applications such as Procheck.
While Kabsch remains the primary coder of XDS, he has tapped a successor. Kay Diederichs, professor of protein crystallography and molecular bioinformatics at the University of Konstanz, has worked with Kabsch and XDS for over two decades and will continue to evolve XDS when Kabsch starts taking his retirement more seriously.
– Elizabeth Dougherty
Published May 19, 2011
Advancing structural biology discoveries, methods, and tools
It took Victor Lamzin nearly a year to solve his first structure, an 800-residue enzyme formate dehydrogenase. Later, as a post-doc, he asked his supervisor to let him re-solve it, but this time in just 2 months.
Lamzin, now a group leader and the Deputy Head of the Hamburg Unit of the European Molecular Biology Laboratory, did it. “That's when I realized things could be done even quicker than that. I realized that much of the experience I had garnered and what I'd deciphered from reading the literature and talking to colleagues could be put into software,” says Lamzin. “Especially the boring, repetitive things.”
While Lamzin had scant knowledge of computer programming—in fact, he says, scant knowledge of crystallography—he dove in anyway. After all, as a scientist, learning is his job.
Lamzin's challenge led to the development of ARP/wARP. Pronounced “Arp-Warp,” the software transforms electron density maps of Xray diffraction patterns into 3-dimensional models. When ARP/wARP debuted in 1998, it was the only software of its kind. The program, written mostly in Fortran with scripting languages to support the user interface, was a collaborative effort between Lamzin's lab and that of Anastassis Perrakis of the Netherlands Cancer Institute.
“ARP/wARP, like human crystallographers, links model building and refinement together into a unified process that iteratively proceeds towards the final macromolecular model,” wrote Lamzin and Perrakis in a 2008 Nature Protocols paper describing a new release of the program. The software includes multiple, unified approaches to model building as well as ligand and solvent building components.
More recently, ARP/wARP's resolution range has expanded to allow the solution of larger structures and complexes, which have lower resolution diffraction patterns. “These structures don't always produce the Xray diffraction patterns that work best with ARP/wARP,” Lamzin says. “That means we need to develop novel methods if we want to get biologically relevant information from those structures.”
While the software still performs best at resolutions better than 2 angstroms, it can automatically solve 65% of structures at resolutions approaching 3.5 angstroms. Ultimately, Lamzin would like to solve much, much larger structures—his dream being the structure of a cell—so the software is continually under improvement.
The latest release, version 7.2, which will be available late spring 2011, contains an advanced molecular graphics front-end and an RNA/DNA builder. It integrates with CCP4 and also uses REFMAC for model refinement. For more details, visit the ARP/wARP website.
While ARP/wARP is a significant project in Lamzin's lab, his own structural biology work employs atomic resolution Xray crystallography to study the enzymatic action of proteins such as those using Nicotinamide adenine dinucleotide (NAD/NADH), the most abundant electron carrier in cell metabolism. Lamzin's work typically leads not only to publications about structural discoveries, but also to the development of technology to share. “We like to contribute with structural biology interpretations,” says Lamzin, “and at the same time with methodological advances.”
– Elizabeth Dougherty
Published May 17, 2011