Taking technology to the next level in structural biology
When Tim Stevens finished his PhD in biochemistry at the University of Cambridge in 1999, he needed a job to tide him over for a few months. When he discovered that his department had 9 months of grant funding for someone to do Nuclear Magnetic Resonance Imaging (NMR) analysis, he applied.
Even though he'd never done NMR work before, he got the job, and so defined the next decade of his career.
During that 9-month stint, Stevens solved one structure on his own and assisted with another. "I'd done a lot of computing work and I knew my way around a protein very well," he says, "so using the NMR software came naturally to me."
The software Stevens was using had been written in the 1980s. "It was very good, but old fashioned," says Stevens. "I began mumbling that I could do better."
Meanwhile, on the floor just below him, Ernest Laue, a professor of structural biology at Cambridge, had a similar notion. He wanted to create a modern NMR software suite for structural biology. Patterned after CCP4 in funding and philosophy, he called it CCPN. He was looking for people to join his team. Stevens, with his computing, structural biology and new NMR experience, was a perfect fit.
Stevens joined Rasmus Fogh and Wayne Boucher, two other postdoctoral fellows working on the new software suite. Stevens focused on applications, such as the Analysis graphical tool. Within a few years, they had a product.
Today, CCPN has about 1000 users. "We've had great interactions with the user community," says Stevens, who has traveled around the world to participate in CCPN workshops.
Recently, Stevens has focused on retooling the user interface, taking it through yet another round of modernization. "That work is about halfway done," he says. He debuted SpecView and ChemBuild, previews of CCPN version 3, in an SBGrid Webinar in 2012.
Stevens is quick to point out that computing isn't his only gig. "I'm a bioinformatician that dabbles in NMR analysis," he says. During the past decade, Stevens has continued his work characterizing membrane proteins and their distinguishing features.
He has also worked on applying NMR to metabolomics. Unlike Xray crystallography, which essentially snaps a picture of a molecule, NMR spectra contain a jumble of signals that must be pieced together like a jigsaw puzzle. Those signals can also be useful indicators for screening, such as scanning blood or urine for specific compounds or scanning proteins to detect strong or weak binding with a drug candidate. "NMR is very sensitive and can detect even subtle changes. You can even measure protein folding," he says. "You can do an awful lot with it."
Stevens left his long-time post at CCPN at the end of 2012 due to funding constraints. The change is somewhat fortuitous, allowing him to transition to a position at the Babraham Institute, potentially defining the coming decade of his career.
At Babraham, Stevens will apply the analysis techniques of NMR to look at the macro structure of chromosomes. He and his colleagues hope to determine whether physical locations of genes contribute to their regulation. "When DNA is damaged, does that affect the physical arrangements of genes? If so, does that contribute to cancer or aging?" he asks.
To answer these questions, Stevens will combine his expertise in genomic sequencing and bioinformatics, structural biology and NMR analysis. "It's somewhat new territory," he says. "But we're hitting the technology at just the right stage."
- Elizabeth Dougherty
Published January 15, 2013
The Mathematical and Collaborative Artistry of Eleanor Dodson
Back in the mid-1970s, the British government funded several collaborative computing projects. Among them (14 in all) was Collaborative Computing Project 4, known by structural biologists as CCP4. "The idea was that computers were so expensive, you'd probably only have one in London and maybe one in Manchester, so everybody would have to collaborate on using the hardware and developing software," says Eleanor Dodson, Professor Emeritus at the York Structural Biology Laboratory and a contributor to CCP4 from the beginning.
By then, Dodson had already been involved in structural biology for over a decade. With just a bachelor's degree, she began working in the lab of protein crystallography pioneer and Nobel Laureate Dorothy Hodgkin, who solved structures for penicillin, Vitamin B12 and insulin. "Dorothy needed a technician and I needed a job," says Dodson, who took on the work of hand-contouring electron density maps, despite her limited artistic talent. "I had great difficulty drawing a curve that joined up with its own tail," she laughs.
But what Dodson lacked in artistry, she more than made up for in mathematics. She had a hand in developing fundamental mathematical methods of crystallography, including molecular replacement, experimental phasing and refinement. "When there aren't hard and fast rules about how things are to be done, it's very great fun to be working in that field," says Dodson.
Though tedious at times, plotting curves on transparent tracing paper and stacking them up to make an image of a molecule taught Dodson the value of interdisciplinary collaboration. "Without that time spent on mundane chores, we'd have failed to appreciate the contributions of others," she says. "That's one of the dangers of today's quicker scientific results. You aren't as involved in the process and so you don't see how others contributed."
In the mid-70s, Dodson and her husband Guy Dodson, also a scientist, left Hodgkin's lab and moved to York. Hodgkin retired. The informal collaborations the software developers had formed began to dissolve. "We all realized how much we depended upon each other," recalls Dodson. "You need someone who will shout at you and say, look! That's a stupid result. You must have done something wrong!"
To reconnect, the group got funding for CCP4, and established a network of programs still used and maintained today. The effort established fundamental principles such as using centralized, well-tested libraries for common functions and standardized data formats. "We weren't aware that we were being innovative," says Dodson. "We were just practicing common sense."
Dodson continued to work within CCP4 throughout her career. A recent development is the program ACORN, (also freely available in CCP4), a phasing procedure for determining protein structures using atomic resolution data based on the ideas of the physicist Michael Woolfson, also at York. In 2009, Dodson and Woolfson demonstrated that the ACORN technique could be extended to solve structures with more limited data sets, providing the data was artificially extended -- using the so-called "free-lunch" approach.
Though Dodson has been a keen driver for more computer automation in structural biology, she has some concerns about the increasingly successful applications. "One of the disadvantages is that you're failing to educate the next generation of crystallographers," she says.
However, she continues, the many training workshops that have been funded by projects like CCP4 and, more recently, SBGrid, help counter this drawback by bringing people together to solve hard problems. "We try to solve such problems in several ways, to use different software, and to start from different points of view," she says. "They give the innovators of the future a chance to get their fingers in the pie."
-— Elizabeth Dougherty
Published November 5, 2012
Using Digital Signal Processing and SPARX in cryo-Electron Microscopy
When Pawel Penczek took his first job in the lab of Joachim Frank, a pioneer in cryo-Electron Microscopy, he had never heard about the technique. "My interest was in digital signal processing," says Penczek, now director of the Structural Biology Imaging Center at the University of Texas - Houston Medical School and lead developer of SPARX, a Cryo-EM image processing software tool. "I was only remotely aware of using EM for biological applications."
When he arrived in Frank's lab in 1989, he became part of the team working on the first cryo-EM construction of the ribosome, which finally emerged at 45A resolution. "It was a major milestone," he says. As part of the team, Penczek, who had studied physics at the University of Warsaw, took on the task of improving the image processing algorithms in SPIDER, a tool developed by the Frank lab that mathematically averages large numbers of images of a molecule to construct its meaningful 3-D representation.
Penczek, along with others in the Frank lab, spent the next 5 years on technical advancements, all in an effort to squeeze more meaning out of the data. "Only beginning from the mid-1990s did the biological payoff start," says Penczek. "That's when people started to become interested in cryo-EM findings."
Still, the effort to make cryo-EM a workable technique proved to be much more challenging than anyone had expected. "I'm still working on it 2 decades later," laughs Penczek.
Cryo-EM's main advantage is that it allows imaging of a molecule in its native state. "Proteins appear as they are," says Penczek, "like flies in amber." In contrast, crystallized proteins used in X-ray crystallography may be constrained by the crystal.
The downside of cryo-EM is that in a single-molecule sample, the signal-to-noise ratio is very low. Solving the structure takes expertise, and, even then, the resolution is typically limited to 10 - 15 A.
For Penczek, the obvious task ahead is to improve the technique so that it becomes routine to obtain high-resolution solutions. However, says Penczek, "paradoxically, it is the resolution that I'd consider the last interesting challenge."
Rather, Penczek is focusing on finding ways to validate the resulting shape. He is also working to find ways to handle the conformational variability that comes from having unpredictable single-particle samples. "Since the molecules aren't constrained when they are frozen, they can have floppy or moving parts," he says. "We can take as many pictures as we want, but it's all fuzzy, like slow shutter speed pictures of moving cars."
According to Penczek, conformational variability is a blessing and a curse. "It limits resolution," he says, "but it also provides insight into the functioning of proteins."
For this reason, Penczek has been struggling with conformational variability for the last decade. His primary focus today is developing new image processing methods to improve validation and handle variability and implementing them in a software package called SPARX.
"SPARX is the place where my ideas take shape," he says, though his true research focus is on methods. "It's my venue for propagating ideas." For more information about SPARX, visit the SPARX Wiki at: http://sparx-em.org/sparxwiki/SparxWiki
Published June 4, 2012
From Mockery to Reality
“I was an angry young man,” says Gerard Kleywegt of his early days in the 1990s as a structural biologist. He’d found his way from the University of Utrecht, in the Netherlands, where he’d done his PhD on Nuclear Magnetic Resonance (NMR) spectroscopy, to Uppsala, in Sweden, where as a young post-doc he was learning X-ray crystallography from Alwyn Jones. “I thought quality and validation of structures was so important that, when I found an error, I was almost shocked.” And he wasn’t quiet about it.
Kleywegt now laughs at his “zeal,” and says, “I’ve mellowed a lot.” But the nickname Jones gave him, “CD,” has stuck. It stands for “Charm and Diplomacy” — the two least likely characteristics to be associated with Kleywegt, according to Jones.
Kleywegt’s passion for validation has also stuck. Today, working at the EMBL-European Bioinformatics Institute (EBI) as Head of the Protein Data Bank in Europe (PDBe), he oversees a major effort to improve validation throughout the PDB, a role that every turn of his career has prepared him for.
Originally, when Kleywegt went to Uppsala, he expected to spend just a year or two there learning crystallography before going on to a career in drug design. “I ended up staying for nearly 18 years,” he laughs. “I really enjoyed it. It was a very dynamic lab.”
While in Uppsala, he solved structures and wrote software tools, mainly to meet his own crystallography needs. He created what is now known as the Uppsala Software Factory, also tongue-in-cheek, he says, “because the so-called factory was just me.”
But what Kleywegt is probably best known for are his efforts to improve and advance structure validation. In fact, when he took on his new role at PDBe, he was already a member of the Worldwide Protein Data Bank (wwPDB) validation task force for crystallography. “When I came here, I transitioned from being an advisor to advisee,” he says.
The goal of the wwPDB validation effort is to make it easier for scientists who are not experts in structural biology to select structures intelligently from the PDB. This requires automating a number of very complex processes. To do this, task forces in each field (i.e. NMR, X-ray crystallography, EM) recommend validation methods and tools, which the wwPDB then knits into a “validation pipeline,” explains Kleywegt.
This automated pipeline consolidates results and assigns each structure a score to indicate its relative quality. Scores appear on a simple, intuitive sliding color scale from red (poorer scores than other structures) to blue (better scores). Each newly deposited structure gets validated and scored in the same way. Depositors are informed of possible errors or concerns that may need closer scrutiny, and are given an opportunity to address the issues before the data are made public. “Hopefully, this will prevent errors from making it into the archive in the first place,” says Kleywegt.
Though Kleywegt no longer has time to write software, which used to be his reward for completing other work, he says (with both charm and diplomacy): “This job is ideal for me, with my background in chemistry, NMR, crystallography and bioinformatics. But much as I love working here, I miss Sweden every single day.”
Published March 7, 2012
How a holiday lark became structural biology's Coot
A little over a decade ago, Paul Emsley, biochemistry professor at the University of Oxford, was looking to ditch his white coat. What he really wanted was to spend more time programming in the computer lab. “I was happy using existing software tools,” said Emsley, who had used O and other tools in his research. “But you go down the pub and think, if only the tool did this, and if only it did that. That festered for years.”
In the late 1990s, Emsley had the opportunity to join the lab of Kevin Cowtan at University of York with the task of implementing software to perform crystallographic “ridge line tracing” in three dimensions, a concept first imagined in the 1970s by Johnathan Greer, now Director of Structural Biology with Abbott Global Pharmaceuticals.
Early on in this work, and at the beginning of a Christmas holiday, Emsley realized that he wanted his program to be visual, to show the molecules, electron density maps and noncrystallographic symmetry. “There was no easy way of showing these with other people's tools,” says Emsley. “I spent that holiday playing with this idea, and I have just not stopped playing.”
That holiday's work eventually produced Coot.
Since then, Emsley, along with other contributors including Cowtan, have continued to expand Coot. It includes a feature similar to the Lego-like fragment selection tools in O. Coot also includes a novel way of representing electron densities in 3D using a technique called “marching cubes” rather than using contour algorithms. In addition, Coot provides very convenient ways to transform data and view from one representation to another.
That convenience comes from Coot's graphical interface. “Click-click on the data, and up pops a map,” says Emsley. The easy to navigate menu systems allow users to “discover” the program's features rather than memorizing commands. Also, the program “is forgiving,” he says. “You can drag things around in the model and then undo them. Coot doesn't punish you for exploring or experimenting.”
COOT is also an open source program with a GPLv3 license. Anyone can view the source code and submit suggested changes. Emsley recently updated Coot to include changes supplied by a researcher who models RNA. In addition, Phenix integrates with Coot through its Python interface.
In an upcoming release, Coot will include a novel model representation that is more familiar to chemists. “When you try to get medicinal chemists to look at Coot, it just doesn't work,” says Emsley. “They want to see a standard chemical structure diagram, so that's what we're working on.”
While it has become a bit of a standing joke, Emsley believes that version 1.0 will be released within the 18 months promised on Coot's website (http://www.biop.ox.ac.uk/coot/). Regardless, nightly builds provide fully tested releases daily, so users awaiting new features can begin using them as soon as they are complete. – Elizabeth Dougherty
Published October 15, 2011
A Welshman's journey into computer graphics
An unexpected side-effect of Alywn Jones' decision to write Frodo, one of the first computer graphics programs written for Xray crystallography, was learning to swear in German. His teacher? Johann Deisenhoffer, the 1988 winner of the Nobel Prize in Chemistry.
“He was always using my experimental versions,” said Jones, then at the Max Planck Institute for Biochemistry, now professor of structural biology at Uppsala University in Sweden. “He used to swear at me when my program exploded, which it often did.” Back then, in 1976, Jones had happened into computer graphics. “I took a wrong turn and I just kept going,” he says, his voice just slightly less gritty than that of The Boss.
Jones programmed on what at the time was a sophisticated computer. “I could have bought three Ferraris for what we paid for that system,” says Jones. But with only 32000 words of memory and just a megabyte of disk storage, writing modeling software was a challenge. To do anything, he had to link his computer to a larger system, creating a flow of data he thought of as a ring. That ring inspired him to name his program Frodo.
In 1979, Jones took Frodo to Uppsala. There he implemented Frodo's original Lego-like model building tool. He built the tool because he had noticed over the years that the same structure fragments kept reappearing in solved structures. He wondered if he could use these fragments to jump-start solving new structures. Pursuing this idea, Jones found many well-refined fragments. “I was surprised to see that I could use these fragments to build a whole protein,” says Jones.
Without such a tool, he says, “someone could end up sitting in front of one of these computer systems and fitting to noise rather than fitting to a realistic expectation. They might end up with models that have incorrect stereochemistry, or parts of the main chain pointing in the wrong direction.”
By the mid-eighties, Frodo's mini-computer hardware platform had become obsolete, replaced by Digital's VAX. “The VAX was the first computer where crystallographers could actually control their computing destiny,” says Jones. “You could do all of the computing on it and not have to use a computer center.”
Jones decided to write Frodo again from scratch for the new hardware, creating “O.” Jones closely guards the meaning of the name. Even collaborator Morton Kjeldgaard of Denmark, who contributed the cartoon representations of proteins in O, does not know. “O is the end of Frodo,” he guessed. Not so. Perhaps O is The One Ring? Jones will not tell.
Regardless, O includes a newer, more elegant implementation of Frodo's Lego-like graphics tool. In O, Jones used a database to store the fragments and other data the program needs, making it easy to support new features as he conceives them. In addition to updating O, Jones also still relies on O to take electron density maps and turn them into models of proteins. For more information about O, please see Essential O, available online.
– Elizabeth Dougherty
Published June 17, 2011
Wolfgang Kabsch and the making of XDS
As Wolfgang Kabsch headed for the darkroom, facing another day of developing films of Xray diffraction patterns, he passed by a new machine sitting on a bench, unused. It was the mid-1980s and the machine was an early electronic Xray detector, full of new technology but lacking the software to make it usable.
“It was just sitting there, looking at me,” says Kabsch, staff scientist emeritus in biophysics at the Max Planck Institute for Medical Research. “I decided rather than wasting my time in the darkroom, I could program the detector to do something useful.”
Kabsch's efforts led to the development of XDS, Xray Detector Software. First released in 1986, the tool is still widely used today to translate raw Xray data from a variety of detectors into standardized data that includes the H, K and L indices of reflection, the intensity of the reflection, and the standard deviation of the intensity.
At the time, Kabsch was working on his main project, solving the structure of the muscle protein actin. After 14 months of exploring the detector and coding all of its features into a Fortran-based software tool, he was able to solve the actin structure with the excellent data from the new detector, rendering the film-based approach obsolete. “It was a lot of work,” says Kabsch, “but when it was done, the structures just came out like nothing.”
Kabsch released XDS in 1986 and published the actin structure in Nature in 1990. He continued to use the detector for other projects, such as solving (in collaboration with SBGrid member Emil Pai, biochemistry professor at the University of Toronto) the RAS p21 protein, an oncoprotein in human cancer.
Since that first software release, Kabsch has run XDS as a one-man shop. As new detectors come on the market, he extends the software to support them. In all, the software supports over 22 different detectors.
Recently, Kabsch added support for a novel pixel detector called PILATUS from the Paul Scherrer Institut in Switzerland. This data-intensive detector delivers ten images per second, each 6 megapixels in size. On the docket is support for a new types of detectors assembled from may separate, planar segments for recording FEL (free electron laser) data at the Linac Coherent Light Source at Stanford.
Kabsch, now retired, is always looking out for ways to keep XDS up to date. “When I see a better algorithm or a better way of computing, I put it in,” he says. He announces improvements on the XDS web page. Kabsch has also contributed to other applications, including the development of a dictionary of protein secondary structure, used in applications such as Procheck.
While Kabsch remains the primary coder of XDS, he has tapped a successor. Kay Diederichs, professor of protein crystallography and molecular bioinformatics at the University of Konstanz, has worked with Kabsch and XDS for over two decades and will continue to evolve XDS when Kabsch starts taking his retirement more seriously.
– Elizabeth Dougherty
Published May 19, 2011
Advancing structural biology discoveries, methods, and tools
It took Victor Lamzin nearly a year to solve his first structure, an 800-residue enzyme formate dehydrogenase. Later, as a post-doc, he asked his supervisor to let him re-solve it, but this time in just 2 months.
Lamzin, now a group leader and the Deputy Head of the Hamburg Unit of the European Molecular Biology Laboratory, did it. “That's when I realized things could be done even quicker than that. I realized that much of the experience I had garnered and what I'd deciphered from reading the literature and talking to colleagues could be put into software,” says Lamzin. “Especially the boring, repetitive things.”
While Lamzin had scant knowledge of computer programming—in fact, he says, scant knowledge of crystallography—he dove in anyway. After all, as a scientist, learning is his job.
Lamzin's challenge led to the development of ARP/wARP. Pronounced “Arp-Warp,” the software transforms electron density maps of Xray diffraction patterns into 3-dimensional models. When ARP/wARP debuted in 1998, it was the only software of its kind. The program, written mostly in Fortran with scripting languages to support the user interface, was a collaborative effort between Lamzin's lab and that of Anastassis Perrakis of the Netherlands Cancer Institute.
“ARP/wARP, like human crystallographers, links model building and refinement together into a unified process that iteratively proceeds towards the final macromolecular model,” wrote Lamzin and Perrakis in a 2008 Nature Protocols paper describing a new release of the program. The software includes multiple, unified approaches to model building as well as ligand and solvent building components.
More recently, ARP/wARP's resolution range has expanded to allow the solution of larger structures and complexes, which have lower resolution diffraction patterns. “These structures don't always produce the Xray diffraction patterns that work best with ARP/wARP,” Lamzin says. “That means we need to develop novel methods if we want to get biologically relevant information from those structures.”
While the software still performs best at resolutions better than 2 angstroms, it can automatically solve 65% of structures at resolutions approaching 3.5 angstroms. Ultimately, Lamzin would like to solve much, much larger structures—his dream being the structure of a cell—so the software is continually under improvement.
The latest release, version 7.2, which will be available late spring 2011, contains an advanced molecular graphics front-end and an RNA/DNA builder. It integrates with CCP4 and also uses REFMAC for model refinement. For more details, visit the ARP/wARP website.
While ARP/wARP is a significant project in Lamzin's lab, his own structural biology work employs atomic resolution Xray crystallography to study the enzymatic action of proteins such as those using Nicotinamide adenine dinucleotide (NAD/NADH), the most abundant electron carrier in cell metabolism. Lamzin's work typically leads not only to publications about structural discoveries, but also to the development of technology to share. “We like to contribute with structural biology interpretations,” says Lamzin, “and at the same time with methodological advances.”
– Elizabeth Dougherty
Published May 17, 2011