The Creevey Lab: Finding Novel Antimicrobial Peptides with AMPLY

In this post Ben Thomas gives an overview of the rationale behind his tool AMPLY:

Rise of the superbugs

The rise of antibiotic resistant strains of microbes (bacteria, parasites, viruses and fungi) is probably the leading threat facing humankind. Increasingly desperate warnings about the real-world implications of the increasing resistance are now front page news. The problem is a multifaceted one. Antimicrobial resistance is not just about the loss of human life, but inextricably intertwined with increased patient morbidity and massive economic consequences for global healthcare systems. There are two possible solutions, either a socio-political/behavioural change or a technical/scientific response. Humankind has shown itself remarkably intransigent when faced with doom laden prophecies that require behavioural modification to circumvent (see also Climate Change), therefore it is probably prudent to assume that a managed technical response may be our best hope. But new antibiotics are unlikely to arise spontaneously. Mokyr highlights one of the issues with relying on the existing pharmaceutical industry to address the problem: “…few economies have ever left [decisions like these] entirely to the decentralized decision-making processes of competitive firms. The market test by itself is not always enough” (Mokyr, 1998).

Discovery of AMPs

Figure 1: A cationic, helical AMP (Taliecin-1)

The discovery of AMPs dates back to 1939, when Dubos extracted an antimicrobial agent from a soil Bacillus strain. The designation of AMPs has been extended to encompass a general view of them as a group of anionic antimicrobial proteins/peptides; host defence peptides; cationic amphipathic peptides and cationic AMPs. In contrast to acquired immune mechanisms these endogenous peptides provide a fast and effective means of defence against pathogens as part of the innate immune response. Antimicrobial peptides are evolutionary ancient weapons and their ubiquity throughout the animal and plant kingdoms supports the hypothesis that they have played a key role in the successful evolution of complex multi-cellular organisms. Such is their diversity they can be found in locations as disparate as the skin secretions of a frog to the defensive arsenal of a protozoa.

Dolby Bioinformatics

Figure 2: The Dolby certification logo (dolby.com)

One specific feature of AMPs that makes them difficult to find is that they’re small (often less than 20 amino acids in length – which is comparatively tiny compared to typical proteins). In a typical ‘omic dataset containing, potentially millions and millions of datapoints, isolating interesting AMPs for synthesis and testing is a challenging test. For inspiration we can look to the music industry. In the mid-20th century recordings were made on magnetic tape and engineers wrestled with an ever present low level of hissing noise in the background that threatened to drown out the music. Various ingenious solutions were deigned to mitigate the persistent hiss from forms of “low-noise” tape which recorded more signal; running the tape at a higher speed, or using dynamic pre-emphasis during recording and a form of dynamic de-emphasis during playback. This latter approach became the backbone of the Dolby noise reduction system, which became all pervasive in home audio equipment from the late 60s onwards. The audio engineer’s struggle to maximise signal-to-noise is the same core problem that faces computational biologists and the ongoing analysis of ‘omic “big data” in the search for tiny novel AMPs. There is music there, but at the moment the hiss is tremendous.

The detection of AMPs in metagenomic data is a tantalising low-hanging fruit for computational biologists, however. Post-computational wet-lab work is relatively cheap with spot synthesis of peptides up to around 25aas long possible from a wide array of third party companies with prices from as low as £2.50 per amino acid. A well organised screening program can screen in excess of 100 peptides a day, per person, against a model bacterial organism to test for activity. As a potential workflow the rapid assessment of multiple ‘omic datasets; identification of homologues of pattern matched AMPs; rapid synthesis and screening and a rush to publication would appear to provide a grant-friendly drug-discovery goldmine! But to tap this rich vein, improving the hit rate of putative AMPs from ‘omic data needs to be streamlined and improved.

The AMPLY Pipeline

Finding small sequences (you’re interested in) that often look a lot like other small sequences (you’re not interested in) in datafiles that can contain potentially gigabytes of data is a trickier task than it first appears. Annotation in metagenomics is an art and the determination of what’s real and what’s not often relies purely on defining mutually agreed thresholds. However, as the length of the aligned data being identified starts to shorten, a lot of the assumptions on PercentageID, BitScore and E-Value thresholds begins to fall away. It’s here we return to the Dolby signal-to-noise analogy – the “music” of the AMPs in metagenomic datasets are often drowned out by the sheer volume of background noise and to find them we need to adopt a novel strategy of aggressive emphasis.

Designed by Ben Thomas in the CreeveyLab (http://www.creeveylab.org/), AMPLY (http://amply.life) is a pipeline designed to plug this gap between the ‘omic data and lab work. AMPLY is designed to provide a basis to sift-out AMPs suitable as synthesis candidates and provide potential regions for crude synthesis by adopting a hyper-wide “balance of evidence” approach. AMPLY passes over data with a series of detection methods, then wrapping the summative results of both them and presenting the final results into a final tableau (known as the “bitpad”) where each potential AMP can be evaluated on the strength of a series of hundreds of datapoints, rather than just a couple of numeric values.

Figure 3: The AMPLY workflow

To date, AMPLY has been used to find, characterise and synthesise thousands of novel AMPs. Among the AMPs discovered by AMPLY many are highly active against MRSA (a key superbug) and offer encouraging potential treatment avenues for future development. While there is still much work to be done, results so far have been extremely promising: AMPLY has been used to find bioactive AMPs in datasets as diverse as the skin of Peruvian poison dart frogs to the testicles of a Salamander so the only limitation in the AMPLY pipeline is the diversity of the stream of ‘omic data provided to it.

So, if you’re reading this blog and have interesting data and would like to be part of the drive to find new antimicrobials then get in touch for potential collaborations. We are always interested.

Contact Ben Thomas at B.Thomas@qub.ac.uk, or via Twitter @flwrs4algrnon

References
Mokyr, Joel. “The political economy of technological change.” Technological revolutions in Europe (1998): 39-64.

Pages

Finding Novel Antimicrobial Peptides with AMPLY