Combining Mutual Information with Structural Analysis to Screen for Functionally Important Residues in Influenza Hemagglutinin.

Peter M. Kasson and Vijay S. Pande. Pacific Symposium on Biocomputing 14:492-503(2009).

The influenza hemagglutinin protein performs several important
functions, including attaching the virus to cells it will infect and
releasing the viral genome into the interior of the cell. Most
protective antibodies against influenza also bind to the hemagglutinin
protein. We wish to understand how mutations to hemagglutinin affect
viral function, including what keeps avian influenza (“bird flu”) from
being readily transmissible between humans. In this paper, we have
applied a technique from information theory known as mutual
information to genetic sequence data to predict important mutation
sites on the hemagglutinin protein. In follow-up work, we are
combining this technique with other methods to refine these
predictions and test some of them using Folding@home.

Influenza hemagglutinin mediates both cell-surface binding and cell
entry by the virus. Mutations to hemagglutinin are thus critical in
determining host species specificity and viral infectivity. Previous
approaches have primarily considered point mutations and sequence
conservation; here we develop a complementary approach using mutual
information to examine concerted mutations. For hemagglutinin,
several overlapping selective pressures can cause such concerted
mutations, including the host immune response, ligand recognition and
host specificity, and functional requirements for pH-induced
activation and membrane fusion. Using sequence mutual information as
a metric, we extracted clusters of concerted mutation sites and
analyzed them in the context of crystallographic data. Comparison of
influenza isolates from two subtypes—human H3N2 strains and human and
avian H5N1 strains—yielded substantial differences in spatial
localization of the clustered residues. We hypothesize that the
clusters on the globular head of H3N2 hemagglutinin may relate to
antibody recognition (as many protective antibodies are known to bind
in that region), while the clusters in common to H3N2 and H5N1
hemagglutinin may indicate shared functional roles. We propose that
these shared sites may be particularly fruitful for mutagenesis
studies in understanding the infectivity of this common human
pathogen. The combination of sequence mutual information and
structural analysis thus helps generate novel functional hypotheses
that would not be apparent via either method alone.