The fate of mutations in populations depends on their impact on the fitness of the individual that carries them. This fitness effect depends, in turn, on the location of the mutation in the genome: a mutation occurring in a non-coding region generates a new allele that will evolve neutrally, while a mutation located within a functional region can have deleterious or advantageous effects, effects that will furthermore depend on the function of the underlying gene. Yet within a given gene, mutations can have very distinct effects. For genes encoding a macromolecule, RNA or protein, an important determinant of these effects is the structure of the encoded molecule. I will here present some insights that we gained regarding the impact of protein structure on the evolution of sequences, with a focus on protein-encoding sequences. In particular, we ask the following questions: (1) what is the distribution of adaptive mutations along 3D protein structures and (2) to which extent does protein structure generate coevolution between positions? To leverage information about the distribution of fitness effects, we relied on comparative genome analyses. I will present two statistical approaches: an extension of the McDonald-Kreitman approach that allows inferring the rate of adaptive non-synonymous substitutions by modeling the distribution of fitness effects of mutations, and a substitution mapping procedure used for inferring coevolving positions.
RSG-Turkey is a member of The International Society for Computational Biology (ISCB) Student Council (SC) Regional Student Groups (RSG). We are a non-profit community composed of early career researchers interested in computational biology and bioinformatics.