A short review: Integrative Modeling of Biomolecular Complexes

Structures of biological macromolecules cannot be easily determined, as they are flexible, i.e. their conformations change while they function [1].  Therefore, these molecules should be characterized. This characterization step might be very challenging [2]. Structures of these macromolecules can be specified by using some well-known techniques such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy and Small-angle X-ray scattering (SAXS) [3]. These techniques give different information about structure of a molecule and these alone are not enough to determine whole structure properties [4]. Also, results of these techniques need to be interpreted by using computational analysis in order to specify more precise structures of macromolecules [2]. Integrative modeling is a common technique used in recent years to more accurately determine the molecular structure.

What is integrative modeling?

Figure 1: Integrative structure examples of some systems [5].

As understood from a word meaning, integrative modeling uses more than one information source to model a structure and mechanism of biological molecules in systems [5]. As in all modeling methods, integrative modeling combines all available experimental data as well as with computational techniques to obtain more accurate, precise, complete and efficient model (Figure 1).

Integrative modeling has iterative four stages: (1) gathering information, (2) representing the system by translating information, (3) creating sample of structural models and (4) scoring the model (Figure 2).  

A screenshot of a cell phone

Description automatically generated
  Figure 2: Iterative integrative modeling process [5].

Briefly:

  1. Gathering information: Collecting all available data about structure of the system is the first stage in Figure 2 [6]. Structural information coming from any method/technique can be based in the theory.
  2. Representing the system and translating information into spatial restraints: Gained information data from first step can be used to describe a model [5]. Based on the input information, variables are used to define features of the model. These variables can represent atoms, coarse-grained particles and subunit in complex in structural biology.
  3. Structural sampling:  Created models have random configuration as a first. Then, different configurations are sampled based on the scoring functions [6]. Results are fitted to input information for filtering. 
  4. Validating the model: Models which have good-scoring results are chosen for validation (these chosen models creates ensemble) [5]. After some estimation, one or more than one model can be chosen as a result (or not to be chosen). This depends on accuracy calculation based on the input data. 

As a following, you can find another representation of these iterative four stages for integrative modeling which shows different usage areas [7].

A picture containing timeline

Description automatically generated
Figure 3: Structure determination of protein complexes and genome assemblies with integrative modeling [7].

Types of Structural Information & Software Resources for Integrative Modeling 

X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy are common techniques to specify the structure of biological molecules[4]. They give atomic information of the structure. Acquiring crystallographic structures is very challenging for large biological complexes. Therefore, some recent techniques to determine structure of a molecule are used as shown in Table 1.

Table

Description automatically generated
Table 1: Types of structural information [4].

Each technique gives a different aspect about structure of a molecule, so they can be used to determine different properties of the structure[5]. Brief explanation of some techniques in Table1:

  • Cryo-EM: Cryo-electron microscopy is commonly used to specify global structural information and structural characterization of large complexes [4]. The three-dimensional (3D) electron density map of a macromolecule complex is obtained with cryo-EM single particle analysis. 
  • XL-MS: Distance restraints between residue pairs in biological molecule can be determined by using cross-linking coupled to mass spectrometry.
  • SAXS: Small-angle X-ray scattering is commonly used method to get information about shapes of macromolecules. 
  • Sequence information is important to get evolutionary conserved positions which can be related with folding, function, interactions and dynamics of the molecule.
  • FRET: Förster resonance energy transfer is used to get information about structures, dynamics and interactions of protein.

In Table 2, some of commonly used suite of tools with their methods are listed which were also used to create structures in Figure 1. 

Table

Description automatically generated
   Table 2: Example of existing software resources for integrative modeling [5].

What is the accuracy of integrative modeling?

Diagram

Description automatically generated
Figure 4: Comparing the integrative structures of the Yeast NPC [5].

The modeled structures of the Yeast NPC from two different years are compared in Figure 4 [5]. As it can be seen in Figure 4, 2018 structure of Yeast NPC was modeled more detailed than 2007 structure. Among the years, precision is getting smaller and that leads to more detailed, more accurate model. Accordingly, decrease on the value of precision means increasing the resolution. High resolution allows us to determine structure more clearly. Therefore, 2018 structure has more details. These two structures of the Yeast NPC were not modeled by using only one information resource. They were modeled with multiple information from different resources. The resources used for 2018 model give more detailed information. Purpose of integrative modeling is to reach as true as possible model by using all available data and the quality of data increases the accuracy of integrative modeling. Figure 4 is a good example of how well integrative modeling works and how used resources affect the result.

——————————————————————————————————————————————————————

I suggest you take a look at the articles I used for this short review. “Principles for Integrative Structural Biology Studies” article is very informative review for integrative modeling. If you will choose only one, I highly recommend that article. I hope this short review is helpful for you. Thank you for reading.

——————————————————————————————————————————————————————

References

[1]      A. Panjkovich and D. I. Svergun, “Deciphering conformational transitions of proteins by small angle X-ray scattering and normal mode analysis,” Phys. Chem. Chem. Phys., vol. 18, no. 8, pp. 5707–5719, 2016.

[2]      E. Karaca and A. M. J. J. Bonvin, “Advances in integrative modeling of biomolecular complexes,” Methods, vol. 59, no. 3, pp. 372–381, 2013.

[3]      C. E. M. Schindler, S. J. de Vries, A. Sasse, and M. Zacharias, “SAXS Data Alone can Generate High-Quality Models of Protein-Protein Complexes,” Structure, vol. 24, no. 8, pp. 1387–1397, 2016.

[4]      M. Braitbard, D. Schneidman-Duhovny, and N. Kalisman, “Integrative Structure Modeling: Overview and Assessment,” Annu. Rev. Biochem., vol. 88, no. 1, pp. 113–135, Jun. 2019.

[5]      M. P. Rout and A. Sali, “Principles for Integrative Structural Biology Studies,” Cell, vol. 177, no. 6, pp. 1384–1403, 2019.

[6]      B. Webb et al., “Integrative structure modeling with the Integrative Modeling Platform,” Protein Sci., vol. 27, no. 1, pp. 245–258, Jan. 2018.

[7]      A. P. Joseph, G. Polles, F. Alber, and M. Topf, “Integrative modelling of cellular assemblies,” Curr. Opin. Struct. Biol., vol. 46, pp. 102–109, 2017.

INSaFLU ve galaxyproject ile SARS-CoV-2 varyantlarının karşılaştırılması – RSG-Türkiye Aktif Üyeleri

Çalışmayı Yapanlar

  • Nazlı S. Kara, İstinye Üniversitesi
  • Meltem Kutnu, ODTÜ
  • Yasemin Utkueri, Sabancı Üniversitesi
  • Funda Yılmaz, Radbound University
  • Elif Bozlak, University of Veterinary Medicine Vienna; Vienna Graduate School of Population Genetics
  • Evrim Fer, University of Arizona

Özet

2020 BioHackathon’u, var olan varyant tespit etme iş akışlarının COVID-19 için geliştirilmesi veya üretilen büyük miktardaki verinin analiz edilebilmesi için yeni iş akışları oluşturulmasına ev sahipliği yapmıştır. Bunlardan bazıları Galaxy Project, INSaFLU ve nf-core’dur. Bu iş akışları yeni nesil dizileme teknolojisi ile dizilenen genom verisini analiz eder ve anotasyonu yapılmış tek nükleotid polimorfizm (SNP) ve kısa ekle-sil (indel) varyantlarını çıktı olarak verir. Kullandıkları algoritmalara göre farklı avantaj ve dezavantajları vardır. Bu çalışmada Galaxy Project tarafından yayımlanmış SARS-CoV-2 genom varyantlarını INSaFLU iş akışıyla belirlenen varyantlarla karşılaştırmayı, böylece bu iki iş akışının performanslarını değerlendirebilmeyi amaçladık. Sonuç olarak iki iş akışı tarafından ortak olarak bulunan 600’e yakın varyant bulduk. Bu varyantların neredeyse yarısının replikaz poliprotein 1ab’de olduğunu tespit ettik. Ortak olarak bulunan varyantlarda non-synonymous varyantların synonymous varyantlardan fazla olduğu gördük. Çalışmada tespit edilen ortak ve özgün varyantlar ileriki araştırmalarda daha detaylı incelenebilir.

Tarih: 21 Haziran 2020 – 20:00 (GMT+3)

Dil: Türkçe

Aşağıdaki linkten webinara kayıt olabilirsiniz:

https://www.bigmarker.com/bioinfonet/INSaFLU-ve-galaxyproject-ile-SARSCoV2-varyantlarinin-karsilastirilmasi

Evolution and Unprecedented Variants of the Mitochondrial Genetic Code in a Lineage of Green Algae – David Žihala

Presenter




David Žihala

Abstract

Mitochondria of diverse eukaryotes have evolved various departures from the standard genetic code, but the breadth of possible modifications and their phylogenetic distribution are known only incompletely. Furthermore, it is possible that some codon reassignments in previously sequenced mitogenomes have been missed, resulting in inaccurate protein sequences in databases. Considering the distribution of codons at conserved amino acid positions in mitogenome-encoded proteins, mitochondria of the green algal order Sphaeropleales exhibit a diversity of codon reassignments, including previously missed ones and some that are unprecedented in any translation system examined so far, necessitating redefinition of existing translation tables and creating at least seven new ones. We resolve a previous controversy concerning the meaning the UAG codon in Hydrodictyaceae, which beyond any doubt encodes alanine. We further demonstrate that AGG, sometimes together with AGA, encodes alanine instead of arginine in diverse sphaeroplealeans. Further newly detected changes include Arg-to-Met reassignment of the AGG codon and Arg-to-Leu reassignment of the CGG codon in particular species. Analysis of tRNAs specified by sphaeroplealean mitogenomes provides direct support for and molecular underpinning of the proposed reassignments. Furthermore, we point to unique mutations in the mitochondrial release factor mtRF1a that correlate with changes in the use of termination codons in Sphaeropleales, including the two independent stop-to-sense UAG reassignments, the reintroduction of UGA in some Scenedesmaceae, and the sense-to-stop reassignment of UCA widespread in the group. Codon disappearance seems to be the main drive of the dynamic evolution of the mitochondrial genetic code in Sphaeropleales.

Date: April 28th, 2020 – 7:00 pm (GMT+3)

Language: English

To register the webinar, you can visit this link:
https://www.bigmarker.com/bioinfonet/Evolution-and-Unprecedented-Variants-of-the-Mitochondrial-Genetic-Code-in-a-Lineage-of-Green-Algae

Connecting to Virtual Machine for Windows by using Putty (3-steps)

Big data requires big infrastructure. If your computer cannot handle with big data, you need to connect with a server or virtual machine to store and process your data.

I have been participating COVID19-bh20. If you are newbie like me to participate such events, and inexperienced in handling with big data in such a big hackathon, here is the first thing you need to know about how to manage such metadata: connecting the Virtual Machine (VM) via Putty.

  • First you need to download PuTTy
  • Please open the putty key generator,

Step-1

  • You need to generate the public and private keys in the format requested by the admin such as RSA format, shown in yellow box
  • You need to save them
  • After generation, you need to share the public key, shown in red box, with the admin of virtual machine/server
  • Btw you need to generate a password, which is shown with green box

Step-2

  • Next type the IP address to the host name/IP address box, shown in purple box
  • (Do not open without changing the Connection settings, which will be done in the following steps)
  • Then you will enter the private key to access to VM via changing the Connection settings, shown with an orange arrow

Step-3

  • After clicking the Connection, denoted with orange arrow
  • Next step is to click SSH, shown in orange arrow
  • Then you need to click select Auth, shown in orange arrow
  • When you select Auth, you need to add the path of the private key via browsing it, shown in red box
  • Now you need to click OPEN to access, shown in green arrow
  • Username is given by the admin username@IP_address, highlighted with bold
  • And the password will be the password you generated as key passphrase while generating the key.

I hope you find this post useful,

For detailed information you can check with Microsoft Azure page.

PS: Although my labmates showed me how to do it before, I forgot it. Thanks to hackathon, I had a chance to refresh my old memories. In case you are a newbie like me, this post might be useful.

All the best with your analysis!

Hakkimizda

Follow here for English!

Biz kimiz?

  • Genc arastirmacilariz
  • Hesaplamali biyoloji ve biyoenformatik ile ilgileniyoruz
  • Bir sekilde Turkiye’deki hesaplamali biyoloji camiasi ile baglantiliyiz
  • Yeni insanlarla tanismayi cok seviyoruz! Simdi bize katilin!

Ne yapariz?


Neden bu kadar uzun bir ismimiz var? (eglenceli aciklamamiz icin, anasayfamiza bakin!)

Haydi ismi parcalarina ayirip inceleyelim!
  • ISCB – The International Society for Computational Biology (ISCB)
  • ISCB SC – ISCB Ogrenci Konseyi
  • ISCB SC RSG – ISCB Ogrenci Konseyi – Bolgesel Ogrenci Gruplari
  • ISCB SC RSG Turkiye – bu biziz!

Bizi takip edin: Facebook Twitter

RSG-Turkey is a member of The International Society for Computational Biology (ISCB) Student Council (SC) Regional Student Groups (RSG). We are a non-profit community composed of early career researchers interested in computational biology and bioinformatics.

Contact: turkey.rsg@gmail.com

Follow us on social media!