DiMA: Sequence Diversity Dynamics Analyser for Viruses


Tharanga S., Hu Y., Unlu E. S., Sjaugi M. F., Celik M. A., Hekimoglu H., ...More

arXiv, vol.2205, no.13915, pp.1-17, 2022 (Non Peer-Reviewed Journal)

  • Publication Type: Article / Article
  • Volume: 2205 Issue: 13915
  • Publication Date: 2022
  • Journal Name: arXiv
  • Page Numbers: pp.1-17
  • Bezmialem Vakıf University Affiliated: Yes

Abstract

Sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic interventions against viruses. Herein, we present DiMA, a tool designed to facilitate the dissection of sequence diversity dynamics for viruses. As a base, DiMA provides a quantitative overview of sequence diversity by use of Shannon's entropy, applied via a user-defined k-mer sliding window to an input alignment file. Distinctively, the key feature is that DiMA interrogates diversity dynamics by dissecting each k-mer position to various diversity motifs, defined based on the incidence of distinct sequences. At a given position, an index is a predominant sequence, while all the others are (total) variants to the index, sub-classified into the major (most common) variant, minor variants (occurring more than once and of frequency lower than the major), and the unique (singleton) variants. Moreover, DiMA allows for metadata enrichment of the motifs. DiMA is big data ready and provides an interactive output, depicting multiple facets of sequence diversity, with download options. It enables comparative genome/proteome diversity dynamics analyses, within and between sequences of viral species. The web server is publicly available at this https URL.