Public databases are treasure troves of sequence data. Given the small genome size of viruses, they represent the entity with one of the largest number of full-genome sequences. Genetic diversity has been one of the mechanisms by which viruses evade the host immune response. Viruses, in particular those of RNA genetic material, mutate rapidly and thus contribute a large number of viral variants. In this talk, we describe the viral diversity dynamics at the protein sequence level and the implication to vaccine design.