A collection of high quality multiple sequence alignments for objective, comparative studies of alignment algorithms. The alignments are constructed based on 3D structure superposition and manually refined to ensure alignment of important functional residues. A number of subsets are defined covering many of the most important problems encountered when aligning real sets of proteins. It is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Version 2.0 of the database incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences.
Within the resource, users can look at a list of all the alignments, download the whole database by ftp, get the "c" program to compare a test alignment with the BAliBASE reference (The source code for the program is freely available), or look at the results of a comparison study of several multiple alignment programs, using BAliBASE reference sets.