fastaguy88
an hour ago
One of the major breakthroughs in Bioinformatics was the recognition that local similarity scores (which can be thought of as runs of positive sequence similarity) are extreme-value distributed.[0] The logic of that discovery uses almost exactly the same mathematical argument as this paper [1], indeed I recognized some of the same equations.
It is difficult to overstate the importance of this discovery for biology, as today, the vast vast majority of protein functional inferences for newly sequenced genomes are based on the statistics of long runs of sequence similarity.
[0] https://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html [1] https://www.pnas.org/doi/epdf/10.1073/pnas.87.6.2264