Searching for Multiple Words in a Markov Sequence

Yonil Park
Yonil Park
[email protected]
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Search for more papers by this author
,
John L. Spouge
John L. Spouge
[email protected]
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
Search for more papers by this author

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA

Search for more papers by this author

John L. Spouge

[email protected]

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA

Search for more papers by this author

Published Online:1 Nov 2004https://doi.org/10.1287/ijoc.1040.0095

References

Arratia R., Goldstein L., Gordon L. Poisson approximation and the Chen-Stein method. Statist. Sci. (1990) 5:403–434Crossref, Google Scholar
Biggins J. D., Cannings C. Markov renewal processes, counters and repeated sequences in Markov-chains. Adv. Appl. Probab. (1987) 19:521–545Crossref, Google Scholar
Bishop D. T., Williamson J. A., Skolnick M. H. A model for restriction fragment length distributions. Amer. J. Human Genetics (1983) 35:795–815Google Scholar
Blondia C., Casals O. Statistical multiplexing of Vbr sources—A matrix-analytic approach. Performance Evaluation (1992) 16:5–20Crossref, Google Scholar
Breen S., Waterman M. S., Zhang N. Renewal theory for several patterns. J. Appl. Probab. (1985) 22:228–234Crossref, Google Scholar
Chryssaphinou O., Papastavridis S. A limit-theorem for the number of nonoverlapping occurrences of a pattern in a sequence of independent trials. J. Appl. Probab. (1988) 25:428–431Crossref, Google Scholar
Chryssaphinou O., Papastavridis S., Vaggelatou E. Poisson approximation for the nonoverlapping appearances of several words in Markov chains. Combinatorics Probab. Comput. (2001) 10:293–308Crossref, Google Scholar
Fu J. C. Distribution theory of runs and patterns associated with a sequence of multi-state trials. Statistica Sinica (1996) 6:957–974Google Scholar
Fu J. C., Koutras M. V. Distribution-theory of runs—A Markov-chain approach. J. Amer. Statist. Association (1994) 89:1050–1058Crossref, Google Scholar
Fu J. C., Lou W. Y. W.Distribution Theory of Runs and Patterns and its Applications: A Finite Markov Chain Imbedding Approach (2003) (World Scientific, River Edge, NJ) 69Crossref, Google Scholar
Gelfand M. S., Kozhukhin C. G., Pevzner P. A. Extendable words in nucleotide sequences. Comput. Appl. Biosci. (1992) 8:129–135Google Scholar
Gentleman J. F., Mullin R. C. The distribution of the frequency of occurrence of nucleotide subsequences, based on their overlap capability. Biometrics (1989) 45:35–52Crossref, Google Scholar
Kleffe J., Borodovsky M. 1st and 2nd moment of counts of words in random texts generated by Markov-chains. Comput. Appl. Biosci. (1992) 8:433–441Google Scholar
Lucantoni D. M., Meierhellstern K. S., Neuts M. F. A single-server queue with server vacations and a class of nonrenewal arrival processes. Adv. Appl. Probab. (1990) 22:676–705Crossref, Google Scholar
Pevzner P. A. Nucleotide sequences versus Markov models. Comput. Chem. (1992) 16:103–106Crossref, Google Scholar
Prum B., Rodolphe F., Deturckheim E. Finding words with unexpected frequencies in deoxyribonucleic-acid sequences. J. Roy. Statist. Soc. Ser. B-Methodological (1995) 57:205–220Google Scholar
Regnier M. A unified approach to word occurrence probabilities. Discrete Appl. Math. (2000) 104:259–280Crossref, Google Scholar
Reinert G., Schbath S. Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains. J. Comput. Biol. (1998) 5:223–253Crossref, Google Scholar
Reinert G., Schbath S., Waterman M. S. Probabilistic and statistical properties of words: An overview. J. Comput. Biol. (2000) 7:1–46Crossref, Google Scholar
Robin S., Daudin J. J. Exact distribution of word occurrences in a random sequence of letters. J. Appl. Probab. (1999) 36:179–193Crossref, Google Scholar
Robin S., Daudin J. J. Exact distribution of the distances between any occurrences of a set of words. Ann. Inst. Statist. Math. (2001) 53:895–905Crossref, Google Scholar
Robin S., Schbath S. Numerical comparison of several approximations of the word count distribution in random sequences. J. Comput. Biol. (2001) 8:349–359Crossref, Google Scholar
Schbath S. An efficient statistic to detect over- and under-represented words in DNA sequences. J. Comput. Biol. (1997) 4:189–192Crossref, Google Scholar
Schbath S., Prum B., de Turckheim E. Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comput. Biol. (1995) 2:417–437Crossref, Google Scholar
Stuckle E. E., Emmrich C., Grob U., Nielsen P. J. Statistical-analysis of nucleotide-sequences. Nucleic Acids Res. (1990) 18:6641–6647Crossref, Google Scholar
Tanushev M. S., Arratia R. Central limit theorem for renewal theory for several patterns. J. Comput. Biol. (1997) 4:35–44Crossref, Google Scholar
Waterman M. S.Introduction to Computational Biology (1995) (Chapman & Hall/CRC, Boca Raton, FL) Crossref, Google Scholar

cover image INFORMS Journal on Computing

Volume 16, Issue 4

Fall 2004

Pages 329-494

Article Information

Metrics

Information

Received:July 01, 2003
Accepted:April 01, 2004
Published Online:November 01, 2004

Cite as

Yonil Park, John L. Spouge, (2004) Searching for Multiple Words in a Markov Sequence. INFORMS Journal on Computing 16(4):341-347.

https://doi.org/10.1287/ijoc.1040.0095

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Searching for Multiple Words in a Markov Sequence

References

Volume 16, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News