skip to main content
10.1145/1014052.1014114acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

IncSpan: incremental mining of sequential patterns in large database

Authors Info & Claims
Published:22 August 2004Publication History

ABSTRACT

Many real life sequence databases grow incrementally. It is undesirable to mine sequential patterns from scratch each time when a small set of sequences grow, or when some new sequences are added into the database. Incremental algorithm should be developed for sequential pattern mining so that mining can be adapted to incremental database updates. However, it is nontrivial to mine sequential patterns incrementally, especially when the existing sequences grow incrementally because such growth may lead to the generation of many new patterns due to the interactions of the growing subsequences with the original ones. In this study, we develop an efficient algorithm, IncSpan, for incremental mining of sequential patterns, by exploring some interesting properties. Our performance study shows that IncSpan outperforms some previously proposed incremental algorithms as well as a non-incremental one with a wide margin.

References

  1. R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering (ICDE '95), pages 3--14, March 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Ayres, J.E. Gehrke, T. Yiu, and J. Flannik. Sequential pattern mining using bitmaps. In Proc. 2002 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD '02), July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Cheung, J. Han, V. Ng, and C. Wong. Maintenance of discovered association rules in large databases: An incremental update technique. In Proc. of the 12th Int. Conf. on Data Engineering (ICDE '96), March 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Garofalakis, R. Rastogi, and K. Shim. SPIRIT: Sequential pattern mining with regular expression constraints. In Proc. 1999 Int. Conf. Very Large Data Bases (VLDB '99), pages 223--234, Sept 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Mannila, H. Toivonen, and A.I. Verkamo. Discovering frequent episodes in sequences. In Proc. 1995 Int. Conf. Knowledge Discovery and Data Mining (KDD '95), pages 210--215, Aug 1995.Google ScholarGoogle Scholar
  6. F. Masseglia, P. Poncelet, and M. Teisseire. Incremental mining of sequential patterns in large databases. Data Knowl. Eng., 46(1): 97--121, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Parthasarathy, M. Zaki, M. Ogihara, and S. Dwarkadas. Incremental and interactive sequence mining. In Proc. of the 8th Int. Conf. on Information and Knowledge Management (CIKM '99), Nov 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE '01), pages 215--224, April 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. 1995 Int. Conf. Very Large Data Bases (VLDB '95), Sept 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. In Proc. of the 5th Int. Conf. on Extending Database Technology (EDBT '96), Mar 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Wang and J. Han. Bide: Efficient mining of frequent losed sequences. In Proc. of 2004 Int. Conf. on Data Engineering (ICDE '04), March 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. X. Yan, J. Han,and R. Afshar. CloSpan: Mining closed sequential patterns in large datasets. In Proc. 2003 SIAM Int. Conf. on Data Mining (SDM '03), May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Zaki. SPADE: An efficient algorithm for mining frequent sequences. Machine Learning, 40:31--60, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Zhang, B. Kao, D. Cheung, and C. Yip. Efficient algorithms for incremental updates of frequent sequences. In Proc. of Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD '02), May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. IncSpan: incremental mining of sequential patterns in large database

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2004
      874 pages
      ISBN:1581138881
      DOI:10.1145/1014052

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 August 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader