Treffer: Efficient Main Memory Algorithms For Significant Interval And Frequent Episode Discovery

Title:
Efficient Main Memory Algorithms For Significant Interval And Frequent Episode Discovery
Contributors:
Chakravarthy, Sharma
Publisher Information:
Computer Science & Engineering
Publication Year:
2006
Collection:
University of Texas Arlington: UTA ResearchCommons
Document Type:
Dissertation master thesis
File Description:
application/pdf
Language:
English
Accession Number:
edsbas.C347544D
Database:
BASE

Weitere Informationen

There is a considerable research on sequential mining of time-series data. Sensor-based applications such as MavHome require prediction of events for automating the environment using time-series data collected over a period of time. In these applications, it is important to predict tight and accurate intervals of interest to effectively automate the application. Also, detection of frequent patterns is needed for the automation of sequence of happenings. Although, there is a considerable body of work on sequential mining of transactional data, most of them deal with time point data and make several iterations over the entire data set for discovering frequently occurring patterns. An alternative approach consisting of three phases has been proposed for detecting significant intervals and frequent episodes. In the first phase, time-series data is folded over a periodicity (day, week, etc.) using which intervals are formed. The advantage of this approach is that the data set is compressed substantially thereby reducing the size of input used and hence the computation. Significant intervals that satisfy the criteria of minimum confidence and maximum interval-length specified by the user are discovered from this compressed interval data. In this thesis, we present a new single pass main memory algorithm (OnePass-SI algorithm) for detecting significant intervals. Unlike its counterparts, OnePass-SI algorithm does not follow the classic apriori style to discover significant intervals. While analyzing the OnePass-SI algorithm, we shall discuss its characteristics, complexity, scalability issues and its advantages over other algorithms. We shall also compare the performance of our algorithm with previously developed SID and SQL-based algorithms. For the second phase, we propose a frequent episode discovery (FED) algorithm (OnePass-FED algorithm). The OnePass-FED algorithm proposed in this thesis is a main memory algorithm. The OnePass-FED algorithm works on the significant intervals discovered in the first phase to ...