Held in conjunction with KDD'16
Aug 14, 2016 - San Francisco, CA
2nd SIGKDD Workshop on
Mining and Learning from Time Series

Introduction

Time series data are ubiquitous. The explosion of new sensing technologies (wearable sensors, satellites, mobile phones, etc.), combined with increasingly cheap and effective storage, is generating an unprecedented and growing amount of time series data in a variety of domains. The volume and complexity of these data present new and significant challenges to existing and even state-of-the-art methods. The focus of MiLeTS workshop is to synergize the research in this area and discuss both new and open problems in time series analysis and mining. The solutions to these problems may be algorithmic, theoretical, statistical, or systems-based in nature. Further, MiLeTS emphasizes applications to high impact or relatively new domains, including but not limited to biology, health and medicine, climate and weather, road traffic, astronomy, and energy.
The inaugural MiLeTS workshop will discuss a broad variety of topics related to time series, including:

  • Time series pattern mining and detection, representation, searching and indexing, classification, clustering, prediction, forecasting, and rule mining.
  • BIG time series data.
  • Hardware acceleration techniques using GPUs, FPGAs and special processors.
  • Online, high-speed learning and mining from streaming time series.
  • Uncertain time series mining.
  • Privacy preserving time series mining and learning.
  • Time series that are multivariate, high-dimensional, heterogeneous, etc., or that possess other atypical properties.
  • Time series with special structure: spatiotemporal (e.g., wind patterns at different locations), relational (e.g., patients with similar diseases), hierarchical, etc.
  • Time series with sparse or irregular sampling, non-random missing values, and special types of measurement noise or bias.
  • Time series analysis using less traditional approaches, such as deep learning and subspace clustering.
  • Applications to high impact or relatively new time series domains, such as health and medicine, road traffic, and air quality.
  • New, open, or unsolved problems in time series analysis and mining.

Schedule

Imperial Ballroom A & B
8:00 - 17:00
August 14, 2016

 

Morning Session

08:00-08:10  Opening remarks
08:10-09:00  Keynote Latifur Khan, UTD
     Stream Data Mining: A Big Data Perspective
09:00-10:00  Paper Presentation 1: Forecasting

      Parallel News-Article Traffic Forecasting with ADMM. Stratis Ioannidis, Yunjiang Jiang, Saeed Amizadeh and Nikolay Laptev
       Using Time Series Techniques to Forecast and Analyze Wake and Sleep Behavior. Jennifer A. Williams and Diane J. Cook
       Short-term Time Series Forecasting with Regression Automata. Qin Lin, Christian Hammerschmidt, Gaetano Pellegrino and Sicco Verwer

10:00-10:30  Coffee Break
10:30-11:20  Keynote Le Song, GaTech
     Dynamic Processes over Information Networks Representation, Modeling, Learning and Inference
11:20-12:00  Paper Presentation 2: Clustering

       Scalable Clustering of Correlated Time Series using Expectation Propagation. Christopher Aicher and Emily B. Fox
       Space-Time Clustering with Stability Probe while Riding Downhill. Xin Huang, Iliyan Iliev, Alexander Brenning and Yulia Gel

Afternoon Session

13:00-13:50  Keynote Naren Ramakrishnan, VT
     New Time Series Methods for Flu Forecasting
13:50-15:30  Paper Presentation 3: Warping and Dependencies

       On the Effect of Endpoints on Dynamic Time Warping. Diego Silva, Gustavo Batista and Eamonn Keogh
       Evaluating Improvements to the Shapelet Transform. Aaron Bostrom, Anthony Bagnall and Jason Lines
       Sparse plus low-rank graphical models of time series for functional connectivity in MEG. Nicholas J. Foti, Rahul Nadkarni, Adrian Kc Lee and Emily B. Fox
       Time Lag Concerned Dynamic Dependency Network Structure Learning. Sizhen Du, Haikun Hong and Guojie Song
       Granger Causality Networks for Categorical Time Series. Alex Tank, Emily Fox and Ali Shojaie

15:30-16:00  Coffee Break
16:00-16:20  Paper Presentation 4: Evaluation

       The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Anthony Bagnall, Aaron Bostrom, James Large and Jason Lines

16:20-17:00  Poster Presentation

       Open Problem: Accurately Measuring Event Impacts on Time Series. Lianhua Chi, Bo Han and Yun Wang
       Flexible Similarity Search for Enriched Trajectories. Hideaki Ohashi, Toshiyuki Shimizu and Masatoshi Yoshikawa
       Extreme Traffic Forecasting: A Deep Learning Approach. Rose Yu, Yaguang Li, Cyrus Shahabi, Ugur Demiryurek and Yan Liu
       Precursor Mining in Time Series Data. Vijay Manikandan Janakiraman, Bryan Matthews and Nikunj Oza

 

Keynote Speakers

Latifur Khan

Latifur Khan

Professor
The University of Texas at Dallas

Stream Data Mining: A Big Data Perspective

Data streams are continuous flows of data. Examples of data streams include network traffic, sensor data, call center records and so on. Data streams demonstrate several unique properties that together conform to the characteristics of big data (i.e., volume, velocity, variety and veracity) and add challenges to data stream mining. In this talk we will present an organized picture on how to handle various data mining techniques in data streams.
Most existing data stream classification techniques ignore one important aspect of stream data: arrival of a novel class. We address this issue and propose a data stream classification technique that integrates a novel class detection mechanism into traditional classifiers, enabling automatic detection of novel classes before the true labels of the novel class instances arrive. Novel class detection problem becomes more challenging in the presence of concept-drift, when the underlying data distributions evolve in streams. In this talk we will show how to make fast and correct classification decisions under this constraint with limited labeled training data and apply them to real benchmark data. In addition, we will present a number of stream classification applications such as adaptive malicious code detection, website fingerprinting, evolving insider threat detection and textual stream classification.
This research was funded in part by NSF, NASA, Air Force Office of Scientific Research (AFOSR) and Raytheon.
Bio
Dr. Latifur Khan is currently a full Professor (tenured) in the Computer Science department at the University of Texas at Dallas, USA where he has been teaching and conducting research since September 2000. He received his Ph.D. and M.S. degrees in Computer Science from the University of Southern California in August of 2000, and December of 1996 respectively. Dr. Khan is an ACM Distinguished Scientist. He has received prestigious awards including the IEEE Technical Achievement Award for Intelligence and Security Informatics.
Dr. Khan has published over 200 papers in prestigious journals, and in peer reviewed conference proceedings. Currently, his research area focuses on big data management and analytics, data mining, complex data management including geo-spatial data and multimedia data. More details can be found at: www.utdallas.edu/~lkhan/

Le Song

Le Song

Assistant Professor
Georgia Institute of Technology

Dynamic Processes over Information Networks Representation, Modeling, Learning and Inference

Nowadays, large-scale human activity data from online social platforms, such as Twitter, Facebook, Reddit, Stackoverflow, Wikipedia and Yelp, are becoming increasing available and in increasing spatial and temporal resolutions. Such data provide great opportunities for understanding and modeling both macroscopic (network level) and microscopic (node-level) patterns in human dynamics. Such data have also fueled the increasing efforts on developing realistic representations and models as well as learning, inference and control algorithms to understand, predict, control and distill knowledge from these dynamic processes over networks.
It has emerged as a trend to take a bottom-up approach which starts by considering the stochastic mechanism driving the behavior of each node in a network to later produce global, macroscopic patterns at a network level. However, this bottom-up approach also raises significant modeling, algorithmic and computational challenges. In this talk, I will present machine learning framework for representing, modeling, and performing learning and inference for human activity data. The framework leverage methods from temporal point process theory, probabilistic graphical models and optimization, and often produce state-of-the-art results on various modeling and time-sensitive inference tasks.
Bio
Le Song is an assistant professor in the Department of Computational Science and Engineering, College of Computing, Georgia Institute of Technology. He received his Ph.D. in Machine Learning from University of Sydney and NICTA in 2008, and then conducted his post-doctoral research in the Department of Machine Learning, Carnegie Mellon University, between 2008 and 2011. Before he joined Georgia Institute of Technology, he was a research scientist at Google. His principal research direction is machine learning, especially nonlinear methods and probabilistic graphical models for large scale and complex problems, arising from artificial intelligence, social network analysis, healthcare analytics, and other interdisciplinary domains. He is the recipient of the NSF CAREER Award’14, AISTATS ’16 Best Student Paper Award, IPDPS ’15 Best Paper Award, NIPS ’13 Outstanding Paper Award, and ICML ’10 Best Paper Award. He has also served as the area chair for leading machine learning conferences such as ICML , NIPS and AISTATS , and the action editor for JMLR .

 Naren Ramakrishnan

Naren Ramakrishnan

Thomas L. Phillips Professor
Virginia Tech

New Time Series Methods for Flu Forecasting

There has been recent concerted interest in computational methods for forecasting the flu, spurred by competitions organized by agencies like the CDC and IARPA. The CDC competition aimed to forecast flu seasonal characteristics in the US and the IARPA Open Source Indicators (OSI) forecasting tournament was focused on disease forecasting (flu and rare diseases) in countries of Latin America. The speaker's team was declared the winner in the IARPA OSI competition and this task will communicate our lessons learned about what goes into a successful flu forecasting engine, how to evaluate its performance, and how best to ensure its relevance to public health policy and planning purposes.
Bio
Naren Ramakrishnan is the Thomas L. Phillips Professor of Engineering at Virginia Tech. He directs the Discovery Analytics Center, a university-wide effort that brings together researchers from computer science, statistics, mathematics, and electrical and computer engineering to tackle knowledge discovery problems in important areas of national interest. His work has been featured in the Wall Street Journal, Newsweek, Smithsonian Magazine, PBS/NoVA Next, Chronicle of Higher Education, and Popular Science, among other venues. Ramakrishnan serves on the editorial boards of IEEE Computer, ACM Transactions on Knowledge Discovery from Data, Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, and other journals. He received his PhD in Computer Sciences from Purdue University.

Accepted Papers

 

The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms Anthony Bagnall, Aaron Bostrom, James Large and Jason Lines

On the Effect of Endpoints on Dynamic Time Warping Diego Silva, Gustavo Batista and Eamonn Keogh

Evaluating Improvements to the Shapelet Transform Aaron Bostrom, Anthony Bagnall and Jason Lines

Parallel News-Article Traffic Forecasting with ADMM Stratis Ioannidis, Yunjiang Jiang, Saeed Amizadeh and Nikolay Laptev

Using Time Series Techniques to Forecast and Analyze Wake and Sleep Behavior Jennifer A. Williams and Diane J. Cook

Short-term Time Series Forecasting with Regression Automata Qin Lin, Christian Hammerschmidt, Gaetano Pellegrino and Sicco Verwer

Time Lag Concerned Dynamic Dependency Network Structure Learning Sizhen Du, Haikun Hong and Guojie Song

Space-Time Clustering with Stability Probe while Riding Downhill Xin Huang, Iliyan Iliev, Alexander Brenning and Yulia Gel

Sparse plus low-rank graphical models of time series for functional connectivity in MEG Nicholas J. Foti, Rahul Nadkarni, Adrian Kc Lee and Emily B. Fox

Scalable Clustering of Correlated Time Series using Expectation Propagation Christopher Aicher and Emily B. Fox

Granger Causality Networks for Categorical Time Series Alex Tank, Emily Fox and Ali Shojaie

 

Accepted Posters

 

Open Problem: Accurately Measuring Event Impacts on Time Series Lianhua Chi, Bo Han and Yun Wang

Flexible Similarity Search for Enriched Trajectories Hideaki Ohashi, Toshiyuki Shimizu and Masatoshi Yoshikawa

Extreme Traffic Forecasting: A Deep Learning Approach Rose Yu, Yaguang Li, Cyrus Shahabi, Ugur Demiryurek and Yan Liu

Precursor Mining in Time Series Data Vijay Manikandan Janakiraman, Bryan Matthews and Nikunj Oza

 

Call for Papers

Submissions should follow the SIGKDD formatting requirements and will be evaluated using the SIGKDD Research Track evaluation criteria. Preference will be given to papers that are reproducible, and authors are encouraged to share their data and code publicly whenever possible.

Note on open problem submissions: In order to promote new and innovative research on time series, we plan to accept a small number of high quality manuscripts describing open problems in time series analysis and mining. Such papers should provide a clear, detailed description and analysis of a new or open problem that poses a significant challenge to existing techniques, as well as a thorough empirical investigation demonstrating that current methods are insufficient.

Submissions will be managed via the MiLeTS 2016 EasyChair website.

Key Dates

 

Paper Submission Deadline: May 27, 2016, 11:59 PM PST

Author Notification: June 13, 2016, 11:59 PM PST

Final Version: June 27, 2016, 11:59 PM PST

Workshop: August 14, 2016

Workshop Organizers

 

Eamonn Keogh

University of California Riverside

 

Yan Liu

University of Southern California

 

Abdullah Mueen

University of New Mexico

 

Dehua Cheng

University of Southern California

 

Program Committee

 

Jessica Lin
Anthony Bagnall
Gustavo Batista
Sanjay Purushotham
Zeeshan Syed
Francois Petitjean
Nurjahan Begum
Diego Silva
Chotirat Ratanamahatana
Pavel Senin
Dean Teffer
Souhaib Ben Taieb
Krishnamurthy Viswanathan
Liudmila Ulanova
Zhenhui Li
Mohammad Taha Bahadori

George Mason University
University of East Anglia
University of São Paulo
University of Southern California
University of Michigan
Monash University
University of California, Riverside
University of São Paulo
Chulalongkorn University
Los Alamos National Laboratory
University of Texas at Austin
Monash University
HP Labs
University of California, Riverside
Penn State University
Georgia Institute of Technology