Time series data are ubiquitous. In domains as diverse as finance, retail, entertainment, transportation and health care, we observe a fundamental shift away from parsimonious, infrequent measurement to nearly continuous monitoring and recording. Recent
advances in diverse sensing technologies, ranging from remote sensors to wearables and social sensing, are generating a rapid growth in the size and complexity of time series archives. Thus, although time series analysis has been
studied extensively, its importance only continues to grow. What is more, modern time series data pose significant challenges to existing techniques (e.g., irregular sampling in hospital records and spatiotemporal structure in
climate data). Finally, time series mining research is challenging and rewarding because it bridges a variety of disciplines and demands interdisciplinary solutions. Now is the time to discuss the next generation of temporal mining
algorithms. The focus of MiLeTS workshop is to synergize the research in this area and discuss both new and open problems in time series analysis and mining. The solutions to these problems may be algorithmic, theoretical, statistical,
or systems-based in nature. Further, MiLeTS emphasizes applications to high impact or relatively new domains, including but not limited to biology, health and medicine, climate and weather, road traffic, astronomy, and energy.
The MiLeTS workshop will discuss a broad variety of topics related to time series, including:
08:00-08:10 Opening remarks
08:10-09:10 Keynote Talk
9:10-10:30 Contributed Talks
10:30-11:00 Coffee Break
11:00-12:00 Keynote Talk
12:00-13:00 Lunch Break
13:00-14:00 Keynote Talk
14:00-14:30 Poster Spotlights
14:30-16:00 Coffee Break & Poster Session
16:00-16:50 Keynote Talk
16:50-17:00 Concluding Remarks
Georgia Institute of Technology
The devastating impact of the currently unfolding global COVID-19 pandemic and those of the Zika, SARS, MERS, and Ebola outbreaks over the past decade has sharply illustrated our enormous vulnerability to emerging infectious diseases. There are many questions that are being studied by epidemiologists and public officials during these outbreaks. Building on our prior work, we have been pursuing multiple activities amidst the COVID-19 pandemic in the United States, collaborating with partners in academia, industry and public health agencies, from award-winning work on helping forecast pandemic trajectories (also shown on the CDC website) to designing more localized and less burdensome campus interventions. In this talk, I will briefly give an overview of our recent research in designing well calibrated, robust, accurate and interpretable deep learning models for epidemic forecasting, illustrating the important role data science and machine learning have to play for pandemic prevention and prediction.
B. Aditya Prakash is an Associate Professor in the College of Computing at the Georgia Institute of Technology (“Georgia Tech”). He received a Ph.D. from the Computer Science Department at Carnegie Mellon University in 2012, and a B.Tech (in CS) from the Indian Institute of Technology (IIT) -- Bombay in 2007. He has published one book, more than 80 papers in major venues, holds two U.S. patents and has given several tutorials at leading conferences. His work has also received multiple best-of-conference, best paper and travel awards. His research interests include Data Science, Machine Learning and AI, with emphasis on big-data problems in large real-world networks and time-series, with applications to computational epidemiology/public health, urban computing, security and the Web. Tools developed by his group have been in use in many places including ORNL, Walmart and Facebook. He has received several awards such as a Facebook Faculty Award, the NSF CAREER award and was named as one of ‘AI Ten to Watch’ by IEEE. His work has also won awards in multiple data science challenges (e.g the Facebook COVID19 Symptom Challenge) and been highlighted by several media outlets/popular press like FiveThirtyEight.com. He is also a member of the infectious diseases modeling MIDAS network and core-faculty at the Center for Machine Learning (ML@GT) and the Institute for Data Engineering and Science (IDEaS) at Georgia Tech. Aditya’s Twitter handle is @badityap.
University of Toronto
Much real-world data is sampled at irregular intervals, but most time series models require regularly-sampled data. Continuous-time models address this problem, but until now only deterministic models (based on ordinary differential equations) or linear-Gaussian
models were efficiently trainable with millions of parameters. We construct a scalable algorithm for computing gradients through samples from stochastic differential equations (SDEs), and for gradient-based stochastic variational
inference in function space, all with the use of adaptive black-box SDE solvers. This allows us to fit a new family of richly-parameterized distributions over time series, in which neural networks can parameterize both dynamics
and likelihoods. We demonstrate these latent SDEs on motion capture data, and provide an open-source PyTorch library for fitting large SDE models.
[Slides] [Paper] [Code]
David Duvenaud is an assistant professor at the University of Toronto. His research focuses on constructing deep probabilistic models to help predict, explain and design things. For example: Neural ODEs, a kind of continuous-depth neural network, Automatic chemical design using generative models, Gradient-based hyperparameter tuning, Structured latent-variable models for modeling video, and Convolutional networks on graphs. Previously, He was a postdoc in the Harvard Intelligent Probabilistic Systems group with Ryan Adams. He did my Ph.D. at the University of Cambridge, where his advisors were Carl Rasmussen and Zoubin Ghahramani. His M.Sc. advisor was Kevin Murphy at the University of British Columbia. He spent a summer working on probabilistic numerics at the Max Planck Institute for Intelligent Systems, and the two summers before that at Google Research, doing machine vision. He co-founded Invenia, an energy forecasting and trading firm where he still consult. He is also a founding member of the Vector Institute.
University of California - Riverside
In the last five years there has been an explosion of paper on time series anomaly detection (TSAD) appearing in all the top conferences. In this talk I will make a surprising claim. Almost all such papers suffer from various flaws, including: Testing
on deeply flawed datasets, use of inappropriate measures of success, non-reproducible experiments, unjustified complexity, and ignoring competitive decades-old methods. I will demonstrate that because of these flaws, we should
not believe the claims of most papers on time series anomaly detection. Rather than a completely pessimistic talk, I will take this opportunity to release a new set of 250 benchmark datasets and guidelines that will go some way
to mitigating the problems.
Eamonn Keogh is a Distinguished Professor of Computer Science at the University of California Riverside. With his students, he invented many of the most commonly used time series primitives, including Shapelets, Discords, Motifs, SAX, PAA, LB_keogh and the Matrix Profile. He has won at least one best paper award in the major data mining conferences (SIGMOD, SIGKDD, ICDM, SDM) and his h-index of 100 reflects the significant amount of citations his work has attracted.
CREST - ENSAE, Institut Polytechnique de Paris, Google Brain
I will start this talk with a short intro to optimal transport theory, and mention a few interesting areas at the intersection of both optimal transport and time series. I will describe in more detail the JKO proximal gradient descent scheme and show how it can be used to model time series of snapshots of populations (joint work with Charlotte Bunne, Laetitia Papaxanthos and Andreas Krause) and also the Wasserstein DTW discrepancy for such time series.
Marco Cuturi joined Google Brain, in Paris, in October 2018. He graduated from ENSAE (2001), ENS Cachan (Master MVA, 2002) and holds a PhD in applied maths obtained in 2005 at Ecole des Mines de Paris. He worked as a post-doctoral researcher at the Institute of Statistical Mathematics, Tokyo, between 11/2005 and 03/2007. He worked in the financial industry between 04/2007 and 09/2008. After working at the ORFE department of Princeton University between 02/2009 and 08/2010 as a lecturer, he was at the Graduate School of Informatics of Kyoto University between 09/2010 and 09/2016 as a tenured associate professor. He then joined ENSAE, the french national school for statistics and economics, in 9/2016, where he still teaches. His recent proposal to solve optimal transport using an entropic regularization has re-ignited interest in optimal transport and Wasserstein distances in the machine learning community. His work has recently focused on applying that loss function to problems involving probability distributions, e.g. topic models / dictionary learning for text and images, parametric inference for generative models, regression with a Wasserstein loss and probabilistic embeddings for words.
RLAD: Time Series Anomaly Detection through ReinforcementLearning and Active Learning Tong Wu and Jorge Ortiz.
Personalized and Environment-Aware Battery Prediction for Electric Vehicles Dongyue Li, Guangyu Li, Bo Jiang, Zhengping Che and Yan Liu.
Exploring Generative Data Augmentation in Multivariate Time Series Forecasting: Opportunities and Challenges Ankur Debnath, Govind Waghmare, Hardik Wadhwa, Siddhartha Asthana and Ankur Arora.
Short Text Clustering in Continuous Time Using Stacked Dirichlet-Hawkes Process with Inverse Cluster Frequency Prior Avirup Saha and Balaji Ganesan.
TE-ESN: Time Encoding Echo State Network for Prediction Based on Irregularly Sampled Time Series Data Chenxi Sun, Shenda Hong, Moxian Song and Hongyan Li.
Learning Robust Representations using a Change Point Framework Ame Osotsi and Qunhua Li.
Low-Rank Autoregressive Tensor Completion for Spatiotemporal Traffic Data Imputation Xinyu Chen, Mengying Lei, Nicolas Saunier and Lijun Sun.
Improving COVID-19 Forecasting using eXogenous Variables Mohammadhossein Toutiaee, Xiaochuan Li, Yogesh Chaudhari, Shophine Sivaraja, Aishwarya Venkataraj, Indrajeet Javeri, Yuan Ke, Ismailcem Arpinar, Nicole Lazar and John Miller.
Visual Time Series Forecasting: An Image-driven Approach Naftali Cohen, Srijan Sood, Zhen Zeng, Tucker Balch and Manuela Veloso.
Multi-Window-Finder: Domain Agnostic Window Size for Time Series Data Shima Imani, Alireza Abdoli, Ali Beyram, Azam Imani and Eamonn Keogh.
Detection and clustering of lead-lag networks for multivariate time series with an application to financial markets Stefanos Bennett, Mihai Cucuringu and Gesine Reinert.
Temporal Progression: A case study in Porcine Survivability through Hemostatic Nanoparticles Chhaya K, Nuzhat Maisha, Leasha J Schaub, Jacob Glaser, Erin Lavik and Vandana Janeja.
Recurrent Attentive Kernel Learning for Shark Activity Recognition Matthew Buchholz, Wenlu Zhang, Emily N. Meese, Yu Yang, Christopher G. Lowe and Hen-Geul Yeh.
HIVE-COTE 2.0: a new meta ensemble for time series classification Matthew Middlehurst, James Large, Michael Flynn, Jason Lines, Aaron Bostrom and Anthony Bagnall.
Aggregate Learning for Mixed-Frequency Data Daisuke Moriwaki, Takamichi Toda and Kazuhiro Ota.
Multivariate time series forecasting with diffusion kernels: Freeway traffic prediction Semin Kwak, Nikolasa Geroliminis and Pascal Frossard.
Time Series Features for Classification of Contaminated Cell Cultures Laura Tupper, Charles Keese and David Matteson.
Actionable Insights in Multivariate Time-series for Urban Analytics Anika Tabassum, Supriya Chinthavali, Varisara Tansakul and B. Aditya Prakash.
Forecasting COVID-19 Counts At A Single Hospital: A Hierarchical Bayesian Approach Alexandra Lee, Panagiotis Lymperopoulos, Joshua T. Cohen, John B. Wong and Michael Hughes.
Submissions should follow the SIGKDD formatting requirements (unless otherwise stated) and will be evaluated using the SIGKDD Research Track evaluation criteria. Preference will be given to papers that are reproducible, and authors are encouraged to share their data and code publicly whenever possible. Submissions are strongly recommended to be no more than 4 pages, excluding references or supplementary materials (all in a single pdf). The appropriateness of using additional pages over the recommended length will be judged by reviewers. All submissions must be in pdf format using the workshop template (latex, word). Submissions will be managed via the MiLeTS 2021 EasyChair website: https://easychair.org/conferences/?conf=milets2021.
Note on open problem submissions: In order to promote new and innovative research on time series, we plan to accept a small number of high quality manuscripts describing open problems in time series analysis and mining. Such papers should provide a clear, detailed description and analysis of a new or open problem that poses a significant challenge to existing techniques, as well as a thorough empirical investigation demonstrating that current methods are insufficient.
COVID-19 Time Series Analysis Special Track: The COVID-19 pandemic is impacting almost everyone worldwide and is expected to have life-altering short and long-term effects. There are many potential applications of time series analysis and mining that can contribute to understanding of this pandemic. We encourage submission of high quality manuscripts describing original problems, time series datasets, and novel solutions for time series analysis and forecasting of COVID-19.
The review process is single-round and double-blind (submission files have to be anonymized). Concurrent submissions to other journals and conferences are acceptable. Accepted papers will be presented as posters during the workshop and list on the website (non-archival/without proceedings). Besides, a small number of accepted papers will be selected to be presented as contributed talks.
Any questions may be directed to the workshop e-mail address: firstname.lastname@example.org.
Paper Submission Deadline: May 20th June 1st, 2021, 11:59PM Alofi Time
Author Notification: July 2nd, 2021
Camera Ready Version: July 16th, 2021
Workshop: August 15th, 2021 8:00 AM - 5:00PM Pacific Time.