Table of Contents
Keynote Address
The Architecture of SciDB Michael Stonebraker Paul Brown Alex Poliakov Suchi Raman 1
Ranked Search
Location-Based Instant Search Shengyue Ji Chen Li 17
Continuous Inverse Ranking Queries in Uncertain Streams Thomas Bernecker Hans-Peter Kriegel Nikos Mamoulis Matthias Renz Andreas Zuefle 37
Finding Haystacks with Needles: Ranked Search for Data using Geospatial and Temporal Characteristics V. M. Megler David Maier 55
Using Medians to Generate Consensus Rankings for Biological Data Sarah Cohen-Boulakia Alain Denise Sylvie Hamel 73
A Truly Dynamic Data Structure for Top-k Queries on Uncertain Data Manish Patil Rahul Shah Sharma V. Thankachan 91
Temporal Data and Queries
Efficient Storage and Temporal Query Evaluation in Hierarchical Data Archiving Systems Hui (Wendy) Wang Ruilin Liu Dimitri Theodoratos Xiaoying Wu 109
Update Propagation in a Streaming Warehouse Theodore Johnson Vladislav Shkapenyuk 129
Efficient Processing of Multiple DTW Queries in Time Series Databases Hardy Kremer Stephan Günnemann Anca-Maria Ivanescu Ira Assent Thomas Seidl 150
Probabilistic Time Consistent Queries over Moving Objects Xiang Lian Lei Chen 168
Workflows and Provenance
Knowledge Annotations in Scientific Workflows: An Implementation in Kepler Aída Gándara George Chin Paulo Pinheiro da Silva Signe White Chandrika Sivaramakrishnan Terence Critchlow 189
Improving Workflow Fault Tolerance through Provenance-Based Recovery Sven Köhler Sean Riddle Daniel Zinn Timothy McPhillips Bertram Ludäscher 207
PROPUB: Towards a Declarative Approach for Publishing Customized, Policy-Aware Provenance Saumen C. Dey Daniel Zinn Bertram Ludäscher 225
Provenance-Enabled Automatic Data Publishing James Frew Greg Janée Peter Slaughter 244
Panel I
A Panel Discussion on Data Intensive Science: Moving towards Solutions Terence Critchlow 253
Querying Graphs
Querying Shortest Path Distance with Bounded Errors in Large Graphs Miao Qiao Hong Cheng Jeffrey Xu Yu 255
PG-Join: Proximity Graph Based String Similarity Joins Michail Kazimianec Nikolaus Augsten 274
A Flexible Graph Pattern Matching Framework via Indexing Wei Jin Jiong Yang 293
Subgraph Search over Massive Disk Resident Graphs Peng Peng Lei Zou Lei Chen Xuemin Lin Dongyan Zhao 312
BR-Index: An Indexing Structure for Subgraph Matching in Very Large Dynamic Graphs Jiong Yang Wei Jin 322
Clustering and Data Mining
Cloud Vista: Visual Cluster Exploration for Extreme Scale Data in the Cloud Keke Chen Huiqi Xu Fengguang Tian Shumin Guo 332
Efficient Selectivity Estimation by Histogram Construction Based on Subspace Clustering Andranik Khachatryan Emmanuel Müller Klemens Böhm Jonida Kopper 351
Finding Closed MEMOs Htoo Htet Aung Kian-Lee Tan 369
Density Based Subspace Clustering over Dynamic Data Hans-Peter Kriegel Peer Kröger Irene Ntoutsi Arthur Zimek 387
Hierarchical Clustering for Real-Time Stream Data with Noise Philipp Kranen Felix Reidl Fernando Sanchez Villaamil Thomas Seidl 405
Architectures and Privacy
Energy Proportionality and Performance in Data Parallel Computing Clusters Jinoh Kim Jerry Chou Doron Rotem 414
Privacy Preserving Group Linkage Fengjun Li Yuxin Chen Bo Luo Dongwon Lee Peng Liu 432
Dynamic Anonymization for Marginal Publication Xianmang He Yanghua Xiao Yujia Li Qing Wang Wei Wang Baile Shi 451
Pantheon: Exascale File System Search for Scientific Computing Joseph L. Naps Mohamed F. Mokbel David H.C. Du 461
Massive-Scale RDF Processing using Compressed Bitmap Indexes Kamesh Madduri Kesheng Wu 470
Database-as-a-Service for Long-Tail Science Bill Howe Garret Cole Emad Souroush Paraschos Koutris Alicia Key Nodira Khoussainova Leilani Battle 480
Panel II
Data Scientists, Data Management and Data Policy Sylvia Spengler 490
Applications and Models
Context-Aware Parameter Estimation for Forecast Models in the Energy Domain Lars Dannecker Robert Schulze Matthias Böhm Wolfgang Lehner Gregor Hackenbroich 491
Implementing a General Spatial Indexing Library for Relational Databases of Large Numerical Simulations Gerard Lemson Tamás Budavári Alexander Szalay 509
Histogram and Other Aggregate Queries in Wireless Sensor Networks Khaled Ammar Mario A. Nascimento 527
Efficient In-Database Maintenance of ARIMA Models Frank Rosenthal Wolfgang Lehner 537
Recipes for Baking Black Forest Databases: Building and Querying Black Hole Merger Trees from Cosmological Simulations Julio López Colin Degraf Tiziana DiMatteo Bin Fu Eugene Fink Garth Gibson 546
CrowdLabs: Social Analysis and Visualization for the Sciences Phillip Mates Emanuele Santos Juliana Freire Cláudio T. Silva 555
Posters
Heidi Visualization of R-tree Structures over High Dimensional Data Shraddha Agrawal Soujanya Vadapalli Kamalakar Karlapalem 565
Towards Efficient and Precise Queries over Ten Million Asteroid Trajectory Models Yusra AlSayyad K. Simon Krughoff Bill Howe Andrew J. Connoly Magdalena Balazinska Lynne Jones 568
Keyword Search Support for Automating Scientific Workflow Composition David Chiu Travis Hall Farhana Kabir Gagan Agrawal 571
FastQuery: A General Indexing and Querying System for Scientific Data Jerry Chou Kesheng Wu Prabhat 573
Retrieving Accurate Estimates to OLAP Queries over Uncertain and Imprecise Multidimensional Data Streams Alfredo Cuzzocrea 575
Hybrid Data-Flow Graphs for Procedural Domain-Specific Query Languages Bernhard Jaecksch Franz Faerber Frank Rosenthal Wolfgang Lehner 577
Scalable and Automated Workflow in Mining Large-Scale Severe-Storm Simulations Lei Jiang Gabrielle Allen Qin Chen 579
Accurate Cost Estimation using Distribution-Based Cardinality Estimates for Multi-dimensional Queries Andranik Khachatryan Klemens Böhm 581
Session-Based Browsing for More Effective Query Reuse Nodira Khoussainova YongChul Kwon Wei-Ting Liao Magdalena Balazinska Wolfgang Gatterbauer Dan Suciu 583
The ETLMR MapReduce-Based ETL Framework Xiufeng Liu Christian Thomsen Torben Bach Pedersen 586
Top-k Similarity Search on Uncertain Trajectories Chunyang Ma Hua Lu Lidan Shou Gang Chen Shujie Chen 589
Fast and Accurate Trajectory Streams Clustering Elio Masciari 592
Data-Driven Multidimensional Design for OLAP Oscar Romero Alberto Abelló 594
An Adaptive Outlier Detection Technique for Data Streams Shiblee Sadik Le Gruenwald 596
Power-Aware DBMS: Potential and Challenges Yi-cheng Tu Xiaorui Wang Zichen Xu 598
Author Index 601