OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics

Djahida Belayadi1, Khaled-Walid Hidouci1 and Ladjel Bellatreche2

  1. Laboratoire de la Communication dans les Systmes Informatiques, Ecole nationale Sup´erieure d’Informatique
    BP 68M, 16309, Oued-Smar, Algiers, Algeria
    {d_belayadi,w_hidouci}@esi.dz
  2. LIAS/ISAE-ENSMA – Poitiers University
    86960 Futuroscope, France
    bellatreche@ensma.fr

Abstract

Modern database systems can achieve high throughput main-memory query execution by being aware of the dynamics of highly parallel hardware. In such systems, data is partitioned into smaller pieces to reach a better parallelism. Unfortunately, data skew is one of the main problems faced during parallel processing in a parallel main memory database. In some data-intensive applications, parallel range queries over a dynamic range partitioned system are important. Continuous insertions/deletions can lead to a very high degree of data skew and consequently a poor performance of parallel range queries. In this paper, we propose an approach for maintaining balanced loads over a set of nodes as in a system of communicating vessels, by migrating tuples between neighboring nodes. These frequent (or even continuous) data transfers inevitably involve dynamic changes in the partition statistics. To avoid the performance degradation typically associated with this dynamism, we provide a solution based on an approximate Partition Statistics Table. The basic idea behind this table is that both clients and nodes may have an imperfect knowledge about the effective load distribution. They can nevertheless locate any data with almost the same efficiency as using exact partition statistics. Furthermore, maintaining load distribution statistics do not require exchanging additional messages as opposed to the cost of efficient solutions from the state-of-art (which requires at least O(logn) messages). We show through intensive experiments that our proposal supports efficient range queries, while simultaneously guaranteeing storage balance even in the presence of numerous concurrent insertions/deletions generating a heavy skewed data distribution.

Key words

Load balancing, Parallel main memory database systems (MMBD), Range partitioning, Data skew, Range query

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS170320007B

Publication information

Volume 15, Issue 2 (June 2018)
Year of Publication: 2018
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Belayadi, D., Hidouci, K., Bellatreche, L.: OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics. Computer Science and Information Systems, Vol. 15, No. 2. (2018), https://doi.org/10.2298/CSIS170320007B