Treffer: Scheduling Problems in Parallel Query Optimization
Weitere Informationen
We introduce a class of novel multiprocessor scheduling problems that arise in the optimization of SQL queries for parallel machines. These consist of scheduling a tree of interdependent communicating operators while exploiting both inter-operator and intra-operator parallelism. We develop algorithms for the specific problem of scheduling a Pipelined Operator Tree in which all operators run in parallel using inter-operator parallelism. Weights associated with nodes and edges represent respectively the cost of operators and communication. Communication cost is incurred only if adjacent operators are assigned different processors. The optimization problem is to assign operators to processors so as to minimize the maximum processor load. We develop two approximation algorithms for this NP-hard problem. The faster algorithm has a performance ratio of 3.56 while the slower algorithm has a ratio of 2.87. 1 Introduction Exploiting parallel execution [DG92, Val93] to speed up database querie.