mirror of
https://github.com/mmueller41/mxtasking.git
synced 2026-01-21 20:52:57 +01:00
2.3 KiB
2.3 KiB
HashJoin Benchmark
Benchmark of a parallel, task-based HashJoin.
How to generate workload
- Download TPC-H benchmark and generate tables
- Specify joined tables and key-indices via CLI arguments
Important CLI arguments
- The first argument is the number of cores:
./bin/hashjoin_benchmark 1for using a single core../bin/hashjoin_benchmark 1:24for using cores1up to24.
-i <NUMBER>specifies the number of repetitions of each workload.-s <NUMBER>steps of the cores:-s 1will increase the used cores by one (core ids:0,1,2,3,4,5,6,7,..,23).-s 2will skip every second core (core ids:0,1,3,5,7,..23).
-pd <NUMBER>specifies the prefetch distance.-por--perfwill activate performance counter (result will be printed to console and output file).-Rspecifies the TPC-H table file for the left relation.-R-keyspecifies the index of the join key forR.-Sspecifies the TPC-H table file for the right relation.-S-keyspecifies the index of the join key forS.--batchspecifies the records per task (comma separated:8,16,64,256)
Understanding the output
After started, the benchmark will print a summary of configured cores and workload:
core configuration:
1: 0
2: 0 1
4: 0 1 2 3
workload: customer.tbl.0 (#3000000) JOIN orders.tbl.1 (#30000000)
Here, we configured the benchmark to use one to four cores; each line of the core configuration displays the number of cores and the core identifiers.
Following, the benchmark will be started and print the results for every iteration:
1 1 64 1478 ms 3.38295e+06 op/s
2 1 64 964 ms 5.18672e+06 op/s
4 1 64 935 ms 5.34759e+06 op/s
- The first column is the number of used cores.
- The second column displays the iteration of the benchmark (configured by
-i X). - Thirdly, the granularity of how many records per task will be processed.
- After that, the time and throughput are written.
- If
--perfis enabled, the output will be extended by some perf counters, which are labeled (like throughput).
Plot the results
When using -o FILE, the results will be written to the given file, using JSON format.
The plot script scripts/plot_hashjoin_benchmark INPUT_FILE will aggregate and plot the results using one JSON file.