Skip to content

Commit 29b593a

Browse files
authored
Merge pull request #215 from Morwenn/develop
Release 1.14.0
2 parents 23424cc + cbad910 commit 29b593a

File tree

126 files changed

+2283
-1015
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

126 files changed

+2283
-1015
lines changed

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ cmake_minimum_required(VERSION 3.8.0)
55

66
list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
77

8-
project(cpp-sort VERSION 1.13.2 LANGUAGES CXX)
8+
project(cpp-sort VERSION 1.14.0 LANGUAGES CXX)
99

1010
include(CMakePackageConfigHelpers)
1111
include(GNUInstallDirs)

NOTICE.txt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,16 @@ In addition, certain files include the notices provided below.
7575

7676
----------------------
7777

78+
// boost heap: d-ary heap as container adaptor
79+
//
80+
// Copyright (C) 2010 Tim Blechmann
81+
//
82+
// Distributed under the Boost Software License, Version 1.0. (See
83+
// accompanying file LICENSE_1_0.txt or copy at
84+
// http://www.boost.org/LICENSE_1_0.txt)
85+
86+
----------------------
87+
7888
//----------------------------------------------------------------------------
7989
/// @file merge.hpp
8090
/// @brief low level merge functions

README.md

Lines changed: 31 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
![cpp-sort logo](docs/images/cpp-sort-logo.svg)
22

3-
[![Latest Release](https://img.shields.io/badge/release-1.13.2-blue.svg)](https://github.com/Morwenn/cpp-sort/releases/tag/1.13.2)
4-
[![Conan Package](https://img.shields.io/badge/conan-cpp--sort%2F1.13.2-blue.svg)](https://conan.io/center/cpp-sort?version=1.13.2)
3+
[![Latest Release](https://img.shields.io/badge/release-1.14.0-blue.svg)](https://github.com/Morwenn/cpp-sort/releases/tag/1.14.0)
4+
[![Conan Package](https://img.shields.io/badge/conan-cpp--sort%2F1.14.0-blue.svg)](https://conan.io/center/cpp-sort?version=1.14.0)
55
[![Code Coverage](https://codecov.io/gh/Morwenn/cpp-sort/branch/develop/graph/badge.svg)](https://codecov.io/gh/Morwenn/cpp-sort)
66
[![Pitchfork Layout](https://img.shields.io/badge/standard-PFL-orange.svg)](https://github.com/vector-of-bool/pitchfork)
77

@@ -98,16 +98,22 @@ and extending **cpp-sort** in [the wiki](https://github.com/Morwenn/cpp-sort/wik
9898
# Benchmarks
9999

100100
The following graph has been generated with a script found in the benchmarks
101-
directory. It shows the time needed for a sorting algorithm to sort one million
102-
shuffled `std::array<int, N>` of sizes 0 to 32. It compares the sorters generally
103-
used to sort small arrays:
101+
directory. It shows the time needed for [`heap_sort`][heap-sorter] to sort one
102+
million elements without being adapted, then when it is adapted with either
103+
[`drop_merge_adapter`][drop-merge-adapter] or [`split_adapter`][split-adapter].
104104

105-
![Benchmark speed of small sorts with increasing size for std::array<int>](https://i.imgur.com/dOa3vyl.png)
105+
![Graph showing the speed difference between heap_sort raw, then adapted with
106+
split_adapter and drop_merge_adapter, when the number of inversions in the
107+
std::vector<int> to sort increases](https://i.imgur.com/IcjUkYF.png)
106108

107-
These results were generated with MinGW-w64 g++ 10.1 with the compiler options
108-
`-std=c++2a -O3 -march=native`. That benchmark is merely an example to make this
109-
introduction look good. You can find more commented benchmarks in the [dedicated
110-
wiki page](https://github.com/Morwenn/cpp-sort/wiki/Benchmarks).
109+
As can be seen above, wrapping `heap_sort` with either of the adapters makes it
110+
[*adaptive*][adaptive-sort] to the number of inversions in a non-intrusive
111+
manner. The algorithms used to adapt it have different pros and cons, it is up
112+
to you to use either.
113+
114+
This benchmark is mostly there to show the possibilities offered by the
115+
library. You can find more such commented benchmarks in the [dedicated wiki
116+
page][benchmarks].
111117

112118
# Compiler support & tooling
113119

@@ -156,7 +162,14 @@ parts of the benchmarks come from there as well.
156162
of a Timsort](https://github.com/gfx/cpp-TimSort).
157163

158164
* The three algorithms used by `spread_sorter` come from Steven Ross [Boost.Sort
159-
module](https://www.boost.org/doc/libs/1_71_0/libs/sort/doc/html/index.html).
165+
module](https://www.boost.org/doc/libs/1_80_0/libs/sort/doc/html/index.html).
166+
167+
* The algorithm used by `d_ary_spread_sorter` comes from Tim Blechmann's
168+
[Boost.Heap module](https://www.boost.org/doc/libs/1_80_0/doc/html/heap.html).
169+
170+
* The algorithm used by `spin_sorter` comes from the eponymous algorithm implemented
171+
in [Boost.Sort](https://www.boost.org/doc/libs/1_80_0/libs/sort/doc/html/index.html).
172+
by Francisco Jose Tapia.
160173

161174
* [`utility::as_function`](https://github.com/Morwenn/cpp-sort/wiki/Miscellaneous-utilities#as_function),
162175
[`utility::static_const`](https://github.com/Morwenn/cpp-sort/wiki/Miscellaneous-utilities#static_const),
@@ -227,3 +240,10 @@ and [Crascit/DownloadProject](https://github.com/Crascit/DownloadProject).
227240

228241
* Some of the benchmarks use a [colorblind-friendly palette](https://gist.github.com/thriveth/8560036)
229242
developed by Thøger Rivera-Thorsen.
243+
244+
245+
[adaptive-sort]: https://en.wikipedia.org/wiki/Adaptive_sort
246+
[benchmarks]: https://github.com/Morwenn/cpp-sort/wiki/Benchmarks
247+
[drop-merge-adapter]: https://github.com/Morwenn/cpp-sort/wiki/Sorter-adapters#drop_merge_adapter
248+
[heap-sorter]: https://github.com/Morwenn/cpp-sort/wiki/Sorters#heap_sorter
249+
[split-adapter]: https://github.com/Morwenn/cpp-sort/wiki/Sorter-adapters#split_adapter

benchmarks/benchmarking-tools/distributions.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -359,9 +359,9 @@ namespace dist
359359

360360
for (long long int i = 0 ; i < size ; ++i) {
361361
if (percent_dis(distributions_prng) < factor) {
362-
*out++ = value_dis(distributions_prng);
362+
*out++ = proj(value_dis(distributions_prng));
363363
} else {
364-
*out++ = i;
364+
*out++ = proj(i);
365365
}
366366
}
367367
}

benchmarks/inversions/inv-bench.cpp

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*
2-
* Copyright (c) 2020-2021 Morwenn
2+
* Copyright (c) 2020-2022 Morwenn
33
* SPDX-License-Identifier: MIT
44
*/
55
#include <cassert>
@@ -17,7 +17,6 @@
1717
#include "../benchmarking-tools/distributions.h"
1818
#include "../benchmarking-tools/filesystem.h"
1919
#include "../benchmarking-tools/rdtsc.h"
20-
#include "../benchmarking-tools/statistics.h"
2120

2221
using namespace std::chrono_literals;
2322

@@ -34,14 +33,14 @@ using sort_f = void (*)(collection_t&);
3433
std::pair<std::string, sort_f> sorts[] = {
3534
{ "drop_merge_sort", cppsort::drop_merge_sort },
3635
{ "pdq_sort", cppsort::pdq_sort },
37-
{ "split_sort", cppsort::split_sort }
36+
{ "split_sort", cppsort::split_sort },
3837
};
3938

4039
// Size of the collections to sort
4140
constexpr std::size_t size = 1'000'000;
4241

4342
// Maximum time to let the benchmark run for a given size before giving up
44-
auto max_run_time = 3s;
43+
auto max_run_time = 5s;
4544
// Maximum number of benchmark runs per size
4645
std::size_t max_runs_per_size = 25;
4746

@@ -68,18 +67,24 @@ int main(int argc, char* argv[])
6867
std::uint_fast32_t seed = std::time(nullptr);
6968
std::cout << "SEED: " << seed << '\n';
7069

70+
int sort_number = 0;
7171
for (auto& sort: sorts) {
7272
// Create a file to store the results
73-
std::string output_filename = output_directory + '/' + safe_file_name(sort.first) + ".csv";
74-
std::ofstream output_file(output_filename);
73+
auto sort_number_str = std::to_string(sort_number);
74+
auto output_filename =
75+
std::string(3 - sort_number_str.size(), '0') +
76+
std::move(sort_number_str) +
77+
'-' + safe_file_name(sort.first) + ".csv";
78+
std::string output_path = output_directory + '/' + output_filename;
79+
std::ofstream output_file(output_path);
7580
output_file << sort.first << '\n';
7681
std::cout << sort.first << '\n';
7782

7883
// Seed the distribution manually to ensure that all algorithms
7984
// sort the same collections when there is randomness
8085
distributions_prng.seed(seed);
8186

82-
for (int idx = 0 ; idx <= 100 ; ++idx) {
87+
for (int idx = 0; idx <= 100; ++idx) {
8388
double factor = 0.01 * idx;
8489
auto distribution = dist::inversions(factor);
8590

@@ -100,9 +105,18 @@ int main(int argc, char* argv[])
100105
}
101106

102107
// Compute and display stats & numbers
103-
double avg = average(cycles);
104-
output_file << idx << ", " << avg << '\n';
105-
std::cout << idx << ", " << avg << std::endl;
108+
output_file << idx << ",";
109+
std::cout << idx << ",";
110+
auto it = cycles.begin();
111+
output_file << *it;
112+
std::cout << *it;
113+
while (++it != cycles.end()) {
114+
output_file << "," << *it;
115+
std::cout << "," << *it;
116+
}
117+
output_file << '\n';
118+
std::cout << std::endl;
106119
}
120+
++sort_number;
107121
}
108122
}

benchmarks/inversions/plot.py

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
# -*- coding: utf-8 -*-
22

3-
# Copyright (c) 2020-2021 Morwenn
3+
# Copyright (c) 2020-2022 Morwenn
44
# SPDX-License-Identifier: MIT
55

66
import argparse
77
import pathlib
8+
import sys
89

910
import numpy
1011
from matplotlib import pyplot
@@ -15,7 +16,7 @@ def fetch_results(fresults):
1516
results.pop()
1617
return [float(elem) for elem in results]
1718

18-
19+
1920
if __name__ == '__main__':
2021
parser = argparse.ArgumentParser(description="Plot the results of the errorbar-plot benchmark.")
2122
parser.add_argument('root', help="directory with the result files to plot")
@@ -26,6 +27,7 @@ def fetch_results(fresults):
2627

2728
root = pathlib.Path(args.root)
2829
result_files = list(root.glob('*.csv'))
30+
result_files.sort()
2931
if len(result_files) == 0:
3032
print(f"There are no files to plot in {root}")
3133
sys.exit(1)
@@ -42,22 +44,27 @@ def fetch_results(fresults):
4244
colors = iter(palette)
4345

4446
for result_file in result_files:
47+
percent_inversions = []
48+
averages = []
4549
with result_file.open() as fd:
4650
# Read the first line
4751
algo_name = fd.readline().strip()
4852
# Read the rest of the file
49-
data = numpy.genfromtxt(fd, delimiter=',').transpose()
50-
percent_inversions, avg = data
53+
for line in fd:
54+
pct, *data = line.strip().split(',')
55+
data = list(map(int, data))
56+
percent_inversions.append(pct)
57+
averages.append(numpy.average(data))
5158

5259
# Plot the results
53-
pyplot.plot(percent_inversions,
54-
avg,
60+
pyplot.plot(list(map(int, percent_inversions)),
61+
averages,
5562
label=algo_name,
5663
color=next(colors))
5764

5865
# Add a legend
59-
pyplot.legend(loc='best')
60-
pyplot.title('Sorting std::vector<int> with $10^6$ elements')
61-
pyplot.xlabel('Percentage of inversions')
62-
pyplot.ylabel('Cycles (lower is better)')
66+
pyplot.legend()
67+
pyplot.title("Sorting std::vector<int> with $10^6$ elements")
68+
pyplot.xlabel("Percentage of inversions")
69+
pyplot.ylabel("Cycles (lower is better)")
6370
pyplot.show()

benchmarks/small-array/benchmark.cpp

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/*
2-
* Copyright (c) 2015-2021 Morwenn
2+
* Copyright (c) 2015-2022 Morwenn
33
* SPDX-License-Identifier: MIT
44
*/
55
#include <algorithm>
@@ -46,8 +46,10 @@ template<
4646
typename DistributionFunction
4747
>
4848
auto time_it(Sorter sorter, DistributionFunction distribution)
49-
-> double
49+
-> std::uint64_t
5050
{
51+
static_assert(N > 0, "this benchmark does not support zero-sized arrays");
52+
5153
// Seed the distribution manually to ensure that all algorithms
5254
// sort the same collections when there is randomness
5355
distributions_prng.seed(seed);
@@ -65,53 +67,53 @@ auto time_it(Sorter sorter, DistributionFunction distribution)
6567
sorter(arr);
6668
std::uint64_t end = rdtsc();
6769
assert(std::is_sorted(arr.begin(), arr.end()));
68-
cycles.push_back(end - start);
70+
cycles.push_back(double(end - start) / N);
6971
total_end = clock_type::now();
7072
}
7173

72-
// Return the average number of cycles it took to sort the arrays
73-
std::uint64_t avg = 0;
74-
for (auto value: cycles) {
75-
avg += value;
76-
}
77-
return avg / double(cycles.size());
74+
// Return the median number of cycles per element
75+
auto cycles_median = cycles.begin() + cycles.size() / 2;
76+
std::nth_element(cycles.begin(), cycles_median, cycles.end());
77+
return *cycles_median;
7878
}
7979

8080
template<
8181
typename T,
82-
typename Distribution,
82+
typename Dist,
8383
std::size_t... Ind
8484
>
8585
auto time_distribution(std::index_sequence<Ind...>)
8686
-> void
8787
{
88-
using sorting_network_sorter = cppsort::small_array_adapter<
89-
cppsort::sorting_network_sorter
90-
>;
91-
9288
using low_comparisons_sorter = cppsort::small_array_adapter<
9389
cppsort::low_comparisons_sorter
9490
>;
95-
9691
using low_moves_sorter = cppsort::small_array_adapter<
9792
cppsort::low_moves_sorter
9893
>;
94+
using merge_exchange_network_sorter = cppsort::small_array_adapter<
95+
cppsort::merge_exchange_network_sorter
96+
>;
97+
using sorting_network_sorter = cppsort::small_array_adapter<
98+
cppsort::sorting_network_sorter
99+
>;
99100

100101
// Compute results for the different sorting algorithms
101-
std::pair<const char*, std::array<double, sizeof...(Ind)>> results[] = {
102-
{ "insertion_sorter", { time_it<T, Ind>(cppsort::insertion_sort, Distribution{})... } },
103-
{ "selection_sorter", { time_it<T, Ind>(cppsort::selection_sort, Distribution{})... } },
104-
{ "low_moves_sorter", { time_it<T, Ind>(low_moves_sorter{}, Distribution{})... } },
105-
{ "low_comparisons_sorter", { time_it<T, Ind>(low_comparisons_sorter{}, Distribution{})... } },
106-
{ "sorting_network_sorter", { time_it<T, Ind>(sorting_network_sorter{}, Distribution{})... } },
102+
std::pair<const char*, std::array<std::uint64_t, sizeof...(Ind)>> results[] = {
103+
{ "insertion_sorter", { time_it<T, Ind + 1>(cppsort::insertion_sort, Dist{})... } },
104+
{ "selection_sorter", { time_it<T, Ind + 1>(cppsort::selection_sort, Dist{})... } },
105+
{ "low_comparisons_sorter", { time_it<T, Ind + 1>(low_comparisons_sorter{}, Dist{})... } },
106+
{ "low_moves_sorter", { time_it<T, Ind + 1>(low_moves_sorter{}, Dist{})... } },
107+
{ "merge_exchange_network_sorter", { time_it<T, Ind + 1>(merge_exchange_network_sorter{}, Dist{})... } },
108+
{ "sorting_network_sorter", { time_it<T, Ind + 1>(sorting_network_sorter{}, Dist{})... } },
107109
};
108110

109111
// Output the results to their respective files
110112
std::ofstream output(Distribution::output);
111113
for (auto&& sort_result: results) {
112-
output << std::get<0>(sort_result) << ' ';
114+
output << std::get<0>(sort_result) << ',';
113115
for (auto&& nb_cycles: std::get<1>(sort_result)) {
114-
output << nb_cycles << ' ';
116+
output << nb_cycles << ',';
115117
}
116118
output << '\n';
117119
}
@@ -125,7 +127,7 @@ template<
125127
auto time_distributions()
126128
-> void
127129
{
128-
using indices = std::make_index_sequence<N>;
130+
using indices = std::make_index_sequence<N - 1>;
129131

130132
// Variadic dispatch only works with expressions
131133
int dummy[] = {

0 commit comments

Comments
 (0)