From: Kent Overstreet <kent.overstreet@linux.dev>
To: linux-bcachefs@vger.kernel.org
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Subject: [PATCH 17/17] bcachefs: Inline btree write buffer sort
Date: Fri, 10 Nov 2023 11:31:54 -0500 [thread overview]
Message-ID: <20231110163157.2736111-18-kent.overstreet@linux.dev> (raw)
In-Reply-To: <20231110163157.2736111-1-kent.overstreet@linux.dev>
The sort in the btree write buffer flush path is a very hot path, and
it's particularly performance sensitive since it's single threaded and
can block every other thread on a multithreaded write workload.
It's well worth doing a sort with inlined cmp and swap functions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/btree_write_buffer.c | 57 +++++++++++++++++++++++++++-----
1 file changed, 48 insertions(+), 9 deletions(-)
diff --git a/fs/bcachefs/btree_write_buffer.c b/fs/bcachefs/btree_write_buffer.c
index e961cf33db1e..e482bbb767e6 100644
--- a/fs/bcachefs/btree_write_buffer.c
+++ b/fs/bcachefs/btree_write_buffer.c
@@ -10,23 +10,64 @@
#include "journal_io.h"
#include "journal_reclaim.h"
-#include <linux/sort.h>
-
static int bch2_btree_write_buffer_journal_flush(struct journal *,
struct journal_entry_pin *, u64);
static int bch2_journal_keys_to_write_buffer(struct bch_fs *, struct journal_buf *);
-static inline int wb_key_cmp(const void *_l, const void *_r)
+static inline int wb_key_cmp(struct btree_write_buffered_key_ref *l,
+ struct btree_write_buffered_key_ref *r)
{
- const struct btree_write_buffered_key_ref *l = _l;
- const struct btree_write_buffered_key_ref *r = _r;
-
return cmp_int(l->btree, r->btree) ?:
bpos_cmp(l->pos, r->pos) ?:
cmp_int(l->idx, r->idx);
}
+static void wb_sort(struct btree_write_buffered_key_ref *base, size_t num)
+{
+ size_t n = num, a = num / 2;
+
+ if (!a) /* num < 2 || size == 0 */
+ return;
+
+ for (;;) {
+ size_t b, c, d;
+
+ if (a) /* Building heap: sift down --a */
+ --a;
+ else if (--n) /* Sorting: Extract root to --n */
+ swap(base[0], base[n]);
+ else /* Sort complete */
+ break;
+
+ /*
+ * Sift element at "a" down into heap. This is the
+ * "bottom-up" variant, which significantly reduces
+ * calls to cmp_func(): we find the sift-down path all
+ * the way to the leaves (one compare per level), then
+ * backtrack to find where to insert the target element.
+ *
+ * Because elements tend to sift down close to the leaves,
+ * this uses fewer compares than doing two per level
+ * on the way down. (A bit more than half as many on
+ * average, 3/4 worst-case.)
+ */
+ for (b = a; c = 2*b + 1, (d = c + 1) < n;)
+ b = wb_key_cmp(base + c, base + d) >= 0 ? c : d;
+ if (d == n) /* Special case last leaf with no sibling */
+ b = c;
+
+ /* Now backtrack from "b" to the correct location for "a" */
+ while (b != a && wb_key_cmp(base + a, base + b) >= 0)
+ b = (b - 1) / 2;
+ c = b; /* Where "a" belongs */
+ while (b != a) { /* Shift it into place */
+ b = (b - 1) / 2;
+ swap(base[b], base[c]);
+ }
+ }
+}
+
static noinline int wb_flush_one_slowpath(struct btree_trans *trans,
struct btree_iter *iter,
struct btree_write_buffered_key *wb)
@@ -202,9 +243,7 @@ static int bch2_btree_write_buffer_flush_locked(struct btree_trans *trans)
* If that happens, simply skip the key so we can optimistically insert
* as many keys as possible in the fast path.
*/
- sort(wb->sorted.data, wb->sorted.nr,
- sizeof(wb->sorted.data[0]),
- wb_key_cmp, NULL);
+ wb_sort(wb->sorted.data, wb->sorted.nr);
darray_for_each(wb->sorted, i) {
struct btree_write_buffered_key *k = &wb->flushing.keys.data[i->idx];
--
2.42.0
next prev parent reply other threads:[~2023-11-10 16:32 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-10 16:31 [PATCH 00/17] btree write buffer & journal optimizations Kent Overstreet
2023-11-10 16:31 ` [PATCH 01/17] bcachefs: Kill journal pre-reservations Kent Overstreet
2023-11-10 16:31 ` [PATCH 02/17] bcachefs: track_event_change() Kent Overstreet
2023-11-10 16:31 ` [PATCH 03/17] bcachefs: Journal pins must always have a flush_fn Kent Overstreet
2023-11-13 15:22 ` Brian Foster
2023-11-13 16:36 ` Kent Overstreet
2023-11-13 17:08 ` Brian Foster
2023-11-10 16:31 ` [PATCH 04/17] bcachefs: BTREE_INSERT_JOURNAL_REPLAY now "don't init trans->journal_res" Kent Overstreet
2023-11-10 16:31 ` [PATCH 05/17] bcachefs: Kill BTREE_UPDATE_PREJOURNAL Kent Overstreet
2023-11-13 15:29 ` Brian Foster
2023-11-13 16:49 ` Kent Overstreet
2023-11-14 13:17 ` Brian Foster
2023-11-10 16:31 ` [PATCH 06/17] bcachefs: Go rw before journal replay Kent Overstreet
2023-11-10 16:31 ` [PATCH 07/17] bcachefs: Make journal replay more efficient Kent Overstreet
2023-11-14 13:19 ` Brian Foster
2023-11-15 1:50 ` Kent Overstreet
2023-11-10 16:31 ` [PATCH 08/17] bcachefs: Don't flush journal after replay Kent Overstreet
2023-11-10 16:31 ` [PATCH 09/17] bcachefs: Unwritten journal buffers are always dirty Kent Overstreet
2023-11-10 16:31 ` [PATCH 10/17] bcachefs: journal->buf_lock Kent Overstreet
2023-11-10 16:31 ` [PATCH 11/17] bcachefs: bch2_journal_block_reservations() Kent Overstreet
2023-11-10 16:31 ` [PATCH 12/17] bcachefs: Clean up btree write buffer write ref handling Kent Overstreet
2023-11-10 16:31 ` [PATCH 13/17] bcachefs: bch2_btree_write_buffer_flush_locked() Kent Overstreet
2023-11-10 16:31 ` [PATCH 14/17] bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush() Kent Overstreet
2023-11-10 16:31 ` [PATCH 15/17] bcachefs: Improve btree write buffer tracepoints Kent Overstreet
2023-11-10 16:31 ` [PATCH 16/17] bcachefs: btree write buffer now slurps keys from journal Kent Overstreet
2023-11-21 10:56 ` Geert Uytterhoeven
2023-11-21 16:52 ` Kent Overstreet
2023-11-10 16:31 ` Kent Overstreet [this message]
2023-11-10 16:42 ` [PATCH 00/17] btree write buffer & journal optimizations Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231110163157.2736111-18-kent.overstreet@linux.dev \
--to=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.