public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] block: switch to atomic_t for request references
Date: Mon, 6 Dec 2021 11:13:20 -0700	[thread overview]
Message-ID: <282666e2-93d4-0302-b2d0-47d03395a6d4@kernel.dk> (raw)
In-Reply-To: <CAHk-=wjXmGt9-JQp-wvup4y2tFNUCVjvx2W7MHzuAaxpryP4mg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 928 bytes --]

On 12/6/21 10:35 AM, Linus Torvalds wrote:
> On Mon, Dec 6, 2021 at 12:31 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> Quite; and for something that pretends to be about performance, it also
>> lacks any actual numbers to back that claim.
>>
>> The proposed implementation also doesn't do nearly as much as the
>> refcount_t one does.
> 
> Stop pretending refcoutn_t is that great.
> 
> It's horrid. The code it generators is disgusting. It should never
> have been inlines in the first place, and the design decsisions were
> questionable to begin with.
> 
> There's a reason core stuff (like the page counters) DO NOT USE REFCOUNT_T.
> 
> I seriously believe that refcount_t should be used for things like
> device reference counting or similar issues, and not for _any_ truly
> core code.

Maybe we just need to embrace it generically, took a quick stab at it
which is attached. Totally untested...

-- 
Jens Axboe


[-- Attachment #2: 0004-mm-convert-to-using-atomic-ref.patch --]
[-- Type: text/x-patch, Size: 1388 bytes --]

From 2a755b779681c300261ec38007b482ae8257cfc1 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 6 Dec 2021 11:11:36 -0700
Subject: [PATCH 4/4] mm: convert to using atomic-ref

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/mm.h | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a7e4a9e7d807..954a74492b5a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -32,6 +32,7 @@
 #include <linux/sched.h>
 #include <linux/pgtable.h>
 #include <linux/kasan.h>
+#include <linux/atomic-ref.h>
 
 struct mempolicy;
 struct anon_vma;
@@ -1181,10 +1182,6 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
 		page->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA;
 }
 
-/* 127: arbitrary random number, small enough to assemble well */
-#define folio_ref_zero_or_close_to_overflow(folio) \
-	((unsigned int) folio_ref_count(folio) + 127u <= 127u)
-
 /**
  * folio_get - Increment the reference count on a folio.
  * @folio: The folio.
@@ -1195,7 +1192,7 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
  */
 static inline void folio_get(struct folio *folio)
 {
-	VM_BUG_ON_FOLIO(folio_ref_zero_or_close_to_overflow(folio), folio);
+	VM_BUG_ON_FOLIO(atomic_ref_zero_or_close_to_overflow(&folio->page._refcount), folio);
 	folio_ref_inc(folio);
 }
 
-- 
2.34.1


[-- Attachment #3: 0003-block-convert-to-using-atomic-ref.patch --]
[-- Type: text/x-patch, Size: 5148 bytes --]

From 784487e60aceb37cf0b6664e8949a87ea27a0cd2 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 6 Dec 2021 11:11:19 -0700
Subject: [PATCH 3/4] block: convert to using atomic-ref

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-flush.c      |  4 ++--
 block/blk-mq-tag.c     |  2 +-
 block/blk-mq.c         | 12 ++++++------
 block/blk.h            | 31 -------------------------------
 include/linux/blk-mq.h |  1 +
 5 files changed, 10 insertions(+), 40 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index e4df894189ce..e957902af17c 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -229,7 +229,7 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 	/* release the tag's ownership to the req cloned from */
 	spin_lock_irqsave(&fq->mq_flush_lock, flags);
 
-	if (!req_ref_put_and_test(flush_rq)) {
+	if (!atomic_ref_put_and_test(&flush_rq->ref)) {
 		fq->rq_status = error;
 		spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
 		return;
@@ -349,7 +349,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
 	 * and READ flush_rq->end_io
 	 */
 	smp_wmb();
-	req_ref_set(flush_rq, 1);
+	atomic_set(&flush_rq->ref, 1);
 
 	blk_flush_queue_rq(flush_rq, false);
 }
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 380e2dd31bfc..d9f961320652 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -228,7 +228,7 @@ static struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
 
 	spin_lock_irqsave(&tags->lock, flags);
 	rq = tags->rqs[bitnr];
-	if (!rq || rq->tag != bitnr || !req_ref_inc_not_zero(rq))
+	if (!rq || rq->tag != bitnr || !atomic_ref_inc_not_zero(&rq->ref))
 		rq = NULL;
 	spin_unlock_irqrestore(&tags->lock, flags);
 	return rq;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0bf3523dd1f5..2be5557a77c9 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -386,7 +386,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	INIT_LIST_HEAD(&rq->queuelist);
 	/* tag was already set */
 	WRITE_ONCE(rq->deadline, 0);
-	req_ref_set(rq, 1);
+	atomic_set(&rq->ref, 1);
 
 	if (rq->rq_flags & RQF_ELV) {
 		struct elevator_queue *e = data->q->elevator;
@@ -634,7 +634,7 @@ void blk_mq_free_request(struct request *rq)
 	rq_qos_done(q, rq);
 
 	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
-	if (req_ref_put_and_test(rq))
+	if (atomic_ref_put_and_test(&rq->ref))
 		__blk_mq_free_request(rq);
 }
 EXPORT_SYMBOL_GPL(blk_mq_free_request);
@@ -930,7 +930,7 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob)
 		rq_qos_done(rq->q, rq);
 
 		WRITE_ONCE(rq->state, MQ_RQ_IDLE);
-		if (!req_ref_put_and_test(rq))
+		if (!atomic_ref_put_and_test(&rq->ref))
 			continue;
 
 		blk_crypto_free_request(rq);
@@ -1373,7 +1373,7 @@ void blk_mq_put_rq_ref(struct request *rq)
 {
 	if (is_flush_rq(rq))
 		rq->end_io(rq, 0);
-	else if (req_ref_put_and_test(rq))
+	else if (atomic_ref_put_and_test(&rq->ref))
 		__blk_mq_free_request(rq);
 }
 
@@ -3005,7 +3005,7 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
 			unsigned long rq_addr = (unsigned long)rq;
 
 			if (rq_addr >= start && rq_addr < end) {
-				WARN_ON_ONCE(req_ref_read(rq) != 0);
+				WARN_ON_ONCE(atomic_read(&rq->ref) != 0);
 				cmpxchg(&drv_tags->rqs[i], rq, NULL);
 			}
 		}
@@ -3339,7 +3339,7 @@ static void blk_mq_clear_flush_rq_mapping(struct blk_mq_tags *tags,
 	if (!tags)
 		return;
 
-	WARN_ON_ONCE(req_ref_read(flush_rq) != 0);
+	WARN_ON_ONCE(atomic_read(&flush_rq->ref) != 0);
 
 	for (i = 0; i < queue_depth; i++)
 		cmpxchg(&tags->rqs[i], flush_rq, NULL);
diff --git a/block/blk.h b/block/blk.h
index 7ccb7c7d86b3..0114e18b9903 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -469,35 +469,4 @@ static inline bool should_fail_request(struct block_device *part,
 }
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
-/*
- * Optimized request reference counting. Ideally we'd make timeouts be more
- * clever, as that's the only reason we need references at all... But until
- * this happens, this is faster than using refcount_t. Also see:
- *
- * abc54d634334 ("io_uring: switch to atomic_t for io_kiocb reference count")
- */
-#define req_ref_zero_or_close_to_overflow(req)	\
-	((unsigned int) atomic_read(&(req->ref)) + 127u <= 127u)
-
-static inline bool req_ref_inc_not_zero(struct request *req)
-{
-	return atomic_inc_not_zero(&req->ref);
-}
-
-static inline bool req_ref_put_and_test(struct request *req)
-{
-	WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
-	return atomic_dec_and_test(&req->ref);
-}
-
-static inline void req_ref_set(struct request *req, int value)
-{
-	atomic_set(&req->ref, value);
-}
-
-static inline int req_ref_read(struct request *req)
-{
-	return atomic_read(&req->ref);
-}
-
 #endif /* BLK_INTERNAL_H */
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index ecdc049b52fa..02abf08f5765 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -7,6 +7,7 @@
 #include <linux/lockdep.h>
 #include <linux/scatterlist.h>
 #include <linux/prefetch.h>
+#include <linux/atomic-ref.h>
 
 struct blk_mq_tags;
 struct blk_flush_queue;
-- 
2.34.1


[-- Attachment #4: 0002-io_uring-convert-to-using-atomic-ref.patch --]
[-- Type: text/x-patch, Size: 2141 bytes --]

From a2e04d4d855f85faa913399f4fadedf45d09142a Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 6 Dec 2021 11:11:00 -0700
Subject: [PATCH 2/4] io_uring: convert to using atomic-ref

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 59fd8b785262..2ce076fd85dc 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -81,6 +81,7 @@
 #include <linux/tracehook.h>
 #include <linux/audit.h>
 #include <linux/security.h>
+#include <linux/atomic-ref.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -1170,17 +1171,10 @@ static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
 #define io_for_each_link(pos, head) \
 	for (pos = (head); pos; pos = pos->link)
 
-/*
- * Shamelessly stolen from the mm implementation of page reference checking,
- * see commit f958d7b528b1 for details.
- */
-#define req_ref_zero_or_close_to_overflow(req)	\
-	((unsigned int) atomic_read(&(req->refs)) + 127u <= 127u)
-
 static inline bool req_ref_inc_not_zero(struct io_kiocb *req)
 {
 	WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
-	return atomic_inc_not_zero(&req->refs);
+	return atomic_ref_inc_not_zero(&req->refs);
 }
 
 static inline bool req_ref_put_and_test(struct io_kiocb *req)
@@ -1188,21 +1182,19 @@ static inline bool req_ref_put_and_test(struct io_kiocb *req)
 	if (likely(!(req->flags & REQ_F_REFCOUNT)))
 		return true;
 
-	WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
-	return atomic_dec_and_test(&req->refs);
+	return atomic_ref_put_and_test(&req->refs);
 }
 
 static inline void req_ref_put(struct io_kiocb *req)
 {
 	WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
-	WARN_ON_ONCE(req_ref_put_and_test(req));
+	atomic_ref_put(&req->refs);
 }
 
 static inline void req_ref_get(struct io_kiocb *req)
 {
 	WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
-	WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
-	atomic_inc(&req->refs);
+	atomic_ref_get(&req->refs);
 }
 
 static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
-- 
2.34.1


[-- Attachment #5: 0001-atomic-ref-add-basic-infrastructure-for-atomic-refs-.patch --]
[-- Type: text/x-patch, Size: 1538 bytes --]

From cfec670a6240b84173f3b3719a24df9a4c7424e5 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 6 Dec 2021 10:54:55 -0700
Subject: [PATCH 1/4] atomic-ref: add basic infrastructure for atomic refs
 based on atomic_t

Make the atomic_t reference counting from commit f958d7b528b1 generic
and available for other users.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/atomic-ref.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 include/linux/atomic-ref.h

diff --git a/include/linux/atomic-ref.h b/include/linux/atomic-ref.h
new file mode 100644
index 000000000000..dfba69dd9d37
--- /dev/null
+++ b/include/linux/atomic-ref.h
@@ -0,0 +1,33 @@
+#ifndef LINUX_ATOMIC_REF_H
+#define LINUX_ATOMIC_REF_H
+
+/*
+ * Shamelessly stolen from the mm implementation of page reference checking,
+ * see commit f958d7b528b1 for details.
+ */
+#define atomic_ref_zero_or_close_to_overflow(ref)	\
+	((unsigned int) atomic_read(ref) + 127u <= 127u)
+
+static inline bool atomic_ref_inc_not_zero(atomic_t *ref)
+{
+	return atomic_inc_not_zero(ref);
+}
+
+static inline bool atomic_ref_put_and_test(atomic_t *ref)
+{
+	WARN_ON_ONCE(atomic_ref_zero_or_close_to_overflow(ref));
+	return atomic_dec_and_test(ref);
+}
+
+static inline void atomic_ref_put(atomic_t *ref)
+{
+	WARN_ON_ONCE(atomic_ref_put_and_test(ref));
+}
+
+static inline void atomic_ref_get(atomic_t *ref)
+{
+	WARN_ON_ONCE(atomic_ref_zero_or_close_to_overflow(ref));
+	atomic_inc(ref);
+}
+
+#endif
-- 
2.34.1


  reply	other threads:[~2021-12-06 18:13 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03 15:35 [PATCH] block: switch to atomic_t for request references Jens Axboe
2021-12-03 15:56 ` Keith Busch
2021-12-06  6:53 ` Christoph Hellwig
2021-12-06  8:31   ` Peter Zijlstra
2021-12-06 16:32     ` Jens Axboe
2021-12-06 17:19       ` Peter Zijlstra
2021-12-06 17:35     ` Linus Torvalds
2021-12-06 18:13       ` Jens Axboe [this message]
2021-12-06 20:51         ` Kees Cook
2021-12-06 21:17           ` Linus Torvalds
2021-12-06 23:28             ` Kees Cook
2021-12-07  0:13               ` Linus Torvalds
2021-12-07  4:56                 ` Kees Cook
2021-12-07  9:34                 ` Peter Zijlstra
2021-12-07 16:03                   ` Linus Torvalds
2021-12-07 10:30                 ` Peter Zijlstra
2021-12-07 16:10                   ` Linus Torvalds
2021-12-07 16:23                     ` Peter Zijlstra
2021-12-06 16:31   ` Jens Axboe
2021-12-07 11:26   ` Peter Zijlstra
2021-12-07 13:28     ` Peter Zijlstra
2021-12-07 15:51       ` Peter Zijlstra
2021-12-07 16:13       ` Linus Torvalds
2021-12-07 16:52         ` Peter Zijlstra
2021-12-07 17:41           ` Peter Zijlstra
2021-12-07 17:43           ` Linus Torvalds
2021-12-07 17:45             ` Linus Torvalds
2021-12-07 20:28       ` Peter Zijlstra
2021-12-07 23:23         ` Linus Torvalds
2021-12-08 17:07           ` Peter Zijlstra
2021-12-08 18:00             ` Linus Torvalds
2021-12-08 18:44               ` Peter Zijlstra
2021-12-08 18:50                 ` Linus Torvalds
2021-12-08 20:32                   ` Peter Zijlstra
2021-12-10 10:57                   ` Peter Zijlstra
2021-12-10 12:38               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=282666e2-93d4-0302-b2d0-47d03395a6d4@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=keescook@chromium.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox