From: Suresh Jayaraman <sjayaraman@suse.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: LKML <linux-kernel@vger.kernel.org>,
Shaohua Li <shaohua.li@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>
Subject: [PATCH v2] block: document blk-plug
Date: Mon, 29 Aug 2011 16:58:21 +0530 [thread overview]
Message-ID: <4E5B77D5.7090006@suse.de> (raw)
Thus spake Andrew Morton:
"And I have the usual maintainability whine. If someone comes up to
vmscan.c and sees it calling blk_start_plug(), how are they supposed to
work out why that call is there? They go look at the blk_start_plug()
definition and it is undocumented. I think we can do better than this?"
Adapted from the LWN article - http://lwn.net/Articles/438256/ by Jens
Axboe and from an earlier attempt by Shaohua Li to document blk-plug.
Changes since -v1:
* explain how blk_plug helps with potential deadlock avoidance.
* explain why we need blk-plug.
* add a note that cb_list is required by md.
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
---
block/blk-core.c | 14 ++++++++++++++
include/linux/blkdev.h | 16 +++++++++++-----
2 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 90e1ffd..ea360c8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2626,6 +2626,20 @@ EXPORT_SYMBOL(kblockd_schedule_delayed_work);
#define PLUG_MAGIC 0x91827364
+/**
+ * blk_start_plug - initialize blk_plug and track it inside the task_struct
+ * @plug: The &struct blk_plug that needs to be initialized
+ *
+ * Description:
+ * Tracking blk_plug inside the task_struct will help with auto-flushing the
+ * pending I/O should the task end up blocking between blk_start_plug() and
+ * blk_finish_plug(). This is important from a performance perspective, but
+ * also ensures that we don't deadlock. For instance, if the task is blocking
+ * for a memory allocation, memory reclaim could end up wanting to free a
+ * page belonging to that request that is currently residing in our private
+ * plug. By flushing the pending I/O when the process goes to sleep, we avoid
+ * this kind of deadlocks.
+ */
void blk_start_plug(struct blk_plug *plug)
{
struct task_struct *tsk = current;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 84b15d5..f45d783 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -863,17 +863,23 @@ struct request_queue *blk_alloc_queue_node(gfp_t, int);
extern void blk_put_queue(struct request_queue *);
/*
+ * blk_plug allows to build up a queue of related requests by holding the I/O
+ * fragments for a short period. This allows merging of sequential requests
+ * into single larger request. As the requests are moved from per-task list to
+ * the device's request_queue in a batch, this results in improved
+ * scalability as the lock contention for request_queue lock is reduced.
+ *
* Note: Code in between changing the blk_plug list/cb_list or element of such
* lists is preemptable, but such code can't do sleep (or be very careful),
* otherwise data is corrupted. For details, please check schedule() where
* blk_schedule_flush_plug() is called.
*/
struct blk_plug {
- unsigned long magic;
- struct list_head list;
- struct list_head cb_list;
- unsigned int should_sort;
- unsigned int count;
+ unsigned long magic; /* detect uninitialized use-cases */
+ struct list_head list; /* requests */
+ struct list_head cb_list; /* md requires an unplug callback */
+ unsigned int should_sort; /*list to be sorted before flushing? */
+ unsigned int count; /* request count to avoid list getting too big */
};
#define BLK_MAX_REQUEST_COUNT 16
next reply other threads:[~2011-08-29 11:29 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-29 11:28 Suresh Jayaraman [this message]
2011-08-29 21:48 ` [PATCH v2] block: document blk-plug Andrew Morton
2011-08-30 5:21 ` Suresh Jayaraman
2011-08-30 7:00 ` Shaohua Li
2011-09-05 12:46 ` Suresh Jayaraman
2011-09-06 0:55 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E5B77D5.7090006@suse.de \
--to=sjayaraman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=corbet@lwn.net \
--cc=linux-kernel@vger.kernel.org \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.