All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <axboe@fb.com>, <hch@lst.de>,
	<neilb@suse.de>
Subject: Re: [PATCH 2/5] sched: always use blk_schedule_flush_plug in io_schedule_out
Date: Fri, 1 May 2015 11:28:07 -0700	[thread overview]
Message-ID: <20150501182807.GA2041646@devbig257.prn2.facebook.com> (raw)
In-Reply-To: <x49383g189d.fsf@segfault.boston.devel.redhat.com>

On Fri, May 01, 2015 at 02:07:42PM -0400, Jeff Moyer wrote:
> Jeff Moyer <jmoyer@redhat.com> writes:
> 
> > Shaohua Li <shli@fb.com> writes:
> >
> >> block plug callback could sleep, so we introduce a parameter
> >> 'from_schedule' and corresponding drivers can use it to destinguish a
> >> schedule plug flush or a plug finish. Unfortunately io_schedule_out
> >> still uses blk_flush_plug(). This causes below output (Note, I added a
> >> might_sleep() in raid1_unplug to make it trigger faster, but the whole
> >> thing doesn't matter if I add might_sleep). In raid1/10, this can cause
> >> deadlock.
> >>
> >> This patch makes io_schedule_out always uses blk_schedule_flush_plug.
> >> This should only impact drivers (as far as I know, raid 1/10) which are
> >> sensitive to the 'from_schedule' parameter.
> >
> > Why wouldn't you change io_schedule_timeout to use blk_flush_plug_list
> > instead?
> 
> Sorry, I did not fully engage my brain on this one.  I see that's
> exactly what you did.
> 
> I'll add my reviewed-by once you've addressed hch's concern.

Thanks! Updated patch.


>From c7c6e5ab4071e9fc20e898a3e3741bf315449d59 Mon Sep 17 00:00:00 2001
Message-Id: <c7c6e5ab4071e9fc20e898a3e3741bf315449d59.1430504716.git.shli@fb.com>
From: Shaohua Li <shli@fb.com>
Date: Wed, 29 Apr 2015 12:12:03 -0700
Subject: [PATCH] sched: always use blk_schedule_flush_plug in io_schedule_out

block plug callback could sleep, so we introduce a parameter
'from_schedule' and corresponding drivers can use it to destinguish a
schedule plug flush or a plug finish. Unfortunately io_schedule_out
still uses blk_flush_plug(). This causes below output (Note, I added a
might_sleep() in raid1_unplug to make it trigger faster, but the whole
thing doesn't matter if I add might_sleep). In raid1/10, this can cause
deadlock.

This patch makes io_schedule_out always uses blk_schedule_flush_plug.
This should only impact drivers (as far as I know, raid 1/10) which are
sensitive to the 'from_schedule' parameter.

[  370.817949] ------------[ cut here ]------------
[  370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90()
[  370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff81092fcf>] prepare_to_wait+0x2f/0x90
[  370.817971] Modules linked in: raid1
[  370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G        W       4.0.0+ #361
[  370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[  370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1)
[  370.817985]  ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001
[  370.817988]  ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8
[  370.817990]  ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28
[  370.817993] Call Trace:
[  370.817999]  [<ffffffff819dd7af>] dump_stack+0x4f/0x7b
[  370.818002]  [<ffffffff81051afc>] warn_slowpath_common+0x8c/0xd0
[  370.818004]  [<ffffffff81051b86>] warn_slowpath_fmt+0x46/0x50
[  370.818006]  [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[  370.818008]  [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[  370.818010]  [<ffffffff810776ef>] __might_sleep+0x7f/0x90
[  370.818014]  [<ffffffffa0000c03>] raid1_unplug+0xd3/0x170 [raid1]
[  370.818024]  [<ffffffff81421d9a>] blk_flush_plug_list+0x8a/0x1e0
[  370.818028]  [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[  370.818031]  [<ffffffff819e21b0>] io_schedule_timeout+0x130/0x140
[  370.818033]  [<ffffffff819e3586>] bit_wait_io+0x36/0x50
[  370.818034]  [<ffffffff819e31b5>] __wait_on_bit+0x65/0x90
[  370.818041]  [<ffffffff8125b67c>] ? ext4_read_block_bitmap_nowait+0xbc/0x630
[  370.818043]  [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[  370.818045]  [<ffffffff819e3302>] out_of_line_wait_on_bit+0x72/0x80
[  370.818047]  [<ffffffff810935e0>] ? autoremove_wake_function+0x40/0x40
[  370.818050]  [<ffffffff811de744>] __wait_on_buffer+0x44/0x50
[  370.818053]  [<ffffffff8125ae80>] ext4_wait_block_bitmap+0xe0/0xf0
[  370.818058]  [<ffffffff812975d6>] ext4_mb_init_cache+0x206/0x790
[  370.818062]  [<ffffffff8114bc6c>] ? lru_cache_add+0x1c/0x50
[  370.818064]  [<ffffffff81297c7e>] ext4_mb_init_group+0x11e/0x200
[  370.818066]  [<ffffffff81298231>] ext4_mb_load_buddy+0x341/0x360
[  370.818068]  [<ffffffff8129a1a3>] ext4_mb_find_by_goal+0x93/0x2f0
[  370.818070]  [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[  370.818072]  [<ffffffff8129ab67>] ext4_mb_regular_allocator+0x67/0x460
[  370.818074]  [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[  370.818076]  [<ffffffff8129ca4b>] ext4_mb_new_blocks+0x4cb/0x620
[  370.818079]  [<ffffffff81290956>] ext4_ext_map_blocks+0x4c6/0x14d0
[  370.818081]  [<ffffffff812a4d4e>] ? ext4_es_lookup_extent+0x4e/0x290
[  370.818085]  [<ffffffff8126399d>] ext4_map_blocks+0x14d/0x4f0
[  370.818088]  [<ffffffff81266fbd>] ext4_writepages+0x76d/0xe50
[  370.818094]  [<ffffffff81149691>] do_writepages+0x21/0x50
[  370.818097]  [<ffffffff811d5c00>] __writeback_single_inode+0x60/0x490
[  370.818099]  [<ffffffff811d630a>] writeback_sb_inodes+0x2da/0x590
[  370.818103]  [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[  370.818105]  [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[  370.818107]  [<ffffffff811d665f>] __writeback_inodes_wb+0x9f/0xd0
[  370.818109]  [<ffffffff811d69db>] wb_writeback+0x34b/0x3c0
[  370.818111]  [<ffffffff811d70df>] bdi_writeback_workfn+0x23f/0x550
[  370.818116]  [<ffffffff8106bbd8>] process_one_work+0x1c8/0x570
[  370.818117]  [<ffffffff8106bb5b>] ? process_one_work+0x14b/0x570
[  370.818119]  [<ffffffff8106c09b>] worker_thread+0x11b/0x470
[  370.818121]  [<ffffffff8106bf80>] ? process_one_work+0x570/0x570
[  370.818124]  [<ffffffff81071868>] kthread+0xf8/0x110
[  370.818126]  [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[  370.818129]  [<ffffffff819e9322>] ret_from_fork+0x42/0x70
[  370.818131]  [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[  370.818132] ---[ end trace 7b4deb71e68b6605 ]---

V2: don't change ->in_iowait

Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 kernel/sched/core.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f9123a8..1ab3d2d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4402,10 +4402,7 @@ long __sched io_schedule_timeout(long timeout)
 	long ret;
 
 	current->in_iowait = 1;
-	if (old_iowait)
-		blk_schedule_flush_plug(current);
-	else
-		blk_flush_plug(current);
+	blk_schedule_flush_plug(current);
 
 	delayacct_blkio_start();
 	rq = raw_rq();
-- 
1.8.1


  reply	other threads:[~2015-05-01 18:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-30 17:45 [PATCH 0/5] blk plug fixes Shaohua Li
2015-04-30 17:45 ` [PATCH 1/5] blk: clean up plug Shaohua Li
2015-05-01 17:11   ` Christoph Hellwig
2015-04-30 17:45 ` [PATCH 2/5] sched: always use blk_schedule_flush_plug in io_schedule_out Shaohua Li
2015-05-01 17:14   ` Christoph Hellwig
2015-05-01 18:05     ` Shaohua Li
2015-05-01 17:42   ` Jeff Moyer
2015-05-01 18:07     ` Jeff Moyer
2015-05-01 18:28       ` Shaohua Li [this message]
2015-05-01 19:37         ` Jeff Moyer
2015-04-30 17:45 ` [PATCH 3/5] blk-mq: fix plugging in blk_sq_make_request Shaohua Li
2015-05-01 17:16   ` Christoph Hellwig
2015-05-01 17:47     ` Jeff Moyer
2015-04-30 17:45 ` [PATCH 4/5] blk-mq: do limited block plug for multiple queue case Shaohua Li
2015-05-01 20:16   ` Jeff Moyer
2015-05-04 19:40     ` Shaohua Li
2015-05-04 19:46       ` Jens Axboe
2015-05-04 20:33         ` Shaohua Li
2015-05-04 20:35           ` Jens Axboe
2015-04-30 17:45 ` [PATCH 5/5] blk-mq: make plug work for mutiple disks and queues Shaohua Li
2015-05-01 20:55   ` Jeff Moyer
2015-05-04 19:44     ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150501182807.GA2041646@devbig257.prn2.facebook.com \
    --to=shli@fb.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.