All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Mikulas Patocka <mpatocka@redhat.com>,
	Lars Ellenberg <lars.ellenberg@linbit.com>,
	axboe@kernel.dk
Cc: linux-block@vger.kernel.org, dm-devel@redhat.com,
	Roland Kammerer <roland.kammerer@linbit.com>,
	"Alasdair G. Kergon" <agk@redhat.com>,
	Zdenek Kabelac <zkabelac@redhat.com>
Subject: Re: [PATCH 0/3] offload bios to a thread
Date: Thu, 30 Jun 2016 15:40:24 -0400	[thread overview]
Message-ID: <20160630194023.GA21338@redhat.com> (raw)
In-Reply-To: <alpine.LRH.2.02.1606282014380.7298@file01.intranet.prod.int.rdu2.redhat.com>

[cc'ing linux-block and drbd folks]

On Tue, Jun 28 2016 at  8:16pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi
> 
> Here I'm sending three patches to fix the deadlocks in snapshot and 
> snapshot-merge.
> 
> The first patch fixes the deadlock, the following 2 patches introduce a 
> timer, so that bios are not offloaded immediatelly, they are offloaded 
> after a specified timeout, because immediate offloading can change order 
> of bios and it could theoretically produce regressions. I don't know if 
> these regressions really exist or not.
> 
> If there is some way to push the patches upstream, try it.

Some fix must happen before the more recent upstream kernels can be
reliably used in stacked bio-based workloads (in production).  We simply
cannot ignore this issue any more.

drbd is also hitting the same generic_make_request (current->bio_list)
problem, see:
https://www.redhat.com/archives/dm-devel/2016-June/msg00326.html

Mikulas, I've taken your 3 proposed patches patches and refactored them
some to split out intermediate patches that hopefully make review
easier.  Nothing other than variable names and some other style stuff
was changed -- headers were tweaked some to help with clarity.

Please see the 5 topmost "block: ..." patches here:
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=wip

It should be noted that Jens had a quick look at this set and wanted to
throw up a little when he saw the (ab)use of a timer to defer punting to
the workqueue.  I explained that without the timer, always punting to
the workqueue, we could hurt performance by reordering IO or crippling
onstack plugging.  He said he'd try to think of a cleaner way forward.

Lars, please feel free to see if this set addresses the similar deadlock
you saw/fixed with drbd.  We need to converge on an acceptable fix for
this problem -- preferably sooner rather than later!

Conversely, Mikulas: if you can easily reproduce the dm-snapshot
deadlock please try Lars' fix to see if it is workable for our DM needs.

Thanks,
Mike

p.s. I'm on holiday until next Wednesday (7/6).. so may be slow to
     respond until then.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Mikulas Patocka <mpatocka@redhat.com>,
	Lars Ellenberg <lars.ellenberg@linbit.com>,
	axboe@kernel.dk
Cc: "Alasdair G. Kergon" <agk@redhat.com>,
	Zdenek Kabelac <zkabelac@redhat.com>,
	dm-devel@redhat.com, linux-block@vger.kernel.org,
	Roland Kammerer <roland.kammerer@linbit.com>
Subject: Re: [PATCH 0/3] offload bios to a thread
Date: Thu, 30 Jun 2016 15:40:24 -0400	[thread overview]
Message-ID: <20160630194023.GA21338@redhat.com> (raw)
In-Reply-To: <alpine.LRH.2.02.1606282014380.7298@file01.intranet.prod.int.rdu2.redhat.com>

[cc'ing linux-block and drbd folks]

On Tue, Jun 28 2016 at  8:16pm -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi
> 
> Here I'm sending three patches to fix the deadlocks in snapshot and 
> snapshot-merge.
> 
> The first patch fixes the deadlock, the following 2 patches introduce a 
> timer, so that bios are not offloaded immediatelly, they are offloaded 
> after a specified timeout, because immediate offloading can change order 
> of bios and it could theoretically produce regressions. I don't know if 
> these regressions really exist or not.
> 
> If there is some way to push the patches upstream, try it.

Some fix must happen before the more recent upstream kernels can be
reliably used in stacked bio-based workloads (in production).  We simply
cannot ignore this issue any more.

drbd is also hitting the same generic_make_request (current->bio_list)
problem, see:
https://www.redhat.com/archives/dm-devel/2016-June/msg00326.html

Mikulas, I've taken your 3 proposed patches patches and refactored them
some to split out intermediate patches that hopefully make review
easier.  Nothing other than variable names and some other style stuff
was changed -- headers were tweaked some to help with clarity.

Please see the 5 topmost "block: ..." patches here:
http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=wip

It should be noted that Jens had a quick look at this set and wanted to
throw up a little when he saw the (ab)use of a timer to defer punting to
the workqueue.  I explained that without the timer, always punting to
the workqueue, we could hurt performance by reordering IO or crippling
onstack plugging.  He said he'd try to think of a cleaner way forward.

Lars, please feel free to see if this set addresses the similar deadlock
you saw/fixed with drbd.  We need to converge on an acceptable fix for
this problem -- preferably sooner rather than later!

Conversely, Mikulas: if you can easily reproduce the dm-snapshot
deadlock please try Lars' fix to see if it is workable for our DM needs.

Thanks,
Mike

p.s. I'm on holiday until next Wednesday (7/6).. so may be slow to
     respond until then.

  reply	other threads:[~2016-06-30 19:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-29  0:16 [PATCH 0/3] offload bios to a thread Mikulas Patocka
2016-06-30 19:40 ` Mike Snitzer [this message]
2016-06-30 19:40   ` Mike Snitzer
2016-06-30 23:15   ` Mike Snitzer
2016-06-30 23:15     ` Mike Snitzer
2016-07-04  8:09     ` Lars Ellenberg
2016-07-04  8:09       ` Lars Ellenberg
2016-07-04 22:27       ` Mikulas Patocka
2016-07-04 22:27         ` Mikulas Patocka
2016-07-04 22:45   ` Mikulas Patocka
2016-07-04 22:45     ` Mikulas Patocka
  -- strict thread matches above, loose matches on Subject: below --
2016-07-04 22:53 Mikulas Patocka
2016-07-06 13:36 ` Mike Snitzer
2016-07-06 13:53   ` Mikulas Patocka
2016-07-06 13:55     ` Mike Snitzer
2016-07-06 15:23       ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160630194023.GA21338@redhat.com \
    --to=snitzer@redhat.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=lars.ellenberg@linbit.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=roland.kammerer@linbit.com \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.