From: Steven Rostedt <rostedt@goodmis.org>
To: Waiman Long <longman@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
Jeff Layton <jlayton@poochiereds.net>,
"J. Bruce Fields" <bfields@fieldses.org>,
Ingo Molnar <mingo@kernel.org>,
linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops
Date: Thu, 17 Aug 2017 17:10:07 -0400 [thread overview]
Message-ID: <20170817171007.1ab33b8f@gandalf.local.home> (raw)
In-Reply-To: <b0719c60-5c07-d121-1c84-d5993d36afb0@redhat.com>
On Thu, 17 Aug 2017 12:24:39 -0400
Waiman Long <longman@redhat.com> wrote:
> >> + * sysfs file and then acquiring the bd_mutex. Deleting a block device
> >> + * requires acquiring the bd_mutex and then waiting for all the sysfs
> >> + * references to be gone. This can lead to deadlock if both operations
> >> + * happen simultaneously. To avoid this problem, read/write to the
> >> + * the tracing sysfs files can now fail if the bd_mutex cannot be
> >> + * acquired while a deletion operation is in progress.
> >> + *
> >> + * A mutex trylock loop is used assuming that tracing sysfs operations
> > A mutex trylock loop is not enough to stop a deadlock. But I'm guessing
> > the undocumented bd_deleting may prevent that.
>
> Yes, that is what the bd_deleting flag is for.
>
> I was thinking about failing the sysfs operation after a certain number
> of trylock attempts, but then it will require changes to user space code
> to handle the occasional failures. Finally I decided to fail it only
> when a delete operation is in progress as all hopes are lost in this case.
Actually, why not just fail the attr read on deletion? If it is being
deleted, and one is reading the attribute, perhaps -ENODEV is the
proper return value?
>
> The root cause is the lock inversion under this circumstance. I think
> modifying the blk_trace code has the least impact overall. I agree that
> the code is ugly. If you have a better suggestion, I will certainly like
> to hear it.
Instead of playing games with taking the lock, the only way this race
is hit, is if the partition is being deleted and the sysfs attribute is
being read at the same time, correct? In that case, just return
-ENODEV, and be done with it.
-- Steve
next prev parent reply other threads:[~2017-08-17 21:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-16 20:40 [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops Waiman Long
2017-08-17 13:34 ` Steven Rostedt
2017-08-17 16:24 ` Waiman Long
2017-08-17 20:30 ` Steven Rostedt
2017-08-17 20:41 ` Bart Van Assche
2017-08-17 20:56 ` Steven Rostedt
2017-08-17 21:10 ` Steven Rostedt [this message]
2017-08-17 21:27 ` Waiman Long
2017-08-17 22:13 ` Steven Rostedt
2017-08-17 22:18 ` Steven Rostedt
2017-08-17 23:23 ` Steven Rostedt
2017-08-18 13:42 ` Waiman Long
2017-08-17 21:30 ` Steven Rostedt
2017-08-18 13:55 ` Waiman Long
2017-08-18 16:21 ` Bart Van Assche
2017-08-18 17:22 ` Waiman Long
2017-08-18 17:26 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170817171007.1ab33b8f@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=axboe@kernel.dk \
--cc=bfields@fieldses.org \
--cc=jlayton@poochiereds.net \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).