From: James Bottomley <jejb@linux.vnet.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Greg Kroah-Hartman <greg@kroah.com>,
Hannes Reinecke <hare@suse.de>,
Johannes Thumshirn <jthumshirn@suse.de>,
Sagi Grimberg <sagi@grimberg.me>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
Tejun Heo <htejun@gmail.com>
Subject: Re: [PATCH] Avoid that SCSI device removal through sysfs triggers a deadlock
Date: Thu, 10 Nov 2016 17:37:24 -0800 [thread overview]
Message-ID: <1478828244.3267.49.camel@linux.vnet.ibm.com> (raw)
In-Reply-To: <87pom5o123.fsf@xmission.com>
On Tue, 2016-11-08 at 20:10 -0600, Eric W. Biederman wrote:
> James Bottomley <jejb@linux.vnet.ibm.com> writes:
>
> > On Tue, 2016-11-08 at 18:57 -0600, Eric W. Biederman wrote:
[...]
> > > I am pretty certain that if you are going to make
> > > kernfs_remove_self and kernfs_remove_by_name_ns to be safe to
> > > race against each other, not just the KERNFS_SUICIDAL check, but
> > > the wait when KERNFS_SUICIDAL is set needs to be added ot
> > > kernfs_remove_by_name_ns.
> >
> > I don't think you can do that: waiting for SUICIDED would introduce
> > another potential lock entanglement. I'm reasonably happy that the
> > deactivation offset coupled with kernfs_drain in the non self
> > remove path means that the necessary cleanup is done when the
> > directory itself is removed. That seems to be a common pattern in
> > all non-self removes.
>
> But if we don't I am pretty certain there will be asynchrounous
> behavior in some cases that could potentially confuse userspace.
But the original behaviour kernfs_remove_self() eliminated was the
asynchronous callback. If we go back to that, we're definitely going
to introduce far more asynchronous behaviour.
> Which is partly why I would like to kill kernfs_remove_self.
I took a look at it. It's definitely not cleanly revertible given
what's gone on since. Even just trying to excise it is going to be
hard given all the tentacles it has.
> > > And I suspect if you add the appropriate lockdep annotations to
> > > that mess you will find yourself in a similar but related ABBA
> > > deadlock.
> >
> > I can't prove the negative, but as long as there's no waiting on
> > the SUICIDED/AL flags in the non-self remove path, I believe we're
> > safe with the current patch.
> >
> > > Which is why I would like a simpler easier to understand
> > > mechanism if we can.
> >
> > I don't disagree: If you want to clean out the Augean Stables, I
> > can lend you the thigh length rubber boots and the gas mask.
> > However, I think that what we're currently proposing: a simple
> > patch to make device_remove_file_self() actually work for everyone,
> > along with stringent testing is the better approach.
> >
> > After all, if you look at
> >
> > commit ac0ece9174aca9aa895ce0accc54f1f8ff12d117
> > Author: Tejun Heo <tj@kernel.org>
> > Date: Mon Feb 3 14:03:03 2014 -0500
> >
> > scsi: use device_remove_file_self() instead of
> > device_schedule_callback()
> >
> > You'll see Tejun added all this stuff just to remove the async
> > callback we originally had. Simply restoring the async callback
> > back makes us quite considerably worse off because the
> > device_remove_file_self() mechanism is in use elsewhere. We need
> > either to fix it and move on or junk it and go back to the
> > original.
>
> I vote we junk it and go back to something that resembles the
> original. there are only about 5 other callers so this isn't that
> bad to do.
>
> Tejun's work clearly added this deadlock 2 and a half years ago, and
> it was nasty enough no one noticed until recently.
>
> Using task_work_add(current, ...) as I posted earlier let's us retain
> the synchronous property of the current API.
>
> While we debate the details I am happy to look at scsi as a special
> case and solve for scsi. Then when we have the details worked out we
> can go fix the other cases. Given my preliminary patch in my last
> reply it looks very straight forward to fix this sanely.
I don't think there's any urgency to fix SCSI. You can only really
trigger this by hammering the device and host remove paths, which isn't
what users normally do ... as the fact it's been in the field for 2.5
years with no apparent problems shows.
I'd like Greg and Tejun to weigh in on this before we start doing
something, since they created the initial problem.
James
next prev parent reply other threads:[~2016-11-11 1:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-08 0:32 [PATCH] Avoid that SCSI device removal through sysfs triggers a deadlock Bart Van Assche
2016-11-08 7:01 ` Greg KH
2016-11-08 15:34 ` James Bottomley
2016-11-08 15:28 ` James Bottomley
2016-11-08 16:52 ` Bart Van Assche
2016-11-08 18:01 ` James Bottomley
2016-11-08 19:13 ` Eric W. Biederman
2016-11-08 23:33 ` Bart Van Assche
2016-11-09 1:22 ` Eric W. Biederman
2016-11-08 23:44 ` James Bottomley
2016-11-09 0:57 ` Eric W. Biederman
2016-11-09 1:43 ` James Bottomley
2016-11-09 2:10 ` Eric W. Biederman
2016-11-11 1:37 ` James Bottomley [this message]
2016-11-11 4:13 ` Eric W. Biederman
-- strict thread matches above, loose matches on Subject: below --
2018-06-26 22:25 Bart Van Assche
2016-10-26 18:44 Bart Van Assche
2016-10-27 9:36 ` Sagi Grimberg
2016-10-27 15:39 ` Bart Van Assche
2016-10-27 9:46 ` Johannes Thumshirn
2016-10-27 15:38 ` Bart Van Assche
2016-10-29 0:12 ` Johannes Thumshirn
2016-10-29 2:08 ` James Bottomley
2016-10-30 19:22 ` Bart Van Assche
2016-10-30 20:25 ` James Bottomley
2016-11-03 22:27 ` Bart Van Assche
2016-11-04 13:47 ` James Bottomley
2016-11-04 18:17 ` Bart Van Assche
2016-11-07 20:51 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1478828244.3267.49.camel@linux.vnet.ibm.com \
--to=jejb@linux.vnet.ibm.com \
--cc=bart.vanassche@sandisk.com \
--cc=ebiederm@xmission.com \
--cc=greg@kroah.com \
--cc=hare@suse.de \
--cc=htejun@gmail.com \
--cc=jthumshirn@suse.de \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).