From: Ted Ts'o <tytso@mit.edu>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: Theodore Tso <tytso@google.com>, Greg KH <greg@kroah.com>,
Paul Taysom <taysom@google.com>,
Paul Taysom <taysom@chromium.org>,
Mandeep Baines <msb@chromium.org>, Jens Axboe <axboe@kernel.dk>,
Andrew Morton <akpm@google.com>,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
Alexander Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, stable@kernel.org
Subject: Re: [PATCH] fs: Fix mod_timer crash when removing USB sticks
Date: Sat, 17 Mar 2012 23:44:12 -0400 [thread overview]
Message-ID: <20120318034412.GA22531@thunk.org> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1203171010510.31791-100000@netrider.rowland.org>
On Sat, Mar 17, 2012 at 10:21:43AM -0400, Alan Stern wrote:
> On Fri, 16 Mar 2012, Theodore Tso wrote:
>
> > I thought another fix at the USB layer also went in that attempted to
> > fix this problem for 3.2, and so with two separate band-aid patches, I
> > think we had thought the problem had been addressed.
>
> I can't recall any USB fix like that. The only thing I remember is
> your change to ext4 (leaving the problem still present in ext3).
I could have sworn there was another patch that was tried to fix this
at another layer, because I distinctly remember thinking, oh well,
maybe I didn't need to merge my patch, when I saw another patch go by
that tried to fix it somewhere else.
I can't find it though, so maybe my memory is playing tricks on me....
> > The real problem is that all of the patches which I've seen to date
> > are band-aids, in that we aren't properly sending a "device as
> > disappeared" notification to the file system layer, but instead we are
> > trying to keep enough of the pointers valid (while also freeing other
> > data structures), such that the file system can blindly write into a
> > partially dismantled block device, and hopefully not oops.
>
> That's not a band-aid approach; it's the way reference counting is
> _intended_ to work. The whole idea of refcounting is that instead of
> synchronizing every single operation, you keep data structures around
> so long as anyone might be using them.
Yeah, maybe. It's just that when I went looking in the git logs
trying to find the other patch which I was sure had gone by on LKML, I
can across this:
commit 95f28604a65b1c40b6c6cd95e58439cd7ded3add
Author: Jens Axboe <jaxboe@fusionio.com>
Date: Thu Mar 17 11:13:12 2011 +0100
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
We don't have proper reference counting for this yet, so we run into
cases where the device is pulled and we OOPS on flushing the fs data.
This happens even though the dirty inodes have already been
migrated to the default_backing_dev_info.
Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
I can't help thinking that the fact that we're constantly playing
whack-a-mole trying to fix various random crashes when devices
disappear that perhaps we should consider if there's a better way to
do things.
The fact that at the file system layer I have **no** idea that a
device has disappeared, and just blindly going on trying to write to a
device which is gone just seems a little crazy to me... why shouldn't
block layer inform the upper layers about something as fundamental as,
"the device is gone and is never coming back"?
> I suspect Paul's patch is the right thing to do. It might even make
> the ext4 fix unnecessary, although I don't understand the details well
> enough to verify it. Maybe Paul can check -- the commit I'm referring
> to is 7c2e70879fc0949b4220ee61b7c4553f6976a94d (ext4: add ext4-specific
> kludge to avoid an oops after the disk disappears).
I have no idea either, because it's not obvious to me what data
structures can be relied upon, and what can't, and when things are
supposed to get freed on sudden device disconnects. The fact that
none of us are sure is part of what makes me think that the current
scheme is, perhaps, non-optimal...
Regards,
- Ted
next prev parent reply other threads:[~2012-03-18 3:44 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-12 21:57 [PATCH] fs: Fix mod_timer crash when removing USB sticks Paul Taysom
2012-03-16 17:36 ` Greg KH
2012-03-16 19:29 ` Paul Taysom
2012-03-16 19:43 ` Greg KH
2012-03-16 21:10 ` Theodore Tso
2012-03-17 0:06 ` Greg KH
2012-03-17 14:21 ` Alan Stern
2012-03-18 3:44 ` Ted Ts'o [this message]
2012-03-18 20:23 ` Alan Stern
2012-03-18 22:25 ` Mandeep Singh Baines
[not found] ` <CACBanvpzOdC4ns-pg1f92ptxrCJ2O=_oJhpKFD4NOB0hyF_+aA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-03-20 0:24 ` Paul Taysom
2012-03-20 2:02 ` Alan Stern
2012-03-22 16:15 ` Paul Taysom
-- strict thread matches above, loose matches on Subject: below --
2012-01-12 21:15 Paul Taysom
2012-01-12 21:38 ` Greg KH
2012-01-12 21:53 ` Mandeep Singh Baines
2012-01-12 22:02 ` Greg KH
[not found] ` <20120112215331.GB18166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-01-12 22:11 ` Theodore Tso
[not found] ` <CAGagf4dk4KsZSkaWTO9Yegi=_wRJsYBPgfyks1z=wMZJV8gX0w@mail.gmail.com>
2012-01-12 22:35 ` Mandeep Singh Baines
[not found] ` <20120112223544.GC18166-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-01-12 23:23 ` Andrew Morton
[not found] ` <1326402935-31002-1-git-send-email-taysom-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2012-01-13 5:42 ` Josh Boyer
[not found] ` <CA+5PVA7Yffs3-qdq6pSqDKLLngU7kBsdE92e31NnAM0=wrwp4w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-01-13 15:39 ` Paul Taysom
2012-01-12 19:57 Paul Taysom
2012-01-12 20:15 ` Greg KH
2012-01-13 11:28 ` Sergei Shtylyov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120318034412.GA22531@thunk.org \
--to=tytso@mit.edu \
--cc=akpm@google.com \
--cc=axboe@kernel.dk \
--cc=greg@kroah.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=msb@chromium.org \
--cc=stable@kernel.org \
--cc=stern@rowland.harvard.edu \
--cc=taysom@chromium.org \
--cc=taysom@google.com \
--cc=tytso@google.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).