From: Andrew Morton <akpm@linux-foundation.org>
To: Roland Dreier <rdreier@cisco.com>
Cc: Al Viro <viro@ftp.linux.org.uk>, linux-kernel@vger.kernel.org
Subject: Re: [RESEND] file operations: release can race with read/write?
Date: Wed, 17 Oct 2007 12:16:56 -0700 [thread overview]
Message-ID: <20071017121656.484646a5.akpm@linux-foundation.org> (raw)
In-Reply-To: <adaodexh9uz.fsf@cisco.com>
On Wed, 17 Oct 2007 11:30:44 -0700 Roland Dreier <rdreier@cisco.com> wrote:
> [Resending, directly to a couple likely suspects this time, in the
> hope of getting a reply... thanks]
>
> I have a question about the synchronization of file_operations: is it
> intended that the .release method of a file can be called while a
> .read or .write method is still running for that file?
>
> The reason I ask is that I've seen a crash in practice that appears to
> be caused by this race, and I'm wondering whether the correct fix is
> to add synchronization between .read/.write and .release to the
> driver's methods (this is a character device), or whether the core
> kernel is supposed to provide this synchronization.
>
> I've written a quick test case that seems to show this happen in
> practice, and reading the code in fs/open.c and fs/read_write.c I see
> no reason why this race can't happen: sys_read() and sys_write() just
> do fget_light(), which will not increment the file's reference count
> on the fast path, so a racing sys_close() from another thread could do
> the final fput() that ends up calling the file's .release method
> before the read or write has finished.
After pthread_create()'s clone() you should have files->count==2, so
fget_light() will do the full atomic_inc_not_zero() thing?
> I think the actual file data structure is OK, because it is freed with
> RCU, so it won't be freed too soon. But a .release method may free
> other driver-internal state that is still in use because of this race.
>
> It seems that we wouldn't want to give up the fget_light()
> optimization in read/write, so probably the right place to handle this
> is in the driver. But on the other hand, this is kind of a subtle
> booby trap that has been laid, and maybe there is a clever way to
> handle this. So I would appreciate guidance from smarter people.
>
> Thanks,
> Roland
>
>
> BTW, here's the test case -- it seems pretty conclusive to me, but
> maybe I'm all wet and misunderstanding things. There's a kernel
> module that just prints "Write raced with close?" if it sees the race,
> and a userspace driver that just spawns two threads, one that does
> open/close in a loop and one that does write in a loop. Running this
> on my machine, I see several race messages get printed.
>
That seems pretty conclusive. I haven't actually thought about this yet -
I just wanted to clarify that fget_light() thing.
next prev parent reply other threads:[~2007-10-17 19:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-16 18:53 file operations: release can race with read/write? Roland Dreier
2007-10-17 18:30 ` [RESEND] " Roland Dreier
2007-10-17 19:16 ` Andrew Morton [this message]
2007-10-17 22:06 ` Roland Dreier
2007-10-17 19:27 ` Al Viro
2007-10-17 19:33 ` Al Viro
2007-10-17 22:10 ` Roland Dreier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071017121656.484646a5.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rdreier@cisco.com \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.