From: Al Viro <viro@ZenIV.linux.org.uk>
To: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>, Sergey Dyasly <dserrg@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 4/4] proc: avoid ->f_pos overflows in proc_task_readdir() paths
Date: Tue, 4 Jun 2013 22:06:34 +0100 [thread overview]
Message-ID: <20130604210633.GC13110@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20130604195700.GA31933@redhat.com>
On Tue, Jun 04, 2013 at 09:57:00PM +0200, Oleg Nesterov wrote:
> Will do... but so far I am confused.
>
> I do not see how they could race (I mean /proc/pid/task only). OK, OK,
> the usage of ->f_pos in sys_getdents() looks "obviously wrong", but this
> is another story? And "put f_pos in a local variable" can't help.
For one thing, a bunch of directories use generic_file_llseek(), which
does *not* use ->i_mutex. For another, there's a very unpleasant problem
with read(2) (failing) attempt racing with ->f_pos modifications in
->readdir(). Take a look at sys_read() and note that it is done with no
serialization at all (not in the top level, that is) and that it puts the
(unmodified by generic_read_dir()) value of pos back into file->f_pos as
soon as vfs_read() passes -EISDIR (returned by generic_read_dir()) back to
sys_read().
I.e. ->f_pos is silently reset back to the value it used to have on the
entry to read(2). Despite foo_readdir() assumptions that it won't be
changed behind its back.
Reset itself wouldn't be a problem - if several threads mess with read()
on the same opened file in parallel, you are not promised anything good
about the resulting IO pointer position. The same applies here. However,
many ->readdir() instances use file->f_pos as a variable they can use for
internal needs and _that_ leads to very unpleasant races.
The sane solution is to do what ->read()/->write()/etc. are doing - pass
an address of local copy of ->f_pos, so they are able to use it without
worrying about concurrent modifications of that value. That obviously
solves all problems with generic_file_lseek(), etc., as well as this
sys_read() shite.
next prev parent reply other threads:[~2013-06-04 21:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-03 19:06 [PATCH v2 0/4] proc: first_tid() fix/cleanup Oleg Nesterov
2013-06-03 19:06 ` [PATCH v2 1/4] proc: first_tid: fix the potential use-after-free Oleg Nesterov
2013-06-03 19:07 ` [PATCH v2 2/4] proc: change first_tid() to use while_each_thread() Oleg Nesterov
2013-06-03 19:07 ` [PATCH v2 3/4] proc: simplify proc_task_readdir/first_tid paths Oleg Nesterov
2013-06-03 22:06 ` Eric W. Biederman
2013-06-03 19:07 ` [PATCH v2 4/4] proc: avoid ->f_pos overflows in proc_task_readdir() paths Oleg Nesterov
2013-06-03 22:18 ` Eric W. Biederman
2013-06-04 17:14 ` Oleg Nesterov
2013-06-04 17:39 ` Al Viro
2013-06-04 19:57 ` Oleg Nesterov
2013-06-04 21:06 ` Al Viro [this message]
2013-06-04 0:58 ` Al Viro
2013-06-04 17:35 ` Oleg Nesterov
2013-06-04 17:32 ` [PATCH v2 0/4] proc: first_tid() fix/cleanup Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130604210633.GC13110@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=akpm@linux-foundation.org \
--cc=dserrg@gmail.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.