Re: [git pull] vfs pile 1 (splice)

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@fb.com>, "Ted Ts'o" <tytso@mit.edu>,
	Christoph Lameter <cl@linux.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [git pull] vfs pile 1 (splice)
Date: Sun, 9 Oct 2016 11:40:27 -0700	[thread overview]
Message-ID: <CA+55aFxJsPM0OihozMUmCecg0zdG0izVDr_=z55CXkdXU3qT+w@mail.gmail.com> (raw)
In-Reply-To: <CA+55aFzXgWSRYeBX-qSUWPv2uhxEQ+80poQbvwvgCbf=RsKXTg@mail.gmail.com>

On Sat, Oct 8, 2016 at 11:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Hmm. I've now gotten two oopses today, all at __kmalloc+0xc3/0x1f0,
> which seems to be the
>
>   *(void **)(object + s->offset);
>
> in get_freepointer().

Actually, it's in "get_freepointer_safe()", it's just that without
DEBUG_PAGEALLOC the two end up being the same.

> I guess I'll need to just run with slab debugging on, but I wanted to
> bring this to peoples attention in case it rings a bell for somebody.
> I haven't been merging anything today, partly because of this.

Hmm. When I enabled SLUB debugging, I also enabled DEBUG_PAGEALLOC,
because "why not". But it turns out that that may have been a mistake,
because it changes the very path that failed to no longer do that
failing access (or rather, it does it as a "probe_kernel_read()",
which traps and ignores the failure).

So all my "careful" testing seems to have been pointless, because I
enabled too much debugging, making sure that the problem cannot
happen. No wonder I couldn't reproduce this.

I'll continue with *just* SLUB debugging on, but I thought it was
interesting how enabling more memory access debugging actually ends up
changing some really subtle code.

The "get_freepointer_safe()" thing is explicitly doing a read that
could be to free'd memory, and it then depends on doing the
this_cpu_cmpxchg_double() to abort the operation if it's no longer
valid.

I'm adding Christoph to the cc, not because the slub code has changed
lately (this optimistic access logic is 5+ years old), but because
maybe Christoph remembers what tends to trigger these kinds of issues.

Christoph, the problem is that something is triggering an oops or page
fault (depending on how bogus the address is) in __kmalloc() when it
does that get_freepointer_safe() thing without DEBUG_PAGEALLOC. I've
seen two different cases on two different boots, but they both were on
that one instruction that did that

     void *next_object = get_freepointer_safe(s, object);

access. Both were to random kmalloc'ed memory (it *may* be a very
specific size that sees the corruption, but it's hard to tell, the
callchains were different and in both cases depended on some dynamic
length thing - once the directory entry name, in another case the
xattr name length).

The subject line is about Al's splice pull, but that's only one of the
ones I suspect are the potential causes. It could easily be Andrew's
pile (maybe that nice fsnotify locking cleanup causes double free's?),
Ted's ext4 changes (didn't look whether that could have allocation
pattern changes with bugs) or Jens' block layer changes.

Could be elsewhere too. I saw it twice in one day which would *tend*
to mean that it's recent, but maybe I was just lucky the previous days
and didn't hit it. I haven't been able to repro it now, but maybe I
figured out one reason why my reproductions have been failing ;)

                  Linus

next prev parent reply	other threads:[~2016-10-09 18:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-07 22:20 [git pull] vfs pile 1 (splice) Al Viro
2016-10-09  6:05 ` Linus Torvalds
2016-10-09 18:40   ` Linus Torvalds [this message]
2016-10-09 19:11     ` Linus Torvalds
2016-10-10 14:03     ` Christoph Lameter
2016-10-10 19:56       ` Linus Torvalds
2016-10-12 14:10         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFxJsPM0OihozMUmCecg0zdG0izVDr_=z55CXkdXU3qT+w@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@fb.com \
    --cc=cl@linux.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).