All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	linux-fsdevel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
	syzkaller <syzkaller@googlegroups.com>
Subject: Re: xarray, fault injection and syzkaller
Date: Fri, 4 Nov 2022 15:03:21 -0300	[thread overview]
Message-ID: <Y2VT6b/AgwddWxYj@nvidia.com> (raw)
In-Reply-To: <CACT4Y+aENA5FouC3fkUHiYqo0hv9xdRoRS043ukJf+qPZU1gbQ@mail.gmail.com>

On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
> > > Do we know how common/useful such an allocation pattern is?
> >
> > I have coded something like this a few times, in my cases it is
> > usually something like: try to allocate a big chunk of memory hoping
> > for a huge page, then fall back to a smaller allocation
> >
> > Most likely the key consideration is that the callsites are using
> > GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
> > NOWARN case assuming that another allocation attempt will closely
> > follow?
> 
> GFP_NOWARN is also extensively used for allocations with
> user-controlled size, e.g.:
> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
> 
> That's different and these allocations are usually not repeated.
> So looking at GFP_NOWARN does not look like the right thing to do.

This may be the best option then, arguably perhaps even more
"realistic" than normal fail_nth as in a real system if this stuff
starts failing there is a good chance things from then on will fail
too during the error cleanup.

> > However, this would also have to fix the obnoxious behavior of fail
> > nth where it fails its own copy_from_user on its write system call -
> > meaning there would be no way to turn it off.
> 
> Oh, interesting. We added failing of copy_from/to_user later and did
> not consider such interaction.
> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.

Oh, I will tell you the other two bugish things I noticed

__should_failslab() has this:

	if (gfpflags & __GFP_NOWARN)
		failslab.attr.no_warn = true;

	return should_fail(&failslab.attr, s->object_size);

Which always permanently turns off no_warn for slab during early
boot. This is why syzkaller reports are so confusing. They trigger a
slab fault injection, which in all other cases gives a notification
backtrace, but in slab cases there is no hint about the fault
injection in the log.

Once that is fixed we can quickly explain why the socketpair() example
in the docs shows success ret codes in the middle of the sweep when
run on syzkaller kernels

fail_nth interacts badly with other kernel features typically enabled
in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:

[   18.499559] FAULT_INJECTION: forcing a failure.
[   18.499559] name failslab, interval 1, probability 0, space 0, times 0
[   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
[   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[   18.499971] Call Trace:
[   18.500010]  <TASK>
[   18.500048]  show_stack+0x3d/0x3f
[   18.500114]  dump_stack_lvl+0x92/0xbd
[   18.500171]  dump_stack+0x15/0x17
[   18.500232]  should_fail.cold+0x5/0xa
[   18.500291]  __should_failslab+0xb6/0x100
[   18.500349]  should_failslab+0x9/0x20
[   18.500416]  kmem_cache_alloc+0x64/0x4e0
[   18.500477]  ? __create_object+0x40/0xc50
[   18.500539]  __create_object+0x40/0xc50
[   18.500620]  ? kasan_poison+0x3a/0x50
[   18.500690]  ? kasan_unpoison+0x28/0x50
[***18.500753]  kmemleak_alloc+0x24/0x30 
[   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
[   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
[   18.500993]  kmalloc_trace+0x26/0x110
[   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]

Which has the consequence of syzkaller wasting half its fail_nth
effort because it is triggering failures in hidden instrumentation
that has no impact on the main code path.

Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
cases.

Jason

  reply	other threads:[~2022-11-04 18:03 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 19:09 xarray, fault injection and syzkaller Jason Gunthorpe
2022-11-03 20:00 ` Matthew Wilcox
2022-11-03 20:07   ` Jason Gunthorpe
2022-11-04  0:11     ` Dmitry Vyukov
2022-11-04  0:21       ` Jason Gunthorpe
2022-11-04 17:47         ` Dmitry Vyukov
2022-11-04 18:03           ` Jason Gunthorpe [this message]
2022-11-04 18:21             ` Dmitry Vyukov
2022-11-04 18:30               ` Jason Gunthorpe
2022-11-04 18:38                 ` Dmitry Vyukov
2022-11-04 18:45                   ` Jason Gunthorpe
2022-11-04 22:43               ` Qi Zheng
2022-11-05 12:16                 ` Qi Zheng
2022-11-06 17:36                   ` Dmitry Vyukov
2022-11-07  2:13                     ` Qi Zheng
2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
2022-11-07 12:42                       ` Jason Gunthorpe
2022-11-07 15:05                         ` Qi Zheng
2022-11-07 16:26                           ` Jason Gunthorpe
2022-11-08  2:47                             ` Qi Zheng
2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
2022-11-08  8:44                               ` Wei Yongjun
2022-11-08  8:58                                 ` Qi Zheng
2022-11-08  9:32                                   ` Wei Yongjun
2022-11-08  9:45                                     ` Qi Zheng
2022-11-08 12:04                                     ` Jason Gunthorpe
2022-11-09  3:57                                       ` Wei Yongjun
2022-11-08 17:36                               ` Akinobu Mita
2022-11-14  3:59                                 ` Qi Zheng
2022-11-04 18:42             ` xarray, fault injection and syzkaller Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2VT6b/AgwddWxYj@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=akinobu.mita@gmail.com \
    --cc=dvyukov@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=syzkaller@googlegroups.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.