linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <zhengqi.arch@bytedance.com>
To: Dmitry Vyukov <dvyukov@google.com>, Jason Gunthorpe <jgg@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	linux-fsdevel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
	syzkaller <syzkaller@googlegroups.com>
Subject: Re: xarray, fault injection and syzkaller
Date: Sat, 5 Nov 2022 06:43:37 +0800	[thread overview]
Message-ID: <6e33dd02-99b0-0899-aed5-07f770340a74@bytedance.com> (raw)
In-Reply-To: <CACT4Y+aog92JBEGqga1QxZ7w6iPsEvEKE=6v7m78pROGAQ7KEA@mail.gmail.com>



On 2022/11/5 02:21, Dmitry Vyukov wrote:
> On Fri, 4 Nov 2022 at 11:03, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Fri, Nov 04, 2022 at 10:47:17AM -0700, Dmitry Vyukov wrote:
>>>>> Do we know how common/useful such an allocation pattern is?
>>>>
>>>> I have coded something like this a few times, in my cases it is
>>>> usually something like: try to allocate a big chunk of memory hoping
>>>> for a huge page, then fall back to a smaller allocation
>>>>
>>>> Most likely the key consideration is that the callsites are using
>>>> GFP_NOWARN, so perhaps we can just avoid decrementing the nth on a
>>>> NOWARN case assuming that another allocation attempt will closely
>>>> follow?
>>>
>>> GFP_NOWARN is also extensively used for allocations with
>>> user-controlled size, e.g.:
>>> https://elixir.bootlin.com/linux/v6.1-rc3/source/net/unix/af_unix.c#L3451
>>>
>>> That's different and these allocations are usually not repeated.
>>> So looking at GFP_NOWARN does not look like the right thing to do.
>>
>> This may be the best option then, arguably perhaps even more
>> "realistic" than normal fail_nth as in a real system if this stuff
>> starts failing there is a good chance things from then on will fail
>> too during the error cleanup.
>>
>>>> However, this would also have to fix the obnoxious behavior of fail
>>>> nth where it fails its own copy_from_user on its write system call -
>>>> meaning there would be no way to turn it off.
>>>
>>> Oh, interesting. We added failing of copy_from/to_user later and did
>>> not consider such interaction.
>>> Filed https://bugzilla.kernel.org/show_bug.cgi?id=216660 for this.
>>
>> Oh, I will tell you the other two bugish things I noticed
>>
>> __should_failslab() has this:
>>
>>          if (gfpflags & __GFP_NOWARN)
>>                  failslab.attr.no_warn = true;
>>
>>          return should_fail(&failslab.attr, s->object_size);
>>
>> Which always permanently turns off no_warn for slab during early
>> boot. This is why syzkaller reports are so confusing. They trigger a
>> slab fault injection, which in all other cases gives a notification
>> backtrace, but in slab cases there is no hint about the fault
>> injection in the log.
> 
> Ouch, this looks like a bug in:
> 
> commit 3f913fc5f9745613088d3c569778c9813ab9c129
> Author: Qi Zheng <zhengqi.arch@bytedance.com>
> Date:   Thu May 19 14:08:55 2022 -0700
>       mm: fix missing handler for __GFP_NOWARN
> 
> +Qi could you please fix it?
> 
> At the very least the local gfpflags should not alter the global
> failslab.attr that is persistent and shared by all tasks.

Oh, It indeed shouldn't alter the global failslab.attr, I'll fix it.

But a warning should not be printed for callers that currently specify
__GFP_NOWARN, because that could lead to deadlocks, such as the deadlock
case mentioned in commit 6b9dbedbe349 ("tty: fix deadlock caused by 
calling printk() under tty_port->lock").

Thanks,
Qi

> 
> But I am not sure if we really don't want to issue the fault injection
> stack in this case. It's not a WARNING, it's merely an information
> message. It looks useful in all cases, even with GFP_NOWARN. Why
> should it be suppressed?
> 
> 
>> Once that is fixed we can quickly explain why the socketpair() example
>> in the docs shows success ret codes in the middle of the sweep when
>> run on syzkaller kernels
>>
>> fail_nth interacts badly with other kernel features typically enabled
>> in syzkaller kernels. Eg it fails in hidden kmemleak instrumentation:
>>
>> [   18.499559] FAULT_INJECTION: forcing a failure.
>> [   18.499559] name failslab, interval 1, probability 0, space 0, times 0
>> [   18.499720] CPU: 10 PID: 386 Comm: iommufd_fail_nt Not tainted 6.1.0-rc3+ #34
>> [   18.499826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>> [   18.499971] Call Trace:
>> [   18.500010]  <TASK>
>> [   18.500048]  show_stack+0x3d/0x3f
>> [   18.500114]  dump_stack_lvl+0x92/0xbd
>> [   18.500171]  dump_stack+0x15/0x17
>> [   18.500232]  should_fail.cold+0x5/0xa
>> [   18.500291]  __should_failslab+0xb6/0x100
>> [   18.500349]  should_failslab+0x9/0x20
>> [   18.500416]  kmem_cache_alloc+0x64/0x4e0
>> [   18.500477]  ? __create_object+0x40/0xc50
>> [   18.500539]  __create_object+0x40/0xc50
>> [   18.500620]  ? kasan_poison+0x3a/0x50
>> [   18.500690]  ? kasan_unpoison+0x28/0x50
>> [***18.500753]  kmemleak_alloc+0x24/0x30
>> [   18.500816]  __kmem_cache_alloc_node+0x1de/0x400
>> [   18.500900]  ? iopt_alloc_area_pages+0x95/0x560 [iommufd]
>> [   18.500993]  kmalloc_trace+0x26/0x110
>> [   18.501059]  iopt_alloc_area_pages+0x95/0x560 [iommufd]
>>
>> Which has the consequence of syzkaller wasting half its fail_nth
>> effort because it is triggering failures in hidden instrumentation
>> that has no impact on the main code path.
>>
>> Maybe a kmem_cache_alloc_no_fault_inject() would be helpful for a few
>> cases.
>>
>> Jason

-- 
Thanks,
Qi

  parent reply	other threads:[~2022-11-04 22:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 19:09 xarray, fault injection and syzkaller Jason Gunthorpe
2022-11-03 20:00 ` Matthew Wilcox
2022-11-03 20:07   ` Jason Gunthorpe
2022-11-04  0:11     ` Dmitry Vyukov
2022-11-04  0:21       ` Jason Gunthorpe
2022-11-04 17:47         ` Dmitry Vyukov
2022-11-04 18:03           ` Jason Gunthorpe
2022-11-04 18:21             ` Dmitry Vyukov
2022-11-04 18:30               ` Jason Gunthorpe
2022-11-04 18:38                 ` Dmitry Vyukov
2022-11-04 18:45                   ` Jason Gunthorpe
2022-11-04 22:43               ` Qi Zheng [this message]
2022-11-05 12:16                 ` Qi Zheng
2022-11-06 17:36                   ` Dmitry Vyukov
2022-11-07  2:13                     ` Qi Zheng
2022-11-07  3:31                     ` [PATCH] mm: fix unexpected changes to {failslab|fail_page_alloc}.attr Qi Zheng
2022-11-07 12:42                       ` Jason Gunthorpe
2022-11-07 15:05                         ` Qi Zheng
2022-11-07 16:26                           ` Jason Gunthorpe
2022-11-08  2:47                             ` Qi Zheng
2022-11-08  3:52                             ` [PATCH v2] " Qi Zheng
2022-11-08  8:44                               ` Wei Yongjun
2022-11-08  8:58                                 ` Qi Zheng
2022-11-08  9:32                                   ` Wei Yongjun
2022-11-08  9:45                                     ` Qi Zheng
2022-11-08 12:04                                     ` Jason Gunthorpe
2022-11-09  3:57                                       ` Wei Yongjun
2022-11-08 17:36                               ` Akinobu Mita
2022-11-14  3:59                                 ` Qi Zheng
2022-11-04 18:42             ` xarray, fault injection and syzkaller Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e33dd02-99b0-0899-aed5-07f770340a74@bytedance.com \
    --to=zhengqi.arch@bytedance.com \
    --cc=akinobu.mita@gmail.com \
    --cc=dvyukov@google.com \
    --cc=jgg@nvidia.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=syzkaller@googlegroups.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).