* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Paolo Bonzini @ 2019-12-05 10:13 UTC (permalink / raw)
To: syzbot, aryabinin, b.zolnierkie, daniel.thompson, daniel.vetter,
dri-devel, dvyukov, ghalat, gleb, gwshan, hpa, jmorris, kasan-dev,
kvm, linux-fbdev, linux-kernel, linux-security-module,
maarten.lankhorst, mingo, mpe, penguin-kernel, ruscur, sam, serge,
stewart, syzkaller-bugs, takedakn, tglx, x86
In-Reply-To: <0000000000003e640e0598e7abc3@google.com>
On 04/12/19 22:41, syzbot wrote:
> syzbot has bisected this bug to:
>
> commit 2de50e9674fc4ca3c6174b04477f69eb26b4ee31
> Author: Russell Currey <ruscur@russell.cc>
> Date: Mon Feb 8 04:08:20 2016 +0000
>
> powerpc/powernv: Remove support for p5ioc2
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127a042ae00000
> start commit: 76bb8b05 Merge tag 'kbuild-v5.5' of
> git://git.kernel.org/p..
> git tree: upstream
> final crash: https://syzkaller.appspot.com/x/report.txt?x=117a042ae00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=167a042ae00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=dd226651cb0f364b
> dashboard link:
> https://syzkaller.appspot.com/bug?extid=4455ca3b3291de891abc
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11181edae00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105cbb7ae00000
>
> Reported-by: syzbot+4455ca3b3291de891abc@syzkaller.appspotmail.com
> Fixes: 2de50e9674fc ("powerpc/powernv: Remove support for p5ioc2")
>
> For information about bisection process see:
> https://goo.gl/tpsmEJ#bisection
>
Why is everybody being CC'd, even if the bug has nothing to do with the
person's subsystem?
Thanks,
Paolo
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Dmitry Vyukov @ 2019-12-05 10:16 UTC (permalink / raw)
To: Paolo Bonzini
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <41c082f5-5d22-d398-3bdd-3f4bf69d7ea3@redhat.com>
On Thu, Dec 5, 2019 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/12/19 22:41, syzbot wrote:
> > syzbot has bisected this bug to:
> >
> > commit 2de50e9674fc4ca3c6174b04477f69eb26b4ee31
> > Author: Russell Currey <ruscur@russell.cc>
> > Date: Mon Feb 8 04:08:20 2016 +0000
> >
> > powerpc/powernv: Remove support for p5ioc2
> >
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127a042ae00000
> > start commit: 76bb8b05 Merge tag 'kbuild-v5.5' of
> > git://git.kernel.org/p..
> > git tree: upstream
> > final crash: https://syzkaller.appspot.com/x/report.txt?x=117a042ae00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=167a042ae00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=dd226651cb0f364b
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=4455ca3b3291de891abc
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11181edae00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105cbb7ae00000
> >
> > Reported-by: syzbot+4455ca3b3291de891abc@syzkaller.appspotmail.com
> > Fixes: 2de50e9674fc ("powerpc/powernv: Remove support for p5ioc2")
> >
> > For information about bisection process see:
> > https://goo.gl/tpsmEJ#bisection
> >
>
> Why is everybody being CC'd, even if the bug has nothing to do with the
> person's subsystem?
The To list should be intersection of 2 groups of emails: result of
get_maintainers.pl on the file identified as culprit in the crash
message + emails extracted from the bisected to commit.
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Paolo Bonzini @ 2019-12-05 10:22 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <CACT4Y+bCHOCLYF+TW062n8+tqfK9vizaRvyjUXNPdneciq0Ahg@mail.gmail.com>
On 05/12/19 11:16, Dmitry Vyukov wrote:
> On Thu, Dec 5, 2019 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 04/12/19 22:41, syzbot wrote:
>>> syzbot has bisected this bug to:
>>>
>>> commit 2de50e9674fc4ca3c6174b04477f69eb26b4ee31
>>> Author: Russell Currey <ruscur@russell.cc>
>>> Date: Mon Feb 8 04:08:20 2016 +0000
>>>
>>> powerpc/powernv: Remove support for p5ioc2
>>>
>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127a042ae00000
>>> start commit: 76bb8b05 Merge tag 'kbuild-v5.5' of
>>> git://git.kernel.org/p..
>>> git tree: upstream
>>> final crash: https://syzkaller.appspot.com/x/report.txt?x=117a042ae00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=167a042ae00000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=dd226651cb0f364b
>>> dashboard link:
>>> https://syzkaller.appspot.com/bug?extid=4455ca3b3291de891abc
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11181edae00000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105cbb7ae00000
>>>
>>> Reported-by: syzbot+4455ca3b3291de891abc@syzkaller.appspotmail.com
>>> Fixes: 2de50e9674fc ("powerpc/powernv: Remove support for p5ioc2")
>>>
>>> For information about bisection process see:
>>> https://goo.gl/tpsmEJ#bisection
>>>
>>
>> Why is everybody being CC'd, even if the bug has nothing to do with the
>> person's subsystem?
>
> The To list should be intersection of 2 groups of emails: result of
> get_maintainers.pl on the file identified as culprit in the crash
> message + emails extracted from the bisected to commit.
Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
backtrace and I get to share syzkaller's joy every time. :)
This bisect result is bogus, though Tetsuo found the bug anyway.
Perhaps you can exclude commits that only touch architectures other than
x86?
Paolo
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Dmitry Vyukov @ 2019-12-05 10:31 UTC (permalink / raw)
To: Paolo Bonzini
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <f4db22f2-53a3-68ed-0f85-9f4541530f5d@redhat.com>
On Thu, Dec 5, 2019 at 11:22 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 05/12/19 11:16, Dmitry Vyukov wrote:
> > On Thu, Dec 5, 2019 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >> On 04/12/19 22:41, syzbot wrote:
> >>> syzbot has bisected this bug to:
> >>>
> >>> commit 2de50e9674fc4ca3c6174b04477f69eb26b4ee31
> >>> Author: Russell Currey <ruscur@russell.cc>
> >>> Date: Mon Feb 8 04:08:20 2016 +0000
> >>>
> >>> powerpc/powernv: Remove support for p5ioc2
> >>>
> >>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127a042ae00000
> >>> start commit: 76bb8b05 Merge tag 'kbuild-v5.5' of
> >>> git://git.kernel.org/p..
> >>> git tree: upstream
> >>> final crash: https://syzkaller.appspot.com/x/report.txt?x=117a042ae00000
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=167a042ae00000
> >>> kernel config: https://syzkaller.appspot.com/x/.config?x=dd226651cb0f364b
> >>> dashboard link:
> >>> https://syzkaller.appspot.com/bug?extid=4455ca3b3291de891abc
> >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11181edae00000
> >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105cbb7ae00000
> >>>
> >>> Reported-by: syzbot+4455ca3b3291de891abc@syzkaller.appspotmail.com
> >>> Fixes: 2de50e9674fc ("powerpc/powernv: Remove support for p5ioc2")
> >>>
> >>> For information about bisection process see:
> >>> https://goo.gl/tpsmEJ#bisection
> >>>
> >>
> >> Why is everybody being CC'd, even if the bug has nothing to do with the
> >> person's subsystem?
> >
> > The To list should be intersection of 2 groups of emails: result of
> > get_maintainers.pl on the file identified as culprit in the crash
> > message + emails extracted from the bisected to commit.
>
> Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
> backtrace and I get to share syzkaller's joy every time. :)
I don't see any mention of "kvm" in the crash report. And it's only 1
file, not all of them, in this case I would expect it to be
drivers/video/fbdev/core/fbcon.c. So it should be something different.
> This bisect result is bogus, though Tetsuo found the bug anyway.
> Perhaps you can exclude commits that only touch architectures other than
> x86?
We do this. It work sometimes. But sometimes it hits non-deterministic
kernel build bugs:
https://github.com/google/syzkaller/issues/1271#issuecomment-559093018
And in this case it hit some git bisect weirdness which I can't explain yet:
https://github.com/google/syzkaller/issues/1527
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Tetsuo Handa @ 2019-12-05 10:30 UTC (permalink / raw)
To: Dmitry Vyukov, Paolo Bonzini
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Russell Currey,
Sam Ravnborg, Serge E. Hallyn, stewart, syzkaller-bugs,
Kentaro Takeda, Thomas Gleixner, the arch/x86 maintainers
In-Reply-To: <CACT4Y+bCHOCLYF+TW062n8+tqfK9vizaRvyjUXNPdneciq0Ahg@mail.gmail.com>
On 2019/12/05 19:16, Dmitry Vyukov wrote:
> On Thu, Dec 5, 2019 at 11:13 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 04/12/19 22:41, syzbot wrote:
>>> syzbot has bisected this bug to:
>>>
>>> commit 2de50e9674fc4ca3c6174b04477f69eb26b4ee31
>>> Author: Russell Currey <ruscur@russell.cc>
>>> Date: Mon Feb 8 04:08:20 2016 +0000
>>>
>>> powerpc/powernv: Remove support for p5ioc2
>>>
>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127a042ae00000
>>> start commit: 76bb8b05 Merge tag 'kbuild-v5.5' of
>>> git://git.kernel.org/p..
>>> git tree: upstream
>>> final crash: https://syzkaller.appspot.com/x/report.txt?x=117a042ae00000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=167a042ae00000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=dd226651cb0f364b
>>> dashboard link:
>>> https://syzkaller.appspot.com/bug?extid=4455ca3b3291de891abc
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11181edae00000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=105cbb7ae00000
>>>
>>> Reported-by: syzbot+4455ca3b3291de891abc@syzkaller.appspotmail.com
>>> Fixes: 2de50e9674fc ("powerpc/powernv: Remove support for p5ioc2")
>>>
>>> For information about bisection process see:
>>> https://goo.gl/tpsmEJ#bisection
>>>
>>
>> Why is everybody being CC'd, even if the bug has nothing to do with the
>> person's subsystem?
>
> The To list should be intersection of 2 groups of emails: result of
> get_maintainers.pl on the file identified as culprit in the crash
> message + emails extracted from the bisected to commit.
>
There is "#syz uncc" command but it is too hard to utilize?
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Tetsuo Handa @ 2019-12-05 10:41 UTC (permalink / raw)
To: Paolo Bonzini, Dmitry Vyukov
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Russell Currey,
Sam Ravnborg, Serge E. Hallyn, stewart, syzkaller-bugs,
Kentaro Takeda, Thomas Gleixner, the arch/x86 maintainers
In-Reply-To: <f4db22f2-53a3-68ed-0f85-9f4541530f5d@redhat.com>
On 2019/12/05 19:22, Paolo Bonzini wrote:
> Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
> backtrace and I get to share syzkaller's joy every time. :)
>
> This bisect result is bogus, though Tetsuo found the bug anyway.
> Perhaps you can exclude commits that only touch architectures other than
> x86?
>
It would be nice if coverage functionality can extract filenames in the source
code and supply the list of filenames as arguments for bisect operation.
Also, (unrelated but) it would be nice if we can have "make yes2modconfig"
target which converts CONFIG_FOO=y to CONFIG_FOO=m if FOO is tristate.
syzbot is testing kernel configs close to "make allyesconfig" but I want to
save kernel rebuild time by disabling unrelated functionality when manually
"debug printk()ing" kernels.
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Paolo Bonzini @ 2019-12-05 10:53 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <CACT4Y+ZHCmTu4tdfP+iCswU3r6+_NBM9M-pAZEypVSZ9DEq3TQ@mail.gmail.com>
On 05/12/19 11:31, Dmitry Vyukov wrote:
>> Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
>> backtrace and I get to share syzkaller's joy every time. :)
> I don't see any mention of "kvm" in the crash report.
It's there in the stack trace, not sure if this is what triggered my Cc:
[<ffffffff810c7c3a>] kvm_wait+0xca/0xe0 arch/x86/kernel/kvm.c:612
Paolo
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Dmitry Vyukov @ 2019-12-05 11:27 UTC (permalink / raw)
To: Paolo Bonzini
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <e03140c6-8ff5-9abb-1af6-17a5f68d1829@redhat.com>
On Thu, Dec 5, 2019 at 11:53 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 05/12/19 11:31, Dmitry Vyukov wrote:
> >> Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
> >> backtrace and I get to share syzkaller's joy every time. :)
> > I don't see any mention of "kvm" in the crash report.
>
> It's there in the stack trace, not sure if this is what triggered my Cc:
>
> [<ffffffff810c7c3a>] kvm_wait+0xca/0xe0 arch/x86/kernel/kvm.c:612
>
> Paolo
Oh, you mean the final bisection crash. Indeed it contains a kvm frame
and it turns out to be a bug in syzkaller code that indeed
misattributed it to kvm instead of netfilter.
Should be fixed now, you may read the commit message for details:
https://github.com/google/syzkaller/commit/4fb74474cf0af2126be3a8989d770c3947ae9478
Overall this "making sense out of kernel output" task is the ultimate
insanity, you may skim through this file to get a taste of amount of
hardcoding and special corner cases that need to be handled:
https://github.com/google/syzkaller/blob/master/pkg/report/linux.go
And this is never done, such "exception from exception corner case"
things pop up every week. There is always something to shuffle and
tune. It only keeps functioning due to 500+ test cases for all
possible insane kernel outputs:
https://github.com/google/syzkaller/tree/master/pkg/report/testdata/linux/report
https://github.com/google/syzkaller/tree/master/pkg/report/testdata/linux/guilty
So thanks for persisting and questioning! We are getting better with
each new test.
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Paolo Bonzini @ 2019-12-05 11:29 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Tetsuo Handa,
Russell Currey, Sam Ravnborg, Serge E. Hallyn, stewart,
syzkaller-bugs, Kentaro Takeda, Thomas Gleixner,
the arch/x86 maintainers
In-Reply-To: <CACT4Y+YopHoCFDRHCE6brnWfHb5YUsTJS1Mc+58GgO8CDEcgHQ@mail.gmail.com>
On 05/12/19 12:27, Dmitry Vyukov wrote:
> Oh, you mean the final bisection crash. Indeed it contains a kvm frame
> and it turns out to be a bug in syzkaller code that indeed
> misattributed it to kvm instead of netfilter.
> Should be fixed now, you may read the commit message for details:
> https://github.com/google/syzkaller/commit/4fb74474cf0af2126be3a8989d770c3947ae9478
>
> Overall this "making sense out of kernel output" task is the ultimate
> insanity, you may skim through this file to get a taste of amount of
> hardcoding and special corner cases that need to be handled:
> https://github.com/google/syzkaller/blob/master/pkg/report/linux.go
> And this is never done, such "exception from exception corner case"
> things pop up every week. There is always something to shuffle and
> tune. It only keeps functioning due to 500+ test cases for all
> possible insane kernel outputs:
> https://github.com/google/syzkaller/tree/master/pkg/report/testdata/linux/report
> https://github.com/google/syzkaller/tree/master/pkg/report/testdata/linux/guilty
>
> So thanks for persisting and questioning! We are getting better with
> each new test.
Thanks to you! I "complain" because I know you're so responsive. :)
Paolo
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Dmitry Vyukov @ 2019-12-05 11:35 UTC (permalink / raw)
To: Tetsuo Handa
Cc: Paolo Bonzini, syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Russell Currey,
Sam Ravnborg, Serge E. Hallyn, stewart, syzkaller-bugs,
Kentaro Takeda, Thomas Gleixner, the arch/x86 maintainers
In-Reply-To: <397ad276-ee2b-3883-9ed4-b5b1a2f8cf67@i-love.sakura.ne.jp>
On Thu, Dec 5, 2019 at 11:41 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/12/05 19:22, Paolo Bonzini wrote:
> > Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
> > backtrace and I get to share syzkaller's joy every time. :)
> >
> > This bisect result is bogus, though Tetsuo found the bug anyway.
> > Perhaps you can exclude commits that only touch architectures other than
> > x86?
> >
>
> It would be nice if coverage functionality can extract filenames in the source
> code and supply the list of filenames as arguments for bisect operation.
What is the criteria for file name extraction? What will bisect
operation do with the set of files?
If you have a feature/improvement request, please file it at:
https://github.com/google/syzkaller/issues/new
^ permalink raw reply
* Re: KASAN: slab-out-of-bounds Read in fbcon_get_font
From: Dmitry Vyukov @ 2019-12-05 11:36 UTC (permalink / raw)
To: Tetsuo Handa
Cc: Paolo Bonzini, syzbot, Andrey Ryabinin, Bartlomiej Zolnierkiewicz,
Daniel Thompson, Daniel Vetter, DRI, ghalat, Gleb Natapov, gwshan,
H. Peter Anvin, James Morris, kasan-dev, KVM list,
Linux Fbdev development list, LKML, linux-security-module,
Maarten Lankhorst, Ingo Molnar, Michael Ellerman, Russell Currey,
Sam Ravnborg, Serge E. Hallyn, stewart, syzkaller-bugs,
Kentaro Takeda, Thomas Gleixner, the arch/x86 maintainers
In-Reply-To: <397ad276-ee2b-3883-9ed4-b5b1a2f8cf67@i-love.sakura.ne.jp>
On Thu, Dec 5, 2019 at 11:41 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2019/12/05 19:22, Paolo Bonzini wrote:
> > Ah, and because the machine is a KVM guest, kvm_wait appears in a lot of
> > backtrace and I get to share syzkaller's joy every time. :)
> >
> > This bisect result is bogus, though Tetsuo found the bug anyway.
> > Perhaps you can exclude commits that only touch architectures other than
> > x86?
> >
>
> It would be nice if coverage functionality can extract filenames in the source
> code and supply the list of filenames as arguments for bisect operation.
>
> Also, (unrelated but) it would be nice if we can have "make yes2modconfig"
> target which converts CONFIG_FOO=y to CONFIG_FOO=m if FOO is tristate.
> syzbot is testing kernel configs close to "make allyesconfig" but I want to
> save kernel rebuild time by disabling unrelated functionality when manually
> "debug printk()ing" kernels.
I thought that maybe sed "s#=y#=m#g" && make olddefconfig will do, but
unfortunately, it turns off non-tristate configs...
$ egrep "CONFIG_MEMORY_HOTPLUG|CONFIG_TCP_CONG_DCTCP" .config
CONFIG_MEMORY_HOTPLUG=y
CONFIG_TCP_CONG_DCTCP=y
# sed -i "s/CONFIG_MEMORY_HOTPLUG=y/CONFIG_MEMORY_HOTPLUG=m/g" .config
# sed -i "s/CONFIG_TCP_CONG_DCTCP=y/CONFIG_TCP_CONG_DCTCP=m/g" .config
# egrep "CONFIG_MEMORY_HOTPLUG|CONFIG_TCP_CONG_DCTCP" .config
CONFIG_MEMORY_HOTPLUG=m
CONFIG_TCP_CONG_DCTCP=m
# make olddefconfig
# egrep "CONFIG_MEMORY_HOTPLUG|CONFIG_TCP_CONG_DCTCP" .config
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_TCP_CONG_DCTCP=m
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: David Sterba @ 2019-12-05 12:58 UTC (permalink / raw)
To: David Howells
Cc: torvalds, Rasmus Villemoes, Greg Kroah-Hartman, Peter Zijlstra,
raven, Christian Brauner, keyrings, linux-usb, linux-block,
linux-security-module, linux-fsdevel, linux-api, linux-kernel
In-Reply-To: <31452.1574721589@warthog.procyon.org.uk>
On Mon, Nov 25, 2019 at 10:39:49PM +0000, David Howells wrote:
> David Howells (12):
> pipe: Reduce #inclusion of pipe_fs_i.h
> Remove the nr_exclusive argument from __wake_up_sync_key()
> Add wake_up_interruptible_sync_poll_locked()
> pipe: Use head and tail pointers for the ring, not cursor and length
This commit (8cefc107ca54c8b06438b7dc9) causes hangs of 'btrfs send'
commands, that's eg. inside fstests or in btrfs-progs testsuite. The
'send' uses pipe/splice to get data from kernel to userspace.
The test that reliably worked for me leaves the process hanging with
this stacktrace (no CPU or IO activity):
[<0>] pipe_write+0x1be/0x4b0
[<0>] new_sync_write+0x11e/0x1c0
[<0>] vfs_write+0xc1/0x1d0
[<0>] kernel_write+0x2c/0x40
[<0>] send_cmd+0x78/0xf0 [btrfs]
[<0>] send_extent_data+0x4b1/0x52c [btrfs]
[<0>] process_extent+0xe46/0xe9d [btrfs]
[<0>] changed_cb+0xcf5/0xd2f [btrfs]
[<0>] send_subvol+0x8ad/0xc0b [btrfs]
[<0>] btrfs_ioctl_send+0xe50/0xf30 [btrfs]
[<0>] _btrfs_ioctl_send+0x80/0x110 [btrfs]
[<0>] btrfs_ioctl+0x18e1/0x3450 [btrfs]
[<0>] do_vfs_ioctl+0xa5/0x710
[<0>] ksys_ioctl+0x70/0x80
[<0>] __x64_sys_ioctl+0x16/0x20
[<0>] do_syscall_64+0x56/0x220
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
I've bisected that to the mentioned commit, using test in btrfs-progs
testsuite with command "make TEST=016\* test-misc". Normally the test
can run up to 10 seconds, in the buggy case it stays there.
I can help testing fixes or gathering more information, please let me
know.
Full bisect log:
# bad: [3c0edea9b29f9be6c093f236f762202b30ac9431] pipe: Remove sync on wake_ups
# good: [da0c9ea146cbe92b832f1b0f694840ea8eb33cce] Linux 5.4-rc2
git bisect start '3c0edea9b29f9be6c093f236f762202b30ac9431' 'd055b4fb4d165b06d912e7f846610d120c3bb9fb^'
# bad: [b667b867344301e24f21d4a4c844675ff61d89e1] pipe: Advance tail pointer inside of wait spinlock in pipe_read()
git bisect bad b667b867344301e24f21d4a4c844675ff61d89e1
# good: [f94df9890e98f2090c6a8d70c795134863b70201] Add wake_up_interruptible_sync_poll_locked()
git bisect good f94df9890e98f2090c6a8d70c795134863b70201
# bad: [6718b6f855a0b4962d54bd625be2718cb820cec6] pipe: Allow pipes to have kernel-reserved slots
git bisect bad 6718b6f855a0b4962d54bd625be2718cb820cec6
# bad: [8cefc107ca54c8b06438b7dc9cc08bc0a11d5b98] pipe: Use head and tail pointers for the ring, not cursor and length
git bisect bad 8cefc107ca54c8b06438b7dc9cc08bc0a11d5b98
# first bad commit: [8cefc107ca54c8b06438b7dc9cc08bc0a11d5b98] pipe: Use head and tail pointers for the ring, not cursor and length
> pipe: Allow pipes to have kernel-reserved slots
> pipe: Advance tail pointer inside of wait spinlock in pipe_read()
> pipe: Conditionalise wakeup in pipe_read()
> pipe: Rearrange sequence in pipe_write() to preallocate slot
> pipe: Remove redundant wakeup from pipe_write()
> pipe: Check for ring full inside of the spinlock in pipe_write()
> pipe: Increase the writer-wakeup threshold to reduce context-switch count
> pipe: Remove sync on wake_ups
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: David Howells @ 2019-12-05 13:56 UTC (permalink / raw)
To: dsterba
Cc: dhowells, torvalds, Rasmus Villemoes, Greg Kroah-Hartman,
Peter Zijlstra, raven, Christian Brauner, keyrings, linux-usb,
linux-block, linux-security-module, linux-fsdevel, linux-api,
linux-kernel
In-Reply-To: <20191205125826.GK2734@twin.jikos.cz>
David Sterba <dsterba@suse.cz> wrote:
> [<0>] pipe_write+0x1be/0x4b0
Can you get me a line number of that? Assuming you've built with -g, load
vmlinux into gdb and do "i li pipe_write+0x1be".
David
^ permalink raw reply
* [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups
From: Alexey Budankov @ 2019-12-05 15:41 UTC (permalink / raw)
To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel-owner
Currently access to perf_events functionality [1] beyond the scope permitted
by perf_event_paranoid [1] kernel setting is allowed to a privileged process
[2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
governing role for perf_events based performance monitoring of a system.
CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
performance using perf_events subsystem by processes and Perf privileged users
[2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
privileged processes [3].
CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
performance monitoring functionality of perf_events and balance amount of
CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
see Notes to kernel developers, below."
For backward compatibility reasons performance monitoring functionality of
perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
secure performance monitoring use cases is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.
In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
[2] to conduct secure performance monitoring using perf_events in the scope
of available online CPUs when executing code in kernel and user modes.
Possible alternative solution to this capabilities balancing, system security
hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
perf_events' performance monitoring functionality, since process debugging is
similar to performance monitoring with respect to providing insights into
process memory and execution details. However CAP_SYS_PTRACE still provides
users with more credentials than are required for secure performance monitoring
using perf_events subsystem and this excess is avoided by using the dedicated
CAP_SYS_PERFMON capability.
libcap library utilities [4], [5] and Perf tool can be used to apply
CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
permitted by system wide perf_event_paranoid kernel setting and below are the
steps to evaluate the advancement suggested by the patch set:
- patch, build and boot the kernel
- patch, build Perf tool e.g. to /home/user/perf
...
# git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
# pushd libcap
# patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
# make
# pushd progs
# ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
# ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
/home/user/perf: OK
# ./getcap /home/user/perf
/home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
# echo 2 > /proc/sys/kernel/perf_event_paranoid
# cat /proc/sys/kernel/perf_event_paranoid
2
...
$ /home/user/perf top
... works as expected ...
$ cat /proc/`pidof perf`/status
Name: perf
Umask: 0002
State: S (sleeping)
Tgid: 2958
Ngid: 0
Pid: 2958
PPid: 9847
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
FDSize: 256
...
CapInh: 0000000000000000
CapPrm: 0000004400080000
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
cap_sys_perfmon,cap_sys_ptrace,cap_syslog
CapBnd: 0000007fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ff
Cpus_allowed_list: 0-7
...
Usage of cap_sys_perfmon effectively avoids unused credentials excess:
- with cap_sys_admin:
CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
- with cap_sys_perfmon:
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
38 34 19
sys_perfmon syslog sys_ptrace
The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
[1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
[2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
[3] http://man7.org/linux/man-pages/man7/capabilities.7.html
[4] http://man7.org/linux/man-pages/man8/setcap.8.html
[5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
[6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
---
Alexey Budankov (3):
capabilities: introduce CAP_SYS_PERFMON to kernel and user space
perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
perf tool: extend Perf tool with CAP_SYS_PERFMON support
include/linux/perf_event.h | 6 ++++--
include/uapi/linux/capability.h | 10 +++++++++-
security/selinux/include/classmap.h | 4 ++--
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 15 +++++++++++++--
7 files changed, 39 insertions(+), 13 deletions(-)
--
2.20.1
^ permalink raw reply
* [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups
From: Alexey Budankov @ 2019-12-05 16:15 UTC (permalink / raw)
To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel
Currently access to perf_events functionality [1] beyond the scope permitted
by perf_event_paranoid [1] kernel setting is allowed to a privileged process
[2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
governing role for perf_events based performance monitoring of a system.
CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
performance using perf_events subsystem by processes and Perf privileged users
[2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
privileged processes [3].
CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
performance monitoring functionality of perf_events and balance amount of
CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
see Notes to kernel developers, below."
For backward compatibility reasons performance monitoring functionality of
perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
secure performance monitoring use cases is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.
In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
[2] to conduct secure performance monitoring using perf_events in the scope
of available online CPUs when executing code in kernel and user modes.
Possible alternative solution to this capabilities balancing, system security
hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
perf_events' performance monitoring functionality, since process debugging is
similar to performance monitoring with respect to providing insights into
process memory and execution details. However CAP_SYS_PTRACE still provides
users with more credentials than are required for secure performance monitoring
using perf_events subsystem and this excess is avoided by using the dedicated
CAP_SYS_PERFMON capability.
libcap library utilities [4], [5] and Perf tool can be used to apply
CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
permitted by system wide perf_event_paranoid kernel setting and below are the
steps to evaluate the advancement suggested by the patch set:
- patch, build and boot the kernel
- patch, build Perf tool e.g. to /home/user/perf
...
# git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
# pushd libcap
# patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
# make
# pushd progs
# ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
# ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
/home/user/perf: OK
# ./getcap /home/user/perf
/home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
# echo 2 > /proc/sys/kernel/perf_event_paranoid
# cat /proc/sys/kernel/perf_event_paranoid
2
...
$ /home/user/perf top
... works as expected ...
$ cat /proc/`pidof perf`/status
Name: perf
Umask: 0002
State: S (sleeping)
Tgid: 2958
Ngid: 0
Pid: 2958
PPid: 9847
TracerPid: 0
Uid: 500 500 500 500
Gid: 500 500 500 500
FDSize: 256
...
CapInh: 0000000000000000
CapPrm: 0000004400080000
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
cap_sys_perfmon,cap_sys_ptrace,cap_syslog
CapBnd: 0000007fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ff
Cpus_allowed_list: 0-7
...
Usage of cap_sys_perfmon effectively avoids unused credentials excess:
- with cap_sys_admin:
CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
- with cap_sys_perfmon:
CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
38 34 19
sys_perfmon syslog sys_ptrace
The patch set is for tip perf/core repository:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
[1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
[2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
[3] http://man7.org/linux/man-pages/man7/capabilities.7.html
[4] http://man7.org/linux/man-pages/man8/setcap.8.html
[5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
[6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
---
Alexey Budankov (3):
capabilities: introduce CAP_SYS_PERFMON to kernel and user space
perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
perf tool: extend Perf tool with CAP_SYS_PERFMON support
include/linux/perf_event.h | 6 ++++--
include/uapi/linux/capability.h | 10 +++++++++-
security/selinux/include/classmap.h | 4 ++--
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 15 +++++++++++++--
7 files changed, 39 insertions(+), 13 deletions(-)
--
2.20.1
^ permalink raw reply
* [PATCH v1 1/3] capabilities: introduce CAP_SYS_PERFMON to kernel and user space
From: Alexey Budankov @ 2019-12-05 16:19 UTC (permalink / raw)
To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel
In-Reply-To: <283f09a5-33bd-eac3-bdfd-83d775045bf9@linux.intel.com>
Introduce CAP_SYS_PERFMON capability dedicated to secure performance
monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN
capability in its governing role for perf_events based performance
monitoring of a system.
CAP_SYS_PERFMON aims to harden system security and integrity during
performance monitoring by decreasing attack surface that is available
to CAP_SYS_ADMIN privileged processes.
CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
performance monitoring functionality of perf_events and balance amount of
CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
see Notes to kernel developers, below."
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
include/uapi/linux/capability.h | 10 +++++++++-
security/selinux/include/classmap.h | 4 ++--
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..c9514f034be1 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -366,8 +366,16 @@ struct vfs_ns_cap_data {
#define CAP_AUDIT_READ 37
+/*
+ * Allow usage of perf_event_open() syscall (perf_events subsystem):
+ * http://man7.org/linux/man-pages/man2/perf_event_open.2.html
+ * beyond the scope permitted by perf_event_paranoid kernel setting.
+ * See Documentation/admin-guide/perf-security.rst for more information.
+ */
+
+#define CAP_SYS_PERFMON 38
-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_SYS_PERFMON
#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 7db24855e12d..bae602c623b0 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -27,9 +27,9 @@
"audit_control", "setfcap"
#define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \
- "wake_alarm", "block_suspend", "audit_read"
+ "wake_alarm", "block_suspend", "audit_read", "sys_perfmon"
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_SYS_PERFMON
#error New capability defined, please update COMMON_CAP2_PERMS.
#endif
--
2.20.1
^ permalink raw reply related
* [PATCH v1 2/3] perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
From: Alexey Budankov @ 2019-12-05 16:21 UTC (permalink / raw)
To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel
In-Reply-To: <283f09a5-33bd-eac3-bdfd-83d775045bf9@linux.intel.com>
Enable CAP_SYS_PERFMON privileged process with secure performance monitoring
of available online CPUs, when executing code in kernel and user modes.
For backward compatibility reasons performance monitoring functionality of
perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
secure performance monitoring use cases is discouraged with respect to the
introduced CAP_SYS_PERFMON capability.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
include/linux/perf_event.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 34c7c6910026..e8dc8411de9a 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1285,7 +1285,8 @@ static inline int perf_is_paranoid(void)
static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 1 &&
+ !(capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN)))
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
@@ -1293,7 +1294,8 @@ static inline int perf_allow_kernel(struct perf_event_attr *attr)
static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 0 &&
+ !(capable(CAP_SYS_PERFMON) || capable(CAP_SYS_ADMIN)))
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_CPU);
--
2.20.1
^ permalink raw reply related
* [PATCH v1 3/3] perf tool: extend Perf tool with CAP_SYS_PERFMON support
From: Alexey Budankov @ 2019-12-05 16:22 UTC (permalink / raw)
To: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel
In-Reply-To: <283f09a5-33bd-eac3-bdfd-83d775045bf9@linux.intel.com>
Extend error messages to mention CAP_SYS_PERFMON capability as an option
to substitute CAP_SYS_ADMIN credentials where applicable.
Make perf_event_paranoid_check() to be aware of CAP_SYS_PERFMON in case
perf_event_paranoid value >= 0.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
---
tools/perf/design.txt | 3 ++-
tools/perf/util/cap.h | 4 ++++
tools/perf/util/evsel.c | 10 +++++-----
tools/perf/util/util.c | 15 +++++++++++++--
4 files changed, 24 insertions(+), 8 deletions(-)
diff --git a/tools/perf/design.txt b/tools/perf/design.txt
index 0453ba26cdbd..71755b3e1303 100644
--- a/tools/perf/design.txt
+++ b/tools/perf/design.txt
@@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any user, for
their own tasks.
A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_SYS_PERFMON or
+CAP_SYS_ADMIN privilege.
The 'flags' parameter is currently unused and must be zero.
diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h
index 051dc590ceee..0f79fbf6638b 100644
--- a/tools/perf/util/cap.h
+++ b/tools/perf/util/cap.h
@@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused)
#define CAP_SYSLOG 34
#endif
+#ifndef CAP_SYS_PERFMON
+#define CAP_SYS_PERFMON 38
+#endif
+
#endif /* __PERF_CAP_H */
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f4dea055b080..3a46325e3702 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2468,14 +2468,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
"You may not have permission to collect %sstats.\n\n"
"Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
"which controls use of the performance events system by\n"
- "unprivileged users (without CAP_SYS_ADMIN).\n\n"
+ "unprivileged users (without CAP_SYS_PERFMON or CAP_SYS_ADMIN).\n\n"
"The current value is %d:\n\n"
" -1: Allow use of (almost) all events by all users\n"
" Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
- ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n"
- " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n"
- ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n"
- ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n"
+ ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 1: Disallow CPU event access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 2: Disallow kernel profiling by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n\n"
"To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
" kernel.perf_event_paranoid = -1\n" ,
target->system_wide ? "system-wide " : "",
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 969ae560dad9..d8334fa97c85 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -271,8 +271,19 @@ int perf_event_paranoid(void)
bool perf_event_paranoid_check(int max_level)
{
- return perf_cap__capable(CAP_SYS_ADMIN) ||
- perf_event_paranoid() <= max_level;
+ bool res = false;
+
+ res = perf_cap__capable(CAP_SYS_ADMIN);
+
+ if (!res) {
+ if (max_level >= 0)
+ res = perf_cap__capable(CAP_SYS_PERFMON);
+ }
+
+ if (!res)
+ res = perf_event_paranoid() <= max_level;
+
+ return res;
}
static int
--
2.20.1
^ permalink raw reply related
* Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups
From: Casey Schaufler @ 2019-12-05 16:49 UTC (permalink / raw)
To: Alexey Budankov, Peter Zijlstra, Arnaldo Carvalho de Melo,
Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel, Casey Schaufler
In-Reply-To: <283f09a5-33bd-eac3-bdfd-83d775045bf9@linux.intel.com>
On 12/5/2019 8:15 AM, Alexey Budankov wrote:
> Currently access to perf_events functionality [1] beyond the scope permitted
> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>
> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
> governing role for perf_events based performance monitoring of a system.
>
> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
> performance using perf_events subsystem by processes and Perf privileged users
> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
> privileged processes [3].
Are there use cases where you would need CAP_SYS_PERFMON where you
would not also need CAP_SYS_ADMIN? If you separate a new capability
from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
with the new capability it is all rather pointless.
The scope you've defined for this CAP_SYS_PERFMON is very small.
Is there a larger set of privilege checks that might be applicable
for it?
>
> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
> performance monitoring functionality of perf_events and balance amount of
> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
> see Notes to kernel developers, below."
>
> For backward compatibility reasons performance monitoring functionality of
> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
> secure performance monitoring use cases is discouraged with respect to the
> introduced CAP_SYS_PERFMON capability.
>
> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
> [2] to conduct secure performance monitoring using perf_events in the scope
> of available online CPUs when executing code in kernel and user modes.
>
> Possible alternative solution to this capabilities balancing, system security
> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
> perf_events' performance monitoring functionality, since process debugging is
> similar to performance monitoring with respect to providing insights into
> process memory and execution details. However CAP_SYS_PTRACE still provides
> users with more credentials than are required for secure performance monitoring
> using perf_events subsystem and this excess is avoided by using the dedicated
> CAP_SYS_PERFMON capability.
>
> libcap library utilities [4], [5] and Perf tool can be used to apply
> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
> permitted by system wide perf_event_paranoid kernel setting and below are the
> steps to evaluate the advancement suggested by the patch set:
>
> - patch, build and boot the kernel
> - patch, build Perf tool e.g. to /home/user/perf
> ...
> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> # pushd libcap
> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
> # make
> # pushd progs
> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> /home/user/perf: OK
> # ./getcap /home/user/perf
> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
> # echo 2 > /proc/sys/kernel/perf_event_paranoid
> # cat /proc/sys/kernel/perf_event_paranoid
> 2
> ...
> $ /home/user/perf top
> ... works as expected ...
> $ cat /proc/`pidof perf`/status
> Name: perf
> Umask: 0002
> State: S (sleeping)
> Tgid: 2958
> Ngid: 0
> Pid: 2958
> PPid: 9847
> TracerPid: 0
> Uid: 500 500 500 500
> Gid: 500 500 500 500
> FDSize: 256
> ...
> CapInh: 0000000000000000
> CapPrm: 0000004400080000
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
> CapBnd: 0000007fffffffff
> CapAmb: 0000000000000000
> NoNewPrivs: 0
> Seccomp: 0
> Speculation_Store_Bypass: thread vulnerable
> Cpus_allowed: ff
> Cpus_allowed_list: 0-7
> ...
>
> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
> - with cap_sys_admin:
> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> - with cap_sys_perfmon:
> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> 38 34 19
> sys_perfmon syslog sys_ptrace
>
> The patch set is for tip perf/core repository:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>
> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>
> ---
> Alexey Budankov (3):
> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>
> include/linux/perf_event.h | 6 ++++--
> include/uapi/linux/capability.h | 10 +++++++++-
> security/selinux/include/classmap.h | 4 ++--
> tools/perf/design.txt | 3 ++-
> tools/perf/util/cap.h | 4 ++++
> tools/perf/util/evsel.c | 10 +++++-----
> tools/perf/util/util.c | 15 +++++++++++++--
> 7 files changed, 39 insertions(+), 13 deletions(-)
>
^ permalink raw reply
* Re: [PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups
From: Alexey Budankov @ 2019-12-05 17:05 UTC (permalink / raw)
To: Casey Schaufler, Peter Zijlstra, Arnaldo Carvalho de Melo,
Ingo Molnar
Cc: Jiri Olsa, Andi Kleen, elena.reshetova, Alexander Shishkin,
Jann Horn, Kees Cook, Stephane Eranian, Namhyung Kim,
linux-security-module, selinux, linux-kernel
In-Reply-To: <1e836f34-eda3-542d-f7ce-9a3e87ac5e2e@schaufler-ca.com>
Hello Casey,
On 05.12.2019 19:49, Casey Schaufler wrote:
> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
>> Currently access to perf_events functionality [1] beyond the scope permitted
>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
>>
>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
>> governing role for perf_events based performance monitoring of a system.
>>
>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
>> performance using perf_events subsystem by processes and Perf privileged users
>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
>> privileged processes [3].
>
> Are there use cases where you would need CAP_SYS_PERFMON where you
> would not also need CAP_SYS_ADMIN? If you separate a new capability
Actually, there are. Perf tool that has record, stat and top modes could run with
CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
> with the new capability it is all rather pointless.
>
> The scope you've defined for this CAP_SYS_PERFMON is very small.
> Is there a larger set of privilege checks that might be applicable
> for it?
CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
and stat mode use cases for system wide performance monitoring in kernel and
user modes.
Thanks,
Alexey
>
>
>>
>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
>> performance monitoring functionality of perf_events and balance amount of
>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
>> see Notes to kernel developers, below."
>>
>> For backward compatibility reasons performance monitoring functionality of
>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
>> secure performance monitoring use cases is discouraged with respect to the
>> introduced CAP_SYS_PERFMON capability.
>>
>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
>> [2] to conduct secure performance monitoring using perf_events in the scope
>> of available online CPUs when executing code in kernel and user modes.
>>
>> Possible alternative solution to this capabilities balancing, system security
>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
>> perf_events' performance monitoring functionality, since process debugging is
>> similar to performance monitoring with respect to providing insights into
>> process memory and execution details. However CAP_SYS_PTRACE still provides
>> users with more credentials than are required for secure performance monitoring
>> using perf_events subsystem and this excess is avoided by using the dedicated
>> CAP_SYS_PERFMON capability.
>>
>> libcap library utilities [4], [5] and Perf tool can be used to apply
>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
>> permitted by system wide perf_event_paranoid kernel setting and below are the
>> steps to evaluate the advancement suggested by the patch set:
>>
>> - patch, build and boot the kernel
>> - patch, build Perf tool e.g. to /home/user/perf
>> ...
>> # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
>> # pushd libcap
>> # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
>> # make
>> # pushd progs
>> # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>> # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
>> /home/user/perf: OK
>> # ./getcap /home/user/perf
>> /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
>> # echo 2 > /proc/sys/kernel/perf_event_paranoid
>> # cat /proc/sys/kernel/perf_event_paranoid
>> 2
>> ...
>> $ /home/user/perf top
>> ... works as expected ...
>> $ cat /proc/`pidof perf`/status
>> Name: perf
>> Umask: 0002
>> State: S (sleeping)
>> Tgid: 2958
>> Ngid: 0
>> Pid: 2958
>> PPid: 9847
>> TracerPid: 0
>> Uid: 500 500 500 500
>> Gid: 500 500 500 500
>> FDSize: 256
>> ...
>> CapInh: 0000000000000000
>> CapPrm: 0000004400080000
>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>> cap_sys_perfmon,cap_sys_ptrace,cap_syslog
>> CapBnd: 0000007fffffffff
>> CapAmb: 0000000000000000
>> NoNewPrivs: 0
>> Seccomp: 0
>> Speculation_Store_Bypass: thread vulnerable
>> Cpus_allowed: ff
>> Cpus_allowed_list: 0-7
>> ...
>>
>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
>> - with cap_sys_admin:
>> CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
>> - with cap_sys_perfmon:
>> CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000
>> 38 34 19
>> sys_perfmon syslog sys_ptrace
>>
>> The patch set is for tip perf/core repository:
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
>> tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
>>
>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
>>
>> ---
>> Alexey Budankov (3):
>> capabilities: introduce CAP_SYS_PERFMON to kernel and user space
>> perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
>> perf tool: extend Perf tool with CAP_SYS_PERFMON support
>>
>> include/linux/perf_event.h | 6 ++++--
>> include/uapi/linux/capability.h | 10 +++++++++-
>> security/selinux/include/classmap.h | 4 ++--
>> tools/perf/design.txt | 3 ++-
>> tools/perf/util/cap.h | 4 ++++
>> tools/perf/util/evsel.c | 10 +++++-----
>> tools/perf/util/util.c | 15 +++++++++++++--
>> 7 files changed, 39 insertions(+), 13 deletions(-)
>>
>
>
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: Linus Torvalds @ 2019-12-05 17:12 UTC (permalink / raw)
To: David Howells
Cc: David Sterba, Rasmus Villemoes, Greg Kroah-Hartman,
Peter Zijlstra, raven, Christian Brauner, keyrings, linux-usb,
linux-block, LSM List, linux-fsdevel, Linux API,
Linux Kernel Mailing List
In-Reply-To: <1593.1575554217@warthog.procyon.org.uk>
On Thu, Dec 5, 2019 at 5:57 AM David Howells <dhowells@redhat.com> wrote:
>
> David Sterba <dsterba@suse.cz> wrote:
>
> > [<0>] pipe_write+0x1be/0x4b0
>
> Can you get me a line number of that? Assuming you've built with -g, load
> vmlinux into gdb and do "i li pipe_write+0x1be".
If the kernel is built with debug info (which you need for the gdb
command anyway), it's much better to just use
./scripts/decode_stacktrace.sh
which gives all the information for the whole backtrace.
It would be interesting to hear if somebody else is waiting on the
read side too.
Linus
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: David Sterba @ 2019-12-05 17:21 UTC (permalink / raw)
To: David Howells
Cc: torvalds, Rasmus Villemoes, Greg Kroah-Hartman, Peter Zijlstra,
raven, Christian Brauner, keyrings, linux-usb, linux-block,
linux-security-module, linux-fsdevel, linux-api, linux-kernel
In-Reply-To: <1593.1575554217@warthog.procyon.org.uk>
On Thu, Dec 05, 2019 at 01:56:57PM +0000, David Howells wrote:
> David Sterba <dsterba@suse.cz> wrote:
>
> > [<0>] pipe_write+0x1be/0x4b0
>
> Can you get me a line number of that? Assuming you've built with -g, load
> vmlinux into gdb and do "i li pipe_write+0x1be".
I built it with -g (DEBUG_INFO) but there's no output for the command (gdb 8.2):
(gdb) i li pipe_write+0x1be
Function "pipe_write+0x1be" not defined.
But the address can tell something:
(gdb) l *(pipe_write+0x1be)
0xffffffff81390b8e is in pipe_write (fs/pipe.c:509).
warning: Source file is more recent than executable.
504 kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
505 do_wakeup = 0;
506 }
507 pipe->waiting_writers++;
508 pipe_wait(pipe);
509 pipe->waiting_writers--;
510 }
511 out:
512 __pipe_unlock(pipe);
513 if (do_wakeup) {
I rerun the test again (with a different address where it's stuck), there's
nothing better I can get from the debug info, it always points to pipe_wait,
disassembly points to:
0xffffffff81390b71 <+417>: jne 0xffffffff81390c23 <pipe_write+595>
0xffffffff81390b77 <+423>: test %ecx,%ecx
0xffffffff81390b79 <+425>: jne 0xffffffff81390b95 <pipe_write+453>
0xffffffff81390b7b <+427>: addl $0x1,0x110(%rbx)
0xffffffff81390b82 <+434>: mov %rbx,%rdi
0xffffffff81390b85 <+437>: callq 0xffffffff813908c0 <pipe_wait>
0xffffffff81390b8a <+442>: subl $0x1,0x110(%rbx)
(pipe_write+0x1ba == 0xffffffff81390b8a)
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: Linus Torvalds @ 2019-12-05 17:24 UTC (permalink / raw)
To: David Howells
Cc: David Sterba, Rasmus Villemoes, Greg Kroah-Hartman,
Peter Zijlstra, raven, Christian Brauner, keyrings, linux-usb,
linux-block, LSM List, linux-fsdevel, Linux API,
Linux Kernel Mailing List
In-Reply-To: <CAHk-=wgwwJ+ZEtycujFdNmpS8TjwCYyT+oHfV7d-GekyaX91xg@mail.gmail.com>
On Thu, Dec 5, 2019 at 9:12 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> It would be interesting to hear if somebody else is waiting on the
> read side too.
Looking once more at that commit, I find at least one bug, but I'm
pretty sure that's not the cause of this problem.
DavidH, watch out for things like this:
- for (idx = 0; idx < pipe->nrbufs && rem < len; idx++)
+ for (idx = tail; idx < head && rem < len; idx++)
which is a completely buggy conversion.
You can't compare tail and head with an inequality, it gets the
wraparound case wrong.
But I found only one of those, and it's fuse-specific, plus the
overflow would take a long time to trigger, so I'm pretty sure this
isn't what DavidS is reporting.
Linus
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: David Howells @ 2019-12-05 17:25 UTC (permalink / raw)
To: dsterba
Cc: dhowells, torvalds, Rasmus Villemoes, Greg Kroah-Hartman,
Peter Zijlstra, raven, Christian Brauner, keyrings, linux-usb,
linux-block, linux-security-module, linux-fsdevel, linux-api,
linux-kernel
In-Reply-To: <20191205172127.GW2734@suse.cz>
I've just posted a couple of patches - can you check to see if they fix your
problem?
https://lore.kernel.org/linux-fsdevel/157556649610.20869.8537079649495343567.stgit@warthog.procyon.org.uk/T/#t
Thanks,
David
^ permalink raw reply
* Re: [GIT PULL] pipe: Notification queue preparation
From: Linus Torvalds @ 2019-12-05 17:33 UTC (permalink / raw)
To: David Sterba, David Howells, Linus Torvalds, Rasmus Villemoes,
Greg Kroah-Hartman, Peter Zijlstra, raven, Christian Brauner,
keyrings, linux-usb, linux-block, LSM List, linux-fsdevel,
Linux API, Linux Kernel Mailing List
In-Reply-To: <20191205172127.GW2734@suse.cz>
On Thu, Dec 5, 2019 at 9:22 AM David Sterba <dsterba@suse.cz> wrote:
>
> I rerun the test again (with a different address where it's stuck), there's
> nothing better I can get from the debug info, it always points to pipe_wait,
> disassembly points to.
Hah. I see another bug.
"pipe_wait()" depends on the fact that all events that wake it up
happen with the pipe lock held.
But we do some of the "do_wakeup()" handling outside the pipe lock now
on the reader side
__pipe_unlock(pipe);
/* Signal writers asynchronously that there is more room. */
if (do_wakeup) {
wake_up_interruptible_poll(&pipe->wait, EPOLLOUT | EPOLLWRNORM);
kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
}
However, that isn't new to this series _either_, so I don't think
that's it. It does wake up things inside the lock _too_ if it ended up
emptying a whole buffer.
So it could be triggered by timing and behavior changes, but I doubt
this pipe_wait() thing is it either. The fact that it bisects to the
thing that changes things to use head/tail pointers makes me think
there's some other incorrect update or comparison somewhere.
That said, "pipe_wait()" is an abomination. It should use a proper
wait condition and use wait_event(), but the code predates all of
that. I suspect pipe_wait() goes back to the dark ages with the BKL
and no actual races between kernel code.
Linus
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox