* Re: WARNING: refcount bug in find_key_to_update
2019-10-17 2:42 ` WARNING: refcount bug in find_key_to_update syzbot
@ 2019-10-17 15:53 ` Linus Torvalds
2019-10-17 16:00 ` Eric Biggers
2019-10-18 16:38 ` David Howells
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2019-10-17 15:53 UTC (permalink / raw)
To: syzbot
Cc: aou, Palmer Dabbelt, syzkaller-bugs, James Morris James Morris,
Jarkko Sakkinen, Linux Kernel Mailing List, David Howells,
LSM List, keyrings, linux-riscv, Serge E. Hallyn
On Wed, Oct 16, 2019 at 7:42 PM syzbot
<syzbot+6455648abc28dbdd1e7f@syzkaller.appspotmail.com> wrote:
>
> syzbot has bisected this bug to 0570bc8b7c9b ("Merge tag
> 'riscv/for-v5.3-rc1' ...")
Yeah, that looks unlikely. The only non-riscv changes are from
documentation updates and moving a config variable around.
Looks like the crash is quite unlikely, and only happens in one out of
ten runs for the ones it has happened to.
The backtrace looks simple enough, though:
RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156
__key_get include/linux/key.h:281 [inline]
find_key_to_update+0x67/0x80 security/keys/keyring.c:1127
key_create_or_update+0x4e5/0xb20 security/keys/key.c:905
__do_sys_add_key security/keys/keyctl.c:132 [inline]
__se_sys_add_key security/keys/keyctl.c:72 [inline]
__x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72
do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe
which to me implies that there's some locking bug, and somebody
released the key without holding a lock.
That code looks a bit confused to me. Releasing a key without holding
a lock looks permitted, but if that's the case then __key_get() is
complete garbage. It would need to use 'refcount_inc_not_zero()' and
failure would require failing the caller.
But I haven't followed the key locking rules, so who knows. That "put
without lock" scenario would explain the crash, though.
David?
Linus
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: WARNING: refcount bug in find_key_to_update
2019-10-17 2:42 ` WARNING: refcount bug in find_key_to_update syzbot
2019-10-17 15:53 ` Linus Torvalds
@ 2019-10-18 16:38 ` David Howells
2019-10-22 10:35 ` David Howells
2019-10-22 13:17 ` David Howells
3 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2019-10-18 16:38 UTC (permalink / raw)
To: Linus Torvalds
Cc: aou, syzbot, Palmer Dabbelt, syzkaller-bugs,
James Morris James Morris, Jarkko Sakkinen,
Linux Kernel Mailing List, dhowells, LSM List, keyrings,
linux-riscv, Serge E. Hallyn
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> The backtrace looks simple enough, though:
>
> RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156
> __key_get include/linux/key.h:281 [inline]
> find_key_to_update+0x67/0x80 security/keys/keyring.c:1127
> key_create_or_update+0x4e5/0xb20 security/keys/key.c:905
> __do_sys_add_key security/keys/keyctl.c:132 [inline]
> __se_sys_add_key security/keys/keyctl.c:72 [inline]
> __x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72
> do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> which to me implies that there's some locking bug, and somebody
> released the key without holding a lock.
>
> That code looks a bit confused to me. Releasing a key without holding
> a lock looks permitted, but if that's the case then __key_get() is
> complete garbage. It would need to use 'refcount_inc_not_zero()' and
> failure would require failing the caller.
find_key_to_update() must be called with the keyring-to-be-searched locked, as
stated in the comment on that function.
If a key-to-be-updated can be found in that keyring, then the keyring must be
holding a ref on that key already, so it's refcount must be > 0, so it
shouldn't be necessary to use refcount_inc_not_zero().
There shouldn't be a race with key_link(), key_unlink(), key_move(),
keyring_clear() or keyring_gc() (garbage collection) as all of those take a
write-lock on the keyring.
> But I haven't followed the key locking rules, so who knows. That "put
> without lock" scenario would explain the crash, though.
That shouldn't explain it. When key_put() reduces the refcount to 0, it just
schedules the garbage collector. It doesn't touch the key again directly.
I would guess that something incorrectly put a ref when it shouldn't have. Do
we know which type of key is involved? Looking at the syzkaller reproducer,
it's adding an encrypted key and a user key to the process keyring -
presumably repeating the procedure within the same process, hence how it finds
something to update.
David
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING: refcount bug in find_key_to_update
2019-10-17 2:42 ` WARNING: refcount bug in find_key_to_update syzbot
2019-10-17 15:53 ` Linus Torvalds
2019-10-18 16:38 ` David Howells
@ 2019-10-22 10:35 ` David Howells
2019-10-22 13:17 ` David Howells
3 siblings, 0 replies; 6+ messages in thread
From: David Howells @ 2019-10-22 10:35 UTC (permalink / raw)
To: Linus Torvalds, Mimi Zohar
Cc: aou, syzbot, Palmer Dabbelt, syzkaller-bugs,
James Morris James Morris, Jarkko Sakkinen,
Linux Kernel Mailing List, dhowells, LSM List, keyrings,
linux-riscv, Serge E. Hallyn
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> > syzbot has bisected this bug to 0570bc8b7c9b ("Merge tag
> > 'riscv/for-v5.3-rc1' ...")
>
> Yeah, that looks unlikely. The only non-riscv changes are from
> documentation updates and moving a config variable around.
>
> Looks like the crash is quite unlikely, and only happens in one out of
> ten runs for the ones it has happened to.
>
> The backtrace looks simple enough, though:
>
> RIP: 0010:refcount_inc_checked+0x2b/0x30 lib/refcount.c:156
> __key_get include/linux/key.h:281 [inline]
> find_key_to_update+0x67/0x80 security/keys/keyring.c:1127
> key_create_or_update+0x4e5/0xb20 security/keys/key.c:905
> __do_sys_add_key security/keys/keyctl.c:132 [inline]
> __se_sys_add_key security/keys/keyctl.c:72 [inline]
> __x64_sys_add_key+0x219/0x3f0 security/keys/keyctl.c:72
> do_syscall_64+0xd0/0x540 arch/x86/entry/common.c:296
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> which to me implies that there's some locking bug, and somebody
> released the key without holding a lock.
I'm wondering if this is actually a bug in the error handling in the encrypted
key type. Looking in the syzbot console log, there's a lot of output from
there prior to the crash, of which the following is an excerpt:
[ 248.516746][T27381] encrypted_key: key user:syz not found
[ 248.524392][T27382] encrypted_key: key user:syz not found
[ 248.616141][T27392] encrypted_key: key user:syz not found
[ 248.618890][T27393] encrypted_key: key user:syz not found
[ 248.690844][T27404] encrypted_key: key user:syz not found
[ 248.739405][T27403] encrypted_key: key user:syz not found
[ 248.804881][T27417] encrypted_key: key user:syz not found
[ 248.828354][T27418] encrypted_key: keyword 'new' not allowed when called from .update method
[ 248.925249][T27427] encrypted_key: keyword 'new' not allowed when called from .update method
[ 248.928200][T27415] Bad refcount user syz
[ 248.934043][T27428] encrypted_key: key user:syz not found
[ 248.939502][T27429] encrypted_key: key user:syz not found
[ 248.968744][T27434] encrypted_key: key user:syz not found
[ 248.982201][T27415] ==================================================================
[ 248.996072][T27415] BUG: KASAN: use-after-free in refcount_inc_not_zero_checked+0x81/0x200
Note that the "Bad refcount user syz" is a bit I patched in to print the type
and description of the key that incurred the error.
It's a tad difficult to say exactly what's going on since I've no idea what
the syzbot reproducer is actually doing.
#{"threaded":true,"collide":true,"repeat":true,"procs":6,"sandbox":"namespace","fault_call":-1,"tun":true,"netdev":true,"resetnet":true,"cgroups":true,"binfmt_misc":true,"close_fds":true,"tmpdir":true,"segv":true}
perf_event_open(&(0x7f000001d000)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0x0)
keyctl$instantiate(0xc, 0x0, &(0x7f0000000100)=ANY=[@ANYBLOB='new default user:syz 04096'], 0x1, 0x0)
r0 = add_key(&(0x7f0000000140)='encrypted\x00', &(0x7f0000000180)={'syz'}, &(0x7f0000000100), 0xca, 0xfffffffffffffffe)
add_key$user(&(0x7f0000000040)='user\x00', &(0x7f0000000000)={'syz'}, &(0x7f0000000440)='X', 0x1, 0xfffffffffffffffe)
keyctl$read(0xb, r0, &(0x7f0000000240)=""/112, 0x349b7f55)
However, it looks like the encrypted key type is trying to access a user key,
so maybe there's an overput there? I'm trying to insert more debugging, but
the test doesn't always fail.
syzbot <syzbot+6455648abc28dbdd1e7f@syzkaller.appspotmail.com> wrote:
> HEAD commit: bc88f85c kthread: make __kthread_queue_delayed_work static
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1730584b600000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e0ac4d9b35046343
> dashboard link: https://syzkaller.appspot.com/bug?extid=6455648abc28dbdd1e7f
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8adab600000
David
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 6+ messages in thread