All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bruno Prémont" <bonbons@sysophe.eu>
To: David Howells <dhowells@redhat.com>
Cc: <netfs@lists.linux.dev>, <linux-fsdevel@vger.kernel.org>,
	linux-cifs@vger.kernel.org
Subject: [BUG] 6.8.x general protection fault during cifs/netfs write
Date: Tue, 30 Apr 2024 09:49:50 +0200	[thread overview]
Message-ID: <20240430094651.17461d73@hemera> (raw)

Hi,

I've been experiencing general protection fault during cifs IO recently
with 6.8 kernels (6.8.7 and 6.8.0).
For the 6.8.7 it was during a `cp -a $localdir /path/to/cifs/mount`.

[276389.427110] general protection fault, probably for non-canonical address 0x5a5a5a5a5a5a5ae2: 0000 [#1] SMP NOPTI
[276389.427129] CPU: 4 PID: 4373 Comm: cp Not tainted 6.8.7 #2
[276389.427137] Hardware name: LENOVO 20N4S13W00/20N4S13W00, BIOS N2IETA2W (1.80 ) 06/21/2023
[276389.427143] RIP: 0010:__fscache_use_cookie+0x1e/0x2b0
[276389.427156] Code: 90 90 90 90 90 90 90 90 90 90 90 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 48 65 48 8b 04 25 28 00 00 00 48 89 44 24 40 31 c0 <48> 8b 87 88 00 00 00 89 c5 83 e5 01 0f 85 c3 01 00 00 4c 8d 6f 14
[276389.427164] RSP: 0018:ffffaf3e85053ca8 EFLAGS: 00010246
[276389.427173] RAX: 0000000000000000 RBX: ffff9b8feefd1ef8 RCX: 0000000000000000
[276389.427178] RDX: 0000000000040004 RSI: 0000000000000001 RDI: 5a5a5a5a5a5a5a5a
[276389.427184] RBP: 5a5a5a5a5a5a5a5a R08: 0000000000000000 R09: 0000000000000000
[276389.427189] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9b8feefd1ef8
[276389.427194] R13: 0000000000001000 R14: ffff9b8feefd2070 R15: ffff9b8fe2db4a00
[276389.427199] FS:  00007fac30b10740(0000) GS:ffff9b922e700000(0000) knlGS:0000000000000000
[276389.427206] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[276389.427211] CR2: 00007fac30aa8000 CR3: 00000001dcafa006 CR4: 00000000003706f0
[276389.427217] Call Trace:
[276389.427223]  <TASK>
[276389.427227]  ? die_addr+0x2d/0x80
[276389.427239]  ? exc_general_protection+0x2ba/0x340
[276389.427251]  ? asm_exc_general_protection+0x22/0x30
[276389.427262]  ? __fscache_use_cookie+0x1e/0x2b0
[276389.427269]  ? queue_delayed_work_on+0x27/0x30
[276389.427280]  netfs_dirty_folio+0x8b/0xa0
[276389.427290]  cifs_write_end+0x145/0x1d0
[276389.427299]  generic_perform_write+0x11e/0x230
[276389.427308]  cifs_strict_writev+0x256/0x2d0
[276389.427318]  vfs_write+0x274/0x420
[276389.427326]  ksys_write+0x66/0xf0
[276389.427332]  do_syscall_64+0x49/0x120
[276389.427340]  entry_SYSCALL_64_after_hwframe+0x78/0x80
[276389.427351] RIP: 0033:0x7fac30c02264
[276389.427358] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 05 31 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[276389.427364] RSP: 002b:00007ffd93d03ac8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[276389.427372] RAX: ffffffffffffffda RBX: 0000000000099ec3 RCX: 00007fac30c02264
[276389.427378] RDX: 0000000000099ec3 RSI: 00007fac30a0f000 RDI: 0000000000000004
[276389.427382] RBP: 0000000000099ec3 R08: 00007fac30a0f000 R09: 0000000000000000
[276389.427387] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fac30a0f000
[276389.427392] R13: 0000000000000004 R14: 0000000000099ec3 R15: 0000000000099ec3
[276389.427398]  </TASK>
[276389.427401] Modules linked in:
[276389.427409] ---[ end trace 0000000000000000 ]---
[276389.427414] RIP: 0010:__fscache_use_cookie+0x1e/0x2b0
[276389.427421] Code: 90 90 90 90 90 90 90 90 90 90 90 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 48 65 48 8b 04 25 28 00 00 00 48 89 44 24 40 31 c0 <48> 8b 87 88 00 00 00 89 c5 83 e5 01 0f 85 c3 01 00 00 4c 8d 6f 14
[276389.427427] RSP: 0018:ffffaf3e85053ca8 EFLAGS: 00010246
[276389.427433] RAX: 0000000000000000 RBX: ffff9b8feefd1ef8 RCX: 0000000000000000
[276389.427438] RDX: 0000000000040004 RSI: 0000000000000001 RDI: 5a5a5a5a5a5a5a5a
[276389.427442] RBP: 5a5a5a5a5a5a5a5a R08: 0000000000000000 R09: 0000000000000000
[276389.427447] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9b8feefd1ef8
[276389.427451] R13: 0000000000001000 R14: ffff9b8feefd2070 R15: ffff9b8fe2db4a00
[276389.427456] FS:  00007fac30b10740(0000) GS:ffff9b922e700000(0000) knlGS:0000000000000000
[276389.427462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[276389.427467] CR2: 00007fac30aa8000 CR3: 00000001dcafa006 CR4: 00000000003706f0


This looks like the delayed work is suffering from some use-after-free.
Note that I have `slub_debug=ZP` on my kernel cmdline which will make
use after free issues more visible.

The kernel not being built with debugging symbols, here is what I get
from objdump:
0000000000001130 <__fscache_use_cookie>:
__fscache_use_cookie():
    1130:       41 57                   push   %r15
    1132:       41 56                   push   %r14
    1134:       41 55                   push   %r13
    1136:       41 54                   push   %r12
    1138:       55                      push   %rbp
    1139:       53                      push   %rbx
    113a:       48 83 ec 48             sub    $0x48,%rsp
    113e:       65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
    1145:       00 00 
    1147:       48 89 44 24 40          mov    %rax,0x40(%rsp)
    114c:       31 c0                   xor    %eax,%eax
    114e:       48 8b 87 88 00 00 00    mov    0x88(%rdi),%rax
                ^^^^^^^^^^^^^^^^^^^^
    1155:       89 c5                   mov    %eax,%ebp
    1157:       83 e5 01                and    $0x1,%ebp
    115a:       0f 85 c3 01 00 00       jne    1323 <__fscache_use_cookie+0x1f3>
    1160:       4c 8d 6f 14             lea    0x14(%rdi),%r13
    1164:       48 89 fb                mov    %rdi,%rbx
    1167:       41 89 f4                mov    %esi,%r12d
    116a:       4c 89 ef                mov    %r13,%rdi
    116d:       e8 00 00 00 00          call   1172 <__fscache_use_cookie+0x42>
    1172:       b8 01 00 00 00          mov    $0x1,%eax
    1177:       f0 0f c1 43 04          lock xadd %eax,0x4(%rbx)
    117c:       8b 43 08                mov    0x8(%rbx),%eax
    117f:       8b 03                   mov    (%rbx),%eax
    1181:       0f b6 b3 90 00 00 00    movzbl 0x90(%rbx),%esi
    1188:       40 80 fe 04             cmp    $0x4,%sil
    118c:       77 26                   ja     11b4 <__fscache_use_cookie+0x84>
    118e:       40 80 fe 02             cmp    $0x2,%sil
    1192:       77 7b                   ja     120f <__fscache_use_cookie+0xdf>
    1194:       40 84 f6                test   %sil,%sil
    1197:       0f 84 0f 01 00 00       je     12ac <__fscache_use_cookie+0x17c>
    119d:       45 84 e4                test   %r12b,%r12b
    11a0:       0f 84 fd 00 00 00       je     12a3 <__fscache_use_cookie+0x173>
    11a6:       f0 80 8b 88 00 00 00    lock orb $0x80,0x88(%rbx)
    11ad:       80 

Unless I'm misreading code that feels like the cookie was freed
already and the first access at cookie details crashes.


A previous trace with 6.8.0:
[1164231.756488] CPU: 7 PID: 23080 Comm: cp Tainted: G      D            6.8.0 #1
[1164231.756491] Hardware name: LENOVO 20N4S13W00/20N4S13W00, BIOS N2IETA2W (1.80 ) 06/21/2023
[1164231.756493] RIP: 0010:__fscache_use_cookie+0x1e/0x2b0
[1164231.756500] Code: 90 90 90 90 90 90 90 90 90 90 90 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 48 65 48 8b 04 25 28 00 00 00 48 89 44 24 40 31 c0 <48> 8b 87 88 00 00 00 89 c5 83 e5 01 0f 85 c3 01 00 00 4c 8d 6f 14
[1164231.756502] RSP: 0018:ffffaedd076cfca8 EFLAGS: 00010246
[1164231.756504] RAX: 0000000000000000 RBX: ffffa3179d9863b8 RCX: ffffa3179d9864a0
[1164231.756506] RDX: 0000000000040004 RSI: 0000000000000001 RDI: 5a5a5a5a5a5a5a5a
[1164231.756507] RBP: 5a5a5a5a5a5a5a5a R08: ffffa3179d9864a0 R09: 0000000000000000
[1164231.756508] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa3179d9863b8
[1164231.756510] R13: 0000000000001000 R14: ffffa3179d986530 R15: ffffa31617172c00
[1164231.756511] FS:  00007f66d8685740(0000) GS:ffffa31a2e7c0000(0000) knlGS:0000000000000000
[1164231.756513] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1164231.756514] CR2: 00007f66d8589000 CR3: 00000001b9dce001 CR4: 00000000003706f0
[1164231.756516] Call Trace:
[1164231.756518]  <TASK>
[1164231.756520]  ? die_addr+0x2d/0x80
[1164231.756526]  ? exc_general_protection+0x2ba/0x340
[1164231.756531]  ? asm_exc_general_protection+0x22/0x30
[1164231.756535]  ? __fscache_use_cookie+0x1e/0x2b0
[1164231.756536]  ? locked_inode_to_wb_and_lock_list+0x3b/0x130
[1164231.756541]  ? __mark_inode_dirty+0x12e/0x220
[1164231.756543]  netfs_dirty_folio+0x8b/0xa0
[1164231.756547]  cifs_write_end+0x145/0x1d0
[1164231.756552]  generic_perform_write+0x11e/0x230
[1164231.756556]  cifs_strict_writev+0x256/0x2d0
[1164231.756559]  vfs_write+0x274/0x420
[1164231.756563]  ksys_write+0x66/0xf0
[1164231.756565]  do_syscall_64+0x4e/0x120
[1164231.756568]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[1164231.756573] RIP: 0033:0x7f66d8777264
[1164231.756576] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 05 31 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
[1164231.756577] RSP: 002b:00007ffea7f86228 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[1164231.756579] RAX: ffffffffffffffda RBX: 00000000000059d5 RCX: 00007f66d8777264
[1164231.756581] RDX: 00000000000059d5 RSI: 00007f66d8584000 RDI: 0000000000000004
[1164231.756582] RBP: 00000000000059d5 R08: 00007f66d8584000 R09: 0000000000000000
[1164231.756583] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f66d8584000
[1164231.756584] R13: 0000000000000004 R14: 00000000000059d5 R15: 00000000000059d5
[1164231.756585]  </TASK>
[1164231.756586] Modules linked in:
[1164231.756589] ---[ end trace 0000000000000000 ]---
[1164231.756591] RIP: 0010:__fscache_use_cookie+0x1e/0x2b0
[1164231.756593] Code: 90 90 90 90 90 90 90 90 90 90 90 90 41 57 41 56 41 55 41 54 55 53 48 83 ec 48 65 48 8b 04 25 28 00 00 00 48 89 44 24 40 31 c0 <48> 8b 87 88 00 00 00 89 c5 83 e5 01 0f 85 c3 01 00 00 4c 8d 6f 14
[1164231.756594] RSP: 0018:ffffaedd03103ca0 EFLAGS: 00010246
[1164231.756595] RAX: 0000000000000000 RBX: ffffa3179d9825d8 RCX: 0000000000000000
[1164231.756597] RDX: 0000000000040004 RSI: 0000000000000001 RDI: 5a5a5a5a5a5a5a5a
[1164231.756598] RBP: 5a5a5a5a5a5a5a5a R08: 0000000000000000 R09: 0000000000000000
[1164231.756599] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa3179d9825d8
[1164231.756600] R13: 000000000000005c R14: ffffa3179d982750 R15: ffffa316e1f66e00
[1164231.756601] FS:  00007f66d8685740(0000) GS:ffffa31a2e7c0000(0000) knlGS:0000000000000000
[1164231.756602] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1164231.756604] CR2: 00007f66d8589000 CR3: 00000001b9dce001 CR4: 00000000003706f0



Cheers,
Bruno

                 reply	other threads:[~2024-04-30  8:00 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240430094651.17461d73@hemera \
    --to=bonbons@sysophe.eu \
    --cc=dhowells@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=netfs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.