* Re: request_module DoS [not found] <YnXiuhdZ49pKL/dK@gondor.apana.org.au> @ 2022-05-07 7:10 ` Christophe Leroy 2022-05-07 8:02 ` Luis Chamberlain [not found] ` <874k1zt0ec.fsf@mpe.ellerman.id.au> 1 sibling, 1 reply; 9+ messages in thread From: Christophe Leroy @ 2022-05-07 7:10 UTC (permalink / raw) To: Herbert Xu, Luis Chamberlain, linux-kernel@vger.kernel.org, linux-modules@vger.kernel.org, linuxppc-dev Cc: fnovak@us.ibm.com + linuxppc list Le 07/05/2022 à 05:08, Herbert Xu a écrit : > Hi: > > There are some code paths in the kernel where you can reliably > trigger a request_module of a non-existant module. For example, > if you attempt to load a non-existent crypto algorithm, or create > a socket of a non-existent network family, it will result in a > request_module call that is guaranteed to fail. > > As user-space can do this repeatedly, it can quickly overwhelm > the concurrency limit in kmod. This in itself is expected, > however, at least on some platforms this appears to result in > a live-lock. Here is an example triggered by stress-ng on ppc64: > > [ 529.853264] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.854329] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.854341] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.854419] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.925327] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.925328] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.925328] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.925356] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128, throttling... > [ 529.925373] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 529.925397] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > [ 534.863623] __request_module: 572 callbacks suppressed > [ 534.863632] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.863642] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.864113] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.864989] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.865908] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.873626] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.873682] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 534.874487] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 534.875200] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-rfc4106(gcm(aes))-all, throttling... > [ 534.883333] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 539.903506] __request_module: 604 callbacks suppressed > [ 539.903514] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 539.923693] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-anubis-all, throttling... > [ 539.985508] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-rsa-all, throttling... > [ 540.005381] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.033224] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.035282] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.044614] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.045344] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.063380] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 540.073839] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 545.013451] __request_module: 364 callbacks suppressed > [ 545.013463] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-morus640-all, throttling... > [ 545.055639] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 545.073121] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aes, throttling... > [ 545.113218] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-morus640-all, throttling... > [ 545.143335] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-anubis, throttling... > [ 545.153122] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 545.213393] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 545.423560] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-blowfish-all, throttling... > [ 545.485459] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aes, throttling... > [ 545.493302] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aes, throttling... > [ 546.373762] request_module: modprobe crypto-blowfish cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 550.114824] __request_module: 89 callbacks suppressed > [ 550.114836] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 550.133698] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 550.134293] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-blowfish-all, throttling... > [ 550.134367] request_module: modprobe crypto-aegis128 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 550.134380] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 550.143479] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 550.184477] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 550.213325] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 550.273658] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module cryptomgr, throttling... > [ 550.354497] request_module: modprobe crypto-aegis128 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 550.354531] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128-all, throttling... > [ 550.373253] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 551.553129] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 555.125406] __request_module: 463 callbacks suppressed > [ 555.125414] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.144260] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.353349] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.363333] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-camellia-all, throttling... > [ 555.374176] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.404280] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.424795] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l-all, throttling... > [ 555.425009] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-camellia-all, throttling... > [ 555.433594] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-morus1280-all, throttling... > [ 555.434605] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-camellia-all, throttling... > [ 560.135515] __request_module: 528 callbacks suppressed > [ 560.135525] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 560.213142] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.213155] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.253160] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.273546] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 560.295392] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.295393] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.295447] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.295493] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 560.295539] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256, throttling... > [ 565.313118] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 565.313191] __request_module: 269 callbacks suppressed > [ 565.313193] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 565.313211] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 565.313224] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 565.313241] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 565.313253] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 565.934584] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-cast6-all, throttling... > [ 565.993559] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 566.163898] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module cryptomgr, throttling... > [ 566.324557] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 566.885018] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 567.123450] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 567.144416] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 568.224505] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 568.224517] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 568.263714] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 568.263737] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 569.123115] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 570.323756] __request_module: 27 callbacks suppressed > [ 570.323763] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.383775] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.393602] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.443781] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.473465] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.583827] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.833842] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 570.863734] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-cast6-all, throttling... > [ 570.915448] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aes, throttling... > [ 570.923497] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aes, throttling... > [ 573.374203] request_module: modprobe crypto-aegis256-all cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 573.485584] request_module: modprobe crypto-morus1280 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 573.745565] request_module: modprobe crypto-aegis256-all cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 573.853349] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 573.853453] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 574.053100] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 574.073611] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 574.073679] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 574.114243] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 574.204498] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 575.384942] __request_module: 37 callbacks suppressed > [ 575.384948] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.554612] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.614579] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.623600] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.635387] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.654233] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.764383] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.783091] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.783802] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 575.823309] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 578.783308] __request_module: 17 callbacks suppressed > [ 578.783319] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 578.943468] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.013776] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.074271] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.124351] request_module: modprobe crypto-cast5-all cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.473229] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.744561] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.744565] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.833100] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 579.845320] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > [ 580.414590] __request_module: 25 callbacks suppressed > [ 580.414597] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > [ 580.423082] watchdog: CPU 784 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c > [ 580.423097] watchdog: CPU 784 TB:1297691958559475, last heartbeat TB:1297686321743840 (11009ms ago) > [ 580.423099] Modules linked in: cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common tun nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill bonding tls ip_set nf_tables nfnetlink pseries_rng binfmt_misc drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod fuse > [ 580.423136] CPU: 784 PID: 77071 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-55.el9.ppc64le #1 > [ 580.423139] NIP: c0000000000f8ff4 LR: c0000000001f7c38 CTR: 0000000000000000 > [ 580.423140] REGS: c0000043fdd7bd60 TRAP: 0900 Not tainted (5.14.0-55.el9.ppc64le) > [ 580.423142] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008202 XER: 20040000 > [ 580.423148] CFAR: 0000000000000c00 IRQMASK: 1 > GPR00: 0000000028008202 c0000044c46b3850 c000000002a46f00 0000000000000000 > GPR04: ffffffffffffffff 0000000000000000 0000000000000010 c000000002a83060 > GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 > GPR12: c0000000001b9530 c0000043ffe16700 0000000200000117 0000000010185ea8 > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > GPR20: 0000000000000001 0000000000000000 0000000000000001 00000000000000ff > GPR24: c0000043f4a00e14 c0000043fafe0e00 000000000c440000 0000000000000000 > GPR28: c0000043f4a00e00 c0000043f4a00e00 c0000000021e0e00 c000000002561aa0 > [ 580.423166] NIP [c0000000000f8ff4] plpar_hcall_norets_notrace+0x18/0x2c > [ 580.423168] LR [c0000000001f7c38] __pv_queued_spin_lock_slowpath+0x528/0x530 > [ 580.423173] Call Trace: > [ 580.423174] [c0000044c46b3850] [0000000100006b60] 0x100006b60 (unreliable) > [ 580.423177] [c0000044c46b3910] [c000000000ea6948] _raw_spin_lock_irqsave+0xa8/0xc0 > [ 580.423182] [c0000044c46b3940] [c0000000001dd7c0] prepare_to_wait_event+0x40/0x200 > [ 580.423185] [c0000044c46b39a0] [c00000000019e9e0] __request_module+0x320/0x510 > [ 580.423188] [c0000044c46b3ac0] [c0000000006f1a14] crypto_alg_mod_lookup+0x1e4/0x2e0 > [ 580.423192] [c0000044c46b3b60] [c0000000006f2178] crypto_alloc_tfm_node+0xa8/0x1a0 > [ 580.423194] [c0000044c46b3be0] [c0000000006f84f8] crypto_alloc_aead+0x38/0x50 > [ 580.423196] [c0000044c46b3c00] [c00000000072cba0] aead_bind+0x70/0x140 > [ 580.423199] [c0000044c46b3c40] [c000000000727824] alg_bind+0xb4/0x210 > [ 580.423201] [c0000044c46b3cc0] [c000000000bc2ad4] __sys_bind+0x114/0x160 > [ 580.423205] [c0000044c46b3d90] [c000000000bc2b48] sys_bind+0x28/0x40 > [ 580.423207] [c0000044c46b3db0] [c000000000030880] system_call_exception+0x160/0x300 > [ 580.423209] [c0000044c46b3e10] [c00000000000c168] system_call_vectored_common+0xe8/0x278 > [ 580.423213] --- interrupt: 3000 at 0x7fff9b824464 > [ 580.423214] NIP: 00007fff9b824464 LR: 0000000000000000 CTR: 0000000000000000 > [ 580.423215] REGS: c0000044c46b3e80 TRAP: 3000 Not tainted (5.14.0-55.el9.ppc64le) > [ 580.423216] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42004802 XER: 00000000 > [ 580.423221] IRQMASK: 0 > GPR00: 0000000000000147 00007fffdcff2780 00007fff9b917100 0000000000000004 > GPR04: 00007fffdcff27e0 0000000000000058 0000000000000000 0000000000000000 > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR12: 0000000000000000 00007fff9bc9efe0 0000000200000117 0000000010185ea8 > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > GPR20: 0000000000000004 00007fffdcff2a00 0000000300000117 00000000101862b8 > GPR24: 0000000000000004 0000000046401570 0000000046401120 0000000046404650 > GPR28: 0000000000000020 0000000000000020 0000000000000060 0000000046404bf0 > [ 580.423236] NIP [00007fff9b824464] 0x7fff9b824464 > [ 580.423237] LR [0000000000000000] 0x0 > [ 580.423238] --- interrupt: 3000 > [ 580.423239] Instruction dump: > [ 580.423241] e8690000 7c0803a6 3884fff8 78630100 78840020 4bfffeb8 3c4c0295 3842df24 > [ 580.423244] 7c421378 7c000026 90010008 44000022 <38800000> 988d0931 80010008 7c0ff120 > > Would it be possible to modify kmod so that in such cases that > request_module calls fail more quickly rather than repeatedly > obtaining a spinlock that appears to be under high contention? > > Thanks, ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-07 7:10 ` request_module DoS Christophe Leroy @ 2022-05-07 8:02 ` Luis Chamberlain 2022-05-07 19:14 ` Luis Chamberlain 0 siblings, 1 reply; 9+ messages in thread From: Luis Chamberlain @ 2022-05-07 8:02 UTC (permalink / raw) To: Christophe Leroy Cc: linuxppc-dev, fnovak@us.ibm.com, Herbert Xu, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org On Sat, May 07, 2022 at 07:10:23AM +0000, Christophe Leroy wrote: > > There are some code paths in the kernel where you can reliably > > trigger a request_module of a non-existant module. For example, > > if you attempt to load a non-existent crypto algorithm, or create > > a socket of a non-existent network family, it will result in a > > request_module call that is guaranteed to fail. > > > > As user-space can do this repeatedly, it can quickly overwhelm > > the concurrency limit in kmod. This in itself is expected, > > however, at least on some platforms this appears to result in > > a live-lock. Here is an example triggered by stress-ng on ppc64: > > > > [ 579.845320] request_module: modprobe crypto-aegis256 cannot be processed, kmod busy with 50 threads for more than 5 seconds now > > [ 580.414590] __request_module: 25 callbacks suppressed > > [ 580.414597] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > > [ 580.423082] watchdog: CPU 784 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c > > [ 580.423097] watchdog: CPU 784 TB:1297691958559475, last heartbeat TB:1297686321743840 (11009ms ago) > > [ 580.423099] Modules linked in: cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common tun nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill bonding tls ip_set nf_tables nfnetlink pseries_rng binfmt_misc drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod fuse > > [ 580.423136] CPU: 784 PID: 77071 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-55.el9.ppc64le #1 > > [ 580.423139] NIP: c0000000000f8ff4 LR: c0000000001f7c38 CTR: 0000000000000000 > > [ 580.423140] REGS: c0000043fdd7bd60 TRAP: 0900 Not tainted (5.14.0-55.el9.ppc64le) > > [ 580.423142] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008202 XER: 20040000 > > [ 580.423148] CFAR: 0000000000000c00 IRQMASK: 1 > > GPR00: 0000000028008202 c0000044c46b3850 c000000002a46f00 0000000000000000 > > GPR04: ffffffffffffffff 0000000000000000 0000000000000010 c000000002a83060 > > GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 > > GPR12: c0000000001b9530 c0000043ffe16700 0000000200000117 0000000010185ea8 > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > GPR20: 0000000000000001 0000000000000000 0000000000000001 00000000000000ff > > GPR24: c0000043f4a00e14 c0000043fafe0e00 000000000c440000 0000000000000000 > > GPR28: c0000043f4a00e00 c0000043f4a00e00 c0000000021e0e00 c000000002561aa0 > > [ 580.423166] NIP [c0000000000f8ff4] plpar_hcall_norets_notrace+0x18/0x2c > > [ 580.423168] LR [c0000000001f7c38] __pv_queued_spin_lock_slowpath+0x528/0x530 > > [ 580.423173] Call Trace: > > [ 580.423174] [c0000044c46b3850] [0000000100006b60] 0x100006b60 (unreliable) > > [ 580.423177] [c0000044c46b3910] [c000000000ea6948] _raw_spin_lock_irqsave+0xa8/0xc0 > > [ 580.423182] [c0000044c46b3940] [c0000000001dd7c0] prepare_to_wait_event+0x40/0x200 > > [ 580.423185] [c0000044c46b39a0] [c00000000019e9e0] __request_module+0x320/0x510 > > [ 580.423188] [c0000044c46b3ac0] [c0000000006f1a14] crypto_alg_mod_lookup+0x1e4/0x2e0 > > [ 580.423192] [c0000044c46b3b60] [c0000000006f2178] crypto_alloc_tfm_node+0xa8/0x1a0 > > [ 580.423194] [c0000044c46b3be0] [c0000000006f84f8] crypto_alloc_aead+0x38/0x50 > > [ 580.423196] [c0000044c46b3c00] [c00000000072cba0] aead_bind+0x70/0x140 > > [ 580.423199] [c0000044c46b3c40] [c000000000727824] alg_bind+0xb4/0x210 > > [ 580.423201] [c0000044c46b3cc0] [c000000000bc2ad4] __sys_bind+0x114/0x160 > > [ 580.423205] [c0000044c46b3d90] [c000000000bc2b48] sys_bind+0x28/0x40 > > [ 580.423207] [c0000044c46b3db0] [c000000000030880] system_call_exception+0x160/0x300 > > [ 580.423209] [c0000044c46b3e10] [c00000000000c168] system_call_vectored_common+0xe8/0x278 > > [ 580.423213] --- interrupt: 3000 at 0x7fff9b824464 > > [ 580.423214] NIP: 00007fff9b824464 LR: 0000000000000000 CTR: 0000000000000000 > > [ 580.423215] REGS: c0000044c46b3e80 TRAP: 3000 Not tainted (5.14.0-55.el9.ppc64le) > > [ 580.423216] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42004802 XER: 00000000 > > [ 580.423221] IRQMASK: 0 > > GPR00: 0000000000000147 00007fffdcff2780 00007fff9b917100 0000000000000004 > > GPR04: 00007fffdcff27e0 0000000000000058 0000000000000000 0000000000000000 > > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR12: 0000000000000000 00007fff9bc9efe0 0000000200000117 0000000010185ea8 > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > GPR20: 0000000000000004 00007fffdcff2a00 0000000300000117 00000000101862b8 > > GPR24: 0000000000000004 0000000046401570 0000000046401120 0000000046404650 > > GPR28: 0000000000000020 0000000000000020 0000000000000060 0000000046404bf0 > > [ 580.423236] NIP [00007fff9b824464] 0x7fff9b824464 > > [ 580.423237] LR [0000000000000000] 0x0 > > [ 580.423238] --- interrupt: 3000 > > [ 580.423239] Instruction dump: > > [ 580.423241] e8690000 7c0803a6 3884fff8 78630100 78840020 4bfffeb8 3c4c0295 3842df24 > > [ 580.423244] 7c421378 7c000026 90010008 44000022 <38800000> 988d0931 80010008 7c0ff120 > > > > Would it be possible to modify kmod so that in such cases that > > request_module calls fail more quickly rather than repeatedly > > obtaining a spinlock that appears to be under high contention? kmod count limit is lockless but prepare_to_wait_event() does hold a lock... You can try to reproduce by using adding a new test type for crypto-aegis256 on lib/test_kmod.c. These tests however can try something similar but other modules. /tools/testing/selftests/kmod/kmod.sh -t 0008 /tools/testing/selftests/kmod/kmod.sh -t 0009 I can't decipher this yet. Luis ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-07 8:02 ` Luis Chamberlain @ 2022-05-07 19:14 ` Luis Chamberlain 2022-05-09 1:42 ` Luis Chamberlain 0 siblings, 1 reply; 9+ messages in thread From: Luis Chamberlain @ 2022-05-07 19:14 UTC (permalink / raw) To: Christophe Leroy Cc: linuxppc-dev, fnovak@us.ibm.com, Herbert Xu, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org On Sat, May 07, 2022 at 01:02:20AM -0700, Luis Chamberlain wrote: > You can try to reproduce by using adding a new test type for crypto-aegis256 > on lib/test_kmod.c. These tests however can try something similar but other > modules. > > /tools/testing/selftests/kmod/kmod.sh -t 0008 > /tools/testing/selftests/kmod/kmod.sh -t 0009 > > I can't decipher this yet. Without testing it... but something like this might be an easier reproducer: diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh index afd42387e8b2..48b6b5ec6c1e 100755 --- a/tools/testing/selftests/kmod/kmod.sh +++ b/tools/testing/selftests/kmod/kmod.sh @@ -41,6 +41,7 @@ set -e TEST_NAME="kmod" TEST_DRIVER="test_${TEST_NAME}" TEST_DIR=$(dirname $0) +PROC_CONFIG="/proc/config.gz" # This represents # @@ -65,6 +66,7 @@ ALL_TESTS="$ALL_TESTS 0010:1:1" ALL_TESTS="$ALL_TESTS 0011:1:1" ALL_TESTS="$ALL_TESTS 0012:1:1" ALL_TESTS="$ALL_TESTS 0013:1:1" +ALL_TESTS="$ALL_TESTS 0014:150:1" # Kselftest framework requirement - SKIP code is 4. ksft_skip=4 @@ -79,6 +81,19 @@ test_modprobe() fi } +kconfig_has() +{ + if [ -f $PROC_CONFIG ]; then + if zgrep -q $1 $PROC_CONFIG 2>/dev/null; then + echo "yes" + else + echo "no" + fi + else + echo "no" + fi +} + function allow_user_defaults() { if [ -z $DEFAULT_KMOD_DRIVER ]; then @@ -106,6 +121,8 @@ function allow_user_defaults() fi MODPROBE_LIMIT_FILE="${PROC_DIR}/kmod-limit" + HAS_CRYPTO_AEGIS256_MOD="$(kconfig_has CONFIG_CRYPTO_AEGIS256=m)" + HAS_CRYPTO_AEGIS256_BUILTIN="$(kconfig_has CONFIG_CRYPTO_AEGIS256=y)" } test_reqs() @@ -504,6 +521,21 @@ kmod_test_0013() "cat /sys/module/${DEFAULT_KMOD_DRIVER}/sections/.*text | head -n1" } +kmod_test_0014() +{ + kmod_defaults_driver + MODPROBE_LIMIT=$(config_get_modprobe_limit) + let EXTRA=$MODPROBE_LIMIT/6 + config_set_driver crypto-aegis256 + config_num_thread_limit_extra $EXTRA + config_trigger ${FUNCNAME[0]} + if [[ "$HAS_CRYPTO_AEGIS256_MOD" == "yes" || "$HAS_CRYPTO_AEGIS256_BUILTIN" == "yes" ]]; then + config_expect_result ${FUNCNAME[0]} SUCCESS + else + config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND + fi +} + list_tests() { echo "Test ID list:" @@ -525,6 +557,7 @@ list_tests() echo "0011 x $(get_test_count 0011) - test completely disabling module autoloading" echo "0012 x $(get_test_count 0012) - test /proc/modules address visibility under CAP_SYSLOG" echo "0013 x $(get_test_count 0013) - test /sys/module/*/sections/* visibility under CAP_SYSLOG" + echo "0014 x $(get_test_count 0014) - multithreaded - push kmod_concurrent over max_modprobes for request_module() for crypto-aegis256" } usage() ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-07 19:14 ` Luis Chamberlain @ 2022-05-09 1:42 ` Luis Chamberlain 0 siblings, 0 replies; 9+ messages in thread From: Luis Chamberlain @ 2022-05-09 1:42 UTC (permalink / raw) To: Christophe Leroy Cc: linuxppc-dev, fnovak@us.ibm.com, Herbert Xu, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org On Sat, May 07, 2022 at 12:14:47PM -0700, Luis Chamberlain wrote: > On Sat, May 07, 2022 at 01:02:20AM -0700, Luis Chamberlain wrote: > > You can try to reproduce by using adding a new test type for crypto-aegis256 > > on lib/test_kmod.c. These tests however can try something similar but other > > modules. > > > > /tools/testing/selftests/kmod/kmod.sh -t 0008 > > /tools/testing/selftests/kmod/kmod.sh -t 0009 > > > > I can't decipher this yet. > > Without testing it... but something like this might be an easier > reproducer: > > + config_set_driver crypto-aegis256 If the module is not present though nothing really happens, and so is it possible this is another issue? Below a bogus module request. diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh index afd42387e8b2..a747ad549940 100755 --- a/tools/testing/selftests/kmod/kmod.sh +++ b/tools/testing/selftests/kmod/kmod.sh @@ -65,6 +66,7 @@ ALL_TESTS="$ALL_TESTS 0010:1:1" ALL_TESTS="$ALL_TESTS 0011:1:1" ALL_TESTS="$ALL_TESTS 0012:1:1" ALL_TESTS="$ALL_TESTS 0013:1:1" +ALL_TESTS="$ALL_TESTS 0014:150:1" # Kselftest framework requirement - SKIP code is 4. ksft_skip=4 @@ -504,6 +506,17 @@ kmod_test_0013() "cat /sys/module/${DEFAULT_KMOD_DRIVER}/sections/.*text | head -n1" } +kmod_test_0014() +{ + kmod_defaults_driver + MODPROBE_LIMIT=$(config_get_modprobe_limit) + let EXTRA=$MODPROBE_LIMIT/6 + config_set_driver bogus_module_does_not_exist + config_num_thread_limit_extra $EXTRA + config_trigger ${FUNCNAME[0]} + config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND +} + list_tests() { echo "Test ID list:" @@ -525,6 +538,7 @@ list_tests() echo "0011 x $(get_test_count 0011) - test completely disabling module autoloading" echo "0012 x $(get_test_count 0012) - test /proc/modules address visibility under CAP_SYSLOG" echo "0013 x $(get_test_count 0013) - test /sys/module/*/sections/* visibility under CAP_SYSLOG" + echo "0014 x $(get_test_count 0014) - multithreaded - push kmod_concurrent over max_modprobes for request_module() for a missing module" } usage() ^ permalink raw reply related [flat|nested] 9+ messages in thread
[parent not found: <874k1zt0ec.fsf@mpe.ellerman.id.au>]
* Re: request_module DoS [not found] ` <874k1zt0ec.fsf@mpe.ellerman.id.au> @ 2022-05-09 16:13 ` Luis Chamberlain 2022-05-11 16:35 ` Luis Chamberlain 0 siblings, 1 reply; 9+ messages in thread From: Luis Chamberlain @ 2022-05-09 16:13 UTC (permalink / raw) To: Michael Ellerman Cc: Herbert Xu, linux-kernel, linuxppc-dev, fnovak, linux-modules On Mon, May 09, 2022 at 09:23:39PM +1000, Michael Ellerman wrote: > Herbert Xu <herbert@gondor.apana.org.au> writes: > > Hi: > > > > There are some code paths in the kernel where you can reliably > > trigger a request_module of a non-existant module. For example, > > if you attempt to load a non-existent crypto algorithm, or create > > a socket of a non-existent network family, it will result in a > > request_module call that is guaranteed to fail. > > > > As user-space can do this repeatedly, it can quickly overwhelm > > the concurrency limit in kmod. This in itself is expected, > > however, at least on some platforms this appears to result in > > a live-lock. Here is an example triggered by stress-ng on ppc64: > > > > [ 529.853264] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > ... > > [ 580.414590] __request_module: 25 callbacks suppressed > > [ 580.414597] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > > [ 580.423082] watchdog: CPU 784 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c > > [ 580.423097] watchdog: CPU 784 TB:1297691958559475, last heartbeat TB:1297686321743840 (11009ms ago) > > [ 580.423099] Modules linked in: cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common tun nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill bonding tls ip_set nf_tables nfnetlink pseries_rng binfmt_misc drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod fuse > > [ 580.423136] CPU: 784 PID: 77071 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-55.el9.ppc64le #1 > > [ 580.423139] NIP: c0000000000f8ff4 LR: c0000000001f7c38 CTR: 0000000000000000 > > [ 580.423140] REGS: c0000043fdd7bd60 TRAP: 0900 Not tainted (5.14.0-55.el9.ppc64le) > > [ 580.423142] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008202 XER: 20040000 > > [ 580.423148] CFAR: 0000000000000c00 IRQMASK: 1 > > GPR00: 0000000028008202 c0000044c46b3850 c000000002a46f00 0000000000000000 > > GPR04: ffffffffffffffff 0000000000000000 0000000000000010 c000000002a83060 > > GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 > > GPR12: c0000000001b9530 c0000043ffe16700 0000000200000117 0000000010185ea8 > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > GPR20: 0000000000000001 0000000000000000 0000000000000001 00000000000000ff > > GPR24: c0000043f4a00e14 c0000043fafe0e00 000000000c440000 0000000000000000 > > GPR28: c0000043f4a00e00 c0000043f4a00e00 c0000000021e0e00 c000000002561aa0 > > [ 580.423166] NIP [c0000000000f8ff4] plpar_hcall_norets_notrace+0x18/0x2c > > [ 580.423168] LR [c0000000001f7c38] __pv_queued_spin_lock_slowpath+0x528/0x530 > > [ 580.423173] Call Trace: > > [ 580.423174] [c0000044c46b3850] [0000000100006b60] 0x100006b60 (unreliable) > > [ 580.423177] [c0000044c46b3910] [c000000000ea6948] _raw_spin_lock_irqsave+0xa8/0xc0 > > [ 580.423182] [c0000044c46b3940] [c0000000001dd7c0] prepare_to_wait_event+0x40/0x200 > > [ 580.423185] [c0000044c46b39a0] [c00000000019e9e0] __request_module+0x320/0x510 > > [ 580.423188] [c0000044c46b3ac0] [c0000000006f1a14] crypto_alg_mod_lookup+0x1e4/0x2e0 > > [ 580.423192] [c0000044c46b3b60] [c0000000006f2178] crypto_alloc_tfm_node+0xa8/0x1a0 > > [ 580.423194] [c0000044c46b3be0] [c0000000006f84f8] crypto_alloc_aead+0x38/0x50 > > [ 580.423196] [c0000044c46b3c00] [c00000000072cba0] aead_bind+0x70/0x140 > > [ 580.423199] [c0000044c46b3c40] [c000000000727824] alg_bind+0xb4/0x210 > > [ 580.423201] [c0000044c46b3cc0] [c000000000bc2ad4] __sys_bind+0x114/0x160 > > [ 580.423205] [c0000044c46b3d90] [c000000000bc2b48] sys_bind+0x28/0x40 > > [ 580.423207] [c0000044c46b3db0] [c000000000030880] system_call_exception+0x160/0x300 > > [ 580.423209] [c0000044c46b3e10] [c00000000000c168] system_call_vectored_common+0xe8/0x278 > > [ 580.423213] --- interrupt: 3000 at 0x7fff9b824464 > > [ 580.423214] NIP: 00007fff9b824464 LR: 0000000000000000 CTR: 0000000000000000 > > [ 580.423215] REGS: c0000044c46b3e80 TRAP: 3000 Not tainted (5.14.0-55.el9.ppc64le) > > [ 580.423216] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42004802 XER: 00000000 > > [ 580.423221] IRQMASK: 0 > > GPR00: 0000000000000147 00007fffdcff2780 00007fff9b917100 0000000000000004 > > GPR04: 00007fffdcff27e0 0000000000000058 0000000000000000 0000000000000000 > > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR12: 0000000000000000 00007fff9bc9efe0 0000000200000117 0000000010185ea8 > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > GPR20: 0000000000000004 00007fffdcff2a00 0000000300000117 00000000101862b8 > > GPR24: 0000000000000004 0000000046401570 0000000046401120 0000000046404650 > > GPR28: 0000000000000020 0000000000000020 0000000000000060 0000000046404bf0 > > [ 580.423236] NIP [00007fff9b824464] 0x7fff9b824464 > > [ 580.423237] LR [0000000000000000] 0x0 > > [ 580.423238] --- interrupt: 3000 > > [ 580.423239] Instruction dump: > > [ 580.423241] e8690000 7c0803a6 3884fff8 78630100 78840020 4bfffeb8 3c4c0295 3842df24 > > [ 580.423244] 7c421378 7c000026 90010008 44000022 <38800000> 988d0931 80010008 7c0ff120 > > > > Would it be possible to modify kmod so that in such cases that > > request_module calls fail more quickly rather than repeatedly > > obtaining a spinlock that appears to be under high contention? > > If you run stress-ng with a timeout does the system eventually recover? OK the respective stress-ng test should be something like: ./stress-ng --af-alg 8192 I had left this running overnight on x86_64 without issues: sudo ./tools/testing/selftests/kmod/kmod.sh -t 0009 Going to leave the above stress-ng call running in a loop to see if I can reproduce the live lock on x86_64. Luis ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-09 16:13 ` Luis Chamberlain @ 2022-05-11 16:35 ` Luis Chamberlain 2022-05-12 7:36 ` Michael Ellerman 0 siblings, 1 reply; 9+ messages in thread From: Luis Chamberlain @ 2022-05-11 16:35 UTC (permalink / raw) To: Michael Ellerman Cc: Herbert Xu, linux-kernel, linuxppc-dev, fnovak, linux-modules On Mon, May 09, 2022 at 09:13:03AM -0700, Luis Chamberlain wrote: > On Mon, May 09, 2022 at 09:23:39PM +1000, Michael Ellerman wrote: > > Herbert Xu <herbert@gondor.apana.org.au> writes: > > > Hi: > > > > > > There are some code paths in the kernel where you can reliably > > > trigger a request_module of a non-existant module. For example, > > > if you attempt to load a non-existent crypto algorithm, or create > > > a socket of a non-existent network family, it will result in a > > > request_module call that is guaranteed to fail. > > > > > > As user-space can do this repeatedly, it can quickly overwhelm > > > the concurrency limit in kmod. This in itself is expected, > > > however, at least on some platforms this appears to result in > > > a live-lock. Here is an example triggered by stress-ng on ppc64: > > > > > > [ 529.853264] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... > > ... > > > [ 580.414590] __request_module: 25 callbacks suppressed > > > [ 580.414597] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... > > > [ 580.423082] watchdog: CPU 784 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c > > > [ 580.423097] watchdog: CPU 784 TB:1297691958559475, last heartbeat TB:1297686321743840 (11009ms ago) > > > [ 580.423099] Modules linked in: cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common tun nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill bonding tls ip_set nf_tables nfnetlink pseries_rng binfmt_misc drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod fuse > > > [ 580.423136] CPU: 784 PID: 77071 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-55.el9.ppc64le #1 > > > [ 580.423139] NIP: c0000000000f8ff4 LR: c0000000001f7c38 CTR: 0000000000000000 > > > [ 580.423140] REGS: c0000043fdd7bd60 TRAP: 0900 Not tainted (5.14.0-55.el9.ppc64le) > > > [ 580.423142] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008202 XER: 20040000 > > > [ 580.423148] CFAR: 0000000000000c00 IRQMASK: 1 > > > GPR00: 0000000028008202 c0000044c46b3850 c000000002a46f00 0000000000000000 > > > GPR04: ffffffffffffffff 0000000000000000 0000000000000010 c000000002a83060 > > > GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 > > > GPR12: c0000000001b9530 c0000043ffe16700 0000000200000117 0000000010185ea8 > > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > > GPR20: 0000000000000001 0000000000000000 0000000000000001 00000000000000ff > > > GPR24: c0000043f4a00e14 c0000043fafe0e00 000000000c440000 0000000000000000 > > > GPR28: c0000043f4a00e00 c0000043f4a00e00 c0000000021e0e00 c000000002561aa0 > > > [ 580.423166] NIP [c0000000000f8ff4] plpar_hcall_norets_notrace+0x18/0x2c > > > [ 580.423168] LR [c0000000001f7c38] __pv_queued_spin_lock_slowpath+0x528/0x530 > > > [ 580.423173] Call Trace: > > > [ 580.423174] [c0000044c46b3850] [0000000100006b60] 0x100006b60 (unreliable) > > > [ 580.423177] [c0000044c46b3910] [c000000000ea6948] _raw_spin_lock_irqsave+0xa8/0xc0 > > > [ 580.423182] [c0000044c46b3940] [c0000000001dd7c0] prepare_to_wait_event+0x40/0x200 > > > [ 580.423185] [c0000044c46b39a0] [c00000000019e9e0] __request_module+0x320/0x510 > > > [ 580.423188] [c0000044c46b3ac0] [c0000000006f1a14] crypto_alg_mod_lookup+0x1e4/0x2e0 > > > [ 580.423192] [c0000044c46b3b60] [c0000000006f2178] crypto_alloc_tfm_node+0xa8/0x1a0 > > > [ 580.423194] [c0000044c46b3be0] [c0000000006f84f8] crypto_alloc_aead+0x38/0x50 > > > [ 580.423196] [c0000044c46b3c00] [c00000000072cba0] aead_bind+0x70/0x140 > > > [ 580.423199] [c0000044c46b3c40] [c000000000727824] alg_bind+0xb4/0x210 > > > [ 580.423201] [c0000044c46b3cc0] [c000000000bc2ad4] __sys_bind+0x114/0x160 > > > [ 580.423205] [c0000044c46b3d90] [c000000000bc2b48] sys_bind+0x28/0x40 > > > [ 580.423207] [c0000044c46b3db0] [c000000000030880] system_call_exception+0x160/0x300 > > > [ 580.423209] [c0000044c46b3e10] [c00000000000c168] system_call_vectored_common+0xe8/0x278 > > > [ 580.423213] --- interrupt: 3000 at 0x7fff9b824464 > > > [ 580.423214] NIP: 00007fff9b824464 LR: 0000000000000000 CTR: 0000000000000000 > > > [ 580.423215] REGS: c0000044c46b3e80 TRAP: 3000 Not tainted (5.14.0-55.el9.ppc64le) > > > [ 580.423216] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42004802 XER: 00000000 > > > [ 580.423221] IRQMASK: 0 > > > GPR00: 0000000000000147 00007fffdcff2780 00007fff9b917100 0000000000000004 > > > GPR04: 00007fffdcff27e0 0000000000000058 0000000000000000 0000000000000000 > > > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > > GPR12: 0000000000000000 00007fff9bc9efe0 0000000200000117 0000000010185ea8 > > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 > > > GPR20: 0000000000000004 00007fffdcff2a00 0000000300000117 00000000101862b8 > > > GPR24: 0000000000000004 0000000046401570 0000000046401120 0000000046404650 > > > GPR28: 0000000000000020 0000000000000020 0000000000000060 0000000046404bf0 > > > [ 580.423236] NIP [00007fff9b824464] 0x7fff9b824464 > > > [ 580.423237] LR [0000000000000000] 0x0 > > > [ 580.423238] --- interrupt: 3000 > > > [ 580.423239] Instruction dump: > > > [ 580.423241] e8690000 7c0803a6 3884fff8 78630100 78840020 4bfffeb8 3c4c0295 3842df24 > > > [ 580.423244] 7c421378 7c000026 90010008 44000022 <38800000> 988d0931 80010008 7c0ff120 > > > > > > Would it be possible to modify kmod so that in such cases that > > > request_module calls fail more quickly rather than repeatedly > > > obtaining a spinlock that appears to be under high contention? > > > > If you run stress-ng with a timeout does the system eventually recover? > > OK the respective stress-ng test should be something like: > > ./stress-ng --af-alg 8192 > > I had left this running overnight on x86_64 without issues: > > sudo ./tools/testing/selftests/kmod/kmod.sh -t 0009 > > Going to leave the above stress-ng call running in a loop to see > if I can reproduce the live lock on x86_64. The following loop has been running on 5.18.0-rc5-next-20220506 since May 9 without any issues on x86_64: while true; do sudo ./stress-ng --af-alg 8192; done Can someone try this on ppc64le system? At this point I am not convinced this issue is generic. Luis ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-11 16:35 ` Luis Chamberlain @ 2022-05-12 7:36 ` Michael Ellerman 2022-05-12 12:07 ` Michael Ellerman 0 siblings, 1 reply; 9+ messages in thread From: Michael Ellerman @ 2022-05-12 7:36 UTC (permalink / raw) To: Luis Chamberlain Cc: Herbert Xu, linux-kernel, linuxppc-dev, fnovak, linux-modules Luis Chamberlain <mcgrof@kernel.org> writes: > On Mon, May 09, 2022 at 09:13:03AM -0700, Luis Chamberlain wrote: >> On Mon, May 09, 2022 at 09:23:39PM +1000, Michael Ellerman wrote: >> > Herbert Xu <herbert@gondor.apana.org.au> writes: >> > > Hi: >> > > >> > > There are some code paths in the kernel where you can reliably >> > > trigger a request_module of a non-existant module. For example, >> > > if you attempt to load a non-existent crypto algorithm, or create >> > > a socket of a non-existent network family, it will result in a >> > > request_module call that is guaranteed to fail. >> > > >> > > As user-space can do this repeatedly, it can quickly overwhelm >> > > the concurrency limit in kmod. This in itself is expected, >> > > however, at least on some platforms this appears to result in >> > > a live-lock. Here is an example triggered by stress-ng on ppc64: >> > > >> > > [ 529.853264] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis128l, throttling... >> > ... >> > > [ 580.414590] __request_module: 25 callbacks suppressed >> > > [ 580.414597] request_module: kmod_concurrent_max (0) close to 0 (max_modprobes: 50), for module crypto-aegis256-all, throttling... >> > > [ 580.423082] watchdog: CPU 784 self-detected hard LOCKUP @ plpar_hcall_norets_notrace+0x18/0x2c >> > > [ 580.423097] watchdog: CPU 784 TB:1297691958559475, last heartbeat TB:1297686321743840 (11009ms ago) >> > > [ 580.423099] Modules linked in: cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common tun nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill bonding tls ip_set nf_tables nfnetlink pseries_rng binfmt_misc drm drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror dm_region_hash dm_log dm_mod fuse >> > > [ 580.423136] CPU: 784 PID: 77071 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-55.el9.ppc64le #1 >> > > [ 580.423139] NIP: c0000000000f8ff4 LR: c0000000001f7c38 CTR: 0000000000000000 >> > > [ 580.423140] REGS: c0000043fdd7bd60 TRAP: 0900 Not tainted (5.14.0-55.el9.ppc64le) >> > > [ 580.423142] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008202 XER: 20040000 >> > > [ 580.423148] CFAR: 0000000000000c00 IRQMASK: 1 >> > > GPR00: 0000000028008202 c0000044c46b3850 c000000002a46f00 0000000000000000 >> > > GPR04: ffffffffffffffff 0000000000000000 0000000000000010 c000000002a83060 >> > > GPR08: 0000000000000000 0000000000000001 0000000000000001 0000000000000000 >> > > GPR12: c0000000001b9530 c0000043ffe16700 0000000200000117 0000000010185ea8 >> > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 >> > > GPR20: 0000000000000001 0000000000000000 0000000000000001 00000000000000ff >> > > GPR24: c0000043f4a00e14 c0000043fafe0e00 000000000c440000 0000000000000000 >> > > GPR28: c0000043f4a00e00 c0000043f4a00e00 c0000000021e0e00 c000000002561aa0 >> > > [ 580.423166] NIP [c0000000000f8ff4] plpar_hcall_norets_notrace+0x18/0x2c >> > > [ 580.423168] LR [c0000000001f7c38] __pv_queued_spin_lock_slowpath+0x528/0x530 >> > > [ 580.423173] Call Trace: >> > > [ 580.423174] [c0000044c46b3850] [0000000100006b60] 0x100006b60 (unreliable) >> > > [ 580.423177] [c0000044c46b3910] [c000000000ea6948] _raw_spin_lock_irqsave+0xa8/0xc0 >> > > [ 580.423182] [c0000044c46b3940] [c0000000001dd7c0] prepare_to_wait_event+0x40/0x200 >> > > [ 580.423185] [c0000044c46b39a0] [c00000000019e9e0] __request_module+0x320/0x510 >> > > [ 580.423188] [c0000044c46b3ac0] [c0000000006f1a14] crypto_alg_mod_lookup+0x1e4/0x2e0 >> > > [ 580.423192] [c0000044c46b3b60] [c0000000006f2178] crypto_alloc_tfm_node+0xa8/0x1a0 >> > > [ 580.423194] [c0000044c46b3be0] [c0000000006f84f8] crypto_alloc_aead+0x38/0x50 >> > > [ 580.423196] [c0000044c46b3c00] [c00000000072cba0] aead_bind+0x70/0x140 >> > > [ 580.423199] [c0000044c46b3c40] [c000000000727824] alg_bind+0xb4/0x210 >> > > [ 580.423201] [c0000044c46b3cc0] [c000000000bc2ad4] __sys_bind+0x114/0x160 >> > > [ 580.423205] [c0000044c46b3d90] [c000000000bc2b48] sys_bind+0x28/0x40 >> > > [ 580.423207] [c0000044c46b3db0] [c000000000030880] system_call_exception+0x160/0x300 >> > > [ 580.423209] [c0000044c46b3e10] [c00000000000c168] system_call_vectored_common+0xe8/0x278 >> > > [ 580.423213] --- interrupt: 3000 at 0x7fff9b824464 >> > > [ 580.423214] NIP: 00007fff9b824464 LR: 0000000000000000 CTR: 0000000000000000 >> > > [ 580.423215] REGS: c0000044c46b3e80 TRAP: 3000 Not tainted (5.14.0-55.el9.ppc64le) >> > > [ 580.423216] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42004802 XER: 00000000 >> > > [ 580.423221] IRQMASK: 0 >> > > GPR00: 0000000000000147 00007fffdcff2780 00007fff9b917100 0000000000000004 >> > > GPR04: 00007fffdcff27e0 0000000000000058 0000000000000000 0000000000000000 >> > > GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> > > GPR12: 0000000000000000 00007fff9bc9efe0 0000000200000117 0000000010185ea8 >> > > GPR16: 0000000010212150 0000000010186198 00000000101863a0 000000001021b3c0 >> > > GPR20: 0000000000000004 00007fffdcff2a00 0000000300000117 00000000101862b8 >> > > GPR24: 0000000000000004 0000000046401570 0000000046401120 0000000046404650 >> > > GPR28: 0000000000000020 0000000000000020 0000000000000060 0000000046404bf0 >> > > [ 580.423236] NIP [00007fff9b824464] 0x7fff9b824464 >> > > [ 580.423237] LR [0000000000000000] 0x0 >> > > [ 580.423238] --- interrupt: 3000 >> > > [ 580.423239] Instruction dump: >> > > [ 580.423241] e8690000 7c0803a6 3884fff8 78630100 78840020 4bfffeb8 3c4c0295 3842df24 >> > > [ 580.423244] 7c421378 7c000026 90010008 44000022 <38800000> 988d0931 80010008 7c0ff120 >> > > >> > > Would it be possible to modify kmod so that in such cases that >> > > request_module calls fail more quickly rather than repeatedly >> > > obtaining a spinlock that appears to be under high contention? >> > >> > If you run stress-ng with a timeout does the system eventually recover? >> >> OK the respective stress-ng test should be something like: >> >> ./stress-ng --af-alg 8192 >> >> I had left this running overnight on x86_64 without issues: >> >> sudo ./tools/testing/selftests/kmod/kmod.sh -t 0009 >> >> Going to leave the above stress-ng call running in a loop to see >> if I can reproduce the live lock on x86_64. > > The following loop has been running on 5.18.0-rc5-next-20220506 since > May 9 without any issues on x86_64: > > while true; do sudo ./stress-ng --af-alg 8192; done I ran the above on a ppc64le system here, no issues. But it's only a 32 CPU machine. > Can someone try this on ppc64le system? At this point I am not convinced > this issue is generic. Does your x86 system have at least 784 CPUs? I don't know where the original report came from, but the trace shows "CPU 784", which would usually indicate a system with at least that many CPUs. I would hope we can handle that many CPUs banging on a spin lock without throwing traces, but maybe not. It could also be that the system is under load and the hypervisor has scheduled us off for too long. cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-12 7:36 ` Michael Ellerman @ 2022-05-12 12:07 ` Michael Ellerman 2022-05-12 17:43 ` Luis Chamberlain 0 siblings, 1 reply; 9+ messages in thread From: Michael Ellerman @ 2022-05-12 12:07 UTC (permalink / raw) To: Luis Chamberlain Cc: Herbert Xu, linux-kernel, linuxppc-dev, fnovak, linux-modules Michael Ellerman <mpe@ellerman.id.au> writes: > Luis Chamberlain <mcgrof@kernel.org> writes: ... > >> Can someone try this on ppc64le system? At this point I am not convinced >> this issue is generic. > > Does your x86 system have at least 784 CPUs? > > I don't know where the original report came from, but the trace shows > "CPU 784", which would usually indicate a system with at least that many > CPUs. Update, apparently the report originally came from IBM, so I'll chase it up internally. I think you're right that there's probably no issue in the module code, sorry to waste your time. cheers ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: request_module DoS 2022-05-12 12:07 ` Michael Ellerman @ 2022-05-12 17:43 ` Luis Chamberlain 0 siblings, 0 replies; 9+ messages in thread From: Luis Chamberlain @ 2022-05-12 17:43 UTC (permalink / raw) To: Michael Ellerman Cc: Herbert Xu, linux-kernel, linuxppc-dev, fnovak, linux-modules On Thu, May 12, 2022 at 10:07:26PM +1000, Michael Ellerman wrote: > Michael Ellerman <mpe@ellerman.id.au> writes: > > Luis Chamberlain <mcgrof@kernel.org> writes: > ... > > > >> Can someone try this on ppc64le system? At this point I am not convinced > >> this issue is generic. > > > > Does your x86 system have at least 784 CPUs? > > > > I don't know where the original report came from, but the trace shows > > "CPU 784", which would usually indicate a system with at least that many > > CPUs. > > Update, apparently the report originally came from IBM, so I'll chase it > up internally. > > I think you're right that there's probably no issue in the module code, > sorry to waste your time. It gives me testing happiness to know that may be the case :) Luis ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-05-12 17:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <YnXiuhdZ49pKL/dK@gondor.apana.org.au>
2022-05-07 7:10 ` request_module DoS Christophe Leroy
2022-05-07 8:02 ` Luis Chamberlain
2022-05-07 19:14 ` Luis Chamberlain
2022-05-09 1:42 ` Luis Chamberlain
[not found] ` <874k1zt0ec.fsf@mpe.ellerman.id.au>
2022-05-09 16:13 ` Luis Chamberlain
2022-05-11 16:35 ` Luis Chamberlain
2022-05-12 7:36 ` Michael Ellerman
2022-05-12 12:07 ` Michael Ellerman
2022-05-12 17:43 ` Luis Chamberlain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).