From: Sasha Levin <sashal@kernel.org>
To: Hugh Dickins <hughd@google.com>
Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
Jan Kara <jack@suse.cz>, Keith Busch <kbusch@kernel.org>,
Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org
Subject: Re: [PATCH AUTOSEL 6.0 64/67] sbitmap: fix lockup while swapping
Date: Thu, 13 Oct 2022 13:54:39 -0400 [thread overview]
Message-ID: <Y0hQ300MiPc4GBvh@sashalap> (raw)
In-Reply-To: <d095e91-046-10e9-225e-de3aecd5e8b3@google.com>
On Wed, Oct 12, 2022 at 06:08:50PM -0700, Hugh Dickins wrote:
>On Wed, 12 Oct 2022, Sasha Levin wrote:
>
>> From: Hugh Dickins <hughd@google.com>
>>
>> [ Upstream commit 30514bd2dd4e86a3ecfd6a93a3eadf7b9ea164a0 ]
>>
>> Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting")
>> is a big improvement: without it, I had to revert to before commit
>> 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup")
>> to avoid the high system time and freezes which that had introduced.
>>
>> Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy
>> swapping (kernel builds in low memory) on another: soon locking up in
>> sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling
>> around with waitqueue_active() but wait_cnt 0 . Here is a backtrace,
>> showing the common pattern of outer sbitmap_queue_wake_up() interrupted
>> before setting wait_cnt 0 back to wake_batch (in some cases other CPUs
>> are idle, in other cases they're spinning for a lock in dd_bio_merge()):
>>
>> sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag <
>> __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request <
>> scsi_end_request < scsi_io_completion < scsi_finish_command <
>> scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq <
>> __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt <
>> _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up <
>> sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag <
>> __blk_mq_free_request < blk_mq_free_request < dd_bio_merge <
>> blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio <
>> __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct <
>> submit_bio < __swap_writepage < swap_writepage < pageout <
>> shrink_folio_list < evict_folios < lru_gen_shrink_lruvec <
>> shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages <
>> __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio <
>> do_anonymous_page < __handle_mm_fault < handle_mm_fault <
>> do_user_addr_fault < exc_page_fault < asm_exc_page_fault
>>
>> See how the process-context sbitmap_queue_wake_up() has been interrupted,
>> after bringing wait_cnt down to 0 (and in this example, after doing its
>> wakeups), before advancing wake_index and refilling wake_cnt: an
>> interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck.
>>
>> I have almost no grasp of all the possible sbitmap races, and their
>> consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0,
>> so it is better if sbq_wake_ptr() skips on to the next ws in that case:
>> which fixes the lockup and shows no adverse consequence for me.
>>
>> The check for wait_cnt being 0 is obviously racy, and ultimately can lead
>> to lost wakeups: for example, when there is only a single waitqueue with
>> waiters. However, lost wakeups are unlikely to matter in these cases,
>> and a proper fix requires redesign (and benchmarking) of the batched
>> wakeup code: so let's plug the hole with this bandaid for now.
>>
>> Signed-off-by: Hugh Dickins <hughd@google.com>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>> Reviewed-by: Keith Busch <kbusch@kernel.org>
>> Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>
>Whoa! NAK to this 6.0 backport, and to the 5.19, 5.15, 5.10, 5.4
>AUTOSEL backports of the same commit. I never experienced such a
>lockup on those releases. Or have I missed announcements of stable
>backports of the whole series of 6.1-rc commits to which this one
>is a fix? (I hope not.)
Happy to drop it.
>I'm happy for my NAK to be overruled by Jens or Jan or Keith,
>if they see virtue in this commit, beyond what I'm aware of:
>but as it stands, it looks like AUTOSEL out of control again -
>it found the word "fix", and found that the commit applies cleanly,
>so thinks it must be a good stable addition. Not necessarily so!
I'm a bit confused: the subject of the patch is "fix lockup while
swapping" and the body describes a lockup and that this patch "fixes the
lockup and shows no adverse consequence". What am I missing?
--
Thanks,
Sasha
next prev parent reply other threads:[~2022-10-13 17:56 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-13 0:14 [PATCH AUTOSEL 6.0 01/67] staging: r8188eu: do not spam the kernel log Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 02/67] clk: zynqmp: Fix stack-out-of-bounds in strncpy` Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 03/67] media: cx88: Fix a null-ptr-deref bug in buffer_prepare() Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 04/67] media: platform: fix some double free in meson-ge2d and mtk-jpeg and s5p-mfc Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 05/67] clk: zynqmp: pll: rectify rate rounding in zynqmp_pll_round_rate Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 06/67] RDMA/rxe: Delete error messages triggered by incoming Read requests Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 07/67] usb: host: xhci-plat: suspend and resume clocks Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 08/67] usb: host: xhci-plat: suspend/resume clks for brcm Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 09/67] scsi: lpfc: Fix null ndlp ptr dereference in abnormal exit path for GFT_ID Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 10/67] dmaengine: ti: k3-udma: Reset UDMA_CHAN_RT byte counters to prevent overflow Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 11/67] scsi: 3w-9xxx: Avoid disabling device if failing to enable it Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 12/67] nbd: Fix hung when signal interrupts nbd_start_device_ioctl() Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 13/67] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 14/67] usb: gadget: uvc: increase worker prio to WQ_HIGHPRI Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 15/67] staging: rtl8712: Fix return type for implementation of ndo_start_xmit Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 16/67] staging: rtl8192e: " Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 17/67] power: supply: adp5061: fix out-of-bounds read in adp5061_get_chg_type() Sasha Levin
2022-10-13 0:14 ` [PATCH AUTOSEL 6.0 18/67] staging: vt6655: fix potential memory leak Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 19/67] blk-throttle: prevent overflow while calculating wait time Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 20/67] clk: microchip: mpfs: add MSS pll's set & round rate Sasha Levin
2022-10-13 5:29 ` Conor Dooley
2022-10-13 17:55 ` Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 21/67] gpiolib: of: do not ignore requested index when applying quirks Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 22/67] gpiolib: of: make Freescale SPI quirk similar to all others Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 23/67] gpiolib: rework quirk handling in of_find_gpio() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 24/67] ata: libahci_platform: Sanity check the DT child nodes number Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 25/67] habanalabs: ignore EEPROM errors during boot Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 26/67] nvmet-auth: clean up with done_kfree Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 27/67] bcache: fix set_at_max_writeback_rate() for multiple attached devices Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 28/67] soundwire: cadence: Don't overwrite msg->buf during write commands Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 29/67] soundwire: intel: fix error handling on dai registration issues Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 30/67] hid: topre: Add driver fixing report descriptor Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 31/67] habanalabs: remove some f/w descriptor validations Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 32/67] HID: roccat: Fix use-after-free in roccat_read() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 33/67] HSI: ssi_protocol: fix potential resource leak in ssip_pn_open() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 34/67] HID: nintendo: check analog user calibration for plausibility Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 35/67] eventfd: guard wake_up in eventfd fs calls as well Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 36/67] md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 37/67] usb: host: xhci: Fix potential memory leak in xhci_alloc_stream_info() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 38/67] usb: musb: Fix musb_gadget.c rxstate overflow bug Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 39/67] usb: dwc3: core: add gfladj_refclk_lpm_sel quirk Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 40/67] arm64: dts: imx8mp: Add snps,gfladj-refclk-lpm-sel quirk to USB nodes Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 41/67] usb: dwc3: core: Enable GUCTL1 bit 10 for fixing termination error after resume bug Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 42/67] Revert "usb: storage: Add quirk for Samsung Fit flash" Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 43/67] tty: n_gsm: replace use of gsm_read_ea() with gsm_read_ea_val() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 44/67] io_uring: fix CQE reordering Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 45/67] staging: rtl8723bs: fix potential memory leak in rtw_init_drv_sw() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 46/67] staging: rtl8723bs: fix a potential memory leak in rtw_init_cmd_priv() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 47/67] staging: rtl8192u: Fix return type of ieee80211_xmit Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 48/67] staging: octeon: Fix return type of cvm_oct_xmit and cvm_oct_xmit_pow Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 49/67] staging: r8188eu: fix a potential memory leak in rtw_init_cmd_priv() Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 50/67] scsi: tracing: Fix compile error in trace_array calls when TRACING is disabled Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 51/67] ext2: Use kvmalloc() for group descriptor array Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 52/67] nvme: handle effects after freeing the request Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 53/67] nvme: copy firmware_rev on each init Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 54/67] nvmet-tcp: add bounds check on Transfer Tag Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 55/67] usb: idmouse: fix an uninit-value in idmouse_open Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 56/67] block: replace blk_queue_nowait with bdev_nowait Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 57/67] blk-mq: use quiesced elevator switch when reinitializing queues Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 58/67] nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 59/67] nvmet: don't look at the request_queue in nvmet_bdev_set_limits Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 60/67] hwmon (occ): Retry for checksum failure Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 61/67] fsi: occ: Prevent use after free Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 62/67] fsi: master-ast-cf: Fix missing of_node_put in fsi_master_acf_probe Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 63/67] dmaengine: dw-edma: Remove runtime PM support Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 64/67] sbitmap: fix lockup while swapping Sasha Levin
2022-10-13 1:08 ` Hugh Dickins
2022-10-13 17:54 ` Sasha Levin [this message]
2022-10-13 18:40 ` Hugh Dickins
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 65/67] usb: typec: ucsi: Don't warn on probe deferral Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 66/67] clk: bcm2835: Make peripheral PLLC critical Sasha Levin
2022-10-13 0:15 ` [PATCH AUTOSEL 6.0 67/67] clk: bcm2835: Round UART input clock up Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0hQ300MiPc4GBvh@sashalap \
--to=sashal@kernel.org \
--cc=axboe@kernel.dk \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox