From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C748330100D for ; Thu, 5 Feb 2026 20:28:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770323280; cv=none; b=jOxjGyTPTO6Ikwk4LTgYDJHD4fmeRypChaFe1ZXybazoOGz4ncpMwyRAW9WDcIdcWCqNYE5cqz0JaG6rKqCyWtJH1qpojdD0NA0j17pkKIavEqwA2XPJEw63VJSkE206bENuH7Dwvlhypeh1oGUht1CYa2jz8i50eUyyTMUgHxQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770323280; c=relaxed/simple; bh=niia58yHuSNHoSzxwYiho2gi/3KaKxCtnoPFWqsCLwc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iZnvfVJh0+xLS/DySy5ImJ0QocZNKWT0mLUmnY/XRRvGyZ+RvUTe2fTuk5fQAIObcine3xGuy4v28hTIhCQw2Eb73UtA7brVbpuWdK+if1AL65vt4Gfl2Tc2yZ9J2WrMOEtTUL2iEf+irJmhKfj/rAz42bM9Cro1ooOBQTOtKtw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=eckfZvbm; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="eckfZvbm" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-2a0d52768ccso8747835ad.1 for ; Thu, 05 Feb 2026 12:28:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1770323280; x=1770928080; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=McVms00cAviRg0OClqcs3WN2EKZ9Ahl14CHN4GvqWtk=; b=eckfZvbmciamjt2bED/IWzvB0MAcrYTwdjw1m09FY3B6bMOccb5gGljwY+OkGznt5y xZlnfEEkM+5oFr4xXstE4/U3Ui4lnU5VJc6XeboDc+hdlV+p/2AImROlxeb/77cdxRTx Jc8Pph3PnvKXPQt2z/F9NcyhHUTaFxPTJY0T7u3UxIVUYquoApPBfYl/1ChhlqjoYsbh pkLhpqpsIHqyaSJU3PvTJVqc3XBhDKHXmMQgOtMVXle4bADLggBEGH6zUD15L8dYofAg Sp2lXwpdYJQaOIOaFHVN8JASrEErprzTvHpDAb+TZ9f1UWCFO+PJr2h8/DZBUtoxz51y iKVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770323280; x=1770928080; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=McVms00cAviRg0OClqcs3WN2EKZ9Ahl14CHN4GvqWtk=; b=COA+WNj5y/E0abQtSko1N/YHPLP2zoTOB9tz5c4aIjx8Kds1sDuTyzN3HeOpwvbtcN pHA0r5QVSlMKZNZfrY4mLKjGFAQriJR8PYJeK6mh+sawV4rD+UMmNw4rq0WeKDLwTgXt Qr9N+WTN/qEV1jVNWyJg2lRtOUDmyfnGsyjcGy/gjkQZYrDAutv27CP8aRghies8p8k2 h6+xyFvlWO5FRW2cKeE0cGUAAb8HOP7cW+1w9L5yA31cQqucDx/qQpMpcCsKkJziynPy sLX5PXA+4u/FtYevPr9n/6gQFqo4HVe6smJFzUmYwiPOtQjIKZeUj26mVpQRr/jAUeY3 lSiQ== X-Forwarded-Encrypted: i=1; AJvYcCV4NfzktDINyZ/WXHH7sJDD441VcWc0TAd1HLDrKZYoA7Q+T61ZoW95Wj8+5R+2H2E3GgGdyfSTOks=@vger.kernel.org X-Gm-Message-State: AOJu0YwIbcCGdf6t1y0/D+m9lpmlzAvqi4kJjO7xITX1j6z6ftZ91Abr BHf0qLYCy5Z1B4hikbkRyYAUxOFUNalof52yGI61N2+g3YW+XVFIfGhOH8WZ70f7nPg= X-Gm-Gg: AZuq6aLDpDVIios/79eTbQakwNja1f8Z4QlXbhS5JDBr01EH5brAY9vTNfRGHcj3NKP 9hsOxBuc5m9/aR3rGHmrzu4NE2yhqpRIQuB+gaBvDmYkZFsONSDidJ3fOIIV2EXxUdm2qb7Vms2 eiqO6Q/piY9nFD+8dm253dg5FkLWCRTzAF2yZbqBJmitDh+s/U8twuG/B1wnfbd1WeOiiwIDdSt Uq4ao2mFYf19TksQpoYIxUF74N/5QtlTuYWg9iETyaTKqgWPGq9Eu53Dsb174hCMC9ztHEqI4RH 8Z6lpQLn+XIZUsKSvAAIxP5MaczSEP/82tEX1fyOgQiGV8tbMiijkAhmVjchycsR/gXLeiiESqj w+uwXyqC1tgw5qOBEYInZNvadD6fpb7rboT5l46AqftYyWnCxbAttCqcSoPVD4NQ7SbjOutPfeB P52H8kW/fJay2oaTsSrOiwSgzi6Zy4M70Qc6eR9UqHdeYyw2/XuDlDC+dVBDb+6Qk= X-Received: by 2002:a17:902:fc43:b0:2a9:4c2:e53 with SMTP id d9443c01a7336-2a95192e01emr5075975ad.48.1770323279995; Thu, 05 Feb 2026 12:27:59 -0800 (PST) Received: from dread.disaster.area (pa49-180-164-75.pa.nsw.optusnet.com.au. [49.180.164.75]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a951c940besm2818045ad.38.2026.02.05.12.27.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Feb 2026 12:27:59 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.99.1) (envelope-from ) id 1vo5xD-0000000DOVU-32Ei; Fri, 06 Feb 2026 07:27:55 +1100 Date: Fri, 6 Feb 2026 07:27:55 +1100 From: Dave Chinner To: Jinliang Zheng Cc: alexjlzheng@tencent.com, cem@kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH 1/2] xfs: take a breath in xlog_ioend_work() Message-ID: References: <20260205124940.2323931-1-alexjlzheng@tencent.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260205124940.2323931-1-alexjlzheng@tencent.com> On Thu, Feb 05, 2026 at 08:49:38PM +0800, Jinliang Zheng wrote: > On Thu, 5 Feb 2026 21:54:49 +1100, david@fromorbit.com wrote: > > On Thu, Feb 05, 2026 at 04:26:20PM +0800, alexjlzheng@gmail.com wrote: > > > From: Jinliang Zheng > > > > > > The xlog_ioend_work() function contains several nested loops with > > > fairly complex operations, which may leads to: > > > > > > PID: 2604722 TASK: ffff88c08306b1c0 CPU: 263 COMMAND: "kworker/263:0H" > > > #0 [ffffc9001cbf8d58] machine_kexec at ffffffff9d086081 > > > #1 [ffffc9001cbf8db8] __crash_kexec at ffffffff9d20817a > > > #2 [ffffc9001cbf8e78] panic at ffffffff9d107d8f > > > #3 [ffffc9001cbf8ef8] watchdog_timer_fn at ffffffff9d243511 > > > #4 [ffffc9001cbf8f28] __hrtimer_run_queues at ffffffff9d1e62ff > > > #5 [ffffc9001cbf8f80] hrtimer_interrupt at ffffffff9d1e73d4 > > > #6 [ffffc9001cbf8fd8] __sysvec_apic_timer_interrupt at ffffffff9d07bb29 > > > #7 [ffffc9001cbf8ff0] sysvec_apic_timer_interrupt at ffffffff9dd689f9 > > > --- --- > > > #8 [ffffc900460a7c28] asm_sysvec_apic_timer_interrupt at ffffffff9de00e86 > > > [exception RIP: slab_free_freelist_hook.constprop.0+107] > > > RIP: ffffffff9d3ef74b RSP: ffffc900460a7cd0 RFLAGS: 00000286 > > > RAX: ffff89ea4de06b00 RBX: ffff89ea4de06a00 RCX: ffff89ea4de06a00 > > > RDX: 0000000000000100 RSI: ffffc900460a7d28 RDI: ffff888100044c80 > > > RBP: ffff888100044c80 R8: 0000000000000000 R9: ffffffffc21e8500 > > > R10: ffff88c867e93200 R11: 0000000000000001 R12: ffff89ea4de06a00 > > > R13: ffffc900460a7d28 R14: ffff89ea4de06a00 R15: ffffc900460a7d30 > > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > > #9 [ffffc900460a7d18] __kmem_cache_free at ffffffff9d3f65a0 > > > #10 [ffffc900460a7d70] xlog_cil_committed at ffffffffc21e85af [xfs] > > > #11 [ffffc900460a7da0] xlog_cil_process_committed at ffffffffc21e9747 [xfs] > > > #12 [ffffc900460a7dd0] xlog_state_do_iclog_callbacks at ffffffffc21e41eb [xfs] > > > #13 [ffffc900460a7e28] xlog_state_do_callback at ffffffffc21e436f [xfs] > > > #14 [ffffc900460a7e50] xlog_ioend_work at ffffffffc21e6e1c [xfs] > > > #15 [ffffc900460a7e70] process_one_work at ffffffff9d12de69 > > > #16 [ffffc900460a7ea8] worker_thread at ffffffff9d12e79b > > > #17 [ffffc900460a7ef8] kthread at ffffffff9d1378fc > > > #18 [ffffc900460a7f30] ret_from_fork at ffffffff9d042dd0 > > > #19 [ffffc900460a7f50] ret_from_fork_asm at ffffffff9d007e2b > > > > > > This patch adds cond_resched() to avoid softlockups similar to the one > > > described above. > > > > You've elided the soft lockup messages that tell us how long this > > task was holding the CPU. What is the soft lockup timer set to? > > What workload causes this to happen? How do we reproduce it? > > Thanks for your reply. :) > > The soft lockup timer is set to 20s, and the cpu was holding 22s. Yep, something else must be screwed up here - an iclog completion should never have enough items attched to it that is takes this long to process them without yeilding the CPU. We can only loop once around the iclogs in this path these days, because the iclog we we are completing runs xlog_ioend_work() with the iclog->ic_sema held. This locks out new IO being issued on that iclog whilst we are processing the completion, and hence the iclogbuf ring will stall trying to write new items to this iclog. Hence whilst we are processing an iclog completion, the entire journal (and hence filesystem) can stall waiting for the completion to finish and release the iclog->ic_sema. This also means we can have, at most, 8 iclogs worth of journal writes to complete in xlog_ioend_work(). That greatly limits the number of items we are processing in the xlog_state_do_callback() loops. Yes, it can be tens of thousands of items, but it is bound by journal size and the checkpoint pipeline depth (maximum of 4 checkpoints in flight at once). So I don't see how the number of items that we are asking to be processed during journal completion, by itself, can cause such a long processing time. We typically process upwards of a thousand items per millisecond here... > The workload is a test suite combining stress-ng, LTP, and fio, > executed concurrently. I believe reproducing the issue requires a > certain probability. > > thanks, > Jinliang Zheng. :) > > > > > FWIW, yes, there might be several tens of thousands of objects to > > move to the AIL in this journal IO completion path, but if this > > takes more than a couple of hundred milliseconds of processing time > > then there is something else going wrong.... > > Is it possible that the kernel’s CPU and memory were under high load, > causing each iteration of the loop to take more time and eventually > accumulating to over 20 seconds? That's a characteristic behaviour of catastrophic spin lock contention... -Dave. -- Dave Chinner david@fromorbit.com