From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA8963E716F for ; Wed, 1 Jul 2026 08:33:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782894817; cv=none; b=FAwxg2hEBvlYXIMaHjC/yyoh6SJp2FofuMRaHTI5rrmE9VvHbqemtFngvN2KM6WSPgp+8VcnGEBpXCrYEwN6eB1Y5lcoT8m4wmSGUZJvywMCzxWEipTuilCCGVV5g8ZQu8Om8PhKUlUIX1tvg2/lSryCdXsXxCYsD28A8lzSTsc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782894817; c=relaxed/simple; bh=KdGxBvvqUEKh01AHomJVyPjMsIlcQgvzY/8dHzWOD0s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ralgeLKws+TJxinZ5a1CQBuOjdabz4dGMxEo0SDqbepfzKX96MovKr2IWZs1cvGOawHqU/ua67XDpJsu4diSIEAVZfEBmHBaP1hzx1FV5lXnloRQnC1cKx3vfKpyqOT7CEL2vHXPBxQPHXBLtWw+T9Qv8wBiy3FOxXEBgZUpA64= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=doTAP4yR; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=f3piNX8u; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="doTAP4yR"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="f3piNX8u" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1782894815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=UGIqf/RERDJYkTy1UTY+sb5OW/7mHN3HiL9KZ7p6W40=; b=doTAP4yRrdwNiNQpMG3QhhlhAf7i3Shnz+w0qGb/k4UEjSfckuJOSs3+1D23M3vKxqHJ2s IFRN8mFmmS3jJyE6QI49b2X615tLIo7EjRq1eg51t6gf2EgochTYzWHYBrL5lJroYnw6Tx uAONzdqZMRiBVFdLqa50/ZpZdjxVGfo= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-619-BiB5p9StMcGEcznLYCuV5Q-1; Wed, 01 Jul 2026 04:33:33 -0400 X-MC-Unique: BiB5p9StMcGEcznLYCuV5Q-1 X-Mimecast-MFC-AGG-ID: BiB5p9StMcGEcznLYCuV5Q_1782894813 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-493a5d7eaedso3126615e9.2 for ; Wed, 01 Jul 2026 01:33:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1782894812; x=1783499612; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=UGIqf/RERDJYkTy1UTY+sb5OW/7mHN3HiL9KZ7p6W40=; b=f3piNX8uZSVfOnsCriB95nan83MLORbh+HBZLiW7eLwKSYkZfFbPdD5g6CGOyFPcm4 ZPQMVuA3RoMjhM1WVZQ6Q/C1Krmeq7QVVUl4wrSzfs9b2n1zUN+xDaR8anGU4xdRyq6u x17/TzzVQwJykHBwuRT7vMl1AWb4M+K0+tdI+diM/xsclKmb909wTPTNMZgjp5GQrOeb RvGJhAEOaSzZSCJeqGVFwVnvsoj7jlv8UOY6i3SLLmJOMIAayJxAA/lQ906hLzAybY9r Y+4+agjiPTY8HFDNLfCuuxu/jKNrpUEAM+Z/Sm3H031pHBNWs5EegEi4AePSEsnf7Xoz JQRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782894812; x=1783499612; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UGIqf/RERDJYkTy1UTY+sb5OW/7mHN3HiL9KZ7p6W40=; b=B4itQmzSA77PyANJ7t38n6VmWTojbbXpJ2Wfhq49XlxftDvPyPpjJMBrqel+6LTTli 2RW9auY1ZOiskIkpNjZRgf9s4RSxv/XWEy7iUfOfhoUD3/H4ICvbUsdxuSHbWk1Sm6V3 mhgBf6fqeTZNcD5KsBnR5TORMF2Ng0Nt34hclFmC4Qc+PnG67b890yanZf/ecpRlTKFW gS57IxuienAiY/XvFiz8V31jwUKkY+Zo+8Qj46MkNLXLlPh5XqqT/H37jM7UAssiIcFY QE/VXP7+h/Iz4jpou7G9xLLMM9alayHDzCDeTBdHiBLLXv3JZVoEpOEV9cXt2QrUruSP ZMHg== X-Gm-Message-State: AOJu0YzDcKbrQVxkZRS7/QNAm7uu/yZKQynt3ranXP/NYi9hJKSp1/Dt LqcK5+RUpzOP910S296snGQgNONgS44k1DsoDudUkbvi/UoZIe/5WXsLRpxMHtxlYzYbGI9kpYN cLPCwXOXjrFi7a2sHgCn9NuciO8D7pR7thdxu9OfsRJKkvT9M+4TgvBRK85dmYmxF7Q== X-Gm-Gg: AfdE7cltgc6D72sWnL2R5CJy9wyI58eg+5Do0nqD5/mJtX0sBpaka2Sme1kxrB2cRSW yBOQMhgoP+1CeYlrfCyWSKs16l7oqmYc0wMGfUM+KBArOZTOuOZ5VjN/aKeu0NuUl4jXFiXPZMX 66LHH8uFdD7fnaeQB/PjHHfEu1Kga5KvV/HbSnmdZbqWd0QqS612oURwwl1PE+TT8qXPCkAP0B8 ZMGpOX5QlYvw0Ow//X5Wllx9M9fv8m9vepxt/Y9B5txgenpUNw9xRDMb6YKLAqJHBi3eUSeAqGw q6ROnzGsT4uo/wqlaW+ZStOOtfzvNUbifn7D3Oy8u1f2cWAIeAPzyrLjIYtIGmDiPRa0Lrhb80f O5/vfcC8t/i3HvON47iUm1VQi7CQMfg7u X-Received: by 2002:a05:600c:190a:b0:493:c068:db11 with SMTP id 5b1f17b1804b1-493c2b94615mr9321065e9.26.1782894812471; Wed, 01 Jul 2026 01:33:32 -0700 (PDT) X-Received: by 2002:a05:600c:190a:b0:493:c068:db11 with SMTP id 5b1f17b1804b1-493c2b94615mr9320455e9.26.1782894811919; Wed, 01 Jul 2026 01:33:31 -0700 (PDT) Received: from redhat.com (IGLD-80-230-85-71.inter.net.il. [80.230.85.71]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-493be4f76a7sm56484595e9.13.2026.07.01.01.33.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jul 2026 01:33:31 -0700 (PDT) Date: Wed, 1 Jul 2026 04:33:26 -0400 From: "Michael S. Tsirkin" To: "David Hildenbrand (Arm)" Cc: linux-kernel@vger.kernel.org, Miaohe Lin , Naoya Horiguchi , Andrew Morton , Oscar Salvador , Andi Kleen , Hidehiro Kawai , Rik van Riel , Vlastimil Babka , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , Baolin Wang , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Christoph Lameter , David Rientjes , Roman Gushchin , Harry Yoo , Hao Li , Kiryl Shutsemau , Byungchul Park , linux-mm@kvack.org, linux-cxl@vger.kernel.org Subject: Re: [PATCH 0/2] mm: memory-failure: fix HWPoison flag race with non-atomic page flag ops Message-ID: <20260701043024-mutt-send-email-mst@kernel.org> References: <4f5ba5d6-246c-4430-9737-e8dd8e4c5142@kernel.org> <20260629092856-mutt-send-email-mst@kernel.org> <54c8cbee-9b26-458c-93ba-5aa594f5d1e8@kernel.org> <20260629174225-mutt-send-email-mst@kernel.org> <20260630174852-mutt-send-email-mst@kernel.org> <2f884bfa-3cd5-4fba-8aa4-c2e68890ab64@kernel.org> <20260701041112-mutt-send-email-mst@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed, Jul 01, 2026 at 10:26:26AM +0200, David Hildenbrand (Arm) wrote: > On 7/1/26 10:18, Michael S. Tsirkin wrote: > > On Wed, Jul 01, 2026 at 10:08:45AM +0200, David Hildenbrand (Arm) wrote: > >>> > >>> Yay. I did that + dropped the extra lock/unlock and now it's in the noise in > >>> my testing. needs much more testing of course. > >> > >> Cool. I'd expect that latency-sensitive workloads (PREEMPT_RT) would not want to > >> have hwpoison handling either way, so using the no_resched variants at these > >> places might be doable. > >> > >>> > >>> If you want me to post (including addressing your other feedback) let me > >>> know. > >>> > >> > >> Let's first discuss the options. We essentially have the following one so far: > >> > >> 1) Ignore the problem > >> > >> It's been there forever ... but I am not quite happy about that. > >> > >> 2) Use atomics everywhere > >> > >> The easiest+cleanest, but as measured, the performance hit is real. > >> > >> 3) Keep retrying for a couple of times > >> > >> The big problem is "how long". A CPU in a hypervisor might be stalled for quite > >> a while (20s? can be longer). > > > > So on this idea. It might not matter. What I had in mind is: > > 1. run the current logic > > 2. add page to a list of pages to check, then invoke e.g. call_rcu_tasks > > (or call_rcu_tasks_rude) maybe > > 3. in the callback, recheck and if poison cleared, go back to 1 > > 4. otherwise everyone will see the bit set, remove from list we are done > > > > it seems to not regress anything, and for the rare race, we set > > the bit eventually. > > > > So test-and-set (and friends) would also have to check the data structure that > remembers bit to set/clear (and possibly update the data structure). > > That does seem doable. Do you have a prototype? what do you think ;) post it? > -- > Cheers, > > David