From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3EAA38B14F for ; Mon, 8 Jun 2026 20:17:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780949882; cv=none; b=NKCXALNpUjCKDmn1sc2ogvSn49a1xYCbaTESJd0EjZ2FPUIi1VK9SNd/AnHkcMOarVV8rdJwEGLD7UiOlC8g7RmLydvtWepoX7k6dbl9meFAy/so8BFUsV7TBW0hSV/bTpcSQBhm3L8oYqGX/cOovuoJMhxltYNkSVW8T9IKvaE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780949882; c=relaxed/simple; bh=Afz6dk03s4ccCp3YsMkn2AuqSIaK756whnIE9ftF+gc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UhvV3h8gxtQm/aBkUXZpiQtkB540j+HJw1E1ZRR9Chpg+HLfS0/bbmp9evTc4LZILaUWm53W/J5rBE+BOPVsCyzBoRhMvmrVLe7fnK1AmVa9GxCfUlVaJEHHGT+E9YCtY2jRnAc+DleBk03kYrJ6xOYhbK+sXHNaG4o8/3nr/Gw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UzILx/wY; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=T/6J8COq; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UzILx/wY"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="T/6J8COq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780949878; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Vroj8FPw/hBBf0mVIETR1AI/cdAWI1hHo3K6Dubb1Jo=; b=UzILx/wYZkIsjcmnvqxQIDIYRxMvyYs5ITjiTTmoAgyFY8qA4+pAUXVC6xojgp8YVj0yb4 2dS76SEVzD1u+0dnLp0NGZqeDFj1Zr3LWRhl9/W+UKr4F6/+G9HtcIdINzFUPRF+baLGuV 9oluYH5FrfPHys2ARhHliA1LH+qdfAM= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-688-TMoNtOawMimO3MxqR8lFOQ-1; Mon, 08 Jun 2026 16:17:57 -0400 X-MC-Unique: TMoNtOawMimO3MxqR8lFOQ-1 X-Mimecast-MFC-AGG-ID: TMoNtOawMimO3MxqR8lFOQ_1780949876 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-490b9318944so33537835e9.1 for ; Mon, 08 Jun 2026 13:17:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1780949876; x=1781554676; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Vroj8FPw/hBBf0mVIETR1AI/cdAWI1hHo3K6Dubb1Jo=; b=T/6J8COqdVGdrb6eW/WvsD1qg6Tshkq6RjR/fPRHg6mYFIRdZ4lr/TKN7wvzb9nO5v iKiVHgUfX6HzI8vX5FcyS/a2AqGivtD0P8wbMi+tTVNw7RsmMxuzetliGVG5a47TVjzL ZRch0wGlRXkVyLjDSBy5b0FlvBxmXA4+eythuWM86a/iCk8LCORG8cQoceZggXh50VaO NDT771D3MMeCZ8WXe6qodXO37Kihoq36GYajqRW/4ikPjuraPd25Qjtz9KNqdguiDDaZ TidUYa52sTMAnO/JE7RifjcwiHYhMr+2rsP5lAXNxAoiQSGoBncKWnLjhWG31Ypb75Q4 nfNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780949876; x=1781554676; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vroj8FPw/hBBf0mVIETR1AI/cdAWI1hHo3K6Dubb1Jo=; b=lyIpshVWwgeEG+sy+IhVtbDuCA2cXrrYrmHL383cJrKU/7lX9O6wSeO9rF4Bl9wjR0 12E1GmX4mErIt5+btzI6n+Fps1byThlknCw3eC1r7UtaqxMTShHWgrR9uYhIvNZavQFZ HSML0AqIeiubpjhZrf4Ynl9gwf2OfrtYmnuZVFyDakO9xY/Oh6vsRv50t7sTPDSSRO9Z z1UzeNodj8rKxCZ5/He0ddhZV1dDL2I93Te4O9+IAUqW1BqFCq1KBQS5yIXbZPGlZosu IHYVPUiOljBH+sbk5m9qXAgfJhFTChblck9xgpAZzzWcWYmX2w1YpANZWiI16bIxmQ4n mENQ== X-Gm-Message-State: AOJu0YzDbMuFOg0JhPOhZLOrBPB/1hP8Rg4lr4unscSSgAqtG65gdlvr vFr4XYrRDYJeKLx2jvwCV2o+FAOzfz1HgZ79ojPM2y58DK/zmaIJCGeI1Jjzk1W0dmkB+eNBNKX rm+Rmv/YDlTt7MOy3lybL24sEuWmE72+vuDC7bE7DyfZ4wT22LpkrOIzJ54Wg5Y5VGw== X-Gm-Gg: Acq92OFthueDoR9t73OHX8YgTufWP9uP2eXBZhIiJnE7fMfrUrowJ4yCBTKwjkcHp/E OWags4Y7xZI7VXhy1nP8kr2nY6CzV7AInKFXBa6p6ZxqKylE6i2xPRBPY2MNga98wtfIB7tVYv6 R7fBoLsYOabxabe4YOdFwz/YnPoTTc/X90H4w/+YkDMtL4czf+iOMVXdepyycLofXwFSlxxYqXl yoOKI/klKQmfs68KP4BfzR9mt2Ag9/09raNAR2wAECwSkMaEJtLw1+I4qnjDalE4O0aSOH0pJXR hCUJc583VR7O5d3pP4cOpXd2968qOy6vfdJEjh9laOJbObNy3Nwf+IriOxZcCDA7rrZcwG49nOH mZajjYomx/u7idNiJqVqB1ibgftT22MSUZlojdvY/AAia1U/sK4OJEA== X-Received: by 2002:a05:600c:8b6e:b0:490:b7a2:8864 with SMTP id 5b1f17b1804b1-490c25237e4mr294515715e9.0.1780949875692; Mon, 08 Jun 2026 13:17:55 -0700 (PDT) X-Received: by 2002:a05:600c:8b6e:b0:490:b7a2:8864 with SMTP id 5b1f17b1804b1-490c25237e4mr294515045e9.0.1780949875120; Mon, 08 Jun 2026 13:17:55 -0700 (PDT) Received: from redhat.com (IGLD-80-230-85-71.inter.net.il. [80.230.85.71]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-490c2d37edbsm327199975e9.2.2026.06.08.13.17.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 13:17:54 -0700 (PDT) Date: Mon, 8 Jun 2026 16:17:49 -0400 From: "Michael S. Tsirkin" To: Lorenzo Stoakes Cc: linux-kernel@vger.kernel.org, "David Hildenbrand (Arm)" , Jason Wang , Xuan Zhuo , Eugenio =?iso-8859-1?Q?P=E9rez?= , Muchun Song , Oscar Salvador , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , Baolin Wang , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Hugh Dickins , Matthew Brost , Joshua Hahn , Rakie Kim , Byungchul Park , Gregory Price , Ying Huang , Alistair Popple , Christoph Lameter , David Rientjes , Roman Gushchin , Harry Yoo , Axel Rasmussen , Yuanchu Xie , Wei Xu , Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , virtualization@lists.linux.dev, linux-mm@kvack.org, Andrea Arcangeli , Miaohe Lin Subject: Re: [PATCH v10 02/37] mm: memory-failure: serialize TestSetPageHWPoison with zone->lock Message-ID: <20260608160954-mutt-send-email-mst@kernel.org> References: <20260608094153-mutt-send-email-mst@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Mon, Jun 08, 2026 at 03:14:51PM +0100, Lorenzo Stoakes wrote: > On Mon, Jun 08, 2026 at 09:48:34AM -0400, Michael S. Tsirkin wrote: > > On Mon, Jun 08, 2026 at 10:43:21AM +0100, Lorenzo Stoakes wrote: > > > On Mon, Jun 08, 2026 at 04:34:23AM -0400, Michael S. Tsirkin wrote: > > > > TestSetPageHWPoison() is called without zone->lock, so its atomic > > > > update to page->flags can race with non-atomic flag operations > > > > that run under zone->lock in the buddy allocator. > > > > > > > > In particular, __free_pages_prepare() does: > > > > > > > > page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP; > > > > > > > > This non-atomic read-modify-write, while correctly excluding > > > > __PG_HWPOISON from the mask, can still lose a concurrent > > > > TestSetPageHWPoison if the read happens before the poison bit > > > > is set and the write happens after. Follow-up patches in this > > > > series add similar non-atomic flag operations as well. > > > > > > > > Fix by acquiring zone->lock around TestSetPageHWPoison and > > > > around ClearPageHWPoison in the retry path. This > > > > serializes with all buddy flag manipulation. The cost is > > > > negligible: one lock/unlock in an extremely rare path > > > > (hardware memory errors). > > > > > > > > Note: SetPageHWPoison and TestClearPageHWPoison calls elsewhere > > > > in this file operate on pages already removed from the buddy > > > > allocator or on non-buddy pages (DAX, hugetlb), so they do not > > > > need zone->lock protection. > > > > > > > > Acked-by: Miaohe Lin > > > > Signed-off-by: Michael S. Tsirkin > > > > > > Can we have Fixes: and Cc: stable and also send this separately please? > > > > > > These patches seem like unrelated fixups that you've discovered along the way, > > > and don't belong as part of the already rather large series, unless I'm missing > > > something here. > > > > > > Thanks, Lorenzo > > > > I think you are mising that they are a dependency, not unrelated. > > Then say so. > > > For example, this issue gets worse with the patchset as there are more > > places that manipulate flags without atomics. No? > > It's your job to make that case, not mine. > > > > > > > You are welcome to send this to stable, but I think stable rules > > preclude theoretical bugfixes. > > It's a dependency but also theoretical? As in, the race is exteremely hard to trigger and I have no idea if it triggers for anyone, but it's obvious from reading the code that theoretically it exists? Yes. > > > > As for Fixes: the issue has been there for decades. I wouldn't know > > what to attribute it for. > > Again, your job. Alright, if you insist: Fixes: 6a46079cf57a ("HWPOISON: The high level memory error handler in the VM v7") now everyone running 2.6 kernels will backport this fix, I presume. > > > > > > I guess I could send these separately, too, why not. Not sure > > what this accomplishes, but hey. But is that an ack? You want > > this fix merged even before the feature? > > I already made the case as to why, as have other maintainers. > > If you need to review what an ack looks like please consult > https://docs.kernel.org/process/5.Posting.html > > Thanks, Lorenzo I am merely asking if you want this patch in the set including all these nits I had to fix. -- MST