From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8253CF8840 for ; Fri, 4 Oct 2024 18:32:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B5836B0135; Fri, 4 Oct 2024 14:32:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 066416B02F3; Fri, 4 Oct 2024 14:32:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E470E6B02F4; Fri, 4 Oct 2024 14:32:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C53C16B02F1 for ; Fri, 4 Oct 2024 14:32:31 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 66BEC411A7 for ; Fri, 4 Oct 2024 18:32:31 +0000 (UTC) X-FDA: 82636765302.15.136A246 Received: from mail-ed1-f54.google.com (mail-ed1-f54.google.com [209.85.208.54]) by imf20.hostedemail.com (Postfix) with ESMTP id 902841C001D for ; Fri, 4 Oct 2024 18:32:29 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DWhtXfbY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728066708; a=rsa-sha256; cv=none; b=bn5qA/cX4G5mXoFFVewJl0nkQfA9KqPx+Ga5TnyHTICHzZz8Qey9ulJdlbBybDJHhl0lLb nH9DDZWYSuDoXET2cx6X71Akarv69D1jWNpCJuSTDrPopmpkZewYdEMCAm50N09fCp7sE+ rVx8y6L9ljf1pETo4CW+Z3qVURiaqp8= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DWhtXfbY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of jiaqiyan@google.com designates 209.85.208.54 as permitted sender) smtp.mailfrom=jiaqiyan@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728066708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UZmVTo3mhB00/+xAP9yXazmKhVVBMTUz56ACBqvjjKM=; b=aD0A75P/4R+ZoN+zzr0SSMO0nu52jtL3kBqw0k2rLfyq1E0vOG1YFaw2cS2T7xvxLr6nbF McQE8TZUhqlFKRxb4x/rd+E8HlwBpGz2tvJ8PEKlA0Lub4kKS5cXLdBywW0EeZPaP0Znrb Sgfr3U6tgXHc+0uBLsThlYn/vzYE/Oo= Received: by mail-ed1-f54.google.com with SMTP id 4fb4d7f45d1cf-5c87a7782beso2989a12.1 for ; Fri, 04 Oct 2024 11:32:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728066748; x=1728671548; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UZmVTo3mhB00/+xAP9yXazmKhVVBMTUz56ACBqvjjKM=; b=DWhtXfbYnNe5vX6Iri4mug6C7b/sA9bU3c7tp0Gl2+VXU80RVmUB8dFYzM8QrSiA0l Ym3Z6g0A4QwL2ZcJw2oyI0GVs50KPOKFN1+u5NWdqkVcO+xTtnbP5/XLvt4iZhwYV9kp OA7cYNXwAglagECsK9UGRygd2RCcJS0iP/8lpSsHyFMSo30m0FIpTMuFpmjhPQGDMISy XCRf5fJjgL9FdRORUKIPHZkK+9q6neIVf9GC2A5t0pNIVYcB+aGhkIHynwYorbCe3TY5 NK07pNgTlI6JxeiPmUY+eeF3E4M+zNAcsHYA173xkggKDhbuXpsg9AHVJh2RcoTExL9e x2ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728066748; x=1728671548; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UZmVTo3mhB00/+xAP9yXazmKhVVBMTUz56ACBqvjjKM=; b=XBO/AOt6D8zQn42RLvW0H68OcwhxeU0JUrnu4sDW5S1JJe2I9DMLzdw3dzFer91NXi Q4urdL0A+K+4ExZP8y+qkTZ2jx1N/qPV0APjq5PG9gAG8VxTdafZcvEUyis0csPFzEv6 kALQTsyQGVPX33coIpLQ6dqAshrXvPw1A97e2nOnQeDcZR+TGznI1AEN0OhBTWIvb2jn Adtms9C1+sP/A4SNRiEHHlUWWjn4TfSo5Hej0v7qHrj/WVaUAiXpuNN/MGZAq4zjVd/0 qg/nHJWgWmnhqovneZJwlgDdIWxJam0Vg86izPooAj3XIxEluc0TLJy90nlUMW/fWTDs HRUg== X-Forwarded-Encrypted: i=1; AJvYcCWcyN88wVRI98Xvoq7KF7cjy9r1CAKknyTuzyRI8pXZVivMAaZhZykvFiM9SjtD82pCjpX8c3kBoA==@kvack.org X-Gm-Message-State: AOJu0Yygthu/xFrWxjTDqvQNEU4t7M155tW+Dzv+jlLHmc9eIgQxoDHZ i5dSkxOCLkCBkpoF52VoaZKxYZ+D309bCdxAmOwOVLzBHSchDRMTKMNHZ8M54GJL7UrITyOmQUD SUimtbj845h8uHuyc5vOTuVEK2oXyMwgEhLez X-Google-Smtp-Source: AGHT+IHojjz6Erja5jRMLCUjuBRYnBa8hhWmWmq9kyahKsq9HK2Mg0VOVzi8fYqJEI8uu3RHN6tgbz7ecW3yVyJzzUc= X-Received: by 2002:a05:6402:42cb:b0:5c7:18f8:38a6 with SMTP id 4fb4d7f45d1cf-5c8e124ffe2mr28770a12.5.1728066747902; Fri, 04 Oct 2024 11:32:27 -0700 (PDT) MIME-Version: 1.0 References: <20240924043924.3562257-1-jiaqiyan@google.com> <20241002150217.GR1365916@nvidia.com> <20241003231957.GE1365916@nvidia.com> In-Reply-To: <20241003231957.GE1365916@nvidia.com> From: Jiaqi Yan Date: Fri, 4 Oct 2024 11:32:14 -0700 Message-ID: Subject: Re: [RFC PATCH v1 0/2] Userspace Can Control Memory Failure Recovery To: Jason Gunthorpe Cc: nao.horiguchi@gmail.com, linmiaohe@huawei.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, rientjes@google.com, duenwen@google.com, jthoughton@google.com, ankita@nvidia.com, peterx@redhat.com, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 902841C001D X-Rspamd-Server: rspam01 X-Stat-Signature: 433koxtdgbsy6qashfwrnrfjbykfo9qw X-HE-Tag: 1728066749-567394 X-HE-Meta: U2FsdGVkX1/mImnjhlleqxSktipeLxo+2fBPizgasaRpRbrnLeEwbuYnGj478N4oUhn+GnMQtGtVh54R4/xoQHAte2ydfGLM+HP8PswprGRyb/6V748KdnI6Mgez2ZybNSIWScR7EsHtqR7fxqrrZY+o15vX63SWVpbBIA+B+7prp/zb/rViTzdyIeWnyT7CFNUzS5a0oYVNn7gh7VP2aGpAZHo5cCmF0deOE/TuW2y6YVrhbxx0ne9AVgodM6YokPFmNF4ux8bLvLLYodaGpk25yjEPqmCLQJc/1J/vjApCMpedWELIkFV/QDA7IAaT42awy/aSL91MfHSoY7dClBks4y9DKNeW9DAJ1m1qGi/CKwGGMjlrLHaRV4GEcQB9cN1NWT+ew5WcXZiJrMO7biYB5RgFVLBp0CZzCTNm9y1gFxSg3R0CN5cDhBYl/HQnJTOxTDwywHgvsT6v9CscY2l+cFgpaq5gYQq4FyeKYnFYLQpqnBpow0evRMSXYKL3lu6eFnDcPkUdnoSO0oAb3OpHMaCyXtL/MR3VLz/OP3WXpTAZAu4J1BJa+YW1ol8uPzNeEHWUo2QM6NkhcLwm5/cj3wGt8UzOmsCu+1zPUFGq7zXsNrnA5eBBcLR+l9upe/qIJ1m6tE9FvCM0EZc9vP/ZRKPKMSxjag+7IYafAGKWQBcsV2njxYYIXt3wOUiZSshJB4/AS+gMUkGPN6A9+yrsbIEeWjim+k99NNzxwCKHnAa6NK2A0r3qnX1PnqrgE80Kv8FIwQgo37MeZlbR7VgoJdlP3B0Dxbe5XggoLQUtnNHJE4o0xDloTINmKQmzXsByw7CvhDrS/nJGcxZXOQDq11l7Mkcx7W6sn8GGY6pjOe8xFlbwHICmqBe+KNkT9Lg3DSWUsmjd2UIvZot61hu8ZyZK89ysggCeRm7Ck6k84EcgxOW0kGm8i43BLmZ+6OE5W4057EjptFZzw7L w48c7rDO K0C0Q+gvz6UQqATwzoX0mf4xUaR7prAQCHnMk/+BKU+eQC8o1va0AEBEfG8BbLZ3vN/HHiGEnzelw0exgxlWC8Y02xgJB7aVEujUxDNJIEt39mLZG+RSwTZ2SQrw1ZfN4sGqjNFuCPjFeX4hxhifNchNFbwUs9MLnguKvJE5hRfQbysAfU1ro0dLKac+kJ90cuppgg6p1IYkWQz6OfwwZx0qWd7S29b/T/C9o1yPDIM7GOu7fRgCBGQ7Q3QqH8E6ne4mApTxFjnk9LqOU3CnnBId+/KE/0nqyq5+/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.002090, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 3, 2024 at 4:20=E2=80=AFPM Jason Gunthorpe wro= te: > > On Thu, Oct 03, 2024 at 03:45:09PM -0700, Jiaqi Yan wrote: > > Hi Jason, > > > > On Wed, Oct 2, 2024 at 8:02=E2=80=AFAM Jason Gunthorpe = wrote: > > > > > > On Tue, Sep 24, 2024 at 04:39:18AM +0000, Jiaqi Yan wrote: > > > > > > > So far I personally prefer the global MFR policy but open to feedba= cks to both > > > > options, or new ideas. > > > > > > Why? It seems more natural that only processe that can handle the > > > SIGBUS semantics would opt into them? > > > > Are you suggesting you prefer the per-VMA policy, or proposing a new > > "per-process policy" added via prctl? By "per-process", I imagine the > > policy to keep or offline the poisoned page will apply to all its > > VMAs? > > I'm just asking why you "personally prefer" as the direction seems a > bit awkward I assume the "awkward" comes from the concern of what userspace will do if the kernel is configured to keep poisoned pages. Admittedly this direction is the high return-on-invest one for me, as we already have memory failure recovery and repair in userspace to work well with poisoned pages not offlined until hw is repaired. But I don't assume it is the also case for everyone else, so I also want to propose alternative (limit to just VMA, or memory owned by process, and limit to their lifetime) that hope work for more people. > > Jason