From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6441C7115B for ; Thu, 19 Jun 2025 12:00:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 494466B00A4; Thu, 19 Jun 2025 08:00:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 444B26B00A5; Thu, 19 Jun 2025 08:00:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 333876B00A7; Thu, 19 Jun 2025 08:00:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1FF1C6B00A4 for ; Thu, 19 Jun 2025 08:00:29 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7D6DC160E18 for ; Thu, 19 Jun 2025 12:00:28 +0000 (UTC) X-FDA: 83572007736.14.B276021 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf23.hostedemail.com (Postfix) with ESMTP id 985B414000C for ; Thu, 19 Jun 2025 12:00:26 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Sp5IMTce; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750334426; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PzwhHGahRXqYBNmNXo4OGE2H+ROVmzyu37PVrvBuxyk=; b=iPjVWHKMdRprj2LCMYxCam0M3ez6r9+gyioxvjOt4+LvIGP2hPaNGDHjF+0dK5CdaAxNGp fuUoh3/NuCs5g5XbGwj4PZWbJTdoGxTP446bgLEJSm3Vh1REWFzy0So5ppWY2pAabsp1PK l3BD7/1YCYxNyU467xjow8zCWvhIKzQ= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Sp5IMTce; spf=pass (imf23.hostedemail.com: domain of rppt@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750334426; a=rsa-sha256; cv=none; b=yL9dSGPYIuCpboFUCYrZpcVU0Oc3rqBjJ9ojI7OPArcRzg7j6bsvRpSedLCiierdavdnYe uNXwKq0WlA6oWFgQ3v7IhEiZWBQPpb3ggpJ/IvGUfbhW06E8yECxgpiGvA+RDXCgDFJMly rsn3kArWQ+wfQmG9oxOM0Z59cbefiBc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id C9D58629D1; Thu, 19 Jun 2025 12:00:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 37E45C4CEEA; Thu, 19 Jun 2025 12:00:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750334425; bh=PFHjT7wNEtWLFt2XnrjSHL8YbQXv79m3A4LyMVacP/s=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Sp5IMTce8h2KFmBQyevsBYG/pppr/7Np2tJ0/NtoIV2RtRI7UqU3rq4mr6cZiWJU9 6tyjN4z6RkAylFZ69Br4I6VhGCe91JL6lmuzZWltP0Ine7PmfjbYkSiDDlFA3iPUSE G124NpNazhsGC2WZ7yOix/a68JlIdd9PE1WXnJSBNncuhTmdjyRhvKoVm82wW9m4mV m004cAftIjH0dDd5xqqng5pieE49A7g0B+nd8BS1etNWwtHYZq2ya3tt85dKIaAeP/ pfX38OIYKE+RWghmlPD0jVsVkfK44miQ84PASCCv8X5gI80fMkWigkKcM5ZaQisdTI 8PsmqVTXU1GOA== Date: Thu, 19 Jun 2025 15:00:03 +0300 From: Mike Rapoport To: Pasha Tatashin Cc: Pratyush Yadav , Jason Gunthorpe , jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com Subject: Re: [RFC v2 05/16] luo: luo_core: integrate with KHO Message-ID: References: <20250617152357.GB1376515@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam11 X-Rspam-User: X-Rspamd-Queue-Id: 985B414000C X-Stat-Signature: scxpwr8r34ptpmzpmqneiwwukynrd5c6 X-HE-Tag: 1750334426-46106 X-HE-Meta: U2FsdGVkX1+4GNZL1nao8Yp0rXuct1ULEezKhr+nyUt1VcEMHUsCGv5xgtx5evjkLO8gAa15U9wTUHwv4Zeiwq0NpW+kLz5kubCZNfXBnW0Rlnd8vZz3M0L7CwzDcYD/u0wPBLy2sT16qM/42GUmbgYo0wakTZz5lDILk5xVRfXW+UyLxDHAgIgZuYh2y4Y7TN0gdQ3WUDtzN99VnCCNCPE1mrCERiw4dtZpCDzttkBjAhuzVLDQq1wR59zC1JfPup7f2hV8k3u4nLc0D/tEmzq6i1E9S9DzNuRQZIfWbP323AHPGW5jVRSzs05Hesg0tQdpLoiW2d11rgarUZW1za7bP+VsbHexk4z7CYb2lWId/iulq1CJI+j24OsBhfd4n4o9JCx/TJM84QhHK9wwH3kE3qXQ0h620BtU2cuuY4yMTBoODJ5egEK0nqe5L9u/zMuQbOdW0bqx69lNgSG0EwlxpCpVqpWdaOxQuiBBfUtCibUL1mDFM1U9JMtH35yBopXFV8JPsyHlBQcO3UGjlkGBynm0Ip5xn9Ugko0m3mgHIiDrjGdPmSprzsGHfBfsVZEWk1RTBzEYRv9Sk0eSnSWKU730RJi+KEqniiER1ouHq8DgM5SH1m6/wkfwYKfjZ24b4W860QIZxzknB1KzGmvZcxSq5NKabufTADJDhYnbQ35MfF8SVBWPCaXP7GeIOTbzRpVHhkbsC/CoL9sipHsjo/sACneOdTZgqOnMIdLCMqqutGqzH0S2BQVFVyDhvUcEVX89dHKXNqdkvvetLqCChXHeypKHmjs2vOwG0nXvTZ/nBwFWh/8DJgxtbTUcwLSF6XYSncbosSbyO7Oga5z8zgZF8OHckgYOvuun+PucU77RUzJSFD1J32uaA9SAErxk/hgAsB2kdRY68BRAWgD5cjgOiJ1fpAvLxmB5HGfHKXpOppRYQISJfno9j7q91AVJkz6Iiz5lsosPCfi LYLgM9Vk jNv2gUMrS+1hDCJ4/raaxy4ctCAg80e1YGTwNFCowegQNfP7ymaV3O7EGeaEiTx5R3z8k+K6WWhAg6PBuQXrkdUBywQeJRUUDGDqfVFUxwXDQUiiWs+bTrayOUgHXzgi2qgGWV1VQJnthcfr0UmJY3yvrrUh0eq1dTPyR8uJppRG3h4+Uuqp6AN3q8wTluhGizdEzEop70zFSZXM1O+H1OmeknvSsHANGwxnp2/921wHfHbWKTNT6NqzYH2f1fnFGD74NlJmquUw5lzAD83dMzwGpIkO+Aseh3EI8CYsx5EHs3WsG0FA0OMukzorOQnZ4PAtr5qYOX4yEDH1qD0/wAgU6yjnSkXUEqM0yCmSKZGId3IaJ9EqzDYZ+/w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 18, 2025 at 01:43:18PM -0400, Pasha Tatashin wrote: > > > > > >> > > > > > > >> > What I meant is that even without KHO_DEBUGFS, LUO drives KHO, but then > > > > > >> > KHO calls into LUO from the notifier, which makes the control flow > > > > > >> > somewhat convoluted. If LUO is supposed to be the only thing that > > > > > >> > interacts directly with KHO, maybe we should get rid of the notifier and > > > > > >> > only let LUO drive things. > > > > > >> > > > > > >> Yes, we should. I think we should consider the KHO notifiers and self > > > > > >> orchestration as obsoleted by LUO. That's why it was in debugfs > > > > > >> because we were not ready to commit to it. > > > > > > > > > > > > We could do that, however, there is one example KHO user > > > > > > `reserve_mem`, that is also not liveupdate related. So, it should > > > > > > either be removed or modified to be handled by LUO. > > > > > > > > > > It still depends on kho_finalize() being called, so it still needs > > > > > something to trigger its serialization. It is not automatic. And with > > > > > your proposed patch to make debugfs interface optional, it can't even be > > > > > used with the config disabled. > > > > > > > > At least for now, it can still be used via LUO going into prepare > > > > state, since LUO changes KHO into finalized state and reserve_mem is > > > > registered to be called back from KHO. > > > > > > > > > So if it must be explicitly triggered to be preserved, why not let the > > > > > trigger point be LUO instead of KHO? You can make reservemem a LUO > > > > > subsystem instead. > > > > > > > > Yes, LUO can do that, the only concern I raised is that `reserve_mem` > > > > is not really live update related. > > > > > > I only now realized what bothered me about "liveupdate". It's the name of > > > the driving usecase rather then the name of the technology it implements. > > > In the end what LUO does is a (more) sophisticated control for KHO. > > > > > > But essentially it's not that it actually implements live update, it > > > provides kexec handover control plane that enables live update. > > > > > > And since the same machinery can be used regardless of live update, and I'm > > > sure other usecases will appear as soon as the technology will become more > > > mature, it makes me think that we probably should just > > > s/liveupdate_/kho_control/g or something along those lines. > > > > I disagree, LUO is for liveupdate flows, and is designed specifically > > around the live update flows: brownout/blackout/post-liveupdate, it > > should not be generalized to anticipate some other random states, and > > it should only support participants that are related to live update: > > iommufd/vfiofd/kvmfd/memfd/eventfd and controled via "liveupdated" the > > userspace agent. But it's not how the things work. Once there's an API anyone can use it, right? How do you intend to restrict this API usage to subsystems that are related to the live update flow? Or userspace driving ioctls outside "liveupdated" user agent? There are a lot of examples of kernel subsystems that were designed for a particular thing and later were extended to support additional use cases. I'm not saying LUO should "anticipate some other random states", what I'm saying is that usecases other than liveupdate may appear and use the APIs LUO provides for something else. > > KHO is for preserving memory, LUO uses KHO as a backbone for Live Update. If we make LUO the only uABI to drive KHO it becomes misnamed from the start. As you mentioned yourself, reserve_mem and potentially IMA and kexec telemetry are not necessarily related to LUO, but it still would be useful to support them without LUO. While it's easy to make memblock a LUO subsystem to me it seems semantically wrong naming. > > > > > Although to be honest, things like reservemem (or IMA perhaps?) don't > > > > > really fit well with the explicit trigger mechanism. They can be carried > > > > > > > > Agreed. Another example I was thinking about is "kexec telemetry": > > > > precise time information about kexec, including shutdown, purgatory, > > > > boot. We are planning to propose kexec telemetry, and it could be LUO > > > > subsystem. On the other hand, it could be useful even without live > > > > update, just to measure precise kexec reboot time. > > > > > > > > > across kexec without needing userspace explicitly driving it. Maybe we > > > > > allow LUO subsystems to mark themselves as auto-preservable and LUO will > > > > > preserve them regardless of state being prepared? Something to think > > > > > about later down the line I suppose. > > > > > > > > We can start with adding `reserve_mem` as regular subsystem, and make > > > > this auto-preserve option a future expansion, when if needed. > > > > Presumably, `luoctl prepare` would work for whoever plans to use just > > > > `reserve_mem`. > > > > > > I think it would be nice to support auto-preserve sooner than later. > > > > Makes sense. > > > > > reserve_mem can already be useful for ftrace and pstore folks and if it > > > would survive a kexec without any userspace intervention it would be great. > > > > The pstore use case is only potential, correct? Or can it already use > > reserve_mem? pstore can use reserve_mem already. > So currently, KHO provides the following two types of internal API: > > Preserve memory and metadata > ========================= > kho_preserve_folio() / kho_preserve_phys() > kho_unpreserve_folio() / kho_unpreserve_phys() > kho_restore_folio() > > kho_add_subtree() kho_retrieve_subtree() > > State machine > =========== > register_kho_notifier() / unregister_kho_notifier() > > kho_finalize() / kho_abort() > > We should remove the "State machine", and only keep the "Preserve > Memory" API functions. At the time these functions are called, KHO > should do the magic of making sure that the memory gets preserved > across the reboot. > > This way, reserve_mem_init() would call: kho_preserve_folio() and > kho_add_subtree() during boot, and be done with it. Right, but we still need something to drive kho_mem_serialize(). And it has to be done before kexec load, at least until we resolve this. Currently this is triggered either by KHO debugfs or by LUO ioctls. If we completely drop KHO debugfs and notifiers, we still need something that would trigger the magic. I'm not saying we should keep KHO debugfs and notifiers, I'm saying that if we make LUO the only thing driving KHO, liveupdate is not an appropriate name. > Pasha > -- Sincerely yours, Mike.