From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41C27C3ABDD for ; Thu, 15 May 2025 18:24:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A3878D0007; Thu, 15 May 2025 14:23:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 556638D0001; Thu, 15 May 2025 14:23:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 30BCF8D0007; Thu, 15 May 2025 14:23:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 044EE8D0001 for ; Thu, 15 May 2025 14:23:42 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 04D4C58EB7 for ; Thu, 15 May 2025 18:23:44 +0000 (UTC) X-FDA: 83445965568.30.4AB04C1 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) by imf12.hostedemail.com (Postfix) with ESMTP id 28D5E40008 for ; Thu, 15 May 2025 18:23:42 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=FTyQR17Q; spf=pass (imf12.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747333422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cWCx6GGcEP/Bhmxu59BlBlBS7y0637Vw+qLeLaDOFgQ=; b=wGcY7AZtYTZWrbMdogki/ScQ8l/Bdw/9VKeqOZfvpoZa0FkMGbi1JpkQH7+tY/NEvt+wsC 15vifzALLs4Q6wfcRAJx9KWybadTtO+hriIr0B48MI3tSMCNOiA7t54hoNtMpXUYd54VrZ EUKoonLL7oYzZ0kGcnNbrQLBxCxfILY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=FTyQR17Q; spf=pass (imf12.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.175 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=pass (policy=none) header.from=soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747333422; a=rsa-sha256; cv=none; b=FnDISndYSEHvJ/7e2KJOzfxT1P41utGx7cV6LOhkHzUT0A+rwcof+DEbaoqDsXARqL/Of0 nH7mkUUZyrn0QijuoguD36oeKJTAXgSq2ba6Qg5asmLH1Un5F3I1AFFn98n5wePaK0+6DN MyyUVNnwIb9nvrhs4AH/CJzE/gxtcUY= Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-7c55500d08cso128841085a.0 for ; Thu, 15 May 2025 11:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1747333421; x=1747938221; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=cWCx6GGcEP/Bhmxu59BlBlBS7y0637Vw+qLeLaDOFgQ=; b=FTyQR17Q2DZF8n0qpETuCNxe5kwZyAGGjXoNADrRldoYd5guWpjlDcERUYVjg1HHUo WD+qLWjp/0vJ/ZKPJeK47SStG3uAIfEnwpZfA1Q37ImCIUidT/UMCS9FF2xl7hoP9Bpp 4ybU0BtArv9yRCzEzrDlQegqpIe3Ie79b/JbJM0CsNLv2K2YbqJyHlg8BiO0NUCO4IzO Uqz4H06XsjRO82W1RvcBw08Gi0OLCHmHvNIwXSJOEvnCz5ZsgMKX7oxQ5A8v/5KOggzZ D92A/IfUfyYM21Ktv7MebkMHDeNzSk0etsmv/DcyDWFU9HL5801o9eXwvT+ya1xXPth5 uKOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747333421; x=1747938221; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cWCx6GGcEP/Bhmxu59BlBlBS7y0637Vw+qLeLaDOFgQ=; b=mnHaM+YJQMhJOSjphMwrn6ro6cSv52/yJfFBLzZg8FtUnmedfEAgSOw5ECdjpVB0kH 7E+194JYDI1sWagtvyvbRKEJiEKJIk4HH+rULgsBQHnTEZpHXPdjXgqSZE4nXTCIgQfN 6u04BBsR9xg9/jkgB83/sCdkc7441F21Dnd2A8S5v5QCSUxWfOL9+eYx9vU0MG6/GjIK E5CB7SSSv5OmVXN9/9vRE2FMf6U/29iXes32GBMmJ/ei1DzNFWEdOtzAQgHkXjTFA/pE E4ZLNkrjGjOOKaEDO7UwP4D5Vib72V/KBLvL5ZI2qta8ji2ehSvPmwb9/SP6pwEbG+e1 6gBQ== X-Forwarded-Encrypted: i=1; AJvYcCVFS9cgDlBc2JE7y3d1GzlyeAILciVZHaXcfzQQjyGyspYYIYgzjM56hahkQH3Bj7m6MqFy9lsF4A==@kvack.org X-Gm-Message-State: AOJu0YyDJfxyoj84QPOq+4kczjZmmwZUneYGDBfC1fk6wpKsaQhOy4fd f94ij6HeKpvPeqefgd2fATqQfbSIoYCBELj1AU/JbXshlWddJ8Lej5A1fKzxGJQvZ7c= X-Gm-Gg: ASbGncsG/z2tiB6u5VRBF/xiZlDjmqVsNinn+axQQwZOSUAxhcxkBRGSl3gB7vywz7R e8PdZw0bhLBuGWXEeyyXNg+XfPTf9u1kn+B9z1/GVXD7jw1wmEVCS9aFEzl5dwvT2F75Sr4mdCn RntJfrLx1sQZrDlAjdhT2RKhfAW7vIHQ0ourNDfadh/DRjvpANB7YpPhYUmmmBL5djs1i5/Hb7A i4Q4x6yY2b86rt4dL47Sbks8qOwdQE+0Pi5cmp129LW+S6MfeL+L72S+r7K9RRG0Q8Y9Ez9t/z4 +sneoKM7hzBitqDvfVRoRBE6XYReciLafvObEqXzVWhafguTJulOXb4c/3HYiaW57xueUYMqhyF ykI761z9voMSJsWvLfLE+LN6gXApCCWUq8IP9xodLJq4H X-Google-Smtp-Source: AGHT+IEWhdlfPNILwzbLIqumfGzpcWA+Jmu5BcE2C+Ogk2n1GuY+lHsyTfj1P93f5W7qlh3TnbUPDA== X-Received: by 2002:a05:620a:2592:b0:7cd:3b02:b6e4 with SMTP id af79cd13be357-7cd46707fcfmr70839885a.1.1747333420931; Thu, 15 May 2025 11:23:40 -0700 (PDT) Received: from soleen.c.googlers.com.com (138.200.150.34.bc.googleusercontent.com. [34.150.200.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7cd466fc2afsm18218685a.0.2025.05.15.11.23.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 May 2025 11:23:40 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de Subject: [RFC v2 10/16] luo: luo_ioctl: add ioctl interface Date: Thu, 15 May 2025 18:23:14 +0000 Message-ID: <20250515182322.117840-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.49.0.1101.gccaa498523-goog In-Reply-To: <20250515182322.117840-1-pasha.tatashin@soleen.com> References: <20250515182322.117840-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam10 X-Stat-Signature: t4y1857u1gjgyadciijrwbs8o3r8swt4 X-Rspamd-Queue-Id: 28D5E40008 X-Rspam-User: X-HE-Tag: 1747333422-455345 X-HE-Meta: U2FsdGVkX184aKlNENlrjpl+YK+qV3bnP+n8ImR12dxyurvW8uqG+CiU4WWSRlg8c92htz0KLCzVEp4plcBQD7QXLwKgGgCu7LgJqY0daco1cQWqRTyYBmJkI9OzOWsm5cUzAKBJRqoEYRiPVSt+p1HLy/2H4PsVwsT5XCL3WhvgqkjkHJOo1D581x0qx54Q/Q5nLMurmkW2Hao0aXhepk8HRmBc8FIEDAML8eIwziCI8XyfWUc5yB2I5WKQdSL7yqoFdxRUYsmZa9ViaVVsQ6Pc7KPe6ViFSCSj/xlRaQPRK8DhShRdwah8ADlFW/5AklbQeReQ4EYrUd6/6K20W830/P0jEqRPcZR6Gdvey6x1Y6XMcHkbRKVlFsQBPvp2ApqBx/iYI545gIExrErgre1RWoBIDiLPn5+DcsNr7ukR8KuTqcT/gHH2tfHG0aSykBCLJoLKXMIIPinCrPF4bucWn6O5IicCOXIAgSLlkuF/H5FSMCFy5tIHkAiNWq/DKvoipCziTpHCHNKZjgPvaPxT3pfqNiTWHzAJxos2B0N3+R/aNW28BjeaUtvHWoO8D5S9XNtBXW4aMvpkh2+xZn+67zXg76B9Hd5mWwt+9IZ2zTXDpc+wdjMIRytY4FbDeyKi8oDC29QmlsBvIysGz98rnNGFPIuKYAgtMd1l3cLgs8kaACX4AxnDrSQdCatout2Bv1WRP+PELs0VOOy6VHypgdlqU4LMzd4tQzImCo6JXq+uUDliCsBqWBCAS0LDlAyR23GtPWsIOESFiG3w8+1ZNpjR9em7CCn7oW3sK6sNYBlaSDcphCrNc/AWtaGkKFap56etXfZaormvx2qaqys8CsE8etUiMOupLRSfiqrwzBVa1FGGJJmOQgj4AlonjhKmAO5NwlrkcgyYJkpdab+6pSKFw3Vhn2nEgpTu4/qqnpXhxXlxto2SDBHuWJCzeVHlhDeNxmCJkJU15b0 BAKuG1Rk oSiLjUv46AosgXa5sBQx8bfn5aLBn07FRHAC/9bxkabgjMfQGQAxXlTDlw7xfo9eYP67QkAZd9RjkevIt0bCIxY9yss4oeuL16IfjP6RYmjJF9glrHeIOhyuRtJtm7vBy3hTwz8OE1tZ86yiM72DxFjGsAlRj98DFATmfmpdtDJk2SQsC5TMDgDv5U18J4tRzbFJhpYX5D9d97j8O3LpUa+o+VBHFG4130jc5m6J6+nOYurAkDxU2p40nRJBe+99fkXNY4UYqWTAVsxBDPZ3+l9XsxdA1Xr5HUr2Mx1EMxxw9t05kYYvmjnMwBS4YsXnJWP6Xt8h8rDlpuE5bXZYl74Qn+xl72bu8QfP5tPVcTxl3TSW9pZEP8UNIoi+PMoAZsvxXfL6F/f+yKF/e0qLYqEvVj9lo/wGSLQr2pXzcGJr4gS7tCOubelPrRkyGpseQvIttKc62bZQyDx/x4LzaGWuOns2PapUIdzGkgl/7y9TGRUsh4rqDXPyj/UhFoQo6im+mTaIh3RsR6y5XjFEO+dmHFA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce the user-space interface for the Live Update Orchestrator via ioctl commands, enabling external control over the live update process and management of preserved resources. Create a misc character device at /dev/liveupdate. Access to this device requires the CAP_SYS_ADMIN capability. A new UAPI header, , defines the necessary structures. The magic number is registered in Documentation/userspace-api/ioctl/ioctl-number.rst. Signed-off-by: Pasha Tatashin --- .../userspace-api/ioctl/ioctl-number.rst | 1 + drivers/misc/liveupdate/Makefile | 1 + drivers/misc/liveupdate/luo_ioctl.c | 199 ++++++++++++ include/linux/liveupdate.h | 34 +- include/uapi/linux/liveupdate.h | 300 ++++++++++++++++++ 5 files changed, 502 insertions(+), 33 deletions(-) create mode 100644 drivers/misc/liveupdate/luo_ioctl.c create mode 100644 include/uapi/linux/liveupdate.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 7a1409ecc238..279c124048f2 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -375,6 +375,7 @@ Code Seq# Include File Comments 0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Marvell CN10K DPI driver 0xB8 all uapi/linux/mshv.h Microsoft Hyper-V /dev/mshv driver +0xBA all uapi/linux/liveupdate.h 0xC0 00-0F linux/usb/iowarrior.h 0xCA 00-0F uapi/misc/cxl.h Dead since 6.15 0xCA 10-2F uapi/misc/ocxl.h diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile index b4cdd162574f..7a0cd08919c9 100644 --- a/drivers/misc/liveupdate/Makefile +++ b/drivers/misc/liveupdate/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 +obj-y += luo_ioctl.o obj-y += luo_core.o obj-y += luo_files.o obj-y += luo_subsystems.o diff --git a/drivers/misc/liveupdate/luo_ioctl.c b/drivers/misc/liveupdate/luo_ioctl.c new file mode 100644 index 000000000000..76c687ff650b --- /dev/null +++ b/drivers/misc/liveupdate/luo_ioctl.c @@ -0,0 +1,199 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO ioctl Interface + * + * The IOCTL user-space control interface for the LUO subsystem. + * It registers a misc character device, typically found at ``/dev/liveupdate``, + * which allows privileged userspace applications (requiring %CAP_SYS_ADMIN) to + * manage and monitor the LUO state machine and associated resources like + * preservable file descriptors. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static int luo_ioctl_fd_preserve(struct liveupdate_fd *luo_fd) +{ + struct file *file; + int ret; + + file = fget(luo_fd->fd); + if (!file) { + pr_err("Bad file descriptor\n"); + return -EBADF; + } + + ret = luo_register_file(&luo_fd->token, file); + if (ret) + fput(file); + + return ret; +} + +static int luo_ioctl_fd_unpreserve(u64 token) +{ + return luo_unregister_file(token); +} + +static int luo_ioctl_fd_restore(struct liveupdate_fd *luo_fd) +{ + struct file *file; + int ret; + int fd; + + fd = get_unused_fd_flags(O_CLOEXEC); + if (fd < 0) { + pr_err("Failed to allocate new fd: %d\n", fd); + return fd; + } + + ret = luo_retrieve_file(luo_fd->token, &file); + if (ret < 0) { + put_unused_fd(fd); + + return ret; + } + + fd_install(fd, file); + luo_fd->fd = fd; + + return 0; +} + +static int luo_open(struct inode *inodep, struct file *filep) +{ + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (filep->f_flags & O_EXCL) + return -EINVAL; + + return 0; +} + +static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) +{ + void __user *argp = (void __user *)arg; + struct liveupdate_fd luo_fd; + enum liveupdate_state state; + int ret = 0; + u64 token; + + if (_IOC_TYPE(cmd) != LIVEUPDATE_IOCTL_TYPE) + return -ENOTTY; + + switch (cmd) { + case LIVEUPDATE_IOCTL_GET_STATE: + state = READ_ONCE(luo_state); + if (copy_to_user(argp, &state, sizeof(luo_state))) + ret = -EFAULT; + break; + + case LIVEUPDATE_IOCTL_EVENT_PREPARE: + ret = luo_prepare(); + break; + + case LIVEUPDATE_IOCTL_EVENT_FREEZE: + ret = luo_freeze(); + break; + + case LIVEUPDATE_IOCTL_EVENT_FINISH: + ret = luo_finish(); + break; + + case LIVEUPDATE_IOCTL_EVENT_CANCEL: + ret = luo_cancel(); + break; + + case LIVEUPDATE_IOCTL_FD_PRESERVE: + if (copy_from_user(&luo_fd, argp, sizeof(luo_fd))) { + ret = -EFAULT; + break; + } + + ret = luo_ioctl_fd_preserve(&luo_fd); + if (!ret && copy_to_user(argp, &luo_fd, sizeof(luo_fd))) + ret = -EFAULT; + break; + + case LIVEUPDATE_IOCTL_FD_UNPRESERVE: + if (copy_from_user(&token, argp, sizeof(u64))) { + ret = -EFAULT; + break; + } + + ret = luo_ioctl_fd_unpreserve(token); + break; + + case LIVEUPDATE_IOCTL_FD_RESTORE: + if (copy_from_user(&luo_fd, argp, sizeof(luo_fd))) { + ret = -EFAULT; + break; + } + + ret = luo_ioctl_fd_restore(&luo_fd); + if (!ret && copy_to_user(argp, &luo_fd, sizeof(luo_fd))) + ret = -EFAULT; + break; + + default: + pr_warn("ioctl: unknown command nr: 0x%x\n", _IOC_NR(cmd)); + ret = -ENOTTY; + break; + } + + return ret; +} + +static const struct file_operations fops = { + .owner = THIS_MODULE, + .open = luo_open, + .unlocked_ioctl = luo_ioctl, +}; + +static struct miscdevice liveupdate_miscdev = { + .minor = MISC_DYNAMIC_MINOR, + .name = "liveupdate", + .fops = &fops, +}; + +static int __init liveupdate_init(void) +{ + int err; + + err = misc_register(&liveupdate_miscdev); + if (err < 0) { + pr_err("Failed to register misc device '%s': %d\n", + liveupdate_miscdev.name, err); + } + + return err; +} +module_init(liveupdate_init); + +static void __exit liveupdate_exit(void) +{ + misc_deregister(&liveupdate_miscdev); +} +module_exit(liveupdate_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Pasha Tatashin"); +MODULE_DESCRIPTION("Live Update Orchestrator"); +MODULE_VERSION("0.1"); diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 7afe0aac5ce4..ff4f2ab5c673 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -10,6 +10,7 @@ #include #include #include +#include /** * enum liveupdate_event - Events that trigger live update callbacks. @@ -53,39 +54,6 @@ enum liveupdate_event { LIVEUPDATE_CANCEL, }; -/** - * enum liveupdate_state - Defines the possible states of the live update - * orchestrator. - * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in progress. - * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; the - * LIVEUPDATE_PREPARE callbacks have completed - * successfully. - * Devices might operate in a limited state - * for example the participating devices might - * not be allowed to unbind, and also the - * setting up of new DMA mappings might be - * disabled in this state. - * @LIVEUPDATE_STATE_FROZEN: The final reboot event - * (%LIVEUPDATE_FREEZE) has been sent, and the - * system is performing its final state saving - * within the "blackout window". User - * workloads must be suspended. The actual - * reboot (kexec) into the next kernel is - * imminent. - * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next - * kernel via live update the system is now - * running the next kernel, awaiting the - * finish event. - * - * These states track the progress and outcome of a live update operation. - */ -enum liveupdate_state { - LIVEUPDATE_STATE_NORMAL = 0, - LIVEUPDATE_STATE_PREPARED = 1, - LIVEUPDATE_STATE_FROZEN = 2, - LIVEUPDATE_STATE_UPDATED = 3, -}; - /* Forward declaration needed if definition isn't included */ struct file; diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdate.h new file mode 100644 index 000000000000..c673d08a29ea --- /dev/null +++ b/include/uapi/linux/liveupdate.h @@ -0,0 +1,300 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in progress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; the + * LIVEUPDATE_PREPARE callbacks have completed + * successfully. + * Devices might operate in a limited state + * for example the participating devices might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, and the + * system is performing its final state saving + * within the "blackout window". User + * workloads must be suspended. The actual + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_NORMAL = 0, + LIVEUPDATE_STATE_PREPARED = 1, + LIVEUPDATE_STATE_FROZEN = 2, + LIVEUPDATE_STATE_UPDATED = 3, +}; + +/** + * struct liveupdate_fd - Holds parameters for preserving and restoring file + * descriptors across live update. + * @fd: Input for %LIVEUPDATE_IOCTL_FD_PRESERVE: The user-space file + * descriptor to be preserved. + * Output for %LIVEUPDATE_IOCTL_FD_RESTORE: The new file descriptor + * representing the fully restored kernel resource. + * @flags: Unused, reserved for future expansion, must be set to 0. + * @token: Output for %LIVEUPDATE_IOCTL_FD_PRESERVE: An opaque, unique token + * generated by the kernel representing the successfully preserved + * resource state. + * Input for %LIVEUPDATE_IOCTL_FD_RESTORE: The token previously + * returned by the preserve ioctl for the resource to be restored. + * + * This structure is used as the argument for the %LIVEUPDATE_IOCTL_FD_PRESERVE + * and %LIVEUPDATE_IOCTL_FD_RESTORE ioctls. These ioctls allow specific types + * of file descriptors (for example memfd, kvm, iommufd, and VFIO) to have their + * underlying kernel state preserved across a live update cycle. + * + * To preserve an FD, user space passes this struct to + * %LIVEUPDATE_IOCTL_FD_PRESERVE with the @fd field set. On success, the + * kernel populates the @token field. + * + * After the live update transition, user space passes the struct populated with + * the *same* @token to %LIVEUPDATE_IOCTL_FD_RESTORE. The kernel uses the @token + * to find the preserved state and, on success, populates the @fd field with a + * new file descriptor referring to the fully restored resource. + */ +struct liveupdate_fd { + int fd; + __u32 flags; + __u64 token; +}; + +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +/** + * LIVEUPDATE_IOCTL_FD_PRESERVE - Validate and initiate preservation for a file + * descriptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation passes, + * the kernel marks the FD internally and *initiates the process* of preparing + * its state for saving. The actual snapshotting of the state typically occurs + * during the subsequent %LIVEUPDATE_IOCTL_EVENT_PREPARE execution phase, though + * some finalization might occur during %LIVEUPDATE_IOCTL_EVENT_FREEZE. + * On successful validation and initiation, the kernel populates the @token + * field with an opaque identifier representing the resource being preserved. + * This token confirms the FD is targeted for preservation and is required for + * the subsequent %LIVEUPDATE_IOCTL_FD_RESTORE call after the live update. This + * is an I/O read/write operation. + * + * Return: 0 on success (validation passed, preservation initiated), negative + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +#define LIVEUPDATE_IOCTL_FD_PRESERVE \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x00, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_FD_UNPRESERVE - Remove a file descriptor from the + * preservation list. + * + * Argument: Pointer to __u64 token. + * + * Allows user space to explicitly remove a file descriptor from the set of + * items marked as potentially preservable. User space provides a pointer to the + * __u64 @token that was previously returned by a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call (potentially from a prior, possibly + * cancelled, live update attempt). The kernel reads the token value from the + * provided user-space address. + * + * On success, the kernel removes the corresponding entry (identified by the + * token value read from the user pointer) from its internal preservation list. + * The provided @token (representing the now-removed entry) becomes invalid + * after this call. + * + * This operation can only be called when the live update orchestrator is in the + * %LIVEUPDATE_STATE_NORMAL state.** + * + * This is an I/O write operation (_IOW), signifying the kernel reads data (the + * token) from the user-provided pointer. + * + * Return: 0 on success, negative error code on failure (e.g., -EBUSY or -EINVAL + * if not in %LIVEUPDATE_STATE_NORMAL, bad address provided, invalid token value + * read, token not found). + */ +#define LIVEUPDATE_IOCTL_FD_UNPRESERVE \ + _IOW(LIVEUPDATE_IOCTL_TYPE, 0x01, __u64) + +/** + * LIVEUPDATE_IOCTL_FD_RESTORE - Restore a previously preserved file descriptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kernel. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_EVENT_FINISH. This is an I/O read/write operation. + * + * Return: 0 on success, negative error code on failure (e.g., invalid token). + */ +#define LIVEUPDATE_IOCTL_FD_RESTORE \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x02, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_GET_STATE - Query the current state of the live update + * orchestrator. + * + * Argument: Pointer to &enum liveupdate_state. + * + * The kernel fills the enum value pointed to by the argument with the current + * state of the live update subsystem. Possible states are: + * + * - %LIVEUPDATE_STATE_NORMAL: Default state; no live update operation is + * currently in progress. + * - %LIVEUPDATE_STATE_PREPARED: The preparation phase (triggered by + * %LIVEUPDATE_IOCTL_EVENT_PREPARE) has completed + * successfully. The system is ready for the + * reboot transition initiated by + * %LIVEUPDATE_IOCTL_EVENT_FREEZE. Note that some + * device operations (e.g., unbinding, new DMA + * mappings) might be restricted in this state. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into the + * new kernel via live update. It is now running + * the new kernel code and is awaiting the + * completion signal from user space via + * %LIVEUPDATE_IOCTL_EVENT_FINISH after + * restoration tasks are done. + * + * See the definition of &enum liveupdate_state for more details on each state. + * This is an I/O read operation (kernel writes to the user-provided pointer). + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_GET_STATE \ + _IOR(LIVEUPDATE_IOCTL_TYPE, 0x03, enum liveupdate_state) + +/** + * LIVEUPDATE_IOCTL_EVENT_PREPARE - Initiate preparation phase and trigger state + * saving. + * + * Argument: None. + * + * Initiates the live update preparation phase. This action corresponds to + * the internal %LIVEUPDATE_PREPARE kernel event and can also be triggered + * by writing '1' to ``/sys/kernel/liveupdate/prepare``. This typically + * triggers the main state saving process for items marked via the PRESERVE + * ioctls. This occurs *before* the main "blackout window", while user + * applications (e.g., VMs) may still be running. Kernel subsystems + * receiving the %LIVEUPDATE_PREPARE event should serialize necessary state. + * This command does not transfer data. + * + * Return: 0 on success, negative error code on failure. Transitions state + * towards %LIVEUPDATE_STATE_PREPARED on success. + */ +#define LIVEUPDATE_IOCTL_EVENT_PREPARE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x04) + +/** + * LIVEUPDATE_IOCTL_EVENT_FREEZE - Notify subsystems of imminent reboot + * transition. + * + * Argument: None. + * + * Notifies the live update subsystem and associated components that the kernel + * is about to execute the final reboot transition into the new kernel (e.g., + * via kexec). This action triggers the internal %LIVEUPDATE_FREEZE kernel + * event. This event provides subsystems a final, brief opportunity (within the + * "blackout window") to save critical state or perform last-moment quiescing. + * Any remaining or deferred state saving for items marked via the PRESERVE + * ioctls typically occurs in response to the %LIVEUPDATE_FREEZE event. + * + * This ioctl should only be called when the system is in the + * %LIVEUPDATE_STATE_PREPARED state. This command does not transfer data. + * + * Return: 0 if the notification is successfully processed by the kernel (but + * reboot follows). Returns a negative error code if the notification fails + * or if the system is not in the %LIVEUPDATE_STATE_PREPARED state. + */ +#define LIVEUPDATE_IOCTL_EVENT_FREEZE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x05) + +/** + * LIVEUPDATE_IOCTL_EVENT_CANCEL - Cancel the live update preparation phase. + * + * Argument: None. + * + * Notifies the live update subsystem to abort the preparation sequence + * potentially initiated by %LIVEUPDATE_IOCTL_EVENT_PREPARE. This action + * typically corresponds to the internal %LIVEUPDATE_CANCEL kernel event, + * which might also be triggered automatically if the PREPARE stage fails + * internally. + * + * When triggered, subsystems receiving the %LIVEUPDATE_CANCEL event should + * revert any state changes or actions taken specifically for the aborted + * prepare phase (e.g., discard partially serialized state). The kernel + * releases resources allocated specifically for this *aborted preparation + * attempt*. + * + * This operation cancels the current *attempt* to prepare for a live update + * but does **not** remove previously validated items from the internal list + * of potentially preservable resources. Consequently, preservation tokens + * previously generated by successful %LIVEUPDATE_IOCTL_FD_PRESERVE or calls + * generally **remain valid** as identifiers for those potentially preservable + * resources. However, since the system state returns towards + * %LIVEUPDATE_STATE_NORMAL, user space must initiate a new live update sequence + * (starting with %LIVEUPDATE_IOCTL_EVENT_PREPARE) to proceed with an update + * using these (or other) tokens. + * + * This command does not transfer data. Kernel callbacks for the + * %LIVEUPDATE_CANCEL event must not fail. + * + * Return: 0 on success, negative error code on failure. Transitions state back + * towards %LIVEUPDATE_STATE_NORMAL on success. + */ +#define LIVEUPDATE_IOCTL_EVENT_CANCEL \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x06) + +/** + * LIVEUPDATE_IOCTL_EVENT_FINISH - Signal restoration completion and trigger + * cleanup. + * + * Argument: None. + * + * Signals that user space has completed all necessary restoration actions in + * the new kernel (after a live update reboot). This action corresponds to the + * internal %LIVEUPDATE_FINISH kernel event and may also be triggerable via + * sysfs (e.g., writing '1' to ``/sys/kernel/liveupdate/finish``) + * Calling this ioctl triggers the cleanup phase: any resources that were + * successfully preserved but were *not* subsequently restored (reclaimed) via + * the RESTORE ioctls will have their preserved state discarded and associated + * kernel resources released. Involved devices may be reset. All desired + * restorations *must* be completed *before* this. Kernel callbacks for the + * %LIVEUPDATE_FINISH event must not fail. Successfully completing this phase + * transitions the system state from %LIVEUPDATE_STATE_UPDATED back to + * %LIVEUPDATE_STATE_NORMAL. This command does not transfer data. + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_EVENT_FINISH \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x07) + +#endif /* _UAPI_LIVEUPDATE_H */ -- 2.49.0.1101.gccaa498523-goog