From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0BC1CDB465 for ; Mon, 16 Oct 2023 23:32:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 628308D00D5; Mon, 16 Oct 2023 19:32:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D50C8D00B8; Mon, 16 Oct 2023 19:32:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44FF38D00D5; Mon, 16 Oct 2023 19:32:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2EFCA8D00B8 for ; Mon, 16 Oct 2023 19:32:41 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 076601CB73E for ; Mon, 16 Oct 2023 23:32:41 +0000 (UTC) X-FDA: 81352926522.09.7A8ACB1 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by imf06.hostedemail.com (Postfix) with ESMTP id 534F9180003 for ; Mon, 16 Oct 2023 23:32:39 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.microsoft.com header.s=default header.b=JBDgiG0I; dmarc=pass (policy=none) header.from=linux.microsoft.com; spf=pass (imf06.hostedemail.com: domain of madvenka@linux.microsoft.com designates 13.77.154.182 as permitted sender) smtp.mailfrom=madvenka@linux.microsoft.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697499159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SfHUHeC+bsZlpMjAFeDGrDjrFJ/Saq7zYvmmmQIZG2o=; b=tuYsVdoTDH10x3MD865lEUbQUTG5RfyiPuq51QoOWt1W6QDqpFHfr7p+iODK6T+z9F7Ols 2+F778IjrjrTZxSnkwGc1P+gu8JWvQPmFzJsG5m/jDHMfNrnMkSE/nATiszxf4z+amv5FA Xv7sxC8HktS5QcYcdBWjGPOktpccIi0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.microsoft.com header.s=default header.b=JBDgiG0I; dmarc=pass (policy=none) header.from=linux.microsoft.com; spf=pass (imf06.hostedemail.com: domain of madvenka@linux.microsoft.com designates 13.77.154.182 as permitted sender) smtp.mailfrom=madvenka@linux.microsoft.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697499159; a=rsa-sha256; cv=none; b=Z40NgGsV7qBf+TvEug4NEjZKTGbHJfUYAlMQZOJVw9zCpc4LvopsowOEcG7+a2zj4Lm9ga 9sDvIieEoKrhRGpmsRsxVIVjWM4i8qzwaI4GOn7Y8aRFarwscKalycBs64gem02V/9jZO1 UUThQIB1BIEF6/sgaVHAQTaYcbl6w2g= Received: from localhost.localdomain (unknown [47.186.13.91]) by linux.microsoft.com (Postfix) with ESMTPSA id 3AB4920B74C5; Mon, 16 Oct 2023 16:32:38 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 3AB4920B74C5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1697499159; bh=SfHUHeC+bsZlpMjAFeDGrDjrFJ/Saq7zYvmmmQIZG2o=; h=From:To:Subject:Date:In-Reply-To:References:From; b=JBDgiG0Iw0V/I83sKz/7TLqCH1WGNHbaqiN/L1djBYTo9C+2lSnwInQ+rPUbadLrK bfb/8Qfc1IRoTuk6K7U+UDcnHXhAQtM1nxDLrlcYMUkQBuvdcgG+KaRapIhBAl0hqB 2GotHRQuOMWqC7nGeuz9y1TDF2EDEuFvRaRAj9tU= From: madvenka@linux.microsoft.com To: gregkh@linuxfoundation.org, pbonzini@redhat.com, rppt@kernel.org, jgowans@amazon.com, graf@amazon.de, arnd@arndb.de, keescook@chromium.org, stanislav.kinsburskii@gmail.com, anthony.yznaga@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, madvenka@linux.microsoft.com, jamorris@linux.microsoft.com Subject: [RFC PATCH v1 07/10] mm/prmem: Implement named Persistent Instances. Date: Mon, 16 Oct 2023 18:32:12 -0500 Message-Id: <20231016233215.13090-8-madvenka@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231016233215.13090-1-madvenka@linux.microsoft.com> References: <1b1bc25eb87355b91fcde1de7c2f93f38abb2bf9> <20231016233215.13090-1-madvenka@linux.microsoft.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 534F9180003 X-Stat-Signature: gbai7eour7ozhymby6ws5pz41x93yxpe X-Rspam-User: X-HE-Tag: 1697499159-358255 X-HE-Meta: U2FsdGVkX19sE5PV1C4esoQMl/1H70em47Xl+Zg94eOzNJAD6ozGGN0W5JvSbRHWWDl4tiQAQdodiX4Q9UkFb+jfzWmvqk0YOaiy4/PWd1EqVgcS83AxaEB4JXkisYdvp0sI74ebbWmv5OWUrBRt3tn3TCDlHNKtAZxMOu16jSG3aJ13b+yVaL5frk1HBcWRX7kOEEPtZdNMvAe/rckqzQQZsmf3IM3Ab7RJMeN6U/vcr09UTgtRSyDZKg1vRUHhmw6Ck87iVjidbxd8JuhL9mx65FahyzLSQm6eZchutfmbuyKGqq2kwDTVVSxbz8OiJfsBhyXGS1uDgbBHI587uJbEZakd8w6GB67iXxqWUHoapwud4xu/kd6dhjHscsJ6UFtgfJehbsSAHtZVr52xWCeSPYBeIT1x4W4+JtoBFmnIdSJl4i4J0E9xoansoOWDLrImEwYu0Cq2kbjmv0vwherk74k9WiPEIj4m0WzksquR38jiz3n2pj4OLECMGBZ2J3cSbzL5PRCREmuFRAfjNEyP3Nx7hkYOBK6hRaRdVLX8qtKcGjDY7NvhcydM9gkSH6cKcQJ5Wy5PvMNgTHP4dGjH+Bk4kjXNkUCNAdHMEmsPf8QfGmPF3NXQ+yjpybxD3FivN5sXWFKRyZkzCNpKN1g3BPy6hYsZ79v6cpJ6AJsGmY7UsGK9nP6VtCVe54CmFkhXREewA8nuEITHYeTDp0fB2GFv6IXqbuhBMJplmL3JVGUUoIUWMX3Cw1/KP46ptIAE6Trn+VKwDeWgtW3JgX7wFgaay537IWij6QXl6w8qttGB0BU9G2W2LaYiqwKkwuG0BfP/d71aVU2h1DvXOLFI47XclzJbfklobpJMLGAXJSW3Kf6R0DnlrdsnSUwyuAYoDhYkdUNkF4or3KjjIy6DWemNzsPzZZFcBJ6Nn7k6WOESyT7mM+YeFTu3h/yst1VuqEfsOuxqiXfdNZQ /XCqyN6O Jv4p6gT3jgU05ye5IgmV+5MlbMwynDKvGOCNXBPBiqODOVUjjX4WJMjsQCoZisZG9V6YMn+A+z2EZIjl4GgVG2ieMtpbAiHlTe3hyO4WN844eYwYB20s3g68noR9RhcA7keTzwllBhcERc05O+ch8asO65AEJHNdh52Ui2B3d1dmpQ3uHxbrCEYSC0ytyfPzLLQMpLaC3uKS2tK/uWJqAkgvE278wUqVVisUuWmOJul/U9jxCcdi7Y7KSzu99H7BMpECedLVzwEqUvh7epxC7i89iwFexEVdhRrxX3tB3Zj2j0P0f5hRn/o9HHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Madhavan T. Venkataraman" To persist any data, a consumer needs to do the following: - create a persistent instance for it. The instance gets recorded in the metadata. - Name the instance. - Record the instance data in the instance. - Retrieve the instance by name after kexec. - Retrieve instance data. Implement the following API for consumers: prmem_get(subsystem, name, create) Get/Create a persistent instance. The consumer provides the name of the subsystem and the name of the instance within the subsystem. E.g., for a persistent ramdisk block device: subsystem = "ramdisk" instance = "pram0" prmem_set_data() Record a data pointer and a size for the instance. An instance may contain many data structures connected to each other using pointers, etc. A consumer is expected to record the top level data structure in the instance. All other data structures must be reachable from the top level data structure. prmem_get_data() Retrieve the data pointer and the size for the instance. prmem_put() Destroy a persistent instance. The instance data must be NULL at this point. So, the consumer is responsible for freeing the instance data and setting it to NULL in the instance prior to destroying. prmem_list() Walk the instances of a subsystem and call a callback for each. This allows a consumer to enumerate all of the instances associated with a subsystem. Signed-off-by: Madhavan T. Venkataraman --- include/linux/prmem.h | 36 +++++++++ kernel/prmem/Makefile | 2 +- kernel/prmem/prmem_init.c | 1 + kernel/prmem/prmem_instance.c | 139 ++++++++++++++++++++++++++++++++++ 4 files changed, 177 insertions(+), 1 deletion(-) create mode 100644 kernel/prmem/prmem_instance.c diff --git a/include/linux/prmem.h b/include/linux/prmem.h index 1cb4660cf35e..c7034690f7cb 100644 --- a/include/linux/prmem.h +++ b/include/linux/prmem.h @@ -50,6 +50,28 @@ struct prmem_region { struct gen_pool_chunk *chunk; }; +#define PRMEM_MAX_NAME 32 + +/* + * To persist any data, a persistent instance is created for it and the data is + * "remembered" in the instance. + * + * node List node + * subsystem Subsystem/driver/module that created the instance. E.g., + * "ramdisk" for the ramdisk driver. + * name Instance name within the subsystem/driver/module. E.g., "pram0" + * for a persistent ramdisk instance. + * data Pointer to data. E.g., the radix tree of pages in a ram disk. + * size Size of data. + */ +struct prmem_instance { + struct list_head node; + char subsystem[PRMEM_MAX_NAME]; + char name[PRMEM_MAX_NAME]; + void *data; + size_t size; +}; + #define PRMEM_MAX_CACHES 14 /* @@ -63,6 +85,8 @@ struct prmem_region { * * regions List of memory regions. * + * instances Persistent instances. + * * caches Caches for different object sizes. For allocations smaller than * PAGE_SIZE, these caches are used. */ @@ -74,6 +98,9 @@ struct prmem { /* Persistent Regions. */ struct list_head regions; + /* Persistent Instances. */ + struct list_head instances; + /* Allocation caches. */ void *caches[PRMEM_MAX_CACHES]; }; @@ -85,6 +112,8 @@ extern size_t prmem_size; extern bool prmem_inited; extern spinlock_t prmem_lock; +typedef int (*prmem_list_func_t)(struct prmem_instance *instance, void *arg); + /* Kernel API. */ void prmem_reserve_early(void); void prmem_reserve(void); @@ -98,6 +127,13 @@ void prmem_free_pages(struct page *pages, unsigned int order); void *prmem_alloc(size_t size, gfp_t gfp); void prmem_free(void *va, size_t size); +/* Persistent Instance API. */ +void *prmem_get(char *subsystem, char *name, bool create); +void prmem_set_data(struct prmem_instance *instance, void *data, size_t size); +void prmem_get_data(struct prmem_instance *instance, void **data, size_t *size); +bool prmem_put(struct prmem_instance *instance); +int prmem_list(char *subsystem, prmem_list_func_t func, void *arg); + /* Internal functions. */ struct prmem_region *prmem_add_region(unsigned long pa, size_t size); bool prmem_create_pool(struct prmem_region *region, bool new_region); diff --git a/kernel/prmem/Makefile b/kernel/prmem/Makefile index 99bb19f0afd3..0ed7976580d6 100644 --- a/kernel/prmem/Makefile +++ b/kernel/prmem/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-y += prmem_parse.o prmem_reserve.o prmem_init.o prmem_region.o prmem_misc.o -obj-y += prmem_allocator.o +obj-y += prmem_allocator.o prmem_instance.o diff --git a/kernel/prmem/prmem_init.c b/kernel/prmem/prmem_init.c index d23833d296fe..166fca688ab3 100644 --- a/kernel/prmem/prmem_init.c +++ b/kernel/prmem/prmem_init.c @@ -21,6 +21,7 @@ void __init prmem_init(void) prmem->metadata = prmem_metadata; prmem->size = prmem_size; INIT_LIST_HEAD(&prmem->regions); + INIT_LIST_HEAD(&prmem->instances); if (!prmem_add_region(prmem_pa, prmem_size)) return; diff --git a/kernel/prmem/prmem_instance.c b/kernel/prmem/prmem_instance.c new file mode 100644 index 000000000000..ee3554d0ab8b --- /dev/null +++ b/kernel/prmem/prmem_instance.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Persistent-Across-Kexec memory (prmem) - Persistent instances. + * + * Copyright (C) 2023 Microsoft Corporation + * Author: Madhavan T. Venkataraman (madvenka@linux.microsoft.com) + */ +#include + +static struct prmem_instance *prmem_find(char *subsystem, char *name) +{ + struct prmem_instance *instance; + + list_for_each_entry(instance, &prmem->instances, node) { + if (!strcmp(instance->subsystem, subsystem) && + !strcmp(instance->name, name)) { + return instance; + } + } + return NULL; +} + +void *prmem_get(char *subsystem, char *name, bool create) +{ + int subsystem_len = strlen(subsystem); + int name_len = strlen(name); + struct prmem_instance *instance; + + /* + * In early boot, you are allowed to get an existing instance. But + * you are not allowed to create one until prmem is fully initialized. + */ + if (!prmem || (!prmem_inited && create)) + return NULL; + + if (!subsystem_len || subsystem_len >= PRMEM_MAX_NAME || + !name_len || name_len >= PRMEM_MAX_NAME) { + return NULL; + } + + spin_lock(&prmem_lock); + + /* Check if it already exists. */ + instance = prmem_find(subsystem, name); + if (instance || !create) + goto unlock; + + instance = prmem_alloc_locked(sizeof(*instance)); + if (!instance) + goto unlock; + + strcpy(instance->subsystem, subsystem); + strcpy(instance->name, name); + instance->data = NULL; + instance->size = 0; + + list_add_tail(&instance->node, &prmem->instances); +unlock: + spin_unlock(&prmem_lock); + return instance; +} +EXPORT_SYMBOL_GPL(prmem_get); + +void prmem_set_data(struct prmem_instance *instance, void *data, size_t size) +{ + if (!prmem_inited) + return; + + spin_lock(&prmem_lock); + instance->data = data; + instance->size = size; + spin_unlock(&prmem_lock); +} +EXPORT_SYMBOL_GPL(prmem_set_data); + +void prmem_get_data(struct prmem_instance *instance, void **data, size_t *size) +{ + if (!prmem) + return; + + spin_lock(&prmem_lock); + *data = instance->data; + *size = instance->size; + spin_unlock(&prmem_lock); +} +EXPORT_SYMBOL_GPL(prmem_get_data); + +bool prmem_put(struct prmem_instance *instance) +{ + if (!prmem_inited) + return true; + + spin_lock(&prmem_lock); + + if (instance->data) { + /* + * Caller is responsible for freeing instance data and setting + * it to NULL. + */ + spin_unlock(&prmem_lock); + return false; + } + + /* Free instance. */ + list_del(&instance->node); + prmem_free_locked(instance, sizeof(*instance)); + + spin_unlock(&prmem_lock); + return true; +} +EXPORT_SYMBOL_GPL(prmem_put); + +int prmem_list(char *subsystem, prmem_list_func_t func, void *arg) +{ + int subsystem_len = strlen(subsystem); + struct prmem_instance *instance; + int ret; + + if (!prmem) + return 0; + + if (!subsystem_len || subsystem_len >= PRMEM_MAX_NAME) + return -EINVAL; + + spin_lock(&prmem_lock); + + list_for_each_entry(instance, &prmem->instances, node) { + if (strcmp(instance->subsystem, subsystem)) + continue; + + ret = func(instance, arg); + if (ret) + break; + } + + spin_unlock(&prmem_lock); + return ret; +} +EXPORT_SYMBOL_GPL(prmem_list); -- 2.25.1