From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3396B3806BE for ; Sat, 25 Apr 2026 22:09:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777154945; cv=none; b=eVrrRp6V/D4CQeTpFRNpQsDpnTrmhIdW1yFoPKSnRP2Y8ntuMvGcurnUy9sxgCm9wm2gpaN7YKjPY8SSrnhRuwc8euIo1Tio5UN+g5kYo3Dtj/z1Q93Tnqit+VyzV0CmboImNt/DiV4YGZwQaHMv8m1iJ5W3IOaKSTCfHqpazUA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777154945; c=relaxed/simple; bh=VtZOPW1/LZKA5x5UFQabaxvf5lmI+Tr3n7dvFdW2Njo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R+lcbdUEKZvR47au++0AZ3v3OPlapW0yyT5GzFqbWmWIyGk58PydbUA9dr7yWVaB7eRh0Vdu19lGRGKTG2OQO7EF/jM5WcHIIvaBIlXQpLzGQZ7B5MZk78KuVDSoqvUaEbleVl/S5i7RyIO5MyGzr/hp/TZOPOKA77OEk4eis68= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OxEDFPAY; arc=none smtp.client-ip=209.85.221.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OxEDFPAY" Received: by mail-wr1-f50.google.com with SMTP id ffacd0b85a97d-43d70b3e159so4697436f8f.0 for ; Sat, 25 Apr 2026 15:09:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777154942; x=1777759742; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pFZlcdvkhCsVbbo9Dh0LekEgPNA2Ljy464GXlS0CXEM=; b=OxEDFPAYspITjxdQPV+8eAkXPBCu/at3xJWWiqFvVKx1ZgZRNnBOT3WXInSB/bVzHW tqKPLr601Rc4iJABn8+C2cFo0h2bqiqPVNgzb1LTIq1cbrMgBUVHpX9jyWDHkGwQNZ4T +D0nfNKr/wNXwY/dk/zd/4j0eLomPQOQpNuwuP48QwRgL8rAlqpfAetd+Xu+DhF4iAGI 8YovNQOwUjDL5J90UTeZgqhvOWUSm0koo2POqdy9KPucGdfLjNd2cVXlcq/WKqOzzv12 XDpduKXAxBKVCvY5MgaABQbaSPlvgSNdxyROdOuTNScyoJfb333b0wP+FC0sA8cg1qE0 Ntqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777154942; x=1777759742; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=pFZlcdvkhCsVbbo9Dh0LekEgPNA2Ljy464GXlS0CXEM=; b=h35qelyMB/iDttZuGoZ8HIUkPbASMboXqSJ51nXzfTI8TdeXX6WxYrudjdFRfrjs4h 1rw0gk3nNVb5jw4cRq/O5Q+pMGBc/nmwhDzMcJKxLwp78MnjDWi8EH3Jz7VFVtL4oqoN p6Atch1BlmMpG8AMOVp6nr18NiZTY/YsKkFmGyUWdqtN4tfPrlFb51P1G8/IfkbaZfnf ZLZhsQDTIt8nZRHmja+GYA4dRoP0OT/nAQsXnbmpMCZe+Yxklv8EYZdQoxtSJhVSdvjh QMedZmS0XT+SPQWqdK1U+FelXjufyGsx8PShNFVFhLtnVkOcbmQnLroaPjR2W227u2Mx qloA== X-Forwarded-Encrypted: i=1; AFNElJ8REN1U45oo3Cl2w4VeWbiNUNFfKnlbahI+aIIoX4LqZy06Fix0hoPZ1tn8eiUyYD+YUEC5FxVNJGprKsY=@vger.kernel.org X-Gm-Message-State: AOJu0YyxBa4257r98HvrvoezlaHvekqYFoYzEy90Ji2TgH4nBKYrJP+q yQqf7xxINsqlmjAnJFvDWnQrw2Y5bN7Zg7Mqbd/QI4S9G4UddS173KaV X-Gm-Gg: AeBDiet009mvnAfiK9gjsXjCZ6qRGo3Fucs7BIyKU2gUHt00JWw/fYGtUoP2xt9w71J udYDtTPss/9q5zP6SsHBGgHmKDrTtH/SYW4kBJqVulnLbVMOWiUr/ZxVlLYWhKeWdQXO0x9XdK7 ZLqWOc22PCnzaFxvGNMmx9mLqbM0+o2CSeCKLvX53lP1Zs0pIuF82cWa05+cpjrqn0/bBRnnmbE vISwtUkJANu0fIF4PJ9qG2ttYzto6c4Hf3lWA30MwCL7UJGJjLnVsXw/+DRa6bE01NvwYB4mExV DLgzY95AfipZmaUSH94r8DEppha+UUmqmFZj3k7tBhNIuDkRPs1jVWV6r8gNP+11anQpANFldnX +CDX+gm2/xRADgSoajUzZecwFaiSUnWMkyf1XQ/j7mjbcbcf+ZqfICcxFGdBoNSooUQuWLsJ7rV 1XhQ8RaxR1gPgNOdnh5seDVEPx36CuTIaSKY/H2gM/0HQBLjEZ9j6Gc5mVOIZCqqaQuJgNKVr/D pAwRMNHnwno X-Received: by 2002:a05:600c:8284:b0:489:1f3e:5f69 with SMTP id 5b1f17b1804b1-4891f3e629bmr384462475e9.18.1777154942431; Sat, 25 Apr 2026 15:09:02 -0700 (PDT) Received: from f.. (cst-prg-93-232.cust.vodafone.cz. [46.135.93.232]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-488fc0b4c85sm651984545e9.0.2026.04.25.15.09.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Apr 2026 15:09:01 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, adobriyan@gmail.com, Mateusz Guzik Subject: [PATCH v3 3/3] fs: cache the string generated by reading /proc/filesystems Date: Sun, 26 Apr 2026 00:08:44 +0200 Message-ID: <20260425220844.1763933-4-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260425220844.1763933-1-mjguzik@gmail.com> References: <20260425220844.1763933-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit It is being read surprisingly often (e.g., by mkdir, ls and even sed!). This is lock-protected pointer chasing over a linked list to pay for sprintf for every fs (32 on my boxen). Instead cache the result. While here make the file as permanent to avoid spurious ref trips in procfs. Signed-off-by: Mateusz Guzik --- fs/filesystems.c | 155 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 153 insertions(+), 2 deletions(-) diff --git a/fs/filesystems.c b/fs/filesystems.c index 7976366d4197..771fc31a69b8 100644 --- a/fs/filesystems.c +++ b/fs/filesystems.c @@ -31,6 +31,36 @@ static HLIST_HEAD(file_systems); static DEFINE_SPINLOCK(file_systems_lock); +#ifdef CONFIG_PROC_FS +/* + * Cache a stringified version of the filesystem list. + * + * The fs list gets queried a lot by userspace because of libselinux, including + * rather surprising programs (would you guess *sed* is on the list?). In order + * to reduce the overhead we cache the resulting string, which normally hangs + * around below 512 bytes in size. + * + * As the list almost never changes, its creation is not particularly optimized + * to keep things simple. + * + * We sort it out on read in order to not introduce a failure point for fs + * registration (in principle we may be unable to alloc memory for the list). + */ +struct file_systems_string { + struct rcu_head rcu; + unsigned long gen; + size_t len; + char string[]; +}; + +static unsigned long file_systems_gen; +static struct file_systems_string __rcu *file_systems_string; + +static void invalidate_filesystems_string(void); +#else +static inline void invalidate_filesystems_string(void) { } +#endif + /* WARNING: This can be used only if we _already_ own a reference */ struct file_system_type *get_filesystem(struct file_system_type *fs) { @@ -80,6 +110,7 @@ int register_filesystem(struct file_system_type *fs) if (find_filesystem(fs->name, strlen(fs->name))) return -EBUSY; hlist_add_tail_rcu(&fs->list, &file_systems); + invalidate_filesystems_string(); return 0; } EXPORT_SYMBOL(register_filesystem); @@ -101,6 +132,7 @@ int unregister_filesystem(struct file_system_type *fs) if (hlist_unhashed(&fs->list)) return -EINVAL; hlist_del_init_rcu(&fs->list); + invalidate_filesystems_string(); } synchronize_rcu(); return 0; @@ -209,7 +241,102 @@ int __init list_bdev_fs_names(char *buf, size_t size) } #ifdef CONFIG_PROC_FS -static int filesystems_proc_show(struct seq_file *m, void *v) +static void invalidate_filesystems_string(void) +{ + struct file_systems_string *old; + + lockdep_assert_held_write(&file_systems_lock); + file_systems_gen++; + old = rcu_replace_pointer(file_systems_string, NULL, + lockdep_is_held(&file_systems_lock)); + if (old) + kfree_rcu(old, rcu); +} + +static __cold noinline int regen_filesystems_string(void) +{ + struct file_system_type *p; + struct file_systems_string *old, *new; + size_t newlen, usedlen; + unsigned long gen; + +retry: + newlen = 0; + + /* pre-calc space for each fs */ + spin_lock(&file_systems_lock); + gen = file_systems_gen; + hlist_for_each_entry_rcu(p, &file_systems, list) { + if (!(p->fs_flags & FS_REQUIRES_DEV)) + newlen += strlen("nodev"); + newlen += strlen("\t") + strlen(p->name) + strlen("\n"); + } + spin_unlock(&file_systems_lock); + + new = kmalloc(offsetof(struct file_systems_string, string) + newlen + 1, + GFP_KERNEL); + if (!new) + return -ENOMEM; + + new->gen = gen; + new->len = newlen; + new->string[newlen] = '\0'; + + spin_lock(&file_systems_lock); + old = file_systems_string; + + /* + * Did someone beat us to it? + */ + if (old && old->gen == file_systems_gen) { + kfree(new); + return 0; + } + + /* + * Did the list change in the meantime? + */ + if (gen != file_systems_gen) { + kfree(new); + goto retry; + } + + /* + * Populate the string. + * + * We know we have just enough space because we calculated the right + * size the previous time we had the lock and confirmed the list has + * not changed after reacquiring it. + */ + usedlen = 0; + hlist_for_each_entry_rcu(p, &file_systems, list) { + usedlen += sprintf(&new->string[usedlen], "%s\t%s\n", + (p->fs_flags & FS_REQUIRES_DEV) ? "" : "nodev", + p->name); + } + + if (WARN_ON_ONCE(new->len != strlen(new->string))) { + /* + * Should never happen of course, keep this in case someone changes string + * generation above and messes it up. + */ + spin_unlock(&file_systems_lock); + if (old) + kfree_rcu(old, rcu); + return -EINVAL; + } + + /* + * Paired with consume fence in READ_ONCE() in filesystems_proc_show() + */ + smp_store_release(&file_systems_string, new); + spin_unlock(&file_systems_lock); + if (old) + kfree_rcu(old, rcu); + return 0; +} + +static __cold noinline int filesystems_proc_show_fallback(struct seq_file *m, void *v) { struct file_system_type *p; @@ -222,9 +349,33 @@ static int filesystems_proc_show(struct seq_file *m, void *v) return 0; } +static int filesystems_proc_show(struct seq_file *m, void *v) +{ + struct file_systems_string *fss; + + for (;;) { + scoped_guard(rcu) { + fss = rcu_dereference(file_systems_string); + if (likely(fss)) { + seq_write(m, fss->string, fss->len); + return 0; + } + } + + int err = regen_filesystems_string(); + if (unlikely(err)) + return filesystems_proc_show_fallback(m, v); + } +} + static int __init proc_filesystems_init(void) { - proc_create_single("filesystems", 0, NULL, filesystems_proc_show); + struct proc_dir_entry *pde; + + pde = proc_create_single("filesystems", 0, NULL, filesystems_proc_show); + if (!pde) + return -ENOMEM; + proc_make_permanent(pde); return 0; } module_init(proc_filesystems_init); -- 2.48.1