From: Al Viro <viro@zeniv.linux.org.uk>
To: viro@zeniv.linux.org.uk
Cc: brauner@kernel.org, jack@suse.cz, linux-fsdevel@vger.kernel.org
Subject: [PATCH 10/11] alloc_fdtable(): change calling conventions.
Date: Mon, 12 Aug 2024 07:44:26 +0100 [thread overview]
Message-ID: <20240812064427.240190-10-viro@zeniv.linux.org.uk> (raw)
In-Reply-To: <20240812064427.240190-1-viro@zeniv.linux.org.uk>
First of all, tell it how many slots do we want, not which slot
is wanted. It makes one caller (dup_fd()) more straightforward
and doesn't harm another (expand_fdtable()).
Furthermore, make it return ERR_PTR() on failure rather than
returning NULL. Simplifies the callers.
Simplify the size calculation, while we are at it - note that we
always have slots_wanted greater than BITS_PER_LONG. What the
rules boil down to is
* use the smallest power of two large enough to give us
that many slots
* on 32bit skip 64 and 128 - the minimal capacity we want
there is 256 slots (i.e. 1Kb fd array).
* on 64bit don't skip anything, the minimal capacity is
128 - and we'll never be asked for 64 or less. 128 slots means
1Kb fd array, again.
* on 128bit, if that ever happens, don't skip anything -
we'll never be asked for 128 or less, so the fd array allocation
will be at least 2Kb.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
fs/file.c | 58 ++++++++++++++++++-------------------------------------
1 file changed, 19 insertions(+), 39 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index 1ee85e061ade..01cef75ef132 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -89,18 +89,11 @@ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt)
* 'unsigned long' in some places, but simply because that is how the Linux
* kernel bitmaps are defined to work: they are not "bits in an array of bytes",
* they are very much "bits in an array of unsigned long".
- *
- * The ALIGN(nr, BITS_PER_LONG) here is for clarity: since we just multiplied
- * by that "1024/sizeof(ptr)" before, we already know there are sufficient
- * clear low bits. Clang seems to realize that, gcc ends up being confused.
- *
- * On a 128-bit machine, the ALIGN() would actually matter. In the meantime,
- * let's consider it documentation (and maybe a test-case for gcc to improve
- * its code generation ;)
*/
-static struct fdtable * alloc_fdtable(unsigned int nr)
+static struct fdtable *alloc_fdtable(unsigned int slots_wanted)
{
struct fdtable *fdt;
+ unsigned int nr;
void *data;
/*
@@ -110,20 +103,22 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
* the fdarray into comfortable page-tuned chunks: starting at 1024B
* and growing in powers of two from there on.
*/
- nr /= (1024 / sizeof(struct file *));
- nr = roundup_pow_of_two(nr + 1);
- nr *= (1024 / sizeof(struct file *));
- nr = ALIGN(nr, BITS_PER_LONG);
+ if (IS_ENABLED(CONFIG_32BIT) && slots_wanted < 256)
+ nr = 256;
+ else
+ nr = roundup_pow_of_two(slots_wanted);
/*
* Note that this can drive nr *below* what we had passed if sysctl_nr_open
- * had been set lower between the check in expand_files() and here. Deal
- * with that in caller, it's cheaper that way.
+ * had been set lower between the check in expand_files() and here.
*
* We make sure that nr remains a multiple of BITS_PER_LONG - otherwise
* bitmaps handling below becomes unpleasant, to put it mildly...
*/
- if (unlikely(nr > sysctl_nr_open))
- nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
+ if (unlikely(nr > sysctl_nr_open)) {
+ nr = round_down(sysctl_nr_open, BITS_PER_LONG);
+ if (nr < slots_wanted)
+ return ERR_PTR(-EMFILE);
+ }
fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
if (!fdt)
@@ -152,7 +147,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
out_fdt:
kfree(fdt);
out:
- return NULL;
+ return ERR_PTR(-ENOMEM);
}
/*
@@ -169,7 +164,7 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr)
struct fdtable *new_fdt, *cur_fdt;
spin_unlock(&files->file_lock);
- new_fdt = alloc_fdtable(nr);
+ new_fdt = alloc_fdtable(nr + 1);
/* make sure all fd_install() have seen resize_in_progress
* or have finished their rcu_read_lock_sched() section.
@@ -178,16 +173,8 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr)
synchronize_rcu();
spin_lock(&files->file_lock);
- if (!new_fdt)
- return -ENOMEM;
- /*
- * extremely unlikely race - sysctl_nr_open decreased between the check in
- * caller and alloc_fdtable(). Cheaper to catch it here...
- */
- if (unlikely(new_fdt->max_fds <= nr)) {
- __free_fdtable(new_fdt);
- return -EMFILE;
- }
+ if (IS_ERR(new_fdt))
+ return PTR_ERR(new_fdt);
cur_fdt = files_fdtable(files);
BUG_ON(nr < cur_fdt->max_fds);
copy_fdtable(new_fdt, cur_fdt);
@@ -357,16 +344,9 @@ struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int
if (new_fdt != &newf->fdtab)
__free_fdtable(new_fdt);
- new_fdt = alloc_fdtable(open_files - 1);
- if (!new_fdt) {
- *errorp = -ENOMEM;
- goto out_release;
- }
-
- /* beyond sysctl_nr_open; nothing to do */
- if (unlikely(new_fdt->max_fds < open_files)) {
- __free_fdtable(new_fdt);
- *errorp = -EMFILE;
+ new_fdt = alloc_fdtable(open_files);
+ if (IS_ERR(new_fdt)) {
+ *errorp = PTR_ERR(new_fdt);
goto out_release;
}
--
2.39.2
next prev parent reply other threads:[~2024-08-12 6:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-12 6:42 [PATCHES] fs/file.c stuff Al Viro
2024-08-12 6:44 ` [PATCH 01/11] get rid of ...lookup...fdget_rcu() family Al Viro
2024-08-12 6:44 ` [PATCH 02/11] remove pointless includes of <linux/fdtable.h> Al Viro
2024-08-12 9:25 ` Christian Brauner
2024-08-12 6:44 ` [PATCH 03/11] close_files(): don't bother with xchg() Al Viro
2024-08-12 7:56 ` [PATCH] close_files(): reimplement based on do_close_on_exec() Mateusz Guzik
2024-08-14 5:24 ` Al Viro
2024-08-14 5:34 ` Mateusz Guzik
2024-08-12 6:44 ` [PATCH 04/11] proc_fd_getattr(): don't bother with S_ISDIR() check Al Viro
2024-08-12 6:44 ` [PATCH 05/11] move close_range(2) into fs/file.c, fold __close_range() into it Al Viro
2024-08-12 6:44 ` [PATCH 06/11] sane_fdtable_size(): don't bother looking at descriptors we are not going to copy Al Viro
2024-08-12 9:30 ` Christian Brauner
2024-08-12 6:44 ` [PATCH 07/11] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() Al Viro
2024-08-12 6:44 ` [PATCH 08/11] fs/file.c: conditionally clear full_fds Al Viro
2024-08-12 6:44 ` [PATCH 09/11] fs/file.c: add fast path in find_next_fd() Al Viro
2024-08-12 6:44 ` Al Viro [this message]
2024-08-12 9:35 ` [PATCH 10/11] alloc_fdtable(): change calling conventions Christian Brauner
2024-08-12 6:44 ` [PATCH 11/11] dup_fd(): " Al Viro
2024-08-12 9:32 ` Christian Brauner
2024-08-12 9:24 ` [PATCH 01/11] get rid of ...lookup...fdget_rcu() family Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240812064427.240190-10-viro@zeniv.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).