Linux Trace Kernel

Linux Trace Kernel
 help / color / mirror / Atom feed

* Re: [PATCH v2 2/3] lib/bootconfig: check bounds before writing in __xbc_open_brace()
From: Masami Hiramatsu @ 2026-03-13  4:18 UTC (permalink / raw)
  To: Josh Law
  Cc: Andrew Morton, Steven Rostedt, Masami Hiramatsu, Josh Law,
	linux-kernel, linux-trace-kernel
In-Reply-To: <1bd0e36d-0ea0-4a79-bbee-e38de553f1ea@gmail.com>

On Thu, 12 Mar 2026 21:30:10 +0000
Josh Law <hlcj1234567@gmail.com> wrote:

> 12 Mar 2026 21:28:11 Andrew Morton <akpm@linux-foundation.org>:
> 
> > On Thu, 12 Mar 2026 21:09:52 +0000 Josh Law <hlcj1234567@gmail.com> wrote:
> >
> >>> That's a fair point, Steve. Given that brace_index isn't touched elsewhere and the current check effectively prevents the overflow, I agree this isn't strictly necessary. I'll drop this patch and stick with the fix for the off-by-one reporting error instead. Thanks for the feedback!
> >>
> >> Wait Steve,
> >> Thanks for the look. I see your point that it's currently redundant given the call patterns. It looks like Andrew has already merged this into the -mm tree, likely as a 'belt-and-suspenders' safety measure. I'll keep your feedback in mind for future cleanup, but I'm glad we got the other off-by-one fix in as well!
> >
> > Please wordwrap the emails.
> >
> >> And in my opinion, merging it is a decent idea.
> >
> > You've changed your position without explaining why?
> 
> Sorry, I think it should be merged because it's better to be safe than sorry, I know there is different methods of implementation, but this one still works... I know it's churn (and I'm sorry)

I would like to keep this original >= because it is safer.
Andrew, I will pick these patches with my test patch.

Thank you,


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v2 3/3] lib/bootconfig: fix snprintf truncation check in xbc_node_compose_key_after()
From: Masami Hiramatsu @ 2026-03-13  2:26 UTC (permalink / raw)
  To: Josh Law; +Cc: Andrew Morton, Josh Law, linux-kernel, linux-trace-kernel
In-Reply-To: <20260312191143.28719-4-objecting@objecting.org>

On Thu, 12 Mar 2026 19:11:43 +0000
Josh Law <hlcj1234567@gmail.com> wrote:

> From: Josh Law <objecting@objecting.org>
> 
> snprintf() returns the number of characters that would have been
> written excluding the NUL terminator.  Output is truncated when the
> return value is >= the buffer size, not just > the buffer size.
> 
> When ret == size, the current code takes the non-truncated path,
> advancing buf by ret and reducing size to 0.  This is wrong because
> the output was actually truncated (the last character was replaced by
> NUL).  Fix by using >= so the truncation path is taken correctly.
> 

OK, but this one is a minor issue, because either way, remaining
size becomes 0.

		ret = snprintf(buf, size, "%s%s", xbc_node_get_data(node),
			       depth ? "." : "");
		if (ret < 0)
			return ret;
		if (ret > size) {
			size = 0;
		} else {
			size -= ret; // if ret == size, the size becomes 0
			buf += ret;
		}

Anyway, to be clear the correct error case handling, this
should be applied.

Thank you!

> Signed-off-by: Josh Law <objecting@objecting.org>
> ---
>  lib/bootconfig.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/bootconfig.c b/lib/bootconfig.c
> index 62b4ed7a0ba6..b0ef1e74e98a 100644
> --- a/lib/bootconfig.c
> +++ b/lib/bootconfig.c
> @@ -316,7 +316,7 @@ int __init xbc_node_compose_key_after(struct xbc_node *root,
>  			       depth ? "." : "");
>  		if (ret < 0)
>  			return ret;
> -		if (ret > size) {
> +		if (ret >= size) {
>  			size = 0;
>  		} else {
>  			size -= ret;
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v2 2/3] lib/bootconfig: check bounds before writing in __xbc_open_brace()
From: Masami Hiramatsu @ 2026-03-13  2:10 UTC (permalink / raw)
  To: Josh Law; +Cc: Andrew Morton, Josh Law, linux-kernel, linux-trace-kernel
In-Reply-To: <20260312191143.28719-3-objecting@objecting.org>

On Thu, 12 Mar 2026 19:11:42 +0000
Josh Law <hlcj1234567@gmail.com> wrote:

> From: Josh Law <objecting@objecting.org>
> 
> The bounds check for brace_index happens after the array write.
> While the current call pattern prevents an actual out-of-bounds
> access (the previous call would have returned an error), the
> write-before-check pattern is fragile and would become a real
> out-of-bounds write if the error return were ever not propagated.
> 
> Move the bounds check before the array write so the function is
> self-contained and safe regardless of caller behavior.
> 

Hmm good catch. This is a kind of specification mistake.
If we pass below to the bootconfig tool, currently it fails.
----
key1 {
key2 {
key3 {
key4 {
key5 {
key6 {
key7 {
key8 {
key9 {
key10 {
key11 {
key12 {
key13 {
key14 {
key15 {
key16 {
}}}}}}}}}}}}}}}}
----
But since "open_brace[]" array has 16 entries, it should
accept the 16th brace.

Let me add a good and a bad example for this case too.

Thanks,

> Signed-off-by: Josh Law <objecting@objecting.org>
> ---
>  lib/bootconfig.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/bootconfig.c b/lib/bootconfig.c
> index a1e6a2e14b01..62b4ed7a0ba6 100644
> --- a/lib/bootconfig.c
> +++ b/lib/bootconfig.c
> @@ -532,9 +532,9 @@ static char *skip_spaces_until_newline(char *p)
>  static int __init __xbc_open_brace(char *p)
>  {
>  	/* Push the last key as open brace */
> -	open_brace[brace_index++] = xbc_node_index(last_parent);
>  	if (brace_index >= XBC_DEPTH_MAX)
>  		return xbc_parse_error("Exceed max depth of braces", p);
> +	open_brace[brace_index++] = xbc_node_index(last_parent);
>  
>  	return 0;
>  }
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v2 1/3] lib/bootconfig: fix off-by-one in xbc_verify_tree() unclosed brace error
From: Masami Hiramatsu @ 2026-03-13  1:39 UTC (permalink / raw)
  To: Josh Law
  Cc: Steven Rostedt, Masami Hiramatsu, Andrew Morton, Josh Law,
	linux-kernel, linux-trace-kernel
In-Reply-To: <48aee348-5859-4d13-aa88-d0c21a4c0289@gmail.com>

On Thu, 12 Mar 2026 21:03:52 +0000
Josh Law <hlcj1234567@gmail.com> wrote:

> 12 Mar 2026 21:02:51 Steven Rostedt <rostedt@goodmis.org>:
> 
> > On Thu, 12 Mar 2026 19:11:41 +0000
> > Josh Law <hlcj1234567@gmail.com> wrote:
> >
> >> From: Josh Law <objecting@objecting.org>
> >>
> >> __xbc_open_brace() pushes entries with post-increment
> >> (open_brace[brace_index++]), so brace_index always points one past
> >> the last valid entry.  xbc_verify_tree() reads open_brace[brace_index]
> >> to report which brace is unclosed, but this is one past the last
> >> pushed entry and contains stale/zero data, causing the error message
> >> to reference the wrong node.
> >>
> >> Use open_brace[brace_index - 1] to correctly identify the unclosed
> >> brace.  brace_index is known to be > 0 here since we are inside the
> >> if (brace_index) guard.
> >>
> >> Signed-off-by: Josh Law <objecting@objecting.org>
> >
> > Nice catch. May I ask how you found this.
> >
> > Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Thanks for patch and review!

> >
> > -- Steve
> >
> >> ---
> >> lib/bootconfig.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/lib/bootconfig.c b/lib/bootconfig.c
> >> index 2bcd5c2aa87e..a1e6a2e14b01 100644
> >> --- a/lib/bootconfig.c
> >> +++ b/lib/bootconfig.c
> >> @@ -802,7 +802,7 @@ static int __init xbc_verify_tree(void)
> >>
> >>     /* Brace closing */
> >>     if (brace_index) {
> >> -       n = &xbc_nodes[open_brace[brace_index]];
> >> +       n = &xbc_nodes[open_brace[brace_index - 1]];
> >>         return xbc_parse_error("Brace is not closed",
> >>                     xbc_node_get_data(n));
> >>     }
> 
> Hi Steve,
> Thanks for the review!
> I found this while doing a manual audit of the bootconfig parser's error handling. I noticed that the post-increment in __xbc_open_brace() didn't seem to align with how xbc_verify_tree() was accessing the index. I verified it by intentionally passing a 
malformed config with an unclosed brace and saw it reporting a 'stale' or incorrect node location

Thanks, I confirmed it with below config.

$ cat samples/bad-non-closed-brace.bconf 
foo {
	bar {
		buz
}

This closed the 2nd `{`, but not close the first one.

Without patch;
$ ./bootconfig samples/bad-non-closed-brace.bconf 
Parse Error: Brace is not closed at 2:2

With this fix;
$ ./bootconfig samples/bad-non-closed-brace.bconf 
Parse Error: Brace is not closed at 1:1

Than you!

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH 07/12] HID: spi_hid: add ACPI support for SPI over HID
From: Jingyuan Liang @ 2026-03-13  1:24 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Jiri Kosina, Benjamin Tissoires, Jonathan Corbet, Mark Brown,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-input, linux-doc,
	linux-kernel, linux-spi, linux-trace-kernel, devicetree, hbarnor,
	Angela Czubak
In-Reply-To: <abD6RJZa5D7LN3x0@google.com>

On Tue, Mar 10, 2026 at 10:27 PM Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
>
> On Tue, Mar 03, 2026 at 06:12:59AM +0000, Jingyuan Liang wrote:
> > From: Angela Czubak <acz@semihalf.com>
> >
> > Detect SPI HID devices described in ACPI.
> >
> > Signed-off-by: Angela Czubak <acz@semihalf.com>
> > Signed-off-by: Jingyuan Liang <jingyliang@chromium.org>
> > ---
> >  drivers/hid/spi-hid/Kconfig        |  15 +++
> >  drivers/hid/spi-hid/Makefile       |   1 +
> >  drivers/hid/spi-hid/spi-hid-acpi.c | 253 +++++++++++++++++++++++++++++++++++++
> >  drivers/hid/spi-hid/spi-hid-core.c |  27 +---
> >  drivers/hid/spi-hid/spi-hid.h      |  44 +++++++
> >  5 files changed, 316 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/hid/spi-hid/Kconfig b/drivers/hid/spi-hid/Kconfig
> > index 836fdefe8345..114b1e00da39 100644
> > --- a/drivers/hid/spi-hid/Kconfig
> > +++ b/drivers/hid/spi-hid/Kconfig
> > @@ -10,6 +10,21 @@ menuconfig SPI_HID
> >
> >  if SPI_HID
> >
> > +config SPI_HID_ACPI
> > +     tristate "HID over SPI transport layer ACPI driver"
> > +     depends on ACPI
> > +     select SPI_HID_CORE
> > +     help
> > +       Say Y here if you use a keyboard, a touchpad, a touchscreen, or any
> > +       other HID based devices which are connected to your computer via SPI.
> > +       This driver supports ACPI-based systems.
> > +
> > +       If unsure, say N.
> > +
> > +       This support is also available as a module.  If so, the module
> > +       will be called spi-hid-acpi. It will also build/depend on the
> > +       module spi-hid.
> > +
> >  config SPI_HID_CORE
> >       tristate
> >  endif
> > diff --git a/drivers/hid/spi-hid/Makefile b/drivers/hid/spi-hid/Makefile
> > index 92e24cddbfc2..753c7b7a7844 100644
> > --- a/drivers/hid/spi-hid/Makefile
> > +++ b/drivers/hid/spi-hid/Makefile
> > @@ -7,3 +7,4 @@
> >
> >  obj-$(CONFIG_SPI_HID_CORE)   += spi-hid.o
> >  spi-hid-objs                         = spi-hid-core.o
> > +obj-$(CONFIG_SPI_HID_ACPI)   += spi-hid-acpi.o
> > diff --git a/drivers/hid/spi-hid/spi-hid-acpi.c b/drivers/hid/spi-hid/spi-hid-acpi.c
> > new file mode 100644
> > index 000000000000..612e74fe72f9
> > --- /dev/null
> > +++ b/drivers/hid/spi-hid/spi-hid-acpi.c
> > @@ -0,0 +1,253 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * HID over SPI protocol, ACPI related code
> > + *
> > + * Copyright (c) 2021 Microsoft Corporation
> > + * Copyright (c) 2026 Google LLC
> > + *
> > + * This code was forked out of the HID over SPI core code, which is partially
> > + * based on "HID over I2C protocol implementation:
> > + *
> > + * Copyright (c) 2012 Benjamin Tissoires <benjamin.tissoires@gmail.com>
> > + * Copyright (c) 2012 Ecole Nationale de l'Aviation Civile, France
> > + * Copyright (c) 2012 Red Hat, Inc
> > + *
> > + * which in turn is partially based on "USB HID support for Linux":
> > + *
> > + * Copyright (c) 1999 Andreas Gal
> > + * Copyright (c) 2000-2005 Vojtech Pavlik <vojtech@suse.cz>
> > + * Copyright (c) 2005 Michael Haboustak <mike-@cinci.rr.com> for Concept2, Inc
> > + * Copyright (c) 2007-2008 Oliver Neukum
> > + * Copyright (c) 2006-2010 Jiri Kosina
> > + */
> > +
> > +#include <linux/acpi.h>
> > +#include <linux/delay.h>
> > +#include <linux/device.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/reset.h>
> > +#include <linux/uuid.h>
> > +
> > +#include "spi-hid.h"
> > +
> > +/* Config structure is filled with data from ACPI */
> > +struct spi_hid_acpi_config {
> > +     struct spihid_ops ops;
> > +
> > +     struct spi_hid_conf property_conf;
> > +     u32 post_power_on_delay_ms;
> > +     u32 minimal_reset_delay_ms;
> > +     struct acpi_device *adev;
> > +};
> > +
> > +/* HID SPI Device: 6e2ac436-0fcf41af-a265-b32a220dcfab */
> > +static guid_t spi_hid_guid =
> > +     GUID_INIT(0x6E2AC436, 0x0FCF, 0x41AF,
> > +               0xA2, 0x65, 0xB3, 0x2A, 0x22, 0x0D, 0xCF, 0xAB);
> > +
> > +static int spi_hid_acpi_populate_config(struct spi_hid_acpi_config *conf,
> > +                                     struct acpi_device *adev)
> > +{
> > +     acpi_handle handle = acpi_device_handle(adev);
> > +     union acpi_object *obj;
> > +
> > +     conf->adev = adev;
> > +
> > +     /* Revision 3 for HID over SPI V1, see specification. */
> > +     obj = acpi_evaluate_dsm_typed(handle, &spi_hid_guid, 3, 1, NULL,
> > +                                   ACPI_TYPE_INTEGER);
> > +     if (!obj) {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID input report header address failed");
> > +             return -ENODEV;
> > +     }
> > +     conf->property_conf.input_report_header_address = obj->integer.value;
> > +     ACPI_FREE(obj);
> > +
> > +     obj = acpi_evaluate_dsm_typed(handle, &spi_hid_guid, 3, 2, NULL,
> > +                                   ACPI_TYPE_INTEGER);
> > +     if (!obj) {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID input report body address failed");
> > +             return -ENODEV;
> > +     }
> > +     conf->property_conf.input_report_body_address = obj->integer.value;
> > +     ACPI_FREE(obj);
> > +
> > +     obj = acpi_evaluate_dsm_typed(handle, &spi_hid_guid, 3, 3, NULL,
> > +                                   ACPI_TYPE_INTEGER);
> > +     if (!obj) {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID output report header address failed");
> > +             return -ENODEV;
> > +     }
> > +     conf->property_conf.output_report_address = obj->integer.value;
> > +     ACPI_FREE(obj);
> > +
> > +     obj = acpi_evaluate_dsm_typed(handle, &spi_hid_guid, 3, 4, NULL,
> > +                                   ACPI_TYPE_BUFFER);
> > +     if (!obj) {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID read opcode failed");
> > +             return -ENODEV;
> > +     }
> > +     if (obj->buffer.length == 1) {
> > +             conf->property_conf.read_opcode = obj->buffer.pointer[0];
> > +     } else {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID read opcode, too long buffer");
> > +             ACPI_FREE(obj);
> > +             return -ENODEV;
> > +     }
> > +     ACPI_FREE(obj);
> > +
> > +     obj = acpi_evaluate_dsm_typed(handle, &spi_hid_guid, 3, 5, NULL,
> > +                                   ACPI_TYPE_BUFFER);
> > +     if (!obj) {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID write opcode failed");
> > +             return -ENODEV;
> > +     }
> > +     if (obj->buffer.length == 1) {
> > +             conf->property_conf.write_opcode = obj->buffer.pointer[0];
> > +     } else {
> > +             acpi_handle_err(handle,
> > +                             "Error _DSM call to get HID write opcode, too long buffer");
> > +             ACPI_FREE(obj);
> > +             return -ENODEV;
> > +     }
> > +     ACPI_FREE(obj);
> > +
> > +     /* Value not provided in ACPI,*/
> > +     conf->post_power_on_delay_ms = 5;
> > +     conf->minimal_reset_delay_ms = 150;
> > +
> > +     if (!acpi_has_method(handle, "_RST")) {
> > +             acpi_handle_err(handle, "No reset method for acpi handle");
> > +             return -ENODEV;
>
> I would return -EINVAL as we have the device with right _DSM but without
> mandated by the spec _RST.

Thanks! Will fix this in v2.

>
> > +     }
> > +
> > +     /* FIXME: not reading hid-over-spi-flags, multi-SPI not supported */
> > +
> > +     return 0;
> > +}
> > +
> > +static int spi_hid_acpi_power_none(struct spihid_ops *ops)
> > +{
> > +     return 0;
> > +}
> > +
> > +static int spi_hid_acpi_power_down(struct spihid_ops *ops)
> > +{
> > +     struct spi_hid_acpi_config *conf = container_of(ops,
> > +                                                     struct spi_hid_acpi_config,
> > +                                                     ops);
> > +
> > +     return acpi_device_set_power(conf->adev, ACPI_STATE_D3);
> > +}
> > +
> > +static int spi_hid_acpi_power_up(struct spihid_ops *ops)
> > +{
> > +     struct spi_hid_acpi_config *conf = container_of(ops,
> > +                                                     struct spi_hid_acpi_config,
> > +                                                     ops);
> > +     int error;
> > +
> > +     error = acpi_device_set_power(conf->adev, ACPI_STATE_D0);
> > +     if (error) {
> > +             dev_err(&conf->adev->dev, "Error could not power up ACPI device: %d.", error);
> > +             return error;
> > +     }
> > +
> > +     if (conf->post_power_on_delay_ms)
> > +             msleep(conf->post_power_on_delay_ms);
> > +
> > +     return 0;
> > +}
> > +
> > +static int spi_hid_acpi_assert_reset(struct spihid_ops *ops)
> > +{
> > +     return 0;
> > +}
> > +
> > +static int spi_hid_acpi_deassert_reset(struct spihid_ops *ops)
> > +{
> > +     struct spi_hid_acpi_config *conf = container_of(ops,
> > +                                                     struct spi_hid_acpi_config,
> > +                                                     ops);
> > +
> > +     return device_reset(&conf->adev->dev);
> > +}
> > +
> > +static void spi_hid_acpi_sleep_minimal_reset_delay(struct spihid_ops *ops)
> > +{
> > +     struct spi_hid_acpi_config *conf = container_of(ops,
> > +                                                     struct spi_hid_acpi_config,
> > +                                                     ops);
> > +     usleep_range(1000 * conf->minimal_reset_delay_ms,
> > +                  1000 * (conf->minimal_reset_delay_ms + 1));
>
> I'd probably use "fsleep(conf->minimal_reset_delay_ms * 1000)".

I will fix this in v2. And do the same for the of driver.

>
> > +}
> > +
> > +static int spi_hid_acpi_probe(struct spi_device *spi)
> > +{
> > +     struct device *dev = &spi->dev;
> > +     struct acpi_device *adev;
> > +     struct spi_hid_acpi_config *config;
> > +     int error;
> > +
> > +     adev = ACPI_COMPANION(dev);
> > +     if (!adev) {
> > +             dev_err(dev, "Error could not get ACPI device.");
> > +             return -ENODEV;
> > +     }
> > +
> > +     config = devm_kzalloc(dev, sizeof(struct spi_hid_acpi_config),
> > +                           GFP_KERNEL);
> > +     if (!config)
> > +             return -ENOMEM;
> > +
> > +     if (acpi_device_power_manageable(adev)) {
> > +             config->ops.power_up = spi_hid_acpi_power_up;
> > +             config->ops.power_down = spi_hid_acpi_power_down;
> > +     } else {
> > +             config->ops.power_up = spi_hid_acpi_power_none;
> > +             config->ops.power_down = spi_hid_acpi_power_none;
> > +     }
> > +     config->ops.assert_reset = spi_hid_acpi_assert_reset;
> > +     config->ops.deassert_reset = spi_hid_acpi_deassert_reset;
> > +     config->ops.sleep_minimal_reset_delay =
> > +             spi_hid_acpi_sleep_minimal_reset_delay;
> > +
> > +     error = spi_hid_acpi_populate_config(config, adev);
> > +     if (error) {
> > +             dev_err(dev, "%s: unable to populate config data.", __func__);
> > +             return error;
> > +     }
>
> I would add a blank line.

Sure! Will fix this in v2.

>
> > +     return spi_hid_core_probe(spi, &config->ops, &config->property_conf);
> > +}
> > +
> > +static const struct acpi_device_id spi_hid_acpi_match[] = {
> > +     { "ACPI0C51", 0 },
> > +     { "PNP0C51", 0 },
> > +     { },
>
> No comma on sentinels.

Will fix this in v2.

>
> > +};
> > +MODULE_DEVICE_TABLE(acpi, spi_hid_acpi_match);
> > +
> > +static struct spi_driver spi_hid_acpi_driver = {
> > +     .driver = {
> > +             .name   = "spi_hid_acpi",
> > +             .owner  = THIS_MODULE,
> > +             .acpi_match_table = ACPI_PTR(spi_hid_acpi_match),
>
> This is dependent on ACPI, so no need to sue ACPI_PTR().

Will fix this in v2 and remove of_match_ptr in the of driver as well.

>
> > +             .probe_type = PROBE_PREFER_ASYNCHRONOUS,
> > +             .dev_groups = spi_hid_groups,
> > +     },
> > +     .probe          = spi_hid_acpi_probe,
> > +     .remove         = spi_hid_core_remove,
> > +};
> > +
> > +module_spi_driver(spi_hid_acpi_driver);
> > +
> > +MODULE_DESCRIPTION("HID over SPI ACPI transport driver");
> > +MODULE_AUTHOR("Angela Czubak <aczubak@google.com>");
> > +MODULE_LICENSE("GPL");
> > diff --git a/drivers/hid/spi-hid/spi-hid-core.c b/drivers/hid/spi-hid/spi-hid-core.c
> > index e3273846267e..02beb209a92d 100644
> > --- a/drivers/hid/spi-hid/spi-hid-core.c
> > +++ b/drivers/hid/spi-hid/spi-hid-core.c
> > @@ -43,6 +43,9 @@
> >  #include <linux/wait.h>
> >  #include <linux/workqueue.h>
> >
> > +#include "spi-hid.h"
> > +#include "spi-hid-core.h"
> > +
> >  /* Protocol constants */
> >  #define SPI_HID_READ_APPROVAL_CONSTANT               0xff
> >  #define SPI_HID_INPUT_HEADER_SYNC_BYTE               0x5a
> > @@ -105,30 +108,6 @@ struct spi_hid_output_report {
> >       u8 *content;
> >  };
> >
> > -/* struct spi_hid_conf - Conf provided to the core */
> > -struct spi_hid_conf {
> > -     u32 input_report_header_address;
> > -     u32 input_report_body_address;
> > -     u32 output_report_address;
> > -     u8 read_opcode;
> > -     u8 write_opcode;
> > -};
> > -
> > -/**
> > - * struct spihid_ops - Ops provided to the core
> > - * @power_up: do sequencing to power up the device
> > - * @power_down: do sequencing to power down the device
> > - * @assert_reset: do sequencing to assert the reset line
> > - * @deassert_reset: do sequencing to deassert the reset line
> > - */
> > -struct spihid_ops {
> > -     int (*power_up)(struct spihid_ops *ops);
> > -     int (*power_down)(struct spihid_ops *ops);
> > -     int (*assert_reset)(struct spihid_ops *ops);
> > -     int (*deassert_reset)(struct spihid_ops *ops);
> > -     void (*sleep_minimal_reset_delay)(struct spihid_ops *ops);
> > -};
> > -
> >  static struct hid_ll_driver spi_hid_ll_driver;
> >
> >  static void spi_hid_populate_read_approvals(const struct spi_hid_conf *conf,
> > diff --git a/drivers/hid/spi-hid/spi-hid.h b/drivers/hid/spi-hid/spi-hid.h
> > new file mode 100644
> > index 000000000000..1fdd45262647
> > --- /dev/null
> > +++ b/drivers/hid/spi-hid/spi-hid.h
> > @@ -0,0 +1,44 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 Microsoft Corporation
> > + * Copyright (c) 2026 Google LLC
> > + */
> > +
> > +#ifndef SPI_HID_H
> > +#define SPI_HID_H
> > +
> > +#include <linux/spi/spi.h>
> > +#include <linux/sysfs.h>
> > +
> > +/* struct spi_hid_conf - Conf provided to the core */
> > +struct spi_hid_conf {
> > +     u32 input_report_header_address;
> > +     u32 input_report_body_address;
> > +     u32 output_report_address;
> > +     u8 read_opcode;
> > +     u8 write_opcode;
> > +};
> > +
> > +/**
> > + * struct spihid_ops - Ops provided to the core
> > + * @power_up: do sequencing to power up the device
> > + * @power_down: do sequencing to power down the device
> > + * @assert_reset: do sequencing to assert the reset line
> > + * @deassert_reset: do sequencing to deassert the reset line
> > + */
> > +struct spihid_ops {
> > +     int (*power_up)(struct spihid_ops *ops);
> > +     int (*power_down)(struct spihid_ops *ops);
> > +     int (*assert_reset)(struct spihid_ops *ops);
> > +     int (*deassert_reset)(struct spihid_ops *ops);
> > +     void (*sleep_minimal_reset_delay)(struct spihid_ops *ops);
> > +};
> > +
> > +int spi_hid_core_probe(struct spi_device *spi, struct spihid_ops *ops,
> > +                    struct spi_hid_conf *conf);
> > +
> > +void spi_hid_core_remove(struct spi_device *spi);
> > +
> > +extern const struct attribute_group *spi_hid_groups[];
> > +
> > +#endif /* SPI_HID_H */
>
> I am not sure if this belongs to this patch or if it should be better in
> the patch introducing the main driver from the beginning.

These definitions were in spi-hid-core.c in the previous patch introducing
the main driver because it was only used in one .c file. This patch introduces
spi-hid-acpi.c and now two .c files need it so I created a separate .h
file here.

>
> For the ACPI part:
>
> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
>
> Thanks.
>
> --
> Dmitry

^ permalink raw reply

* Re: [PATCH 09/12] dt-bindings: input: Document hid-over-spi DT schema
From: Jingyuan Liang @ 2026-03-13  1:14 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jiri Kosina, Benjamin Tissoires, Jonathan Corbet, Mark Brown,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Dmitry Torokhov, Krzysztof Kozlowski, Conor Dooley, linux-input,
	linux-doc, LKML, linux-spi, linux-trace-kernel, devicetree,
	Henry Barnor, Dmitry Antipov, Jarrett Schultz
In-Reply-To: <CAEe3GZE_79Lqtzv4ZRi6zWuV_DgkCREBDcZJjpR+X9OaGZCA2g@mail.gmail.com>

On Wed, Mar 11, 2026 at 5:58 PM Jingyuan Liang <jingyliang@chromium.org> wrote:
>
> (Resending to the list. Apologies, I accidentally dropped the CCs on
> my initial reply!)
>
> On Tue, Mar 3, 2026 at 5:53 AM Rob Herring <robh@kernel.org> wrote:
> >
> > On Tue, Mar 3, 2026 at 12:14 AM Jingyuan Liang <jingyliang@chromium.org> wrote:
> > >
> > > Documentation describes the required and optional properties for
> > > implementing Device Tree for a Microsoft G6 Touch Digitizer that
> > > supports HID over SPI Protocol 1.0 specification.
> > >
> > > The properties are common to HID over SPI.
> > >
> > > Signed-off-by: Dmitry Antipov <dmanti@microsoft.com>
> > > Signed-off-by: Jarrett Schultz <jaschultz@microsoft.com>
> > > Signed-off-by: Jingyuan Liang <jingyliang@chromium.org>
> > > ---
> > >  .../devicetree/bindings/input/hid-over-spi.yaml    | 153 +++++++++++++++++++++
> > >  1 file changed, 153 insertions(+)
> > >
> > > diff --git a/Documentation/devicetree/bindings/input/hid-over-spi.yaml b/Documentation/devicetree/bindings/input/hid-over-spi.yaml
> > > new file mode 100644
> > > index 000000000000..b623629ed9d3
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/input/hid-over-spi.yaml
> > > @@ -0,0 +1,153 @@
> > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/input/hid-over-spi.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: HID over SPI Devices
> > > +
> > > +maintainers:
> > > +  - Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > > +  - Jiri Kosina <jkosina@suse.cz>
> > > +
> > > +description: |+
> > > +  HID over SPI provides support for various Human Interface Devices over the
> > > +  SPI bus. These devices can be for example touchpads, keyboards, touch screens
> > > +  or sensors.
> > > +
> > > +  The specification has been written by Microsoft and is currently available here:
> > > +  https://www.microsoft.com/en-us/download/details.aspx?id=103325
> > > +
> > > +  If this binding is used, the kernel module spi-hid will handle the communication
> > > +  with the device and the generic hid core layer will handle the protocol.
> > > +
> > > +allOf:
> > > +  - $ref: /schemas/input/touchscreen/touchscreen.yaml#
> > > +
> > > +properties:
> > > +  compatible:
> > > +    oneOf:
> > > +      - items:
> > > +          - enum:
> > > +              - microsoft,g6-touch-digitizer
> > > +          - const: hid-over-spi
> > > +      - description: Just "hid-over-spi" alone is allowed, but not recommended.
> > > +        const: hid-over-spi
> > > +
> > > +  reg:
> > > +    maxItems: 1
> > > +
> > > +  interrupts:
> > > +    maxItems: 1
> > > +
> > > +  reset-gpios:
> > > +    maxItems: 1
> > > +    description:
> > > +      GPIO specifier for the digitizer's reset pin (active low). The line must
> > > +      be flagged with GPIO_ACTIVE_LOW.
> > > +
> > > +  vdd-supply:
> > > +    description:
> > > +      Regulator for the VDD supply voltage.
> >
> > Is this part of the spec? This won't scale for multiple devices with
> > different power rails.
>
> This is not part of the spec but is needed for power management. Is it okay I
> mark it as optional? Thank you.
>
> >
> > > +
> > > +  input-report-header-address:
> > > +    $ref: /schemas/types.yaml#/definitions/uint32
> > > +    minimum: 0
> > > +    maximum: 0xffffff
> > > +    description:
> > > +      A value to be included in the Read Approval packet, listing an address of
> > > +      the input report header to be put on the SPI bus. This address has 24
> > > +      bits.
> > > +
> > > +  input-report-body-address:
> > > +    $ref: /schemas/types.yaml#/definitions/uint32
> > > +    minimum: 0
> > > +    maximum: 0xffffff
> > > +    description:
> > > +     A value to be included in the Read Approval packet, listing an address of
> > > +      the input report body to be put on the SPI bus. This address has 24 bits.
> > > +
> > > +  output-report-address:
> > > +    $ref: /schemas/types.yaml#/definitions/uint32
> > > +    minimum: 0
> > > +    maximum: 0xffffff
> > > +    description:
> > > +      A value to be included in the Output Report sent by the host, listing an
> > > +      address where the output report on the SPI bus is to be written to. This
> > > +      address has 24 bits.
> > > +
> > > +  post-power-on-delay-ms:
> > > +    description:
> > > +      Optional time in ms required by the device after enabling its regulators
> > > +      or powering it on, before it is ready for communication.
> >
> > Drop. This should be implied by the compatible.
>
> Thank you, I will fix this in v2.
>
> >
> > > +
> > > +  minimal-reset-delay-ms:
> > > +    description:
> > > +      Optional minimum amount of time in ms that device needs to be in reset
> > > +      state for the reset to take effect.
> >
> > Drop. This should be implied by the compatible.
>
> I will fix this in v2.
>
> >
> > > +
> > > +  read-opcode:
> > > +  $ref: /schemas/types.yaml#/definitions/uint8
> > > +    description:
> > > +      Value to be used in Read Approval packets. 1 byte.
> > > +
> > > +  write-opcode:
> > > +  $ref: /schemas/types.yaml#/definitions/uint8
> > > +    description:
> > > +      Value to be used in Write Approval packets. 1 byte.
> >
> > Why are these and the address properties above not defined by the
> > spec? Do they vary for a specific device? If not, then they should be
> > implied by the compatible.
>
> These properties are not defined by the spec:
>
> "The Input Report Address (header or body) and READ opcode are retrieved
> from ACPI."
>
> Same for the output report address and write opcode. I will drop these in v2.

Hi Rob,

Please disregard my previous reply about dropping the read/write opcodes and
input/output addresses. The spec does not define these fields. Instead, it says
these values are retrieved from ACPI _DSM methods. If we remove them from
the binding, the driver is not very generic. Would it be okay to keep them?

>
> >
> > > +
> > > +  hid-over-spi-flags:
> > > +  $ref: /schemas/types.yaml#/definitions/uint16
> > > +    description:
> > > +      16 bits.
> > > +      Bits 0-12 - Reserved (must be 0)
> > > +      Bit 13 - SPI Write Mode. Possible values -
> > > +        * 0b0- Writes are carried out in Single-SPI mode
> > > +        * 0b1- Writes are carried out in the Multi-SPI mode specified by bits
> > > +               14-15
> > > +      Bits 14-15 - Multi-SPI Mode. Possible values -
> > > +        * 0b00- Single SPI
> > > +        * 0b01- Dual SPI
> > > +        * 0b10- Quad SPI
> >
> > We already have SPI properties to define the bus width for read and write.
>
> Will fix this in v2.
>
> >
> > > +
> > > +required:
> > > +  - compatible
> > > +  - interrupts
> > > +  - reset-gpios
> > > +  - vdd-supply
> > > +  - input-report-header-address
> > > +  - input-report-body-address
> > > +  - output-report-address
> > > +  - read-opcode
> > > +  - write-opcode
> > > +  - hid-over-spi-flags
> > > +
> > > +additionalProperties: false
> > > +
> > > +examples:
> > > +  - |
> > > +    #include <dt-bindings/interrupt-controller/irq.h>
> > > +    #include <dt-bindings/gpio/gpio.h>
> > > +
> > > +    spi {
> > > +      #address-cells = <1>;
> > > +      #size-cells = <0>;
> > > +
> > > +      hid@0 {
> > > +        compatible = "hid-over-spi";
> > > +        reg = <0x0>;
> > > +        interrupts-extended = <&gpio 42 IRQ_TYPE_EDGE_FALLING>;
> > > +        reset-gpios = <&gpio 27 GPIO_ACTIVE_LOW>;
> > > +        vdd-supply = <&pm8350c_l3>;
> > > +        pinctrl-names = "default";
> > > +        pinctrl-0 = <&ts_d6_reset_assert &ts_d6_int_bias>;
> > > +        input-report-header-address = <0x1000>;
> > > +        input-report-body-address = <0x1004>;
> > > +        output-report-address = <0x2000>;
> > > +        read-opcode = <0x0b>;
> > > +        write-opcode = <0x02>;
> > > +        hid-over-spi-flags = <0x0000>;
> > > +        post-power-on-delay-ms = <5>;
> > > +        minimal-reset-delay-ms = <5>;
> > > +      };
> > > +    };
> > > \ No newline at end of file
> >
> > Fix this.
>
> Will fix this in v2.
>
> >
> > Rob

^ permalink raw reply

* Re: [PATCH 09/12] dt-bindings: input: Document hid-over-spi DT schema
From: Jingyuan Liang @ 2026-03-13  1:00 UTC (permalink / raw)
  To: Val Packett
  Cc: Jiri Kosina, Benjamin Tissoires, Jonathan Corbet, Mark Brown,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Dmitry Torokhov, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	linux-input, linux-doc, linux-kernel, linux-spi,
	linux-trace-kernel, devicetree, hbarnor, Dmitry Antipov,
	Jarrett Schultz
In-Reply-To: <1cc6de61-8b56-492e-ab78-e3aa448f58ad@packett.cool>

On Fri, Mar 6, 2026 at 11:25 PM Val Packett <val@packett.cool> wrote:
>
>
> On 3/3/26 3:13 AM, Jingyuan Liang wrote:
> > Documentation describes the required and optional properties for
> > implementing Device Tree for a Microsoft G6 Touch Digitizer that
> > supports HID over SPI Protocol 1.0 specification.
> > […]
> > +properties:
> > +  compatible:
> > +    oneOf:
> > +      - items:
> > +          - enum:
> > +              - microsoft,g6-touch-digitizer
> > +          - const: hid-over-spi
> > +      - description: Just "hid-over-spi" alone is allowed, but not recommended.
> > […]
> > +required:
> > +  - compatible
> > +  - interrupts
> > +  - reset-gpios
>
> Why is reset required? Is it so implausible on some device implementing
> the spec there wouldn't be a reset gpio?
>
> > +  - vdd-supply
> Linux makes up a dummy regulator if DT doesn't provide one, so can
> regulators even be required?
> > […]
> > +        compatible = "hid-over-spi";
> Not following your own recommendation from above :)

Thanks! I will fix this in v2.

> > +        reg = <0x0>;
> > +        interrupts-extended = <&gpio 42 IRQ_TYPE_EDGE_FALLING>;
> > +        reset-gpios = <&gpio 27 GPIO_ACTIVE_LOW>;
> > +        vdd-supply = <&pm8350c_l3>;
> > +        pinctrl-names = "default";
> > +        pinctrl-0 = <&ts_d6_reset_assert &ts_d6_int_bias>;
>
> Heh, "reset_assert" is a name implying it would actually set the value
> from the pinctrl properties, which is what had to be done before
> reset-gpios were supported. But now reset-gpios are supported.

Taken from the original patch. Will fix this in v2.

>
>
> Thanks,
> ~val
>
>
> P.S. happy to see work on this happen again!
>

^ permalink raw reply

* [PATCH 19/53] afs: use d_time instead of d_fsdata
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

afs uses ->d_fsdata to store version information for the parent
directory.  ->d_time is arguably a better field to store this
information as the version is like a time stamp, and ->d_time is an
unsigned long, while ->d_fsdata is a void *.

This will leave ->d_fsdata free for a different use ...  which
admittedly is also not a void*, but is certainly not at all a time.

Interesting the value stored in ->d_time or d_fsdata is u64 which is a
different size of 32 bit hosts.  Maybe that doesn't matter.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/dir.c      | 18 +++++++++---------
 fs/afs/internal.h |  8 ++++----
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 78caef3f1338..a0417292314c 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -808,7 +808,7 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry)
 		afs_dir_iterate(dir, &cookie->ctx, NULL, &data_version);
 	}
 
-	dentry->d_fsdata = (void *)(unsigned long)data_version;
+	dentry->d_time = (unsigned long)data_version;
 
 	/* Check to see if we already have an inode for the primary fid. */
 	inode = ilookup5(dir->i_sb, cookie->fids[1].vnode,
@@ -895,9 +895,9 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry)
 	}
 
 	if (op->file[0].scb.have_status)
-		dentry->d_fsdata = (void *)(unsigned long)op->file[0].scb.status.data_version;
+		dentry->d_time = (unsigned long)op->file[0].scb.status.data_version;
 	else
-		dentry->d_fsdata = (void *)(unsigned long)op->file[0].dv_before;
+		dentry->d_time = (unsigned long)op->file[0].dv_before;
 	ret = afs_put_operation(op);
 out:
 	kfree(cookie);
@@ -1010,7 +1010,7 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 	_debug("splice %p", dentry->d_inode);
 	d = d_splice_alias(inode, dentry);
 	if (!IS_ERR_OR_NULL(d)) {
-		d->d_fsdata = dentry->d_fsdata;
+		d->d_time = dentry->d_time;
 		trace_afs_lookup(dvnode, &d->d_name, &fid);
 	} else {
 		trace_afs_lookup(dvnode, &dentry->d_name, &fid);
@@ -1040,7 +1040,7 @@ static int afs_d_revalidate_rcu(struct afs_vnode *dvnode, struct dentry *dentry)
 	 * version.
 	 */
 	dir_version = (long)READ_ONCE(dvnode->status.data_version);
-	de_version = (long)READ_ONCE(dentry->d_fsdata);
+	de_version = (long)READ_ONCE(dentry->d_time);
 	if (de_version != dir_version) {
 		dir_version = (long)READ_ONCE(dvnode->invalid_before);
 		if (de_version - dir_version < 0)
@@ -1100,7 +1100,7 @@ static int afs_d_revalidate(struct inode *parent_dir, const struct qstr *name,
 	 * version.
 	 */
 	dir_version = dir->status.data_version;
-	de_version = (long)dentry->d_fsdata;
+	de_version = (long)dentry->d_time;
 	if (de_version == (long)dir_version)
 		goto out_valid_noupdate;
 
@@ -1161,7 +1161,7 @@ static int afs_d_revalidate(struct inode *parent_dir, const struct qstr *name,
 	}
 
 out_valid:
-	dentry->d_fsdata = (void *)(unsigned long)dir_version;
+	dentry->d_time = (unsigned long)dir_version;
 out_valid_noupdate:
 	key_put(key);
 	_leave(" = 1 [valid]");
@@ -1931,7 +1931,7 @@ static void afs_rename_edit_dir(struct afs_operation *op)
 		spin_unlock(&new_inode->i_lock);
 	}
 
-	/* Now we can update d_fsdata on the dentries to reflect their
+	/* Now we can update d_time on the dentries to reflect their
 	 * new parent's data_version.
 	 */
 	afs_update_dentry_version(op, new_dvp, op->dentry);
@@ -2167,7 +2167,7 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 	}
 
 	/* This bit is potentially nasty as there's a potential race with
-	 * afs_d_revalidate{,_rcu}().  We have to change d_fsdata on the dentry
+	 * afs_d_revalidate{,_rcu}().  We have to change d_time_ on the dentry
 	 * to reflect it's new parent's new data_version after the op, but
 	 * d_revalidate may see old_dentry between the op having taken place
 	 * and the version being updated.
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 009064b8d661..106a7fe06b56 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1746,17 +1746,17 @@ static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
 }
 
 /*
- * Note that a dentry got changed.  We need to set d_fsdata to the data version
+ * Note that a dentry got changed.  We need to set d_time to the data version
  * number derived from the result of the operation.  It doesn't matter if
- * d_fsdata goes backwards as we'll just revalidate.
+ * d_time goes backwards as we'll just revalidate.
  */
 static inline void afs_update_dentry_version(struct afs_operation *op,
 					     struct afs_vnode_param *dir_vp,
 					     struct dentry *dentry)
 {
 	if (!op->cumul_error.error)
-		dentry->d_fsdata =
-			(void *)(unsigned long)dir_vp->scb.status.data_version;
+		dentry->d_time =
+			(unsigned long)dir_vp->scb.status.data_version;
 }
 
 /*
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 20/53] afs: don't unhash/rehash dentries during unlink/rename
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

afs needs to block lookup of dentries during unlink and rename.
There are two reasons:
1/ If the target is to be removed, not silly-renamed, the subsequent
   opens cannot be allowed as the file won't exist on the server.
2/ If the rename source is being moved between directories a lookup,
   particularly d_revalidate, might change ->d_time asynchronously
   with rename changing ->d_time with possible incorrect results.

afs current unhashes the dentry to force a lookup which will wait on the
directory lock, and rehashes afterwards.  This is incompatible with
proposed changed to directory locking which will require a dentry to
remain hashed throughout rename/unlink/etc operations.

This patch copies a mechanism developed for NFS.  ->d_fsdata which is
currently unused is now set to a non-NULL value when lookups must be
blocked.  d_revalidate checks for this value, and waits for it to become
NULL.

->d_lock is used to ensure d_revalidate never updates ->d_time while
->d_fsdata is set.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/afs.h      |  7 ++++++
 fs/afs/dir.c      | 64 +++++++++++++++++++++++++++++------------------
 fs/afs/internal.h |  5 +---
 3 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/fs/afs/afs.h b/fs/afs/afs.h
index ec3db00bd081..019e77b08458 100644
--- a/fs/afs/afs.h
+++ b/fs/afs/afs.h
@@ -26,6 +26,13 @@ typedef u64			afs_volid_t;
 typedef u64			afs_vnodeid_t;
 typedef u64			afs_dataversion_t;
 
+/* This is stored in ->d_fsdata to stop d_revalidate looking at,
+ * and possibly changing, ->d_time on a dentry which is being moved
+ * between directories, and to block lookup for dentry that is
+ * being removed without silly-rename.
+ */
+#define AFS_FSDATA_BLOCKED ((void*)1)
+
 typedef enum {
 	AFSVL_RWVOL,			/* read/write volume */
 	AFSVL_ROVOL,			/* read-only volume */
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index a0417292314c..9c57614feccf 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -1034,6 +1034,10 @@ static int afs_d_revalidate_rcu(struct afs_vnode *dvnode, struct dentry *dentry)
 	if (!afs_check_validity(dvnode))
 		return -ECHILD;
 
+	/* A rename/unlink is pending */
+	if (dentry->d_fsdata)
+		return -ECHILD;
+
 	/* We only need to invalidate a dentry if the server's copy changed
 	 * behind our back.  If we made the change, it's no problem.  Note that
 	 * on a 32-bit system, we only have 32 bits in the dentry to store the
@@ -1069,6 +1073,10 @@ static int afs_d_revalidate(struct inode *parent_dir, const struct qstr *name,
 	if (flags & LOOKUP_RCU)
 		return afs_d_revalidate_rcu(dir, dentry);
 
+	/* Wait for rename/unlink to complete */
+wait_for_rename:
+	wait_var_event(&dentry->d_fsdata, dentry->d_fsdata == NULL);
+
 	if (d_really_is_positive(dentry)) {
 		vnode = AFS_FS_I(d_inode(dentry));
 		_enter("{v={%llx:%llu} n=%pd fl=%lx},",
@@ -1161,7 +1169,13 @@ static int afs_d_revalidate(struct inode *parent_dir, const struct qstr *name,
 	}
 
 out_valid:
+	spin_lock(&dentry->d_lock);
+	if (dentry->d_fsdata) {
+		spin_unlock(&dentry->d_lock);
+		goto wait_for_rename;
+	}
 	dentry->d_time = (unsigned long)dir_version;
+	spin_unlock(&dentry->d_lock);
 out_valid_noupdate:
 	key_put(key);
 	_leave(" = 1 [valid]");
@@ -1536,8 +1550,7 @@ static void afs_unlink_edit_dir(struct afs_operation *op)
 static void afs_unlink_put(struct afs_operation *op)
 {
 	_enter("op=%08x", op->debug_id);
-	if (op->unlink.need_rehash && afs_op_error(op) < 0 && afs_op_error(op) != -ENOENT)
-		d_rehash(op->dentry);
+	store_release_wake_up(&op->dentry->d_fsdata, NULL);
 }
 
 static const struct afs_operation_ops afs_unlink_operation = {
@@ -1591,11 +1604,7 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 		afs_op_set_error(op, afs_sillyrename(dvnode, vnode, dentry, op->key));
 		goto error;
 	}
-	if (!d_unhashed(dentry)) {
-		/* Prevent a race with RCU lookup. */
-		__d_drop(dentry);
-		op->unlink.need_rehash = true;
-	}
+	dentry->d_fsdata = AFS_FSDATA_BLOCKED;
 	spin_unlock(&dentry->d_lock);
 
 	op->file[1].vnode = vnode;
@@ -1885,9 +1894,10 @@ static void afs_rename_edit_dir(struct afs_operation *op)
 
 	_enter("op=%08x", op->debug_id);
 
-	if (op->rename.rehash) {
-		d_rehash(op->rename.rehash);
-		op->rename.rehash = NULL;
+	if (op->rename.unblock) {
+		/* Rename has finished, so unlocks lookups to target */
+		store_release_wake_up(&op->rename.unblock->d_fsdata, NULL);
+		op->rename.unblock = NULL;
 	}
 
 	fscache_begin_write_operation(&orig_cres, afs_vnode_cache(orig_dvnode));
@@ -1970,6 +1980,9 @@ static void afs_rename_exchange_edit_dir(struct afs_operation *op)
 
 		d_exchange(old_dentry, new_dentry);
 		up_write(&orig_dvnode->validate_lock);
+	/* dentry has been moved, so d_validate can safely proceed */
+	store_release_wake_up(&old_dentry->d_fsdata, NULL);
+
 	} else {
 		down_write(&orig_dvnode->validate_lock);
 		if (test_bit(AFS_VNODE_DIR_VALID, &orig_dvnode->flags) &&
@@ -2009,11 +2022,10 @@ static void afs_rename_exchange_edit_dir(struct afs_operation *op)
 static void afs_rename_put(struct afs_operation *op)
 {
 	_enter("op=%08x", op->debug_id);
-	if (op->rename.rehash)
-		d_rehash(op->rename.rehash);
+	if (op->rename.unblock)
+		store_release_wake_up(&op->rename.unblock->d_fsdata, NULL);
+	store_release_wake_up(&op->dentry->d_fsdata, NULL);
 	dput(op->rename.tmp);
-	if (afs_op_error(op))
-		d_rehash(op->dentry);
 }
 
 static const struct afs_operation_ops afs_rename_operation = {
@@ -2121,7 +2133,6 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 		op->ops		= &afs_rename_noreplace_operation;
 	} else if (flags & RENAME_EXCHANGE) {
 		op->ops		= &afs_rename_exchange_operation;
-		d_drop(new_dentry);
 	} else {
 		/* If we might displace the target, we might need to do silly
 		 * rename.
@@ -2135,14 +2146,12 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 		 */
 		if (d_is_positive(new_dentry) && !d_is_dir(new_dentry)) {
 			/* To prevent any new references to the target during
-			 * the rename, we unhash the dentry in advance.
+			 * the rename, we set d_fsdata which afs_d_revalidate will wait for.
+			 * d_lock ensures d_count() and ->d_fsdata are consistent.
 			 */
-			if (!d_unhashed(new_dentry)) {
-				d_drop(new_dentry);
-				op->rename.rehash = new_dentry;
-			}
-
+			spin_lock(&new_dentry->d_lock);
 			if (d_count(new_dentry) > 2) {
+				spin_unlock(&new_dentry->d_lock);
 				/* copy the target dentry's name */
 				op->rename.tmp = d_alloc(new_dentry->d_parent,
 							 &new_dentry->d_name);
@@ -2160,8 +2169,12 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 				}
 
 				op->dentry_2 = op->rename.tmp;
-				op->rename.rehash = NULL;
 				op->rename.new_negative = true;
+			} else {
+				/* Block any lookups to target until the rename completes */
+				new_dentry->d_fsdata = AFS_FSDATA_BLOCKED;
+				op->rename.unblock = new_dentry;
+				spin_unlock(&new_dentry->d_lock);
 			}
 		}
 	}
@@ -2172,10 +2185,11 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 	 * d_revalidate may see old_dentry between the op having taken place
 	 * and the version being updated.
 	 *
-	 * So drop the old_dentry for now to make other threads go through
-	 * lookup instead - which we hold a lock against.
+	 * So block revalidate on the old_dentry until the rename completes.
 	 */
-	d_drop(old_dentry);
+	spin_lock(&old_dentry->d_lock);
+	old_dentry->d_fsdata = AFS_FSDATA_BLOCKED;
+	spin_unlock(&old_dentry->d_lock);
 
 	ret = afs_do_sync_operation(op);
 	if (ret == -ENOTSUPP)
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 106a7fe06b56..f2898ce9c0e6 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -891,10 +891,7 @@ struct afs_operation {
 			const char *symlink;
 		} create;
 		struct {
-			bool	need_rehash;
-		} unlink;
-		struct {
-			struct dentry	*rehash;
+			struct dentry	*unblock;
 			struct dentry	*tmp;
 			unsigned int	rename_flags;
 			bool		new_negative;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 21/53] afs: use d_splice_alias() in afs_vnode_new_inode()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

As afs supports the fhandle interfaces there is a theoretical possibility
that the inode created for mkdir could be found by open_by_handle_at()
and given a dentry before d_instantiate() is called.  This would result
in two dentries for the one directory inode, which is not permitted.

So this patch changes afs_mkdir() to use d_splice_alias() and to
return the alternate dentry from ->mkdir() if appropriate.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/dir.c      | 14 ++++++++++----
 fs/afs/internal.h |  1 +
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9c57614feccf..1e472768e1f1 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -1248,7 +1248,7 @@ void afs_check_for_remote_deletion(struct afs_operation *op)
 /*
  * Create a new inode for create/mkdir/symlink
  */
-static void afs_vnode_new_inode(struct afs_operation *op)
+static struct dentry *afs_vnode_new_inode(struct afs_operation *op)
 {
 	struct afs_vnode_param *dvp = &op->file[0];
 	struct afs_vnode_param *vp = &op->file[1];
@@ -1265,7 +1265,7 @@ static void afs_vnode_new_inode(struct afs_operation *op)
 		 * the new directory on the server.
 		 */
 		afs_op_accumulate_error(op, PTR_ERR(inode), 0);
-		return;
+		return NULL;
 	}
 
 	vnode = AFS_FS_I(inode);
@@ -1276,7 +1276,7 @@ static void afs_vnode_new_inode(struct afs_operation *op)
 		afs_init_new_symlink(vnode, op);
 	if (!afs_op_error(op))
 		afs_cache_permit(vnode, op->key, vnode->cb_break, &vp->scb);
-	d_instantiate(op->dentry, inode);
+	return d_splice_alias(inode, op->dentry);
 }
 
 static void afs_create_success(struct afs_operation *op)
@@ -1285,7 +1285,7 @@ static void afs_create_success(struct afs_operation *op)
 	op->ctime = op->file[0].scb.status.mtime_client;
 	afs_vnode_commit_status(op, &op->file[0]);
 	afs_update_dentry_version(op, &op->file[0], op->dentry);
-	afs_vnode_new_inode(op);
+	op->create.ret = afs_vnode_new_inode(op);
 }
 
 static void afs_create_edit_dir(struct afs_operation *op)
@@ -1356,6 +1356,12 @@ static struct dentry *afs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
 	op->ops		= &afs_mkdir_operation;
 	ret = afs_do_sync_operation(op);
 	afs_dir_unuse_cookie(dvnode, ret);
+	if (op->create.ret) {
+		/* Alternate dentry */
+		if (ret == 0)
+			return op->create.ret;
+		dput(op->create.ret);
+	}
 	return ERR_PTR(ret);
 }
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index f2898ce9c0e6..ce94f10a14c0 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -889,6 +889,7 @@ struct afs_operation {
 			int	reason;		/* enum afs_edit_dir_reason */
 			mode_t	mode;
 			const char *symlink;
+			struct dentry *ret;
 		} create;
 		struct {
 			struct dentry	*unblock;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 22/53] afs: use d_alloc_nonblock in afs_sillyrename()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

Rather than performing a normal lookup (which will be awkward with future
locking changes) use d_alloc_noblock() to find a dentry for an
unused name, and use an open-coded lookup_slow() to see if it is free on
the server.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/dir_silly.c | 51 ++++++++++++++++++++++++++++++----------------
 1 file changed, 34 insertions(+), 17 deletions(-)

diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c
index 982bb6ec15f0..699143b21cdd 100644
--- a/fs/afs/dir_silly.c
+++ b/fs/afs/dir_silly.c
@@ -112,7 +112,9 @@ int afs_sillyrename(struct afs_vnode *dvnode, struct afs_vnode *vnode,
 		    struct dentry *dentry, struct key *key)
 {
 	static unsigned int sillycounter;
-	struct dentry *sdentry = NULL;
+	struct dentry *sdentry = NULL, *old;
+	struct inode *dir = dentry->d_parent->d_inode;
+	struct qstr qsilly;
 	unsigned char silly[16];
 	int ret = -EBUSY;
 
@@ -122,23 +124,38 @@ int afs_sillyrename(struct afs_vnode *dvnode, struct afs_vnode *vnode,
 	if (dentry->d_flags & DCACHE_NFSFS_RENAMED)
 		return -EBUSY;
 
-	sdentry = NULL;
-	do {
-		dput(sdentry);
-		sillycounter++;
-
-		/* Create a silly name.  Note that the ".__afs" prefix is
-		 * understood by the salvager and must not be changed.
-		 */
-		scnprintf(silly, sizeof(silly), ".__afs%04X", sillycounter);
-		sdentry = lookup_noperm(&QSTR(silly), dentry->d_parent);
+newname:
+	sillycounter++;
 
-		/* N.B. Better to return EBUSY here ... it could be dangerous
-		 * to delete the file while it's in use.
-		 */
-		if (IS_ERR(sdentry))
-			goto out;
-	} while (!d_is_negative(sdentry));
+	/* Create a silly name.  Note that the ".__afs" prefix is
+	 * understood by the salvager and must not be changed.
+	 */
+	scnprintf(silly, sizeof(silly), ".__afs%04X", sillycounter);
+	qsilly = QSTR(silly);
+	sdentry = try_lookup_noperm(&qsilly, dentry->d_parent);
+	if (!sdentry)
+		sdentry = d_alloc_noblock(dentry->d_parent, &qsilly);
+	if (sdentry == ERR_PTR(-EWOULDBLOCK))
+		/* try another name */
+		goto newname;
+	/* N.B. Better to return EBUSY here ... it could be dangerous
+	 * to delete the file while it's in use.
+	 */
+	if (IS_ERR(sdentry))
+		goto out;
+	if (d_is_positive(sdentry)) {
+		dput(sdentry);
+		goto newname;
+	}
+	/* This name isn't known locally - check on server */
+	old = dir->i_op->lookup(dir, sdentry, 0);
+	d_lookup_done(sdentry);
+	if (old || d_is_positive(sdentry)) {
+		if (!IS_ERR(old))
+			dput(old);
+		dput(sdentry);
+		goto newname;
+	}
 
 	ihold(&vnode->netfs.inode);
 
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 23/53] afs: lookup_atsys to drop and reclaim lock.
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

If afs is asked to lookup a name ending with @sys, it needs to look up
a different name for which is allocates a dentry with
d_alloc_parallel().
This is done while the parent lock is held which will be a problem in
a future patch where the ordering of the parent lock and
d_alloc_parallel() locking is reversed.

There is no actual need to hold the lock while d_alloc_parallel() is
called, so with this patch we drop the lock and reclaim it.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/dir.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1e472768e1f1..c195ee851191 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -908,12 +908,14 @@ static struct inode *afs_do_lookup(struct inode *dir, struct dentry *dentry)
 /*
  * Look up an entry in a directory with @sys substitution.
  */
-static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry)
+static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry,
+				       unsigned int flags)
 {
 	struct afs_sysnames *subs;
 	struct afs_net *net = afs_i2net(dir);
 	struct dentry *ret;
 	char *buf, *p, *name;
+	struct qstr nm;
 	int len, i;
 
 	_enter("");
@@ -933,6 +935,13 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry)
 	refcount_inc(&subs->usage);
 	read_unlock(&net->sysnames_lock);
 
+	/* Calling d_alloc_parallel() while holding parent locked is undesirable.
+	 * We don't really need the lock any more.
+	 */
+	if (flags & LOOKUP_SHARED)
+		inode_unlock_shared(dir);
+	else
+		inode_unlock(dir);
 	for (i = 0; i < subs->nr; i++) {
 		name = subs->subs[i];
 		len = dentry->d_name.len - 4 + strlen(name);
@@ -942,7 +951,10 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry)
 		}
 
 		strcpy(p, name);
-		ret = lookup_noperm(&QSTR(buf), dentry->d_parent);
+		nm = QSTR(buf);
+		ret = try_lookup_noperm(&nm, dentry->d_parent);
+		if (!ret)
+			ret = d_alloc_parallel(dentry->d_parent, &nm);
 		if (IS_ERR(ret) || d_is_positive(ret))
 			goto out_s;
 		dput(ret);
@@ -953,6 +965,10 @@ static struct dentry *afs_lookup_atsys(struct inode *dir, struct dentry *dentry)
 	 */
 	ret = NULL;
 out_s:
+	if (flags & LOOKUP_SHARED)
+		inode_lock_shared(dir);
+	else
+		inode_lock_nested(dir, I_MUTEX_PARENT);
 	afs_put_sysnames(subs);
 	kfree(buf);
 out_p:
@@ -998,7 +1014,7 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
 	    dentry->d_name.name[dentry->d_name.len - 3] == 's' &&
 	    dentry->d_name.name[dentry->d_name.len - 2] == 'y' &&
 	    dentry->d_name.name[dentry->d_name.len - 1] == 's')
-		return afs_lookup_atsys(dir, dentry);
+		return afs_lookup_atsys(dir, dentry, flags);
 
 	afs_stat_v(dvnode, n_lookup);
 	inode = afs_do_lookup(dir, dentry);
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 24/53] afs: use d_duplicate()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

To prepare for d_alloc_parallel() being permitted without a directory
lock, use d_duplicate() when duplicating a dentry in order to create a
whiteout.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/afs/dir.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index c195ee851191..b5c593f50079 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -2047,6 +2047,8 @@ static void afs_rename_put(struct afs_operation *op)
 	if (op->rename.unblock)
 		store_release_wake_up(&op->rename.unblock->d_fsdata, NULL);
 	store_release_wake_up(&op->dentry->d_fsdata, NULL);
+	if (op->rename.tmp)
+		d_lookup_done(op->rename.tmp);
 	dput(op->rename.tmp);
 }
 
@@ -2175,8 +2177,7 @@ static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
 			if (d_count(new_dentry) > 2) {
 				spin_unlock(&new_dentry->d_lock);
 				/* copy the target dentry's name */
-				op->rename.tmp = d_alloc(new_dentry->d_parent,
-							 &new_dentry->d_name);
+				op->rename.tmp = d_duplicate(new_dentry);
 				if (!op->rename.tmp) {
 					afs_op_nomem(op);
 					goto error;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 25/53] smb/client: use d_time to store a timestamp in dentry, not d_fsdata
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

smb/client uses d_fsdata is exactly the way that d_time is intended to
be used.  It previous used d_time but this was changed in
  Commit: a00be0e31f8d ("cifs: don't use ->d_time")
without any reason being given.

This patch effectively reverts that patch (though it doesn't remove the
helpers) so that d_fsdata can be used for something more generic.

Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/smb/client/cifsfs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h
index e320d39b01f5..5153e811c50b 100644
--- a/fs/smb/client/cifsfs.h
+++ b/fs/smb/client/cifsfs.h
@@ -30,12 +30,12 @@ cifs_uniqueid_to_ino_t(u64 fileid)
 
 static inline void cifs_set_time(struct dentry *dentry, unsigned long time)
 {
-	dentry->d_fsdata = (void *) time;
+	dentry->d_time = time;
 }
 
 static inline unsigned long cifs_get_time(struct dentry *dentry)
 {
-	return (unsigned long) dentry->d_fsdata;
+	return dentry->d_time;
 }
 
 extern struct file_system_type cifs_fs_type, smb3_fs_type;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 26/53] smb/client: don't unhashed and rehash to prevent new opens.
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

smb/client needs to block new opens of the target of unlink and rename
while the operation is progressing.  This stablises d_count() and allows
a determination of whether a "silly-rename" is required.

It currently unhashes the dentry which will cause lookup to block on
the parent directory i_rwsem.  Proposed changes to locking will cause
this approach to stop working and the exclusivity will be provided for
the dentry only, and only while it is hashed.

So we introduce a new machanism similar to that used by nfs and afs.
->d_fsdata (currently unused by smb/client) is set to a non-NULL
value when lookups need to be blocked.  ->d_revalidate checks for this
and blocks.  This might still allow d_count() to increment, but once it
has been tested as 1, there can be no new opens completed.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/smb/client/dir.c   |  3 +++
 fs/smb/client/inode.c | 48 +++++++++++++++++--------------------------
 2 files changed, 22 insertions(+), 29 deletions(-)

diff --git a/fs/smb/client/dir.c b/fs/smb/client/dir.c
index cb10088197d2..cecbc0cce5c5 100644
--- a/fs/smb/client/dir.c
+++ b/fs/smb/client/dir.c
@@ -790,6 +790,9 @@ cifs_d_revalidate(struct inode *dir, const struct qstr *name,
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
 
+	/* Wait for pending rename/unlink */
+	wait_var_event(&direntry->d_fsdata, direntry->d_fsdata == NULL);
+
 	if (d_really_is_positive(direntry)) {
 		int rc;
 		struct inode *inode = d_inode(direntry);
diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c
index d4d3cfeb6c90..3549605fa9c2 100644
--- a/fs/smb/client/inode.c
+++ b/fs/smb/client/inode.c
@@ -28,6 +28,13 @@
 #include "cached_dir.h"
 #include "reparse.h"
 
+/* This is stored in ->d_fsdata to block d_revalidate on a
+ * file dentry that is being removed - unlink or rename target.
+ * This causes any open attempt to block.  There may be existing opens
+ * but they can be detected by checking d_count() under ->d_lock.
+ */
+#define CIFS_FSDATA_BLOCKED ((void *)1)
+
 /*
  * Set parameters for the netfs library
  */
@@ -1946,27 +1953,21 @@ static int __cifs_unlink(struct inode *dir, struct dentry *dentry, bool sillyren
 	__u32 dosattr = 0, origattr = 0;
 	struct TCP_Server_Info *server;
 	struct iattr *attrs = NULL;
-	bool rehash = false;
 
 	cifs_dbg(FYI, "cifs_unlink, dir=0x%p, dentry=0x%p\n", dir, dentry);
 
 	if (unlikely(cifs_forced_shutdown(cifs_sb)))
 		return smb_EIO(smb_eio_trace_forced_shutdown);
 
-	/* Unhash dentry in advance to prevent any concurrent opens */
-	spin_lock(&dentry->d_lock);
-	if (!d_unhashed(dentry)) {
-		__d_drop(dentry);
-		rehash = true;
-	}
-	spin_unlock(&dentry->d_lock);
-
 	tlink = cifs_sb_tlink(cifs_sb);
 	if (IS_ERR(tlink))
 		return PTR_ERR(tlink);
 	tcon = tlink_tcon(tlink);
 	server = tcon->ses->server;
 
+	/* Set d_fsdata to prevent any concurrent opens */
+	dentry->d_fsdata = CIFS_FSDATA_BLOCKED;
+
 	xid = get_xid();
 	page = alloc_dentry_path();
 
@@ -2083,8 +2084,9 @@ static int __cifs_unlink(struct inode *dir, struct dentry *dentry, bool sillyren
 	kfree(attrs);
 	free_xid(xid);
 	cifs_put_tlink(tlink);
-	if (rehash)
-		d_rehash(dentry);
+
+	/* Allow lookups */
+	store_release_wake_up(&dentry->d_fsdata, NULL);
 	return rc;
 }
 
@@ -2501,7 +2503,6 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir,
 	struct cifs_sb_info *cifs_sb;
 	struct tcon_link *tlink;
 	struct cifs_tcon *tcon;
-	bool rehash = false;
 	unsigned int xid;
 	int rc, tmprc;
 	int retry_count = 0;
@@ -2517,23 +2518,15 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir,
 	if (unlikely(cifs_forced_shutdown(cifs_sb)))
 		return smb_EIO(smb_eio_trace_forced_shutdown);
 
-	/*
-	 * Prevent any concurrent opens on the target by unhashing the dentry.
-	 * VFS already unhashes the target when renaming directories.
-	 */
-	if (d_is_positive(target_dentry) && !d_is_dir(target_dentry)) {
-		if (!d_unhashed(target_dentry)) {
-			d_drop(target_dentry);
-			rehash = true;
-		}
-	}
-
 	tlink = cifs_sb_tlink(cifs_sb);
 	if (IS_ERR(tlink))
 		return PTR_ERR(tlink);
 	tcon = tlink_tcon(tlink);
 	server = tcon->ses->server;
 
+	/* Set d_fsdata to prevent any concurrent opens */
+	target_dentry->d_fsdata = CIFS_FSDATA_BLOCKED;
+
 	page1 = alloc_dentry_path();
 	page2 = alloc_dentry_path();
 	xid = get_xid();
@@ -2570,8 +2563,6 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir,
 		}
 	}
 
-	if (!rc)
-		rehash = false;
 	/*
 	 * No-replace is the natural behavior for CIFS, so skip unlink hacks.
 	 */
@@ -2662,8 +2653,6 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir,
 			}
 			rc = cifs_do_rename(xid, source_dentry, from_name,
 					    target_dentry, to_name);
-			if (!rc)
-				rehash = false;
 		}
 	}
 
@@ -2671,8 +2660,9 @@ cifs_rename2(struct mnt_idmap *idmap, struct inode *source_dir,
 	CIFS_I(source_dir)->time = CIFS_I(target_dir)->time = 0;
 
 cifs_rename_exit:
-	if (rehash)
-		d_rehash(target_dentry);
+	/* Allow lookups */
+	store_release_wake_up(&target_dentry->d_fsdata, NULL);
+
 	kfree(info_buf_source);
 	free_dentry_path(page2);
 	free_dentry_path(page1);
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 27/53] smb/client: use d_splice_alias() in atomic_open
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

atomic_open can be called with a hashed-negative dentry or an in-lookup
dentry.  Rather than d_drop() and d_add() we can use d_splice_alias()
which keeps the dentry hashed - important for proposed locking changes.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/smb/client/dir.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/smb/client/dir.c b/fs/smb/client/dir.c
index cecbc0cce5c5..361a20987927 100644
--- a/fs/smb/client/dir.c
+++ b/fs/smb/client/dir.c
@@ -439,8 +439,7 @@ static int cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned
 			goto out_err;
 		}
 
-	d_drop(direntry);
-	d_add(direntry, newinode);
+	d_splice_alias(newinode, direntry);
 
 out:
 	free_dentry_path(page);
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 28/53] smb/client: Use d_alloc_noblock() in cifs_prime_dcache()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

cifs uses the results of readdir to prime the dcache.  Using
d_alloc_parallel() can block if there is a concurrent lookup.  Blocking
in that case is pointless as the lookup will add info to the dcache and
there is no value in the readdir waiting to see if it should add the
info too.

Also this call to d_alloc_parallel() is made while the parent
directory is locked.  A proposed change to locking will lock the parent
later, after d_alloc_parallel().  This means it won't be safe to wait in
d_alloc_parallel() while holding the directory lock.

So change to use d_alloc_noblock().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/smb/client/readdir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/smb/client/readdir.c b/fs/smb/client/readdir.c
index 47f5d620b750..dabf9507bc40 100644
--- a/fs/smb/client/readdir.c
+++ b/fs/smb/client/readdir.c
@@ -104,7 +104,7 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name,
 		    (fattr->cf_flags & CIFS_FATTR_NEED_REVAL))
 			return;
 
-		dentry = d_alloc_parallel(parent, name);
+		dentry = d_alloc_noblock(parent, name);
 	}
 	if (IS_ERR(dentry))
 		return;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 29/53] exfat: simplify exfat_lookup()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

1/ exfat_d_anon_disconn() serves no purpose.
  It is only called (on alias) when
         alias->d_parent == dentry->d_parent
  and in that case IS_ROOT(dentry) will return false, so the whole
  function will return false.
  So we can remove it.

2/ When an alias for the inode is found in the same parent
  it is always sufficient to d_move() the alias to the new
  name.  This will keep just one dentry around when there are multiple
  effective names, and it will always show the most recently used name,
  which appears to be the intention.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/exfat/namei.c | 36 +++++++-----------------------------
 1 file changed, 7 insertions(+), 29 deletions(-)

diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 670116ae9ec8..e04cda7425da 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -711,11 +711,6 @@ static int exfat_find(struct inode *dir, const struct qstr *qname,
 	return 0;
 }
 
-static int exfat_d_anon_disconn(struct dentry *dentry)
-{
-	return IS_ROOT(dentry) && (dentry->d_flags & DCACHE_DISCONNECTED);
-}
-
 static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry,
 		unsigned int flags)
 {
@@ -750,32 +745,15 @@ static struct dentry *exfat_lookup(struct inode *dir, struct dentry *dentry,
 	 * Checking "alias->d_parent == dentry->d_parent" to make sure
 	 * FS is not corrupted (especially double linked dir).
 	 */
-	if (alias && alias->d_parent == dentry->d_parent &&
-			!exfat_d_anon_disconn(alias)) {
-
+	if (alias && alias->d_parent == dentry->d_parent) {
 		/*
-		 * Unhashed alias is able to exist because of revalidate()
-		 * called by lookup_fast. You can easily make this status
-		 * by calling create and lookup concurrently
-		 * In such case, we reuse an alias instead of new dentry
+		 * As EXFAT does not support hard-links this must
+		 * be an alternate name for the same file,
+		 * possibly longname vs 8.3 alias.
+		 * Rather than allocating a new dentry, use the old
+		 * one but keep the most recently used name.
 		 */
-		if (d_unhashed(alias)) {
-			WARN_ON(alias->d_name.hash_len !=
-				dentry->d_name.hash_len);
-			exfat_info(sb, "rehashed a dentry(%p) in read lookup",
-				   alias);
-			d_drop(dentry);
-			d_rehash(alias);
-		} else if (!S_ISDIR(i_mode)) {
-			/*
-			 * This inode has non anonymous-DCACHE_DISCONNECTED
-			 * dentry. This means, the user did ->lookup() by an
-			 * another name (longname vs 8.3 alias of it) in past.
-			 *
-			 * Switch to new one for reason of locality if possible.
-			 */
-			d_move(alias, dentry);
-		}
+		d_move(alias, dentry);
 		iput(inode);
 		mutex_unlock(&EXFAT_SB(sb)->s_lock);
 		return alias;
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 30/53] configfs: remove d_add() calls before configfs_attach_group()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

These d_add() calls cannot be necessary.  The inode given is NULL so all
they do is attach the dentry to the hash table.

If configfs_attach_group() fails, then d_drop() is called so the dentry
will be detached.
If configfs_attach_group() succeeds, then
 configfs_attach_group -> configfs_attach_item ->configfs_create_dir
must have succeeded, so d_instantiate() will have been called and the
dentry hashed there.

So the only effect is that the dentry will be hashed-negative for a
short period which will allow a lookup to find nothing without waiting
for the directory i_rwsem.  I can find no indication that this might be
important.

Adding a dentry as negative, and then later making it positive is an
unusual pattern and appears to be unnecessary, so it is best avoided.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/configfs/dir.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index 362b6ff9b908..c82eca0b5d73 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -706,8 +706,6 @@ static int create_default_group(struct config_group *parent_group,
 	ret = -ENOMEM;
 	child = d_alloc_name(parent, group->cg_item.ci_name);
 	if (child) {
-		d_add(child, NULL);
-
 		ret = configfs_attach_group(&parent_group->cg_item,
 					    &group->cg_item, child, frag);
 		if (!ret) {
@@ -1904,8 +1902,6 @@ int configfs_register_subsystem(struct configfs_subsystem *subsys)
 	err = -ENOMEM;
 	dentry = d_alloc_name(root, group->cg_item.ci_name);
 	if (dentry) {
-		d_add(dentry, NULL);
-
 		err = configfs_dirent_exists(dentry);
 		if (!err)
 			err = configfs_attach_group(sd->s_element,
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 31/53] configfs: stop using d_add().
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

"Best practice" is to use d_splice_alias() at the end of a ->lookup
function.  d_add() often works and is not incorrect in configfs, but as
it is planned to remove d_add(), change to use d_splice_alias().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/configfs/dir.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index c82eca0b5d73..6ec589b6b8a4 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -501,8 +501,7 @@ static struct dentry * configfs_lookup(struct inode *dir,
 	}
 	spin_unlock(&configfs_dirent_lock);
 done:
-	d_add(dentry, inode);
-	return NULL;
+	return d_splice_alias(inode, dentry);
 }
 
 /*
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 32/53] ext4: move dcache modifying code out of __ext4_link()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

__ext4_link() is separate from ext4_link() so that it can be used for
replaying a "fast_commit" which records a link operation.
Replaying the fast_commit does not require any interaction with the
dcache - it is purely ext4-local - but it uses a dentry to simplify code
reuse.

An interface it uses - d_alloc() - is not generally useful and will soon
be removed.  This patch prepares ext4 for that removal.  Specifically it
removes all dcache-modification code from __ext4_link().  This will mean
that __ext4_link() treats the dentry as read-only so fast_commit reply
can simply provide an on-stack dentry.

Various "const" markers are sprinkled around to confirm that the dentry
is treated read-only.

This patch only rearranges code and slightly re-orders it, so those
changes can be reviewed separately.  The next patch will remove the use
of d_alloc().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/dcache.c                 |  2 +-
 fs/ext4/ext4.h              |  4 ++--
 fs/ext4/fast_commit.c       | 14 +++++++++++---
 fs/ext4/namei.c             | 23 +++++++++++++----------
 include/linux/dcache.h      |  2 +-
 include/trace/events/ext4.h |  4 ++--
 6 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index a1219b446b74..c48337d95f9a 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -358,7 +358,7 @@ static inline int dname_external(const struct dentry *dentry)
 	return dentry->d_name.name != dentry->d_shortname.string;
 }
 
-void take_dentry_name_snapshot(struct name_snapshot *name, struct dentry *dentry)
+void take_dentry_name_snapshot(struct name_snapshot *name, const struct dentry *dentry)
 {
 	unsigned seq;
 	const unsigned char *s;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 293f698b7042..1794407652ff 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2972,7 +2972,7 @@ void ext4_fc_track_range(handle_t *handle, struct inode *inode, ext4_lblk_t star
 void __ext4_fc_track_unlink(handle_t *handle, struct inode *inode,
 	struct dentry *dentry);
 void __ext4_fc_track_link(handle_t *handle, struct inode *inode,
-	struct dentry *dentry);
+	const struct dentry *dentry);
 void ext4_fc_track_unlink(handle_t *handle, struct dentry *dentry);
 void ext4_fc_track_link(handle_t *handle, struct dentry *dentry);
 void __ext4_fc_track_create(handle_t *handle, struct inode *inode,
@@ -3719,7 +3719,7 @@ extern int ext4_handle_dirty_dirblock(handle_t *handle, struct inode *inode,
 extern int __ext4_unlink(struct inode *dir, const struct qstr *d_name,
 			 struct inode *inode, struct dentry *dentry);
 extern int __ext4_link(struct inode *dir, struct inode *inode,
-		       struct dentry *dentry);
+		       const struct dentry *dentry);
 
 #define S_SHIFT 12
 static const unsigned char ext4_type_by_mode[(S_IFMT >> S_SHIFT) + 1] = {
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index f575751f1cae..2a5daf1d9667 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -388,7 +388,7 @@ static int ext4_fc_track_template(
 }
 
 struct __track_dentry_update_args {
-	struct dentry *dentry;
+	const struct dentry *dentry;
 	int op;
 };
 
@@ -400,7 +400,7 @@ static int __track_dentry_update(handle_t *handle, struct inode *inode,
 	struct ext4_inode_info *ei = EXT4_I(inode);
 	struct __track_dentry_update_args *dentry_update =
 		(struct __track_dentry_update_args *)arg;
-	struct dentry *dentry = dentry_update->dentry;
+	const struct dentry *dentry = dentry_update->dentry;
 	struct inode *dir = dentry->d_parent->d_inode;
 	struct super_block *sb = inode->i_sb;
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -483,7 +483,7 @@ void ext4_fc_track_unlink(handle_t *handle, struct dentry *dentry)
 }
 
 void __ext4_fc_track_link(handle_t *handle,
-	struct inode *inode, struct dentry *dentry)
+	struct inode *inode, const struct dentry *dentry)
 {
 	struct __track_dentry_update_args args;
 	int ret;
@@ -1471,7 +1471,15 @@ static int ext4_fc_replay_link_internal(struct super_block *sb,
 		goto out;
 	}
 
+	ihold(inode);
+	inc_nlink(inode);
 	ret = __ext4_link(dir, inode, dentry_inode);
+	if (ret) {
+		drop_nlink(inode);
+		iput(inode);
+	} else {
+		d_instantiate(dentry_inode, inode);
+	}
 	/*
 	 * It's possible that link already existed since data blocks
 	 * for the dir in question got persisted before we crashed OR
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index c4b5e252af0e..80e1051cab44 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2353,7 +2353,7 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
  * may not sleep between calling this and putting something into
  * the entry, as someone else might have used it while you slept.
  */
-static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
+static int ext4_add_entry(handle_t *handle, const struct dentry *dentry,
 			  struct inode *inode)
 {
 	struct inode *dir = d_inode(dentry->d_parent);
@@ -3445,7 +3445,7 @@ static int ext4_symlink(struct mnt_idmap *idmap, struct inode *dir,
 	return err;
 }
 
-int __ext4_link(struct inode *dir, struct inode *inode, struct dentry *dentry)
+int __ext4_link(struct inode *dir, struct inode *inode, const struct dentry *dentry)
 {
 	handle_t *handle;
 	int err, retries = 0;
@@ -3460,8 +3460,6 @@ int __ext4_link(struct inode *dir, struct inode *inode, struct dentry *dentry)
 		ext4_handle_sync(handle);
 
 	inode_set_ctime_current(inode);
-	ext4_inc_count(inode);
-	ihold(inode);
 
 	err = ext4_add_entry(handle, dentry, inode);
 	if (!err) {
@@ -3471,11 +3469,7 @@ int __ext4_link(struct inode *dir, struct inode *inode, struct dentry *dentry)
 		 */
 		if (inode->i_nlink == 1)
 			ext4_orphan_del(handle, inode);
-		d_instantiate(dentry, inode);
-		ext4_fc_track_link(handle, dentry);
-	} else {
-		drop_nlink(inode);
-		iput(inode);
+		__ext4_fc_track_link(handle, inode, dentry);
 	}
 	ext4_journal_stop(handle);
 	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
@@ -3504,7 +3498,16 @@ static int ext4_link(struct dentry *old_dentry,
 	err = dquot_initialize(dir);
 	if (err)
 		return err;
-	return __ext4_link(dir, inode, dentry);
+	ihold(inode);
+	ext4_inc_count(inode);
+	err = __ext4_link(dir, inode, dentry);
+	if (err) {
+		drop_nlink(inode);
+		iput(inode);
+	} else {
+		d_instantiate(dentry, inode);
+	}
+	return err;
 }
 
 /*
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index a97eb151d9db..3b12577ddfbb 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -600,7 +600,7 @@ struct name_snapshot {
 	struct qstr name;
 	union shortname_store inline_name;
 };
-void take_dentry_name_snapshot(struct name_snapshot *, struct dentry *);
+void take_dentry_name_snapshot(struct name_snapshot *, const struct dentry *);
 void release_dentry_name_snapshot(struct name_snapshot *);
 
 static inline struct dentry *d_first_child(const struct dentry *dentry)
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index a3e8fe414df8..efcf1018c208 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2870,7 +2870,7 @@ TRACE_EVENT(ext4_fc_stats,
 DECLARE_EVENT_CLASS(ext4_fc_track_dentry,
 
 	TP_PROTO(handle_t *handle, struct inode *inode,
-		 struct dentry *dentry, int ret),
+		 const struct dentry *dentry, int ret),
 
 	TP_ARGS(handle, inode, dentry, ret),
 
@@ -2902,7 +2902,7 @@ DECLARE_EVENT_CLASS(ext4_fc_track_dentry,
 #define DEFINE_EVENT_CLASS_DENTRY(__type)				\
 DEFINE_EVENT(ext4_fc_track_dentry, ext4_fc_track_##__type,		\
 	TP_PROTO(handle_t *handle, struct inode *inode,			\
-		 struct dentry *dentry, int ret),			\
+		 const struct dentry *dentry, int ret),			\
 	TP_ARGS(handle, inode, dentry, ret)				\
 )
 
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 33/53] ext4: use on-stack dentries in ext4_fc_replay_link_internal()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

ext4_fc_replay_link_internal() uses two dentries to simply code-reuse
when replaying a "link" operation.  It does not need to interact with
the dcache and removes the dentries shortly after adding them.

They are passed to __ext4_link() which only performs read accesses on
these dentries and only uses the name and parent of dentry_inode (plus
checking a flag is unset) and only uses the inode of the parent.

So instead of allocating dentries and adding them to the dcache, allocat
two dentries on the stack, set up the required fields, and pass these to
__ext4_link().

This substantially simplifies the code and removes on of the few uses of
d_alloc() - preparing for its removal.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/ext4/fast_commit.c | 40 ++++++++--------------------------------
 1 file changed, 8 insertions(+), 32 deletions(-)

diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 2a5daf1d9667..e3593bb90a62 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1446,8 +1446,6 @@ static int ext4_fc_replay_link_internal(struct super_block *sb,
 				struct inode *inode)
 {
 	struct inode *dir = NULL;
-	struct dentry *dentry_dir = NULL, *dentry_inode = NULL;
-	struct qstr qstr_dname = QSTR_INIT(darg->dname, darg->dname_len);
 	int ret = 0;
 
 	dir = ext4_iget(sb, darg->parent_ino, EXT4_IGET_NORMAL);
@@ -1457,28 +1455,14 @@ static int ext4_fc_replay_link_internal(struct super_block *sb,
 		goto out;
 	}
 
-	dentry_dir = d_obtain_alias(dir);
-	if (IS_ERR(dentry_dir)) {
-		ext4_debug("Failed to obtain dentry");
-		dentry_dir = NULL;
-		goto out;
-	}
+	{
+		struct dentry dentry_dir = { .d_inode = dir };
+		const struct dentry dentry_inode = {
+			.d_parent = &dentry_dir,
+			.d_name = QSTR_LEN(darg->dname, darg->dname_len),
+		};
 
-	dentry_inode = d_alloc(dentry_dir, &qstr_dname);
-	if (!dentry_inode) {
-		ext4_debug("Inode dentry not created.");
-		ret = -ENOMEM;
-		goto out;
-	}
-
-	ihold(inode);
-	inc_nlink(inode);
-	ret = __ext4_link(dir, inode, dentry_inode);
-	if (ret) {
-		drop_nlink(inode);
-		iput(inode);
-	} else {
-		d_instantiate(dentry_inode, inode);
+		ret = __ext4_link(dir, inode, &dentry_inode);
 	}
 	/*
 	 * It's possible that link already existed since data blocks
@@ -1493,16 +1477,8 @@ static int ext4_fc_replay_link_internal(struct super_block *sb,
 
 	ret = 0;
 out:
-	if (dentry_dir) {
-		d_drop(dentry_dir);
-		dput(dentry_dir);
-	} else if (dir) {
+	if (dir)
 		iput(dir);
-	}
-	if (dentry_inode) {
-		d_drop(dentry_inode);
-		dput(dentry_inode);
-	}
 
 	return ret;
 }
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 34/53] tracefs: stop using d_add().
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

"Best practice" is to use d_splice_alias() at the end of a ->lookup
function.  d_add() often works and is not incorrect in tracefs, but as
it is planned to remove d_add(), change to use d_splice_alias().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/tracefs/event_inode.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 8e5ac464b328..c30567b5331e 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -393,8 +393,7 @@ static struct dentry *lookup_file(struct eventfs_inode *parent_ei,
 	// Files have their parent's ei as their fsdata
 	dentry->d_fsdata = get_ei(parent_ei);
 
-	d_add(dentry, inode);
-	return NULL;
+	return d_splice_alias(inode, dentry);
 };
 
 /**
@@ -424,8 +423,7 @@ static struct dentry *lookup_dir_entry(struct dentry *dentry,
 
 	dentry->d_fsdata = get_ei(ei);
 
-	d_add(dentry, inode);
-	return NULL;
+	return d_splice_alias(inode, dentry);
 }
 
 static inline struct eventfs_inode *init_ei(struct eventfs_inode *ei, const char *name)
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 35/53] cephfs: stop using d_add().
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

"Best practice" is to use d_splice_alias() at the end of a ->lookup
function.  d_add() often works and is not incorrect in tracefs, as the
inode is always NULL, but as it is planned to remove d_add(), change to
use d_splice_alias().

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/ceph/dir.c   | 5 ++---
 fs/ceph/inode.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 86d7aa594ea9..c7dac71b55bd 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -769,7 +769,7 @@ struct dentry *ceph_finish_lookup(struct ceph_mds_request *req,
 				d_drop(dentry);
 				err = -ENOENT;
 			} else {
-				d_add(dentry, NULL);
+				d_splice_alias(NULL, dentry);
 			}
 		}
 	}
@@ -840,9 +840,8 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 			spin_unlock(&ci->i_ceph_lock);
 			doutc(cl, " dir %llx.%llx complete, -ENOENT\n",
 			      ceph_vinop(dir));
-			d_add(dentry, NULL);
 			di->lease_shared_gen = atomic_read(&ci->i_shared_gen);
-			return NULL;
+			return d_splice_alias(NULL, dentry);
 		}
 		spin_unlock(&ci->i_ceph_lock);
 	}
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index d76f9a79dc0c..59f9f6948bb2 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1773,7 +1773,7 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
 				d_delete(dn);
 			} else if (have_lease) {
 				if (d_unhashed(dn))
-					d_add(dn, NULL);
+					d_splice_alias(NULL, dn);
 			}
 
 			if (!d_unhashed(dn) && have_lease)
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

* [PATCH 36/53] cephfs: remove d_alloc from CEPH_MDS_OP_LOOKUPNAME handling in ceph_fill_trace()
From: NeilBrown @ 2026-03-12 21:12 UTC (permalink / raw)
  To: Linus Torvalds, Alexander Viro, Christian Brauner, Jan Kara,
	Jeff Layton, Trond Myklebust, Anna Schumaker, Carlos Maiolino,
	Miklos Szeredi, Amir Goldstein, Jan Harkes, Hugh Dickins,
	Baolin Wang, David Howells, Marc Dionne, Steve French,
	Namjae Jeon, Sungjong Seo, Yuezhang Mo, Andreas Hindborg,
	Breno Leitao, Theodore Ts'o, Andreas Dilger, Steven Rostedt,
	Masami Hiramatsu, Ilya Dryomov, Alex Markuze, Viacheslav Dubeyko,
	Tyler Hicks, Andreas Gruenbacher, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Jeremy Kerr, Ard Biesheuvel
  Cc: linux-fsdevel, linux-nfs, linux-xfs, linux-unionfs, coda,
	linux-mm, linux-afs, linux-cifs, linux-ext4, linux-kernel,
	linux-trace-kernel, ceph-devel, ecryptfs, gfs2, linux-um,
	linux-efi
In-Reply-To: <20260312214330.3885211-1-neilb@ownmail.net>

From: NeilBrown <neil@brown.name>

When performing a get_name export_operation, ceph sends a LOOKUPNAME op
to the server.  When it gets a reply it tries to look up the name
locally and if the name exists in the dcache with the wrong inode, it
discards the result and tries again.

If it doesn't find the name in the dcache it will allocate a new dentry
and never make any use of it.  The dentry is never instantiated and is
assigned to ->r_dentry which is then freed by post-op cleanup.

As this is a waste, and as there is a plan to remove d_alloc(), this
code is discarded.

Also try_lookup_noperm() is used in place of full_name_hash() and
d_lookup(), and QSTR_LEN() is used to initialise dname.

Signed-off-by: NeilBrown <neil@brown.name>
---
 fs/ceph/inode.c | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 59f9f6948bb2..0982fbda2a82 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -15,6 +15,7 @@
 #include <linux/sort.h>
 #include <linux/iversion.h>
 #include <linux/fscrypt.h>
+#include <linux/namei.h>
 
 #include "super.h"
 #include "mds_client.h"
@@ -1623,33 +1624,17 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
 				ceph_fname_free_buffer(parent_dir, &oname);
 				goto done;
 			}
-			dname.name = oname.name;
-			dname.len = oname.len;
-			dname.hash = full_name_hash(parent, dname.name, dname.len);
+			dname = QSTR_LEN(oname.name, oname.len);
 			tvino.ino = le64_to_cpu(rinfo->targeti.in->ino);
 			tvino.snap = le64_to_cpu(rinfo->targeti.in->snapid);
 retry_lookup:
-			dn = d_lookup(parent, &dname);
+			dn = try_lookup_noperm(&dname, parent);
 			doutc(cl, "d_lookup on parent=%p name=%.*s got %p\n",
 			      parent, dname.len, dname.name, dn);
-
-			if (!dn) {
-				dn = d_alloc(parent, &dname);
-				doutc(cl, "d_alloc %p '%.*s' = %p\n", parent,
-				      dname.len, dname.name, dn);
-				if (!dn) {
-					dput(parent);
-					ceph_fname_free_buffer(parent_dir, &oname);
-					err = -ENOMEM;
-					goto done;
-				}
-				if (is_nokey) {
-					spin_lock(&dn->d_lock);
-					dn->d_flags |= DCACHE_NOKEY_NAME;
-					spin_unlock(&dn->d_lock);
-				}
-				err = 0;
-			} else if (d_really_is_positive(dn) &&
+			if (IS_ERR(dn))
+				/* should be impossible */
+				dn = NULL;
+			if (dn && d_really_is_positive(dn) &&
 				   (ceph_ino(d_inode(dn)) != tvino.ino ||
 				    ceph_snap(d_inode(dn)) != tvino.snap)) {
 				doutc(cl, " dn %p points to wrong inode %p\n",
-- 
2.50.0.107.gf914562f5916.dirty


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox