public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: linux-kernel@vger.kernel.org,
	Bartosz Golaszewski <bartosz.golaszewski@linaro.org>,
	Marc Zyngier <maz@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak
Date: Wed, 30 Aug 2023 00:29:41 +0200	[thread overview]
Message-ID: <877cpd7a96.ffs@tglx> (raw)
In-Reply-To: <CAMRc=Md6NA6-rBWL1ti66X5Rt3C4Y2irfrSZnCo3wQSCqT6nPQ@mail.gmail.com>

On Tue, Aug 29 2023 at 14:24, Bartosz Golaszewski wrote:
> On Tue, Aug 29, 2023 at 11:11 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> So when the USB device disconnects, then something needs to tell the
>> i2c-xx-driver that the I2C interface is not longer available, right?
>>
>> IOW, the unbind operation needs the following notification and teardown
>> order:
>>
>>    1) USB core notifies hid-cp2112
>>
>>    2) hid-cp2112 notifies i2c-xx-driver
>>
>>    3) i2c-xx-driver mops up and invokes free_irq()
>>
>>    4) hid-cp2112 removes the interrupt domain
>>
>> But it seems that you end up with a situation where the notification of
>> the i2c-xx-driver is either not happening or the driver in question is
>> simply failing to mop up and free the requested interrupt.
>>
>
> Yes, there's no notification of any kind.

I'm not buying that.

  usb disconnect
    ...
      cp2112_remove()
        i2c_del_adapter()
          i2c_unregister_device(client)
            ...
            device_unregister()
              device_del()
                bus_notify()            // Mechanism #1
                  i2c_device_remove()
                    if (dev->remove)
                       dev->remove()
                ...
                device_unbind_cleanup()
                  devres_release_all()  // Mechanism #2

        gpiochip_remove()

There are very well notifications to the drivers about unplug of a
device. Otherwise this would end up in a complete disaster and a lot
more stale data and state than just a procfs file or a requested
interrupt.

So the mechanisms are there, no?

If this is just about the problem that some device driver writers fail
to implement them correctly, then yes it makes sense to have a last
resort fallback which cleans them up and emits a big fat warning.

Making this a programming model would be beyond wrong.

> It's a common problem unfortunately across different subsystems. We
> have hot-unpluggable consumers using resources that don't support it
> (like interrupts in this example).

All hotpluggable consumers have at least one mechanism to mop up the
resources they allocated. There are a lot of resources in the kernel
which do not clean themself up magically.

So what's so special about interrupts? They are not any different from a
pointer which is registered at some entity and the device driver writer
forgets to unregister it, but the underlying resource is freed. That's
even worse than the leaked interrupt and cannot be magically undone at
all.

Whatever you try, you can't turn driver programming into a task which
can be accomplished w/o brains.

> For interrupts it would mean that when the consumer calls
> request_irq(), the number it gets is a weak reference to the irq_desc.

Interrupt numbers are weak references by definition. request_irq() does
not return an interrupt number, it returns success or fail. The
interrupt number is handed to request_irq(), no?

The entities which hand out the interrupt number are a complete
different story. But that number is from their perspective a weak
reference too.

> For any management operation we lock irq_desc.

That's required anyway, but irq_desc::lock is not a sufficient
protection against a teardown race.

> If the domain is destroyed, irq_descs get destroyed with it

Interrupts consist of more than just an interrupt descriptor. If you
care to look at the internals, then the descriptor is the last entity
which goes away simply because all other related resources hang off the
interrupt descriptor.

So they obviously need to be mopped up first and trying to mop up
requested interrupts at the point where the interrupt descriptor is
freed is way too late.

In fact they need to mopped up first, which is obvious from the setup
ordering as everything underneath must be in place already before
request_irq() can succeed. The only sensible ordering for teardown is
obviously the reverse of setup, but that's not specific to interrupts at
all.

> (after all users leave the critical section).

Which critical section? Interrupts have more than just the
irq_desc::lock critical section which needs to be left.

> Next call to any of the functions looks up the irq number and sees
> it's gone. It fails or silently returns depending on the function
> (e.g. irq_free() would have to ignore the missing lookup).

That's what happens today already.

The missing bit is the magic function which mops up when the driver
writer blew it up. But that's far from a 'put a trivial free_irq() call
somewhere'.

First of all we have too many interrupt mechanisms ranging from the
linux 0.1 model with static interrupt spaces to hierarchical interrupt
domains.

  - Almost everything which is interrupt domain based (hierarchical or
    not) can probably be "addressed" by some definition of "addressed"
    because there are only a few places in the core code which are
    involved in releasing individual or domain wide resources.

    But see below.

  - The incarnations which do not use interrupt domains are way harder
    because the teardown of the interrupt resources is not happening in
    the core code. It's happening at the driver side and the core is
    only involved to release the interrupt descriptor. But that's too
    late.

    The good news about those is that probably the vast majority of the
    instances is built in, mostly legacy and not pluggable.

So lets assume that anything which is not interrupt domain based is not
relevant, which reduces the problem space significantly. But the
remaining space is still non-trivial.

  1) Life-time

     Interrupt descriptors are RCU freed but that's mostly for external
     usage like /proc/interrupts

     Still most of the core code relies on the proper setup/teardown
     order, so there would be quite some work to do vs. validating that
     an interrupt descriptor is consistent at all hierarchy levels.

     That's going to be interesting vs. interfaces which can be invoked
     from pretty much any context (except NMI).

     irq_desc::lock is not the panacea here because it obviously can
     only be held for real short truly atomic sections. But quite some
     of the setup/teardown at all levels requires sleepable context.

  2) Concurrency

     Protection against concurrent interrupt handler execution is
     covered in free_irq() as it fully synchronizes.

     That's the least of my worries.

     Whatever the invalidation mechanism might be, pretty much every
     usage site of irq_to_desc() and any usage site of interrupt
     descriptors which are stored outside of the maple_tree for
     performance reasons need to be audited.

     The validation has to take into account whether that's an requested
     descriptor or an unused one.
  
So no, neither removing some procfs entry at the wrong point nor
slapping some variant of free_irq() into random places is going to cut
it.

Your idea of tracking request_irq()/free_irq() at some subsystem level
is not going to work either simply because it requires that such muck is
sprinkled all over the place.

I completely agree that the interrupt core code does not have enough
places which emit a prominent warning when the rules are violated.

So sure, in some way or the other a "fix" by some definition of "fix"
could be implemented, but I'm really not convinced that's the right way
to waste^Wspend our time with.

Especially not because interrupt handling is a hot-path and any
life-time/conncurrency validation mechanism will introduce overhead no
matter what.

Thanks,

        tglx

  parent reply	other threads:[~2023-08-29 22:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14  9:36 [PATCH 0/2] genirq: don't leak handler procfs entries Bartosz Golaszewski
2023-08-14  9:36 ` [PATCH 1/2] genirq: proc: drop unused argument from unregister_handler_proc() Bartosz Golaszewski
2023-08-14  9:36 ` [PATCH 2/2] genirq: proc: fix a procfs entry leak Bartosz Golaszewski
2023-08-24 20:12   ` Thomas Gleixner
2023-08-25  7:36     ` brgl
2023-08-25  8:11       ` Thomas Gleixner
2023-08-25 11:01         ` Bartosz Golaszewski
2023-08-25 14:58           ` Thomas Gleixner
2023-08-25 17:08           ` Thomas Gleixner
2023-08-25 20:07             ` Bartosz Golaszewski
2023-08-26 15:08               ` Thomas Gleixner
2023-08-28 10:06                 ` Bartosz Golaszewski
2023-08-28 12:41                   ` Thomas Gleixner
2023-08-28 19:03                     ` Bartosz Golaszewski
2023-08-28 21:44                       ` Thomas Gleixner
2023-08-29  6:26                         ` Bartosz Golaszewski
2023-08-29  9:11                           ` Thomas Gleixner
2023-08-29 12:24                             ` Bartosz Golaszewski
2023-08-29 20:18                               ` Greg Kroah-Hartman
2023-08-29 22:29                               ` Thomas Gleixner [this message]
2023-09-06 14:54                                 ` Bartosz Golaszewski
2023-09-12 18:16                                   ` Thomas Gleixner
2023-09-15 19:50                                     ` Bartosz Golaszewski
2023-11-13 20:53                                       ` Bartosz Golaszewski
2023-11-14 12:27                                         ` Greg Kroah-Hartman
2023-11-14 13:25                                         ` Linus Walleij

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cpd7a96.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bartosz.golaszewski@linaro.org \
    --cc=brgl@bgdev.pl \
    --cc=gregkh@linuxfoundation.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox