public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Wen Xiong <wenxiong@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org, gjoyce@linux.ibm.com,
	linux-pci@vger.kernel.org, Bjorn Helgaas <helgaas@kernel.org>,
	linux-scsi@vger.kernel.org
Subject: Re: [PATCH 1/1] genirq/msi: Dynamic remove/add stroage adapter hits EEH
Date: Thu, 20 Mar 2025 09:48:23 +0100	[thread overview]
Message-ID: <874izo3x60.ffs@tglx> (raw)
In-Reply-To: <877c4k3yc5.ffs@tglx>

On Thu, Mar 20 2025 at 09:23, Thomas Gleixner wrote:
> On Wed, Mar 19 2025 at 21:58, Wen Xiong wrote:
>> We don't see the issue without dynamically remove/add operation.
>> There is a small window which irqbalance daemon kicks in during device
>> reset. So it took about over 6 hours to recreate the issue when doing
>> remove/add loop operation.
>
> Sure. You need a loop to hit the window. And it does not matter whether
> it's the probe or the remove which triggers it. Fact is that the reset
> wipes out the config space, which means that any read from the config
> space between reset and restore will return garbage. That problem is not
> restricted to the interrupt code. It's a general problem.

After looking at the code again, it's a problem in the remove()
function:

__ipr_remove()
  ipr_initiate_ioa_bringdown() 
    // resets device
    restore_config_space()
  ....
  ipr_free_all_resources()
    free_irqs()

So yes, it's not probe(). But the question is pretty much the same.

Why is a reset issued while the driver is fully operational and
resources are still in use?

Don't even think about telling me that this is a problem of the MSI
interrupt rework. It is not. It's been broken forever.

You _cannot_ pull the rung under a fully operational driver and expect
that all involved parts will just magically handle this gracefully.

What about tearing down resources first and then issuing the reset?

Thanks,

        tglx


  reply	other threads:[~2025-03-20  8:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-19 12:14 [PATCH 1/1] genirq/msi: Dynamic remove/add stroage adapter hits EEH wenxiong
2025-03-19 16:15 ` Thomas Gleixner
2025-03-20  2:58   ` Wen Xiong
2025-03-20  8:23     ` Thomas Gleixner
2025-03-20  8:48       ` Thomas Gleixner [this message]
2025-03-27 21:36         ` Wen Xiong
2025-03-28 11:27           ` Thomas Gleixner
2025-04-01 20:14             ` Wen Xiong
2025-04-02  8:33               ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874izo3x60.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=gjoyce@linux.ibm.com \
    --cc=helgaas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=wenxiong@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox