From: Thomas Gleixner <tglx@linutronix.de>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Eli Cohen <elic@nvidia.com>, Leon Romanovsky <leon@kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
linux-rdma <linux-rdma@vger.kernel.org>,
"open list:NETWORKING [GENERAL]" <netdev@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: system hang on start-up (mlx5?)
Date: Wed, 31 May 2023 19:11:59 +0200 [thread overview]
Message-ID: <87leh4fmsg.ffs@tglx> (raw)
In-Reply-To: <48B0BC74-5F5C-4212-BC5A-552356E9FFB1@oracle.com>
On Wed, May 31 2023 at 15:06, Chuck Lever III wrote:
>> On May 31, 2023, at 10:43 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> mlx5_irq_alloc(af_desc)
>> pci_msix_alloc_irq_at(af_desc)
>> msi_domain_alloc_irq_at(af_desc)
>> __msi_domain_alloc_irqs(af_desc)
>> 1) msidesc->affinity = kmemdup(af_desc);
>> __irq_domain_alloc_irqs()
>> __irq_domain_alloc_irqs(aff=msidesc->affinity)
>> irq_domain_alloc_irqs_locked(aff)
>> irq_domain_alloc_irqs_locked(aff)
>> irq_domain_alloc_descs(aff)
>> alloc_desc(mask=&aff->mask)
>> desc_smp_init(mask)
>> 2) cpumask_copy(desc->irq_common_data.affinity, mask);
>> irq_domain_alloc_irqs_hierarchy()
>> msi_domain_alloc()
>> intel_irq_remapping_alloc()
>> x86_vector_alloc_irqs()
>
> It is x86_vector_alloc_irqs() where the struct irq_data is
> fabricated that ends up in irq_matrix_reserve_managed().
Kinda fabricated :)
irqd = irq_domain_get_irq_data(domain, virq + i);
Thats finding the irqdata which is associated to the vector domain. That
has been allocated earlier. The affinity mask is retrieved via:
const struct cpumask *affmsk = irq_data_get_affinity_mask(irqd);
which does:
return irqd->common->affinity;
irqd->common points to desc->irq_common_data. The affinity there was
copied in #2 above.
>> This also ends up in the wrong place. That mlx code does:
>>
>> af_desc.is_managed = false;
>>
>> but the allocation ends up allocating a managed vector.
>
> That line was changed in 6.4-rc4 to address another bug,
> and it avoids the crash by not calling into the misbehaving
> code. It doesn't address the mlx5_core initialization issue
> though, because as I said before, execution continues and
> crashes in a similar scenario later on.
Ok.
> On my system, I've reverted that fix:
>
> - af_desc.is_managed = false;
> + af_desc.is_managed = 1;
>
> so that we can continue debugging this crash.
Ah.
>> Can you please instrument this along the call chain so we can see where
>> or at least when this gets corrupted? Please print the relevant pointer
>> addresses too so we can see whether that's consistent or not.
>
> I will continue working on this today.
>> But that's just the symptom, not the root cause. That code is perfectly
>> fine when all callers use the proper cpumask functions.
>
> Agreed: we're crashing here because of the extra bits
> in the affinity mask, but those bits should not be set
> in the first place.
Correct.
> I wasn't sure if for_each_cpu() was supposed to iterate
> into non-present CPUs -- and I guess the answer
> is yes, it will iterate the full length of the mask.
> The caller is responsible for ensuring the mask is valid.
Yes, that's the assumption of this constant optimization for the small
number of CPUs case. All other cases use nr_cpu_ids as limit and won't
go into non-possible CPUs. I didn't spot it yesterday night either.
Thanks,
tglx
next prev parent reply other threads:[~2023-05-31 17:12 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-03 1:03 system hang on start-up (mlx5?) Chuck Lever III
2023-05-03 6:34 ` Eli Cohen
2023-05-03 14:02 ` Chuck Lever III
2023-05-04 7:29 ` Leon Romanovsky
2023-05-04 19:02 ` Chuck Lever III
2023-05-04 23:38 ` Jason Gunthorpe
2023-05-07 5:23 ` Eli Cohen
2023-05-07 5:31 ` Eli Cohen
2023-05-27 20:16 ` Chuck Lever III
2023-05-29 21:20 ` Thomas Gleixner
2023-05-30 13:09 ` Chuck Lever III
2023-05-30 13:28 ` Chuck Lever III
2023-05-30 13:48 ` Eli Cohen
2023-05-30 13:51 ` Chuck Lever III
2023-05-30 13:54 ` Eli Cohen
2023-05-30 15:08 ` Shay Drory
2023-05-31 14:15 ` Chuck Lever III
2023-05-30 19:46 ` Thomas Gleixner
2023-05-30 21:48 ` Chuck Lever III
2023-05-30 22:17 ` Thomas Gleixner
2023-05-31 14:43 ` Thomas Gleixner
2023-05-31 15:06 ` Chuck Lever III
2023-05-31 17:11 ` Thomas Gleixner [this message]
2023-05-16 19:23 ` Chuck Lever III
2023-05-23 14:20 ` Linux regression tracking (Thorsten Leemhuis)
2023-05-24 14:59 ` Chuck Lever III
2023-05-08 12:29 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-06-02 11:05 ` Linux regression tracking #update (Thorsten Leemhuis)
2023-06-02 13:38 ` Chuck Lever III
2023-06-02 13:55 ` Linux regression tracking (Thorsten Leemhuis)
2023-06-02 14:03 ` Chuck Lever III
2023-06-02 14:29 ` Jason Gunthorpe
2023-06-02 15:58 ` Thorsten Leemhuis
2023-06-02 16:54 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87leh4fmsg.ffs@tglx \
--to=tglx@linutronix.de \
--cc=chuck.lever@oracle.com \
--cc=elic@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox