All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	leit@meta.com, "Peter Zijlstra (Intel)" <peterz@infradead.org>,
	Wei Liu <wei.liu@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Adrian Huang <ahuang12@lenovo.com>,
	"open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86/apic: Add retry mechanism to add_pin_to_irq_node()
Date: Mon, 29 Jul 2024 09:55:53 -0700	[thread overview]
Message-ID: <ZqfJmUF8sXIyuSHN@gmail.com> (raw)
In-Reply-To: <874j8889ch.ffs@tglx>

Hello Thomas,

On Mon, Jul 29, 2024 at 06:13:34PM +0200, Thomas Gleixner wrote:
> On Mon, Jul 29 2024 at 07:06, Breno Leitao wrote:
> > I've been running some experiments with failslab fault injector running
> > to detect a different problem, and the machine always crash with the
> > following stack:
> >
> > 	can not alloc irq_pin_list (-1,0,20)
> > 	Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
> >
> > 	Call Trace:
> > 	 panic
> > 	   _printk
> > 	   panic_smp_self_stop
> > 	   rcu_is_watching
> > 	   intel_irq_remapping_free
> 
> This completely lacks context. When does this happen? What's the system
> state? What has intel_irq_remapping_free() to do with the allocation path?

Sorry, let me clarify it a bit better:

1) This happens when the machine is booted up, and being under stress
2) This happens when I have failslab fault injection enabled.
3) The machine crashes after hitting this error.
4) This is reproducible with `stress-ng` using the `--aggressive` parameter
5) This is the full stack (sorry for not decoding the stack, but if you
   need it, I am more than happy to give you a decoded stack)


 04:12:34  can not alloc irq_pin_list (-1,0,20)
           Kernel panic - not syncing: IO-APIC: failed to add irq-pin. Can not proceed
           CPU: 11 UID: 0 PID: 335023 Comm: stress-ng-dev Kdump: loaded Tainted: G S          E    N 6.10.0-12563-gdb0610128a16 #48

           Call Trace:
            <TASK>
            panic+0x4e9/0x590
            ? _printk+0xb3/0xe0
            ? panic_smp_self_stop+0x70/0x70
            ? rcu_is_watching+0xe/0xb0
            ? intel_irq_remapping_free+0x30/0x30
            ? __add_pin_to_irq_node+0xf4/0x2d0
            ? rcu_is_watching+0xe/0xb0
            mp_irqdomain_alloc+0x9ab/0xa80
            ? IO_APIC_get_PCI_irq_vector+0x850/0x850
            ? __kmalloc_cache_node_noprof+0x1e0/0x360
            ? mutex_lock_io_nested+0x1420/0x1420
            irq_domain_alloc_irqs_locked+0x25d/0x8d0
            __irq_domain_alloc_irqs+0x80/0x110
            mp_map_pin_to_irq+0x645/0x890
            ? __acpi_get_override_irq+0x350/0x350
            ? mutex_lock_io_nested+0x1420/0x1420
            ? lockdep_hardirqs_on_prepare+0x400/0x400
            ? mp_map_gsi_to_irq+0xe6/0x1b0
            acpi_register_gsi_ioapic+0xe6/0x150
            ? acpi_unregister_gsi_ioapic+0x40/0x40
            ? mark_held_locks+0x9f/0xe0
            ? _raw_spin_unlock_irq+0x24/0x50
            hpet_open+0x313/0x480
            misc_open+0x306/0x420
            chrdev_open+0x218/0x660
            ? __unregister_chrdev+0xe0/0xe0
            ? security_file_open+0x3d4/0x740
            do_dentry_open+0x4a1/0x1300
            ? __unregister_chrdev+0xe0/0xe0
            vfs_open+0x7e/0x350
            path_openat+0xb46/0x2740
            ? kernel_tmpfile_open+0x60/0x60
            ? lock_acquire+0x1e4/0x650
            do_filp_open+0x1af/0x3e0
            ? path_openat+0x2740/0x2740
            ? do_raw_spin_lock+0x12d/0x270
            ? spin_bug+0x1d0/0x1d0
            ? _raw_spin_unlock+0x29/0x40
            ? alloc_fd+0x1e6/0x640
            do_sys_openat2+0x117/0x150
            ? build_open_flags+0x450/0x450
            ? lock_downgrade+0x690/0x690
            __x64_sys_openat+0x11f/0x1d0
            ? __x64_sys_open+0x1a0/0x1a0
            ? do_syscall_64+0x36/0x190
            do_syscall_64+0x6e/0x190
            entry_SYSCALL_64_after_hwframe+0x4b/0x53
           RIP: 0033:0x7f6c406fd784
           Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 d6 88 f8 ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 89 44 24 0c e8 28 89 f8 ff 8b 44
           RSP: 002b:00007fff72413a70 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
           RAX: ffffffffffffffda RBX: 00007f6c408c43a8 RCX: 00007f6c406fd784
           RDX: 0000000000000800 RSI: 000055759a5fc910 RDI: 00000000ffffff9c
           RBP: 000055759a5fc910 R08: 0000000000000000 R09: 0000000000000001
           R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000800
           R13: 00007fff72413c90 R14: 000055759a5fc910 R15: 00007f6c408c43a8
            </TASK>

> > This happens because add_pin_to_irq_node() function would panic if
> > adding a pin to an IRQ failed due to -ENOMEM (which was injected by
> > failslab fault injector).  I've been running with this patch in my test
> > cases in order to be able to pick real bugs, and I thought it might be a
> > good idea to have it upstream also, so, other people trying to find real
> > bugs don't stumble upon this one. Also, this makes sense in a real
> > world(?), when retrying a few times might be better than just
> > panicking.
> 
> While it seems to make sense, the reality is that this is mostly early
> boot code. If there is a real world memory allocation failure during
> early boot then retries will not help at all.

This is not happening at early boot, this is reproducible when running
stress-ng in this aggressive mode.

Since I have failslab injecting a kmalloc fault,
__add_pin_to_irq_noder() returns -ENOMEM, which causes the undesired
panic().

> > Introduce a retry mechanism that attempts to add the pin up to 3 times
> > before giving up and panicking. This should improve the robustness of
> > the IO-APIC code in the face of transient errors.
> 
> I'm absolutely not convinced by this loop heuristic. That's just a bad
> hack.

I will not disagree with you here, but I need to use this patch in order
to be able t keep the system not panicking and stable while fault
injecting slab errors and trying to reproduce a real bug in the network
stack.

Thanks for the review,
--breno

  reply	other threads:[~2024-07-29 16:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-29 14:06 [PATCH] x86/apic: Add retry mechanism to add_pin_to_irq_node() Breno Leitao
2024-07-29 16:13 ` Thomas Gleixner
2024-07-29 16:55   ` Breno Leitao [this message]
2024-07-29 17:44     ` Thomas Gleixner
2024-07-30 10:28       ` Breno Leitao
2024-08-07 16:25     ` [tip: x86/apic] x86/ioapic: Handle allocation failures gracefully tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqfJmUF8sXIyuSHN@gmail.com \
    --to=leitao@debian.org \
    --cc=ahuang12@lenovo.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=leit@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.