From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A86F1531C1 for ; Fri, 10 Oct 2025 02:52:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760064748; cv=none; b=daVBQJ8EinSIrJv2kO+414hVr5kR+IJ6LULpF99SKtJlGeEzQnOUXlDO3K1WwoCmsm6qdkPjAY3moRtws3RJeVzl39jwkpIPNuY4QgVe4DOY09uFCNIuTg4E3PhVTOz/dYzUXPRw6Fam52PaAu15m2QDDysNrWSH4/lNGnJUfDQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760064748; c=relaxed/simple; bh=BtPGX0SCKTWOO/2pmFrKk9ubExIp15TCY8Ta2rehDnA=; h=From:Message-ID:Date:MIME-Version:Subject:To:Cc:References: In-Reply-To:Content-Type; b=HgpH69YueyuqoA8yuaDZHG8gNqxm+66B8mvQBhfn42LQ4ee6LosEdnUwpLVJBQfmeOsNJHSjP7/bRXxn1gp9NahfaNYxTUeLYDcoaYyvYF2f3jnNltagPDpQuCVmwnM31fSyq7hLJLu3L85lMBxuL4eXhDPnWqb4/AfAsvtqtQI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=h3W2G2ot; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="h3W2G2ot" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1760064745; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=p77JZb+Mdci5/KEMJK1kpt1QjxLJZnP83075vW8bdvw=; b=h3W2G2otOek2e79fsAMahCn4y5ymGHc7LM0BK/DQXBwaUg3fFQo9WT0lAuIL9XQyXoPc8N EDcJeSK4U/DnYaLvdNY4jdtmo4cKoQFgByYxxJKJtG1DCnDSk+R4fQ82SR0DeIhYFioBle 9PjSm2+4uVy/JGhfQPKcZ7oQneltX30= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-502-sZrd59Q8OtGtdy5uGyAJ4w-1; Thu, 09 Oct 2025 22:52:24 -0400 X-MC-Unique: sZrd59Q8OtGtdy5uGyAJ4w-1 X-Mimecast-MFC-AGG-ID: sZrd59Q8OtGtdy5uGyAJ4w_1760064743 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-7946137e7c2so73344136d6.0 for ; Thu, 09 Oct 2025 19:52:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760064743; x=1760669543; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:subject:user-agent:mime-version:date:message-id:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=p77JZb+Mdci5/KEMJK1kpt1QjxLJZnP83075vW8bdvw=; b=JkySvSuSUrpqILjxtf+5luLSSSNcYkudyME2yPdiV7GyHgDLr/OugjRhPNIbAxH2ww fvgwC1QEeKZBNMV/H2PtpAgJUWXCesTWNf+nBQpegejkP1js5/23dubVfQsUk7GEeFvh ExjasGGUy8caU153mnBkaWMVluJq9PYK3Q5pKMUNgmE8V9CBMGommDsd6YBN0yjzov9U LqqrwXzcmIaOuTQSJ3RlbnoVNinj+IWrZ/eqebKtOmr+TB2kAtLdEocfDx0Ly3XCTVqz /+casNa4VxudEgHcYyaaJfHddo4ND+fbpCLsHoGACP2DREaONq5fm9KFg62CB+RFLy/E hTeQ== X-Forwarded-Encrypted: i=1; AJvYcCUN/9KbpRfALfEqTQK+/jlDm2mmpUAUWXd0fF8GS6xupkFJ8rIOU/5Q9PNPEZtuJ+nms8JKpL/QkaeMXce/sw==@lists.linux.dev X-Gm-Message-State: AOJu0Yye093Cdmqc2qTNK9fE99KxCoipuObBHdlSbRbG0A84zdRzrQBF EOqOS2sY/jKnlI4juLTUbHD8iI++QiGxm2a3F6CobDx3294QbiQbcLimsj34yokU6xYcr+vyeW8 RYqrxN6SF497MyHBrsC13Rwqpd56niPUqlZ2pA3wSATOsrwbZnSWJ8i4Dtq1KFiFeThjo X-Gm-Gg: ASbGncsXUbMXzPuA6uykYOsNmQXgYw6PSG6QTcpr2gdyjvvab1EreV/BNDwv7h0xFVL Lj+Tj0awDPxqEISszRj13I6RJdBf8UkD5ZnZrnOn/KVNymFYCar3XHO3VKCIhcTfK6B0tdAMYYu u74Fpsx6t+kkSOtwYUvYrGy2NTNlryUAPSPyHe80sLamQlNZ+gF/ZCUBJ7xIFFymjRdgr753LT7 NyEQRKgsR9xcHAvNoTOF6HHX1DuI92hwq08QM/v56vqvgnQhM2sffkhxj9a5veSMG/RH9cZQWSK TI9jts7DQchl7WaYzogcsHbXwxSeX7zrsOHfyf/0Gds28qjXFbpI+D+dl7DBWY46l8szZNNO8AK 3zvVuzdmG+vs= X-Received: by 2002:ad4:594b:0:b0:873:f6bc:abb8 with SMTP id 6a1803df08f44-87b2101d5d4mr127529066d6.15.1760064743561; Thu, 09 Oct 2025 19:52:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0lkCcjXQD1DjvauUfSkRJqcrLhOc8z33VBbhKdHXGR8krbUUptEp7lztjXHaRk4Wn9gpyaw== X-Received: by 2002:ad4:594b:0:b0:873:f6bc:abb8 with SMTP id 6a1803df08f44-87b2101d5d4mr127528846d6.15.1760064743131; Thu, 09 Oct 2025 19:52:23 -0700 (PDT) Received: from ?IPV6:2601:188:c180:4250:ecbe:130d:668d:951d? ([2601:188:c180:4250:ecbe:130d:668d:951d]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-87bc345da68sm7842246d6.13.2025.10.09.19.52.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 09 Oct 2025 19:52:22 -0700 (PDT) From: Waiman Long X-Google-Original-From: Waiman Long Message-ID: <3c20a451-ebc6-4057-be77-2caaf6d2317c@redhat.com> Date: Thu, 9 Oct 2025 22:52:21 -0400 Precedence: bulk X-Mailing-List: linux-rt-devel@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] pci/aer_inject: switching inject_lock to raw_spinlock_t To: Guangbo Cui , Sebastian Andrzej Siewior , Clark Williams , Steven Rostedt , Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Thomas Gleixner , Bjorn Helgaas Cc: Jonathan Cameron , linux-rt-devel@lists.linux.dev, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org References: <20251009150651.93618-1-jckeep.cuiguangbo@gmail.com> In-Reply-To: <20251009150651.93618-1-jckeep.cuiguangbo@gmail.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 1uetj7ZFGQkkFZinyRNS2wkihNvMj24-IKQJmwh-wwk_1760064743 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 10/9/25 11:06 AM, Guangbo Cui wrote: > When injecting AER errors under PREEMPT_RT, the kernel may trigger a > lockdep warning about an invalid wait context: > > ``` > [ 1850.950780] [ BUG: Invalid wait context ] > [ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted > [ 1850.951457] ----------------------------- > [ 1850.951680] irq/16-PCIe PME/56 is trying to lock: > [ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc > [ 1850.952731] other info that might help us debug this: > [ 1850.952997] context-{5:5} > [ 1850.953192] 5 locks held by irq/16-PCIe PME/56: > [ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268 > [ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48 > [ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4 > [ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48 > [ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8 > ``` > > This happens because the AER injection path (`aer_inj_read_config()`) > is called in the context of the PCIe PME interrupt thread, which runs > through `irq_forced_thread_fn()` under PREEMPT_RT. In this context, > `pci_lock` (a raw_spinlock_t) is held with interrupts disabled > (`spin_lock_irqsave()`), and then `aer_inj_read_config()` tries to > acquire `inject_lock`, which is a `rt_spin_lock`. (Thanks Waiman Long) > > `rt_spin_lock` may sleep, so acquiring it while holding a raw spinlock > with IRQs disabled violates the lock ordering rules. This leads to > the “Invalid wait context” lockdep warning. > > In other words, the lock order looks like this: > > ``` > raw_spin_lock_irqsave(&pci_lock); > ↓ > rt_spin_lock(&inject_lock); <-- not allowed > ``` > > To fix this, convert `inject_lock` from an `rt_spin_lock` to a > `raw_spinlock_t`, a raw spinlock is safe and consistent with the > surrounding locking scheme. > > This resolves the lockdep “Invalid wait context” warning observed when > injecting correctable AER errors through `/dev/aer_inject` on PREEMPT_RT. > > This was discovered while testing PCIe AER error injection on an arm64 > QEMU virtual machine: > > ``` > qemu-system-aarch64 \ > -nographic \ > -machine virt,highmem=off,gic-version=3 \ > -cpu cortex-a72 \ > -kernel arch/arm64/boot/Image \ > -initrd initramfs.cpio.gz \ > -append "console=ttyAMA0 root=/dev/ram rdinit=/linuxrc earlyprintk nokaslr" \ > -m 2G \ > -smp 1 \ > -netdev user,id=net0,hostfwd=tcp::2223-:22 \ > -device virtio-net-pci,netdev=net0 \ > -device pcie-root-port,id=rp0,chassis=1,slot=0x0 \ > -device pci-testdev -s -S > ``` > > Injecting a correctable PCIe error via /dev/aer_inject caused a BUG > report with "Invalid wait context" in the irq/PCIe thread. > > ``` > ~ # export HEX="00020000000000000100000000000000000000000000000000000000" > ~ # echo -n "$HEX" | xxd -r -p | tee /dev/aer_inject >/dev/null > [ 1850.947170] pcieport 0000:00:02.0: aer_inject: Injecting errors 00000001/00000000 into device 0000:00:02.0 > [ 1850.949951] > [ 1850.950479] ============================= > [ 1850.950780] [ BUG: Invalid wait context ] > [ 1850.951152] 6.17.0-11316-g7a405dbb0f03-dirty #7 Not tainted > [ 1850.951457] ----------------------------- > [ 1850.951680] irq/16-PCIe PME/56 is trying to lock: > [ 1850.952004] ffff800082865238 (inject_lock){+.+.}-{3:3}, at: aer_inj_read_config+0x38/0x1dc > [ 1850.952731] other info that might help us debug this: > [ 1850.952997] context-{5:5} > [ 1850.953192] 5 locks held by irq/16-PCIe PME/56: > [ 1850.953415] #0: ffff800082647390 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x30/0x268 > [ 1850.953931] #1: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48 > [ 1850.954453] #2: ffff000004bb6c58 (&data->lock){+...}-{3:3}, at: pcie_pme_irq+0x34/0xc4 > [ 1850.954949] #3: ffff8000826c6b38 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48 > [ 1850.955420] #4: ffff800082863d10 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x5c/0xd8 > [ 1850.955932] stack backtrace: > [ 1850.956412] CPU: 0 UID: 0 PID: 56 Comm: irq/16-PCIe PME Not tainted 6.17.0-11316-g7a405dbb0f03-dirty #7 PREEMPT_{RT,(full)} > [ 1850.957039] Hardware name: linux,dummy-virt (DT) > [ 1850.957409] Call trace: > [ 1850.957727] show_stack+0x18/0x24 (C) > [ 1850.958089] dump_stack_lvl+0x40/0xbc > [ 1850.958339] dump_stack+0x18/0x24 > [ 1850.958586] __lock_acquire+0xa84/0x3008 > [ 1850.958907] lock_acquire+0x128/0x2a8 > [ 1850.959171] rt_spin_lock+0x50/0x1b8 > [ 1850.959476] aer_inj_read_config+0x38/0x1dc > [ 1850.959821] pci_bus_read_config_dword+0x80/0xd8 > [ 1850.960079] pcie_capability_read_dword+0xac/0xd8 > [ 1850.960454] pcie_pme_irq+0x44/0xc4 > [ 1850.960728] irq_forced_thread_fn+0x30/0x94 > [ 1850.960984] irq_thread+0x1ac/0x3a4 > [ 1850.961308] kthread+0x1b4/0x208 > [ 1850.961557] ret_from_fork+0x10/0x20 > [ 1850.963088] pcieport 0000:00:02.0: AER: Correctable error message received from 0000:00:02.0 > [ 1850.963330] pcieport 0000:00:02.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID) > [ 1850.963351] pcieport 0000:00:02.0: device [1b36:000c] error status/mask=00000001/0000e000 > [ 1850.963385] pcieport 0000:00:02.0: [ 0] RxErr (First) > ``` > > Signed-off-by: Guangbo Cui > --- > Changes in v2: > - Pulling kfree() out from the lock critical section. (Thanks Waiman Long) > - Link to v1: https://lore.kernel.org/linux-pci/20251007060218.57222-1-jckeep.cuiguangbo@gmail.com/ As PCI error injection is mainly for debug/development, I think it is OK to change inject_lock to a raw_spinlock. I think you should also mention about moving kfree() out of the lock critical section in the commit log. Or better yet, break it out as a separate patch. It is just a nit. So I am fine with the current version too. Now it is up to the PCI maintainer to decide if further change is needed. Acked-by: Waiman Long