From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: dumazet@google.com, Dany Madden <drt@linux.ibm.com>,
netdev <netdev@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Abdul Haleem <abdhalee@linux.vnet.ibm.com>,
alexandr.lobakin@intel.com,
brian King <brking@linux.vnet.ibm.com>,
Jakub Kicinski <kuba@kernel.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [5.16.0-rc5][ppc][net] kernel oops when hotplug remove of vNIC interface
Date: Thu, 6 Jan 2022 14:24:56 -0800 [thread overview]
Message-ID: <YddsOKc9DaRg5HTf@us.ibm.com> (raw)
In-Reply-To: <87lezt3398.fsf@mpe.ellerman.id.au>
Michael Ellerman [mpe@ellerman.id.au] wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> > On Wed, 5 Jan 2022 13:56:53 +0530 Abdul Haleem wrote:
> >> Greeting's
> >>
> >> Mainline kernel 5.16.0-rc5 panics when DLPAR ADD of vNIC device on my
> >> Powerpc LPAR
> >>
> >> Perform below dlpar commands in a loop from linux OS
> >>
> >> drmgr -r -c slot -s U9080.HEX.134C488-V1-C3 -w 5 -d 1
> >> drmgr -a -c slot -s U9080.HEX.134C488-V1-C3 -w 5 -d 1
> >>
> >> after 7th iteration, the kernel panics with below messages
> >>
> >> console messages:
> >> [102056] ibmvnic 30000003 env3: Sending CRQ: 801e000864000000
> >> 0060000000000000
> >> <intr> ibmvnic 30000003 env3: Handling CRQ: 809e000800000000
> >> 0000000000000000
> >> [102056] ibmvnic 30000003 env3: Disabling tx_scrq[0] irq
> >> [102056] ibmvnic 30000003 env3: Disabling tx_scrq[1] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[0] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[1] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[2] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[3] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[4] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[5] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq
> >> [102056] ibmvnic 30000003 env3: Replenished 8 pools
> >> Kernel attempted to read user page (10) - exploit attempt? (uid: 0)
> >> BUG: Kernel NULL pointer dereference on read at 0x00000010
> >> Faulting instruction address: 0xc000000000a3c840
> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> >> Modules linked in: bridge stp llc ib_core rpadlpar_io rpaphp nfnetlink
> >> tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag
> >> bonding rfkill ibmvnic sunrpc pseries_rng xts vmx_crypto gf128mul
> >> sch_fq_codel binfmt_misc ip_tables ext4 mbcache jbd2 dm_service_time
> >> sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth dm_multipath dm_mirror
> >> dm_region_hash dm_log dm_mod fuse
> >> CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted
> >> 5.16.0-rc5-autotest-g6441998e2e37 #1
> >> Workqueue: events_long __ibmvnic_reset [ibmvnic]
> >> NIP: c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820
> >> REGS: c0000000548e37e0 TRAP: 0300 Not tainted
> >> (5.16.0-rc5-autotest-g6441998e2e37)
> >> MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28248484 XER: 00000004
> >> CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
> >> GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000
> >> GPR04: c000000c7d1a7e00 fffffffffffffff6 0000000000000027 c000000c7d1a7e08
> >> GPR08: 0000000000000023 0000000000000000 0000000000000010 c0080000029bdd10
> >> GPR12: c000000000a3c820 c000000c7fca6680 0000000000000000 c000000133016bf8
> >> GPR16: 00000000000003fe 0000000000001000 0000000000000002 0000000000000008
> >> GPR20: c000000133016eb0 0000000000000000 0000000000000000 0000000000000003
> >> GPR24: c000000133016000 c000000133017168 0000000020000000 c000000133016a00
> >> GPR28: 0000000000000006 c000000133016a00 0000000000000001 c000000133016000
> >> NIP [c000000000a3c840] napi_enable+0x20/0xc0
> >> LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic]
> >> Call Trace:
> >> [c0000000548e3a80] [0000000000000006] 0x6 (unreliable)
> >> [c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic]
> >> [c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic]
> >> [c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570
> >> [c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660
> >> [c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0
> >> [c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
> >> Instruction dump:
> >> 7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010
> >> 38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020
> >> ---[ end trace 5f8033b08fd27706 ]---
> >> radix-mmu: Page sizes from device-tree:
> >>
> >> the fault instruction points to
> >>
> >> [root@ltcden11-lp1 boot]# gdb -batch
> >> vmlinuz-5.16.0-rc5-autotest-g6441998e2e37 -ex 'list *(0xc000000000a3c840)'
> >> 0xc000000000a3c840 is in napi_enable (net/core/dev.c:6966).
> >> 6961 void napi_enable(struct napi_struct *n)
> >> 6962 {
> >> 6963 unsigned long val, new;
> >> 6964
> >> 6965 do {
> >> 6966 val = READ_ONCE(n->state);
> >
> > If n is NULL here that's gotta be a driver problem.
>
> Definitely looks like it, the disassembly is:
>
> not r9,r9
> clrldi r3,r9,63
> blr # end of previous function
> nop
> addis r2,r12,491 # function entry
> addi r2,r2,14816
> stdu r1,-48(r1) # stack frame creation
> li r5,-10
> ld r9,4352(r13)
> std r9,40(r1)
> li r9,0
> ld r8,16(r3) # load from r3 (n) + 16
>
>
> The register dump shows that r3 is NULL, and it comes directly from the
> caller. So we've been called with n = NULL.
Yeah, Good catch Abdul.
I suspect its due to the release_resources() in __ibmvnic_open(). The
problem is hard to reproduce but we are testing following patch with
error injection. Will formally submit after testing/review.
---
From 8a78083e5ec6914be197352f391bfa17420a147c Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Date: Wed, 5 Jan 2022 16:22:58 -0500
Subject: [PATCH 1/1] ibmvnic: don't release napi in __ibmvnic_open()
If __ibmvnic_open() encounters an error such as when setting link state,
it calls release_resources() which frees the napi structures needlessly.
Instead, have __ibmvnic_open() only clean up the work it did so far (i.e.
disable napi and irqs) and leave the rest to the callers.
If caller of __ibmvnic_open() is ibmvnic_open(), it should release the
resources immediately. If the caller is do_reset() or do_hard_reset(),
they will release the resources on the next reset.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 0bb3911dd014..34efba6c117b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -110,6 +110,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
struct ibmvnic_sub_crq_queue *tx_scrq);
static void free_long_term_buff(struct ibmvnic_adapter *adapter,
struct ibmvnic_long_term_buff *ltb);
+static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter);
struct ibmvnic_stat {
char name[ETH_GSTRING_LEN];
@@ -1418,7 +1419,7 @@ static int __ibmvnic_open(struct net_device *netdev)
rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_UP);
if (rc) {
ibmvnic_napi_disable(adapter);
- release_resources(adapter);
+ ibmvnic_disable_irqs(adapter);
return rc;
}
@@ -1468,9 +1469,6 @@ static int ibmvnic_open(struct net_device *netdev)
rc = init_resources(adapter);
if (rc) {
netdev_err(netdev, "failed to initialize resources\n");
- release_resources(adapter);
- release_rx_pools(adapter);
- release_tx_pools(adapter);
goto out;
}
}
@@ -1487,6 +1485,12 @@ static int ibmvnic_open(struct net_device *netdev)
adapter->state = VNIC_OPEN;
rc = 0;
}
+ if (rc) {
+ release_resources(adapter);
+ release_rx_pools(adapter);
+ release_tx_pools(adapter);
+ }
+
return rc;
}
--
2.27.0
>
> cheers
WARNING: multiple messages have this Message-ID (diff)
From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Jakub Kicinski <kuba@kernel.org>,
Abdul Haleem <abdhalee@linux.vnet.ibm.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
alexandr.lobakin@intel.com, dumazet@google.com,
brian King <brking@linux.vnet.ibm.com>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
netdev <netdev@vger.kernel.org>, Dany Madden <drt@linux.ibm.com>
Subject: Re: [5.16.0-rc5][ppc][net] kernel oops when hotplug remove of vNIC interface
Date: Thu, 6 Jan 2022 14:24:56 -0800 [thread overview]
Message-ID: <YddsOKc9DaRg5HTf@us.ibm.com> (raw)
In-Reply-To: <87lezt3398.fsf@mpe.ellerman.id.au>
Michael Ellerman [mpe@ellerman.id.au] wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> > On Wed, 5 Jan 2022 13:56:53 +0530 Abdul Haleem wrote:
> >> Greeting's
> >>
> >> Mainline kernel 5.16.0-rc5 panics when DLPAR ADD of vNIC device on my
> >> Powerpc LPAR
> >>
> >> Perform below dlpar commands in a loop from linux OS
> >>
> >> drmgr -r -c slot -s U9080.HEX.134C488-V1-C3 -w 5 -d 1
> >> drmgr -a -c slot -s U9080.HEX.134C488-V1-C3 -w 5 -d 1
> >>
> >> after 7th iteration, the kernel panics with below messages
> >>
> >> console messages:
> >> [102056] ibmvnic 30000003 env3: Sending CRQ: 801e000864000000
> >> 0060000000000000
> >> <intr> ibmvnic 30000003 env3: Handling CRQ: 809e000800000000
> >> 0000000000000000
> >> [102056] ibmvnic 30000003 env3: Disabling tx_scrq[0] irq
> >> [102056] ibmvnic 30000003 env3: Disabling tx_scrq[1] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[0] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[1] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[2] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[3] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[4] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[5] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq
> >> [102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq
> >> [102056] ibmvnic 30000003 env3: Replenished 8 pools
> >> Kernel attempted to read user page (10) - exploit attempt? (uid: 0)
> >> BUG: Kernel NULL pointer dereference on read at 0x00000010
> >> Faulting instruction address: 0xc000000000a3c840
> >> Oops: Kernel access of bad area, sig: 11 [#1]
> >> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> >> Modules linked in: bridge stp llc ib_core rpadlpar_io rpaphp nfnetlink
> >> tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag
> >> bonding rfkill ibmvnic sunrpc pseries_rng xts vmx_crypto gf128mul
> >> sch_fq_codel binfmt_misc ip_tables ext4 mbcache jbd2 dm_service_time
> >> sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth dm_multipath dm_mirror
> >> dm_region_hash dm_log dm_mod fuse
> >> CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted
> >> 5.16.0-rc5-autotest-g6441998e2e37 #1
> >> Workqueue: events_long __ibmvnic_reset [ibmvnic]
> >> NIP: c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820
> >> REGS: c0000000548e37e0 TRAP: 0300 Not tainted
> >> (5.16.0-rc5-autotest-g6441998e2e37)
> >> MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28248484 XER: 00000004
> >> CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
> >> GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000
> >> GPR04: c000000c7d1a7e00 fffffffffffffff6 0000000000000027 c000000c7d1a7e08
> >> GPR08: 0000000000000023 0000000000000000 0000000000000010 c0080000029bdd10
> >> GPR12: c000000000a3c820 c000000c7fca6680 0000000000000000 c000000133016bf8
> >> GPR16: 00000000000003fe 0000000000001000 0000000000000002 0000000000000008
> >> GPR20: c000000133016eb0 0000000000000000 0000000000000000 0000000000000003
> >> GPR24: c000000133016000 c000000133017168 0000000020000000 c000000133016a00
> >> GPR28: 0000000000000006 c000000133016a00 0000000000000001 c000000133016000
> >> NIP [c000000000a3c840] napi_enable+0x20/0xc0
> >> LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic]
> >> Call Trace:
> >> [c0000000548e3a80] [0000000000000006] 0x6 (unreliable)
> >> [c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic]
> >> [c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic]
> >> [c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570
> >> [c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660
> >> [c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0
> >> [c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
> >> Instruction dump:
> >> 7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010
> >> 38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020
> >> ---[ end trace 5f8033b08fd27706 ]---
> >> radix-mmu: Page sizes from device-tree:
> >>
> >> the fault instruction points to
> >>
> >> [root@ltcden11-lp1 boot]# gdb -batch
> >> vmlinuz-5.16.0-rc5-autotest-g6441998e2e37 -ex 'list *(0xc000000000a3c840)'
> >> 0xc000000000a3c840 is in napi_enable (net/core/dev.c:6966).
> >> 6961 void napi_enable(struct napi_struct *n)
> >> 6962 {
> >> 6963 unsigned long val, new;
> >> 6964
> >> 6965 do {
> >> 6966 val = READ_ONCE(n->state);
> >
> > If n is NULL here that's gotta be a driver problem.
>
> Definitely looks like it, the disassembly is:
>
> not r9,r9
> clrldi r3,r9,63
> blr # end of previous function
> nop
> addis r2,r12,491 # function entry
> addi r2,r2,14816
> stdu r1,-48(r1) # stack frame creation
> li r5,-10
> ld r9,4352(r13)
> std r9,40(r1)
> li r9,0
> ld r8,16(r3) # load from r3 (n) + 16
>
>
> The register dump shows that r3 is NULL, and it comes directly from the
> caller. So we've been called with n = NULL.
Yeah, Good catch Abdul.
I suspect its due to the release_resources() in __ibmvnic_open(). The
problem is hard to reproduce but we are testing following patch with
error injection. Will formally submit after testing/review.
---
From 8a78083e5ec6914be197352f391bfa17420a147c Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Date: Wed, 5 Jan 2022 16:22:58 -0500
Subject: [PATCH 1/1] ibmvnic: don't release napi in __ibmvnic_open()
If __ibmvnic_open() encounters an error such as when setting link state,
it calls release_resources() which frees the napi structures needlessly.
Instead, have __ibmvnic_open() only clean up the work it did so far (i.e.
disable napi and irqs) and leave the rest to the callers.
If caller of __ibmvnic_open() is ibmvnic_open(), it should release the
resources immediately. If the caller is do_reset() or do_hard_reset(),
they will release the resources on the next reset.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
---
drivers/net/ethernet/ibm/ibmvnic.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 0bb3911dd014..34efba6c117b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -110,6 +110,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
struct ibmvnic_sub_crq_queue *tx_scrq);
static void free_long_term_buff(struct ibmvnic_adapter *adapter,
struct ibmvnic_long_term_buff *ltb);
+static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter);
struct ibmvnic_stat {
char name[ETH_GSTRING_LEN];
@@ -1418,7 +1419,7 @@ static int __ibmvnic_open(struct net_device *netdev)
rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_UP);
if (rc) {
ibmvnic_napi_disable(adapter);
- release_resources(adapter);
+ ibmvnic_disable_irqs(adapter);
return rc;
}
@@ -1468,9 +1469,6 @@ static int ibmvnic_open(struct net_device *netdev)
rc = init_resources(adapter);
if (rc) {
netdev_err(netdev, "failed to initialize resources\n");
- release_resources(adapter);
- release_rx_pools(adapter);
- release_tx_pools(adapter);
goto out;
}
}
@@ -1487,6 +1485,12 @@ static int ibmvnic_open(struct net_device *netdev)
adapter->state = VNIC_OPEN;
rc = 0;
}
+ if (rc) {
+ release_resources(adapter);
+ release_rx_pools(adapter);
+ release_tx_pools(adapter);
+ }
+
return rc;
}
--
2.27.0
>
> cheers
next prev parent reply other threads:[~2022-01-06 22:26 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-05 8:26 [5.16.0-rc5][ppc][net] kernel oops when hotplug remove of vNIC interface Abdul Haleem
2022-01-05 8:26 ` Abdul Haleem
2022-01-05 18:26 ` Jakub Kicinski
2022-01-05 18:26 ` Jakub Kicinski
2022-01-06 4:19 ` Michael Ellerman
2022-01-06 4:19 ` Michael Ellerman
2022-01-06 22:24 ` Sukadev Bhattiprolu [this message]
2022-01-06 22:24 ` Sukadev Bhattiprolu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YddsOKc9DaRg5HTf@us.ibm.com \
--to=sukadev@linux.ibm.com \
--cc=abdhalee@linux.vnet.ibm.com \
--cc=alexandr.lobakin@intel.com \
--cc=brking@linux.vnet.ibm.com \
--cc=drt@linux.ibm.com \
--cc=dumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.