* [PATCH] vmxnet3: fix netpoll race condition
@ 2014-03-07 21:25 Neil Horman
2014-03-08 5:14 ` Shreyas Bhatewara
2014-03-10 10:55 ` [PATCH v2] " Neil Horman
0 siblings, 2 replies; 8+ messages in thread
From: Neil Horman @ 2014-03-07 21:25 UTC (permalink / raw)
To: netdev
Cc: Neil Horman, Shreyas Bhatewara, VMware, Inc., David S. Miller,
stable
vmxnet3's netpoll driver is incorrectly coded. It directly calls
vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll
controller method doesn't block real napi polls in any way, there is a potential
for race conditions in which the netpoll controller method and the napi poll
method run concurrently. The result is data corruption causing panics such as this
one recently observed:
PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg"
#0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
#1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
#2 [ffff88023abd58b0] oops_end at ffffffff8152b570
#3 [ffff88023abd58e0] die at ffffffff81010e0b
#4 [ffff88023abd5910] do_trap at ffffffff8152add4
#5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
#6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
[exception RIP: vmxnet3_rq_rx_complete+1968]
RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0
RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0
RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048
R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8
R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
#8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
#9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7
The fix is to do as other drivers do, and have the poll controller call the top
half interrupt handler, which schedules a napi poll properly to recieve frames
Tested by myself, successfully.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: stable@vger.kernel.org
---
drivers/net/vmxnet3/vmxnet3_drv.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 3be786f..473687f 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1762,11 +1762,13 @@ vmxnet3_netpoll(struct net_device *netdev)
{
struct vmxnet3_adapter *adapter = netdev_priv(netdev);
- if (adapter->intr.mask_mode == VMXNET3_IMM_ACTIVE)
- vmxnet3_disable_all_intrs(adapter);
-
- vmxnet3_do_poll(adapter, adapter->rx_queue[0].rx_ring[0].size);
- vmxnet3_enable_all_intrs(adapter);
+ switch (adapter->intr.type) {
+ case VMXNET3_IT_MSIX:
+ vmxnet3_msix_rx(0, &adapter->rx_queue[0]);
+ case VMXNET3_IT_MSI:
+ default:
+ vmxnet3_intr(0, adapter->netdev);
+ }
}
#endif /* CONFIG_NET_POLL_CONTROLLER */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] vmxnet3: fix netpoll race condition
2014-03-07 21:25 [PATCH] vmxnet3: fix netpoll race condition Neil Horman
@ 2014-03-08 5:14 ` Shreyas Bhatewara
2014-03-08 14:36 ` Neil Horman
2014-03-10 10:55 ` [PATCH v2] " Neil Horman
1 sibling, 1 reply; 8+ messages in thread
From: Shreyas Bhatewara @ 2014-03-08 5:14 UTC (permalink / raw)
To: Neil Horman; +Cc: netdev, VMware, Inc., David S. Miller, stable
Thanks for the patch Neil.
> --- a/drivers/net/vmxnet3/vmxnet3_drv.c
> +++ b/drivers/net/vmxnet3/vmxnet3_drv.c
> @@ -1762,11 +1762,13 @@ vmxnet3_netpoll(struct net_device *netdev)
> {
> struct vmxnet3_adapter *adapter = netdev_priv(netdev);
>
> - if (adapter->intr.mask_mode == VMXNET3_IMM_ACTIVE)
> - vmxnet3_disable_all_intrs(adapter);
> -
> - vmxnet3_do_poll(adapter, adapter->rx_queue[0].rx_ring[0].size);
> - vmxnet3_enable_all_intrs(adapter);
> + switch (adapter->intr.type) {
> + case VMXNET3_IT_MSIX:
> + vmxnet3_msix_rx(0, &adapter->rx_queue[0]);
This should be called for each rx queue, just calling it for
1st queue does not suffice.
Also there should be a break; here
> + case VMXNET3_IT_MSI:
> + default:
> + vmxnet3_intr(0, adapter->netdev);
> + }
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] vmxnet3: fix netpoll race condition
2014-03-08 5:14 ` Shreyas Bhatewara
@ 2014-03-08 14:36 ` Neil Horman
0 siblings, 0 replies; 8+ messages in thread
From: Neil Horman @ 2014-03-08 14:36 UTC (permalink / raw)
To: Shreyas Bhatewara; +Cc: netdev, VMware, Inc., David S. Miller, stable
On Fri, Mar 07, 2014 at 09:14:21PM -0800, Shreyas Bhatewara wrote:
> Thanks for the patch Neil.
>
>
> > --- a/drivers/net/vmxnet3/vmxnet3_drv.c
> > +++ b/drivers/net/vmxnet3/vmxnet3_drv.c
> > @@ -1762,11 +1762,13 @@ vmxnet3_netpoll(struct net_device *netdev)
> > {
> > struct vmxnet3_adapter *adapter = netdev_priv(netdev);
> >
> > - if (adapter->intr.mask_mode == VMXNET3_IMM_ACTIVE)
> > - vmxnet3_disable_all_intrs(adapter);
> > -
> > - vmxnet3_do_poll(adapter, adapter->rx_queue[0].rx_ring[0].size);
> > - vmxnet3_enable_all_intrs(adapter);
> > + switch (adapter->intr.type) {
> > + case VMXNET3_IT_MSIX:
> > + vmxnet3_msix_rx(0, &adapter->rx_queue[0]);
>
> This should be called for each rx queue, just calling it for
> 1st queue does not suffice.
You're right, I'll fix that up.
> Also there should be a break; here
>
that too. thanks
Neil
>
> > + case VMXNET3_IT_MSI:
> > + default:
> > + vmxnet3_intr(0, adapter->netdev);
> > + }
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] vmxnet3: fix netpoll race condition
2014-03-07 21:25 [PATCH] vmxnet3: fix netpoll race condition Neil Horman
2014-03-08 5:14 ` Shreyas Bhatewara
@ 2014-03-10 10:55 ` Neil Horman
2014-03-11 17:16 ` David Miller
2014-03-11 20:15 ` David Miller
1 sibling, 2 replies; 8+ messages in thread
From: Neil Horman @ 2014-03-10 10:55 UTC (permalink / raw)
To: netdev
Cc: Neil Horman, Shreyas Bhatewara, VMware, Inc., David S. Miller,
stable
vmxnet3's netpoll driver is incorrectly coded. It directly calls
vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll
controller method doesn't block real napi polls in any way, there is a potential
for race conditions in which the netpoll controller method and the napi poll
method run concurrently. The result is data corruption causing panics such as this
one recently observed:
PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg"
#0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
#1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
#2 [ffff88023abd58b0] oops_end at ffffffff8152b570
#3 [ffff88023abd58e0] die at ffffffff81010e0b
#4 [ffff88023abd5910] do_trap at ffffffff8152add4
#5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
#6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
[exception RIP: vmxnet3_rq_rx_complete+1968]
RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0
RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0
RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048
R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8
R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
#8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
#9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7
The fix is to do as other drivers do, and have the poll controller call the top
half interrupt handler, which schedules a napi poll properly to recieve frames
Tested by myself, successfully.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: stable@vger.kernel.org
---
Change notes:
v2)
* Fixed missing break statements
* Added loop for each rx queue
---
drivers/net/vmxnet3/vmxnet3_drv.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 3be786f..b7daa02 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1761,12 +1761,18 @@ static void
vmxnet3_netpoll(struct net_device *netdev)
{
struct vmxnet3_adapter *adapter = netdev_priv(netdev);
+ int i;
- if (adapter->intr.mask_mode == VMXNET3_IMM_ACTIVE)
- vmxnet3_disable_all_intrs(adapter);
-
- vmxnet3_do_poll(adapter, adapter->rx_queue[0].rx_ring[0].size);
- vmxnet3_enable_all_intrs(adapter);
+ switch (adapter->intr.type) {
+ case VMXNET3_IT_MSIX:
+ for (i = 0; i < adapter->num_rx_queues; i++)
+ vmxnet3_msix_rx(0, &adapter->rx_queue[i]);
+ break;
+ case VMXNET3_IT_MSI:
+ default:
+ vmxnet3_intr(0, adapter->netdev);
+ break;
+ }
}
#endif /* CONFIG_NET_POLL_CONTROLLER */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] vmxnet3: fix netpoll race condition
2014-03-10 10:55 ` [PATCH v2] " Neil Horman
@ 2014-03-11 17:16 ` David Miller
2014-03-11 18:24 ` Shreyas Bhatewara
2014-03-11 20:15 ` David Miller
1 sibling, 1 reply; 8+ messages in thread
From: David Miller @ 2014-03-11 17:16 UTC (permalink / raw)
To: nhorman; +Cc: netdev, sbhatewara, pv-drivers, stable
From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 10 Mar 2014 06:55:55 -0400
> vmxnet3's netpoll driver is incorrectly coded. It directly calls
> vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll
> controller method doesn't block real napi polls in any way, there is a potential
> for race conditions in which the netpoll controller method and the napi poll
> method run concurrently. The result is data corruption causing panics such as this
> one recently observed:
> PID: 1371 TASK: ffff88023762caa0 CPU: 1 COMMAND: "rs:main Q:Reg"
> #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
> #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
> #2 [ffff88023abd58b0] oops_end at ffffffff8152b570
> #3 [ffff88023abd58e0] die at ffffffff81010e0b
> #4 [ffff88023abd5910] do_trap at ffffffff8152add4
> #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
> #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
> [exception RIP: vmxnet3_rq_rx_complete+1968]
> RIP: ffffffffa00f1e80 RSP: ffff88023abd5ac8 RFLAGS: 00010086
> RAX: 0000000000000000 RBX: ffff88023b5dcee0 RCX: 00000000000000c0
> RDX: 0000000000000000 RSI: 00000000000005f2 RDI: ffff88023b5dcee0
> RBP: ffff88023abd5b48 R8: 0000000000000000 R9: ffff88023a3b6048
> R10: 0000000000000000 R11: 0000000000000002 R12: ffff8802398d4cd8
> R13: ffff88023af35140 R14: ffff88023b60c890 R15: 0000000000000000
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
> #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
> #9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7
>
> The fix is to do as other drivers do, and have the poll controller call the top
> half interrupt handler, which schedules a napi poll properly to recieve frames
>
> Tested by myself, successfully.
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
VMware folks, please review.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] vmxnet3: fix netpoll race condition
2014-03-11 17:16 ` David Miller
@ 2014-03-11 18:24 ` Shreyas Bhatewara
2014-03-11 20:08 ` Neil Horman
0 siblings, 1 reply; 8+ messages in thread
From: Shreyas Bhatewara @ 2014-03-11 18:24 UTC (permalink / raw)
To: David Miller; +Cc: nhorman, netdev, pv-drivers, stable
> > The fix is to do as other drivers do, and have the poll controller call the
> > top
> > half interrupt handler, which schedules a napi poll properly to recieve
> > frames
> >
> > Tested by myself, successfully.
> >
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
>
> VMware folks, please review
>
Neil, thanks for the updated change. It looks good.
Reviewed-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] vmxnet3: fix netpoll race condition
2014-03-11 18:24 ` Shreyas Bhatewara
@ 2014-03-11 20:08 ` Neil Horman
0 siblings, 0 replies; 8+ messages in thread
From: Neil Horman @ 2014-03-11 20:08 UTC (permalink / raw)
To: Shreyas Bhatewara; +Cc: David Miller, netdev, pv-drivers, stable
On Tue, Mar 11, 2014 at 11:24:08AM -0700, Shreyas Bhatewara wrote:
> > > The fix is to do as other drivers do, and have the poll controller call the
> > > top
> > > half interrupt handler, which schedules a napi poll properly to recieve
> > > frames
> > >
> > > Tested by myself, successfully.
> > >
> > > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> >
> > VMware folks, please review
> >
> Neil, thanks for the updated change. It looks good.
>
> Reviewed-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
>
Thanks guys!
Neil
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] vmxnet3: fix netpoll race condition
2014-03-10 10:55 ` [PATCH v2] " Neil Horman
2014-03-11 17:16 ` David Miller
@ 2014-03-11 20:15 ` David Miller
1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2014-03-11 20:15 UTC (permalink / raw)
To: nhorman; +Cc: netdev, sbhatewara, pv-drivers, stable
From: Neil Horman <nhorman@tuxdriver.com>
Date: Mon, 10 Mar 2014 06:55:55 -0400
> vmxnet3's netpoll driver is incorrectly coded. It directly calls
> vmxnet3_do_poll, which is the driver internal napi poll routine. As the netpoll
> controller method doesn't block real napi polls in any way, there is a potential
> for race conditions in which the netpoll controller method and the napi poll
> method run concurrently. The result is data corruption causing panics such as this
> one recently observed:
...
> The fix is to do as other drivers do, and have the poll controller call the top
> half interrupt handler, which schedules a napi poll properly to recieve frames
>
> Tested by myself, successfully.
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Applied.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-03-11 20:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-07 21:25 [PATCH] vmxnet3: fix netpoll race condition Neil Horman
2014-03-08 5:14 ` Shreyas Bhatewara
2014-03-08 14:36 ` Neil Horman
2014-03-10 10:55 ` [PATCH v2] " Neil Horman
2014-03-11 17:16 ` David Miller
2014-03-11 18:24 ` Shreyas Bhatewara
2014-03-11 20:08 ` Neil Horman
2014-03-11 20:15 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).