* Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
@ 2020-06-22 16:08 Stephen Hemminger
2020-06-22 17:46 ` Björn Töpel
2020-06-23 6:27 ` Yahui Chen
0 siblings, 2 replies; 8+ messages in thread
From: Stephen Hemminger @ 2020-06-22 16:08 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh
Cc: bpf
Begin forwarded message:
Date: Mon, 22 Jun 2020 10:13:52 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
https://bugzilla.kernel.org/show_bug.cgi?id=208275
Bug ID: 208275
Summary: kernel hang occasionally while running the sample of
xdpsock
Product: Networking
Version: 2.5
Kernel Version: 5.7.0
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Other
Assignee: stephen@networkplumber.org
Reporter: goodluckwillcomesoon@gmail.com
Regression: No
Distribution:
5.7.0-1.el7.centos.x86_64
Hardware Environment:
Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.7 06/16/2016
Software Environment:
Problem Description:
kernel hang occasionally while running the sample of xdpsock
Steps to reproduce:
I want to test the rx performace of AF_XDP. I change the nic to 4 queues by cmd
`ethtool -L p6p1 combined 4`, then I will create 1 socket for every queue.
for ((i=0; i<4; i++));
do
./xdpsock -r -z -i p6p1 -m -q $i &
done
I run the xdpsock in samples/bpf using the shell command above.
And occasionally the kernel hang, so I have to power off and on.
Additonal information:
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
2020-06-22 16:08 Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock Stephen Hemminger
@ 2020-06-22 17:46 ` Björn Töpel
2020-06-22 17:49 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2020-06-23 6:27 ` Yahui Chen
1 sibling, 1 reply; 8+ messages in thread
From: Björn Töpel @ 2020-06-22 17:46 UTC (permalink / raw)
To: Stephen Hemminger, Karlsson, Magnus, Björn Töpel
Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh, bpf
On Mon, 22 Jun 2020 at 18:08, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
>
>
> Begin forwarded message:
>
> Date: Mon, 22 Jun 2020 10:13:52 +0000
> From: bugzilla-daemon@bugzilla.kernel.org
> To: stephen@networkplumber.org
> Subject: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
>
Thanks for forwarding, Stephen.
I'll have a look!
Björn
>
> https://bugzilla.kernel.org/show_bug.cgi?id=208275
>
> Bug ID: 208275
> Summary: kernel hang occasionally while running the sample of
> xdpsock
> Product: Networking
> Version: 2.5
> Kernel Version: 5.7.0
> Hardware: All
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: Other
> Assignee: stephen@networkplumber.org
> Reporter: goodluckwillcomesoon@gmail.com
> Regression: No
>
> Distribution:
> 5.7.0-1.el7.centos.x86_64
>
> Hardware Environment:
> Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.7 06/16/2016
>
> Software Environment:
>
>
> Problem Description:
> kernel hang occasionally while running the sample of xdpsock
>
> Steps to reproduce:
>
> I want to test the rx performace of AF_XDP. I change the nic to 4 queues by cmd
> `ethtool -L p6p1 combined 4`, then I will create 1 socket for every queue.
>
> for ((i=0; i<4; i++));
> do
> ./xdpsock -r -z -i p6p1 -m -q $i &
> done
>
> I run the xdpsock in samples/bpf using the shell command above.
> And occasionally the kernel hang, so I have to power off and on.
>
> Additonal information:
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
2020-06-22 17:46 ` Björn Töpel
@ 2020-06-22 17:49 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
0 siblings, 0 replies; 8+ messages in thread
From: Björn Töpel @ 2020-06-22 17:49 UTC (permalink / raw)
To: Stephen Hemminger, Karlsson, Magnus, Björn Töpel,
intel-wired-lan
Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh, bpf
On Mon, 22 Jun 2020 at 19:46, Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> On Mon, 22 Jun 2020 at 18:08, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Mon, 22 Jun 2020 10:13:52 +0000
> > From: bugzilla-daemon@bugzilla.kernel.org
> > To: stephen@networkplumber.org
> > Subject: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
> >
>
> Thanks for forwarding, Stephen.
>
> I'll have a look!
>
Intel ixgbe splat. Adding intel-wired-lan to To:.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-wired-lan] Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
@ 2020-06-22 17:49 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
0 siblings, 0 replies; 8+ messages in thread
From: =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= @ 2020-06-22 17:49 UTC (permalink / raw)
To: intel-wired-lan
On Mon, 22 Jun 2020 at 19:46, Bj?rn T?pel <bjorn.topel@gmail.com> wrote:
>
> On Mon, 22 Jun 2020 at 18:08, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> >
> >
> > Begin forwarded message:
> >
> > Date: Mon, 22 Jun 2020 10:13:52 +0000
> > From: bugzilla-daemon at bugzilla.kernel.org
> > To: stephen at networkplumber.org
> > Subject: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
> >
>
> Thanks for forwarding, Stephen.
>
> I'll have a look!
>
Intel ixgbe splat. Adding intel-wired-lan to To:.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-wired-lan] Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
2020-06-22 17:49 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
(?)
@ 2020-06-23 3:44 ` Yahui Chen
2020-06-23 9:18 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
-1 siblings, 1 reply; 8+ messages in thread
From: Yahui Chen @ 2020-06-23 3:44 UTC (permalink / raw)
To: intel-wired-lan
Hi, Bjorn, Thank your response.
Could you describe it more clearly? I can not get it exactly.
Thx.
Bj?rn T?pel <bjorn.topel@gmail.com> ?2020?6?23??? ??1:50???
> On Mon, 22 Jun 2020 at 19:46, Bj?rn T?pel <bjorn.topel@gmail.com> wrote:
> >
> > On Mon, 22 Jun 2020 at 18:08, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > >
> > >
> > > Begin forwarded message:
> > >
> > > Date: Mon, 22 Jun 2020 10:13:52 +0000
> > > From: bugzilla-daemon at bugzilla.kernel.org
> > > To: stephen at networkplumber.org
> > > Subject: [Bug 208275] New: kernel hang occasionally while running the
> sample of xdpsock
> > >
> >
> > Thanks for forwarding, Stephen.
> >
> > I'll have a look!
> >
>
> Intel ixgbe splat. Adding intel-wired-lan to To:.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20200623/75f6ccfb/attachment.html>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
2020-06-22 16:08 Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock Stephen Hemminger
2020-06-22 17:46 ` Björn Töpel
@ 2020-06-23 6:27 ` Yahui Chen
1 sibling, 0 replies; 8+ messages in thread
From: Yahui Chen @ 2020-06-23 6:27 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh,
bpf@vger.kernel.org
Hi, Bjorn, Thx for your response.
I can not get it exactly. Could you describe the details more clearly?
Thx.
Stephen Hemminger <stephen@networkplumber.org> 于2020年6月23日周二 上午12:08写道:
>
>
>
> Begin forwarded message:
>
> Date: Mon, 22 Jun 2020 10:13:52 +0000
> From: bugzilla-daemon@bugzilla.kernel.org
> To: stephen@networkplumber.org
> Subject: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=208275
>
> Bug ID: 208275
> Summary: kernel hang occasionally while running the sample of
> xdpsock
> Product: Networking
> Version: 2.5
> Kernel Version: 5.7.0
> Hardware: All
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: Other
> Assignee: stephen@networkplumber.org
> Reporter: goodluckwillcomesoon@gmail.com
> Regression: No
>
> Distribution:
> 5.7.0-1.el7.centos.x86_64
>
> Hardware Environment:
> Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.7 06/16/2016
>
> Software Environment:
>
>
> Problem Description:
> kernel hang occasionally while running the sample of xdpsock
>
> Steps to reproduce:
>
> I want to test the rx performace of AF_XDP. I change the nic to 4 queues by cmd
> `ethtool -L p6p1 combined 4`, then I will create 1 socket for every queue.
>
> for ((i=0; i<4; i++));
> do
> ./xdpsock -r -z -i p6p1 -m -q $i &
> done
>
> I run the xdpsock in samples/bpf using the shell command above.
> And occasionally the kernel hang, so I have to power off and on.
>
> Additonal information:
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
2020-06-23 3:44 ` Yahui Chen
@ 2020-06-23 9:18 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
0 siblings, 0 replies; 8+ messages in thread
From: Björn Töpel @ 2020-06-23 9:18 UTC (permalink / raw)
To: Yahui Chen
Cc: Stephen Hemminger, Karlsson, Magnus, Björn Töpel,
intel-wired-lan, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
John Fastabend, KP Singh, bpf
On Tue, 23 Jun 2020 at 05:45, Yahui Chen <goodluckwillcomesoon@gmail.com> wrote:
>
> Hi, Bjorn, Thank your response.
> Could you describe it more clearly? I can not get it exactly.
> Thx.
>
When XDP is enabled, the ixgbe NIC does a (somewhat heavy)
reconfiguration. During the reconfiguration, for some reason, the
rx_buffer->page is NULL in the following call chain:
ixgbe_down()->ixgbe_clean_all_rx_rings()->ixgbe_clean_rx_ring()->__page_frag_cache_drain()
This results in that when __page_frag_cache_drain() want to touch the
reference counter, you get a NULL pointer dereference.
[277994.329145] BUG: kernel NULL pointer dereference, address: 0000000000000034
...
[277994.329428] RIP: 0010:__page_frag_cache_drain+0x5/0x40
[277994.329463] Code: d2 ff ff 31 f6 84 c0 74 04 0f b6 73 51 48 89 df
e8 70 ff ff ff eb dc 48 83 eb 01 eb d0 0f 1f 84 00 00 00 00 00 0f 1f
44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 55 48 89 e5 a9 00 00 01 00 74
0f 0f
2a:* f0 29 77 34 lock sub %esi,0x34(%rdi) <--
trapping instruction
I tried to reproduce the issue, but without success so far. I'll keep
looking for the bug. Hopefully someone from Intel with better insight
into ixgbe can help!
Björn
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-wired-lan] Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock
@ 2020-06-23 9:18 ` =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
0 siblings, 0 replies; 8+ messages in thread
From: =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?= @ 2020-06-23 9:18 UTC (permalink / raw)
To: intel-wired-lan
On Tue, 23 Jun 2020 at 05:45, Yahui Chen <goodluckwillcomesoon@gmail.com> wrote:
>
> Hi, Bjorn, Thank your response.
> Could you describe it more clearly? I can not get it exactly.
> Thx.
>
When XDP is enabled, the ixgbe NIC does a (somewhat heavy)
reconfiguration. During the reconfiguration, for some reason, the
rx_buffer->page is NULL in the following call chain:
ixgbe_down()->ixgbe_clean_all_rx_rings()->ixgbe_clean_rx_ring()->__page_frag_cache_drain()
This results in that when __page_frag_cache_drain() want to touch the
reference counter, you get a NULL pointer dereference.
[277994.329145] BUG: kernel NULL pointer dereference, address: 0000000000000034
...
[277994.329428] RIP: 0010:__page_frag_cache_drain+0x5/0x40
[277994.329463] Code: d2 ff ff 31 f6 84 c0 74 04 0f b6 73 51 48 89 df
e8 70 ff ff ff eb dc 48 83 eb 01 eb d0 0f 1f 84 00 00 00 00 00 0f 1f
44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 55 48 89 e5 a9 00 00 01 00 74
0f 0f
2a:* f0 29 77 34 lock sub %esi,0x34(%rdi) <--
trapping instruction
I tried to reproduce the issue, but without success so far. I'll keep
looking for the bug. Hopefully someone from Intel with better insight
into ixgbe can help!
Bj?rn
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-06-23 9:18 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-22 16:08 Fw: [Bug 208275] New: kernel hang occasionally while running the sample of xdpsock Stephen Hemminger
2020-06-22 17:46 ` Björn Töpel
2020-06-22 17:49 ` Björn Töpel
2020-06-22 17:49 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2020-06-23 3:44 ` Yahui Chen
2020-06-23 9:18 ` Björn Töpel
2020-06-23 9:18 ` [Intel-wired-lan] " =?unknown-8bit?q?Bj=C3=B6rn_T=C3=B6pel?=
2020-06-23 6:27 ` Yahui Chen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.