From: Benjamin Poirier <bpoirier@suse.com>
To: Nicolas Belouin <nicolas.belouin@gandi.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: Re: [Bug] net/ipv6: skb_over_panic in mld_newpack
Date: Wed, 5 Dec 2018 15:57:56 +0900 [thread overview]
Message-ID: <20181205065756.GA23806@f2> (raw)
In-Reply-To: <20181204105224.kpzcuipekqqjfe5f@diconico07.dev>
On 2018/12/04 11:52, Nicolas Belouin wrote:
> On 03/12 07:59, Eric Dumazet wrote:
> >
> >
> > On 12/03/2018 07:20 AM, Nicolas Belouin wrote:
> > > Hi,
> > > I ran into a panic while adding an interface to a bridge with a vxlan
> > > interface already attached to it, as it seems related mtu=9000.
> > >
> > > I get the following panic info :
> > >
> > > [ 2482.419893] br100: port 2(vif1.1) entered blocking state
> > > [ 2482.425427] br100: port 2(vif1.1) entered forwarding state
> > > [ 2482.431797] skbuff: skb_over_panic: text:ffffffff816e4f78 len:40 put:40 head:ffff880146449000 data:ffff880146458fd0 tail:0xfff8 end:0xec0 dev:vif1.1
> > > [ 2482.442891] ------------[ cut here ]------------
> > > [ 2482.448254] kernel BUG at /srv/jenkins/workspace/workspace/hosting-xen-dom0-kernel/build/src/linux-4.9/net/core/skbuff.c:105!
> > > [ 2482.459009] invalid opcode: 0000 [#1] PREEMPT SMP
> > > [ 2482.464371] Modules linked in:
> > > [ 2482.469682] CPU: 19 PID: 1317 Comm: kworker/19:1 Not tainted 4.9.135-dom0-e9d15b2-x86_64-iaas #2
> > > [ 2482.480362] Hardware name: Dell Inc. PowerEdge C8220/09N44V, BIOS 2.7.1 03/04/2015
> > > [ 2482.491008] Workqueue: ipv6_addrconf addrconf_dad_work
> > > [ 2482.496380] task: ffff88017eef1a00 task.stack: ffffc90001fcc000
> > > [ 2482.501785] RIP: e030:[<ffffffff815ed71f>] [<ffffffff815ed71f>] skb_panic+0x5f/0x70
> > > [ 2482.512450] RSP: e02b:ffffc90001fcfba8 EFLAGS: 00010296
> > > [ 2482.517817] RAX: 0000000000000088 RBX: ffff880117fb0800 RCX: 0000000000000000
> > > [ 2482.528447] RDX: 0000000000000088 RSI: ffff880184cd03c8 RDI: ffff880184cd03c8
> > > [ 2482.539085] RBP: ffffc90001fcfc00 R08: 00000000000006a8 R09: ffffffff81ea7359
> > > [ 2482.549717] R10: ffff880180406f80 R11: 00000000000006a8 R12: ffff880147258cc0
> > > [ 2482.560350] R13: ffffc90001fcfc20 R14: ffffffff81d13440 R15: 0000000000000000
> > > [ 2482.570993] FS: 0000000000000000(0000) GS:ffff880184cc0000(0000) knlGS:0000000000000000
> > > [ 2482.581646] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 2482.587039] CR2: 00007f5b17f032b0 CR3: 0000000001c08000 CR4: 0000000000042660
> > > [ 2482.597675] Stack:
> > > [ 2482.602958] ffff880146458fd0 000000000000fff8 0000000000000ec0 ffff88017f3f0000
> > > [ 2482.613619] ffffffff815efa62 ffffffff816e4f78 ffff880117fb0800 ffffc90001fcfc20
> > > [ 2482.624288] ffff880147258cc0 ffff88017f3f0000 ffff880146502000 ffffc90001fcfc68
> > > [ 2482.634955] Call Trace:
> > > [ 2482.640254] [<ffffffff815efa62>] ? skb_put+0x42/0x50
> > > [ 2482.645633] [<ffffffff816e4f78>] ? ip6_mc_hdr.constprop.36+0x58/0xd0
> > > [ 2482.651045] [<ffffffff816e511a>] ? mld_newpack+0x12a/0x1e0
> > > [ 2482.656421] [<ffffffff816e5257>] ? add_grhead.isra.28+0x87/0xa0
> > > [ 2482.661825] [<ffffffff816e60d6>] ? add_grec+0x446/0x4c0
> > > [ 2482.667198] [<ffffffff8108b06b>] ? __local_bh_enable_ip+0x1b/0xb0
> > > [ 2482.672609] [<ffffffff816e6328>] ? mld_send_initial_cr.part.29+0x58/0xa0
> > > [ 2482.678022] [<ffffffff816e83d6>] ? ipv6_mc_dad_complete+0x26/0x60
> > > [ 2482.683441] [<ffffffff816cc2cf>] ? addrconf_dad_completed+0x29f/0x2c0
> > > [ 2482.688850] [<ffffffff816e6b84>] ? ipv6_dev_mc_inc+0x194/0x2c0
> > > [ 2482.694249] [<ffffffff816cc3ee>] ? addrconf_dad_work+0xfe/0x3d0
> > > [ 2482.699650] [<ffffffff817484ed>] ? _raw_spin_unlock_irq+0xd/0x20
> > > [ 2482.705052] [<ffffffff8109de12>] ? process_one_work+0x142/0x3e0
> > > [ 2482.710453] [<ffffffff8109e112>] ? worker_thread+0x62/0x480
> > > [ 2482.715848] [<ffffffff8109e0b0>] ? process_one_work+0x3e0/0x3e0
> > > [ 2482.721256] [<ffffffff810a3472>] ? kthread+0xe2/0x100
> > > [ 2482.726621] [<ffffffff81028701>] ? __switch_to+0x261/0x6b0
> > > [ 2482.732006] [<ffffffff810a3390>] ? kthread_park+0x60/0x60
> > > [ 2482.737379] [<ffffffff81748c37>] ? ret_from_fork+0x57/0x70
> > > [ 2482.742761] Code: 00 00 48 89 44 24 10 8b 87 b0 00 00 00 48 89 44 24 08 48 8b 87 c0 00 00 00 48 c7 c7 50 8e a2 81 48 89 04 24 31 c0 e8 b5 07 b6 ff <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
> > > [ 2482.759199] RIP [<ffffffff815ed71f>] skb_panic+0x5f/0x70
> > > [ 2482.764672] RSP <ffffc90001fcfba8>
> > > [ 2482.771186] ---[ end trace 6d0fe52ed049d841 ]---
> > > [ 2482.776641] Kernel panic - not syncing: Fatal exception in interrupt
> > > [ 2482.861621] Kernel Offset: disabled
> > >
> > > I circumvented the bug by applying this patch:
> > > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> > > index 21f6deb2aec9..2762c3dcc883 100644
> > > --- a/net/ipv6/mcast.c
> > > +++ b/net/ipv6/mcast.c
> > > @@ -1605,8 +1605,6 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu)
> > > IPV6_TLV_PADN, 0 };
> > >
> > > /* we assume size > sizeof(ra) here */
> > > - /* limit our allocations to order-0 page */
> > > - size = min_t(int, size, SKB_MAX_ORDER(0, 0));
> > > skb = sock_alloc_send_skb(sk, size, 1, &err);
> > >
> > > if (!skb)
> > >
> > > The lines are introduced by commit 72e09ad107e78d69ff4d3b97a69f0aad2b77280f
> > > stating that "order-2 GRP_ATOMIC allocations are very unreliable"
> > > I then wonder if this statement is still relevant, or if such a patch
> > > would be acceptable to you.
> >
> > Thanks for the report, but this patch is not correct.
> >
> > I rather suspect commit 1837b2e2bcd23137766555a63867e649c0b637f0 ("mld, igmp: Fix reserved tailroom calculation")
> > is the problem.
> >
>
> If I follow the code here, skb_over_panic is triggered if
> skb->tail > skb->end, with the order-0 page limitation, and mtu 9000,
> we can assume (in the case of PAGE_SIZE=4k) skb->end to be 4k.
>
> Tail is set to 0 by the alloc, and then to LL_RESERVED_SPACE of the net
> device by the skb_reserve. We then only change it during the skb_put
> leading to the skb_over_panic by setting it to
> LL_RESERVED_SPACE(dev) + sizeof(struct ipv6hdr)
> So I don't see what is the role of the commit you suspect here, the only
> cause I can see here is that the needed space is just larger than a 4k
> page.
> If you can explain to me why you think the reserved tailroom calculation
> has an impact on this skb_put, I might be able to look at it more in
> depth
Indeed, commit 1837b2e2bcd2 ("mld, igmp: Fix reserved tailroom
calculation", v4.5) only changes the assignment of
skb->reserved_tailroom which has no impact on the failure of the
skb_put().
Given the panic splat that Nicolas posted, it looks like vif1.1 has
LL_RESERVED_SPACE() === 65488, which is quite a lot. What kind of
interface is vif1.1?
>
> The patch I posted is of course a bit harsh and I do not really expect
> it to be the right solution, but unless I set a lesser MTU (<PAGE_SIZE),
> I don't really see any way to get it to work with the order 0 page
> limitation
It's not the right solution because high order atomic allocations are
especially unreliable; reclaim and compaction are not possible at that
time. igmp_sk has sk_allocation = GFP_ATOMIC.
Maybe there is some problem around the needed_headroom manipulations
done in br_add_if().
Can you try again without your changes but with the following extra
debugging prints and post the output (just the extra bridge messages and
the skb_over_panic line are enough I think).
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 9b46d2dc4c22..a911e7653bc7 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -641,11 +641,15 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
netdev_update_features(br->dev);
br_hr = br->dev->needed_headroom;
+ br_info(br, "before: n_h %u port %s n_h %u\n", br_hr, dev->name,
+ dev->needed_headroom);
dev_hr = netdev_get_fwd_headroom(dev);
if (br_hr < dev_hr)
update_headroom(br, dev_hr);
else
netdev_set_rx_headroom(dev, br_hr);
+ br_info(br, "after: n_h %u port %s n_h %u\n",
+ br->dev->needed_headroom, dev->name, dev->needed_headroom);
if (br_fdb_insert(br, p, dev->dev_addr, 0))
netdev_err(dev, "failed insert local address bridge forwarding table\n");
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 946de0e24c87..9b26e9201a92 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -97,10 +97,12 @@ EXPORT_SYMBOL(sysctl_max_skb_frags);
static void skb_panic(struct sk_buff *skb, unsigned int sz, void *addr,
const char msg[])
{
- pr_emerg("%s: text:%p len:%d put:%d head:%p data:%p tail:%#lx end:%#lx dev:%s\n",
+ pr_emerg("%s: text:%p len:%d put:%d head:%p data:%p tail:%#lx end:%#lx dev:%s hhl:%u n_h:%u\n",
msg, addr, skb->len, sz, skb->head, skb->data,
(unsigned long)skb->tail, (unsigned long)skb->end,
- skb->dev ? skb->dev->name : "<NULL>");
+ skb->dev ? skb->dev->name : "<NULL>",
+ skb->dev ? skb->dev->hard_header_len : 0,
+ skb->dev ? skb->dev->needed_headroom : 0);
BUG();
}
next prev parent reply other threads:[~2018-12-05 6:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-03 15:20 [Bug] net/ipv6: skb_over_panic in mld_newpack Nicolas Belouin
2018-12-03 15:59 ` Eric Dumazet
2018-12-04 10:52 ` Nicolas Belouin
2018-12-05 6:57 ` Benjamin Poirier [this message]
2018-12-05 15:57 ` Nicolas Belouin
2018-12-05 23:34 ` Benjamin Poirier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181205065756.GA23806@f2 \
--to=bpoirier@suse.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=nicolas.belouin@gandi.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.