From: "Keller, Jacob E" <jacob.e.keller@intel.com>
To: "eric.dumazet@gmail.com" <eric.dumazet@gmail.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"richardcochran@gmail.com" <richardcochran@gmail.com>
Subject: Re: kernel WARNING on skb_complete_tx_timestamp
Date: Mon, 13 Apr 2015 16:31:11 +0000 [thread overview]
Message-ID: <1428942671.28752.14.camel@intel.com> (raw)
In-Reply-To: <1428724826.25985.337.camel@edumazet-glaptop2.roam.corp.google.com>
Hi Eric,
On Fri, 2015-04-10 at 21:00 -0700, Eric Dumazet wrote:
> On Sat, 2015-04-11 at 01:34 +0000, Keller, Jacob E wrote:
> > Hi Richard,
> >
> > I don't know if you'd know this but I believe you're the best person to ask. If anyone else has an answer please feel free to chime in.
> >
> > [ 2744.552823] Modules linked in: fm10k(OE) vxlan ip6_udp_tunnel udp_tunnel uio fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc openvswitch libcrc32c joydev x86_pkg_temp_thermal coretemp kvm_intel kvm mei_me crct10dif_pclmul crc32_pclmul wmi crc32c_intel mei ipmi_ssif iTCO_wdt iTCO_vendor_support ipmi_devintf ghash_clmulni_intel lpc_ich ipmi_si tpm_tis mfd_core tpm ipmi_msghandler ioatdma sb_edac shpchp edac_core i2c_i801 mgag200 drm_kms_helper ttm drm isci igb firewire_ohci firewire_core libsas ptp crc_itu_t dca scsi_transport_sas pps_core i2c_algo_bit [last unloaded: fm10k]
> > [ 2744.552896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W OE 3.19.3-200.fc21.x86_64 #1
> > [ 2744.552899] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014
> > [ 2744.552901] 0000000000000000 2f4c8b10ea3f9848 ffff88081ee03a38 ffffffff8176e215
> > [ 2744.552906] 0000000000000000 0000000000000000 ffff88081ee03a78 ffffffff8109bc1a
> > [ 2744.552910] ffff88081ee03c50 ffff88080e55fc00 ffff88080e55fc00 ffffffff81647c50
> > [ 2744.552914] Call Trace:
> > [ 2744.552917] <IRQ> [<ffffffff8176e215>] dump_stack+0x45/0x57
> > [ 2744.552931] [<ffffffff8109bc1a>] warn_slowpath_common+0x8a/0xc0
> > [ 2744.552936] [<ffffffff81647c50>] ? skb_queue_purge+0x20/0x40
> > [ 2744.552941] [<ffffffff8109bd4a>] warn_slowpath_null+0x1a/0x20
> > [ 2744.552946] [<ffffffff81646911>] skb_release_head_state+0xe1/0xf0
> > [ 2744.552950] [<ffffffff81647b26>] skb_release_all+0x16/0x30
> > [ 2744.552954] [<ffffffff81647ba6>] kfree_skb+0x36/0x90
> > [ 2744.552958] [<ffffffff81647c50>] skb_queue_purge+0x20/0x40
> > [ 2744.552964] [<ffffffff81751f8d>] packet_sock_destruct+0x1d/0x90
> > [ 2744.552968] [<ffffffff81642053>] __sk_free+0x23/0x140
> > [ 2744.552973] [<ffffffff81642189>] sk_free+0x19/0x20
> > [ 2744.552977] [<ffffffff81647d60>] skb_complete_tx_timestamp+0x50/0x60
> > [ 2744.552988] [<ffffffffa02eee40>] fm10k_ts_tx_hwtstamp+0xd0/0x100 [fm10k]
> > [ 2744.552994] [<ffffffffa02e054e>] fm10k_1588_msg_pf+0x12e/0x140 [fm10k]
> > [ 2744.553002] [<ffffffffa02edf1d>] fm10k_tlv_msg_parse+0x8d/0xc0 [fm10k]
> > [ 2744.553010] [<ffffffffa02eb2d0>] fm10k_mbx_dequeue_rx+0x60/0xb0 [fm10k]
> > [ 2744.553016] [<ffffffffa02ebf98>] fm10k_sm_mbx_process+0x178/0x3c0 [fm10k]
> > [ 2744.553022] [<ffffffffa02e09ca>] fm10k_msix_mbx_pf+0xfa/0x360 [fm10k]
> > [ 2744.553030] [<ffffffff811030a7>] ? get_next_timer_interrupt+0x1f7/0x270
> > [ 2744.553036] [<ffffffff810f2a47>] handle_irq_event_percpu+0x77/0x1a0
> > [ 2744.553041] [<ffffffff810f2bab>] handle_irq_event+0x3b/0x60
> > [ 2744.553045] [<ffffffff810f5d6e>] handle_edge_irq+0x6e/0x120
> > [ 2744.553054] [<ffffffff81017414>] handle_irq+0x74/0x140
> > [ 2744.553061] [<ffffffff810bb54a>] ? atomic_notifier_call_chain+0x1a/0x20
> > [ 2744.553066] [<ffffffff8177777f>] do_IRQ+0x4f/0xf0
> > [ 2744.553072] [<ffffffff8177556d>] common_interrupt+0x6d/0x6d
> > [ 2744.553074] <EOI> [<ffffffff81609b16>] ? cpuidle_enter_state+0x66/0x160
> > [ 2744.553084] [<ffffffff81609b01>] ? cpuidle_enter_state+0x51/0x160
> > [ 2744.553087] [<ffffffff81609cf7>] cpuidle_enter+0x17/0x20
> > [ 2744.553092] [<ffffffff810de101>] cpu_startup_entry+0x321/0x3c0
> > [ 2744.553098] [<ffffffff81764497>] rest_init+0x77/0x80
> > [ 2744.553103] [<ffffffff81d4f02c>] start_kernel+0x4a4/0x4c5
> > [ 2744.553107] [<ffffffff81d4e120>] ? early_idt_handlers+0x120/0x120
> > [ 2744.553110] [<ffffffff81d4e4d7>] x86_64_start_reservations+0x2a/0x2c
> > [ 2744.553114] [<ffffffff81d4e62b>] x86_64_start_kernel+0x152/0x175
> >
> > This occurs because of the WARN_ON in kfree_skb which results from sock_hold(sk) and sock_put(sk)
> >
> > I have a driver (fm10k) which receives a notification of Tx timestamp via an IRQ, and then when I call skb_complete_tx_timestamp I get this warning. I believe this is a result of calling sk_free, which the description *says* is ok to call in any context.. but then we get this warning.
> >
> > I'm really not sure exactly how this situation occurred. Eventually we call kfree_skb() while we are in irq context which results in the warning.
>
> At first look, there are some issues with this driver.
>
I should clarify here. I am working on fixing the issues with 1588 in
this driver. The actual panic I got wasn't on the same code as currently
in upstream. There are a lot of issues we're actively debugging now.
However, your suggestion below does seem like it still applies.
Hopefully the total set of fixes will be ready to post soon.
> fm10k_ts_tx_enqueue() is racy and seems also buggy, freeing wrong skb.
>
> Could you try :
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> index
> 02008e976d186f754470340089f344e781e9bb04..070d4f0b3c03bb0e31e216eb82d00f4fdcb4ea9f 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_ptp.c
> @@ -70,16 +70,15 @@ void fm10k_ts_tx_enqueue(struct fm10k_intfc
> *interface, struct sk_buff *skb)
> * if none are present then insert skb in tail of list
> */
> skb = fm10k_ts_tx_skb(interface, FM10K_CB(clone)->fi.w.dglort);
> - if (!skb)
> + if (!skb) {
> + skb_shinfo(clone)->tx_flags |= SKBTX_IN_PROGRESS;
> __skb_queue_tail(list, clone);
> -
> + }
> spin_unlock_irqrestore(&list->lock, flags);
>
> /* if list is already has one then we just free the clone */
> if (skb)
> - kfree_skb(skb);
> - else
> - skb_shinfo(clone)->tx_flags |= SKBTX_IN_PROGRESS;
> + kfree_skb(clone);
So free the clone instead of the original? I'm not entirely sure how
this changes the flow, but I will give this a shot.
Regards,
Jake
> }
>
> void fm10k_ts_tx_hwtstamp(struct fm10k_intfc *interface, __le16 dglort,
>
>
>
next prev parent reply other threads:[~2015-04-13 16:31 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-11 1:34 kernel WARNING on skb_complete_tx_timestamp Keller, Jacob E
2015-04-11 4:00 ` Eric Dumazet
2015-04-13 16:31 ` Keller, Jacob E [this message]
2015-04-13 16:41 ` Keller, Jacob E
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1428942671.28752.14.camel@intel.com \
--to=jacob.e.keller@intel.com \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.