From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Ma Subject: Re: Corrupted SKB Date: Tue, 25 Apr 2017 21:42:10 -0700 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Linux Kernel Network Developers , jin.oyj@alibaba-inc.com To: Cong Wang Return-path: Received: from mail-oi0-f46.google.com ([209.85.218.46]:34859 "EHLO mail-oi0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1176338AbdDZEmL (ORCPT ); Wed, 26 Apr 2017 00:42:11 -0400 Received: by mail-oi0-f46.google.com with SMTP id j201so195933084oih.2 for ; Tue, 25 Apr 2017 21:42:11 -0700 (PDT) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: 2017-04-18 21:46 GMT-07:00 Michael Ma : > 2017-04-18 16:12 GMT-07:00 Cong Wang : >> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma wrote: >>> Hi - >>> >>> We've implemented a "glue" qdisc similar to mqprio which can associate >>> one qdisc to multiple txqs as the root qdisc. Reference count of the >>> child qdiscs have been adjusted properly in this case so that it >>> represents the number of txqs it has been attached to. However when >>> sending packets we saw the skb from dequeue_skb() corrupted with the >>> following call stack: >>> >>> [exception RIP: netif_skb_features+51] >>> RIP: ffffffff815292b3 RSP: ffff8817f6987940 RFLAGS: 00010246 >>> >>> #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa >>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9 >>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193 >>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03 >>> >>> It looks like the skb has already been released since its dev pointer >>> field is invalid. >>> >>> Any clue on how this can be investigated further? My current thought >>> is to add some instrumentation to the place where skb is released and >>> analyze whether there is any race condition happening there. However >> >> Either dropwatch or perf could do the work to instrument kfree_skb(). > > Thanks - will try it out. I'm using perf to collect the callstack for kfree_skb and trying to correlate that with the corrupted SKB address however when system crashes the perf.data file is also corrupted - how can I view this file in case the system crashes before perf exits? >> >>> by looking through the existing code I think the case where one root >>> qdisc is associated with multiple txqs already exists (when mqprio is >>> not used) so not sure why it won't work when we group txqs and assign >>> each group a root qdisc. Any insight on this issue would be much >>> appreciated! >> >> How do you implement ->attach()? How does it work with netdev_pick_tx()? > > attach() essentially grafts the default qdisc(pfifo) to each "txq > group" represented by a TC class. For netdev_pick_txq() we use classid > of the socket to select a class based on a "class id base" and the > class to txq mapping defined together with this glue qdisc - it's > pretty much the same as mqprio with the difference of mapping one > class to multiple txqs and selecting the txq through a hash.