netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Ben Greear <greearb@candelatech.com>
Cc: NetDev <netdev@vger.kernel.org>,
	robert@herjulf.net, "David S. Miller" <davem@davemloft.net>
Subject: Re: pktgen and spin_lock_bh in xmit path
Date: Tue, 20 Oct 2009 23:22:44 +0200	[thread overview]
Message-ID: <4ADE2A24.6080300@gmail.com> (raw)
In-Reply-To: <4ADE2735.9000807@candelatech.com>

Ben Greear a écrit :
> On 10/20/2009 11:54 AM, Eric Dumazet wrote:
> 
>> -    queue_map = skb_get_queue_mapping(pkt_dev->skb);
>> +    queue_map = pkt_dev->cur_queue_map;
>> +    /*
>> +     * tells skb_tx_hash() to use this tx queue.
>> +     * We should reset skb->mapping before each xmit() because
>> +     * xmit() might change it.
>> +     */
>> +    skb_record_rx_queue(pkt_dev->skb, queue_map);
>>       txq = netdev_get_tx_queue(odev, queue_map);
> 
> I think that must be wrong.  The record_rx_queue sets it to queue_map + 1,
> but the hard-start-xmit method (in ixgbe/ixgbe_main.c, at least), takes the
> skb->queue_map and uses it as an index with no subtraction.


Yes but check net/core/dev.c I quoted in my previous mail :
We change queue_map if skb goes through dev_queue_xmit()
(as done by macvlan)

u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
{
        u32 hash;

        if (skb_rx_queue_recorded(skb)) {
                hash = skb_get_rx_queue(skb);
                while (unlikely(hash >= dev->real_num_tx_queues))
                        hash -= dev->real_num_tx_queues;
                return hash;
        }

        if (skb->sk && skb->sk->sk_hash)
                hash = skb->sk->sk_hash;
        else
                hash = skb->protocol;

        hash = jhash_1word(hash, skb_tx_hashrnd);

        return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
}
EXPORT_SYMBOL(skb_tx_hash);

static struct netdev_queue *dev_pick_tx(struct net_device *dev,
                                        struct sk_buff *skb)
{
        const struct net_device_ops *ops = dev->netdev_ops;
        u16 queue_index = 0;

        if (ops->ndo_select_queue)
                queue_index = ops->ndo_select_queue(dev, skb);
        else if (dev->real_num_tx_queues > 1)
                queue_index = skb_tx_hash(dev, skb);

        skb_set_queue_mapping(skb, queue_index);
        return netdev_get_tx_queue(dev, queue_index);
}

So if skb->queue_mapping was X+1 before entering dev_pick_tx(), it is X when
leaving dev_pick_tx()

> 
> This causes watchdog timeouts because we are calling txq_trans_update in
> pktgen on
> queue 0, for instance, but sending pkts on queue 1.  If queue 1 is ever
> busy when the WD fires, link is reset.
> 

Problem is not pktgen IMHO. 

Problem is skb->queue_mapping has different meaning if skb is directly given to a real device -> start_xmit()

( In this case skb->queue_mapping should be between [O ... real_num_tx_queues-1])

But if it goes through dev_queue_xmit(), it should be set between [1 .. real_num_tx_queues], because
dev_pick_tx() will decrement skb->queue_mapping

In fact skb->queue_mapping only works for forwarded packets, not locally generated ones.

I am too tired to cook a fix at this moment, sorry :(

  reply	other threads:[~2009-10-20 21:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-20  3:38 pktgen and spin_lock_bh in xmit path Ben Greear
2009-10-20  3:48 ` Eric Dumazet
2009-10-20  4:52   ` Ben Greear
2009-10-20 17:37     ` Ben Greear
2009-10-20 17:44       ` Eric Dumazet
2009-10-20 17:54         ` Ben Greear
2009-10-20 18:35           ` Eric Dumazet
2009-10-20 18:54             ` Eric Dumazet
2009-10-20 20:16               ` Ben Greear
2009-10-20 21:10               ` Ben Greear
2009-10-20 21:22                 ` Eric Dumazet [this message]
2009-10-20 21:30                   ` Ben Greear
2009-10-20 21:57                     ` Eric Dumazet
2009-10-20 23:17                       ` Ben Greear
2009-10-21  3:05                         ` Ben Greear
2009-10-21  3:12                           ` Eric Dumazet
2009-10-21  3:59                             ` Eric Dumazet
2009-10-21  5:00                               ` Ben Greear
2009-10-21  5:14                                 ` Eric Dumazet
2009-10-21  5:40                                   ` Ben Greear
2009-10-21  5:12                 ` Krishna Kumar2
2009-10-21  5:32                   ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ADE2A24.6080300@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=greearb@candelatech.com \
    --cc=netdev@vger.kernel.org \
    --cc=robert@herjulf.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).