* [PATCH 1/2] bonding: fix incorrect transmit queue offset @ 2011-02-23 19:42 Andy Gospodarek 2011-02-23 22:19 ` Jay Vosburgh 2011-02-23 23:08 ` Phil Oester 0 siblings, 2 replies; 12+ messages in thread From: Andy Gospodarek @ 2011-02-23 19:42 UTC (permalink / raw) To: netdev, Phil Oester; +Cc: Ben Hutchings, Jay Vosburgh Users noticed the following messages: kernel: bond0 selects TX queue 16, but real number of TX queues is 16 were filling their logs when frames received from a device that contained 16 input queues were being forwarded out of a bonded device. Ben pointed out that the contents of skb->queue_mapping cannot be directly mapped to a transmit queue, so I went with his suggestion for a solution to the offset problem. The function now also warns the user to increase the default value for tx_queues if needed and returns a valid tx queue to dev_pick_tx. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Reported-by: Phil Oester <kernel@linuxace.com> CC: Ben Hutchings <bhutchings@solarflare.com> --- drivers/net/bonding/bond_main.c | 24 +++++++++++++++++------- 1 files changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 77e3c6a..1512c83 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4533,15 +4533,25 @@ out: return res; } +/* + * This helper function exists to help dev_pick_tx get the correct + * destination queue. Using a helper function skips the a call to + * skb_tx_hash and will put the skbs in the queue we expect on their + * way down to the bonding driver. + */ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb) { - /* - * This helper function exists to help dev_pick_tx get the correct - * destination queue. Using a helper function skips the a call to - * skb_tx_hash and will put the skbs in the queue we expect on their - * way down to the bonding driver. - */ - return skb->queue_mapping; + u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; + + while (txq >= dev->real_num_tx_queues) { + /* let the user know if we do not have enough tx queues */ + if (net_ratelimit()) + pr_warning("%s selects invalid tx queue %d. Consider" + " setting module option tx_queues > %d.", + dev->name, txq, dev->real_num_tx_queues); + txq -= dev->real_num_tx_queues; + } + return txq; } static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) -- 1.7.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 19:42 [PATCH 1/2] bonding: fix incorrect transmit queue offset Andy Gospodarek @ 2011-02-23 22:19 ` Jay Vosburgh 2011-02-23 23:12 ` Andy Gospodarek 2011-02-23 23:08 ` Phil Oester 1 sibling, 1 reply; 12+ messages in thread From: Jay Vosburgh @ 2011-02-23 22:19 UTC (permalink / raw) To: Andy Gospodarek; +Cc: netdev, Phil Oester, Ben Hutchings Andy Gospodarek <andy@greyhouse.net> wrote: >Users noticed the following messages: > >kernel: bond0 selects TX queue 16, but real number of TX queues is 16 > >were filling their logs when frames received from a device that >contained 16 input queues were being forwarded out of a bonded device. >Ben pointed out that the contents of skb->queue_mapping cannot be >directly mapped to a transmit queue, so I went with his suggestion for a >solution to the offset problem. > >The function now also warns the user to increase the default value for >tx_queues if needed and returns a valid tx queue to dev_pick_tx. > >Signed-off-by: Andy Gospodarek <andy@greyhouse.net> >Reported-by: Phil Oester <kernel@linuxace.com> >CC: Ben Hutchings <bhutchings@solarflare.com> >--- > drivers/net/bonding/bond_main.c | 24 +++++++++++++++++------- > 1 files changed, 17 insertions(+), 7 deletions(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index 77e3c6a..1512c83 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -4533,15 +4533,25 @@ out: > return res; > } > >+/* >+ * This helper function exists to help dev_pick_tx get the correct >+ * destination queue. Using a helper function skips the a call to >+ * skb_tx_hash and will put the skbs in the queue we expect on their >+ * way down to the bonding driver. >+ */ > static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb) > { >- /* >- * This helper function exists to help dev_pick_tx get the correct >- * destination queue. Using a helper function skips the a call to >- * skb_tx_hash and will put the skbs in the queue we expect on their >- * way down to the bonding driver. >- */ >- return skb->queue_mapping; >+ u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; >+ >+ while (txq >= dev->real_num_tx_queues) { >+ /* let the user know if we do not have enough tx queues */ >+ if (net_ratelimit()) >+ pr_warning("%s selects invalid tx queue %d. Consider" >+ " setting module option tx_queues > %d.", >+ dev->name, txq, dev->real_num_tx_queues); >+ txq -= dev->real_num_tx_queues; Would this be better as: if (txq >= dev->real_num_tx_queues) { /* let the user know if we do not have enough tx queues */ if (net_ratelimit()) pr_warning("%s selects invalid tx queue %d. Consider" " setting module option tx_queues > %d.", dev->name, txq, dev->real_num_tx_queues); txq %= dev->real_num_tx_queues; } I.e., use one modulus instead of a looping subtraction? If the queue id in the packet is substantially out of range, the loop may interate multiple times. Presumably you want to distribute the out of range queue mappings (not just assign them all to the same queue id). -J >+ } >+ return txq; > } > > static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) >-- >1.7.4 > --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 22:19 ` Jay Vosburgh @ 2011-02-23 23:12 ` Andy Gospodarek 0 siblings, 0 replies; 12+ messages in thread From: Andy Gospodarek @ 2011-02-23 23:12 UTC (permalink / raw) To: Jay Vosburgh; +Cc: Andy Gospodarek, netdev, Phil Oester, Ben Hutchings On Wed, Feb 23, 2011 at 02:19:05PM -0800, Jay Vosburgh wrote: > Andy Gospodarek <andy@greyhouse.net> wrote: > > >Users noticed the following messages: > > > >kernel: bond0 selects TX queue 16, but real number of TX queues is 16 > > > >were filling their logs when frames received from a device that > >contained 16 input queues were being forwarded out of a bonded device. > >Ben pointed out that the contents of skb->queue_mapping cannot be > >directly mapped to a transmit queue, so I went with his suggestion for a > >solution to the offset problem. > > > >The function now also warns the user to increase the default value for > >tx_queues if needed and returns a valid tx queue to dev_pick_tx. > > > >Signed-off-by: Andy Gospodarek <andy@greyhouse.net> > >Reported-by: Phil Oester <kernel@linuxace.com> > >CC: Ben Hutchings <bhutchings@solarflare.com> > >--- > > drivers/net/bonding/bond_main.c | 24 +++++++++++++++++------- > > 1 files changed, 17 insertions(+), 7 deletions(-) > > > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > >index 77e3c6a..1512c83 100644 > >--- a/drivers/net/bonding/bond_main.c > >+++ b/drivers/net/bonding/bond_main.c > >@@ -4533,15 +4533,25 @@ out: > > return res; > > } > > > >+/* > >+ * This helper function exists to help dev_pick_tx get the correct > >+ * destination queue. Using a helper function skips the a call to > >+ * skb_tx_hash and will put the skbs in the queue we expect on their > >+ * way down to the bonding driver. > >+ */ > > static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb) > > { > >- /* > >- * This helper function exists to help dev_pick_tx get the correct > >- * destination queue. Using a helper function skips the a call to > >- * skb_tx_hash and will put the skbs in the queue we expect on their > >- * way down to the bonding driver. > >- */ > >- return skb->queue_mapping; > >+ u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; > >+ > >+ while (txq >= dev->real_num_tx_queues) { > >+ /* let the user know if we do not have enough tx queues */ > >+ if (net_ratelimit()) > >+ pr_warning("%s selects invalid tx queue %d. Consider" > >+ " setting module option tx_queues > %d.", > >+ dev->name, txq, dev->real_num_tx_queues); > >+ txq -= dev->real_num_tx_queues; > > Would this be better as: > > if (txq >= dev->real_num_tx_queues) { > /* let the user know if we do not have enough tx queues */ > if (net_ratelimit()) > pr_warning("%s selects invalid tx queue %d. Consider" > " setting module option tx_queues > %d.", > dev->name, txq, dev->real_num_tx_queues); > txq %= dev->real_num_tx_queues; > } > > I.e., use one modulus instead of a looping subtraction? If the > queue id in the packet is substantially out of range, the loop may > interate multiple times. Presumably you want to distribute the out of > range queue mappings (not just assign them all to the same queue id). > I hesitated to put the printk inside the while, but decided to do it as it looked cleaner than a bunch of if/while statements and this loop was likely to get run only once based on current settings and hardware out there. I'm not a big fan of modulo as it is generally pretty expensive when most cases will be covered with a single subtraction. I could be over-stating the expense, though. While it would be nice to distribute things more evenly, this code essentially does what the __skb_tx_hash does right now, so it seemed logical to do the same. If you would rather not see the pr_warning in the loop, this could work as well and doesn't look too nasty. diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 77e3c6a..2d3ae54 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4533,15 +4533,27 @@ out: return res; } +/* + * This helper function exists to help dev_pick_tx get the correct + * destination queue. Using a helper function skips the a call to + * skb_tx_hash and will put the skbs in the queue we expect on their + * way down to the bonding driver. + */ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb) { - /* - * This helper function exists to help dev_pick_tx get the correct - * destination queue. Using a helper function skips the a call to - * skb_tx_hash and will put the skbs in the queue we expect on their - * way down to the bonding driver. - */ - return skb->queue_mapping; + u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; + + if (txq >= dev->real_num_tx_queues) { + /* let the user know if we do not have enough tx queues */ + if (net_ratelimit()) + pr_warning("%s selects invalid tx queue %d. Consider" + " setting module option tx_queues > %d.", + dev->name, txq, dev->real_num_tx_queues); + do + txq -= dev->real_num_tx_queues; + while (txq >= dev->real_num_tx_queues); + } + return txq; } static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 19:42 [PATCH 1/2] bonding: fix incorrect transmit queue offset Andy Gospodarek 2011-02-23 22:19 ` Jay Vosburgh @ 2011-02-23 23:08 ` Phil Oester 2011-02-23 23:13 ` David Miller 1 sibling, 1 reply; 12+ messages in thread From: Phil Oester @ 2011-02-23 23:08 UTC (permalink / raw) To: Andy Gospodarek; +Cc: netdev, Ben Hutchings, Jay Vosburgh On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: > + * destination queue. Using a helper function skips the a call to s/the a/a/ or s/the a/the/ > + while (txq >= dev->real_num_tx_queues) { > + /* let the user know if we do not have enough tx queues */ > + if (net_ratelimit()) > + pr_warning("%s selects invalid tx queue %d. Consider" > + " setting module option tx_queues > %d.", > + dev->name, txq, dev->real_num_tx_queues); > + txq -= dev->real_num_tx_queues; > + } Think this would be better as a WARN_ONCE, as otherwise syslog will still get flooded with this - even when ratelimited. See get_rps_cpu in net/core/dev.c as an example.o Will test this out. Phil ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 23:08 ` Phil Oester @ 2011-02-23 23:13 ` David Miller 2011-02-23 23:37 ` Ben Hutchings 0 siblings, 1 reply; 12+ messages in thread From: David Miller @ 2011-02-23 23:13 UTC (permalink / raw) To: kernel; +Cc: andy, netdev, bhutchings, fubar From: Phil Oester <kernel@linuxace.com> Date: Wed, 23 Feb 2011 15:08:44 -0800 > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: >> + * destination queue. Using a helper function skips the a call to > > s/the a/a/ or s/the a/the/ > >> + while (txq >= dev->real_num_tx_queues) { >> + /* let the user know if we do not have enough tx queues */ >> + if (net_ratelimit()) >> + pr_warning("%s selects invalid tx queue %d. Consider" >> + " setting module option tx_queues > %d.", >> + dev->name, txq, dev->real_num_tx_queues); >> + txq -= dev->real_num_tx_queues; >> + } > > Think this would be better as a WARN_ONCE, as otherwise syslog will still > get flooded with this - even when ratelimited. See get_rps_cpu in > net/core/dev.c as an example.o Agreed. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 23:13 ` David Miller @ 2011-02-23 23:37 ` Ben Hutchings 2011-02-23 23:43 ` Andy Gospodarek 2011-02-23 23:54 ` David Miller 0 siblings, 2 replies; 12+ messages in thread From: Ben Hutchings @ 2011-02-23 23:37 UTC (permalink / raw) To: David Miller; +Cc: kernel, andy, netdev, fubar On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote: > From: Phil Oester <kernel@linuxace.com> > Date: Wed, 23 Feb 2011 15:08:44 -0800 > > > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: > >> + * destination queue. Using a helper function skips the a call to > > > > s/the a/a/ or s/the a/the/ > > > >> + while (txq >= dev->real_num_tx_queues) { > >> + /* let the user know if we do not have enough tx queues */ > >> + if (net_ratelimit()) > >> + pr_warning("%s selects invalid tx queue %d. Consider" > >> + " setting module option tx_queues > %d.", > >> + dev->name, txq, dev->real_num_tx_queues); > >> + txq -= dev->real_num_tx_queues; > >> + } > > > > Think this would be better as a WARN_ONCE, as otherwise syslog will still > > get flooded with this - even when ratelimited. See get_rps_cpu in > > net/core/dev.c as an example.o > > Agreed. This shouldn't WARN at all. It is perfectly valid (though non-optimal) to have different numbers of queues on two different multiqueue devices. Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 23:37 ` Ben Hutchings @ 2011-02-23 23:43 ` Andy Gospodarek 2011-02-23 23:54 ` David Miller 1 sibling, 0 replies; 12+ messages in thread From: Andy Gospodarek @ 2011-02-23 23:43 UTC (permalink / raw) To: Ben Hutchings; +Cc: David Miller, kernel, andy, netdev, fubar On Wed, Feb 23, 2011 at 11:37:49PM +0000, Ben Hutchings wrote: > On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote: > > From: Phil Oester <kernel@linuxace.com> > > Date: Wed, 23 Feb 2011 15:08:44 -0800 > > > > > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: > > >> + * destination queue. Using a helper function skips the a call to > > > > > > s/the a/a/ or s/the a/the/ > > > > > >> + while (txq >= dev->real_num_tx_queues) { > > >> + /* let the user know if we do not have enough tx queues */ > > >> + if (net_ratelimit()) > > >> + pr_warning("%s selects invalid tx queue %d. Consider" > > >> + " setting module option tx_queues > %d.", > > >> + dev->name, txq, dev->real_num_tx_queues); > > >> + txq -= dev->real_num_tx_queues; > > >> + } > > > > > > Think this would be better as a WARN_ONCE, as otherwise syslog will still > > > get flooded with this - even when ratelimited. See get_rps_cpu in > > > net/core/dev.c as an example.o > > > > Agreed. > > This shouldn't WARN at all. It is perfectly valid (though non-optimal) > to have different numbers of queues on two different multiqueue devices. > Agreed. Plus WARN seemed way to 'loud' for something like this. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 23:37 ` Ben Hutchings 2011-02-23 23:43 ` Andy Gospodarek @ 2011-02-23 23:54 ` David Miller 2011-02-25 22:56 ` Phil Oester 1 sibling, 1 reply; 12+ messages in thread From: David Miller @ 2011-02-23 23:54 UTC (permalink / raw) To: bhutchings; +Cc: kernel, andy, netdev, fubar From: Ben Hutchings <bhutchings@solarflare.com> Date: Wed, 23 Feb 2011 23:37:49 +0000 > On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote: >> From: Phil Oester <kernel@linuxace.com> >> Date: Wed, 23 Feb 2011 15:08:44 -0800 >> >> > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: >> >> + * destination queue. Using a helper function skips the a call to >> > >> > s/the a/a/ or s/the a/the/ >> > >> >> + while (txq >= dev->real_num_tx_queues) { >> >> + /* let the user know if we do not have enough tx queues */ >> >> + if (net_ratelimit()) >> >> + pr_warning("%s selects invalid tx queue %d. Consider" >> >> + " setting module option tx_queues > %d.", >> >> + dev->name, txq, dev->real_num_tx_queues); >> >> + txq -= dev->real_num_tx_queues; >> >> + } >> > >> > Think this would be better as a WARN_ONCE, as otherwise syslog will still >> > get flooded with this - even when ratelimited. See get_rps_cpu in >> > net/core/dev.c as an example.o >> >> Agreed. > > This shouldn't WARN at all. It is perfectly valid (though non-optimal) > to have different numbers of queues on two different multiqueue devices. That's also a good point. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] bonding: fix incorrect transmit queue offset 2011-02-23 23:54 ` David Miller @ 2011-02-25 22:56 ` Phil Oester 2011-03-01 15:31 ` [PATCH 1/2 net-next][v2] " Andy Gospodarek 0 siblings, 1 reply; 12+ messages in thread From: Phil Oester @ 2011-02-25 22:56 UTC (permalink / raw) To: David Miller; +Cc: bhutchings, andy, netdev, fubar On Wed, Feb 23, 2011 at 03:54:51PM -0800, David Miller wrote: > From: Ben Hutchings <bhutchings@solarflare.com> > Date: Wed, 23 Feb 2011 23:37:49 +0000 > > > On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote: > >> From: Phil Oester <kernel@linuxace.com> > >> Date: Wed, 23 Feb 2011 15:08:44 -0800 > >> > >> > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: > >> >> + while (txq >= dev->real_num_tx_queues) { > >> >> + /* let the user know if we do not have enough tx queues */ > >> >> + if (net_ratelimit()) > >> >> + pr_warning("%s selects invalid tx queue %d. Consider" > >> >> + " setting module option tx_queues > %d.", > >> >> + dev->name, txq, dev->real_num_tx_queues); > >> >> + txq -= dev->real_num_tx_queues; > >> >> + } > >> > > >> > Think this would be better as a WARN_ONCE, as otherwise syslog will still > >> > get flooded with this - even when ratelimited. See get_rps_cpu in > >> > net/core/dev.c as an example.o > >> > >> Agreed. > > > > This shouldn't WARN at all. It is perfectly valid (though non-optimal) > > to have different numbers of queues on two different multiqueue devices. > > That's also a good point. The patch works as expected. Do we have any agreement on a final version? Phil ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2 net-next][v2] bonding: fix incorrect transmit queue offset 2011-02-25 22:56 ` Phil Oester @ 2011-03-01 15:31 ` Andy Gospodarek 2011-03-02 1:40 ` Phil Oester 0 siblings, 1 reply; 12+ messages in thread From: Andy Gospodarek @ 2011-03-01 15:31 UTC (permalink / raw) To: Phil Oester; +Cc: David Miller, bhutchings, andy, netdev, fubar On Fri, Feb 25, 2011 at 02:56:36PM -0800, Phil Oester wrote: > On Wed, Feb 23, 2011 at 03:54:51PM -0800, David Miller wrote: > > From: Ben Hutchings <bhutchings@solarflare.com> > > Date: Wed, 23 Feb 2011 23:37:49 +0000 > > > > > On Wed, 2011-02-23 at 15:13 -0800, David Miller wrote: > > >> From: Phil Oester <kernel@linuxace.com> > > >> Date: Wed, 23 Feb 2011 15:08:44 -0800 > > >> > > >> > On Wed, Feb 23, 2011 at 02:42:49PM -0500, Andy Gospodarek wrote: > > >> >> + while (txq >= dev->real_num_tx_queues) { > > >> >> + /* let the user know if we do not have enough tx queues */ > > >> >> + if (net_ratelimit()) > > >> >> + pr_warning("%s selects invalid tx queue %d. Consider" > > >> >> + " setting module option tx_queues > %d.", > > >> >> + dev->name, txq, dev->real_num_tx_queues); > > >> >> + txq -= dev->real_num_tx_queues; > > >> >> + } > > >> > > > >> > Think this would be better as a WARN_ONCE, as otherwise syslog will still > > >> > get flooded with this - even when ratelimited. See get_rps_cpu in > > >> > net/core/dev.c as an example.o > > >> > > >> Agreed. > > > > > > This shouldn't WARN at all. It is perfectly valid (though non-optimal) > > > to have different numbers of queues on two different multiqueue devices. > > > > That's also a good point. > > The patch works as expected. Do we have any agreement on a final version? > Thanks for the testing, Phil. I'm in favor of this patch as it does alert the admin that bonding may not have enough default queues, but it is not as verbose (backtrace et al) and likely to create bug reports as a message from WARN_ON. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> --- drivers/net/bonding/bond_main.c | 26 +++++++++++++++++++------- 1 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 584f97b..acc05d6 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4643,15 +4643,27 @@ out: return res; } +/* + * This helper function exists to help dev_pick_tx get the correct + * destination queue. Using a helper function skips the a call to + * skb_tx_hash and will put the skbs in the queue we expect on their + * way down to the bonding driver. + */ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb) { - /* - * This helper function exists to help dev_pick_tx get the correct - * destination queue. Using a helper function skips the a call to - * skb_tx_hash and will put the skbs in the queue we expect on their - * way down to the bonding driver. - */ - return skb->queue_mapping; + u16 txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; + + if (txq >= dev->real_num_tx_queues) { + /* let the user know if we do not have enough tx queues */ + if (net_ratelimit()) + pr_warning("%s selects invalid tx queue %d. Consider" + " setting module option tx_queues > %d.", + dev->name, txq, dev->real_num_tx_queues); + do + txq -= dev->real_num_tx_queues; + while (txq >= dev->real_num_tx_queues); + } + return txq; } static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev) ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2 net-next][v2] bonding: fix incorrect transmit queue offset 2011-03-01 15:31 ` [PATCH 1/2 net-next][v2] " Andy Gospodarek @ 2011-03-02 1:40 ` Phil Oester 2011-03-04 17:37 ` Jay Vosburgh 0 siblings, 1 reply; 12+ messages in thread From: Phil Oester @ 2011-03-02 1:40 UTC (permalink / raw) To: Andy Gospodarek; +Cc: David Miller, bhutchings, netdev, fubar On Tue, Mar 01, 2011 at 10:31:36AM -0500, Andy Gospodarek wrote: > > The patch works as expected. Do we have any agreement on a final version? > > > > Thanks for the testing, Phil. > > I'm in favor of this patch as it does alert the admin that bonding may > not have enough default queues, but it is not as verbose (backtrace et > al) and likely to create bug reports as a message from WARN_ON. > + if (net_ratelimit()) > + pr_warning("%s selects invalid tx queue %d. Consider" > + " setting module option tx_queues > %d.", > + dev->name, txq, dev->real_num_tx_queues); It is unclear why we need to alert the admin to this situation (repeatedly). Say the incoming nic has 32 queues, and is headed out a bond (with 16). With your patch, we will log 50% of the time, no? What benefit is this log spew? While WARN_ONCE may be a bit extreme due to the backtrace, perhaps we should at least throw a 'static bool warned' variable in there to lessen the nuisance? Phil ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2 net-next][v2] bonding: fix incorrect transmit queue offset 2011-03-02 1:40 ` Phil Oester @ 2011-03-04 17:37 ` Jay Vosburgh 0 siblings, 0 replies; 12+ messages in thread From: Jay Vosburgh @ 2011-03-04 17:37 UTC (permalink / raw) To: Phil Oester; +Cc: Andy Gospodarek, David Miller, bhutchings, netdev Phil Oester <kernel@linuxace.com> wrote: >On Tue, Mar 01, 2011 at 10:31:36AM -0500, Andy Gospodarek wrote: >> > The patch works as expected. Do we have any agreement on a final version? >> > >> >> Thanks for the testing, Phil. >> >> I'm in favor of this patch as it does alert the admin that bonding may >> not have enough default queues, but it is not as verbose (backtrace et >> al) and likely to create bug reports as a message from WARN_ON. >> + if (net_ratelimit()) >> + pr_warning("%s selects invalid tx queue %d. Consider" >> + " setting module option tx_queues > %d.", >> + dev->name, txq, dev->real_num_tx_queues); > >It is unclear why we need to alert the admin to this situation (repeatedly). >Say the incoming nic has 32 queues, and is headed out a bond (with 16). >With your patch, we will log 50% of the time, no? What benefit is this >log spew? > >While WARN_ONCE may be a bit extreme due to the backtrace, perhaps we >should at least throw a 'static bool warned' variable in there to lessen >the nuisance? I'm also concerned that the log messages will be excessive. Should we instead create a bonding driver-private ethtool statistics and count these events that way? -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-03-04 17:37 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-23 19:42 [PATCH 1/2] bonding: fix incorrect transmit queue offset Andy Gospodarek 2011-02-23 22:19 ` Jay Vosburgh 2011-02-23 23:12 ` Andy Gospodarek 2011-02-23 23:08 ` Phil Oester 2011-02-23 23:13 ` David Miller 2011-02-23 23:37 ` Ben Hutchings 2011-02-23 23:43 ` Andy Gospodarek 2011-02-23 23:54 ` David Miller 2011-02-25 22:56 ` Phil Oester 2011-03-01 15:31 ` [PATCH 1/2 net-next][v2] " Andy Gospodarek 2011-03-02 1:40 ` Phil Oester 2011-03-04 17:37 ` Jay Vosburgh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).