[PATCH] catch up device stats when multicast

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] catch up device stats when multicast > total frames
@ 2008-08-05 11:50 Neil Horman
  2008-08-05 12:15 ` Ben Hutchings
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Horman @ 2008-08-05 11:50 UTC (permalink / raw)
  To: netdev
  Cc: nhorman, e1000-devel, jeffrey.t.kirsher, jesse.brandeburg,
	bruce.w.allan, john.ronciak

Hey-
	REcently observed a problem wherein, if a BMC or other IPMI device is
attached to a NIC, multicast frames can be consumed by the aformentioned device
without ever being seen by the driver.  Since multicast frames are counted in
the hardware and the total frame counter is counted in the driver napi routine,
its possible to have a condition in which the number of received multicast
frames exceeds the total number of frames.  This patch re-aligns those values on
receive, so that the total frame count is always at least as large as the
received frame count.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>


 e1000_main.c |    6 ++++++
 1 file changed, 6 insertions(+)


diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 59579b1..66afee8 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3761,6 +3761,8 @@ e1000_update_stats(struct e1000_adapter *adapter)
 	/* Fill out the OS statistics structure */
 	adapter->net_stats.multicast = adapter->stats.mprc;
 	adapter->net_stats.collisions = adapter->stats.colc;
+	if (adapter->net_stats.rx_packets < adapter->net_stats.multicast)
+		adapter->net_stats.rx_packets = adapter->net_stats.multicast;
 
 	/* Rx Errors */
 
@@ -4347,6 +4349,8 @@ next_desc:
 		adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count);
 
 	adapter->total_rx_packets += total_rx_packets;
+	if (adapter->total_rx_packets < adapter->net_stats.multicast)
+		adapter->total_rx_packets = adapter->net_stats.multicast;
 	adapter->total_rx_bytes += total_rx_bytes;
 	adapter->net_stats.rx_bytes += total_rx_bytes;
 	adapter->net_stats.rx_packets += total_rx_packets;
@@ -4536,6 +4540,8 @@ next_desc:
 		adapter->alloc_rx_buf(adapter, rx_ring, cleaned_count);
 
 	adapter->total_rx_packets += total_rx_packets;
+	if (adapter->total_rx_packets < adapter->net_stats.multicast)
+		adapter->total_rx_packets = adapter->net_stats.multicast;
 	adapter->total_rx_bytes += total_rx_bytes;
 	adapter->net_stats.rx_bytes += total_rx_bytes;
 	adapter->net_stats.rx_packets += total_rx_packets;
-- 
/****************************************************
 * Neil Horman <nhorman@tuxdriver.com>
 * Software Engineer, Red Hat
 ****************************************************/

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] catch up device stats when multicast > total frames
  2008-08-05 11:50 [PATCH] catch up device stats when multicast > total frames Neil Horman
@ 2008-08-05 12:15 ` Ben Hutchings
  2008-08-05 13:16   ` Neil Horman
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Hutchings @ 2008-08-05 12:15 UTC (permalink / raw)
  To: Neil Horman
  Cc: e1000-devel, netdev, bruce.w.allan, jesse.brandeburg,
	john.ronciak, jeffrey.t.kirsher

Neil Horman wrote:
> Hey-
> 	REcently observed a problem wherein, if a BMC or other IPMI device is
> attached to a NIC, multicast frames can be consumed by the aformentioned device
> without ever being seen by the driver.  Since multicast frames are counted in
> the hardware and the total frame counter is counted in the driver napi routine,

I'd be surprised if the hardware does not also maintain a total frame
counter.  If not, you can possibly calculate the total as good + bad
packets, or unicast + multicast + broadcast + bad, or something like that.

[...]
> --- a/drivers/net/e1000/e1000_main.c
> +++ b/drivers/net/e1000/e1000_main.c
> @@ -3761,6 +3761,8 @@ e1000_update_stats(struct e1000_adapter *adapter)
>  	/* Fill out the OS statistics structure */
>  	adapter->net_stats.multicast = adapter->stats.mprc;
>  	adapter->net_stats.collisions = adapter->stats.colc;
> +	if (adapter->net_stats.rx_packets < adapter->net_stats.multicast)
> +		adapter->net_stats.rx_packets = adapter->net_stats.multicast;
>  
>  	/* Rx Errors */
>  
[...]

This is a botch - it means the numbers can't be so obviously wrong, but
doesn't make them correct.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] catch up device stats when multicast > total frames
  2008-08-05 12:15 ` Ben Hutchings
@ 2008-08-05 13:16   ` Neil Horman
  2008-08-05 16:43     ` [E1000-devel] [PATCH] catch up device stats when multicast >total frames Brandeburg, Jesse
  2008-08-05 16:52     ` [PATCH] catch up device stats when multicast > total frames Ronciak, John
  0 siblings, 2 replies; 8+ messages in thread
From: Neil Horman @ 2008-08-05 13:16 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: netdev, e1000-devel, jeffrey.t.kirsher, jesse.brandeburg,
	bruce.w.allan, john.ronciak

On Tue, Aug 05, 2008 at 01:15:29PM +0100, Ben Hutchings wrote:
> Neil Horman wrote:
> > Hey-
> > 	REcently observed a problem wherein, if a BMC or other IPMI device is
> > attached to a NIC, multicast frames can be consumed by the aformentioned device
> > without ever being seen by the driver.  Since multicast frames are counted in
> > the hardware and the total frame counter is counted in the driver napi routine,
> 
> I'd be surprised if the hardware does not also maintain a total frame
> counter.  If not, you can possibly calculate the total as good + bad
> packets, or unicast + multicast + broadcast + bad, or something like that.
> 

It does, however,  for whatever reason the drriver relies on a software
computed total packets counter rather than the hardware based on (possibly for
the same reason, in that it counts frames the driver never actually sees).  I
asked John R. in the origional bug report why e1000 might do this, and while
neither of us were sure, that seemed to be the consensus.

> [...]
> > --- a/drivers/net/e1000/e1000_main.c
> > +++ b/drivers/net/e1000/e1000_main.c
> > @@ -3761,6 +3761,8 @@ e1000_update_stats(struct e1000_adapter *adapter)
> >  	/* Fill out the OS statistics structure */
> >  	adapter->net_stats.multicast = adapter->stats.mprc;
> >  	adapter->net_stats.collisions = adapter->stats.colc;
> > +	if (adapter->net_stats.rx_packets < adapter->net_stats.multicast)
> > +		adapter->net_stats.rx_packets = adapter->net_stats.multicast;
> >  
> >  	/* Rx Errors */
> >  
> [...]
> 
> This is a botch - it means the numbers can't be so obviously wrong, but
> doesn't make them correct.
> 
I agree, but as it stands currently the numbers are obviously wrong, and I don't
see a way to make them undenyably correct.  We could use the hardware counter
instead, but then we would be counting frames that various IPMI endpoints might
consume, so that wouldn't be quite right.  We could fix up the sofware counter
based on the hardware counter, but that doesn't seem right either.  We could
inspect the packet in the driver, and do a software multicast frame counter, but
that would be a big performance hit.  So there is no great fix to this problem.
This at least makes the stats sane, doesn't impact performance, and doesn't
affect stats for which IPMI reception isn't an issue.

John, you stopped commenting in the origional bug report, do you have any
further thoughts on this as a fix to the issue at hand?

Neil


-- 
/****************************************************
 * Neil Horman <nhorman@tuxdriver.com>
 * Software Engineer, Red Hat
 ****************************************************/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [E1000-devel] [PATCH] catch up device stats when multicast >total frames
  2008-08-05 13:16   ` Neil Horman
@ 2008-08-05 16:43     ` Brandeburg, Jesse
  2008-08-05 16:53       ` Neil Horman
  2008-08-05 16:52     ` [PATCH] catch up device stats when multicast > total frames Ronciak, John
  1 sibling, 1 reply; 8+ messages in thread
From: Brandeburg, Jesse @ 2008-08-05 16:43 UTC (permalink / raw)
  To: Neil Horman, Ben Hutchings
  Cc: e1000-devel, netdev, Allan, Bruce W, Ronciak, John,
	Kirsher, Jeffrey T

Neil Horman wrote:
> On Tue, Aug 05, 2008 at 01:15:29PM +0100, Ben Hutchings wrote:
>>> 	REcently observed a problem wherein, if a BMC or other IPMI
device
>>> is attached to a NIC, multicast frames can be consumed by the
>>> aformentioned device without ever being seen by the driver.  Since
>>> multicast frames are counted in the hardware and the total frame
>>> counter is counted in the driver napi routine, 
>> 
>> I'd be surprised if the hardware does not also maintain a total frame
>> counter.  If not, you can possibly calculate the total as good + bad
>> packets, or unicast + multicast + broadcast + bad, or something like
>> that. 
> 
> It does, however,  for whatever reason the drriver relies on a
> software computed total packets counter rather than the hardware
> based on (possibly for the same reason, in that it counts frames the

We changed the driver to do real time updates based on user feedback
that updating the stats every two seconds was not fast enough.  It was
the simplest solution at the time.

>> This is a botch - it means the numbers can't be so obviously wrong,
>> but doesn't make them correct. 
>> 
> I agree, but as it stands currently the numbers are obviously wrong,
> and I don't see a way to make them undenyably correct.  We could use
> the hardware counter instead, but then we would be counting frames
> that various IPMI endpoints might consume, so that wouldn't be quite

There are counters at the end of the ethtool -S output called
   tx_smbus: 0
   rx_smbus: 0
   dropped_smbus: 0

that are indicating when the SMBUS is active. (SMBUS is used to tx / rx
BMC packets)  Those at least tell you when IPMI is active.

> right.  We could fix up the sofware counter based on the hardware
> counter, but that doesn't seem right either.  We could inspect the
> packet in the driver, and do a software multicast frame counter, but
> that would be a big performance hit.  So there is no great fix to
> this problem. This at least makes the stats sane, doesn't impact
> performance, and doesn't affect stats for which IPMI reception isn't
> an issue.  

There doesn't seem to be much of a middle ground, you either count in
real time or you update every two seconds.

> John, you stopped commenting in the origional bug report, do you have
> any further thoughts on this as a fix to the issue at hand?

I guess this comes down to a question of "how big a problem is this" as
we can really do it either way.  And in fact we had recently developed a
patch that allowed ethtool to control the stats update frequency (the
rate the hw counters are read)  I'm not sure, but reverting the change
that removed the hardware byte counters, and adding this "stats
frequency update" would kind of fix both issues.  The only reason I'm
not fond of the real-time updates up until this point is the hot path
cache misses.

Jesse

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] catch up device stats when multicast > total frames
  2008-08-05 13:16   ` Neil Horman
  2008-08-05 16:43     ` [E1000-devel] [PATCH] catch up device stats when multicast >total frames Brandeburg, Jesse
@ 2008-08-05 16:52     ` Ronciak, John
  2008-08-05 18:03       ` Neil Horman
  1 sibling, 1 reply; 8+ messages in thread
From: Ronciak, John @ 2008-08-05 16:52 UTC (permalink / raw)
  To: Neil Horman, Ben Hutchings
  Cc: netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net,
	Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W

The total packet count in the driver was done so that the count is always right even in between stats updates.  Believe it or not there are some people that think this is an important thing for some reason.  We don't understand that but agreed to do this to correct what they were seeing with the stats being updated every 2 seconds.

The problem that seems to be happening here is that the BMC is consuming frames that the driver never sees.  This I think is what is messing up the counts.  The problem we have is that we aren't sure if all the BMC's are doing the same thing.  We can't seem to find that information out.  So we end up with a fix like that that prevents the calculations from going negative.

I'm not sure what the right thing to do is.  I suggested in the past (in the BZ) that the total HW packet count be used for this calculation but Niel didn't like that for some reason, at least I think that's why it wasn't used.  I suggested that the HW counts be used when looking for these MC frame counts and not use the real time counts it should work correctly without regard to which BMC is being used.  It would always calculate correctly.  The calculation could come out of the update stats routine.

BTW, why did this move from the RH BZ to here?

Cheers,
John
-----------------------------------------------------------
167 days until the end of the error!
-----------------------------------------------------------
"Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety.", Benjamin Franklin 1755

-----Original Message-----
From: Neil Horman [mailto:nhorman@tuxdriver.com]
Sent: Tuesday, August 05, 2008 6:16 AM
To: Ben Hutchings
Cc: netdev@vger.kernel.org; e1000-devel@lists.sourceforge.net; Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Ronciak, John
Subject: Re: [PATCH] catch up device stats when multicast > total frames

On Tue, Aug 05, 2008 at 01:15:29PM +0100, Ben Hutchings wrote:
> Neil Horman wrote:
> > Hey-
> >     REcently observed a problem wherein, if a BMC or other IPMI device is
> > attached to a NIC, multicast frames can be consumed by the aformentioned device
> > without ever being seen by the driver.  Since multicast frames are counted in
> > the hardware and the total frame counter is counted in the driver napi routine,
>
> I'd be surprised if the hardware does not also maintain a total frame
> counter.  If not, you can possibly calculate the total as good + bad
> packets, or unicast + multicast + broadcast + bad, or something like that.
>

It does, however,  for whatever reason the drriver relies on a software
computed total packets counter rather than the hardware based on (possibly for
the same reason, in that it counts frames the driver never actually sees).  I
asked John R. in the origional bug report why e1000 might do this, and while
neither of us were sure, that seemed to be the consensus.

> [...]
> > --- a/drivers/net/e1000/e1000_main.c
> > +++ b/drivers/net/e1000/e1000_main.c
> > @@ -3761,6 +3761,8 @@ e1000_update_stats(struct e1000_adapter *adapter)
> >     /* Fill out the OS statistics structure */
> >     adapter->net_stats.multicast = adapter->stats.mprc;
> >     adapter->net_stats.collisions = adapter->stats.colc;
> > +   if (adapter->net_stats.rx_packets < adapter->net_stats.multicast)
> > +           adapter->net_stats.rx_packets = adapter->net_stats.multicast;
> >
> >     /* Rx Errors */
> >
> [...]
>
> This is a botch - it means the numbers can't be so obviously wrong, but
> doesn't make them correct.
>
I agree, but as it stands currently the numbers are obviously wrong, and I don't
see a way to make them undenyably correct.  We could use the hardware counter
instead, but then we would be counting frames that various IPMI endpoints might
consume, so that wouldn't be quite right.  We could fix up the sofware counter
based on the hardware counter, but that doesn't seem right either.  We could
inspect the packet in the driver, and do a software multicast frame counter, but
that would be a big performance hit.  So there is no great fix to this problem.
This at least makes the stats sane, doesn't impact performance, and doesn't
affect stats for which IPMI reception isn't an issue.

John, you stopped commenting in the origional bug report, do you have any
further thoughts on this as a fix to the issue at hand?

Neil

--
/****************************************************
 * Neil Horman <nhorman@tuxdriver.com>
 * Software Engineer, Red Hat
 ****************************************************/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [E1000-devel] [PATCH] catch up device stats when multicast >total frames
  2008-08-05 16:43     ` [E1000-devel] [PATCH] catch up device stats when multicast >total frames Brandeburg, Jesse
@ 2008-08-05 16:53       ` Neil Horman
  2008-08-05 17:12         ` Ingo Oeser
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Horman @ 2008-08-05 16:53 UTC (permalink / raw)
  To: Brandeburg, Jesse
  Cc: Ben Hutchings, e1000-devel, netdev, Allan, Bruce W, Ronciak, John,
	Kirsher, Jeffrey T

On Tue, Aug 05, 2008 at 09:43:19AM -0700, Brandeburg, Jesse wrote:
> Neil Horman wrote:
> > On Tue, Aug 05, 2008 at 01:15:29PM +0100, Ben Hutchings wrote:
> >>> 	REcently observed a problem wherein, if a BMC or other IPMI
> device
> >>> is attached to a NIC, multicast frames can be consumed by the
> >>> aformentioned device without ever being seen by the driver.  Since
> >>> multicast frames are counted in the hardware and the total frame
> >>> counter is counted in the driver napi routine, 
> >> 
> >> I'd be surprised if the hardware does not also maintain a total frame
> >> counter.  If not, you can possibly calculate the total as good + bad
> >> packets, or unicast + multicast + broadcast + bad, or something like
> >> that. 
> > 
> > It does, however,  for whatever reason the drriver relies on a
> > software computed total packets counter rather than the hardware
> > based on (possibly for the same reason, in that it counts frames the
> 
> We changed the driver to do real time updates based on user feedback
> that updating the stats every two seconds was not fast enough.  It was
> the simplest solution at the time.
> 
Makes sense to me.

> >> This is a botch - it means the numbers can't be so obviously wrong,
> >> but doesn't make them correct. 
> >> 
> > I agree, but as it stands currently the numbers are obviously wrong,
> > and I don't see a way to make them undenyably correct.  We could use
> > the hardware counter instead, but then we would be counting frames
> > that various IPMI endpoints might consume, so that wouldn't be quite
> 
> There are counters at the end of the ethtool -S output called
>    tx_smbus: 0
>    rx_smbus: 0
>    dropped_smbus: 0
> 
> that are indicating when the SMBUS is active. (SMBUS is used to tx / rx
> BMC packets)  Those at least tell you when IPMI is active.
> 
I didn't see those, I'll need to look again. Still though, knowing this still
leaves us with the possibility that multicast packets > total_rx_packets (where
the former is counted in hw, while the latter is counted in sw). 

> > right.  We could fix up the sofware counter based on the hardware
> > counter, but that doesn't seem right either.  We could inspect the
> > packet in the driver, and do a software multicast frame counter, but
> > that would be a big performance hit.  So there is no great fix to
> > this problem. This at least makes the stats sane, doesn't impact
> > performance, and doesn't affect stats for which IPMI reception isn't
> > an issue.  
> 
> There doesn't seem to be much of a middle ground, you either count in
> real time or you update every two seconds.
> 
Agreed, theres no great solution here.


> > John, you stopped commenting in the origional bug report, do you have
> > any further thoughts on this as a fix to the issue at hand?
> 
> I guess this comes down to a question of "how big a problem is this" as
> we can really do it either way.  And in fact we had recently developed a
> patch that allowed ethtool to control the stats update frequency (the
> rate the hw counters are read)  I'm not sure, but reverting the change
> that removed the hardware byte counters, and adding this "stats
> frequency update" would kind of fix both issues.  The only reason I'm
> not fond of the real-time updates up until this point is the hot path
> cache misses.
> 
> Jesse
I can't really answer how big a problem this is.  What I can tell you is that it
was reported to me origionally as a net-snmp issue.  net-snmp was getting some
very odd counter values because it was computing unicast rx frames from a given
interface as unicast = total - multicast, as read from /proc/net/dev.  The
assumption here being that total >= multicast in that file.  Given the e1000
driver as it currently stands, thats not a safe assumption for net-snmp, nor any
other application using /proc/net/dev to make. Clearly an obvious solution here
is for applications to sanity check this input data, but that just begs the
question, what then?  If total < [some reasonable component of total], what do
we do?  I think making sure that we put up sane counters is a resonable
solution, but I'm certainly open to other solutions, or arguments for fixing
this elsewhere.

Regards
Neil


-- 
/****************************************************
 * Neil Horman <nhorman@tuxdriver.com>
 * Software Engineer, Red Hat
 ****************************************************/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] catch up device stats when multicast >total frames
  2008-08-05 16:53       ` Neil Horman
@ 2008-08-05 17:12         ` Ingo Oeser
  0 siblings, 0 replies; 8+ messages in thread
From: Ingo Oeser @ 2008-08-05 17:12 UTC (permalink / raw)
  To: Neil Horman
  Cc: e1000-devel, netdev, Allan,  Bruce W, Brandeburg,  Jesse,
	Ronciak,  John, Kirsher,  Jeffrey T, Ben Hutchings

Hi all,

Neil Horman schrieb:
> Agreed, theres no great solution here.

There maybe:
1. store software counters counted in software in stat_soft_last
2. store software counters counted in hardware in stat_hard_last
3. diff them after getting the fresh counters from hardware and software
4. wait for any "get statistic" request as done with hardware counters today
5. store the combination, taking "unseen packets" into account, 
    as the real netstat
6. goto 1.

What do you think?


Best Regards

Ingo Oeser

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] catch up device stats when multicast > total frames
  2008-08-05 16:52     ` [PATCH] catch up device stats when multicast > total frames Ronciak, John
@ 2008-08-05 18:03       ` Neil Horman
  0 siblings, 0 replies; 8+ messages in thread
From: Neil Horman @ 2008-08-05 18:03 UTC (permalink / raw)
  To: Ronciak, John
  Cc: e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	Allan,  Bruce W, Brandeburg,  Jesse, Kirsher,  Jeffrey T,
	Ben Hutchings

On Tue, Aug 05, 2008 at 09:52:28AM -0700, Ronciak, John wrote:
> The total packet count in the driver was done so that the count is always right even in between stats updates.  Believe it or not there are some people that think this is an important thing for some reason.  We don't understand that but agreed to do this to correct what they were seeing with the stats being updated every 2 seconds.
> 
Yuk.  I can't imagine why anyone would want that, but if thats what were stuck
with...

> The problem that seems to be happening here is that the BMC is consuming frames that the driver never sees.  This I think is what is messing up the counts.  The problem we have is that we aren't sure if all the BMC's are doing the same thing.  We can't seem to find that information out.  So we end up with a fix like that that prevents the calculations from going negative.

Thats the consensus that we came to in the bz, yes.

> 
> I'm not sure what the right thing to do is.  I suggested in the past (in the BZ) that the total HW packet count be used for this calculation but Niel didn't like that for some reason, at least I think that's why it wasn't used.  I suggested that the HW counts be used when looking for these MC frame counts and not use the real time counts it should work correctly without regard to which BMC is being used.  It would always calculate correctly.  The calculation could come out of the update stats routine.
We went back and forth on this point several times in the bugzilla, please re-read
(specifically see comment #93).  I have no problem
with e1000 recording total frame recieved count from hardware directly.  In fact
it seems to make much more sense to me to do just that.  However, e1000 doesn't
do that currently, opting instead to use this software counter.  Why that is I
don't know, but I was hoping that one of you might.  I'm working under the
assumption that some subset of hardware driven by e1000 doesn't have a total
packets counter, or has some other issue which necessitates a software based
total_rx_packets counter.  I don't really know though, and you never were able
to confirm that there was some alternate reason this software counter was
needed.  If the consensus is though, that the software counter can be removed
and the hardware counter read directly, then I'm happy to recind this patch, and
submit one that only uses the hardware counter.
 
 
> BTW, why did this move from the RH BZ to here?
Because we'd had some discussion in the bz about the topic above, but were never
able to conclude if using the hardware counter was a universally safe/reasonable
thing to do.  I'd asked you in comment #111 a month ago if you had any further
comment on the subject, when test results started comming back, indicating this
patch fixed the problem, but I never heard from you.  I wanted to get some more
commentary on this (and make sure it was upstream before it went into RHEL), so
I posted here.

Regards
Neil

> 
> 
> Cheers,
> John
> -----------------------------------------------------------
> 167 days until the end of the error!
> -----------------------------------------------------------
> "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety.", Benjamin Franklin 1755
> 
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Tuesday, August 05, 2008 6:16 AM
> To: Ben Hutchings
> Cc: netdev@vger.kernel.org; e1000-devel@lists.sourceforge.net; Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Ronciak, John
> Subject: Re: [PATCH] catch up device stats when multicast > total frames
> 
> On Tue, Aug 05, 2008 at 01:15:29PM +0100, Ben Hutchings wrote:
> > Neil Horman wrote:
> > > Hey-
> > >     REcently observed a problem wherein, if a BMC or other IPMI device is
> > > attached to a NIC, multicast frames can be consumed by the aformentioned device
> > > without ever being seen by the driver.  Since multicast frames are counted in
> > > the hardware and the total frame counter is counted in the driver napi routine,
> >
> > I'd be surprised if the hardware does not also maintain a total frame
> > counter.  If not, you can possibly calculate the total as good + bad
> > packets, or unicast + multicast + broadcast + bad, or something like that.
> >
> 
> It does, however,  for whatever reason the drriver relies on a software
> computed total packets counter rather than the hardware based on (possibly for
> the same reason, in that it counts frames the driver never actually sees).  I
> asked John R. in the origional bug report why e1000 might do this, and while
> neither of us were sure, that seemed to be the consensus.
> 
> > [...]
> > > --- a/drivers/net/e1000/e1000_main.c
> > > +++ b/drivers/net/e1000/e1000_main.c
> > > @@ -3761,6 +3761,8 @@ e1000_update_stats(struct e1000_adapter *adapter)
> > >     /* Fill out the OS statistics structure */
> > >     adapter->net_stats.multicast = adapter->stats.mprc;
> > >     adapter->net_stats.collisions = adapter->stats.colc;
> > > +   if (adapter->net_stats.rx_packets < adapter->net_stats.multicast)
> > > +           adapter->net_stats.rx_packets = adapter->net_stats.multicast;
> > >
> > >     /* Rx Errors */
> > >
> > [...]
> >
> > This is a botch - it means the numbers can't be so obviously wrong, but
> > doesn't make them correct.
> >
> I agree, but as it stands currently the numbers are obviously wrong, and I don't
> see a way to make them undenyably correct.  We could use the hardware counter
> instead, but then we would be counting frames that various IPMI endpoints might
> consume, so that wouldn't be quite right.  We could fix up the sofware counter
> based on the hardware counter, but that doesn't seem right either.  We could
> inspect the packet in the driver, and do a software multicast frame counter, but
> that would be a big performance hit.  So there is no great fix to this problem.
> This at least makes the stats sane, doesn't impact performance, and doesn't
> affect stats for which IPMI reception isn't an issue.
> 
> John, you stopped commenting in the origional bug report, do you have any
> further thoughts on this as a fix to the issue at hand?
> 
> Neil
> 
> 
> --
> /****************************************************
>  * Neil Horman <nhorman@tuxdriver.com>
>  * Software Engineer, Red Hat
>  ****************************************************/

-- 
/****************************************************
 * Neil Horman <nhorman@tuxdriver.com>
 * Software Engineer, Red Hat
 ****************************************************/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-08-05 18:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-05 11:50 [PATCH] catch up device stats when multicast > total frames Neil Horman
2008-08-05 12:15 ` Ben Hutchings
2008-08-05 13:16   ` Neil Horman
2008-08-05 16:43     ` [E1000-devel] [PATCH] catch up device stats when multicast >total frames Brandeburg, Jesse
2008-08-05 16:53       ` Neil Horman
2008-08-05 17:12         ` Ingo Oeser
2008-08-05 16:52     ` [PATCH] catch up device stats when multicast > total frames Ronciak, John
2008-08-05 18:03       ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).