* [PATCH] igb: Record hardware RX overruns in net_stats @ 2009-05-04 11:06 Jesper Dangaard Brouer 2009-05-05 18:47 ` Jeff Kirsher 0 siblings, 1 reply; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-04 11:06 UTC (permalink / raw) To: David S. Miller, jeffrey.t.kirsher Cc: netdev@vger.kernel.org, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak Hardware RX fifo overruns for the 82576 is stored in the RNBC (Receive No Buffers Count) register. I choose the store the RNBC value in net_stats.rx_fifo_errors. Saving the stats in dev->net_stats makes it visible via /proc/net/dev as "fifo", and thus viewable to ifconfig as "overruns" and 'netstat -i' as "RX-OVR". The Receive No Buffers Count (RNBC) can already be queried by ethtool -S as "rx_no_buffer_count". Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> --- drivers/net/igb/igb_main.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index 183235d..ef26e6a 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -3596,6 +3596,7 @@ void igb_update_stats(struct igb_adapter *adapter) adapter->net_stats.rx_crc_errors = adapter->stats.crcerrs; adapter->net_stats.rx_frame_errors = adapter->stats.algnerrc; adapter->net_stats.rx_missed_errors = adapter->stats.mpc; + adapter->net_stats.rx_fifo_errors = adapter->stats.rnbc; /* Tx Errors */ adapter->net_stats.tx_errors = adapter->stats.ecol + ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-04 11:06 [PATCH] igb: Record hardware RX overruns in net_stats Jesper Dangaard Brouer @ 2009-05-05 18:47 ` Jeff Kirsher 2009-05-05 18:58 ` David Miller 0 siblings, 1 reply; 14+ messages in thread From: Jeff Kirsher @ 2009-05-05 18:47 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: David S. Miller, netdev@vger.kernel.org, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak On Mon, May 4, 2009 at 4:06 AM, Jesper Dangaard Brouer <hawk@comx.dk> wrote: > > Hardware RX fifo overruns for the 82576 is stored in the > RNBC (Receive No Buffers Count) register. > > I choose the store the RNBC value in net_stats.rx_fifo_errors. > > Saving the stats in dev->net_stats makes it visible via > /proc/net/dev as "fifo", and thus viewable to ifconfig > as "overruns" and 'netstat -i' as "RX-OVR". > > The Receive No Buffers Count (RNBC) can already be queried by > ethtool -S as "rx_no_buffer_count". > > Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> > --- > NAK. RNBC is not a counter for buffer overruns, and so should not be counted as such. -- Cheers, Jeff ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 18:47 ` Jeff Kirsher @ 2009-05-05 18:58 ` David Miller 2009-05-05 21:24 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2009-05-05 18:58 UTC (permalink / raw) To: jeffrey.t.kirsher Cc: hawk, netdev, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Date: Tue, 5 May 2009 11:47:09 -0700 > NAK. RNBC is not a counter for buffer overruns, and so should not be > counted as such. I'd say technically it is, it indicates that more packets arrived than the available receive buffers could handle. If anything, this is the closest this device has for this kind of situation, and it's useful for diagnosing problems. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 18:58 ` David Miller @ 2009-05-05 21:24 ` Jesper Dangaard Brouer 2009-05-05 21:32 ` Jeff Kirsher 2009-05-05 22:38 ` Ronciak, John 0 siblings, 2 replies; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-05 21:24 UTC (permalink / raw) To: David Miller Cc: jeffrey.t.kirsher, hawk, netdev, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak On Tue, 5 May 2009, David Miller wrote: > From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > Date: Tue, 5 May 2009 11:47:09 -0700 > >> NAK. RNBC is not a counter for buffer overruns, and so should not be >> counted as such. > > I'd say technically it is, it indicates that more packets arrived than > the available receive buffers could handle. I agree with DaveM. Technically it _is_ a buffer overflow, but in the host memory not the NIC. I'm sort of pushing the system into a situation where it cannot empty the receive buffers fast enough. I can fairly easily provoke this situation by adding too many iptables rules, which (intentionally) cause high CPU load and causes ksoftirqd to run (I'm Oprofiling netfilter modules). > If anything, this is the closest this device has for this kind of > situation, and it's useful for diagnosing problems. Its really useful for diagnosing problems, and I'm betting that this is a real-life situation which people is going to experience. We might as well help our self to more easily identify this issue when people report drop problems. Notice that I'm seeing: rx_no_buffer_count: 136955 rx_missed_errors: 0 Thus, the rx_missed_errors is zero, which according to the datasheet is the "real" fifo drop (the MPC register, Missed Packets Count) and PCI bandwidth problem indications. If we really should nitpick, then: adapter->net_stats.rx_missed_errors = adapter->stats.mpc Should then have been stored in the rx_fifo_errors. Notice that rx_missed_errors is presented to userspace as drops (see net/core/dev.c:2624). I think that both MPC and RNBC should be stored in rx_fifo_errors (and of cause still keeping them seperate to ethtool -S). I'll post two patches with these changes tomorrow, for you evaluation. Please reconsider you NAK. Greetings, Jesper Brouer -- ------------------------------------------------------------------- MSc. Master of Computer Science Dept. of Computer Science, University of Copenhagen Author of http://www.adsl-optimizer.dk ------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 21:24 ` Jesper Dangaard Brouer @ 2009-05-05 21:32 ` Jeff Kirsher 2009-05-05 21:35 ` David Miller 2009-05-05 22:38 ` Ronciak, John 1 sibling, 1 reply; 14+ messages in thread From: Jeff Kirsher @ 2009-05-05 21:32 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: hawk, e1000-devel, netdev, peter.p.waskiewicz.jr, bruce.w.allan, jesse.brandeburg, john.ronciak, David Miller On Tue, May 5, 2009 at 2:24 PM, Jesper Dangaard Brouer <hawk@diku.dk> wrote: > On Tue, 5 May 2009, David Miller wrote: > >> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> >> Date: Tue, 5 May 2009 11:47:09 -0700 >> >>> NAK. RNBC is not a counter for buffer overruns, and so should not be >>> counted as such. >> >> I'd say technically it is, it indicates that more packets arrived than >> the available receive buffers could handle. > > I agree with DaveM. Technically it _is_ a buffer overflow, but in the host > memory not the NIC. I'm sort of pushing the system into a situation where it > cannot empty the receive buffers fast enough. > > I can fairly easily provoke this situation by adding too many iptables > rules, which (intentionally) cause high CPU load and causes ksoftirqd to run > (I'm Oprofiling netfilter modules). > > >> If anything, this is the closest this device has for this kind of >> situation, and it's useful for diagnosing problems. > > Its really useful for diagnosing problems, and I'm betting that this is a > real-life situation which people is going to experience. We might as well > help our self to more easily identify this issue when people report drop > problems. > > Notice that I'm seeing: > > rx_no_buffer_count: 136955 > rx_missed_errors: 0 > > Thus, the rx_missed_errors is zero, which according to the datasheet is the > "real" fifo drop (the MPC register, Missed Packets Count) and PCI bandwidth > problem indications. > > If we really should nitpick, then: > > adapter->net_stats.rx_missed_errors = adapter->stats.mpc > > Should then have been stored in the rx_fifo_errors. Notice that > rx_missed_errors is presented to userspace as drops (see > net/core/dev.c:2624). > > I think that both MPC and RNBC should be stored in rx_fifo_errors (and of > cause still keeping them seperate to ethtool -S). > > I'll post two patches with these changes tomorrow, for you evaluation. > > Please reconsider you NAK. > > Greetings, > Jesper Brouer > > -- the manual[1] for the hardware says: RNBC: This register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). The packet is still received if there is space in the FIFO. This register only increments if receives are enabled. This register does not increment when flow control packets are received. The critical bit "The packet is still received if there is space in the FIFO" (AND a host memory buffer becomes available) So the reason we don't want to put it in the net_stats stats for drops is that the packet *wasn't* necessarily dropped. The rx_missed errors is for packets that were definitely dropped, and is already stored in the net_stats structure. -- Cheers, Jeff ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 21:32 ` Jeff Kirsher @ 2009-05-05 21:35 ` David Miller 2009-05-06 7:46 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2009-05-05 21:35 UTC (permalink / raw) To: jeffrey.t.kirsher Cc: hawk, hawk, netdev, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Date: Tue, 5 May 2009 14:32:04 -0700 > the manual[1] for the hardware says: > RNBC: > This register counts the number of times that frames were received > when there were no available buffers in host memory to store those > frames (receive descriptor head and tail pointers were equal). The > packet is still received if there is space in the FIFO. This register > only increments if receives are enabled. This register does not > increment when flow control packets are received. > > The critical bit "The packet is still received if there is space in > the FIFO" (AND a host memory buffer becomes available) So the reason > we don't want to put it in the net_stats stats for drops is that the > packet > *wasn't* necessarily dropped. > > The rx_missed errors is for packets that were definitely dropped, and > is already stored in the net_stats structure. While not an "rx_missed" because we do eventually take the packet, conceptually it is a "fifo overflow" in the sense that we exceeded available receive resources at the time that the packet arrived. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 21:35 ` David Miller @ 2009-05-06 7:46 ` Jesper Dangaard Brouer 2009-05-06 8:11 ` Waskiewicz Jr, Peter P 0 siblings, 1 reply; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-06 7:46 UTC (permalink / raw) To: David Miller Cc: jeffrey.t.kirsher, hawk, netdev, e1000-devel, jesse.brandeburg, bruce.w.allan, peter.p.waskiewicz.jr, john.ronciak On Tue, 2009-05-05 at 14:35 -0700, David Miller wrote: > From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > Date: Tue, 5 May 2009 14:32:04 -0700 > > > the manual[1] for the hardware says: > > RNBC: > > This register counts the number of times that frames were received > > when there were no available buffers in host memory to store those > > frames (receive descriptor head and tail pointers were equal). The > > packet is still received if there is space in the FIFO. This register > > only increments if receives are enabled. This register does not > > increment when flow control packets are received. > > > > The critical bit "The packet is still received if there is space in > > the FIFO" (AND a host memory buffer becomes available) So the reason > > we don't want to put it in the net_stats stats for drops is that the > > packet > > *wasn't* necessarily dropped. > > > > The rx_missed errors is for packets that were definitely dropped, and > > is already stored in the net_stats structure. > > While not an "rx_missed" because we do eventually take the > packet, conceptually it is a "fifo overflow" in the sense > that we exceeded available receive resources at the time that > the packet arrived. Yes, with this argumentation, the MPC should then be kept as "rx_missed" packets. And the RNBC stored as "rx_fifo_errors" as its an overflow indication, not a number of packets dropped. -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-06 7:46 ` Jesper Dangaard Brouer @ 2009-05-06 8:11 ` Waskiewicz Jr, Peter P 2009-05-06 13:09 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 14+ messages in thread From: Waskiewicz Jr, Peter P @ 2009-05-06 8:11 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: hawk@diku.dk, e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org, Waskiewicz Jr, Peter P, Allan, Bruce W, Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T, David Miller On Wed, 6 May 2009, Jesper Dangaard Brouer wrote: > On Tue, 2009-05-05 at 14:35 -0700, David Miller wrote: > > From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > Date: Tue, 5 May 2009 14:32:04 -0700 > > > > > the manual[1] for the hardware says: > > > RNBC: > > > This register counts the number of times that frames were received > > > when there were no available buffers in host memory to store those > > > frames (receive descriptor head and tail pointers were equal). The > > > packet is still received if there is space in the FIFO. This register > > > only increments if receives are enabled. This register does not > > > increment when flow control packets are received. > > > > > > The critical bit "The packet is still received if there is space in > > > the FIFO" (AND a host memory buffer becomes available) So the reason > > > we don't want to put it in the net_stats stats for drops is that the > > > packet > > > *wasn't* necessarily dropped. > > > > > > The rx_missed errors is for packets that were definitely dropped, and > > > is already stored in the net_stats structure. > > > > While not an "rx_missed" because we do eventually take the > > packet, conceptually it is a "fifo overflow" in the sense > > that we exceeded available receive resources at the time that > > the packet arrived. > > Yes, with this argumentation, the MPC should then be kept as "rx_missed" > packets. And the RNBC stored as "rx_fifo_errors" as its an overflow > indication, not a number of packets dropped. The way RNBC works depends on how the queues themselves are configured. Specifically, if you have packet drop enabled per queue or not will affect RNBC. In the SRRCTL registers, there is a DROP_EN bit, bit 31. If this bit is set to 1b for the queue in question, then the packet will be dropped when there are no buffers in the packet buffer. This does not mean the FIFO is full or has been overrun, it just means there's no more descriptors available in the Rx ring for that queue. In this case, RNBC is incremented, MPC is not. If bit 31 in SRRCTL is 0b, then if there's no room in the packet buffer (no more descriptors available), the device tries to store the packet in the FIFO. RNBC will *not* be incremented in this case. If there's no space in the FIFO, then the packet is dropped. RNBC still is not incremented in this case, rather MPC will be incremented, since the packet was dropped due to the FIFO being full. In 82576, according to the manual, SRRCTL bit 31 is 0b for queue 0 by default, and is 1b for all other queues by default. I hope this helps explain what the hardware is doing, and how these two counters get used in overrun cases. Cheers, -PJ Waskiewicz ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-06 8:11 ` Waskiewicz Jr, Peter P @ 2009-05-06 13:09 ` Jesper Dangaard Brouer 2009-05-06 20:59 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-06 13:09 UTC (permalink / raw) To: Waskiewicz Jr, Peter P Cc: hawk@diku.dk, e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org, Allan, Bruce W, Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T, David Miller On Wed, 2009-05-06 at 01:11 -0700, Waskiewicz Jr, Peter P wrote: > On Wed, 6 May 2009, Jesper Dangaard Brouer wrote: > > > On Tue, 2009-05-05 at 14:35 -0700, David Miller wrote: > > > From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > > Date: Tue, 5 May 2009 14:32:04 -0700 > > > > > > > the manual[1] for the hardware says: > > > > RNBC: > > > > This register counts the number of times that frames were received > > > > when there were no available buffers in host memory to store those > > > > frames (receive descriptor head and tail pointers were equal). The > > > > packet is still received if there is space in the FIFO. This register > > > > only increments if receives are enabled. This register does not > > > > increment when flow control packets are received. > > > > > > > > The critical bit "The packet is still received if there is space in > > > > the FIFO" (AND a host memory buffer becomes available) So the reason > > > > we don't want to put it in the net_stats stats for drops is that the > > > > packet > > > > *wasn't* necessarily dropped. > > > > > > > > The rx_missed errors is for packets that were definitely dropped, and > > > > is already stored in the net_stats structure. > > > > > > While not an "rx_missed" because we do eventually take the > > > packet, conceptually it is a "fifo overflow" in the sense > > > that we exceeded available receive resources at the time that > > > the packet arrived. > > > > Yes, with this argumentation, the MPC should then be kept as "rx_missed" > > packets. And the RNBC stored as "rx_fifo_errors" as its an overflow > > indication, not a number of packets dropped. > > The way RNBC works depends on how the queues themselves are configured. > Specifically, if you have packet drop enabled per queue or not will affect > RNBC. Very good description, thank you Peter. But I could not resist to actually verify/test it, and my observations differ some! ;-) (patch in bottom indicate where I set it in the code) > In the SRRCTL registers, there is a DROP_EN bit, bit 31. If this > bit is set to 1b for the queue in question, then the packet will be > dropped when there are no buffers in the packet buffer. This does not > mean the FIFO is full or has been overrun, it just means there's no more > descriptors available in the Rx ring for that queue. In this case, RNBC > is incremented, MPC is not. My experience is that if DROP_EN bit is *set*. Then I cannot find the drop count anywhere... not in RNBC and not in MPC ... and I can still see the drops with my netfilter module mp2t. ethtool -S eth21 | egrep 'rx_no_buffer_count|rx_miss' rx_no_buffer_count: 0 rx_missed_errors: 0 I'm guessing that the drop counters are now in the per queue RQDPC register (Receive Queue Drop Packet Count), but reading that is not implemented in the driver. (kernel: [438792.665028] Hawk hack -- Register: srrctl:[0x82000002]) > If bit 31 in SRRCTL is 0b, then if there's no room in the packet buffer > (no more descriptors available), the device tries to store the packet in > the FIFO. RNBC will *not* be incremented in this case. If there's no space > in the FIFO, then the packet is dropped. RNBC still is not incremented in this > case, rather MPC will be incremented, since the packet was dropped due to the FIFO > being full. My experience is that if DROP_EN bit is *NOT* set. Then the RNBC *is* incremented... ethtool -S eth21 | egrep 'rx_no_buffer_count|rx_miss' rx_no_buffer_count: 26436 rx_missed_errors: 0 (kernel: [439261.463628] Hawk hack -- Register: srrctl:[0x2000002]) > In 82576, according to the manual, SRRCTL bit 31 is 0b for queue 0 by > default, and is 1b for all other queues by default. Funny default... > I hope this helps explain what the hardware is doing, and how these two > counters get used in overrun cases. -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer Testing the SRRCTL_DROP_EN bit behavior. From: Jesper Dangaard Brouer <hawk@comx.dk> --- drivers/net/igb/igb_main.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index 3ee00a5..20117ce 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -49,7 +49,7 @@ #endif #include "igb.h" -#define DRV_VERSION "1.3.16-k2" +#define DRV_VERSION "1.3.16-k2-test-drop-bit" char igb_driver_name[] = "igb"; char igb_driver_version[] = DRV_VERSION; static const char igb_driver_string[] = @@ -2091,6 +2091,11 @@ static void igb_setup_rctl(struct igb_adapter *adapter) wr32(E1000_VMOLR(j), vmolr); } + /* Hawk: Hack to test the SRRCTL_DROP_EN bit behavior */ + srrctl &= ~E1000_SRRCTL_DROP_EN; /* Unset bit */ + //srrctl |= E1000_SRRCTL_DROP_EN; /* Set bit */ + printk(KERN_INFO "Hawk hack -- Register: srrctl:[0x%X]\n", srrctl); + for (i = 0; i < adapter->num_rx_queues; i++) { j = adapter->rx_ring[i].reg_idx; wr32(E1000_SRRCTL(j), srrctl); ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-06 13:09 ` Jesper Dangaard Brouer @ 2009-05-06 20:59 ` Jesper Dangaard Brouer 2009-05-06 21:24 ` Waskiewicz Jr, Peter P 0 siblings, 1 reply; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-06 20:59 UTC (permalink / raw) To: Waskiewicz Jr, Peter P Cc: David Miller, Kirsher, Jeffrey T, hawk@diku.dk, netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net, Brandeburg, Jesse, Allan, Bruce W, Ronciak, John On Wed, 2009-05-06 at 15:09 +0200, Jesper Dangaard Brouer wrote: > > In the SRRCTL registers, there is a DROP_EN bit, bit 31. If this > > bit is set to 1b for the queue in question, then the packet will be > > dropped when there are no buffers in the packet buffer. This does > not > > mean the FIFO is full or has been overrun, it just means there's no > more > > descriptors available in the Rx ring for that queue. In this case, > RNBC > > is incremented, MPC is not. > > My experience is that if DROP_EN bit is *set*. Then I cannot find the > drop count anywhere... not in RNBC and not in MPC ... and I can still > see the drops with my netfilter module mp2t. > > ethtool -S eth21 | egrep 'rx_no_buffer_count|rx_miss' > rx_no_buffer_count: 0 > rx_missed_errors: 0 > > I'm guessing that the drop counters are now in the per queue RQDPC > register (Receive Queue Drop Packet Count), but reading that is not > implemented in the driver. Just did a quick implemenation in the driver, see draft patch below (not ready for inclusion yet!). An my guess was correct, the drops count is stored per queue in RQDPC. The only really anoying thing is that this counter is only 12 bit, thus overruns can occur quickly. (This reminds me of my drop count implemenation for Sun niu driver. This counter has less bits that the Sun counter, but at least it automatically clears on reads) I'll post some more clean patches tomorrow... -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer Quick implemenation of register: Receive Queue Drop Packet Count - RQDPC (0xC030 + 0x40*n [n=0...15]; RC). --- drivers/net/igb/e1000_regs.h | 2 ++ drivers/net/igb/igb.h | 1 + drivers/net/igb/igb_ethtool.c | 4 ++++ drivers/net/igb/igb_main.c | 6 ++++++ 4 files changed, 13 insertions(+), 0 deletions(-) diff --git a/drivers/net/igb/e1000_regs.h b/drivers/net/igb/e1000_regs.h index 0bd7728..85683e2 100644 --- a/drivers/net/igb/e1000_regs.h +++ b/drivers/net/igb/e1000_regs.h @@ -165,6 +165,8 @@ enum { : (0x0C018 + ((_n) * 0x40))) #define E1000_RXDCTL(_n) ((_n) < 4 ? (0x02828 + ((_n) * 0x100)) \ : (0x0C028 + ((_n) * 0x40))) +#define E1000_RQDPC(_n) ((_n) < 4 ? (0x02830 + ((_n) * 0x100)) \ + : (0x0C030 + ((_n) * 0x40))) #define E1000_TDBAL(_n) ((_n) < 4 ? (0x03800 + ((_n) * 0x100)) \ : (0x0E000 + ((_n) * 0x40))) #define E1000_TDBAH(_n) ((_n) < 4 ? (0x03804 + ((_n) * 0x100)) \ diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h index 4e8464b..4a16ac0 100644 --- a/drivers/net/igb/igb.h +++ b/drivers/net/igb/igb.h @@ -140,6 +140,7 @@ struct igb_buffer { struct igb_queue_stats { u64 packets; u64 bytes; + u64 drops; }; struct igb_ring { diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c index 27eae49..b0d3100 100644 --- a/drivers/net/igb/igb_ethtool.c +++ b/drivers/net/igb/igb_ethtool.c @@ -1998,12 +1998,16 @@ static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data) p += ETH_GSTRING_LEN; sprintf(p, "tx_queue_%u_bytes", i); p += ETH_GSTRING_LEN; + sprintf(p, "tx_queue_%u_drops", i); + p += ETH_GSTRING_LEN; } for (i = 0; i < adapter->num_rx_queues; i++) { sprintf(p, "rx_queue_%u_packets", i); p += ETH_GSTRING_LEN; sprintf(p, "rx_queue_%u_bytes", i); p += ETH_GSTRING_LEN; + sprintf(p, "rx_queue_%u_drops", i); + p += ETH_GSTRING_LEN; } /* BUG_ON(p - data != IGB_STATS_LEN * ETH_GSTRING_LEN); */ break; diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index 183235d..20190bb 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -3494,6 +3494,7 @@ void igb_update_stats(struct igb_adapter *adapter) struct e1000_hw *hw = &adapter->hw; struct pci_dev *pdev = adapter->pdev; u16 phy_tmp; + int i; #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF @@ -3583,6 +3584,11 @@ void igb_update_stats(struct igb_adapter *adapter) adapter->net_stats.multicast = adapter->stats.mprc; adapter->net_stats.collisions = adapter->stats.colc; + /* Read out drops stats per RX queue */ + for (i = 0; i < adapter->num_rx_queues; i++) { + adapter->rx_ring[i].rx_stats.drops += rd32(E1000_RQDPC(i)); + } + /* Rx Errors */ /* RLEC on some newer hardware can be incorrect so build ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-06 20:59 ` Jesper Dangaard Brouer @ 2009-05-06 21:24 ` Waskiewicz Jr, Peter P 0 siblings, 0 replies; 14+ messages in thread From: Waskiewicz Jr, Peter P @ 2009-05-06 21:24 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: hawk@diku.dk, e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org, Waskiewicz Jr, Peter P, Allan, Bruce W, Brandeburg, Jesse, Ronciak, John, Kirsher, Jeffrey T, David Miller On Wed, 6 May 2009, Jesper Dangaard Brouer wrote: > On Wed, 2009-05-06 at 15:09 +0200, Jesper Dangaard Brouer wrote: > > > In the SRRCTL registers, there is a DROP_EN bit, bit 31. If this > > > bit is set to 1b for the queue in question, then the packet will be > > > dropped when there are no buffers in the packet buffer. This does > > not > > > mean the FIFO is full or has been overrun, it just means there's no > > more > > > descriptors available in the Rx ring for that queue. In this case, > > RNBC > > > is incremented, MPC is not. > > > > My experience is that if DROP_EN bit is *set*. Then I cannot find the > > drop count anywhere... not in RNBC and not in MPC ... and I can still > > see the drops with my netfilter module mp2t. > > > > ethtool -S eth21 | egrep 'rx_no_buffer_count|rx_miss' > > rx_no_buffer_count: 0 > > rx_missed_errors: 0 > > > > I'm guessing that the drop counters are now in the per queue RQDPC > > register (Receive Queue Drop Packet Count), but reading that is not > > implemented in the driver. > > Just did a quick implemenation in the driver, see draft patch below (not > ready for inclusion yet!). > > An my guess was correct, the drops count is stored per queue in RQDPC. > The only really anoying thing is that this counter is only 12 bit, thus > overruns can occur quickly. (This reminds me of my drop count > implemenation for Sun niu driver. This counter has less bits that the > Sun counter, but at least it automatically clears on reads) > > I'll post some more clean patches tomorrow... Ok, so 82576 is much more like 82599 with the queue counters, but different enough to be annoying. Yes, I'm seeing this in the data manual for 82576. Honestly, we have too many damn counters for packet accounting. I'm very happy you're getting what you wanted now. Your patch looks reasonable to me; please let us know if you have any other questions about the part that we might be able to help with. Cheers, -PJ Waskiewicz ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 21:24 ` Jesper Dangaard Brouer 2009-05-05 21:32 ` Jeff Kirsher @ 2009-05-05 22:38 ` Ronciak, John 2009-05-06 8:12 ` Jesper Dangaard Brouer 1 sibling, 1 reply; 14+ messages in thread From: Ronciak, John @ 2009-05-05 22:38 UTC (permalink / raw) To: Jesper Dangaard Brouer, David Miller Cc: Kirsher, Jeffrey T, hawk@comx.dk, netdev, e1000-devel@lists.sourceforge.net, Brandeburg, Jesse, Allan, Bruce W, Waskiewicz Jr, Peter P >Its really useful for diagnosing problems, and I'm betting >that this is a >real-life situation which people is going to experience. We >might as well >help our self to more easily identify this issue when people >report drop problems. The problem is that the RNBC aren't dropped packets as the numbers you have show. While we can agree that the MPC are the actual dropped packets and could eaily be be used in the fifo overflow count since the packets were really dropped. >I think that both MPC and RNBC should be stored in >rx_fifo_errors (and of >cause still keeping them seperate to ethtool -S). This would count RNBC packets as packets that the stack did not process, which it did. The MPC packets were not processed by the stack and should be counted as dropped. As you point out, both counts are available via ethtool -S. >I'll post two patches with these changes tomorrow, for you evaluation. Thanks, we look forward to see them. Cheers, John ----------------------------------------------------------- "...that your people will judge you on what you can build, not what you destroy.", B. Obama, 2009 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] igb: Record hardware RX overruns in net_stats 2009-05-05 22:38 ` Ronciak, John @ 2009-05-06 8:12 ` Jesper Dangaard Brouer 2009-05-06 8:56 ` [PATCH v2] igb: Record host memory receive overflow " Jesper Dangaard Brouer 0 siblings, 1 reply; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-06 8:12 UTC (permalink / raw) To: Ronciak, John Cc: Jesper Dangaard Brouer, e1000-devel@lists.sourceforge.net, netdev, Waskiewicz Jr, Peter P, Allan, Bruce W, Brandeburg, Jesse, Kirsher, Jeffrey T, David Miller On Tue, 2009-05-05 at 15:38 -0700, Ronciak, John wrote: > >Its really useful for diagnosing problems, and I'm betting > >that this is a real-life situation which people is going to experience. We > >might as well help our self to more easily identify this issue when people > >report drop problems. > > The problem is that the RNBC aren't dropped packets as the numbers you > have show. Just to make it clear, I am experiencing dropped packets. The reason I positivly know this, is that I'm writing a MPEG2-TS drop detection netfilter module. Which were the only reason I discovered the packet drops and the "hidden" RNBC counter via ethtool. I have read the datasheeet, and with Jeffrey's detailed explaination, I do know that this number might be higher than the actually drops I'm experiencing. > While we can agree that the MPC are the actual dropped packets and > could eaily be be used in the fifo overflow count since the packets > were really dropped. Well, then I think we should keep MPC as drops, and use the fifo_errors as an fifo overflow indication containing RNBC count. > > I think that both MPC and RNBC should be stored in rx_fifo_errors > > (and of cause still keeping them seperate to ethtool -S). > > This would count RNBC packets as packets that the stack did not > process, which it did. The MPC packets were not processed by the > stack and should be counted as dropped. As you point out, both counts > are available via ethtool -S. > > >I'll post two patches with these changes tomorrow, for you evaluation. > > Thanks, we look forward to see them. I'll keep it to one patch (with an extra comment reflecting this disscussion), as you have convinced me that the MPC should stay as "rx_missed", as this is presented to userspace as a positive drop. -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk LinkedIn: http://www.linkedin.com/in/brouer ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2] igb: Record host memory receive overflow in net_stats 2009-05-06 8:12 ` Jesper Dangaard Brouer @ 2009-05-06 8:56 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 14+ messages in thread From: Jesper Dangaard Brouer @ 2009-05-06 8:56 UTC (permalink / raw) To: Ronciak, John, David S. Miller Cc: Jesper Dangaard Brouer, Kirsher, Jeffrey T, netdev, e1000-devel@lists.sourceforge.net, Brandeburg, Jesse, Allan, Bruce W, Waskiewicz Jr, Peter P The RNBC (Receive No Buffers Count) register for the 82576, indicate that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). The packet is still received by the NIC if there is space in the FIFO I have shown that this situation can arise when the kernel is too busy else where. As the the RNBC value is not necessary a packet drop, I choose to store the RNBC value in net_stats.rx_fifo_errors. Saving the stats in dev->net_stats makes it visible via /proc/net/dev as "fifo", and thus viewable to ifconfig as "overruns" and 'netstat -i' as "RX-OVR". The Receive No Buffers Count (RNBC) can already be queried by ethtool -S as "rx_no_buffer_count". Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> --- drivers/net/igb/igb_main.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index 183235d..3ee00a5 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -3597,6 +3597,10 @@ void igb_update_stats(struct igb_adapter *adapter) adapter->net_stats.rx_frame_errors = adapter->stats.algnerrc; adapter->net_stats.rx_missed_errors = adapter->stats.mpc; + /* Note RNBC (Receive No Buffers Count) is an overflow + * indication, thus not necessary a dropped packet. */ + adapter->net_stats.rx_fifo_errors = adapter->stats.rnbc; + /* Tx Errors */ adapter->net_stats.tx_errors = adapter->stats.ecol + adapter->stats.latecol; ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-05-06 21:24 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-04 11:06 [PATCH] igb: Record hardware RX overruns in net_stats Jesper Dangaard Brouer 2009-05-05 18:47 ` Jeff Kirsher 2009-05-05 18:58 ` David Miller 2009-05-05 21:24 ` Jesper Dangaard Brouer 2009-05-05 21:32 ` Jeff Kirsher 2009-05-05 21:35 ` David Miller 2009-05-06 7:46 ` Jesper Dangaard Brouer 2009-05-06 8:11 ` Waskiewicz Jr, Peter P 2009-05-06 13:09 ` Jesper Dangaard Brouer 2009-05-06 20:59 ` Jesper Dangaard Brouer 2009-05-06 21:24 ` Waskiewicz Jr, Peter P 2009-05-05 22:38 ` Ronciak, John 2009-05-06 8:12 ` Jesper Dangaard Brouer 2009-05-06 8:56 ` [PATCH v2] igb: Record host memory receive overflow " Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox