* [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
@ 2005-04-20 5:50 Malli Chilakala
2005-04-21 3:05 ` Randy.Dunlap
2005-04-21 16:55 ` Scott Feldman
0 siblings, 2 replies; 7+ messages in thread
From: Malli Chilakala @ 2005-04-20 5:50 UTC (permalink / raw)
To: jgarzik@pobox.com; +Cc: netdev
Performance optimizations to e100 Tx Path
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Ganesh Venkatesan <ganesh.venkatesan@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
diff -up net-drivers-2.6/drivers/net/e100.c net-drivers-2.6/drivers/net/e100.c.new
--- net-drivers-2.6/drivers/net/e100.c 2005-04-07 03:22:36.805273784 -0700
+++ net-drivers-2.6/drivers/net/e100.c.new 2005-04-07 03:22:37.251205992 -0700
@@ -777,7 +777,7 @@ static int e100_eeprom_save(struct nic *
return 0;
}
-#define E100_WAIT_SCB_TIMEOUT 40
+#define E100_WAIT_SCB_TIMEOUT 20000 /* we might have to wait 100ms!!! */
static inline int e100_exec_cmd(struct nic *nic, u8 cmd, dma_addr_t dma_addr)
{
unsigned long flags;
@@ -847,6 +847,10 @@ static inline int e100_exec_cb(struct ni
* because the controller is too busy, so
* let's just queue the command and try again
* when another command is scheduled. */
+ if(err == -ENOSPC) {
+ //request a reset
+ schedule_work(&nic->tx_timeout_task);
+ }
break;
} else {
nic->cuc_cmd = cuc_resume;
@@ -891,7 +895,7 @@ static void mdio_write(struct net_device
static void e100_get_defaults(struct nic *nic)
{
- struct param_range rfds = { .min = 64, .max = 256, .count = 64 };
+ struct param_range rfds = { .min = 16, .max = 256, .count = 64 };
struct param_range cbs = { .min = 64, .max = 256, .count = 64 };
pci_read_config_byte(nic->pdev, PCI_REVISION_ID, &nic->rev_id);
@@ -906,8 +910,9 @@ static void e100_get_defaults(struct nic
/* Quadwords to DMA into FIFO before starting frame transmit */
nic->tx_threshold = 0xE0;
- nic->tx_command = cpu_to_le16(cb_tx | cb_i | cb_tx_sf |
- ((nic->mac >= mac_82558_D101_A4) ? cb_cid : 0));
+ /* no interrupt for every tx completion, delay = 256us if not 557*/
+ nic->tx_command = cpu_to_le16(cb_tx | cb_tx_sf |
+ ((nic->mac >= mac_82558_D101_A4) ? cb_cid : cb_i));
/* Template for a freshly allocated RFD */
nic->blank_rfd.command = cpu_to_le16(cb_el);
@@ -1289,12 +1294,15 @@ static inline void e100_xmit_prepare(str
struct sk_buff *skb)
{
cb->command = nic->tx_command;
+ /* interrupt every 16 packets regardless of delay */
+ if((nic->cbs_avail & ~15) == nic->cbs_avail) cb->command |= cb_i;
cb->u.tcb.tbd_array = cb->dma_addr + offsetof(struct cb, u.tcb.tbd);
cb->u.tcb.tcb_byte_count = 0;
cb->u.tcb.threshold = nic->tx_threshold;
cb->u.tcb.tbd_count = 1;
cb->u.tcb.tbd.buf_addr = cpu_to_le32(pci_map_single(nic->pdev,
skb->data, skb->len, PCI_DMA_TODEVICE));
+ // check for mapping failure?
cb->u.tcb.tbd.size = cpu_to_le16(skb->len);
}
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
2005-04-20 5:50 [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path Malli Chilakala
@ 2005-04-21 3:05 ` Randy.Dunlap
2005-04-21 16:51 ` Ganesh Venkatesan
2005-04-21 16:55 ` Scott Feldman
1 sibling, 1 reply; 7+ messages in thread
From: Randy.Dunlap @ 2005-04-21 3:05 UTC (permalink / raw)
To: Malli Chilakala; +Cc: jgarzik, netdev
On Tue, 19 Apr 2005 22:50:31 -0700 (PDT) Malli Chilakala wrote:
| Performance optimizations to e100 Tx Path
|
| Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
| Signed-off-by: Ganesh Venkatesan <ganesh.venkatesan@intel.com>
| Signed-off-by: John Ronciak <john.ronciak@intel.com>
| diff -up net-drivers-2.6/drivers/net/e100.c net-drivers-2.6/drivers/net/e100.c.new
| --- net-drivers-2.6/drivers/net/e100.c 2005-04-07 03:22:36.805273784 -0700
| +++ net-drivers-2.6/drivers/net/e100.c.new 2005-04-07 03:22:37.251205992 -0700
| @@ -777,7 +777,7 @@ static int e100_eeprom_save(struct nic *
| return 0;
| }
|
| -#define E100_WAIT_SCB_TIMEOUT 40
| +#define E100_WAIT_SCB_TIMEOUT 20000 /* we might have to wait 100ms!!! */
What correlation is there between 20000 and 100 ms ?
| static inline int e100_exec_cmd(struct nic *nic, u8 cmd, dma_addr_t dma_addr)
| {
| unsigned long flags;
| @@ -847,6 +847,10 @@ static inline int e100_exec_cb(struct ni
| * because the controller is too busy, so
| * let's just queue the command and try again
| * when another command is scheduled. */
| + if(err == -ENOSPC) {
if (err == -ENOSPC) {
is preferred (with space after if).
(same comment for below)
| + //request a reset
Kernel comment style is /* ... */, not //.
(same comment for below)
| + schedule_work(&nic->tx_timeout_task);
| + }
| break;
| } else {
| nic->cuc_cmd = cuc_resume;
| @@ -1289,12 +1294,15 @@ static inline void e100_xmit_prepare(str
| struct sk_buff *skb)
| {
| cb->command = nic->tx_command;
| + /* interrupt every 16 packets regardless of delay */
| + if((nic->cbs_avail & ~15) == nic->cbs_avail) cb->command |= cb_i;
Don't put if() and statement on one line, please.
It tends to hide code unintentionally.
| cb->u.tcb.tbd_array = cb->dma_addr + offsetof(struct cb, u.tcb.tbd);
| cb->u.tcb.tcb_byte_count = 0;
| cb->u.tcb.threshold = nic->tx_threshold;
| cb->u.tcb.tbd_count = 1;
| cb->u.tcb.tbd.buf_addr = cpu_to_le32(pci_map_single(nic->pdev,
| skb->data, skb->len, PCI_DMA_TODEVICE));
| + // check for mapping failure?
| cb->u.tcb.tbd.size = cpu_to_le16(skb->len);
| }
---
~Randy
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
2005-04-21 3:05 ` Randy.Dunlap
@ 2005-04-21 16:51 ` Ganesh Venkatesan
2005-04-21 17:10 ` Randy.Dunlap
0 siblings, 1 reply; 7+ messages in thread
From: Ganesh Venkatesan @ 2005-04-21 16:51 UTC (permalink / raw)
To: Randy.Dunlap; +Cc: Malli Chilakala, jgarzik, netdev
On 4/20/05, Randy.Dunlap <rddunlap@osdl.org> wrote:
> On Tue, 19 Apr 2005 22:50:31 -0700 (PDT) Malli Chilakala wrote:
>
> |
> | -#define E100_WAIT_SCB_TIMEOUT 40
> | +#define E100_WAIT_SCB_TIMEOUT 20000 /* we might have to wait 100ms!!! */
>
> What correlation is there between 20000 and 100 ms ?
>
This needs some review and fixing on our side.
> | static inline int e100_exec_cmd(struct nic *nic, u8 cmd, dma_addr_t dma_addr)
> | {
> | unsigned long flags;
> | @@ -847,6 +847,10 @@ static inline int e100_exec_cb(struct ni
> | * because the controller is too busy, so
> | * let's just queue the command and try again
> | * when another command is scheduled. */
> | + if(err == -ENOSPC) {
> if (err == -ENOSPC) {
> is preferred (with space after if).
> (same comment for below)
Is there a clear directive on 'if(' versus 'if ('? I see both styles
being used. We are trying to stay consistent with 'if('.
>
> | + //request a reset
> Kernel comment style is /* ... */, not //.
> (same comment for below)
>
Agreed. Will fix this.
> | + schedule_work(&nic->tx_timeout_task);
> | + }
> | break;
> | } else {
> | nic->cuc_cmd = cuc_resume;
> | @@ -1289,12 +1294,15 @@ static inline void e100_xmit_prepare(str
> | struct sk_buff *skb)
> | {
> | cb->command = nic->tx_command;
> | + /* interrupt every 16 packets regardless of delay */
> | + if((nic->cbs_avail & ~15) == nic->cbs_avail) cb->command |= cb_i;
> Don't put if() and statement on one line, please.
> It tends to hide code unintentionally.
Will fix this.
>
> | cb->u.tcb.tbd_array = cb->dma_addr + offsetof(struct cb, u.tcb.tbd);
> | cb->u.tcb.tcb_byte_count = 0;
> | cb->u.tcb.threshold = nic->tx_threshold;
> | cb->u.tcb.tbd_count = 1;
> | cb->u.tcb.tbd.buf_addr = cpu_to_le32(pci_map_single(nic->pdev,
> | skb->data, skb->len, PCI_DMA_TODEVICE));
> | + // check for mapping failure?
> | cb->u.tcb.tbd.size = cpu_to_le16(skb->len);
> | }
>
> ---
> ~Randy
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
2005-04-21 16:51 ` Ganesh Venkatesan
@ 2005-04-21 17:10 ` Randy.Dunlap
0 siblings, 0 replies; 7+ messages in thread
From: Randy.Dunlap @ 2005-04-21 17:10 UTC (permalink / raw)
To: Ganesh Venkatesan; +Cc: mallikarjuna.chilakala, jgarzik, netdev
On Thu, 21 Apr 2005 09:51:01 -0700 Ganesh Venkatesan wrote:
|
| > | static inline int e100_exec_cmd(struct nic *nic, u8 cmd, dma_addr_t dma_addr)
| > | {
| > | unsigned long flags;
| > | @@ -847,6 +847,10 @@ static inline int e100_exec_cb(struct ni
| > | * because the controller is too busy, so
| > | * let's just queue the command and try again
| > | * when another command is scheduled. */
| > | + if(err == -ENOSPC) {
| > if (err == -ENOSPC) {
| > is preferred (with space after if).
| > (same comment for below)
|
| Is there a clear directive on 'if(' versus 'if ('? I see both styles
| being used. We are trying to stay consistent with 'if('.
There's nothing explicit in CodingStyle, but all of
the examples in CodingStyle use a space after 'if'.
And a few if's in this driver use a space after 'if',
but most don't iirc from counting them yesterday.
Thanks,
---
~Randy
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
2005-04-20 5:50 [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path Malli Chilakala
2005-04-21 3:05 ` Randy.Dunlap
@ 2005-04-21 16:55 ` Scott Feldman
1 sibling, 0 replies; 7+ messages in thread
From: Scott Feldman @ 2005-04-21 16:55 UTC (permalink / raw)
To: Malli Chilakala; +Cc: netdev, jgarzik@pobox.com
On Apr 19, 2005, at 10:50 PM, Malli Chilakala wrote:
> Performance optimizations to e100 Tx Path
So what is the net performance improvement with these changes?
> - nic->tx_command = cpu_to_le16(cb_tx | cb_i | cb_tx_sf |
> - ((nic->mac >= mac_82558_D101_A4) ? cb_cid : 0));
> + /* no interrupt for every tx completion, delay = 256us if not 557*/
> + nic->tx_command = cpu_to_le16(cb_tx | cb_tx_sf |
> + ((nic->mac >= mac_82558_D101_A4) ? cb_cid : cb_i));
Where is this 256us delay coming from?
> /* Template for a freshly allocated RFD */
> nic->blank_rfd.command = cpu_to_le16(cb_el);
> @@ -1289,12 +1294,15 @@ static inline void e100_xmit_prepare(str
> struct sk_buff *skb)
> {
> cb->command = nic->tx_command;
> + /* interrupt every 16 packets regardless of delay */
> + if((nic->cbs_avail & ~15) == nic->cbs_avail) cb->command |= cb_i;
You messed up Big Endian.
So you send out 15 packets with no i-bit set. Then what? No interrupt
means no NAPI means no cleanup of those 15 packet skbs.
-scott
P.S. I only saw this one patch for e100 but the subject line indicates
there are 6.
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
@ 2005-04-21 18:59 Brandeburg, Jesse
2005-04-21 19:17 ` Scott Feldman
0 siblings, 1 reply; 7+ messages in thread
From: Brandeburg, Jesse @ 2005-04-21 18:59 UTC (permalink / raw)
To: Scott Feldman; +Cc: netdev, Chilakala, Mallikarjuna, Venkatesan, Ganesh
From: Scott Feldman
>On Apr 19, 2005, at 10:50 PM, Malli Chilakala wrote:
>
>> Performance optimizations to e100 Tx Path
>
>So what is the net performance improvement with these changes?
A massive drop in the number of transmit interrupts per second (yielding
more cpu) when sending lots of frames (like forwarding). I don't
remember my data specifically (stoopid brain).
>> - nic->tx_command = cpu_to_le16(cb_tx | cb_i | cb_tx_sf |
>> - ((nic->mac >= mac_82558_D101_A4) ? cb_cid : 0));
>> + /* no interrupt for every tx completion, delay = 256us
>if not 557*/
>> + nic->tx_command = cpu_to_le16(cb_tx | cb_tx_sf |
>> + ((nic->mac >= mac_82558_D101_A4) ? cb_cid : cb_i));
>
>Where is this 256us delay coming from?
Turning off the 'I' bit allows the transmit interrupt moderation to come
on, and setting the cid (which was already set) to all 1's indicates the
max delay (256 us)
>> /* Template for a freshly allocated RFD */
>> nic->blank_rfd.command = cpu_to_le16(cb_el);
>> @@ -1289,12 +1294,15 @@ static inline void e100_xmit_prepare(str
>> struct sk_buff *skb)
>> {
>> cb->command = nic->tx_command;
>> + /* interrupt every 16 packets regardless of delay */
>> + if((nic->cbs_avail & ~15) == nic->cbs_avail)
>cb->command |= cb_i;
>
>You messed up Big Endian.
Yes, I did. Doh.
>So you send out 15 packets with no i-bit set. Then what? No
>interrupt
>means no NAPI means no cleanup of those 15 packet skbs.
This doesn't mean what you said. The i-bit setting every 15 is an
optimization for minimzing latency. i-bit means interrupt immediately
ignoring cid delay.
This means every packet gets worst case a 256us delay before interrupt.
If a receive or watchdog fires the interrupt they will get cleaned up
sooner.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path
2005-04-21 18:59 Brandeburg, Jesse
@ 2005-04-21 19:17 ` Scott Feldman
0 siblings, 0 replies; 7+ messages in thread
From: Scott Feldman @ 2005-04-21 19:17 UTC (permalink / raw)
To: Brandeburg, Jesse; +Cc: Chilakala, Mallikarjuna, Venkatesan, Ganesh, netdev
On Apr 21, 2005, at 11:59 AM, Brandeburg, Jesse wrote:
>> Where is this 256us delay coming from?
>
> Turning off the 'I' bit allows the transmit interrupt moderation to
> come
> on, and setting the cid (which was already set) to all 1's indicates
> the
> max delay (256 us)
This makes sense. I forgot about the cid delay. I was afraid the "CPU
Saver" microcode was coming back. Thanks Jesse.
-scott
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-04-21 19:17 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-20 5:50 [PATCH net-drivers-2.6 5/6] e100: Performance optimizations to e100 Tx Path Malli Chilakala
2005-04-21 3:05 ` Randy.Dunlap
2005-04-21 16:51 ` Ganesh Venkatesan
2005-04-21 17:10 ` Randy.Dunlap
2005-04-21 16:55 ` Scott Feldman
-- strict thread matches above, loose matches on Subject: below --
2005-04-21 18:59 Brandeburg, Jesse
2005-04-21 19:17 ` Scott Feldman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).