netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]  Break 'budget' dependency on netdev_max_backlog.
@ 2002-10-21  6:46 Ben Greear
  2002-10-21 12:51 ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-21  6:46 UTC (permalink / raw)
  To: 'netdev@oss.sgi.com', Robert Olsson

I want to have a very high netdev_max_backlog, but that makes
NAPI with the tulip driver (and 4 running ports) drop packets,
I assume because by the time it polls the first 3, the 4th has
been neglected too long...

So, this patch makes the budget NAPI variable independently tunable.
The default of 300 seems to work well, and is the current default.

In the future, I'm guessing we want a per-interface budget, because
a high performance GigE NIC will want a higher budget that the
venerable 100bt.


--- linux-2.4.19.p3/include/linux/sysctl.h      Tue Oct 15 14:26:40 2002
+++ linux-2.4.19.p4/include/linux/sysctl.h      Sun Oct 20 23:24:13 2002
@@ -206,7 +206,8 @@
         NET_CORE_NO_CONG=14,
         NET_CORE_LO_CONG=15,
         NET_CORE_MOD_CONG=16,
-       NET_CORE_DEV_WEIGHT=17
+       NET_CORE_DEV_WEIGHT=17,
+       NET_CORE_MAX_POLL=18
  };

  /* /proc/sys/net/ethernet */
--- linux-2.4.19.p3/net/core/dev.c      Tue Oct 15 14:23:39 2002
+++ linux-2.4.19.p4/net/core/dev.c      Sun Oct 20 23:36:02 2002
@@ -1092,6 +1092,7 @@
    =======================================================================*/

  int netdev_max_backlog = 300;
+int netdev_max_poll = 300; /* 'Budget' for the NAPI poll method */
  int weight_p = 64;            /* old backlog weight */
  /* These numbers are selected based on intuition and some
   * experimentatiom, if you have more scientific way of doing this
@@ -1604,7 +1605,7 @@
         int this_cpu = smp_processor_id();
         struct softnet_data *queue = &softnet_data[this_cpu];
         unsigned long start_time = jiffies;
-       int budget = netdev_max_backlog;
+       int budget = netdev_max_poll;

         br_read_lock(BR_NETPROTO_LOCK);
         local_irq_disable();
--- linux-2.4.19.p3/net/core/sysctl_net_core.c  Wed Oct  9 00:24:02 2002
+++ linux-2.4.19.p4/net/core/sysctl_net_core.c  Sun Oct 20 23:35:23 2002
@@ -12,6 +12,7 @@
  #ifdef CONFIG_SYSCTL

  extern int netdev_max_backlog;
+extern int netdev_max_poll;
  extern int weight_p;
  extern int no_cong_thresh;
  extern int no_cong;
@@ -54,6 +55,9 @@
         {NET_CORE_MAX_BACKLOG, "netdev_max_backlog",
          &netdev_max_backlog, sizeof(int), 0644, NULL,
          &proc_dointvec},
+       {NET_CORE_MAX_POLL, "netdev_max_poll",
+        &netdev_max_poll, sizeof(int), 0644, NULL,
+        &proc_dointvec},
         {NET_CORE_NO_CONG_THRESH, "no_cong_thresh",
          &no_cong, sizeof(int), 0644, NULL,
          &proc_dointvec},


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-21  6:46 [PATCH] Break 'budget' dependency on netdev_max_backlog Ben Greear
@ 2002-10-21 12:51 ` jamal
  2002-10-21 16:25   ` Ben Greear
  0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2002-10-21 12:51 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com', Robert Olsson


Patch looks fine except for the 300 number. What are you smoking?
Please retain the original value.

cheers,
jamal

On Sun, 20 Oct 2002, Ben Greear wrote:

> I want to have a very high netdev_max_backlog, but that makes
> NAPI with the tulip driver (and 4 running ports) drop packets,
> I assume because by the time it polls the first 3, the 4th has
> been neglected too long...
>
> So, this patch makes the budget NAPI variable independently tunable.
> The default of 300 seems to work well, and is the current default.
>
> In the future, I'm guessing we want a per-interface budget, because
> a high performance GigE NIC will want a higher budget that the
> venerable 100bt.
>
>
> --- linux-2.4.19.p3/include/linux/sysctl.h      Tue Oct 15 14:26:40 2002
> +++ linux-2.4.19.p4/include/linux/sysctl.h      Sun Oct 20 23:24:13 2002
> @@ -206,7 +206,8 @@
>          NET_CORE_NO_CONG=14,
>          NET_CORE_LO_CONG=15,
>          NET_CORE_MOD_CONG=16,
> -       NET_CORE_DEV_WEIGHT=17
> +       NET_CORE_DEV_WEIGHT=17,
> +       NET_CORE_MAX_POLL=18
>   };
>
>   /* /proc/sys/net/ethernet */
> --- linux-2.4.19.p3/net/core/dev.c      Tue Oct 15 14:23:39 2002
> +++ linux-2.4.19.p4/net/core/dev.c      Sun Oct 20 23:36:02 2002
> @@ -1092,6 +1092,7 @@
>     =======================================================================*/
>
>   int netdev_max_backlog = 300;
> +int netdev_max_poll = 300; /* 'Budget' for the NAPI poll method */
>   int weight_p = 64;            /* old backlog weight */
>   /* These numbers are selected based on intuition and some
>    * experimentatiom, if you have more scientific way of doing this
> @@ -1604,7 +1605,7 @@
>          int this_cpu = smp_processor_id();
>          struct softnet_data *queue = &softnet_data[this_cpu];
>          unsigned long start_time = jiffies;
> -       int budget = netdev_max_backlog;
> +       int budget = netdev_max_poll;
>
>          br_read_lock(BR_NETPROTO_LOCK);
>          local_irq_disable();
> --- linux-2.4.19.p3/net/core/sysctl_net_core.c  Wed Oct  9 00:24:02 2002
> +++ linux-2.4.19.p4/net/core/sysctl_net_core.c  Sun Oct 20 23:35:23 2002
> @@ -12,6 +12,7 @@
>   #ifdef CONFIG_SYSCTL
>
>   extern int netdev_max_backlog;
> +extern int netdev_max_poll;
>   extern int weight_p;
>   extern int no_cong_thresh;
>   extern int no_cong;
> @@ -54,6 +55,9 @@
>          {NET_CORE_MAX_BACKLOG, "netdev_max_backlog",
>           &netdev_max_backlog, sizeof(int), 0644, NULL,
>           &proc_dointvec},
> +       {NET_CORE_MAX_POLL, "netdev_max_poll",
> +        &netdev_max_poll, sizeof(int), 0644, NULL,
> +        &proc_dointvec},
>          {NET_CORE_NO_CONG_THRESH, "no_cong_thresh",
>           &no_cong, sizeof(int), 0644, NULL,
>           &proc_dointvec},
>
>
> --
> Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
> President of Candela Technologies Inc      http://www.candelatech.com
> ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear
>
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-21 12:51 ` jamal
@ 2002-10-21 16:25   ` Ben Greear
  2002-10-23  0:47     ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-21 16:25 UTC (permalink / raw)
  To: jamal, 'netdev@oss.sgi.com'

jamal wrote:
> Patch looks fine except for the 300 number. What are you smoking?
> Please retain the original value.

The original value was 300, what do you want it to be?

(Check out what backlog defaults to, that is what is currently
used for the budget, in 2.4.20-pre9 at least.)

Ben

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-21 16:25   ` Ben Greear
@ 2002-10-23  0:47     ` jamal
  2002-10-23  1:06       ` Ben Greear
  0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2002-10-23  0:47 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com'



On Mon, 21 Oct 2002, Ben Greear wrote:

> jamal wrote:
> > Patch looks fine except for the 300 number. What are you smoking?
> > Please retain the original value.
>
> The original value was 300, what do you want it to be?
>

Ok, i take back what i said then. Your patch is not that useful.
For some reason i thought you were introducing NET_CORE_DEV_WEIGHT
which happens to already exist with default value of 64.
The value you put out MUST be the same as netdev_max_backlog in order
to be fair to non-NAPI devices.
Any change to one must be reflected to the other. So a useful patch
will try to shadow NET_CORE_MAX_BACKLOG to that value and allow only
changes to happen to NET_CORE_MAX_BACKLOG; i.e no need for two separate
parameters.

cheers,
jamal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  0:47     ` jamal
@ 2002-10-23  1:06       ` Ben Greear
  2002-10-23  1:18         ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-23  1:06 UTC (permalink / raw)
  To: jamal; +Cc: 'netdev@oss.sgi.com'

jamal wrote:
> 
> On Mon, 21 Oct 2002, Ben Greear wrote:
> 
> 
>>jamal wrote:
>>
>>>Patch looks fine except for the 300 number. What are you smoking?
>>>Please retain the original value.
>>
>>The original value was 300, what do you want it to be?
>>
> 
> 
> Ok, i take back what i said then. Your patch is not that useful.
> For some reason i thought you were introducing NET_CORE_DEV_WEIGHT
> which happens to already exist with default value of 64.
> The value you put out MUST be the same as netdev_max_backlog in order
> to be fair to non-NAPI devices.
> Any change to one must be reflected to the other. So a useful patch
> will try to shadow NET_CORE_MAX_BACKLOG to that value and allow only
> changes to happen to NET_CORE_MAX_BACKLOG; i.e no need for two separate
> parameters.

Actually, with the changes the way they are, if you have a backlog value
of 300, and the first NAPI driver reads 300 packets, then all the rest of the
NAPI drivers will just throw away packets because the max-backlog has
already been hit (if I understand things correctly).

If we make the backlog big enough to handle a large burst of incomming
traffic w/out dropping them, then the first NAPI driver can hog a very
large amount of CPU because its budget is now huge.  The rest of
the devices in the poll loop will not be serviced in time, and will
drop packets (I saw this when testing out the 4-port tulip driver with
the NAPI patch).

Ben

> 
> cheers,
> jamal
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  1:06       ` Ben Greear
@ 2002-10-23  1:18         ` jamal
  2002-10-23  2:37           ` Ben Greear
  0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2002-10-23  1:18 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com'



On Tue, 22 Oct 2002, Ben Greear wrote:

> Actually, with the changes the way they are, if you have a backlog value
> of 300, and the first NAPI driver reads 300 packets, then all the rest of the
> NAPI drivers will just throw away packets because the max-backlog has
> already been hit (if I understand things correctly).
>

The 300 is shared amongst all the drivers which have something on their
rx DMA. This is done on a roundrobin fashion.
Also all the drivers which are non-napi are given the same quota.

> If we make the backlog big enough to handle a large burst of incomming
> traffic w/out dropping them, then the first NAPI driver can hog a very
> large amount of CPU because its budget is now huge.  The rest of
> the devices in the poll loop will not be serviced in time, and will
> drop packets (I saw this when testing out the 4-port tulip driver with
> the NAPI patch).
>

I think you misunderstood. Look at the dev->weight.

cheers,
jamal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  1:18         ` jamal
@ 2002-10-23  2:37           ` Ben Greear
  2002-10-23  3:21             ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-23  2:37 UTC (permalink / raw)
  To: jamal; +Cc: 'netdev@oss.sgi.com'

jamal wrote:

> I think you misunderstood. Look at the dev->weight.

Yep, I was definately confused.  I don't see how changing the
value as I did could affect anything in a good way, but I definately
saw changes in dropped packets...maybe it was too late at night :)

So, after further looking at the code, it appears that the dev->weight
is basically hard-coded in the tulip driver, and the weight_p value
in dev.c (and settable though sysctl) is not used anywhere except
netdev_init (ie, not soon enough to actually set via sysctl).

Think we should add an IOCTL to change the weight of a device
dynamically?  (I want my e1000 and tg3 to have higher weight than
the tulip nics, I imagine)

Is there a ready-built proc interface to do things to individual devices?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  2:37           ` Ben Greear
@ 2002-10-23  3:21             ` jamal
  2002-10-23  3:43               ` Ben Greear
  0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2002-10-23  3:21 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com'



On Tue, 22 Oct 2002, Ben Greear wrote:

> Think we should add an IOCTL to change the weight of a device
> dynamically?  (I want my e1000 and tg3 to have higher weight than
> the tulip nics, I imagine)
>
> Is there a ready-built proc interface to do things to individual devices?
>

tulip hardcodes it; net/core/dev.c::weight_p is supposed to be the general
one and is proc settable via NET_CORE_DEV_WEIGHT; only problem is that
is not shadowed by the devices.
I am not sure if ioctls or module parameters are the best way to override
the default weight_p value. Somehow we need something to set this value.

cheers,
jamal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  3:21             ` jamal
@ 2002-10-23  3:43               ` Ben Greear
  2002-10-23  3:44                 ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-23  3:43 UTC (permalink / raw)
  To: jamal; +Cc: 'netdev@oss.sgi.com'

jamal wrote:
> 
> On Tue, 22 Oct 2002, Ben Greear wrote:
> 
> 
>>Think we should add an IOCTL to change the weight of a device
>>dynamically?  (I want my e1000 and tg3 to have higher weight than
>>the tulip nics, I imagine)
>>
>>Is there a ready-built proc interface to do things to individual devices?
>>
> 
> 
> tulip hardcodes it; net/core/dev.c::weight_p is supposed to be the general
> one and is proc settable via NET_CORE_DEV_WEIGHT; only problem is that
> is not shadowed by the devices.
> I am not sure if ioctls or module parameters are the best way to override
> the default weight_p value. Somehow we need something to set this value.

No reason not to provide both!  The IOCTL can take care of any drivers
that don't take module parameters for it, and can do it in a generic
way.

The tulip driver I am poking at hard-coded it to 16.  64 seems to work
just as good.

Any idea (other than trial and error) for how to determine a good value
for this?  Maybe should take the number of active devices into account
and wiggle things dynamically from user-space?

Ben

> 
> cheers,
> jamal
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  3:43               ` Ben Greear
@ 2002-10-23  3:44                 ` jamal
  2002-10-23  4:00                   ` Ben Greear
  0 siblings, 1 reply; 12+ messages in thread
From: jamal @ 2002-10-23  3:44 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com'



On Tue, 22 Oct 2002, Ben Greear wrote:

> No reason not to provide both!  The IOCTL can take care of any drivers
> that don't take module parameters for it, and can do it in a generic
> way.

sure maybe via the ethertool or whatever tool that is being used would
be a good start. The default should still be weight_p.

>
> The tulip driver I am poking at hard-coded it to 16.  64 seems to work
> just as good.
>
> Any idea (other than trial and error) for how to determine a good value
> for this?  Maybe should take the number of active devices into account
> and wiggle things dynamically from user-space?
>

What i recall is 64 was a good number and thats why we had it as default.
I cant remember who changed the value to 16(Robert?), but it does seem to
be a good value as well. If that was the only driver and it had a lot of
packets, it would still get to use all the 300 packets in the budget; it
will just have to do it in 300/16 times ..

cheers,
jamal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  4:00                   ` Ben Greear
@ 2002-10-23  3:58                     ` jamal
  0 siblings, 0 replies; 12+ messages in thread
From: jamal @ 2002-10-23  3:58 UTC (permalink / raw)
  To: Ben Greear; +Cc: 'netdev@oss.sgi.com'



On Tue, 22 Oct 2002, Ben Greear wrote:

> Is there any way to influence how often the poll loop gets run?

the softirq is scheduled like others. You could play with niceness of
ksoftirqd, however note that you may starve user space etc.
You could also increase that backlog queue, but you will be preempted if
that softirq runs for too long.

cheers,
jamal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH]  Break 'budget' dependency on netdev_max_backlog.
  2002-10-23  3:44                 ` jamal
@ 2002-10-23  4:00                   ` Ben Greear
  2002-10-23  3:58                     ` jamal
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2002-10-23  4:00 UTC (permalink / raw)
  To: jamal; +Cc: 'netdev@oss.sgi.com'

jamal wrote:

> What i recall is 64 was a good number and thats why we had it as default.
> I cant remember who changed the value to 16(Robert?), but it does seem to
> be a good value as well. If that was the only driver and it had a lot of
> packets, it would still get to use all the 300 packets in the budget; it
> will just have to do it in 300/16 times ..

Is there any way to influence how often the poll loop gets run?

I'll have a patch to set the weight via IOCTL soon....

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2002-10-23  4:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-21  6:46 [PATCH] Break 'budget' dependency on netdev_max_backlog Ben Greear
2002-10-21 12:51 ` jamal
2002-10-21 16:25   ` Ben Greear
2002-10-23  0:47     ` jamal
2002-10-23  1:06       ` Ben Greear
2002-10-23  1:18         ` jamal
2002-10-23  2:37           ` Ben Greear
2002-10-23  3:21             ` jamal
2002-10-23  3:43               ` Ben Greear
2002-10-23  3:44                 ` jamal
2002-10-23  4:00                   ` Ben Greear
2002-10-23  3:58                     ` jamal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).