netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <simon.horman@netronome.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@mellanox.com>,
	netdev@vger.kernel.org, oss-drivers@netronome.com,
	Baowen Zheng <baowen.zheng@corigine.com>,
	Louis Peens <louis.peens@netronome.com>
Subject: Re: [PATCH RFC net-next] net/sched: act_police: add support for packet-per-second policing
Date: Wed, 27 Jan 2021 12:02:23 +0100	[thread overview]
Message-ID: <20210127110222.GA29081@netronome.com> (raw)
In-Reply-To: <20210126183812.180d0d61@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

Hi Jakub,

On Tue, Jan 26, 2021 at 06:38:12PM -0800, Jakub Kicinski wrote:
> On Mon, 25 Jan 2021 16:18:19 +0100 Simon Horman wrote:
> > From: Baowen Zheng <baowen.zheng@corigine.com>
> > 
> > Allow a policer action to enforce a rate-limit based on packets-per-second,
> > configurable using a packet-per-second rate and burst parameters. This may
> > be used in conjunction with existing byte-per-second rate limiting in the
> > same policer action.
> > 
> > e.g.
> > tc filter add dev tap1 parent ffff: u32 match \
> >               u32 0 0 police pkts_rate 3000 pkts_burst 1000
> > 
> > Testing was unable to uncover a performance impact of this change on
> > existing features.
> > 
> > Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
> > Signed-off-by: Louis Peens <louis.peens@netronome.com>
> 
> > diff --git a/net/sched/act_police.c b/net/sched/act_police.c
> > index 8d8452b1cdd4..d700b2105535 100644
> > --- a/net/sched/act_police.c
> > +++ b/net/sched/act_police.c
> > @@ -42,6 +42,8 @@ static const struct nla_policy police_policy[TCA_POLICE_MAX + 1] = {
> >  	[TCA_POLICE_RESULT]	= { .type = NLA_U32 },
> >  	[TCA_POLICE_RATE64]     = { .type = NLA_U64 },
> >  	[TCA_POLICE_PEAKRATE64] = { .type = NLA_U64 },
> > +	[TCA_POLICE_PKTRATE64]  = { .type = NLA_U64 },
> > +	[TCA_POLICE_PKTBURST64] = { .type = NLA_U64 },
> 
> Should we set the policy so that .min = 1?

Yes, I think so.
Thanks for spotting that.

> >  };
> >  
> >  static int tcf_police_init(struct net *net, struct nlattr *nla,
> > @@ -61,6 +63,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
> >  	bool exists = false;
> >  	u32 index;
> >  	u64 rate64, prate64;
> > +	u64 pps, ppsburst;
> >  
> >  	if (nla == NULL)
> >  		return -EINVAL;
> > @@ -183,6 +186,16 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
> >  	if (tb[TCA_POLICE_AVRATE])
> >  		new->tcfp_ewma_rate = nla_get_u32(tb[TCA_POLICE_AVRATE]);
> >  
> > +	if (tb[TCA_POLICE_PKTRATE64] && tb[TCA_POLICE_PKTBURST64]) {
> 
> Should we reject if only one is present?

Again, yes I think so.
I'll confirm this with the author too.

> > +		pps = nla_get_u64(tb[TCA_POLICE_PKTRATE64]);
> > +		ppsburst = nla_get_u64(tb[TCA_POLICE_PKTBURST64]);
> > +		if (pps) {
> > +			new->pps_present = true;
> > +			new->tcfp_pkt_burst = PSCHED_TICKS2NS(ppsburst);
> > +			psched_ppscfg_precompute(&new->ppsrate, pps);
> > +		}
> > +	}
> > +
> >  	spin_lock_bh(&police->tcf_lock);
> >  	spin_lock_bh(&police->tcfp_lock);
> >  	police->tcfp_t_c = ktime_get_ns();
> 
> > +void psched_ppscfg_precompute(struct psched_pktrate *r,
> > +			      u64 pktrate64)
> > +{
> > +	memset(r, 0, sizeof(*r));
> > +	r->rate_pkts_ps = pktrate64;
> > +	r->mult = 1;
> > +	/* The deal here is to replace a divide by a reciprocal one
> > +	 * in fast path (a reciprocal divide is a multiply and a shift)
> > +	 *
> > +	 * Normal formula would be :
> > +	 *  time_in_ns = (NSEC_PER_SEC * pkt_num) / pktrate64
> > +	 *
> > +	 * We compute mult/shift to use instead :
> > +	 *  time_in_ns = (len * mult) >> shift;
> > +	 *
> > +	 * We try to get the highest possible mult value for accuracy,
> > +	 * but have to make sure no overflows will ever happen.
> > +	 */
> > +	if (r->rate_pkts_ps > 0) {
> > +		u64 factor = NSEC_PER_SEC;
> > +
> > +		for (;;) {
> > +			r->mult = div64_u64(factor, r->rate_pkts_ps);
> > +			if (r->mult & (1U << 31) || factor & (1ULL << 63))
> > +				break;
> > +			factor <<= 1;
> > +			r->shift++;
> 
> Aren't there helpers somewhere for the reciprocal divide
> pre-calculation?

Now that you mention it, yes.

Looking over reciprocal_divide() I don't think it a good fit here as it
operates on 32bit values, whereas the packet rate is 64 bit.

Packet rate could be changed to a 32 bit entity if we convince ourselves we
don't want more than 2^32 - 1 packets per second (a plausible position
IMHO) - but that leads us to a secondary issue.

The code above is very similar to an existing (long existing)
byte rate variant of this helper - psched_ratecfg_precompute().
And I do think we want to:
a) Support 64-bit byte rates. Indeed such support seems to have
   been added to support 25G use-cases
b) Calculate byte and packet rates the same way

So I feel less and less that reciprocal_divide() is a good fit.
But perhaps I am mistaken.

In the meantime I will take a look to see if a helper common function can
be made to do (64 bit) reciprocal divides for the packet and byte rate
use-cases.  I.e. the common code in psched_ppscfg_precompute() and
psched_ratecfg_precompute().

> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(psched_ppscfg_precompute);
> 
> 

  reply	other threads:[~2021-01-27 11:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 15:18 [PATCH RFC net-next] net/sched: act_police: add support for packet-per-second policing Simon Horman
2021-01-27  2:38 ` Jakub Kicinski
2021-01-27 11:02   ` Simon Horman [this message]
2021-01-27 20:59     ` Jakub Kicinski
2021-01-28 12:02       ` Simon Horman
2021-01-28 12:08     ` Simon Horman
2021-01-28 16:19 ` Ido Schimmel
2021-02-01 12:31   ` Simon Horman
2021-02-01 12:38     ` Simon Horman
2021-02-01 17:41     ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210127110222.GA29081@netronome.com \
    --to=simon.horman@netronome.com \
    --cc=baowen.zheng@corigine.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@mellanox.com \
    --cc=kuba@kernel.org \
    --cc=louis.peens@netronome.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).