From: Florian Westphal <fw@strlen.de>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Florian Westphal <fw@strlen.de>,
"liujian (CE)" <liujian56@huawei.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"edumazet@google.com" <edumazet@google.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"Wangkefeng (Kevin)" <wangkefeng.wang@huawei.com>,
"weiyongjun (A)" <weiyongjun1@huawei.com>
Subject: Re: Question about ip_defrag
Date: Wed, 30 Aug 2017 13:58:20 +0200 [thread overview]
Message-ID: <20170830115820.GC9993@breakpoint.cc> (raw)
In-Reply-To: <20170830125843.250c91c1@redhat.com>
Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> > I take 2) back. Its wrong to do this, for large NR_CPU values it
> > would even overflow.
>
> Alternatively solution 3:
> Why do we want to maintain a (4MBytes) memory limit, across all CPUs?
> Couldn't we just allow each CPU to have a memory limit?
Consider ipv4, ipv6, nf ipv6 defrag, 6lowpan, and 8k cpus... This will
render any limit useless.
> > > To me it looks like we/I have been using the wrong API for comparing
> > > against percpu_counters. I guess we should have used __percpu_counter_compare().
> >
> > Are you sure? For liujian use case (64 cores) it looks like we would
> > always fall through to percpu_counter_sum() so we eat spinlock_irqsave
> > cost for all compares.
> >
> > Before we entertain this we should consider reducing frag_percpu_counter_batch
> > to a smaller value.
>
> Yes, I agree, we really need to lower/reduce the frag_percpu_counter_batch.
> As you say, else the __percpu_counter_compare() call will be useless
> (around systems with >= 32 CPUs).
>
> I think the bug is in frag_mem_limit(). It just reads the global
> counter (fbc->count), without considering other CPUs can have upto 130K
> that haven't been subtracted yet (due to 3M low limit, become dangerous
> at >=24 CPUs). The __percpu_counter_compare() does the right thing,
> and takes into account the number of (online) CPUs and batch size, to
> account for this.
Right, I think we should at very least use __percpu_counter_compare
before denying a new frag queue allocation request.
I'll create a patch.
next prev parent reply other threads:[~2017-08-30 12:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4F88C5DDA1E80143B232E89585ACE27D018F07E2@DGGEMA502-MBX.china.huawei.com>
2017-08-24 13:53 ` Question about ip_defrag Jesper Dangaard Brouer
[not found] ` <4F88C5DDA1E80143B232E89585ACE27D018F0AE1@DGGEMA502-MBX.china.huawei.com>
2017-08-24 18:59 ` Jesper Dangaard Brouer
2017-08-25 1:33 ` liujian (CE)
2017-08-28 8:08 ` liujian (CE)
2017-08-28 14:00 ` Florian Westphal
2017-08-29 7:20 ` Jesper Dangaard Brouer
2017-08-29 7:44 ` liujian (CE)
2017-08-29 7:53 ` Florian Westphal
2017-08-30 10:58 ` Jesper Dangaard Brouer
2017-08-30 11:58 ` Florian Westphal [this message]
2017-08-30 12:22 ` Jesper Dangaard Brouer
2017-08-29 7:40 ` liujian (CE)
2017-08-29 13:01 ` liujian (CE)
2017-08-29 13:46 ` Florian Westphal
2017-08-30 1:52 ` liujian (CE)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170830115820.GC9993@breakpoint.cc \
--to=fw@strlen.de \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=liujian56@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=wangkefeng.wang@huawei.com \
--cc=weiyongjun1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.