From: "dust.li" <dust.li@linux.alibaba.com>
To: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>,
horms@verge.net.au, netdev@vger.kernel.org,
lvs-devel@vger.kernel.org, Yunhong Jiang <yunhjiang@ebay.com>
Subject: Re: Long delay on estimation_timer causes packet latency
Date: Fri, 4 Dec 2020 21:59:02 +0800 [thread overview]
Message-ID: <20201204135902.GA14129@linux.alibaba.com> (raw)
In-Reply-To: <47e05b8-a4fc-24a1-e796-2a44cf7bbd77@ssi.bg>
On Fri, Dec 04, 2020 at 07:42:56AM +0200, Julian Anastasov wrote:
>
> Hello,
>
>On Fri, 4 Dec 2020, dust.li wrote:
>
>>
>> On 12/3/20 4:48 PM, Julian Anastasov wrote:
>> >
>> > - work will use spin_lock_bh(&s->lock) to protect the
>> > entries, we do not want delays between /proc readers and
>> > the work if using mutex. But _bh locks stop timers and
>> > networking for short time :( Not sure yet if just spin_lock
>> > is safe for both /proc and estimator's work.
>
> Here stopping BH is may be not so fatal if some
>CPUs are used for networking and others for workqueues.
>
>> Thanks for sharing your thoughts !
>>
>>
>> I think it's a good idea to split the est_list into different
>>
>> slots, I believe it will dramatically reduce the delay brought
>>
>> by estimation.
>
> 268ms/64 => 4ms average. As the estimation with single
>work does not utilize many CPUs simultaneously, this can be a
>problem for 300000-400000 services but this looks crazy.
Yes. Consider the largest server we use now, which has 256 HT
servers with 4 NUMA nodes. Even that should not be a big problem.
>
>> My only concern is the cost of the estimation when the number of
>>
>> services is large. Splitting the est_list won't reduce the real
>>
>> work to do.
>>
>> In our case, each estimation cost at most 268ms/2000ms, which is
>>
>> about 13% of one CPU hyper-thread, and this should be a common case
>>
>> in a large K8S cluster with lots of services.
>>
>> Since the estimation is not needed in our environment at all, it's
>>
>> just a waste of CPU resource. Have you ever consider add a switch to
>>
>> let the user turn the estimator off ?
>
> No problem to add sysctl var for this, we usually add function
>to check which can be used in ip_vs_in_stats, ip_vs_out_stats,
>ip_vs_conn_stats. If switch can be changed at any time, what should
>we do? Add/Del est entries as normally but do not start the
>delayed work if flag disables stats. When flag is enabled counters
>will increase and we will start delayed work.
Yes, this would be perfect for me !
prev parent reply other threads:[~2020-12-04 13:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-16 22:00 Long delay on estimation_timer causes packet latency yunhong-cgl jiang
2020-04-16 22:00 ` yunhong-cgl jiang
2020-04-17 7:47 ` Julian Anastasov
2020-04-17 16:56 ` yunhong-cgl jiang
2020-04-17 16:56 ` yunhong-cgl jiang
2020-12-03 6:42 ` dust.li
2020-12-03 8:48 ` Julian Anastasov
2020-12-04 3:27 ` dust.li
2020-12-04 5:42 ` Julian Anastasov
2020-12-04 13:59 ` dust.li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201204135902.GA14129@linux.alibaba.com \
--to=dust.li@linux.alibaba.com \
--cc=horms@verge.net.au \
--cc=ja@ssi.bg \
--cc=lvs-devel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=xintian1976@gmail.com \
--cc=yunhjiang@ebay.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.