netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Aaron Conole <aconole@redhat.com>
To: "Nicholas Piggin" <npiggin@gmail.com>
Cc: <netdev@vger.kernel.org>,  <dev@openvswitch.org>,
	 "Pravin B Shelar" <pshelar@ovn.org>,
	 "Eelco Chaudron" <echaudro@redhat.com>,
	 "Ilya Maximets" <imaximet@redhat.com>,
	 "Flavio Leitner" <fbl@redhat.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	 "Jakub Kicinski" <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	 "Eric Dumazet" <edumazet@google.com>
Subject: Re: [PATCH 0/7] net: openvswitch: Reduce stack usage
Date: Fri, 20 Oct 2023 13:04:13 -0400	[thread overview]
Message-ID: <f7til71dy42.fsf@redhat.com> (raw)
In-Reply-To: <CW62DEF1LEWB.3KK4CQJNGIRYO@wheely> (Nicholas Piggin's message of "Thu, 12 Oct 2023 11:19:28 +1000")

"Nicholas Piggin" <npiggin@gmail.com> writes:

> On Wed Oct 11, 2023 at 11:23 PM AEST, Aaron Conole wrote:
>> Nicholas Piggin <npiggin@gmail.com> writes:
>>
>> > Hi,
>> >
>> > I'll post this out again to keep discussion going. Thanks all for the
>> > testing and comments so far.
>>
>> Thanks for the update - did you mean for this to be tagged RFC as well?
>
> Yeah, it wasn't intended for merge with no RB or tests of course.
> I intended to tag it RFC v2.

I only did a basic test with this because of some other stuff, and I
only tested 1, 2, and 3.  I didn't see any real performance changes, but
that is only with a simple port-port test.  I plan to do some additional
testing with some recursive calls.  That will also help to understand
the limits a bit.

That said, I'm very nervous about the key allocator, especially if it is
possible that it runs out.  We probably will need the limit to be
bigger, but I want to get a worst-case flow from OVN side to understand.

>>
>> I don't see any performance data with the deployments on x86_64 and
>> ppc64le that cause these stack overflows.  Are you able to provide the
>> impact on ppc64le and x86_64?
>
> Don't think it'll be easy but they are not be pushing such rates
> so it wouldn't say much.  If you want to show the worst case, those
> tput and latency microbenchmarks should do it.
>
> It's the same tradeoff and reasons the per-cpu key allocator was
> added in the first place, presumably. Probably more expensive than
> stack, but similar order of magnitude O(cycles) vs slab which is
> probably O(100s cycles).
>
>> I guess the change probably should be tagged as -next since it doesn't
>> really have a specific set of commits it is "fixing."  It's really like
>> a major change and shouldn't really go through stable trees, but I'll
>> let the maintainers tell me off if I got it wrong.
>
> It should go upstream first if anything. I thought it was relatively
> simple and elegant to reuse the per-cpu key allocator though :(
>
> It is a kernel crash, so we need something for stable. But In a case
> like this there's not one single problem. Linux kernel stack use has
> always been pretty dumb - "don't use too much", for some values of
> too much, and just cross fingers config and compiler and worlkoad
> doesn't hit some overflow case.
>
> And powerpc has always used more stack x86, so probably it should stay
> one power-of-two larger to be safe. And that may be the best fix for
> -stable.

Given the reply from David (with msg-id:
<ff6cd12e28894f158d9a6c9f7157487f@AcuMS.aculab.com>), are there other
things we can look at with respect to the compiler as well?

> But also, ovs uses too much stack. Look at the stack sizes in the first
> RFC patch, and ovs takes the 5 largest. That's because it has always
> been the practice to not put large variables on stack, and when you're
> introducing significant recursion that puts extra onus on you to be
> lean. Even if it costs a percent. There are probably lots of places in
> the kernel that could get a few cycles by sticking large structures on
> stack, but unfortunately we can't all do that.

Well, OVS operated this way for at least 6 years, so it isn't a recent
thing.  But we should look at it.

I also wonder if we need to recurse in the internal devices, or if we
shouldn't just push the skb into the packet queue.  That will cut out
1/3 of the stack frame that you reported originally, and then when doing
the xmit, will cut out 2/3rds. I have no idea what the performance
impact hit there might be.  Maybe it looks more like a latency hit
rather than a throughput hit, but just speculating.

> Thanks,
> Nick


  parent reply	other threads:[~2023-10-20 17:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11  3:43 [PATCH 0/7] net: openvswitch: Reduce stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 1/7] net: openvswitch: generalise the per-cpu flow key allocation stack Nicholas Piggin
2023-10-11  3:43 ` [PATCH 2/7] net: openvswitch: Use flow key allocator in ovs_vport_receive Nicholas Piggin
2023-10-11  3:43 ` [PATCH 3/7] openvswitch: reduce stack usage in do_execute_actions Nicholas Piggin
2023-10-11  3:43 ` [PATCH 4/7] net: openvswitch: Reduce push_nsh stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 5/7] net: openvswitch: uninline action execution Nicholas Piggin
2023-10-11  3:43 ` [PATCH 6/7] net: openvswitch: uninline ovs_fragment to control stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 7/7] net: openvswitch: Reduce stack usage in ovs_dp_process_packet Nicholas Piggin
2023-10-11 12:22 ` [PATCH 0/7] net: openvswitch: Reduce stack usage Ilya Maximets
2023-10-12  0:08   ` Nicholas Piggin
2023-10-11 13:23 ` Aaron Conole
2023-10-12  1:19   ` Nicholas Piggin
2023-10-13  8:27     ` David Laight
2023-10-20 17:04     ` Aaron Conole [this message]
2023-10-25  4:06       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7til71dy42.fsf@redhat.com \
    --to=aconole@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dev@openvswitch.org \
    --cc=echaudro@redhat.com \
    --cc=edumazet@google.com \
    --cc=fbl@redhat.com \
    --cc=imaximet@redhat.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=pshelar@ovn.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).