All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Conole <aconole@redhat.com>
To: "Nicholas Piggin" <npiggin@gmail.com>
Cc: <netdev@vger.kernel.org>,  <dev@openvswitch.org>,
	 "Pravin B Shelar" <pshelar@ovn.org>,
	 "Eelco Chaudron" <echaudro@redhat.com>,
	 "Ilya Maximets" <imaximet@redhat.com>,
	 "Flavio Leitner" <fbl@redhat.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	 "Jakub Kicinski" <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	 "Eric Dumazet" <edumazet@google.com>
Subject: Re: [PATCH 0/7] net: openvswitch: Reduce stack usage
Date: Fri, 20 Oct 2023 13:04:13 -0400	[thread overview]
Message-ID: <f7til71dy42.fsf@redhat.com> (raw)
In-Reply-To: <CW62DEF1LEWB.3KK4CQJNGIRYO@wheely> (Nicholas Piggin's message of "Thu, 12 Oct 2023 11:19:28 +1000")

"Nicholas Piggin" <npiggin@gmail.com> writes:

> On Wed Oct 11, 2023 at 11:23 PM AEST, Aaron Conole wrote:
>> Nicholas Piggin <npiggin@gmail.com> writes:
>>
>> > Hi,
>> >
>> > I'll post this out again to keep discussion going. Thanks all for the
>> > testing and comments so far.
>>
>> Thanks for the update - did you mean for this to be tagged RFC as well?
>
> Yeah, it wasn't intended for merge with no RB or tests of course.
> I intended to tag it RFC v2.

I only did a basic test with this because of some other stuff, and I
only tested 1, 2, and 3.  I didn't see any real performance changes, but
that is only with a simple port-port test.  I plan to do some additional
testing with some recursive calls.  That will also help to understand
the limits a bit.

That said, I'm very nervous about the key allocator, especially if it is
possible that it runs out.  We probably will need the limit to be
bigger, but I want to get a worst-case flow from OVN side to understand.

>>
>> I don't see any performance data with the deployments on x86_64 and
>> ppc64le that cause these stack overflows.  Are you able to provide the
>> impact on ppc64le and x86_64?
>
> Don't think it'll be easy but they are not be pushing such rates
> so it wouldn't say much.  If you want to show the worst case, those
> tput and latency microbenchmarks should do it.
>
> It's the same tradeoff and reasons the per-cpu key allocator was
> added in the first place, presumably. Probably more expensive than
> stack, but similar order of magnitude O(cycles) vs slab which is
> probably O(100s cycles).
>
>> I guess the change probably should be tagged as -next since it doesn't
>> really have a specific set of commits it is "fixing."  It's really like
>> a major change and shouldn't really go through stable trees, but I'll
>> let the maintainers tell me off if I got it wrong.
>
> It should go upstream first if anything. I thought it was relatively
> simple and elegant to reuse the per-cpu key allocator though :(
>
> It is a kernel crash, so we need something for stable. But In a case
> like this there's not one single problem. Linux kernel stack use has
> always been pretty dumb - "don't use too much", for some values of
> too much, and just cross fingers config and compiler and worlkoad
> doesn't hit some overflow case.
>
> And powerpc has always used more stack x86, so probably it should stay
> one power-of-two larger to be safe. And that may be the best fix for
> -stable.

Given the reply from David (with msg-id:
<ff6cd12e28894f158d9a6c9f7157487f@AcuMS.aculab.com>), are there other
things we can look at with respect to the compiler as well?

> But also, ovs uses too much stack. Look at the stack sizes in the first
> RFC patch, and ovs takes the 5 largest. That's because it has always
> been the practice to not put large variables on stack, and when you're
> introducing significant recursion that puts extra onus on you to be
> lean. Even if it costs a percent. There are probably lots of places in
> the kernel that could get a few cycles by sticking large structures on
> stack, but unfortunately we can't all do that.

Well, OVS operated this way for at least 6 years, so it isn't a recent
thing.  But we should look at it.

I also wonder if we need to recurse in the internal devices, or if we
shouldn't just push the skb into the packet queue.  That will cut out
1/3 of the stack frame that you reported originally, and then when doing
the xmit, will cut out 2/3rds. I have no idea what the performance
impact hit there might be.  Maybe it looks more like a latency hit
rather than a throughput hit, but just speculating.

> Thanks,
> Nick


  parent reply	other threads:[~2023-10-20 17:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11  3:43 [PATCH 0/7] net: openvswitch: Reduce stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 1/7] net: openvswitch: generalise the per-cpu flow key allocation stack Nicholas Piggin
2023-10-11  3:43 ` [PATCH 2/7] net: openvswitch: Use flow key allocator in ovs_vport_receive Nicholas Piggin
2023-10-11  3:43 ` [PATCH 3/7] openvswitch: reduce stack usage in do_execute_actions Nicholas Piggin
2023-10-11  3:43 ` [PATCH 4/7] net: openvswitch: Reduce push_nsh stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 5/7] net: openvswitch: uninline action execution Nicholas Piggin
2023-10-11  3:43 ` [PATCH 6/7] net: openvswitch: uninline ovs_fragment to control stack usage Nicholas Piggin
2023-10-11  3:43 ` [PATCH 7/7] net: openvswitch: Reduce stack usage in ovs_dp_process_packet Nicholas Piggin
2023-10-11 12:22 ` [PATCH 0/7] net: openvswitch: Reduce stack usage Ilya Maximets
2023-10-12  0:08   ` Nicholas Piggin
2023-10-11 13:23 ` Aaron Conole
2023-10-12  1:19   ` Nicholas Piggin
2023-10-13  8:27     ` David Laight
2023-10-20 17:04     ` Aaron Conole [this message]
2023-10-25  4:06       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f7til71dy42.fsf@redhat.com \
    --to=aconole@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dev@openvswitch.org \
    --cc=echaudro@redhat.com \
    --cc=edumazet@google.com \
    --cc=fbl@redhat.com \
    --cc=imaximet@redhat.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=pshelar@ovn.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.