netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Jozsef Kadlecsik <kadlec@netfilter.org>,
	"netfilter-devel@vger.kernel.org"
	<netfilter-devel@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: nft_queues.sh failures
Date: Thu, 22 May 2025 16:35:16 +0200	[thread overview]
Message-ID: <aC82JEehNShMjW8-@strlen.de> (raw)
In-Reply-To: <649e3d9a-48b4-4660-99c5-1609e3cd06cf@redhat.com>

Paolo Abeni <pabeni@redhat.com> wrote:
> On 5/22/25 3:53 PM, Jakub Kicinski wrote:
> > On Thu, 22 May 2025 12:09:01 +0200 Paolo Abeni wrote:
> >> Recently the nipa CI infra went through some tuning, and the mentioned
> >> self-test now often fails.
> >>
> >> As I could not find any applied or pending relevant change, I have a
> >> vague suspect that the timeout applied to the server command now
> >> triggers due to different timing. Could you please have a look?
> > 
> > Oh, I was just staring at:
> > https://lore.kernel.org/all/20250522031835.4395-1-shiming.cheng@mediatek.com/
> > do you think it's not that?

It is, thanks Jakub!

With my updated test case, it does pass, but see for yourself:
# PASS: sctp and nfqueue in forward chain (duration: 118s)
# PASS: sctp and nfqueue in output chain with GSO (duration: 56s)

(the old timeout was 60s, so this would FAIL without the updated test).

plain net-next/main:
# PASS: sctp and nfqueue in forward chain (duration: 42s)
# PASS: sctp and nfqueue in output chain with GSO (duration: 21s)

I haven't debugged yet but i'd guess that some packets get corrupted
when nfqueue segments gso skbs, thus forcing retransmits.

> It's not obvious to me. The failing test case is:
> 
> tcp via loopback and re-queueing
> 
> There should be no S/W segmentation there, as the loopback interface
> exposes TSO.

The nfqueue test also forces software segmentation, even for lo, so that
the userspace listener gets non-aggregated packets (its possible to
disable this so 'large packets' get queued to userspace, this is also
tested for tcp by this selftest).

> @Florian, I'm sorry I should have mentioned explicitly the failing test
> before. Sample failures:
> 
> https://netdev-3.bots.linux.dev/vmksft-nf/results/131921/2-nft-queue-sh/stdout
> https://netdev-3.bots.linux.dev/vmksft-nf/results/131741/2-nft-queue-sh/stdout

both show sctp failing:

# PASS: tcp via loopback and re-queueing

---> tcp loopback passes

# 2025/05/22 05:11:46 socat[32441] E write(7, 0x55ca6b34e000, 8192): Connection reset by peer
# cmp: EOF on /tmp/tmp.1LVNFztWUK after byte 50208768, in line 1
# FAIL: sctp forward: input and output file differ
#  Input file-rw------- 1 root root 209715200 May 22 05:10 /tmp/tmp.teqIUO7Jfh
# Output file-rw------- 1 root root 50208768 May 22 05:11 /tmp/tmp.1LVNFztWUK
# 2025/05/22 05:12:46 socat[32459] E write(7, 0x561110e23000, 8192): Connection reset by peer
# cmp: EOF on /tmp/tmp.1LVNFztWUK after byte 36528128, in line 1
# FAIL: sctp output: input and output file differ

so its sctp+nfqueue thats failing.
And it does seem to be related to the pending patch pointed out by
Jakub.
> > I'll hide both that patch and Florian's fix from the queue for now, 
> > for a test.
> 
> Fine by me.

I'll resend the update tomorrow, keeping the OLD timeout of 60s, I think
keeping track of the 'transmit time' in the test log archives could be
useful in the future.

> I was wondering about this timeout specifically:
> 
> https://elixir.bootlin.com/linux/v6.15-rc7/source/tools/testing/selftests/net/netfilter/nft_queue.sh#L329

5s isn't so short, lo is supposed to be fast (the userspace prog
asks for GSO packets, so no s/w segmentation should happen but even
with GSO segmentation I would not expect it to fail).

I would prefer to keep the 5s for tcp; I don't recall this was a problem
in the past.

  reply	other threads:[~2025-05-22 14:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-22 10:09 nft_queues.sh failures Paolo Abeni
2025-05-22 13:53 ` Jakub Kicinski
2025-05-22 14:10   ` Paolo Abeni
2025-05-22 14:35     ` Florian Westphal [this message]
2025-05-22 14:46 ` [PATCH net] selftests: netfilter: nft_queue.sh: double sctp test timeout Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aC82JEehNShMjW8-@strlen.de \
    --to=fw@strlen.de \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).