Re: [PATCH net v2] ice: fix packet corruption due to extraneous page flip

Netdev List
 help / color / mirror / Atom feed

From: Jacob Keller <jacob.e.keller@intel.com>
To: John Ousterhout <ouster@cs.stanford.edu>
Cc: <anthony.l.nguyen@intel.com>, Jakub Kicinski <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	<intel-wired-lan@lists.osuosl.org>,
	<przemyslaw.kitszel@intel.com>, <netdev@vger.kernel.org>,
	<stable@vger.kernel.org>
Subject: Re: [PATCH net v2] ice: fix packet corruption due to extraneous page flip
Date: Fri, 8 May 2026 15:34:24 -0700	[thread overview]
Message-ID: <a2c831d2-6825-4eed-a494-6e254d451667@intel.com> (raw)
In-Reply-To: <CAGXJAmxKw-85-=0CX=s33CbfUmJA32=oqpDM=SeV5ZLi04fCOg@mail.gmail.com>

On 5/8/2026 2:59 PM, John Ousterhout wrote:
> On Fri, May 8, 2026 at 2:55 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>>
>> On 5/7/2026 7:37 PM, John Ousterhout wrote:
>>> Correct: this patch only applies to the ice driver before its conversion.
>>>
>>> The patch applies to versions 6.18.27 and 6.12.86. I believe the bug
>>> may also be present in 6.6.137, but the code has a slightly different
>>> structure there (the function ice_put_rx_mbuf doesn't yet exist in
>>> that version) so the patch would need to be reworked a bit.
>>>
>>> This situation isn't all that rare. It isn't a zero-length packet that
>>> triggers it; it seems to happen if a packet uses every available byte
>>> in a buffer, ending precisely at the end of the buffer. When this
>>> happens, the NIC seems to generate an extra zero-length "buffer". This
>>> happens quite frequently (thousands of times per second in some of my
>>> workloads).
>>>
>>> What keeps corruption from happening constantly is that there is only
>>> a problem if the "other half" of the buffer page is still active when
>>> the 0-length buffer is received from the NIC. I suspect that with TCP
>>> this is pretty unlikely: packet buffers get recycled quickly. If the
>>> other half is not in use, then it doesn't matter whether the page gets
>>> "flipped" while processing the 0-length buffer. I ran into this
>>> problem because I was testing Homa under conditions that caused some
>>> packet buffers to stay alive for longer periods of time.
>>>
>>> -John-
>> Right. So I think we need to make sure the patch is cc'd to stable.
>> Technically it doesn't strictly follow any of the 3 rules, but its
>> closest to 3 with a clarification that there is no upstream equivalent
>> due to the libeth Rx refactor.
> 
> It looks like messages on this chain have been cc-ed to stable since
> your first message. Is that sufficient, or do I need to resubmit (e.g.
> v3) with stable in the cc list?
> 
> -John-

I had added cc to stable to get some visibility, but I suspect that it
won't show up to the stable maintainers without being sent fully as a
patch that can be picked up by patchwork etc. Thus....

Its probably best to send a version to stable along with a comment about
why you can't list an upstream commit id following the guidelines from
Documentation/process/stable-rules.rst specifically the "option 3" rule,
since we can't apply this fix to any main tree, and there is no
equivalent commit already to backport.

Its a bit unorthodox but I can't see any other solution. It is also
important to be extremely clear in the commit to explain why it deviates
from the upstream (which was fixed accidentally by libeth refactor and
pagepool conversion) as to why we need a separate commit is necessary.

For now I would just target the kernels that the patch easily applies
on. Fixing some is better than fixing none. For the 6.6.x series, I can
try to poke someone from Intel to see if we can get something tested.

Thanks,
Jake

     prev parent reply	other threads:[~2026-05-08 22:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 18:38 [PATCH net v2] ice: fix packet corruption due to extraneous page flip John Ousterhout
2026-05-07 22:11 ` Jacob Keller
2026-05-08  2:37   ` John Ousterhout
2026-05-08 21:55     ` Jacob Keller
2026-05-08 21:59       ` John Ousterhout
2026-05-08 22:34         ` Jacob Keller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2c831d2-6825-4eed-a494-6e254d451667@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=ouster@cs.stanford.edu \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox