regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Lifshits, Vitaly" <vitaly.lifshits@intel.com>
Cc: Christian Heusel <christian@heusel.eu>,
	Paul Menzel <pmenzel@molgen.mpg.de>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Przemek Kitszel <przemyslaw.kitszel@intel.com>,
	netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	regressions@lists.linux.dev, stable@vger.kernel.org,
	Sasha Levin <sashal@kernel.org>
Subject: Re: [Intel-wired-lan] [REGRESSION] e1000e heavy packet loss on Meteor Lake - 6.14.2
Date: Thu, 19 Jun 2025 16:49:18 +0200	[thread overview]
Message-ID: <aFQjby7mQxvShBm7@mail-itl> (raw)
In-Reply-To: <9fb5f018-7333-421b-8e2d-1f6eb98cffaa@intel.com>

[-- Attachment #1: Type: text/plain, Size: 6656 bytes --]

On Thu, Jun 19, 2025 at 03:20:35PM +0300, Lifshits, Vitaly wrote:
> 
> 
> On 6/18/2025 4:41 PM, Christian Heusel wrote:
> > On 25/06/18 03:28PM, Marek Marczykowski-Górecki wrote:
> > > On Fri, May 09, 2025 at 02:17:32AM +0200, Marek Marczykowski-Górecki wrote:
> > > > On Fri, May 09, 2025 at 01:28:36AM +0200, Marek Marczykowski-Górecki wrote:
> > > > > On Fri, May 09, 2025 at 01:13:28AM +0200, Paul Menzel wrote:
> > > > > > Dear Marek, dear Vitaly,
> > > > > > 
> > > > > > 
> > > > > > Am 09.05.25 um 00:41 schrieb Marek Marczykowski-Górecki:
> > > > > > > On Thu, May 08, 2025 at 09:26:18AM +0300, Lifshits, Vitaly
> > > > > > > > On 4/21/2025 4:28 PM, Marek Marczykowski-Górecki wrote:
> > > > > > > > > On Mon, Apr 21, 2025 at 03:19:12PM +0200, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > On Mon, Apr 21, 2025 at 03:44:02PM +0300, Lifshits, Vitaly wrote:
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > On 4/16/2025 3:43 PM, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > > > On Wed, Apr 16, 2025 at 03:09:39PM +0300, Lifshits, Vitaly wrote:
> > > > > > > > > > > > > Can you please also share the output of ethtool -i? I would like to know the
> > > > > > > > > > > > > NVM version that you have on your device.
> > > > > > > > > > > > 
> > > > > > > > > > > > driver: e1000e
> > > > > > > > > > > > version: 6.14.1+
> > > > > > > > > > > > firmware-version: 1.1-4
> > > > > > > > > > > > expansion-rom-version:
> > > > > > > > > > > > bus-info: 0000:00:1f.6
> > > > > > > > > > > > supports-statistics: yes
> > > > > > > > > > > > supports-test: yes
> > > > > > > > > > > > supports-eeprom-access: yes
> > > > > > > > > > > > supports-register-dump: yes
> > > > > > > > > > > > supports-priv-flags: yes
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Your firmware version is not the latest, can you check with the board
> > > > > > > > > > > manufacturer if there is a BIOS update to your system?
> > > > > > > > > > 
> > > > > > > > > > I can check, but still, it's a regression in the Linux driver - old
> > > > > > > > > > kernel did work perfectly well on this hw. Maybe new driver tries to use
> > > > > > > > > > some feature that is missing (or broken) in the old firmware?
> > > > > > > > > 
> > > > > > > > > A little bit of context: I'm maintaining the kernel package for a Qubes
> > > > > > > > > OS distribution. While I can try to update firmware on my test system, I
> > > > > > > > > have no influence on what hardware users will use this kernel, and
> > > > > > > > > which firmware version they will use (and whether all the vendors
> > > > > > > > > provide newer firmware at all). I cannot ship a kernel that is known
> > > > > > > > > to break network on some devices.
> > > > > > > > > 
> > > > > > > > > > > Also, you mentioned that on another system this issue doesn't reproduce, do
> > > > > > > > > > > they have the same firmware version?
> > > > > > > > > > 
> > > > > > > > > > The other one has also 1.1-4 firmware. And I re-checked, e1000e from
> > > > > > > > > > 6.14.2 works fine there.
> > > > > > 
> > > > > > > > Thank you for your detailed feedback and for providing the requested
> > > > > > > > information.
> > > > > > > > 
> > > > > > > > We have conducted extensive testing of this patch across multiple systems
> > > > > > > > and have not observed any packet loss issues. Upon comparing the mentioned
> > > > > > > > setups, we noted that while the LAN controller is similar, the CPU differs.
> > > > > > > > We believe that the issue may be related to transitions in the CPU's low
> > > > > > > > power states.
> > > > > > > > 
> > > > > > > > Consequently, we kindly request that you disable the CPU low power state
> > > > > > > > transitions in the S0 system state and verify if the issue persists. You can
> > > > > > > > disable this in the kernel parameters on the command line with idle=poll.
> > > > > > > > Please note that this command is intended for debugging purposes only, as it
> > > > > > > > may result in higher power consumption.
> > > > > > > 
> > > > > > > I tried with idle=poll, and it didn't help, I still see a lot of packet
> > > > > > > losses. But I can also confirm that idle=poll makes the system use
> > > > > > > significantly more power (previously at 25-30W, with this option stays
> > > > > > > at about 42W).
> > > > > > > 
> > > > > > > Is there any other info I can provide, enable some debug features or
> > > > > > > something?
> > > > > > > 
> > > > > > > I see the problem is with receiving packets - in my simple ping test,
> > > > > > > the ping target sees all the echo requests (and respond to them), but
> > > > > > > the responses aren't reaching ping back (and are not visible on tcpdump
> > > > > > > on the problematic system either).
> > > > > > 
> > > > > > As the cause is still unclear, can the commit please be reverted in the
> > > > > > master branch due adhere to Linux’ no-regression policy, so that it can be
> > > > > > reverted from the stable series?
> > > > > > 
> > > > > > Marek, did you also test 6.15 release candidates?
> > > > > 
> > > > > The last test I did was on 6.15-rc3. I can re-test on -rc5.
> > > > 
> > > > Same with 6.15-rc5.
> > > 
> > > And the same issue still applies to 6.16-rc2. FWIW Qubes OS kernel has
> > > this buggy patch revered and nobody complained (contrary to the version
> > > with the patch included). Should I submit the revert patch?
> 
> It is not a good idea to revert this patch as most of the systems will
> encounter the original issues (PHY access and packet loss). The reason I
> first introduced this patch was because big vendors reported the packet loss
> issue. You can refer to the following sightings:
> https://answers.launchpad.net/ubuntu/+question/816003
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2066064
> https://bugzilla.kernel.org/show_bug.cgi?id=218869

It would be useful to have any of those links in the original commit...

> As an intermediate solution we can either use a privileged flag to make it
> configurable. I will share with you a patch that might fix the issue
> on your system that I would like you to try.

Yes, that patch works :)

> FYI, we are currently investigating a similar issue that seems to be due to
> a misconfiguration of the system firmware.

Can you share some details? I can forward the info to firmware
developers for this system (it's Dasharo - coreboot-based firmware).

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2025-06-19 14:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-14 12:18 [REGRESSION] e1000e heavy packet loss on Meteor Lake - 6.14.2 Marek Marczykowski-Górecki
2025-04-14 12:38 ` Lifshits, Vitaly
2025-04-14 12:58   ` Marek Marczykowski-Górecki
2025-04-14 13:04     ` Lifshits, Vitaly
2025-04-14 13:28       ` Marek Marczykowski-Górecki
2025-04-16 12:09       ` [Intel-wired-lan] " Lifshits, Vitaly
2025-04-16 12:43         ` Marek Marczykowski-Górecki
2025-04-21 12:44           ` Lifshits, Vitaly
2025-04-21 13:19             ` Marek Marczykowski-Górecki
2025-04-21 13:28               ` Marek Marczykowski-Górecki
2025-05-08  6:26                 ` Lifshits, Vitaly
2025-05-08 22:41                   ` Marek Marczykowski-Górecki
2025-05-08 23:13                     ` Paul Menzel
2025-05-08 23:28                       ` Marek Marczykowski-Górecki
2025-05-09  0:17                         ` Marek Marczykowski-Górecki
2025-06-18 13:28                           ` Marek Marczykowski-Górecki
2025-06-18 13:41                             ` Christian Heusel
2025-06-19 12:20                               ` Lifshits, Vitaly
2025-06-19 14:49                                 ` Marek Marczykowski-Górecki [this message]
2025-04-14 14:27     ` Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFQjby7mQxvShBm7@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=christian@heusel.eu \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=netdev@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=regressions@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vitaly.lifshits@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).