All of lore.kernel.org
 help / color / mirror / Atom feed
From: Darren Kenny <darren.kenny@oracle.com>
To: "Linux regression tracking (Thorsten Leemhuis)"
	<regressions@leemhuis.info>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Jason Wang <jasowang@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	virtualization@lists.linux.dev,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Linux kernel regressions list <regressions@lists.linux.dev>
Subject: Re: [PATCH RFC 0/3] Revert "virtio_net: rx enable premapped mode by default"
Date: Thu, 15 Aug 2024 11:22:09 +0100	[thread overview]
Message-ID: <m2r0aqrsq6.fsf@oracle.com> (raw)
In-Reply-To: <a6ec1c84-428f-41b7-9a57-183f2aeca289@leemhuis.info>


On Thursday, 2024-08-15 at 09:14:27 +02, Linux regression tracking (Thorsten Leemhuis) wrote:
> [side note: the message I have been replying to at least when downloaded
> from lore has two message-ids, one of them identical two a older
> message, which is why this looks odd in the lore archives:
> https://lore.kernel.org/all/20240511031404.30903-1-xuanzhuo@linux.alibaba.com/]
>

Yes, I saw that too, hence I responded to patch 1 in the series, rather
than the cover letter.

> On 14.08.24 08:59, Michael S. Tsirkin wrote:
>> Note: Xuan Zhuo, if you have a better idea, pls post an alternative
>> patch.
>> 
>> Note2: untested, posting for Darren to help with testing.
>> 
>> Turns out unconditionally enabling premapped 
>> virtio-net leads to a regression on VM with no ACCESS_PLATFORM, and with
>> sysctl net.core.high_order_alloc_disable=1
>> 
>> where crashes and scp failures were reported (scp a file 100M in size to VM):
>> [...]
>
> TWIMC, there is a regression report on lore and I wonder if this might
> be related or the same problem, as it also mentioned a "get_swap_device:
> Bad swap file entry" error:
> https://bugzilla.kernel.org/show_bug.cgi?id=219154
>

I took a look at the stack traces, they don't look similar to what I was
seeing, but I wasn't running with an ASAN enabled in the kernel.

Most of the traces that I was seeing would look like as in the e-mail
from Si-Wei:

  https://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com/

We could trigger it only when the sysctl value was set like:

- net.core.high_order_alloc_disable=1

And it would immediately panic on any relatively large download, e.g.
wget of a few RPMS, or similar.

Best I can suggest would be to try reverting them in a custom kernel
and see if it fixes this problem too.

Thanks,

Darren.

> To quote:
>
> """
> Hello,
>
> I've encountered repeated crashes or freezes when a KVM VM receives
> large amounts of data over the network while the system is under memory
> load and performing I/O operations. The crashes sometimes occur in the
> filesystem code (ext4 and btrfs, at least), but they also happen in
> other locations.
>
> This issue occurs on my custom builds using kernel versions v6.10 to
> v6.11-rc2, with virtio network and disk drivers, and either Ubuntu 22.04
> or Debian 12 user space.
>
> The same kernel build did not crash on an Azure VM, which does not use
> the virtio network driver. Since this issue only appears when receiving
> data, I suspect there could be an issue related to the virtio interface
> or receive buffer handling.
>
> This issue did not occur on the Debian backport kernel 6.9.7-1~bpo12+1
> amd64.
>
> Steps to Reproduce:
> 1. Setup a small VM on a KVM host.
>    I tested this on an x86_64 KVM VM with 1 CPU, 512 MB RAM, 2 GB SWAP
> (the smallest configuration from Vultr), using a Debian 12 user space,
> virtio disk, and virtio net.
> 2. Induce high memory and I/O load. Run the following command:
>    stress --vm 2 --hdd 1
>    (Adjust --vm to to occupy all the RAM)
>    This slows down the system but does not cause a crash.
> 3. Send large data to the VM.
>    I used `iperf3 -s` on the VM and sent data using `iperf3 -c` from
> another host. The system crashes within a few seconds to a few minutes.
> (The reverse direction `iperf3 -c -R` did not cause a crash.)
>
>
> The OOPS messages are mostly general protection faults, but sometimes I
> see "Bad pagetable" or other errors, such as:
> Oops: general protection fault, probably for non-canonical address
> 0x2f9b7fa5e2bde696: 0000 [#1] PREEMPT SMP PTI
> Oops: Oops: 0000 [#1] PREEMPT SMP PTI
> Oops: Bad pagetable: 000d [#1] PREEMPT SMP PTI
>
> In some cases, dmesg contains something like:
> UBSAN: shift-out-of-bounds in lib/xarray.c:158:34
>
> When the system freezes without crash, I sometimes found BUGON messages
> in some cases, such as:
> get_swap_device: Bad swap file entry 3403b0f5b2584992
> BUG: Bad page map in process stress  pte:c42f93fac0299e1d pmd:0d9b2047
> BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_ANONPAGES val:2
> BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_SWAPENTS val:-1
>
> Thanks.
> """
>
> Ciao, Thorsten

  reply	other threads:[~2024-08-15 10:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-11  3:14 [PATCH net-next v5 0/4] virtio_net: rx enable premapped mode by default Xuan Zhuo
2024-08-14  6:59 ` [PATCH RFC 0/3] Revert "virtio_net: rx enable premapped mode by default" Michael S. Tsirkin
2024-05-11  3:14 ` [PATCH net-next v5 1/4] virtio_ring: enable premapped mode whatever use_dma_api Xuan Zhuo
2024-08-13 19:28   ` Si-Wei Liu
2024-08-13 19:46     ` Michael S. Tsirkin
2024-08-14  3:39       ` Si-Wei Liu
2024-08-14  7:00         ` Michael S. Tsirkin
2024-08-17 13:20     ` Xuan Zhuo
2024-08-20  1:06       ` Si-Wei Liu
2024-08-20  6:19         ` Xuan Zhuo
2024-05-11  3:14 ` [PATCH net-next v5 2/4] virtio_net: big mode skip the unmap check Xuan Zhuo
2024-05-11  3:14 ` [PATCH net-next v5 3/4] virtio_net: rx remove premapped failover code Xuan Zhuo
2024-05-11  3:14 ` [PATCH net-next v5 4/4] virtio_net: remove the misleading comment Xuan Zhuo
2024-05-14  0:20 ` [PATCH net-next v5 0/4] virtio_net: rx enable premapped mode by default patchwork-bot+netdevbpf
2024-08-15  7:14 ` [PATCH RFC 0/3] Revert "virtio_net: rx enable premapped mode by default" Linux regression tracking (Thorsten Leemhuis)
2024-08-15 10:22   ` Darren Kenny [this message]
2024-08-16  5:03     ` Linux regression tracking (Thorsten Leemhuis)
2024-08-15 15:23   ` Michael S. Tsirkin
2024-08-15 15:28     ` Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2024-08-15 15:27 Michael S. Tsirkin
2024-08-15 15:27 ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2r0aqrsq6.fsf@oracle.com \
    --to=darren.kenny@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=virtualization@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.