From: Kuniyuki Iwashima <kuniyu@google.com>
To: i.maximets@ovn.org
Cc: dev@openvswitch.org, edumazet@google.com, kees@kernel.org,
kuba@kernel.org, netdev@vger.kernel.org, pabeni@redhat.com,
write@ownrisk.dk
Subject: Re: [BUG] FORTIFY: memcpy overflow in skb_tunnel_info_unclone() from geneve_xmit()
Date: Wed, 10 Jun 2026 18:21:48 +0000 [thread overview]
Message-ID: <20260610182211.2622443-1-kuniyu@google.com> (raw)
In-Reply-To: <6831f6a7-7321-4657-8255-28252b1a3176@ovn.org>
From: Ilya Maximets <i.maximets@ovn.org>
Date: Wed, 10 Jun 2026 20:00:53 +0200
> On 6/10/26 3:10 PM, Johan Thomsen wrote:
> > Hi Ilya,
> >
> > Sorry for the late follow-up.
> > Your patch has now run without crashes in my setup for 36 hours and
> > I'm unable to re-trigger a panic.
> >
> >> But the operation in the diff above is kind of pointless as the memcpy
> >> itself will copy the value again.
> >
> > Right. So it would probably need at least a comment, if that ends up
> > becoming the final fix.
> > I'm not a kernel dev and I don't know what is the right thing to do here.
> >
> > Happy to test alternative patches if needed.
>
> OK. Thanks for testing.
>
> >
> > BR
> > Johan
> >
> > Den man. 8. jun. 2026 kl. 11.41 skrev Ilya Maximets <i.maximets@ovn.org>:
> >>
> >> On 6/8/26 10:25 AM, Johan Thomsen wrote:
> >>> Hello,
> >>>
> >>> I am seeing what looks like a kernel bug in the Geneve/OVS/vhost
> >>> transmit path on a Talos Linux node running Kube-ovn with Geneve
> >>> overlay and KubeVirt VM traffic.
> >>>
> >>> Environment:
> >>>
> >>> Kernel: 6.18.33-talos
> >>> Distro: Talos v1.13.3
> >>>
> >>> Compiler/config:
> >>>
> >>> CONFIG_CC_VERSION_TEXT="clang version 22.1.2"
> >>> CONFIG_CC_IS_CLANG=y
> >>> CONFIG_LTO=y
> >>> CONFIG_LTO_CLANG=y
> >>> CONFIG_LTO_CLANG_THIN=y
> >>> CONFIG_FORTIFY_SOURCE=y
> >>>
> >>> Hardware: HPE ProLiant DL325 Gen11, AMD EPYC
> >>>
> >>> NIC driver: bnxt_en
> >>>
> >>> Workload/network:
> >>>
> >>> Kube-OVN, Geneve overlay
> >>> Open vSwitch datapath
> >>> KubeVirt/QEMU VM traffic via vhost/tap
> >>>
> >>> Relevant console output:
> >>>
> >>> [ 648.742603] memcpy: detected buffer overflow: 104 byte write of
> >>> buffer size 96
> >>> [ 648.749907] WARNING: CPU: 61 PID: 27020 at
> >>> lib/string_helpers.c:1036 __fortify_report+0x45/0x60
> >>> [ 648.758689] Modules linked in: dm_round_robin dm_multipath lpfc
> >>> nvmet_fc nvmet intel_rapl_msr intel_rapl_common ahci nvme_auth bnxt_en
> >>> nvme hpilo hkdf libahci sp5100_tco watchdog k10temp
> >>> [ 648.775429] CPU: 61 UID: 107 PID: 27020 Comm: vhost-27002 Not
> >>> tainted 6.18.29-talos #1 PREEMPT(none)
> >>> [ 648.784735] Hardware name: HPE ProLiant DL325 Gen11/ProLiant DL325
> >>> Gen11, BIOS 2.84 11/05/2025
> >>> [ 648.890478] skb_tunnel_info_unclone+0x179/0x190
> >>> [ 648.895152] geneve_xmit+0x7fe/0xe00
> >>> [ 648.907240] dev_hard_start_xmit+0xa7/0x1f0
> >>> [ 648.911479] __dev_queue_xmit+0x864/0xf40
> >>> [ 648.919688] do_execute_actions+0x9b9/0x1be0
> >>> [ 648.927727] ovs_execute_actions+0x58/0x170
> >>> [ 648.931960] ovs_dp_process_packet+0xb1/0x1c0
> >>> [ 648.936370] ovs_vport_receive+0x90/0x100
> >>> [ 648.940428] netdev_frame_hook+0x146/0x1a0
> >>> [ 648.954093] __netif_receive_skb+0x3f/0x160
> >>> [ 648.958324] process_backlog+0x10c/0x210
> >>> [ 648.962295] __napi_poll+0x2f/0x190
> >>> [ 648.965832] net_rx_action+0x2e3/0x500
> >>> [ 648.969632] handle_softirqs+0xe7/0x310
> >>> [ 648.985387] tun_get_user+0x137e/0x1510
> >>> [ 649.005878] handle_tx+0x41f/0xd30
> >>> [ 649.029014] vhost_run_work_list+0x52/0x90
> >>> [ 649.033162] vhost_task_fn+0xc2/0x140
> >>> [ 649.064145] ---[ end trace 0000000000000000 ]---
> >>> [ 649.068820] ------------[ cut here ]------------
> >>> [ 649.073489] kernel BUG at lib/string_helpers.c:1043!
> >>>
> >>> I don't know whether this is a real overflow or a FORTIFY false-positive.
> >>
> >> Looks like a false-positive from the __counted_by fortification.
> >>
> >> I'd guess something like this would fit it:
> >>
> >> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> >> index 1fc2fb03ce3f9..e51c3795da474 100644
> >> --- a/include/net/dst_metadata.h
> >> +++ b/include/net/dst_metadata.h
> >> @@ -164,6 +164,7 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> >> if (!new_md)
> >> return ERR_PTR(-ENOMEM);
> >>
> >> + new_md->u.tun_info.options_len = md_size;
> >> memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> >> sizeof(struct ip_tunnel_info) + md_size);
> >> #ifdef CONFIG_DST_CACHE
> >> ---
> >>
> >> Johan, could you try this in your setup?
> >>
> >> The memory was actually allocated for the options, but the structure
> >> is zeroed out on allocation, so the __counted_by check doesn't work
> >> properly for the initial initialization copy.
> >>
> >> But the operation in the diff above is kind of pointless as the memcpy
> >> itself will copy the value again. So, I'm not sure if that's the right
> >> solution here.
> >>
> >> Alternative might be to revert the kmalloc_flex back to the simple
> >> kmalloc in metadata_dst_alloc.
>
> A little less icky alternative might be to just split the copy in two:
I'd simply use unsafe_memcpy().
>
> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
> index 1fc2fb03ce3f9..996ae8350360a 100644
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
> @@ -164,8 +164,12 @@ static inline struct metadata_dst *tun_dst_unclone(struct sk_buff *skb)
> if (!new_md)
> return ERR_PTR(-ENOMEM);
>
> + /* Copy in two stages to keep the __counted_by happy. */
> memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
> - sizeof(struct ip_tunnel_info) + md_size);
> + sizeof(struct ip_tunnel_info));
> + memcpy(ip_tunnel_info_opts(&new_md->u.tun_info),
> + ip_tunnel_info_opts(&md_dst->u.tun_info),
> + md_size);
> #ifdef CONFIG_DST_CACHE
> /* Unclone the dst cache if there is one */
> if (new_md->u.tun_info.dst_cache.cache) {
> ---
>
> Adding netdev maintainers for more opinions.
>
> >>
> >> CC: Kees
> >>
> >> Best regards, Ilya Maximets.
> >>
> >>>
> >>> I cannot reproduce the issue on Talos v1.12.X which uses a gcc built
> >>> kernel, whereas the affected kernel is built with clang. Don't know
> >>> whether this is relevant here.
> >>>
> >>> I am currently trying to make a reliable reproducer, but I can almost
> >>> always trigger the issue when iperf stressing the VM-network.
> >>>
> >>> Please let me know if this should go to a more specific
> >>> maintainer/list or further info is needed. I am able to test candidate
> >>> patches if provided.
> >>>
> >>> Downstream bug reports:
> >>> https://github.com/siderolabs/talos/issues/13440
> >>> https://github.com/kubeovn/kube-ovn/issues/6767
> >>>
> >>> Thanks,
> >>> Johan
next prev parent reply other threads:[~2026-06-10 18:22 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-08 8:25 [BUG] FORTIFY: memcpy overflow in skb_tunnel_info_unclone() from geneve_xmit() Johan Thomsen
2026-06-08 9:41 ` Ilya Maximets
2026-06-10 13:10 ` Johan Thomsen
2026-06-10 18:00 ` Ilya Maximets
2026-06-10 18:21 ` Kuniyuki Iwashima [this message]
2026-06-10 19:41 ` Kees Cook
2026-06-10 19:51 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260610182211.2622443-1-kuniyu@google.com \
--to=kuniyu@google.com \
--cc=dev@openvswitch.org \
--cc=edumazet@google.com \
--cc=i.maximets@ovn.org \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=write@ownrisk.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox