All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Netdev <netdev@vger.kernel.org>
Subject: Re: Linux 5.12-rc7
Date: Tue, 13 Apr 2021 08:52:27 -0400	[thread overview]
Message-ID: <20210413085218-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CANn89i+X9w=nfE843_2cZzkdhwmcD3gaJmhYGEr_qp-_-Ar2hw@mail.gmail.com>

On Tue, Apr 13, 2021 at 02:45:46PM +0200, Eric Dumazet wrote:
> On Tue, Apr 13, 2021 at 12:43 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Apr 13, 2021 at 11:24 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Mon, Apr 12, 2021 at 10:05 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > On 4/12/21 10:38 AM, Eric Dumazet wrote:
> > > > [ ... ]
> > > >
> > > > > Yes, I think this is the real issue here. This smells like some memory
> > > > > corruption.
> > > > >
> > > > > In my traces, packet is correctly received in AF_PACKET queue.
> > > > >
> > > > > I have checked the skb is well formed.
> > > > >
> > > > > But the user space seems to never call poll() and recvmsg() on this
> > > > > af_packet socket.
> > > > >
> > > >
> > > > After sprinkling the kernel with debug messages:
> > > >
> > > > 424   00:01:33.674181 sendto(6, "E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
> > > > 424   00:01:33.693873 close(6)          = 0
> > > > 424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
> > > > 424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT (Bad address)
> > > > 424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) = -1 EFAULT (Bad address)
> > > > 424   00:01:33.697311 exit_group(1)     = ?
> > > > 424   00:01:33.698346 +++ exited with 1 +++
> > > >
> > > > I only see that after adding debug messages in the kernel, so I guess there must be
> > > > a heisenbug somehere.
> > > >
> > > > Anyway, indeed, I see (another kernel debug message):
> > > >
> > > > __do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8
> > > >
> > > > So udhcpc doesn't even try to read the reply because it crashes after sendto()
> > > > when trying to read the current time. Unless I am missing something, that means
> > > > that the problem happens somewhere on the send side.
> > > >
> > > > To make things even more interesting, it looks like the failing system call
> > > > isn't always clock_gettime().
> > > >
> > > > Guenter
> > >
> > >
> > > I think GRO fast path has never worked on SUPERH. Probably SUPERH has
> > > never used a fast NIC (10Gbit+)
> > >
> > > The following hack fixes the issue.
> > >
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..91ba89a645ff91d4cd4f3d8dc8a009bcb67da344
> > > 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -5916,13 +5916,16 @@ static struct list_head
> > > *gro_list_prepare(struct napi_struct *napi,
> > >
> > >  static void skb_gro_reset_offset(struct sk_buff *skb)
> > >  {
> > > +#if !defined(CONFIG_SUPERH)
> > >         const struct skb_shared_info *pinfo = skb_shinfo(skb);
> > >         const skb_frag_t *frag0 = &pinfo->frags[0];
> > > +#endif
> > >
> > >         NAPI_GRO_CB(skb)->data_offset = 0;
> > >         NAPI_GRO_CB(skb)->frag0 = NULL;
> > >         NAPI_GRO_CB(skb)->frag0_len = 0;
> > >
> > > +#if !defined(CONFIG_SUPERH)
> > >         if (!skb_headlen(skb) && pinfo->nr_frags &&
> > >             !PageHighMem(skb_frag_page(frag0))) {
> > >                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> > > @@ -5930,6 +5933,7 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
> > >                                                     skb_frag_size(frag0),
> > >                                                     skb->end - skb->tail);
> > >         }
> > > +#endif
> > >  }
> > >
> > >  static void gro_pull_from_frag0(struct sk_buff *skb, int grow)
> >
> > OK ... more sh debugging :
> >
> > diff --git a/arch/sh/mm/alignment.c b/arch/sh/mm/alignment.c
> > index fb517b82a87b1065cf38c06cb3c178ce86587b00..5d18f9f792991105a8aa05cc6231b7d4532d72c9
> > 100644
> > --- a/arch/sh/mm/alignment.c
> > +++ b/arch/sh/mm/alignment.c
> > @@ -27,7 +27,7 @@ static unsigned long se_multi;
> >     valid! */
> >  static int se_usermode = UM_WARN | UM_FIXUP;
> >  /* 0: no warning 1: print a warning message, disabled by default */
> > -static int se_kernmode_warn;
> > +static int se_kernmode_warn = 1;
> >
> >  core_param(alignment, se_usermode, int, 0600);
> >
> > @@ -103,7 +103,7 @@ void unaligned_fixups_notify(struct task_struct
> > *tsk, insn_size_t insn,
> >                           (void *)instruction_pointer(regs), insn);
> >         else if (se_kernmode_warn)
> >                 pr_notice_ratelimited("Fixing up unaligned kernel access "
> > -                         "in \"%s\" pid=%d pc=0x%p ins=0x%04hx\n",
> > +                         "in \"%s\" pid=%d pc=%px ins=0x%04hx\n",
> >                           tsk->comm, task_pid_nr(tsk),
> >                           (void *)instruction_pointer(regs), insn);
> >  }
> >
> > I now see something of interest :
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc2e ins=0x6236
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc30 ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> > Fixing up unaligned kernel access in "udhcpc" pid=91 pc=8c43fc3a ins=0x6636
> >
> > So basically the frag0 idea only works if drivers respect NET_IP_ALIGN
> > (So that IP header is 4-byte aligned)
> >
> > It seems either virtio_net or qemu does not respect the contract.
> >
> > A possible generic fix  would then be :
> >
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index af8c1ea040b9364b076e2d72f04dc3de2d7e2f11..1f79b9aa9a3f2392fddd1401f95ad098b5e03204
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -5924,7 +5924,8 @@ static void skb_gro_reset_offset(struct sk_buff *skb)
> >         NAPI_GRO_CB(skb)->frag0_len = 0;
> >
> >         if (!skb_headlen(skb) && pinfo->nr_frags &&
> > -           !PageHighMem(skb_frag_page(frag0))) {
> > +           !PageHighMem(skb_frag_page(frag0)) &&
> > +           (!NET_IP_ALIGN || !(skb_frag_off(frag0) & 3))) {
> >                 NAPI_GRO_CB(skb)->frag0 = skb_frag_address(frag0);
> >                 NAPI_GRO_CB(skb)->frag0_len = min_t(unsigned int,
> >                                                     skb_frag_size(frag0),
> 
> 
> Official submission :
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/20210413124136.2750358-1-eric.dumazet@gmail.com/

Thanks a lot Eric!

-- 
MST


  reply	other threads:[~2021-04-13 12:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-11 22:41 Linux 5.12-rc7 Linus Torvalds
2021-04-12  5:14 ` Guenter Roeck
2021-04-12 16:28   ` Linus Torvalds
2021-04-12 16:31     ` Michael S. Tsirkin
2021-04-12 16:31     ` Eric Dumazet
2021-04-12 16:47       ` Eric Dumazet
2021-04-13 12:57         ` Michael S. Tsirkin
2021-04-13 13:27           ` Eric Dumazet
2021-04-13 13:33             ` Eric Dumazet
2021-04-13 13:37               ` Michael S. Tsirkin
2021-04-13 13:42                 ` Eric Dumazet
2021-04-13 13:46                   ` Michael S. Tsirkin
2021-04-13 13:58                   ` Michael S. Tsirkin
2021-04-12 17:31       ` Guenter Roeck
2021-04-12 17:38         ` Eric Dumazet
2021-04-12 20:05           ` Guenter Roeck
2021-04-13  9:24             ` Eric Dumazet
2021-04-13 10:43               ` Eric Dumazet
2021-04-13 12:45                 ` Eric Dumazet
2021-04-13 12:52                   ` Michael S. Tsirkin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-04-12  2:03 Ronald Warsow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210413085218-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=netdev@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.