public inbox for iwd@lists.linux.dev
 help / color / mirror / Atom feed
From: James Prestwood <prestwoj@gmail.com>
To: Jules Maselbas <jmaselbas@zdiv.net>, iwd@lists.linux.dev
Subject: Re: iwd 2.22 segfault
Date: Thu, 3 Oct 2024 06:00:31 -0700	[thread overview]
Message-ID: <ba4ea86c-7abe-4873-9ccc-44bc339d9ee7@gmail.com> (raw)
In-Reply-To: <afdaa1f4-5b24-4d33-962e-3b026b65c626@gmail.com>

Hi Jules,

On 10/3/24 5:26 AM, James Prestwood wrote:
> Hi Jules,
>
> On 10/3/24 5:01 AM, Jules Maselbas wrote:
>> Hi,
>>
>> I am having a segfault in iwd 2.22, running on Alpine Linux (on edge).
>>
>> I can reproduce the segfault by doing `rc-service networking restart`,
>> dmesg gives this information:
>>
>> iwd[4229]: segfault at a ip 00007f027ca94c6b sp 00007fffd6c75858 
>> error 4 in ld-musl-x86_64.so.1[7f027ca44000+57000] likely on CPU 4 
>> (core 2, socket 0)
>> Code: f8 48 83 fa 08 72 14 f7 c7 07 00 00 00 74 0c a4 48 ff ca f7 c7 
>> 07 00 00 00 75 f4 48 89 d1 48 c1 e9 03 f3 48 a5 83 e2 07 74 05 <a4> 
>> ff ca 75 fb c3 48 89 f8 48 29 f0 48 39 d0 0f 83 bf ff ff ff 48
>> ...
>> iwd[24403]: segfault at a ip 00007fa91633ac6b sp 00007ffd1faaa028 
>> error 4 in ld-musl-x86_64.so.1[7fa9162ea000+57000] likely on CPU 6 
>> (core 3, socket 0)
>> Code: f8 48 83 fa 08 72 14 f7 c7 07 00 00 00 74 0c a4 48 ff ca f7 c7 
>> 07 00 00 00 75 f4 48 89 d1 48 c1 e9 03 f3 48 a5 83 e2 07 74 05 <a4> 
>> ff ca 75 fb c3 48 89 f8 48 29 f0 48 39 d0 0f 83 bf ff ff ff 48
>>
>> This is not an issue in musl-libc, but a call to memcpy with a bad 
>> address,
>> we can see that the source address is 0xa (10) which is also the 
>> offset of `aa` field in the `netdev->handshake` struct
>> which makes me think that handshake is null when netdev_rssi_poll is 
>> called.
>>
>> Here is a backtrace when iwd segfault:
>> (gdb) bt
>> #0  memcpy () at src/string/x86_64/memcpy.s:22
>> #1  0x00005555555fc0dc in memcpy (__od=<optimized out>, __os=0xa, 
>> __n=6) at /usr/include/fortify/string.h:55
>> #2  l_netlink_message_append (message=0x7ffff7f34050, 
>> type=type@entry=6, data=0xa, len=len@entry=6) at ell/netlink.c:841
>> #3  0x00005555555fd78f in l_genl_msg_append_attr 
>> (msg=msg@entry=0x7ffff7f34020, type=type@entry=6, len=len@entry=6, 
>> data=<optimized out>) at ell/genl.c:1518
>> #4  0x0000555555559080 in netdev_rssi_poll (timeout=<optimized out>, 
>> user_data=0x7ffff7f38dc0) at src/netdev.c:760
>> #5  0x00005555555f959e in timeout_callback (fd=<optimized out>, 
>> events=<optimized out>, user_data=0x7ffff7f363b0) at ell/timeout.c:69
>> #6  timeout_callback (fd=<optimized out>, events=<optimized out>, 
>> user_data=0x7ffff7f363b0) at ell/timeout.c:58
>> #7  0x00005555555f8a75 in l_main_iterate (timeout=<optimized out>) at 
>> ell/main.c:461
>> #8  0x00005555555f8b4e in l_main_run () at ell/main.c:508
>> #9  l_main_run () at ell/main.c:490
>> #10 0x00005555555f8d7f in l_main_run_with_signal 
>> (callback=callback@entry=0x5555555587f0 <signal_handler>, 
>> user_data=user_data@entry=0x0) at ell/main.c:630
>> #11 0x0000555555557bd0 in main (argc=<optimized out>, argv=<optimized 
>> out>) at src/main.c:614
>>
>>
>> I've also ran a git bisect which points to
>> 154a29be0552f5a39e34301ebaf24623d64073da netdev: fall back to RSSI 
>> polling if SET_CQM fails
>> as the first bad commit. I noticed the "rssi" word is also present in 
>> the stacktrace.
>
> Thanks for such detailed info, do you happen to have debug logs when 
> this happens? I'm just trying to see the code path which leads to 
> this. I can't seem to reproduce it but I suspect musl-libc is just 
> different enough that its exposing the bug
>
>
>>
>> I am using the following wifi card:
>> 03:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI 
>> Express Wireless Network Adapter
>> driver: mt7921e
>> version: 6.6.53-0-lts
>> firmware-version: ____000000-20240716163327
>> expansion-rom-version:
>> bus-info: 0000:03:00.0
>> supports-statistics: yes
>> supports-test: no
>> supports-eeprom-access: no
>> supports-register-dump: no
>> supports-priv-flags: no
>>
>>
>> Cheers,
>> Jules
>>
>>
> Thanks,
>
> James
>
I was finally able to reproduce it. Its completely timing dependent and 
I think if IWD gets restarted _just_ before the timer fires it will 
crash. The easiest way to reproduce it was to just disconnect with 
iwctl. Anyways, I sent a patch to the list which should fix it. If you 
have the ability to try it out to confirm that would be great!

Thanks,

James


  reply	other threads:[~2024-10-03 13:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-03 12:01 iwd 2.22 segfault Jules Maselbas
2024-10-03 12:26 ` James Prestwood
2024-10-03 13:00   ` James Prestwood [this message]
2024-10-03 13:47     ` Jules Maselbas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba4ea86c-7abe-4873-9ccc-44bc339d9ee7@gmail.com \
    --to=prestwoj@gmail.com \
    --cc=iwd@lists.linux.dev \
    --cc=jmaselbas@zdiv.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox