From: Manuel Ullmann <manuel.ullmann@posteo.de>
To: Jordan Leppert <jordanleppert@protonmail.com>
Cc: "Igor Russkikh" <irusskikh@marvell.com>,
"Manuel Ullmann" <labre@posteo.de>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
regressions@lists.linux.dev, davem@davemloft.net,
kuba@kernel.org, pabeni@redhat.com,
"Holger Hoffstätte" <holger@applied-asynchrony.com>,
koo5 <kolman.jindrich@gmail.com>,
"Dmitry Bezrukov" <dbezrukov@marvell.com>
Subject: Re: [EXT] [PATCH] net: atlantic: always deep reset on pm op, fixing null deref regression
Date: Thu, 05 May 2022 17:39:02 +0000 [thread overview]
Message-ID: <87ee17nakp.fsf@posteo.de> (raw)
In-Reply-To: <99KGBavpdWUsYAzz1AIlqoFSVt9JXUAmj3Sbso-671ku1gnhokcfi3D9bbh_2xYS_wWYRQOhGxgUsZKsgqkyIivlelLor9zNvpOLC0I3nxA=@protonmail.com> (Jordan Leppert's message of "Thu, 05 May 2022 10:24:49 +0000")
>>
>>
>> > The impact of this regression is the same for resume that I saw on
>> > thaw: the kernel hangs and nothing except SysRq rebooting can be done.
>> >
>> > The null deref occurs at the same position as on thaw.
>> > BUG: kernel NULL pointer dereference
>> > RIP: aq_ring_rx_fill+0xcf/0x210 [atlantic]
>> >
>> > Fixes regression in cbe6c3a8f8f4 ("net: atlantic: invert deep par in
>> > pm functions, preventing null derefs"), where I disabled deep pm
>> > resets in suspend and resume, trying to make sense of the
>> > atl_resume_common deep parameter in the first place.
>> >
>> > It turns out, that atlantic always has to deep reset on pm operations
>> > and the parameter is useless. Even though I expected that and tested
>> > resume, I screwed up by kexec-rebooting into an unpatched kernel, thus
>> > missing the breakage.
>> >
>> > This fixup obsoletes the deep parameter of atl_resume_common, but I
>> > leave the cleanup for the maintainers to post to mainline.
>> >
>> > PS: I'm very sorry for this regression.
>>
>>
>> Hi Manuel,
>>
>> Unfortunately I've missed to review and comment on previous patch - it was too quickly accepted.
>>
>> I'm still in doubt on your fixes, even after rereading the original problem.
>> Is it possible for you to test this with all the possible combinations?
>> suspend/resume with device up/down,
>> hibernate/restore with device up/down?
I confirm that suspend/resume/hibernation/thaw keeps working in all
cases. Thaw would work without the original patch, if the device is down
before hibernation. I also originally described this behaviour on
bugzilla at
https://bugzilla.kernel.org/show_bug.cgi?id=215798
See also Jordan’s confirmation below.
I think, the main reason, why this could break, is, that the deep
parameter had no real impact until the breaking commit. So it was
practically untested, when the allocation/free functions were split.
Another thing, that I tested, was guarding all null pointer references
with null checks, which failed at first, because GCC optimized them
out. I think I have the atlantic tree for this (bad) fix attempt
floating around. I can try to rebase and create a patch from this and
post it to the Github issue, if you are interested.
https://github.com/Aquantia/AQtion/issues/32
Don’t have time for this before the weekend though.
>> I'll try to do the same on our side, but we don't have much resources for that now unfortunately..
>>
>> > Fixes: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c
>>
>>
>> That tag format is incorrect I think..
Thanks for pointing that out. Also, are those stable Cc tags correct?
Because I figured, that the x in the documentation could be also the
branch name and not a placeholder. Should I resend the patch, fixing the
tags? Won’t get to it before tomorrow, though.
>> Igor
> With the proposed patch (deep parameter is always true), I've managed to test:
> 1. Hibernate/restore (with device down/up)
> 2. Suspend/resume (with device down/up)
>
> I put the device down with the command:
> sudo ip link set <connection> down
>
> I hope that's correct, if not please let me know correct command.
This should be the correct.
Manuel
prev parent reply other threads:[~2022-05-05 17:38 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-04 22:06 [PATCH] net: atlantic: always deep reset on pm op, fixing null deref regression Manuel Ullmann
2022-05-04 22:24 ` Holger Hoffstätte
2022-05-05 7:04 ` [EXT] " Igor Russkikh
2022-05-05 10:24 ` Jordan Leppert
2022-05-05 17:39 ` Manuel Ullmann [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ee17nakp.fsf@posteo.de \
--to=manuel.ullmann@posteo.de \
--cc=davem@davemloft.net \
--cc=dbezrukov@marvell.com \
--cc=holger@applied-asynchrony.com \
--cc=irusskikh@marvell.com \
--cc=jordanleppert@protonmail.com \
--cc=kolman.jindrich@gmail.com \
--cc=kuba@kernel.org \
--cc=labre@posteo.de \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.