public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Heiner Kallweit <hkallweit1@gmail.com>
To: Chris Chiu <chiu@endlessm.com>
Cc: nic_swsd <nic_swsd@realtek.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux Upstreaming Team <linux@endlessm.com>
Subject: Re: A weird problem of Realtek r8168 after resume from S3
Date: Tue, 18 Dec 2018 19:21:54 +0100	[thread overview]
Message-ID: <4d2084e5-abec-b046-a304-8ca7db4c269f@gmail.com> (raw)
In-Reply-To: <CAB4CAwcr+V+7b2uuaFtkOHNKP3Xep5dNkZQBrVQiEq_eg5PpAw@mail.gmail.com>

On 18.12.2018 14:25, Chris Chiu wrote:
> On Tue, Dec 18, 2018 at 3:08 AM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>
>> On 17.12.2018 14:25, Chris Chiu wrote:
>>> On Fri, Dec 14, 2018 at 3:37 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>>>
>>>> On 14.12.2018 04:33, Chris Chiu wrote:
>>>>> On Thu, Dec 13, 2018 at 10:20 AM Chris Chiu <chiu@endlessm.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>     We got an acer laptop which has a problem with ethernet networking after
>>>>>> resuming from S3. The ethernet is popular realtek r8168. The lspci shows as
>>>>>> follows.
>>>>>> 02:00.1 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 12)
>>>>>>
>>>> Helpful would be a "dmesg | grep r8169", especially chip name + XID.
>>>>
>>> [   22.362774] r8169 0000:02:00.1 (unnamed net_device)
>>> (uninitialized): mac_version = 0x2b
>>> [   22.365580] libphy: r8169: probed
>>> [   22.365958] r8169 0000:02:00.1 eth0: RTL8411, 00:e0:b8:1f:cb:83,
>>> XID 5c800800, IRQ 38
>>> [   22.365961] r8169 0000:02:00.1 eth0: jumbo features [frames: 9200
>>> bytes, tx checksumming: ko]
>>>
>> Thanks for the info.
>>
>>>>>>     The problem is the ethernet is not accessible after resume. Pinging via
>>>>>> ethernet always shows the response `Destination Host Unreachable`. However,
>>>>>> the interesting part is, when I run tcpdump to monitor the problematic ethernet
>>>>>> interface, the networking is back to alive. But it's dead again after
>>>>>> I stop tcpdump.
>>>>>> One more thing, if I ping the problematic machine from others, it achieves the
>>>>>> same effect as above tcpdump. Maybe it's about the register setting for RX path?
>>>>>>
>>>> You could compare the register dumps (ethtool -d) before and after S3 sleep
>>>> to find out whether there's a difference.
>>>>
>>>
>>> Actually, I just found I lead the wrong direction. The S3 suspend does
>>> help to reproduce,
>>> but it's not necessary. All I need to do is ping around 5 mins and the
>>> network connection
>>> fails.  And I also find one thing interesting, disabling the  MSI-X
>>> interrupt like commit
>>> [d49c88d7677ba737e9d2759a87db0402d5ab2607] can fix this problem.
>>> Although I don't
>>> understand the root cause. Anything I can do to help?
>>>
>> This is indeed very, very weird. You say switching from MSI-X to MSI fixes
>> the issue, but also pinging the machine from outside brings back the network.
>> Both actions affect totally different corners.
>>
>> The commit and related issue you mention was a workaround in the driver,
>> the root cause was a MSI-X-related  issue with certain Intel chipsets deep
>> in the PCI core. After this was fixed we removed the workaround again.
>> This shouldn't be related to your issue.
>>
>> Hard to say for now is whether the issue is:
>> - a driver issue
>> - a hardware issue in the RTL8411
>> - an issue with the chipset on your mainboard
>>
>> According to your description it doesn't take a special scenario to trigger
>> the issue, so most likely also other users of Acer notebooks with RTL8411
>> should be affected (after briefly checking this should be at least Aspire
>> F15, V15, V7). Therefore I wonder why there aren't more reports.
>>
>> This commit added MSI-X support: 6c6aa15fdea5 ("r8169: improve interrupt handling")
>> So you could test this revision and the one before.
>>
>> Eventually, if the issue really should be caused by a side effect of using
>> MSI-X, then the question is whether we need to disable MSI-X for RTL8411
>> in general or just for RTL8411 and a certain subsystem id.
>>
> 
> I tried the kernel with the head on 6c6aa15fdea5 ("r8169: improve
> interrupt handling"),
> the problem still there. Then I revert to the previous revision, the
> problem goes away.
> So I think it's pretty much the side effect of MSI-X. However, as you
> mentioned that
> you didn't hit this problem, I'll ask the vendor to verify if this
> problem also happens on
> other machines with the same chip. Then we can determine to disable for specific
> mac version or just a certain subsystem id.
> 
Thanks a lot for testing. OK, I have one more idea.
AFAICS RTL8411 also has an integrated card reader controller which is driven
by module rtsx_pci. Maybe if both components (card reader controller + ethernet)
use different interrupt types, RTL8411 can't properly handle this.
In case module rtsx_pci is loaded on your system, can you check whether not
loading it (e.g. by blacklisting) or removing it makes a difference?

Can you provide the "lspci -v" output for the card reader part of RTL8411?

>>>>>>     I tried the latest 4.20 rc version but the problem still there. I
>>>>>> also tried some
>>>>>> hw_reset or init thing in the resume path but no effect. Any
>>>>>> suggestion for this?
>>>>>> Thanks
>>>>>>
>>>> Did previous kernel versions work? If it's a regression, a bisect would be
>>>> appreciated, because with the chip versions I've got I can't reproduce the issue.
>>>>
>>>>>> Chris
>>>>>
>>>>> Gentle ping. Any additional information required?
>>>>>
>>>>> Chris
>>>>>
>>>> Heiner
>>>
>>
> 


  reply	other threads:[~2018-12-18 18:22 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-13  2:20 A weird problem of Realtek r8168 after resume from S3 Chris Chiu
2018-12-14  3:33 ` Chris Chiu
2018-12-14  7:36   ` Heiner Kallweit
2018-12-17 13:25     ` Chris Chiu
2018-12-17 19:08       ` Heiner Kallweit
2018-12-18 13:25         ` Chris Chiu
2018-12-18 18:21           ` Heiner Kallweit [this message]
2018-12-19 14:37             ` Chris Chiu
2018-12-18 20:28           ` Heiner Kallweit
2018-12-19 15:32             ` Chris Chiu
2018-12-19 19:41               ` Heiner Kallweit
2018-12-20  9:43                 ` Chris Chiu
2018-12-20 18:48                   ` Heiner Kallweit
2018-12-20 19:21                   ` Heiner Kallweit
2018-12-21 15:16                     ` Chris Chiu
2018-12-17 21:45       ` Heiner Kallweit
2018-12-18 12:31         ` Chris Chiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d2084e5-abec-b046-a304-8ca7db4c269f@gmail.com \
    --to=hkallweit1@gmail.com \
    --cc=chiu@endlessm.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@endlessm.com \
    --cc=netdev@vger.kernel.org \
    --cc=nic_swsd@realtek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox