From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dmitry Osipenko Subject: Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time Date: Mon, 4 May 2020 23:55:09 +0300 Message-ID: References: <79f6560e-dbb5-0ae1-49f8-cf1cd95396ec@nvidia.com> <20200427074837.GC3451400@ulmo> <20200427103851.GB24446@kunai> <20200427141922.GD3464906@ulmo> <20200427153106.GA8113@kunai> <20200504154226.GA614153@ulmo> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20200504154226.GA614153@ulmo> Content-Language: en-US Sender: linux-tegra-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Thierry Reding Cc: Wolfram Sang , Jon Hunter , Laxman Dewangan , Manikanta Maddireddy , Vidya Sagar , linux-i2c-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-tegra@vger.kernel.org 04.05.2020 18:42, Thierry Reding пишет: > On Sat, May 02, 2020 at 05:40:35PM +0300, Dmitry Osipenko wrote: >> 27.04.2020 18:31, Wolfram Sang пишет: >>> >>>> Yes, that bug should be fixed anyway. But that doesn't justify breaking >>>> suspend/resume completely, which *is* a regression. >>>> >>>> Look, I'm not saying that we should drop this patch altogether. All I'm >>>> saying is that we should postpone it so that we can: a) get suspend and >>>> resume working again (and by doing so make sure no other suspend/resume >>>> regressions silently creep in, because that always seems to happen when >>>> you're not looking) and b) fix any preexisting issues without possibly >>>> scrambling the result with this perhaps unrelated fix. >>>> >>>> So, again, I think the safest road forward is to back this one out for >>>> now, fix whatever this other bug is and once suspend/resume is working >>>> properly again we can revisit this patch based on a known-good baseline. >>> >>> I am with you here. I want to add that the proper fix should be >>> developed without thinking too much about stable in the first place. >>> *When* we have a proper working fix, then we can think about making it >>> "more" suitable for backporting. Yet, it may also be a result that older >>> kernels need a different solution. Or have no solution at all, in case >>> they can't do atomic_transfers and this is needed. >>> >>> D'accord? >>> >> >> I saw that you submitted the revert of the patches for 5.7, hopefully it >> won't result in putting the PCIe driver problem into the back burner. >> I'll try not to forget about these patches to resubmit them later on, >> once the problem will be resolved :) > > I can put these two patches into a local development branch to keep > track of them. From what I said earlier, it looks like it would be fine > to apply these if we also make that runtime PM change (i.e. drop force > runtime PM and instead manually invoke runtime PM callbacks, which seems > to be in line with what the PM maintainers suggest, as pointed out > elsewhere in this thread). > > How about if I put all of that into a branch and push it to linux-next > so that we can get some broader testing? I've already run it through our > internal test system, which, while not perfect, is the broadest system I > am aware of, and all tests came back positive. Will be great. > I'm not exactly sure I see a real issue with the PCIe driver after those > patches are applied. The regulator errors are gone (presumably because > the regulators now do get turned off properly) and I don't observe any > other issues. That's probably because this I2C patch removed the "completion done after timeout" message. You may try to re-add the message, it should pop up on the PCIe driver's suspension. The IRQF_NO_SUSPEND flag should fix it. My assumption was that it should be always fine handle interrupt after timeout, and thus, the message isn't really needed. But this wasn't a correct assumption as we see now, so it should be better to keep the message for the debugging purposes, maybe turn it into dev_info_once().