From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jon Hunter Subject: Re: [PATCH v2 1/2] i2c: tegra: Better handle case where CPU0 is busy for a long time Date: Tue, 28 Apr 2020 09:01:56 +0100 Message-ID: <4981d7eb-b41e-c597-04ff-3d3295804d5a@nvidia.com> References: <79f6560e-dbb5-0ae1-49f8-cf1cd95396ec@nvidia.com> <20200427074837.GC3451400@ulmo> <20200427110033.GC3464906@ulmo> <3a06811c-02dc-ce72-ebef-78c3fc3f4f7c@gmail.com> <20200427151234.GE3464906@ulmo> <1ab276cf-c2b0-e085-49d8-b8ce3dba8fbe@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <1ab276cf-c2b0-e085-49d8-b8ce3dba8fbe@gmail.com> Content-Language: en-US Sender: linux-i2c-owner@vger.kernel.org To: Dmitry Osipenko , Thierry Reding Cc: Wolfram Sang , Laxman Dewangan , Manikanta Maddireddy , Vidya Sagar , linux-i2c@vger.kernel.org, linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org List-Id: linux-tegra@vger.kernel.org On 27/04/2020 16:18, Dmitry Osipenko wrote: > 27.04.2020 18:12, Thierry Reding =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >> On Mon, Apr 27, 2020 at 05:21:30PM +0300, Dmitry Osipenko wrote: >>> 27.04.2020 14:00, Thierry Reding =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >>>> On Mon, Apr 27, 2020 at 12:52:10PM +0300, Dmitry Osipenko wrote: >>>>> 27.04.2020 10:48, Thierry Reding =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >>>>> ... >>>>>>> Maybe but all these other problems appear to have existed for somet= ime >>>>>>> now. We need to fix all, but for the moment we need to figure out w= hat's >>>>>>> best for v5.7. >>>>>> >>>>>> To me it doesn't sound like we have a good handle on what exactly is >>>>>> going on here and we're mostly just poking around. >>>>>> >>>>>> And even if things weren't working quite properly before, it sounds = to >>>>>> me like this patch actually made things worse. >>>>> >>>>> There is a plenty of time to work on the proper fix now. To me it sou= nds >>>>> like you're giving up on fixing the root of the problem, sorry. >>>> >>>> We're at -rc3 now and I haven't seen any promising progress in the las= t >>>> week. All the while suspend/resume is now broken on at least one board >>>> and that may end up hiding any other issues that could creep in in the >>>> meantime. >>>> >>>> Furthermore we seem to have a preexisting issue that may very well >>>> interfere with this patch, so I think the cautious thing is to revert >>>> for now and then fix the original issue first. We can always come back >>>> to this once everything is back to normal. >>>> >>>> Also, people are now looking at backporting this to v5.6. Unless we >>>> revert this from v5.7 it may get picked up for backports to other >>>> kernels and then I have to notify stable kernel maintainers that they >>>> shouldn't and they have to back things out again. That's going to caus= e >>>> a lot of wasted time for a lot of people. >>>> >>>> So, sorry, I disagree. I don't think we have "plenty of time". >>> >>> There is about a month now before the 5.7 release. It's a bit too early >>> to start the panic, IMO :) >> >> There's no panic. A patch got merged and it broken something, so we >> revert it and try again. It's very much standard procedure. >> >>> Jon already proposed a reasonable simple solution: to keep PCIe >>> regulators always-ON. In a longer run we may want to have I2C atomic >>> transfers supported for a late suspend phase. >> >> That's not really a solution, though, is it? It's just papering over >> an issue that this patch introduced or uncovered. I'm much more in >> favour of fixing problems at the root rather than keep papering over >> until we loose track of what the actual problems are. >=20 > It's not "papering over an issue". The bug can't be fixed properly > without introducing I2C atomic transfers support for a late suspend > phase, I don't see any other solutions for now. Stable kernels do not > support atomic transfers at all, that proper solution won't be backportab= le. There are a few issues here, but the issue Thierry and I are referring to is the regression introduced by this change. Yes this exposes other problems, but we first need to understand why this breaks resume in general, regardless of what the PCIe driver is doing. I will look at this a bit more later this week. Cheers Jon --=20 nvpublic