From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E3E8C77B7A for ; Tue, 6 Jun 2023 21:21:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=CSjPKB1kNvub9Xspl3OD6KCCL5+u1rK+tg6GDYoogek=; b=cslkHU8K6/yG1m qPoZBB5SyWwJDEoux6lKilJnMrPhYNpLiKeeC7WF6c/hGGI4JHWAqS1JXQR28oVB9LL/iy3DLdoEi OnG26l7PsuQD5SG3omnuynseL+ECMnlAE+4F2mo95Tp5PHIbO1LNbRg00sAAHEXXKbuwcgnCTZcep xiMZQFVSCVn+DIOuMcQiaL++YfrxFIyXUaYyq/sBcpZnn7H6n7M84yXA8PsyTLTEuW0nrBco3sBTi vfyv9KYKLsz1pzWkBxYNMHG2g8vpdLe65dNusTg5Pnb38uEJq+0xcy7QFTc9g5m9WYzhd/4s1b7yP SBB7G1+an7LV8lZjU7SA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q6e6v-003HZK-2w; Tue, 06 Jun 2023 21:21:01 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1q6e6s-003HYj-0j for linux-arm-kernel@lists.infradead.org; Tue, 06 Jun 2023 21:20:59 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B6F9360EC6; Tue, 6 Jun 2023 21:20:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEF86C433EF; Tue, 6 Jun 2023 21:20:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686086457; bh=SMx52lFPf27vu0KyU8uhOx7LbHTgU23iSKhOJg7H1Rw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=YGFSCUnukUAohbGw/ft5J6qsy1/Luyj5RedqH0y2vfrAC3sLg2kPghe2w3EZXZqsv MF8FXjfov6bljVxcbznrYR8/JghyU3thqVtVtNj8sagMRhDeJVlCt+42d8tZFiyfRC GZw4MwT3bkzVO8xGYHR+U6Bj4Cbq83HXdTTLANYMas758IM+uZmcqKWZQDGRuyHWjI +95YbczGwBCm8tKvF6svkNZgI0AVAY4dsnUt91nkxezSekcivSMvRi4gAY8hKkJBzt MHWk9dAWPCZtYdS+sw1GSBwJ+f5h/S/r7tEtu+RzOjJ7TdBcyfKKcOn3HLHZjFn8rm zOkYgBWdl0YmQ== Date: Tue, 6 Jun 2023 23:20:53 +0200 From: Andi Shyti To: Robert Hancock Cc: "michal.simek@amd.com" , "shubhraj@xilinx.com" , "marex@denx.de" , "linux-arm-kernel@lists.infradead.org" , "linux-i2c@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "wsa@kernel.org" Subject: Re: [PATCH] i2c: xiic: Don't try to handle more interrupt events after error Message-ID: <20230606212053.bwpoxyost4fkpati@intel.intel> References: <20230606182558.1301413-1-robert.hancock@calian.com> <20230606192453.zjzz4kt76kus5hr5@intel.intel> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230606_142058_360977_73F2E7E7 X-CRM114-Status: GOOD ( 49.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Robert, On Tue, Jun 06, 2023 at 07:40:15PM +0000, Robert Hancock wrote: > On Tue, 2023-06-06 at 21:24 +0200, Andi Shyti wrote: > > Hi Robert, > > = > > On Tue, Jun 06, 2023 at 12:25:58PM -0600, Robert Hancock wrote: > > > In xiic_process, it is possible that error events such as > > > arbitration > > > lost or TX error can be raised in conjunction with other interrupt > > > flags > > > such as TX FIFO empty or bus not busy. Error events result in the > > > controller being reset and the error returned to the calling > > > request, > > > but the function could potentially try to keep handling the other > > > events, such as by writing more messages into the TX FIFO. Since > > > the > > > transaction has already failed, this is not helpful and will just > > > cause > > > issues. > > = > > what kind of issues? > > = > = > The one I ran into is what I alluded to further down, where that > WARN_ON was triggering repeatedly, which ended up flooding the kernel > log and causing the device's watchdog timer to timeout. I'm not sure > what other forms of havoc might ensue, other than "nothing good".. > = > > > This problem has been present ever since: > > > = > > > commit 7f9906bd7f72 ("i2c: xiic: Service all interrupts in isr") > > > = > > > which allowed non-error events to be handled after errors, but > > > became > > > more obvious after: > > > = > > > commit 743e227a8959 ("i2c: xiic: Defer xiic_wakeup() and > > > __xiic_start_xfer() in xiic_process()") > > > = > > > which reworked the code to add a WARN_ON which triggers if both the > > > xfer_more and wakeup_req flags were set, since this combination is > > > not supposed to happen, but was occurring in this scenario. > > > = > > > Skip further interrupt handling after error flags are detected to > > > avoid > > > this problem. > > > = > > > Fixes: 7f9906bd7f72 ("i2c: xiic: Service all interrupts in isr") > > > Signed-off-by: Robert Hancock > > = > > please add: > > = > > Cc: # v4.3+ > > = > = > Can add for a v2 - although with the Fixes tag it would most likely be > picked up for stable anyway.. It's just a courtesy to stable maintainers :) > > > --- > > > =A0drivers/i2c/busses/i2c-xiic.c | 2 ++ > > > =A01 file changed, 2 insertions(+) > > > = > > > diff --git a/drivers/i2c/busses/i2c-xiic.c > > > b/drivers/i2c/busses/i2c-xiic.c > > > index 8a3d9817cb41..ee6edc963dea 100644 > > > --- a/drivers/i2c/busses/i2c-xiic.c > > > +++ b/drivers/i2c/busses/i2c-xiic.c > > > @@ -721,6 +721,8 @@ static irqreturn_t xiic_process(int irq, void > > > *dev_id) > > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 wakeu= p_req =3D 1; > > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 wakeu= p_code =3D STATE_ERROR; > > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 } > > > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 /* don't try to handle other ev= ents */ > > > +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 goto out; > > = > > why don't we have goto's after every irq evaluation but only > > here? Do the issues you mentioned happen olny in this particular > > error case? > > = > = > As far as I can tell, yes. For example you could legitimately have > XIIC_INTR_TX_EMPTY_MASK and XIIC_INTR_BNB_MASK both being handled. The > error case is special as we end up resetting the controller, so it > doesn't make sense to try to continue with the rest of the transaction > while also completing it with an error. I think the patch is correct and I will ack it: Acked-by: Andi Shyti = I think, though, that this needs a proper fix and testing, in order to cover all the possible combinations. The scenario you highlighted is indeed one, but not only, potential situation that could arise. Can I just ask you to write a bit more in the comment to = highlight the possible failure? Thanks a lot, Andi > > Thanks, > > Andi > > = > > > =A0=A0=A0=A0=A0 } > > > =A0=A0=A0=A0=A0 if (pend & XIIC_INTR_RX_FULL_MASK) { > > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 /* Receive register/FIFO is f= ull */ > > > -- > > > 2.40.1 > > > = > = > -- = > Robert Hancock _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel