From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8262CE79CB for ; Wed, 20 Sep 2023 11:03:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233786AbjITLDN (ORCPT ); Wed, 20 Sep 2023 07:03:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232488AbjITLDM (ORCPT ); Wed, 20 Sep 2023 07:03:12 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93F2A9E; Wed, 20 Sep 2023 04:03:06 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 74A94C433C7; Wed, 20 Sep 2023 11:03:03 +0000 (UTC) Date: Wed, 20 Sep 2023 12:03:00 +0100 From: Catalin Marinas To: Serge Semin Cc: Jan Bottorff , Yann Sionneau , Wolfram Sang , Yann Sionneau , Will Deacon , Jarkko Nikula , Andy Shevchenko , Mika Westerberg , Jan Dabros , Andi Shyti , Philipp Zabel , linux-i2c@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] i2c: designware: Fix corrupted memory seen in the ISR Message-ID: References: <37e10c3d-b5ab-75ec-3c96-76e15eb9bef8@sionneau.net> <9de89e14-35bd-415d-97f1-4b6db1258997@os.amperecomputing.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-i2c@vger.kernel.org On Wed, Sep 20, 2023 at 12:05:50AM +0300, Serge Semin wrote: > On Tue, Sep 19, 2023 at 11:54:10AM -0700, Jan Bottorff wrote: > > On 9/19/2023 7:51 AM, Catalin Marinas wrote: > > > While smp_* is ok, it really depends on what the regmap_write() does. Is > > > it a write to a shared peripheral (if not, you may need a DSB)? Does the > > > regmap_write() caller know this? That's why I think having the barrier > > > in dw_reg_write() is better. > > > > > > If you do want to stick to a fix in i2c_dw_xfer_init(), you could go for > > > dma_wmb(). While this is not strictly DMA, it's sharing data with > > > another coherent agent (a different CPU in this instance). The smp_wmb() > > > is more about communication via memory not involving I/O. But this still > > > assumes that the caller knows regmap_write() ends up with an I/O > > > write*() (potentially relaxed). > > Catalin, thank you very much for your messages. The situation is much > clearer now. I should have noted that we indeed talking about > different memory types: Normal and Device memories. Anyway to sum it up > AFAICS the next situation is happening: > 1. some data is updated, > 2. IRQ is activated by means of writel_relaxed() to the > DW_IC_INTR_MASK register. > 3. IRQ is raised and being handled, but the data updated in 1. looked > as corrupted. > > (Jan, correct me if I'm wrong.) > > Since ARM doesn't "guarantee ordering between memory accesses to > different devices, or usually between accesses of different memory > types", most likely the CPU core changes 1. and 2. places > occasionally. So in that case the IRQ handler just doesn't see the > updated data. What needs to be done is to make sure that 2. always > happens after 1. is completed. Outer Shareability domain write-barrier > (DMB(oshst)) does that. Am I right? That's why it's utilized in the > __io_bw() and __dma_wmb() macros implementation. Yes. > Wolfram, Jan seeing the root cause of the problem is in using the > '_relaxed' accessors for the controller CSRs I assume that the problem > might be more generic than just for ARMs, since based on [1] these > accessors "do not guarantee ordering with respect to locking, _normal_ > memory accesses or delay() loops." So theoretically the problem might > get to be met on any other arch if it implements the semantic with the > relaxed normal and device memory accesses execution. > > [1] "KERNEL I/O BARRIER EFFECTS" Documentation/memory-barriers.txt > > If so we need to have give up from using the _relaxed accessors at for > the CSRs which may cause a side effect like IRQ raising. Instead the > normal IO write should be utilized which "wait for the completion of > all prior writes to memory either issued by, or propagated to, the > same thread." [1] Thus I'd suggest the next fix for the problem: > > --- a/drivers/i2c/busses/i2c-designware-common.c > +++ b/drivers/i2c/busses/i2c-designware-common.c > @@ -72,7 +72,10 @@ static int dw_reg_write(void *context, unsigned int reg, unsigned int val) > { > struct dw_i2c_dev *dev = context; > > - writel_relaxed(val, dev->base + reg); > + if (reg == DW_IC_INTR_MASK) > + writel(val, dev->base + reg); > + else > + writel_relaxed(val, dev->base + reg); > > return 0; > } > > (and similar changes for dw_reg_write_swab() and dw_reg_write_word().) This should work as well. -- Catalin