From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755489AbcIFLoU (ORCPT ); Tue, 6 Sep 2016 07:44:20 -0400 Received: from mga09.intel.com ([134.134.136.24]:8254 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754597AbcIFLoQ (ORCPT ); Tue, 6 Sep 2016 07:44:16 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,291,1470726000"; d="asc'?scan'208";a="1052004187" From: Felipe Balbi To: Peter Zijlstra , Alan Stern Cc: "Paul E. McKenney" , Ingo Molnar , USB list , Kernel development list , Will Deacon Subject: Re: Memory barrier needed with wake_up_process()? In-Reply-To: <20160906113605.GL10153@twins.programming.kicks-ass.net> References: <20160905083334.GX10153@twins.programming.kicks-ass.net> <20160906113605.GL10153@twins.programming.kicks-ass.net> User-Agent: Notmuch/0.22.1+63~g994277e (https://notmuchmail.org) Emacs/25.1.3 (x86_64-pc-linux-gnu) Date: Tue, 06 Sep 2016 14:43:39 +0300 Message-ID: <87a8flmexg.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Peter Zijlstra writes: > On Mon, Sep 05, 2016 at 11:29:26AM -0400, Alan Stern wrote: >> On Mon, 5 Sep 2016, Peter Zijlstra wrote: >>=20 >> > > Actually, seeing it written out like this, one realizes that it real= ly=20 >> > > ought to be: >> > >=20 >> > > CPU 0 CPU 1 >> > > ----- ----- >> > > Start DMA Handle DMA-complete irq >> > > Sleep until bh->state smp_mb() >> > > set bh->state >> > > Wake up CPU 0 >> > > smp_mb() >> > > Compute rc based on contents of the DMA buffer >> > >=20 >> > > (Bear in mind also that on some platforms, the I/O operation is carr= ied=20 >> > > out by PIO rather than DMA.) > >> > Also, smp_mb() doesn't necessarily interact with MMIO/DMA at all IIRC. >> > Its only defined to do CPU/CPU interactions. >>=20 >> Suppose the DMA master finishes filling up an input buffer and issues a >> completion irq to CPU1. Presumably the data would then be visible to >> CPU1 if the interrupt handler looked at it. So if CPU1 executes smp_mb >> before setting bh->state and waking up CPU0, and if CPU0 executes >> smp_mb after testing bh->state and before reading the data buffer, >> wouldn't CPU0 then see the correct data in the buffer? Even if CPU0=20 >> never did go to sleep? > > Couple of notes here; I would expect the DMA master to make its stores > _globally_ visible on 'completion'. Because I'm not sure our smp_mb() > would help make it globally visible, since its only defined on CPU/CPU > interactions, not external. > > Note that for example ARM has the distinction where smp_mb() uses > DMB-ISH barrier, dma_[rw]mb() uses DMB-OSH{LD,ST} and mb() uses DSB-SY. > > The ISH domain is the Inner-SHarable (IIRC) and only includes CPUs. The > OSH is Outer-SHarable and adds external agents like DMA (also includes > CPUs). The DSB-SY thing is even heavier and syncs world or something; I > always forget these details. > >> Or would something more be needed? > > The thing is, this is x86 (TSO). Most everything is globally > ordered/visible and full barriers. > > The only reorder allowed by TSO is a later read can happen before a > prior store. That is, only: > > X =3D 1; > smp_mb(); > r =3D Y; > > actually needs a full barrier. All the other variants are no-ops. > > Note that x86's dma_[rw]mb() are no-ops (like the regular smp_[rw]mb()). > Which seems to imply DMA is coherent and globally visible. > >> > I would very much expect the device IO stuff to order things for us in >> > this case. "Start DMA" should very much include sufficient fences to >> > ensure the data under DMA is visible to the DMA engine, this would very >> > much include things like flushing store buffers and maybe even writeba= ck >> > caches, depending on platform needs. >> >=20 >> > At the same time, I would expect "Handle DMA-complete irq", even if it >> > were done with a PIO polling loop, to guarantee ordering against later >> > operations such that 'complete' really means that. >>=20 >> That's what I would expect too. >>=20 >> Back in the original email thread where the problem was first reported,= =20 >> Felipe says that the problem appears to be something else. Here's what= =20 >> it looks like now, in schematic form: >>=20 >> CPU0 >> ---- >> get_next_command(): >> while (bh->state !=3D BUF_STATE_EMPTY) >> sleep_thread(); >> start an input request for bh >> while (bh->state !=3D BUF_STATE_FULL) >> sleep_thread(); >>=20 >> As mentioned above, the input involves DMA and is terminated by an irq. >> The request's completion handler is bulk_out_complete(): >>=20 >> CPU1 >> ---- >> bulk_out_complete(): >> bh->state =3D BUF_STATE_FULL; >> wakeup_thread(); >>=20 >> According to Felipe, when CPU0 wakes up and checks bh->state, it sees a= =20 >> value different from BUF_STATE_FULL. So it goes back to sleep again=20 >> and doesn't make any forward progress. >>=20 >> It's possible that something else is changing bh->state when it=20 >> shouldn't. But if this were the explanation, why would Felipe see that= =20 >> the problem goes away when he changes the memory barriers in=20 >> sleep_thread() and wakeup_thread() to smp_mb()? > > Good question, let me stare at the code more. > > Could you confirm that bulk_{in,out}_complete() work on different > usb_request structures, and they can not, at any time, get called on the > _same_ request? usb_requests are allocated for a specific endpoint and USB Device Controller (UDC) drivers refuse to queue requests allocated for epX to epY, so this really can never happen. My fear now, however, is that changing smp_[rw]mb() to smp_mb() just adds extra overhead which makes the problem much, much less likely to happen. Does that sound plausible to you? =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBCAAGBQJXzqvrAAoJEMy+uJnhGpkGz2EQAK5EzMGJMbKJ2kMeL8e6QrL1 HibPYLeLI/Gb33Qm5WWh4sil+7YT1mZbmSluE7Ofa8vsW20dXDHDi8R0XfGjKEMq Gz/UtepZC1ZKFiaY7pQP4se2E/LDYmcNvZY02hl2BxvYGhpf/0DiOe2B2sXZ8gwa pN0XBtrxIA/QNvetN0RL+BI49Xuc7E307LPcIZgfm09/JUIoGOcBfSwTE5oq7HWB s3qgHS3nS2G6szNl7RTPTNXYdqccLWi5ktavr0Ew1oRPtEsI3bkhPdrwOK5h3yo4 dJAPIMflLAkpIdb+aCojqlHoAnDCO/o0/Y3gbg1K6eyvZdqEW1eytjDYMhWLE8m7 eUDxRl0OWjPgE4DgS3U3NdB5Or/DkrR1HwxSV5Jpi0kFZFlmFpaihcd9eB2zGw1M +IVeHaIl0TfpBY6Y0GZl2JiXPXAR50f8bsMiIKhHU2gRIKMCgLv5O4Z4X5ZFejPB KlRSkQddpNGjFc2hhMINjYUub7kWWEki1DwI7Eaaubhb91AblIpV3VDJX7CAdYEg Ve5t5fxQ2+96Q1eJRGmSYsdro+4f63DUdePubF2e4vHlrXe16IYK9phQQBHoEgZJ nuqgl/ZJkeuf0kl3dspGWDE8UI+DVfNA7ISw9B+942WHSymoh59Gq4pvc4s24V5n OaYcg8RGEAGyRTLz4nS1 =ct+q -----END PGP SIGNATURE----- --=-=-=--