From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 717CE153800 for ; Thu, 9 Jan 2025 16:17:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736439436; cv=none; b=u7u2PaDZZboeYH2vRYQbIk87CM/V3XXVVrtueFAW51GucUnp7A71A0BAmzqfxS4OrwAkY9nXBLFKnjObDorYG6p1dlfenBirmrMd/z+tvDR21IPKUxM2SNsvt4uslLpnnMTls+eP89GK2ln6HgA0T4NlAIQhH/xVvIIZFBgr5X0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736439436; c=relaxed/simple; bh=EQ9lTkJj5hVITyZtb/CU7uK04iGBFGO+6cXPUdDZjAo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=pXtRegxnIwAeOwrkvuQN8pvcJ6rwU+q9vEGjrWkhOrjg5I9fsSrCZXFjzYYaNNDRpNOJzAE4CZlbstTp4Kf2+eVHuqETn5atB/hroOw33TOV/5gii9SuteKf0zi+i12pJg/rBIOttPMTu+BPAdMjwSXYTdFEN5Sn32bzXcQud70= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rowland.harvard.edu; spf=fail smtp.mailfrom=g.harvard.edu; dkim=pass (2048-bit key) header.d=rowland.harvard.edu header.i=@rowland.harvard.edu header.b=aDN+eq8L; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rowland.harvard.edu Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=g.harvard.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rowland.harvard.edu header.i=@rowland.harvard.edu header.b="aDN+eq8L" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6d88cb85987so8720826d6.1 for ; Thu, 09 Jan 2025 08:17:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rowland.harvard.edu; s=google; t=1736439433; x=1737044233; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qvXzKAdEajSDj7kVBY3z63caF0WRxLoy22sGj/czuHk=; b=aDN+eq8La3NzhtgvpLV50kdkEup7pQ8UxSIGKjFT6D6ag/3ONJ4ttWxMLV8ZIBU0Iq ZS4QGav7EgMsq2eR1++vrZSosNZrk3a9EeUPRPy11un328M9NaluC8vqzpgBpeHgLvJc Y+AG72SYLCFVunjIWlqJwXgK5ib49BqxG+8ERb0c6jdqeWcydFOyOsnfhBJFUrC0ILfE LgXbWdRIeNg/T+8JB7iMdbEVUQQ92Sv6lpSXNMW/2sC+EbDVxf59QUNBQFjVtr/jA/yj tbR2N7bqTjXbjF5CMoZMicNgIIt+gfS25J8ay/5ZoPbgBfrHNsDFMbaeytvULYxgcOkb aMdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736439433; x=1737044233; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qvXzKAdEajSDj7kVBY3z63caF0WRxLoy22sGj/czuHk=; b=FESszYGq4ZSAMJso+B4EwrfoEhoa2u3cGeMzZY4ZHQUHgs74HiYGeCauSftWPU+Vsy F5zlx+rIqfNwAKe0I7LNxnOChaDV6H01nd5vpk90/A6qN/oGZhuK2MVaegZNsgAhAW8X RDaBkHepd0d5/jvz9WrDMtG2zwDH7fYoLjPmzt+r6DN5MTMV8xLTtPThlWzmhXRILLC5 HRg7ioPzynzo0qGEfFW0l8i/rT4wwrRqPVJ2XEeTXLcNPNDpQBCb7P/dzGVuV9e5QkYm To3WYWnc+QPDY5+0LHB8ip83JRw2HEvqmjYTkknAbaW3y43QCnpV0aoIMwTRGrp3CZ4O M5Wg== X-Forwarded-Encrypted: i=1; AJvYcCUxAirUFrbH7rUwoGz5J0YCXjV47QCTv67G1PRVarbCjfo5y1VkytJBU1bBES0vXP0wV5XtWn7dWeiPp2w=@vger.kernel.org X-Gm-Message-State: AOJu0YwgC6JAIkVmXmc7IbwiKlKUUCtP8VlfCf3EYrjDsyCU6TYXV2Il CxrV/CB6pvw1tJkxr+HY/dCM5QkKvH0wc+OAqn7aZ+2ZO1iS7bbBKKGZrZ+XLw== X-Gm-Gg: ASbGncv02cw1Qi715LI0fQ9MIJs5z2XI3VkAThCAMzznT2Sq/WhZs/noScyUgNM0QAD T2DVIA5iq44900It//8FqiiMSzHMumnUyXF0FF54s+d37ptLig06a6n0lGmLNDSkT5YWVrNi1fI dqBfePiP44pZdQmf4IEAk5dew/DoF4pxB+KgpDNP9nxba05ll6pCacIrXWgcFehDqV4GRq2iuOl gkH+2q3V672eW86DLlOehsaethEEzy9CKwCEA7GUHFgcZFEfOxiAJCWrYCnoqLx+JifjAjq63A= X-Google-Smtp-Source: AGHT+IHvvz+osm2cO8LQlWRP667hSU+yCQkQMj1d8ck8XeDh1CX1nmzrX3OCIPW3T2b+YuR2ht6zQg== X-Received: by 2002:a05:6214:509e:b0:6d8:ab3c:5d7 with SMTP id 6a1803df08f44-6dfa3b2473amr56850436d6.24.1736439433388; Thu, 09 Jan 2025 08:17:13 -0800 (PST) Received: from rowland.harvard.edu ([140.247.181.15]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dfac3e0a37sm2076676d6.89.2025.01.09.08.17.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Jan 2025 08:17:12 -0800 (PST) Date: Thu, 9 Jan 2025 11:17:09 -0500 From: Alan Stern To: Jonas Oberhauser Cc: paulmck@kernel.org, parri.andrea@gmail.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr, akiyks@gmail.com, dlustig@nvidia.com, joel@joelfernandes.org, urezki@gmail.com, quic_neeraju@quicinc.com, frederic@kernel.org, linux-kernel@vger.kernel.org, lkmm@lists.linux.dev, hernan.poncedeleon@huaweicloud.com Subject: Re: [RFC] tools/memory-model: Rule out OOTA Message-ID: <83b64af7-03b7-4dd8-900b-c57a65479ba8@rowland.harvard.edu> References: <20250106214003.504664-1-jonas.oberhauser@huaweicloud.com> <3ca45d97-670f-4c9e-8df8-2304a1de32b9@huaweicloud.com> <06854f97-a671-426e-8c73-78214da8fbf5@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <06854f97-a671-426e-8c73-78214da8fbf5@huaweicloud.com> On Wed, Jan 08, 2025 at 08:22:07PM +0100, Jonas Oberhauser wrote: > > > Am 1/8/2025 um 7:47 PM schrieb Alan Stern: > > On Wed, Jan 08, 2025 at 06:33:05PM +0100, Jonas Oberhauser wrote: > > > > > > > > > Am 1/7/2025 um 5:09 PM schrieb Alan Stern: > > > > Is this really valid? In the example above, if there were no other > > > > references to a or b in the rest of the program, the compiler could > > > > eliminate them entirely. > > > > > > In the example above, this is not possible, because the address of a/b have > > > escaped the function and are not deallocated before synchronization happens. > > > Therefore the compiler must assume that a/b are accessed inside the compiler > > > barrier. > > > > I'm not quite sure what you mean by that, but if the compiler has access > > to the entire program and can do a global analysis then it can recognize > > cases where accesses that may seem to be live aren't really. > > Even for trivial enough cases where the compiler has access to all the > source, compiler barriers should be opaque to the compiler. > > Since it is opaque, > > *a = 1; > compiler_barrier(); > > might as well be > > *a = 1; > *d = *a; // *d is in device memory > > and so in my opinion the compiler needs to ensure that the value of *a right > before the compiler barrier is 1. > > Of course, only if the address of *a could be possibly legally known to the > opaque code in the compiler barrier. What do you mean by "opaque code in the compiler barrier"? The compiler_barrier() instruction doesn't generate any code at all; it merely directs the compiler not to carry any knowledge about values stored in memory from one side of the barrier to the other. Note that it does _not_ necessarily prevent the compiler from carrying knowledge that a memory location is unused from one side of the barrier to the other. > > However, I admit this objection doesn't really apply to Linux kernel > > programming. > > > > > > (Whether the result could count as OOTA is > > > > open to question, but that's not the point.) Is it not possible that a > > > > compiler might find other ways to defeat your intentions? > > > > > > The main point (which I already mentioned in the previous e-mail) is if the > > > object is deallocated without synchronization (or never escapes the function > > > in the first place). > > > > > > And indeed, any such case renders the added rule unsound. It is in a sense > > > unrelated to OOTA; cases where the load/store can be elided are never OOTA. > > > > That is a matter of definition. In our paper, Paul and I described > > instances of OOTA in which all the accesses have been optimized away as > > "trivial". > > Yes, by OOTA I mean a rwdep;rfe cycle. > > In the absence of data races, such a cycle can't be optimized away because > it is created with volatile/compiler-barrier-protected accesses. That wasn't true in the C++ context of the paper Paul and I worked on. Of course, C++ is not our current context here. What I was trying to get at above is that compiler-barrier protection does not necessarily guarantee that non-volatile accesses can't be optimized away. (However, it's probably safe for us to make such an assumption here.) > It would look something like this: > > Live = R & rng(po \ po ; [W] ; (po-loc \ w_barrier)) | W & dom(po \ ((po-loc > \ w_barrier) ; [W] ; po)) > > let to-w = (overwrite & int) | (addr | rmb ; [Live])? ; rwdep ; ([Live] ; > wmb)? > > > > In any case, it seems that any approximation we can make to Live will be > > subject to various sorts of errors. > > Probably (this is certainly true for trying to approximate dependencies, for > example), but what I know for certain is that the approximations of Live > inside cat get more ugly the more precise they become. In the above > definition of Live I have not included that the address must escape, nor > that it must not be freed. > > A non-local definition that suffices for OOTA would be so: > > Live = R & rng(rfe) & dom(rwdep ; rfe) | W & dom(rfe) I could live with this (although I would prefer to have more parentheses -- IMO it'smistake-prone to rely on the relative precedence of | and &). Especially if the to-w definition above were rewritten in a way that would be a little easier to parse and understand. > It seems the ideal solution is to let Live be defined by the tools, which > should keep up with or exceed the analysis done by state-of-art compilers. I don't think it works that way in practice. :-) Alan