From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from lxorguk.ukuu.org.uk ([81.2.110.251]:39748 "EHLO lxorguk.ukuu.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161962Ab2CSQ6U (ORCPT ); Mon, 19 Mar 2012 12:58:20 -0400 Date: Mon, 19 Mar 2012 16:56:44 +0000 From: Alan Cox To: "Tom Cooksey" Cc: "'Rob Clark'" , , , , , "Rob Clark" , , Subject: Re: [PATCH] RFC: dma-buf: userspace mmap support Message-ID: <20120319165644.3e4ce4ad@pyx> In-Reply-To: <000101cd05ef$3eab78c0$bc026a40$@cooksey@arm.com> References: <1331775148-5001-1-git-send-email-rob.clark@linaro.org> <000001cd0399$9e57db90$db0792b0$@cooksey@arm.com> <20120317201719.121f4104@pyx> <000101cd05ef$3eab78c0$bc026a40$@cooksey@arm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-media-owner@vger.kernel.org List-ID: > If the API was to also be used for synchronization it would have to > include an atomic "prepare multiple" ioctl which blocked until all > the buffers listed by the application were available. In the same Too slow already. You are now serializing stuff while what we want to do really is nobody_else_gets_buffers_next([list]) on available(buffer) dispatch_work(buffer) so that you can maximise parallelism without allowing deadlocks. If you've got a high memory bandwith and 8+ cores the 'stop everything' model isn't great. > This might be a good argument for keeping synchronization and cache > maintenance separate, though even ignoring synchronization I would > think being able to issue cache maintenance operations for multiple > buffers in a single ioctl might present some small efficiency gains. > However as Rob points out, CPU access is already in slow/legacy > territory. Dangerous assumption. I do think they should be separate. For one it makes the case of synchronization needed but hardware cache management much easier to split cleanly. Assuming CPU access is slow/legacy reflects a certain model of relatively slow CPU and accelerators where falling off the acceleration path is bad. On a higher end processor falling off the acceleration path isn't a performance matter so much as a power concern. > KDS we differentiated jobs which needed "exclusive access" to a > buffer and jobs which needed "shared access" to a buffer. Multiple > jobs could access a buffer at the same time if those jobs all Makes sense as it's a reader/writer lock and it reflects MESI/MOESI caching and cache policy in some hardware/software assists. > display controller will be reading the front buffer, but the GPU > might also need to read that front buffer. So perhaps adding > "read-only" & "read-write" access flags to prepare could also be > interpreted as shared & exclusive accesses, if we went down this > route for synchronization that is. :-) mmap includes read/write info so probably using that works out. It also means that you have the stuff mapped in a way that will bus error or segfault anyone who goofs rather than give them the usual 'deep weirdness' behaviour you get with mishandling of caching bits. Alan