From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F336712E46 for ; Fri, 10 Nov 2023 09:52:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="hr1IGq+o" Received: from localhost (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id D123166073EF; Fri, 10 Nov 2023 09:52:34 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1699609955; bh=21JPMc5nCzm9ynbxzOvHCWdhr+G/7u4J/hhGJKmwPiQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=hr1IGq+oqoJ9Rr9acOW3M64iX3LlRXTD1EYZcN6fIQQ/5yG5CjEoniES62bBbkrpi h0ipr9LIm4GwFzFXISjFhjjCL8uUJUn8KEsOHTcV4P9HM7fCevRb7Q9ASJx2FXAOwE WUlBIBMI1YwEvRYeQB9VXnqxtu+h+QZYYz7psOCq7U1SWW1w/Xomq16zhGNsaPvYr4 WJ6Yl3evU6bHj3OwofnGfx1Xg6BQ5otDfkLriVpTevxlhBFkoelAfAAKdaPbjq0V6v 4qsQ1KkPQkPtKL936n1kwRJi3+oRdNXNqnbYVpqK8w3hr+fOhl6EJT4Ocd6Bd6gf76 UDh+sRVxjpJIA== Date: Fri, 10 Nov 2023 10:52:31 +0100 From: Boris Brezillon To: Robin Murphy Cc: Joerg Roedel , iommu@lists.linux.dev, Will Deacon , linux-arm-kernel@lists.infradead.org, Rob Clark Subject: Re: [PATCH 2/2] iommu: Extend LPAE page table format to support custom allocators Message-ID: <20231110105231.3a8fd2ff@collabora.com> In-Reply-To: <04ae3d0c-d850-63c7-80bb-f90e26f5b758@arm.com> References: <20230809121744.2341454-1-boris.brezillon@collabora.com> <20230809121744.2341454-3-boris.brezillon@collabora.com> <04ae3d0c-d850-63c7-80bb-f90e26f5b758@arm.com> Organization: Collabora X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) Precedence: bulk X-Mailing-List: iommu@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Wed, 20 Sep 2023 17:42:01 +0100 Robin Murphy wrote: > On 09/08/2023 1:17 pm, Boris Brezillon wrote: > > We need that in order to implement the VM_BIND ioctl in the GPU driver > > targeting new Mali GPUs. > > > > VM_BIND is about executing MMU map/unmap requests asynchronously, > > possibly after waiting for external dependencies encoded as dma_fences. > > We intend to use the drm_sched framework to automate the dependency > > tracking and VM job dequeuing logic, but this comes with its own set > > of constraints, one of them being the fact we are not allowed to > > allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this > > sort of deadlocks: > > > > - VM_BIND map job needs to allocate a page table to map some memory > > to the VM. No memory available, so kswapd is kicked > > - GPU driver shrinker backend ends up waiting on the fence attached to > > the VM map job or any other job fence depending on this VM operation. > > > > With custom allocators, we will be able to pre-reserve enough pages to > > guarantee the map/unmap operations we queued will take place without > > going through the system allocator. But we can also optimize > > allocation/reservation by not free-ing pages immediately, so any > > upcoming page table allocation requests can be serviced by some free > > page table pool kept at the driver level. > > We should bear in mind it's also potentially valuable for other aspects > of GPU and similar use-cases, like fine-grained memory accounting and > resource limiting. That's a significant factor in this approach vs. > internal caching schemes that could only solve the specific reclaim concern. I mentioned these other cases in v2. Let me know if that's not detailed enough. > > diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c > > index f4caf630638a..e273c18ae22b 100644 > > --- a/drivers/iommu/io-pgtable.c > > +++ b/drivers/iommu/io-pgtable.c > > @@ -47,6 +47,18 @@ static int check_custom_allocator(enum io_pgtable_fmt fmt, > > if (!cfg->alloc) > > return 0; > > > > + switch (fmt) { > > + case ARM_32_LPAE_S1: > > + case ARM_32_LPAE_S2: > > + case ARM_64_LPAE_S1: > > + case ARM_64_LPAE_S2: > > + case ARM_MALI_LPAE: > > + return 0; > > I remain not entirely convinced by the value of this, but could it at > least be done in a more scalable manner like some kind of flag provided > by the format itself? I added a caps flag to io_pgtable_init_fns in v2. Feels a bit weird to add a field that's not a function pointer in a struct that's prefixed with _fns (which I guess stands for _functions), but oh well. Regards, Boris From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6694C4167B for ; Fri, 10 Nov 2023 09:53:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ZC3c3u0xhspZX+Hr0pB9oN+uOPiz9Kp8XlMQh3iQizM=; b=A3PBPF3kBaH/u5 fYw1c1lF9fSQ1Tp7qdfdIdhmJ6ttko7EIgiUBYvlDgm12smYpt2NkIF7mL7GUovuN5mP6zmoU4NT3 SNU/dClguNrMwgTCSnrX4FM1HrEG8Jkss2Pf0xlNST9YqVaqiRfRpyala3ppkVIMXTRHH7f5v26rS R81bDh84slPZoEDwdMnYHhv4993tA7F2tIKvvBgx3JsGXKmAzkaAcfiSs4hOC0SCChkFuSeY4CG04 Qur/n3hYoBSRm3XklK5pHfUOMI6fMwlo23E7k1whUt4WUjFtKV6ebPQdPzK70qKMemygiViQrYw5O qbyvVu/Y4qGz7PRjo7+Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r1OBs-008EpB-0u; Fri, 10 Nov 2023 09:52:40 +0000 Received: from madras.collabora.co.uk ([2a00:1098:0:82:1000:25:2eeb:e5ab]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r1OBo-008Eoo-2I for linux-arm-kernel@lists.infradead.org; Fri, 10 Nov 2023 09:52:38 +0000 Received: from localhost (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id D123166073EF; Fri, 10 Nov 2023 09:52:34 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1699609955; bh=21JPMc5nCzm9ynbxzOvHCWdhr+G/7u4J/hhGJKmwPiQ=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=hr1IGq+oqoJ9Rr9acOW3M64iX3LlRXTD1EYZcN6fIQQ/5yG5CjEoniES62bBbkrpi h0ipr9LIm4GwFzFXISjFhjjCL8uUJUn8KEsOHTcV4P9HM7fCevRb7Q9ASJx2FXAOwE WUlBIBMI1YwEvRYeQB9VXnqxtu+h+QZYYz7psOCq7U1SWW1w/Xomq16zhGNsaPvYr4 WJ6Yl3evU6bHj3OwofnGfx1Xg6BQ5otDfkLriVpTevxlhBFkoelAfAAKdaPbjq0V6v 4qsQ1KkPQkPtKL936n1kwRJi3+oRdNXNqnbYVpqK8w3hr+fOhl6EJT4Ocd6Bd6gf76 UDh+sRVxjpJIA== Date: Fri, 10 Nov 2023 10:52:31 +0100 From: Boris Brezillon To: Robin Murphy Cc: Joerg Roedel , iommu@lists.linux.dev, Will Deacon , linux-arm-kernel@lists.infradead.org, Rob Clark Subject: Re: [PATCH 2/2] iommu: Extend LPAE page table format to support custom allocators Message-ID: <20231110105231.3a8fd2ff@collabora.com> In-Reply-To: <04ae3d0c-d850-63c7-80bb-f90e26f5b758@arm.com> References: <20230809121744.2341454-1-boris.brezillon@collabora.com> <20230809121744.2341454-3-boris.brezillon@collabora.com> <04ae3d0c-d850-63c7-80bb-f90e26f5b758@arm.com> Organization: Collabora X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231110_015236_901535_DF2E5188 X-CRM114-Status: GOOD ( 31.78 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, 20 Sep 2023 17:42:01 +0100 Robin Murphy wrote: > On 09/08/2023 1:17 pm, Boris Brezillon wrote: > > We need that in order to implement the VM_BIND ioctl in the GPU driver > > targeting new Mali GPUs. > > > > VM_BIND is about executing MMU map/unmap requests asynchronously, > > possibly after waiting for external dependencies encoded as dma_fences. > > We intend to use the drm_sched framework to automate the dependency > > tracking and VM job dequeuing logic, but this comes with its own set > > of constraints, one of them being the fact we are not allowed to > > allocate memory in the drm_gpu_scheduler_ops::run_job() to avoid this > > sort of deadlocks: > > > > - VM_BIND map job needs to allocate a page table to map some memory > > to the VM. No memory available, so kswapd is kicked > > - GPU driver shrinker backend ends up waiting on the fence attached to > > the VM map job or any other job fence depending on this VM operation. > > > > With custom allocators, we will be able to pre-reserve enough pages to > > guarantee the map/unmap operations we queued will take place without > > going through the system allocator. But we can also optimize > > allocation/reservation by not free-ing pages immediately, so any > > upcoming page table allocation requests can be serviced by some free > > page table pool kept at the driver level. > > We should bear in mind it's also potentially valuable for other aspects > of GPU and similar use-cases, like fine-grained memory accounting and > resource limiting. That's a significant factor in this approach vs. > internal caching schemes that could only solve the specific reclaim concern. I mentioned these other cases in v2. Let me know if that's not detailed enough. > > diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c > > index f4caf630638a..e273c18ae22b 100644 > > --- a/drivers/iommu/io-pgtable.c > > +++ b/drivers/iommu/io-pgtable.c > > @@ -47,6 +47,18 @@ static int check_custom_allocator(enum io_pgtable_fmt fmt, > > if (!cfg->alloc) > > return 0; > > > > + switch (fmt) { > > + case ARM_32_LPAE_S1: > > + case ARM_32_LPAE_S2: > > + case ARM_64_LPAE_S1: > > + case ARM_64_LPAE_S2: > > + case ARM_MALI_LPAE: > > + return 0; > > I remain not entirely convinced by the value of this, but could it at > least be done in a more scalable manner like some kind of flag provided > by the format itself? I added a caps flag to io_pgtable_init_fns in v2. Feels a bit weird to add a field that's not a function pointer in a struct that's prefixed with _fns (which I guess stands for _functions), but oh well. Regards, Boris _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel