From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Christian_K=F6nig?= Subject: Re: [PATCH] drm/radeon: add user pointer support v2 Date: Wed, 02 Jul 2014 18:44:30 +0200 Message-ID: <53B436EE.2000607@vodafone.de> References: <1404133456-1790-1-git-send-email-deathsimple@vodafone.de> <20140630173526.GA3169@gmail.com> <53B2841E.1000701@vodafone.de> <20140701153512.GA2529@gmail.com> <53B2FA97.2030206@vodafone.de> <20140701190541.GA3322@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pegasos-out.vodafone.de (pegasos-out.vodafone.de [80.84.1.38]) by gabe.freedesktop.org (Postfix) with ESMTP id C8E7C6E5CF for ; Wed, 2 Jul 2014 09:44:47 -0700 (PDT) In-Reply-To: <20140701190541.GA3322@gmail.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Jerome Glisse Cc: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org > Yes this could be argued that way, that anyway user than can use the gpu > can already pin large amount of ram. But pining anonymous memory or file > backed memory (ie non regular bo memory) is different as pages are still > on the lru list and thus the vmscan code will still go over them which > might worsen the out of memory case. Assuming we forbid file backed pages and only allow anonymous memory for = this feature, wouldn't it be possible to "steal" the pages in question = from the lru and add them as backing pages to an BO? For userspace it should look like they just created a BO, copied the = content of the memory location in question into it and then mapped the = BO at the original location (obviously without the race condition that = such a thing would imply for doing it in userspace). Just a thought, Christian. Am 01.07.2014 21:05, schrieb Jerome Glisse: > On Tue, Jul 01, 2014 at 08:14:47PM +0200, Christian K=F6nig wrote: >> Am 01.07.2014 17:35, schrieb Jerome Glisse: >>> On Tue, Jul 01, 2014 at 11:49:18AM +0200, Christian K=F6nig wrote: >>>> Am 30.06.2014 19:35, schrieb Jerome Glisse: >>>>> On Mon, Jun 30, 2014 at 03:04:16PM +0200, Christian K=F6nig wrote: >>>>> >>>>> [SNIP] >>>>> @@ -176,8 +181,15 @@ static int radeon_cs_parser_relocs(struct radeon= _cs_parser *p) >>>>> if (p->cs_flags & RADEON_CS_USE_VM) >>>>> p->vm_bos =3D radeon_vm_get_bos(p->rdev, p->ib.vm, >>>>> &p->validated); >>>>> + if (need_mmap_lock) >>>>> + down_read(¤t->mm->mmap_sem); >>>>> + >>>>> + r =3D radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p= ->ring); >>>>> - return radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, = p->ring); >>>>> + if (need_mmap_lock) >>>>> + up_read(¤t->mm->mmap_sem); >>>>> + >>>>> + return r; >>>>> } >>>>> So the mmap lock protect almost nothing here. Once things are validat= ed >>>>> nothing forbid the userspace to unmap the range containing the user b= o. >>>> As far as I understand it (and I'm probably not so deep into the MM >>>> code as you, so correct me if I'm wrong) unmapping the range would >>>> result in an invalidate_range_start callback and this callback in >>>> turn validates the BO into the CPU domain again. So it actually >>>> blocks for the GPU operation to be completed, marks the pages in >>>> question as dirty and then unpins them. >>>> >>>> This should protected us anything nasty to happen in the case of a >>>> unmap(), fork() etc... >>> Thing is nothing forbid from a new cs ioctl to happen right after >>> the radeon range_start callback but before the core mm code that >>> was about to do something. The mmap_sem will protect you from fork >>> or munmap but not from otherthing. >> Any suggestion on how to fix this? >> >> I mean I could take the BO reservation in range_start and release it >> in range_end, but I'm pretty sure the code calling the MMU notifier >> won't like that. > I need to think a bit more about all the case other than munmap/fork > to see what might happen and under what circumstances. > >> [SNIP] >>>>>> + >>>>>> +static const struct mmu_notifier_ops radeon_bo_notifier =3D { >>>>>> + .release =3D radeon_bo_mn_release, >>>>>> + .invalidate_range_start =3D radeon_bo_mn_invalidate_range_start, >>>>>> +}; >>>>> What about invalidate_page callback ? You are breaking write back to >>>>> disk. Which is not a welcome move. >>>> Why? Keep in mind that the pages are pinned down as soon as we make >>>> the first CS which uses them and marked as dirty when we are done >>>> with them. In between those two events the page shouldn't be written >>>> back to disk. >>> Page could be dirty prior to be pin, or dirtied after being (CPU access= ). >>> I am just pointing out that the GPU might be writting to the page while >>> the page is use as source for disk I/O. >> I still don't see the problem here, the worst thing that could >> happen is that we write a halve filled page to disk. But since we >> set the dirty bit again after the GPU is done with the pages it >> should be written back to disk again if necessary. > This break the fsync semantic and expectation. This is the reason why the > cpu mapping is set to read only while disk i/o is in progress. > >> [SNIP] >>> The biggest issue with it is the pining of memory, this violate the mem= cg. >>> Direct-io pin memory but limit itself to a reasonable amount at a time. >>> A GPU buffer object might be several hundred mega byte which obviously >>> can be nasty from memory starvation point of view. >> I thought we would handle this gracefully by accounting the memory >> pinned down to the GTT size as well. E.g. allocating GTT buffers >> pins down memory in the same way we would pin it down through this >> interface. >> >> In the end the maximum amount of pinned down memory is always the GTT si= ze. > Yes this could be argued that way, that anyway user than can use the gpu > can already pin large amount of ram. But pining anonymous memory or file > backed memory (ie non regular bo memory) is different as pages are still > on the lru list and thus the vmscan code will still go over them which > might worsen the out of memory case. > >> Regards, >> Christian. >> >>> Most other issue can be worked out. >>> >>> Cheers, >>> J=E9r=F4me >>>