public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: "David Airlie" <airlied@gmail.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Gurchetan Singh" <gurchetansingh@chromium.org>,
	"Chia-I Wu" <olvaffe@gmail.com>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
	"Maxime Ripard" <mripard@kernel.org>,
	"Thomas Zimmermann" <tzimmermann@suse.de>,
	"Christian König" <christian.koenig@amd.com>,
	"Qiang Yu" <yuq825@gmail.com>,
	"Steven Price" <steven.price@arm.com>,
	"Emma Anholt" <emma@anholt.net>, "Melissa Wen" <mwen@igalia.com>,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	kernel@collabora.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v16 15/20] drm/shmem-helper: Add memory shrinker
Date: Thu, 14 Sep 2023 15:27:03 +0200	[thread overview]
Message-ID: <20230914152703.78b1ac82@collabora.com> (raw)
In-Reply-To: <ca7e905b-2809-fce4-1b56-7909efb1a229@collabora.com>

On Thu, 14 Sep 2023 16:01:37 +0300
Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:

> On 9/14/23 14:58, Boris Brezillon wrote:
> > On Thu, 14 Sep 2023 14:36:23 +0300
> > Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:
> >   
> >> On 9/14/23 11:27, Boris Brezillon wrote:  
> >>> On Thu, 14 Sep 2023 10:50:32 +0300
> >>> Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:
> >>>     
> >>>> On 9/14/23 10:36, Boris Brezillon wrote:    
> >>>>> On Thu, 14 Sep 2023 07:02:52 +0300
> >>>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:
> >>>>>       
> >>>>>> On 9/13/23 10:48, Boris Brezillon wrote:      
> >>>>>>> On Wed, 13 Sep 2023 03:56:14 +0300
> >>>>>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote:
> >>>>>>>         
> >>>>>>>> On 9/5/23 11:03, Boris Brezillon wrote:        
> >>>>>>>>>>                * But
> >>>>>>>>>> +		 * acquiring the obj lock in drm_gem_shmem_release_pages_locked() can
> >>>>>>>>>> +		 * cause a locking order inversion between reservation_ww_class_mutex
> >>>>>>>>>> +		 * and fs_reclaim.
> >>>>>>>>>> +		 *
> >>>>>>>>>> +		 * This deadlock is not actually possible, because no one should
> >>>>>>>>>> +		 * be already holding the lock when drm_gem_shmem_free() is called.
> >>>>>>>>>> +		 * Unfortunately lockdep is not aware of this detail.  So when the
> >>>>>>>>>> +		 * refcount drops to zero, don't touch the reservation lock.
> >>>>>>>>>> +		 */
> >>>>>>>>>> +		if (shmem->got_pages_sgt &&
> >>>>>>>>>> +		    refcount_dec_and_test(&shmem->pages_use_count)) {
> >>>>>>>>>> +			drm_gem_shmem_do_release_pages_locked(shmem);
> >>>>>>>>>> +			shmem->got_pages_sgt = false;
> >>>>>>>>>>  		}          
> >>>>>>>>> Leaking memory is the right thing to do if pages_use_count > 1 (it's
> >>>>>>>>> better to leak than having someone access memory it no longer owns), but
> >>>>>>>>> I think it's worth mentioning in the above comment.          
> >>>>>>>>
> >>>>>>>> It's unlikely that it will be only a leak without a following up
> >>>>>>>> use-after-free. Neither is acceptable.        
> >>>>>>>
> >>>>>>> Not necessarily, if you have a page leak, it could be that the GPU has
> >>>>>>> access to those pages, but doesn't need the GEM object anymore
> >>>>>>> (pages are mapped by the iommu, which doesn't need shmem->sgt or
> >>>>>>> shmem->pages after the mapping is created). Without a WARN_ON(), this
> >>>>>>> can go unnoticed and lead to memory corruptions/information leaks.
> >>>>>>>         
> >>>>>>>>
> >>>>>>>> The drm_gem_shmem_free() could be changed such that kernel won't blow up
> >>>>>>>> on a refcnt bug, but that's not worthwhile doing because drivers
> >>>>>>>> shouldn't have silly bugs.        
> >>>>>>>
> >>>>>>> We definitely don't want to fix that, but we want to complain loudly
> >>>>>>> (WARN_ON()), and make sure the risk is limited (preventing memory from
> >>>>>>> being re-assigned to someone else by not freeing it).        
> >>>>>>
> >>>>>> That's what the code did and continues to do here. Not exactly sure what
> >>>>>> you're trying to say. I'm going to relocate the comment in v17 to
> >>>>>> put_pages(), we can continue discussing it there if I'm missing yours point.
> >>>>>>      
> >>>>>
> >>>>> I'm just saying it would be worth mentioning that we're intentionally
> >>>>> leaking memory if shmem->pages_use_count > 1. Something like:
> >>>>>
> >>>>> 	/**
> >>>>> 	 * shmem->pages_use_count should be 1 when ->sgt != NULL and
> >>>>> 	 * zero otherwise. If some users still hold a pages reference
> >>>>> 	 * that's a bug, and we intentionally leak the pages so they
> >>>>> 	 * can't be re-allocated to someone else while the GPU/CPU
> >>>>> 	 * still have access to it.
> >>>>> 	 */
> >>>>> 	drm_WARN_ON(drm,
> >>>>> 		    refcount_read(&shmem->pages_use_count) == (shmem->sgt ? 1 : 0));
> >>>>> 	if (shmem->sgt && refcount_dec_and_test(&shmem->pages_use_count))
> >>>>> 		drm_gem_shmem_free_pages(shmem);      
> >>>>
> >>>> That may be acceptable, but only once there will a driver using this
> >>>> feature.    
> >>>
> >>> Which feature? That's not related to a specific feature, that's just
> >>> how drm_gem_shmem_get_pages_sgt() works, it takes a pages ref that can
> >>> only be released in drm_gem_shmem_free(), because sgt users are not
> >>> refcounted and the sgt stays around until the GEM object is freed or
> >>> its pages are evicted. The only valid cases we have at the moment are:
> >>>
> >>> - pages_use_count == 1 && sgt != NULL
> >>> - pages_use_count == 0
> >>>
> >>> any other situations are buggy.    
> >>
> >> sgt may belong to dma-buf for which pages_use_count=0, this can't be
> >> done until sgt mess is sorted out  
> > 
> > No it can't, not in that path, because the code you're adding is in the
> > if (!obj->import_branch) branch:
> > 
> > 
> >  	if (obj->import_attach) {
> >  		drm_prime_gem_destroy(obj, shmem->sgt);
> >  	} else {
> > 		...
> > 		// Your changes are here.
> > 		...  
> 
> This branch is taken for the dma-buf in the prime import error code path.

I suggested a fix for this error that didn't involve adding a new flag,
but that's orthogonal to the piece of code we're discussing anyway.

> But yes, the pages_use_count=0 for the dma-buf and then it can be
> written as:
> 
> 	if (obj->import_attach) {
> 		drm_prime_gem_destroy(obj, shmem->sgt);
> 	} else {
> 		drm_WARN_ON(obj->dev, refcount_read(&shmem->vmap_use_count));
> 
> 		if (shmem->sgt && refcount_read(&shmem->pages_use_count)) {

You should drop the '&& refcount_read(&shmem->pages_use_count)',
otherwise you'll never enter this branch (sgt allocation retained
a ref, so pages_use_count > 0 when ->sgt != NULL).

If you added this pages_use_count > 0 check to deal with the
'free-partially-imported-GEM' case, I keep thinking this is not
the right fix. You should just assume that obj->import_attach == NULL
means not-a-prime-buffer, and then make sure
partially-initialized-prime-GEMs have import_attach assigned (see the
oneliner I suggested in my review of
`[PATCH v15 01/23] drm/shmem-helper: Fix UAF in error path when
freeing SGT of imported GEM`).

> 			dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
> 					  DMA_BIDIRECTIONAL, 0);
> 			sg_free_table(shmem->sgt);
> 			kfree(shmem->sgt);
> 
> 			__drm_gem_shmem_put_pages(shmem);

You need to decrement pages_use_count:

			/* shmem->pages_use_count should be 1 when ->sgt != NULL and
			 * zero otherwise. If some users still hold a pages reference
			 * that's a bug, and we intentionally leak the pages so they
			 * can't be re-allocated to someone else while the GPU/CPU
			 * still have access to it.
		 	 */
			if (refcount_dec_and_test(&shmem->pages_use_count))
				__drm_gem_shmem_put_pages(shmem);

> 		}
> 
> 		drm_WARN_ON(obj->dev, refcount_read(&shmem->pages_use_count));

And now this WARN_ON() ^ should catch unexpected pages leak.

> 
> Alright, I'll check if it works as expected for fixing the error code path bug for v17
> 


  reply	other threads:[~2023-09-14 13:27 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-03 17:07 [PATCH v16 00/20] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 01/20] drm/shmem-helper: Fix UAF in error path when freeing SGT of imported GEM Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 02/20] drm/shmem-helper: Use flag for tracking page count bumped by get_pages_sgt() Dmitry Osipenko
2023-09-05  7:40   ` Boris Brezillon
2023-09-11 23:41     ` Dmitry Osipenko
2023-09-12  7:07       ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 03/20] drm/gem: Change locked/unlocked postfix of drm_gem_v/unmap() function names Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 04/20] drm/gem: Add _locked postfix to functions that have unlocked counterpart Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 05/20] drm/v3d: Replace open-coded drm_gem_shmem_free() with drm_gem_object_put() Dmitry Osipenko
2023-09-05  7:33   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 06/20] drm/virtio: Replace " Dmitry Osipenko
2023-09-05  7:20   ` Boris Brezillon
2023-09-11 23:32     ` Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 07/20] drm/shmem-helper: Make all exported symbols GPL Dmitry Osipenko
2023-09-05  7:05   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 08/20] drm/shmem-helper: Refactor locked/unlocked functions Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 09/20] drm/shmem-helper: Remove obsoleted is_iomem test Dmitry Osipenko
2023-09-05  6:46   ` Boris Brezillon
2023-09-13  0:06     ` Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 10/20] drm/shmem-helper: Add and use pages_pin_count Dmitry Osipenko
2023-09-05  6:39   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 11/20] drm/shmem-helper: Use refcount_t for pages_use_count Dmitry Osipenko
2023-09-05  6:56   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 12/20] drm/shmem-helper: Add and use lockless drm_gem_shmem_get_pages() Dmitry Osipenko
2023-09-05  6:58   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 13/20] drm/shmem-helper: Switch drm_gem_shmem_vmap/vunmap to use pin/unpin Dmitry Osipenko
2023-09-05  7:00   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 14/20] drm/shmem-helper: Use refcount_t for vmap_use_count Dmitry Osipenko
2023-09-05  7:05   ` Boris Brezillon
2023-09-03 17:07 ` [PATCH v16 15/20] drm/shmem-helper: Add memory shrinker Dmitry Osipenko
2023-09-05  8:03   ` Boris Brezillon
2023-09-13  0:56     ` Dmitry Osipenko
2023-09-13  7:48       ` Boris Brezillon
2023-09-14  4:02         ` Dmitry Osipenko
2023-09-14  7:36           ` Boris Brezillon
2023-09-14  7:50             ` Dmitry Osipenko
2023-09-14  8:27               ` Boris Brezillon
2023-09-14 11:36                 ` Dmitry Osipenko
2023-09-14 11:58                   ` Boris Brezillon
2023-09-14 13:01                     ` Dmitry Osipenko
2023-09-14 13:27                       ` Boris Brezillon [this message]
2023-09-14 13:30                         ` Boris Brezillon
2023-09-14 13:58                         ` Dmitry Osipenko
2023-09-07 10:03   ` Dan Carpenter
2023-09-11 23:44     ` Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 16/20] drm/shmem-helper: Export drm_gem_shmem_get_pages_sgt_locked() Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 17/20] drm/virtio: Pin display framebuffer BO Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 18/20] drm/virtio: Attach shmem BOs dynamically Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 19/20] drm/virtio: Support memory shrinking Dmitry Osipenko
2023-09-03 17:07 ` [PATCH v16 20/20] drm/panfrost: Switch to generic memory shrinker Dmitry Osipenko
2023-09-04 13:20   ` Steven Price
2023-09-05  8:08     ` Boris Brezillon
2023-09-06 10:55       ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230914152703.78b1ac82@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=airlied@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dmitry.osipenko@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=emma@anholt.net \
    --cc=gurchetansingh@chromium.org \
    --cc=kernel@collabora.com \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maarten.lankhorst@linux.intel.com \
    --cc=mripard@kernel.org \
    --cc=mwen@igalia.com \
    --cc=olvaffe@gmail.com \
    --cc=steven.price@arm.com \
    --cc=tzimmermann@suse.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=yuq825@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox