public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: "Christian König" <christian.koenig@amd.com>,
	"Dave Airlie" <airlied@gmail.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	LKML <linux-kernel@vger.kernel.org>,
	 Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>,
	Alex Deucher <alexdeucher@gmail.com>,
	lingshan.zhu@amd.com, Matthew Brost <matthew.brost@intel.com>
Subject: Re: [git pull] drm fixes for 6.11-rc6
Date: Tue, 03 Sep 2024 11:26:20 +0200	[thread overview]
Message-ID: <80ed1e928de230f54098d2a4b7f14b5d3556a687.camel@linux.intel.com> (raw)
In-Reply-To: <5c493bd5-e657-4241-81d7-19ccd380b379@amd.com>

On Mon, 2024-09-02 at 12:33 +0200, Christian König wrote:
> Am 02.09.24 um 11:32 schrieb Thomas Hellström:
> > On Mon, 2024-09-02 at 08:13 +1000, Dave Airlie wrote:
> > > On Fri, 30 Aug 2024 at 12:32, Linus Torvalds
> > > <torvalds@linux-foundation.org> wrote:
> > > > On Fri, 30 Aug 2024 at 14:08, Dave Airlie <airlied@gmail.com>
> > > > wrote:
> > > > > The TTM revert is due to some stuttering graphical apps
> > > > > probably
> > > > > due
> > > > > to longer stalls while prefaulting.
> > > > Yeah, trying to pre-fault a PMD worth of pages in one go is
> > > > just
> > > > crazy talk.
> > > > 
> > > > Now, if it was PMD-aligned and you faulted in a single PMD,
> > > > that
> > > > would
> > > > be different. But just doing prn_insert_page() in a loop is
> > > > insane.
> > > > 
> > > > The code doesn't even stop when it hits a page that already
> > > > existed,
> > > > and it keeps locking and unlocking the last-level page table
> > > > over
> > > > and
> > > > over again.
> > > > 
> > > > Honestly, that code is questionable even for the *small* value,
> > > > much
> > > > less the "a PMD size" case.
> > > > 
> > > > Now, if you have an array of 'struct page *", you can use
> > > > vm_insert_pages(), and that's reasonably efficient.
> > > > 
> > > > And if you have a *contiguous* are of pfns, you can use
> > > > remap_pfn_range().
> > > > 
> > > > But that "insert one pfn at a time" that the drm layer does is
> > > > complete garbage. You're not speeding anything up, you're just
> > > > digging
> > > > deeper.
> > 
> > > I wonder if there is functionality that could be provided in a
> > > common
> > > helper, by the mm layers, or if there would be too many locking
> > > interactions to make it sane,
> > > 
> > > It seems too fraught with danger for drivers or subsystems to be
> > > just
> > > doing this in the simplest way that isn't actually that smart.
> > Hmm. I see even the "Don't error on prefaults" check was broken at
> > some
> > point :/.
> > 
> > There have been numerous ways to try to address this,
> > 
> > The remap_pfn_range was last tried, at least in the context of the
> > i915
> > driver IIRC by Christoph Hellwig but had to be ripped out since it
> > requires the mmap_lock in write mode. Here we have it only in read
> > mode.
> > 
> > Then there's the apply_to_page_range() used by the igfx
> > functionality
> > of the i915 driver. I don't think we should go that route without
> > turning it into something like vm_insert_pfns() with proper
> > checking.
> > This approach populates all entries of a buffer object.
> > 
> > Finally there's the huge fault attempt that had to be ripped out
> > due to
> > lack of pmd_special and pud_special flags and resulting clashes
> > with
> > gup_fast.
> > 
> > Perhaps a combination of the two latter if properly implemented
> > would
> > be the best choice.
> 
> I'm not deep enough into the memory management background to judge
> which 
> approach is the best, just one more data point to provide:
> 
> The pre-faulting was increased because of virtualization. When
> KVM/XEN 
> is mapping a BO into a guest the switching overhead for each fault is
> so 
> high that mapping a lot of PFNs at the same time becomes beneficial.

Since populating at mmap time is not possible due to eviction /
migration, perhaps one way would be to use madvise() to toggle
prefaulting size? MADV_RANDOM vs MADV_SEQUENTIAL vs MADV_NORMAL.

/Thomas

> 
> Regards,
> Christian.
> 
> > 
> > /Thomas
> > 
> > > Dave.
> 


  parent reply	other threads:[~2024-09-03  9:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-30  2:08 [git pull] drm fixes for 6.11-rc6 Dave Airlie
2024-08-30  2:32 ` Linus Torvalds
2024-09-01 22:13   ` Dave Airlie
2024-09-02  9:32     ` Thomas Hellström
2024-09-02 10:33       ` Christian König
2024-09-02 19:37         ` Linus Torvalds
2024-09-03  9:26         ` Thomas Hellström [this message]
2024-08-30  2:44 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80ed1e928de230f54098d2a4b7f14b5d3556a687.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=Arunpravin.PaneerSelvam@amd.com \
    --cc=airlied@gmail.com \
    --cc=alexdeucher@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lingshan.zhu@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox