All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Nirmoy Das <nirmoy.das@linux.intel.com>,
	Nirmoy Das <nirmoy.das@intel.com>,
	 dri-devel@lists.freedesktop.org
Cc: intel-xe@lists.freedesktop.org,
	Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>,
	Matthew Brost <matthew.brost@intel.com>,
	Christian Koenig <christian.koenig@amd.com>,
	Matthew Auld <matthew.auld@intel.com>
Subject: Re: [PATCH v6 2/2] drm/xe/lnl: Offload system clear page activity to GPU
Date: Tue, 20 Aug 2024 15:36:45 +0200	[thread overview]
Message-ID: <77b3e3994036e4fb6874aff6a1ce39543e7eefea.camel@linux.intel.com> (raw)
In-Reply-To: <b393e5ab-d69c-4bde-9ba2-3801ad8d5b48@linux.intel.com>

Hi, Nirmoy,

On Mon, 2024-08-19 at 18:01 +0200, Nirmoy Das wrote:
> 
> On 8/19/2024 1:05 PM, Matthew Auld wrote:
> > On 16/08/2024 14:51, Nirmoy Das wrote:
> > > On LNL because of flat CCS, driver creates migrates job to clear
> > > CCS meta data. Extend that to also clear system pages using GPU.
> > > Inform TTM to allocate pages without __GFP_ZERO to avoid double
> > > page
> > > clearing by clearing out TTM_TT_FLAG_ZERO_ALLOC flag and set
> > > TTM_TT_FLAG_CLEARED_ON_FREE while freeing to skip ttm pool's
> > > clear
> > > on free as XE now takes care of clearing pages. If a bo is in
> > > system
> > > placement such as BO created with
> > > DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING
> > > and there is a cpu map then for such BO gpu clear will be avoided
> > > as
> > > there is no dma mapping for such BO at that moment to create
> > > migration
> > > jobs.
> > > 
> > > Tested this patch api_overhead_benchmark_l0 from
> > > https://github.com/intel/compute-benchmarks
> > > 
> > > Without the patch:
> > > api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation:
> > > UsmMemoryAllocation(api=l0 type=Host size=4KB) 84.206 us
> > > UsmMemoryAllocation(api=l0 type=Host size=1GB) 105775.56 us
> > > erf tool top 5 entries:
> > > 71.44% api_overhead_be  [kernel.kallsyms]   [k] clear_page_erms
> > > 6.34%  api_overhead_be  [kernel.kallsyms]   [k]
> > > __pageblock_pfn_to_page
> > > 2.24%  api_overhead_be  [kernel.kallsyms]   [k] cpa_flush
> > > 2.15%  api_overhead_be  [kernel.kallsyms]   [k]
> > > pages_are_mergeable
> > > 1.94%  api_overhead_be  [kernel.kallsyms]   [k]
> > > find_next_iomem_res
> > > 
> > > With the patch:
> > > api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation:
> > > UsmMemoryAllocation(api=l0 type=Host size=4KB) 79.439 us
> > > UsmMemoryAllocation(api=l0 type=Host size=1GB) 98677.75 us
> > > Perf tool top 5 entries:
> > > 11.16% api_overhead_be  [kernel.kallsyms]   [k]
> > > __pageblock_pfn_to_page
> > > 7.85%  api_overhead_be  [kernel.kallsyms]   [k] cpa_flush
> > > 7.59%  api_overhead_be  [kernel.kallsyms]   [k]
> > > find_next_iomem_res
> > > 7.24%  api_overhead_be  [kernel.kallsyms]   [k]
> > > pages_are_mergeable
> > > 5.53%  api_overhead_be  [kernel.kallsyms]   [k] 
> > > lookup_address_in_pgd_attr
> > > 
> > > Without this patch clear_page_erms() dominates execution time
> > > which is
> > > also not pipelined with migration jobs. With this patch page
> > > clearing
> > > will get pipelined with migration job and will free CPU for more
> > > work.
> > > 
> > > v2: Handle regression on dgfx(Himal)
> > >      Update commit message as no ttm API changes needed.
> > > v3: Fix Kunit test.
> > > v4: handle data leak on cpu mmap(Thomas)
> > > v5: s/gpu_page_clear/gpu_page_clear_sys and move setting
> > >      it to xe_ttm_sys_mgr_init() and other nits (Matt Auld)
> > > v6: Disable it when init_on_alloc and/or init_on_free is
> > > active(Matt)
> > >      Use compute-benchmarks as reporter used it to report this
> > >      allocation latency issue also a proper test application than
> > > mime.
> > >      In v5, the test showed significant reduction in alloc
> > > latency but
> > >      that is not the case any more, I think this was mostly
> > > because
> > >      previous test was done on IFWI which had low mem BW from
> > > CPU.
> > > 
> > > Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
> > > Cc: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
> > > Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
> > 
> > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> 
> 
> Thanks Matt.
> 
> Pushed this to drm-xe-next. The series contains a ttm pool change
> which 
> as agreed with Christian
> 
> is small enough to not cause any issue so can be pulled though drm-
> xe-next.

I have a question that was sent as a reply-to on that patch.

Thanks,
Thomas

> 
> 
> Regards,
> 
> Nirmoy
> 


  reply	other threads:[~2024-08-20 13:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-16 13:51 [PATCH v6 1/2] drm/ttm: Add a flag to allow drivers to skip clear-on-free Nirmoy Das
2024-08-16 13:51 ` [PATCH v6 2/2] drm/xe/lnl: Offload system clear page activity to GPU Nirmoy Das
2024-08-19 11:05   ` Matthew Auld
2024-08-19 16:01     ` Nirmoy Das
2024-08-20 13:36       ` Thomas Hellström [this message]
2024-08-16 14:25 ` ✓ CI.Patch_applied: success for series starting with [v6,1/2] drm/ttm: Add a flag to allow drivers to skip clear-on-free Patchwork
2024-08-16 14:25 ` ✓ CI.checkpatch: " Patchwork
2024-08-16 14:27 ` ✓ CI.KUnit: " Patchwork
2024-08-16 14:38 ` ✓ CI.Build: " Patchwork
2024-08-16 14:40 ` ✓ CI.Hooks: " Patchwork
2024-08-16 14:42 ` ✗ CI.checksparse: warning " Patchwork
2024-08-16 14:57 ` ✓ CI.BAT: success " Patchwork
2024-08-17  0:39 ` ✓ CI.FULL: " Patchwork
2024-08-20 13:33 ` [PATCH v6 1/2] " Thomas Hellström
2024-08-20 14:06   ` Nirmoy Das
2024-08-20 15:30   ` Christian König
2024-08-20 15:45     ` Thomas Hellström
2024-08-20 15:47       ` Christian König
2024-08-20 16:46         ` Nirmoy Das
2024-08-21  7:47           ` Christian König
2024-08-21  8:08             ` Thomas Hellström
2024-08-21 10:22               ` Nirmoy Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77b3e3994036e4fb6874aff6a1ce39543e7eefea.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=nirmoy.das@intel.com \
    --cc=nirmoy.das@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.