* [Intel-xe] [PATCH 0/2] Handle mmap with D3
@ 2023-12-06 13:34 Badal Nilawar
2023-12-06 13:34 ` [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings Badal Nilawar
` (4 more replies)
0 siblings, 5 replies; 15+ messages in thread
From: Badal Nilawar @ 2023-12-06 13:34 UTC (permalink / raw)
To: intel-xe; +Cc: matthew.auld, rodrigo.vivi
Handle mmap with D3 for headed and headless cards.
Badal Nilawar (2):
drm/xe/dgfx: Block rpm for active mmap mappings
drm/xe/dgfx: Release mmap mappings on rpm suspend
drivers/gpu/drm/xe/xe_bo.c | 93 +++++++++++++++++++++++++++-
drivers/gpu/drm/xe/xe_bo.h | 2 +
drivers/gpu/drm/xe/xe_bo_types.h | 5 ++
drivers/gpu/drm/xe/xe_device_types.h | 20 ++++++
drivers/gpu/drm/xe/xe_pm.c | 7 +++
5 files changed, 124 insertions(+), 3 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 15+ messages in thread* [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar @ 2023-12-06 13:34 ` Badal Nilawar 2023-12-07 11:26 ` Matthew Auld 2023-12-06 13:34 ` [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend Badal Nilawar ` (3 subsequent siblings) 4 siblings, 1 reply; 15+ messages in thread From: Badal Nilawar @ 2023-12-06 13:34 UTC (permalink / raw) To: intel-xe; +Cc: matthew.auld, rodrigo.vivi Block rpm for discrete cards when mmap mappings are active. Ideally rpm wake ref should be taken in vm_open call and put in vm_close call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put in vm_close. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> --- drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 72dc4a4eed4e..5741948a2a51 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -15,6 +15,7 @@ #include <drm/ttm/ttm_tt.h> #include <drm/xe_drm.h> +#include "i915_drv.h" #include "xe_device.h" #include "xe_dma_buf.h" #include "xe_drm_client.h" @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) return ret; } +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) +{ + struct ttm_buffer_object *tbo = vma->vm_private_data; + struct drm_device *ddev = tbo->base.dev; + struct xe_device *xe = to_xe_device(ddev); + + ttm_bo_vm_close(vma); + + if (tbo->resource->bus.is_iomem) + xe_device_mem_access_put(xe); +} + static const struct vm_operations_struct xe_gem_vm_ops = { .fault = xe_gem_fault, .open = ttm_bo_vm_open, - .close = ttm_bo_vm_close, + .close = xe_ttm_bo_vm_close, .access = ttm_bo_vm_access }; +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, + struct vm_area_struct *vma) +{ + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); + struct drm_device *ddev = tbo->base.dev; + struct xe_device *xe = to_xe_device(ddev); + int ret; + + ret = drm_gem_ttm_mmap(gem, vma); + if (ret < 0) + return ret; + + if (tbo->resource->bus.is_iomem) + xe_device_mem_access_get(xe); + + return 0; +} + static const struct drm_gem_object_funcs xe_gem_object_funcs = { .free = xe_gem_object_free, .close = xe_gem_object_close, - .mmap = drm_gem_ttm_mmap, + .mmap = xe_drm_gem_ttm_mmap, .export = xe_gem_prime_export, .vm_ops = &xe_gem_vm_ops, }; -- 2.25.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-06 13:34 ` [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings Badal Nilawar @ 2023-12-07 11:26 ` Matthew Auld 2023-12-07 13:06 ` Matthew Auld 2023-12-08 7:17 ` Nilawar, Badal 0 siblings, 2 replies; 15+ messages in thread From: Matthew Auld @ 2023-12-07 11:26 UTC (permalink / raw) To: Badal Nilawar, intel-xe; +Cc: rodrigo.vivi On 06/12/2023 13:34, Badal Nilawar wrote: > Block rpm for discrete cards when mmap mappings are active. > Ideally rpm wake ref should be taken in vm_open call and put in vm_close > call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. > Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put > in vm_close. > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Anshuman Gupta <anshuman.gupta@intel.com> > Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> > --- > drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- > 1 file changed, 33 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index 72dc4a4eed4e..5741948a2a51 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -15,6 +15,7 @@ > #include <drm/ttm/ttm_tt.h> > #include <drm/xe_drm.h> > > +#include "i915_drv.h" Do we need this? > #include "xe_device.h" > #include "xe_dma_buf.h" > #include "xe_drm_client.h" > @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) > return ret; > } > > +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) > +{ > + struct ttm_buffer_object *tbo = vma->vm_private_data; > + struct drm_device *ddev = tbo->base.dev; > + struct xe_device *xe = to_xe_device(ddev); > + > + ttm_bo_vm_close(vma); > + > + if (tbo->resource->bus.is_iomem) > + xe_device_mem_access_put(xe); > +} > + > static const struct vm_operations_struct xe_gem_vm_ops = { > .fault = xe_gem_fault, > .open = ttm_bo_vm_open, > - .close = ttm_bo_vm_close, > + .close = xe_ttm_bo_vm_close, > .access = ttm_bo_vm_access > }; > > +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, > + struct vm_area_struct *vma) > +{ > + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); > + struct drm_device *ddev = tbo->base.dev; > + struct xe_device *xe = to_xe_device(ddev); > + int ret; > + > + ret = drm_gem_ttm_mmap(gem, vma); > + if (ret < 0) > + return ret; > + > + if (tbo->resource->bus.is_iomem) > + xe_device_mem_access_get(xe); Checking is_iomem outside of the usual locking is racy. One issue here is that is_iomem can freely change at any point (like at fault time) so when vm_close is called you can easily get an an unbalanced RPM ref count. For example io_mem is false here but later becomes true in bo_vm_close and then we call mem_access_put even though we never called mem_access_get. Maybe check the possible placements of the object instead since that is immutable? > + > + return 0; > +} > + > static const struct drm_gem_object_funcs xe_gem_object_funcs = { > .free = xe_gem_object_free, > .close = xe_gem_object_close, > - .mmap = drm_gem_ttm_mmap, > + .mmap = xe_drm_gem_ttm_mmap, > .export = xe_gem_prime_export, > .vm_ops = &xe_gem_vm_ops, > }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-07 11:26 ` Matthew Auld @ 2023-12-07 13:06 ` Matthew Auld 2023-12-08 7:29 ` Nilawar, Badal 2023-12-08 7:17 ` Nilawar, Badal 1 sibling, 1 reply; 15+ messages in thread From: Matthew Auld @ 2023-12-07 13:06 UTC (permalink / raw) To: Badal Nilawar, intel-xe; +Cc: rodrigo.vivi On 07/12/2023 11:26, Matthew Auld wrote: > On 06/12/2023 13:34, Badal Nilawar wrote: >> Block rpm for discrete cards when mmap mappings are active. >> Ideally rpm wake ref should be taken in vm_open call and put in vm_close >> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. >> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put >> in vm_close. >> >> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >> Cc: Anshuman Gupta <anshuman.gupta@intel.com> >> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> >> --- >> drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- >> 1 file changed, 33 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c >> index 72dc4a4eed4e..5741948a2a51 100644 >> --- a/drivers/gpu/drm/xe/xe_bo.c >> +++ b/drivers/gpu/drm/xe/xe_bo.c >> @@ -15,6 +15,7 @@ >> #include <drm/ttm/ttm_tt.h> >> #include <drm/xe_drm.h> >> +#include "i915_drv.h" > > Do we need this? > >> #include "xe_device.h" >> #include "xe_dma_buf.h" >> #include "xe_drm_client.h" >> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct vm_fault >> *vmf) >> return ret; >> } >> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) >> +{ >> + struct ttm_buffer_object *tbo = vma->vm_private_data; >> + struct drm_device *ddev = tbo->base.dev; >> + struct xe_device *xe = to_xe_device(ddev); >> + >> + ttm_bo_vm_close(vma); >> + >> + if (tbo->resource->bus.is_iomem) >> + xe_device_mem_access_put(xe); Are you sure this works as expected? Say if the user partially unmaps something? map = mmap(obj, size); unmap(map, size/2); unmap(map, size); That would be one mmap but multiple vm_close calls leading to an imbalance in the RPM ref. I think we need the access_get in the vm_open also? >> +} >> + >> static const struct vm_operations_struct xe_gem_vm_ops = { >> .fault = xe_gem_fault, >> .open = ttm_bo_vm_open, >> - .close = ttm_bo_vm_close, >> + .close = xe_ttm_bo_vm_close, >> .access = ttm_bo_vm_access >> }; >> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, >> + struct vm_area_struct *vma) >> +{ >> + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); >> + struct drm_device *ddev = tbo->base.dev; >> + struct xe_device *xe = to_xe_device(ddev); >> + int ret; >> + >> + ret = drm_gem_ttm_mmap(gem, vma); >> + if (ret < 0) >> + return ret; >> + >> + if (tbo->resource->bus.is_iomem) >> + xe_device_mem_access_get(xe); > > Checking is_iomem outside of the usual locking is racy. One issue here > is that is_iomem can freely change at any point (like at fault time) so > when vm_close is called you can easily get an an unbalanced RPM ref > count. For example io_mem is false here but later becomes true in > bo_vm_close and then we call mem_access_put even though we never called > mem_access_get. > > Maybe check the possible placements of the object instead since that is > immutable? > >> + >> + return 0; >> +} >> + >> static const struct drm_gem_object_funcs xe_gem_object_funcs = { >> .free = xe_gem_object_free, >> .close = xe_gem_object_close, >> - .mmap = drm_gem_ttm_mmap, >> + .mmap = xe_drm_gem_ttm_mmap, >> .export = xe_gem_prime_export, >> .vm_ops = &xe_gem_vm_ops, >> }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-07 13:06 ` Matthew Auld @ 2023-12-08 7:29 ` Nilawar, Badal 2023-12-08 9:11 ` Matthew Auld 0 siblings, 1 reply; 15+ messages in thread From: Nilawar, Badal @ 2023-12-08 7:29 UTC (permalink / raw) To: Matthew Auld, intel-xe; +Cc: rodrigo.vivi On 07-12-2023 18:36, Matthew Auld wrote: > On 07/12/2023 11:26, Matthew Auld wrote: >> On 06/12/2023 13:34, Badal Nilawar wrote: >>> Block rpm for discrete cards when mmap mappings are active. >>> Ideally rpm wake ref should be taken in vm_open call and put in vm_close >>> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. >>> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put >>> in vm_close. >>> >>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >>> Cc: Anshuman Gupta <anshuman.gupta@intel.com> >>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> >>> --- >>> drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- >>> 1 file changed, 33 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c >>> index 72dc4a4eed4e..5741948a2a51 100644 >>> --- a/drivers/gpu/drm/xe/xe_bo.c >>> +++ b/drivers/gpu/drm/xe/xe_bo.c >>> @@ -15,6 +15,7 @@ >>> #include <drm/ttm/ttm_tt.h> >>> #include <drm/xe_drm.h> >>> +#include "i915_drv.h" >> >> Do we need this? >> >>> #include "xe_device.h" >>> #include "xe_dma_buf.h" >>> #include "xe_drm_client.h" >>> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct >>> vm_fault *vmf) >>> return ret; >>> } >>> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) >>> +{ >>> + struct ttm_buffer_object *tbo = vma->vm_private_data; >>> + struct drm_device *ddev = tbo->base.dev; >>> + struct xe_device *xe = to_xe_device(ddev); >>> + >>> + ttm_bo_vm_close(vma); >>> + >>> + if (tbo->resource->bus.is_iomem) >>> + xe_device_mem_access_put(xe); > > Are you sure this works as expected? Say if the user partially unmaps > something? > > map = mmap(obj, size); > unmap(map, size/2); > unmap(map, size); > > That would be one mmap but multiple vm_close calls leading to an > imbalance in the RPM ref. I think we need the access_get in the vm_open > also?I haven't tried partial mmap but for single mmap-unmap I observed equal number of xe_drm_gem_ttm_mmap and vm_close call. Will try partial mmap. For mem_access_get in vm_open, initially we were trying the same but observed that vm_open never get called. In fact i915 i915_gem_mman.c we found this comment for vm_open. /* * When we install vm_ops for mmap we are too late for * the vm_ops->open() which increases the ref_count of * this obj and then it gets decreased by the vm_ops->close(). * To balance this increase the obj ref_count here. */ Does similar reason applicable for xe vm_open as well? Regards, Badal > >>> +} >>> + >>> static const struct vm_operations_struct xe_gem_vm_ops = { >>> .fault = xe_gem_fault, >>> .open = ttm_bo_vm_open, >>> - .close = ttm_bo_vm_close, >>> + .close = xe_ttm_bo_vm_close, >>> .access = ttm_bo_vm_access >>> }; >>> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, >>> + struct vm_area_struct *vma) >>> +{ >>> + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); >>> + struct drm_device *ddev = tbo->base.dev; >>> + struct xe_device *xe = to_xe_device(ddev); >>> + int ret; >>> + >>> + ret = drm_gem_ttm_mmap(gem, vma); >>> + if (ret < 0) >>> + return ret; >>> + >>> + if (tbo->resource->bus.is_iomem) >>> + xe_device_mem_access_get(xe); >> >> Checking is_iomem outside of the usual locking is racy. One issue here >> is that is_iomem can freely change at any point (like at fault time) >> so when vm_close is called you can easily get an an unbalanced RPM ref >> count. For example io_mem is false here but later becomes true in >> bo_vm_close and then we call mem_access_put even though we never >> called mem_access_get. >> >> Maybe check the possible placements of the object instead since that >> is immutable? >> >>> + >>> + return 0; >>> +} >>> + >>> static const struct drm_gem_object_funcs xe_gem_object_funcs = { >>> .free = xe_gem_object_free, >>> .close = xe_gem_object_close, >>> - .mmap = drm_gem_ttm_mmap, >>> + .mmap = xe_drm_gem_ttm_mmap, >>> .export = xe_gem_prime_export, >>> .vm_ops = &xe_gem_vm_ops, >>> }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-08 7:29 ` Nilawar, Badal @ 2023-12-08 9:11 ` Matthew Auld 0 siblings, 0 replies; 15+ messages in thread From: Matthew Auld @ 2023-12-08 9:11 UTC (permalink / raw) To: Nilawar, Badal, intel-xe; +Cc: rodrigo.vivi On 08/12/2023 07:29, Nilawar, Badal wrote: > > > On 07-12-2023 18:36, Matthew Auld wrote: >> On 07/12/2023 11:26, Matthew Auld wrote: >>> On 06/12/2023 13:34, Badal Nilawar wrote: >>>> Block rpm for discrete cards when mmap mappings are active. >>>> Ideally rpm wake ref should be taken in vm_open call and put in >>>> vm_close >>>> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. >>>> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put >>>> in vm_close. >>>> >>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >>>> Cc: Anshuman Gupta <anshuman.gupta@intel.com> >>>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> >>>> --- >>>> drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- >>>> 1 file changed, 33 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c >>>> index 72dc4a4eed4e..5741948a2a51 100644 >>>> --- a/drivers/gpu/drm/xe/xe_bo.c >>>> +++ b/drivers/gpu/drm/xe/xe_bo.c >>>> @@ -15,6 +15,7 @@ >>>> #include <drm/ttm/ttm_tt.h> >>>> #include <drm/xe_drm.h> >>>> +#include "i915_drv.h" >>> >>> Do we need this? >>> >>>> #include "xe_device.h" >>>> #include "xe_dma_buf.h" >>>> #include "xe_drm_client.h" >>>> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct >>>> vm_fault *vmf) >>>> return ret; >>>> } >>>> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) >>>> +{ >>>> + struct ttm_buffer_object *tbo = vma->vm_private_data; >>>> + struct drm_device *ddev = tbo->base.dev; >>>> + struct xe_device *xe = to_xe_device(ddev); >>>> + >>>> + ttm_bo_vm_close(vma); >>>> + >>>> + if (tbo->resource->bus.is_iomem) >>>> + xe_device_mem_access_put(xe); >> >> Are you sure this works as expected? Say if the user partially unmaps >> something? >> >> map = mmap(obj, size); >> unmap(map, size/2); >> unmap(map, size); >> >> That would be one mmap but multiple vm_close calls leading to an >> imbalance in the RPM ref. I think we need the access_get in the >> vm_open also?I haven't tried partial mmap but for single mmap-unmap I >> observed > equal number of xe_drm_gem_ttm_mmap and vm_close call. Will try partial > mmap. > > For mem_access_get in vm_open, initially we were trying the same but > observed that vm_open never get called. Yeah, if you do: mmap(obj, size) munmap(obj, size) That will do mmap and one vm_close, no vm_open AFAICT. But that looks to be fine here. > In fact i915 i915_gem_mman.c we found this comment for vm_open. > /* > * When we install vm_ops for mmap we are too late for > * the vm_ops->open() which increases the ref_count of > * this obj and then it gets decreased by the vm_ops->close(). > * To balance this increase the obj ref_count here. > */ > Does similar reason applicable for xe vm_open as well? I think so. If you do: mmap(obj, size) munmap(obj, size/2) munmap(obj, size) That will do one mmap, one vm_open for the newly split vma and finally two vm_closes, AFAICT. > > Regards, > Badal >> >>>> +} >>>> + >>>> static const struct vm_operations_struct xe_gem_vm_ops = { >>>> .fault = xe_gem_fault, >>>> .open = ttm_bo_vm_open, >>>> - .close = ttm_bo_vm_close, >>>> + .close = xe_ttm_bo_vm_close, >>>> .access = ttm_bo_vm_access >>>> }; >>>> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, >>>> + struct vm_area_struct *vma) >>>> +{ >>>> + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); >>>> + struct drm_device *ddev = tbo->base.dev; >>>> + struct xe_device *xe = to_xe_device(ddev); >>>> + int ret; >>>> + >>>> + ret = drm_gem_ttm_mmap(gem, vma); >>>> + if (ret < 0) >>>> + return ret; >>>> + >>>> + if (tbo->resource->bus.is_iomem) >>>> + xe_device_mem_access_get(xe); >>> >>> Checking is_iomem outside of the usual locking is racy. One issue >>> here is that is_iomem can freely change at any point (like at fault >>> time) so when vm_close is called you can easily get an an unbalanced >>> RPM ref count. For example io_mem is false here but later becomes >>> true in bo_vm_close and then we call mem_access_put even though we >>> never called mem_access_get. >>> >>> Maybe check the possible placements of the object instead since that >>> is immutable? >>> >>>> + >>>> + return 0; >>>> +} >>>> + >>>> static const struct drm_gem_object_funcs xe_gem_object_funcs = { >>>> .free = xe_gem_object_free, >>>> .close = xe_gem_object_close, >>>> - .mmap = drm_gem_ttm_mmap, >>>> + .mmap = xe_drm_gem_ttm_mmap, >>>> .export = xe_gem_prime_export, >>>> .vm_ops = &xe_gem_vm_ops, >>>> }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-07 11:26 ` Matthew Auld 2023-12-07 13:06 ` Matthew Auld @ 2023-12-08 7:17 ` Nilawar, Badal 2023-12-08 9:27 ` Matthew Auld 1 sibling, 1 reply; 15+ messages in thread From: Nilawar, Badal @ 2023-12-08 7:17 UTC (permalink / raw) To: Matthew Auld, intel-xe; +Cc: rodrigo.vivi On 07-12-2023 16:56, Matthew Auld wrote: > On 06/12/2023 13:34, Badal Nilawar wrote: >> Block rpm for discrete cards when mmap mappings are active. >> Ideally rpm wake ref should be taken in vm_open call and put in vm_close >> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. >> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put >> in vm_close. >> >> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >> Cc: Anshuman Gupta <anshuman.gupta@intel.com> >> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> >> --- >> drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- >> 1 file changed, 33 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c >> index 72dc4a4eed4e..5741948a2a51 100644 >> --- a/drivers/gpu/drm/xe/xe_bo.c >> +++ b/drivers/gpu/drm/xe/xe_bo.c >> @@ -15,6 +15,7 @@ >> #include <drm/ttm/ttm_tt.h> >> #include <drm/xe_drm.h> >> +#include "i915_drv.h" > > Do we need this? This is needed in patch 2 I will remove from this patch. > >> #include "xe_device.h" >> #include "xe_dma_buf.h" >> #include "xe_drm_client.h" >> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct vm_fault >> *vmf) >> return ret; >> } >> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) >> +{ >> + struct ttm_buffer_object *tbo = vma->vm_private_data; >> + struct drm_device *ddev = tbo->base.dev; >> + struct xe_device *xe = to_xe_device(ddev); >> + >> + ttm_bo_vm_close(vma); >> + >> + if (tbo->resource->bus.is_iomem) >> + xe_device_mem_access_put(xe); >> +} >> + >> static const struct vm_operations_struct xe_gem_vm_ops = { >> .fault = xe_gem_fault, >> .open = ttm_bo_vm_open, >> - .close = ttm_bo_vm_close, >> + .close = xe_ttm_bo_vm_close, >> .access = ttm_bo_vm_access >> }; >> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, >> + struct vm_area_struct *vma) >> +{ >> + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); >> + struct drm_device *ddev = tbo->base.dev; >> + struct xe_device *xe = to_xe_device(ddev); >> + int ret; >> + >> + ret = drm_gem_ttm_mmap(gem, vma); >> + if (ret < 0) >> + return ret; >> + >> + if (tbo->resource->bus.is_iomem) >> + xe_device_mem_access_get(xe); > > Checking is_iomem outside of the usual locking is racy. One issue here > is that is_iomem can freely change at any point (like at fault time) so > when vm_close is called you can easily get an an unbalanced RPM ref > count. For example io_mem is false here but later becomes true in > bo_vm_close and then we call mem_access_put even though we never called > mem_access_get. > > Maybe check the possible placements of the object instead since that is > immutable? Sure will check bo placements struct ttm_place *place = &(bo->placements[0]); if (mem_type_is_vram(place->mem_type)) Or can we check tbo resource memtype if (mem_type_is_vram(tbo->resource->mem_type)) Regards, Badal > >> + >> + return 0; >> +} >> + >> static const struct drm_gem_object_funcs xe_gem_object_funcs = { >> .free = xe_gem_object_free, >> .close = xe_gem_object_close, >> - .mmap = drm_gem_ttm_mmap, >> + .mmap = xe_drm_gem_ttm_mmap, >> .export = xe_gem_prime_export, >> .vm_ops = &xe_gem_vm_ops, >> }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings 2023-12-08 7:17 ` Nilawar, Badal @ 2023-12-08 9:27 ` Matthew Auld 0 siblings, 0 replies; 15+ messages in thread From: Matthew Auld @ 2023-12-08 9:27 UTC (permalink / raw) To: Nilawar, Badal, intel-xe; +Cc: rodrigo.vivi On 08/12/2023 07:17, Nilawar, Badal wrote: > > > On 07-12-2023 16:56, Matthew Auld wrote: >> On 06/12/2023 13:34, Badal Nilawar wrote: >>> Block rpm for discrete cards when mmap mappings are active. >>> Ideally rpm wake ref should be taken in vm_open call and put in vm_close >>> call but it is seen that vm_open doesn't get called for xe_gem_vm_ops. >>> Therefore rpm wake ref is being get in xe_drm_gem_ttm_mmap and put >>> in vm_close. >>> >>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >>> Cc: Anshuman Gupta <anshuman.gupta@intel.com> >>> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> >>> --- >>> drivers/gpu/drm/xe/xe_bo.c | 35 +++++++++++++++++++++++++++++++++-- >>> 1 file changed, 33 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c >>> index 72dc4a4eed4e..5741948a2a51 100644 >>> --- a/drivers/gpu/drm/xe/xe_bo.c >>> +++ b/drivers/gpu/drm/xe/xe_bo.c >>> @@ -15,6 +15,7 @@ >>> #include <drm/ttm/ttm_tt.h> >>> #include <drm/xe_drm.h> >>> +#include "i915_drv.h" >> >> Do we need this? > This is needed in patch 2 I will remove from this patch. >> >>> #include "xe_device.h" >>> #include "xe_dma_buf.h" >>> #include "xe_drm_client.h" >>> @@ -1158,17 +1159,47 @@ static vm_fault_t xe_gem_fault(struct >>> vm_fault *vmf) >>> return ret; >>> } >>> +static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) >>> +{ >>> + struct ttm_buffer_object *tbo = vma->vm_private_data; >>> + struct drm_device *ddev = tbo->base.dev; >>> + struct xe_device *xe = to_xe_device(ddev); >>> + >>> + ttm_bo_vm_close(vma); >>> + >>> + if (tbo->resource->bus.is_iomem) >>> + xe_device_mem_access_put(xe); >>> +} >>> + >>> static const struct vm_operations_struct xe_gem_vm_ops = { >>> .fault = xe_gem_fault, >>> .open = ttm_bo_vm_open, >>> - .close = ttm_bo_vm_close, >>> + .close = xe_ttm_bo_vm_close, >>> .access = ttm_bo_vm_access >>> }; >>> +int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, >>> + struct vm_area_struct *vma) >>> +{ >>> + struct ttm_buffer_object *tbo = drm_gem_ttm_of_gem(gem); >>> + struct drm_device *ddev = tbo->base.dev; >>> + struct xe_device *xe = to_xe_device(ddev); >>> + int ret; >>> + >>> + ret = drm_gem_ttm_mmap(gem, vma); >>> + if (ret < 0) >>> + return ret; >>> + >>> + if (tbo->resource->bus.is_iomem) >>> + xe_device_mem_access_get(xe); >> >> Checking is_iomem outside of the usual locking is racy. One issue here >> is that is_iomem can freely change at any point (like at fault time) >> so when vm_close is called you can easily get an an unbalanced RPM ref >> count. For example io_mem is false here but later becomes true in >> bo_vm_close and then we call mem_access_put even though we never >> called mem_access_get. >> >> Maybe check the possible placements of the object instead since that >> is immutable? > Sure will check bo placements > struct ttm_place *place = &(bo->placements[0]); > if (mem_type_is_vram(place->mem_type)) Yeah, something like that. Although we need to consider all the placements. Maybe just bo->flags & XE_BO_CREATE_VRAM_MASK, which would indicate that it can potentially be placed in VRAM at some point. > Or can we check tbo resource memtype > if (mem_type_is_vram(tbo->resource->mem_type)) We can't touch tbo->resource (or any state within it like is_iomem) here without the proper locking. For example it could be NULL or various other scary things if we are unlucky. But even with the lock it can change at any point after you drop it. > > Regards, > Badal >> >>> + >>> + return 0; >>> +} >>> + >>> static const struct drm_gem_object_funcs xe_gem_object_funcs = { >>> .free = xe_gem_object_free, >>> .close = xe_gem_object_close, >>> - .mmap = drm_gem_ttm_mmap, >>> + .mmap = xe_drm_gem_ttm_mmap, >>> .export = xe_gem_prime_export, >>> .vm_ops = &xe_gem_vm_ops, >>> }; ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar 2023-12-06 13:34 ` [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings Badal Nilawar @ 2023-12-06 13:34 ` Badal Nilawar 2023-12-07 11:37 ` Matthew Auld 2023-12-08 11:15 ` [Intel-xe] " Thomas Hellström 2023-12-06 16:28 ` [Intel-xe] ✓ CI.Patch_applied: success for Handle mmap with D3 Patchwork ` (2 subsequent siblings) 4 siblings, 2 replies; 15+ messages in thread From: Badal Nilawar @ 2023-12-06 13:34 UTC (permalink / raw) To: intel-xe; +Cc: matthew.auld, rodrigo.vivi Release all mmap mappings for all vram objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault. Upon userfault, in order to access memory mappings, if graphics function is in D3 then runtime resume of dgpu will be triggered to transition to D0. Previous commit has blocked the rpm but let's make sure that rpm does not get blocked for headed cards(client's parts). Above pagefault approach will be useful for headed cards to save the power when display is off and there are active mmap mappings. v2: - Add lock dep assertion before updating vram_userfault_count (Rodrigo) - Avoid iomem check before bo migration check as bo can migrate to system memory (Matthew Auld) v3: - Apply pagefault approach for headed cards. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> --- drivers/gpu/drm/xe/xe_bo.c | 62 ++++++++++++++++++++++++++-- drivers/gpu/drm/xe/xe_bo.h | 2 + drivers/gpu/drm/xe/xe_bo_types.h | 5 +++ drivers/gpu/drm/xe/xe_device_types.h | 20 +++++++++ drivers/gpu/drm/xe/xe_pm.c | 7 ++++ 5 files changed, 93 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 5741948a2a51..419bc5c55aa7 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -797,6 +797,18 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, dma_fence_put(fence); } + /* + * TTM has already nuked the mmap for us (see ttm_bo_unmap_virtual), + * so if we moved from VRAM make sure to unlink this from the userfault + * tracking. + */ + if (mem_type_is_vram(old_mem_type)) { + spin_lock(&xe->mem_access.vram_userfault_lock); + if (INTEL_DISPLAY_ENABLED(xe) && !list_empty(&bo->vram_userfault_link)) + list_del_init(&bo->vram_userfault_link); + spin_unlock(&xe->mem_access.vram_userfault_lock); + } + xe_device_mem_access_put(xe); out: @@ -1125,6 +1137,8 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) { struct ttm_buffer_object *tbo = vmf->vma->vm_private_data; struct drm_device *ddev = tbo->base.dev; + struct xe_bo *bo = NULL; + struct xe_device *xe = to_xe_device(ddev); vm_fault_t ret; int idx, r = 0; @@ -1133,7 +1147,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) return ret; if (drm_dev_enter(ddev, &idx)) { - struct xe_bo *bo = ttm_to_xe_bo(tbo); + bo = ttm_to_xe_bo(tbo); trace_xe_bo_cpu_fault(bo); @@ -1152,10 +1166,31 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) } else { ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot); } + if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) return ret; + /* + * ttm_bo_vm_reserve() already has dma_resv_lock. + * vram_userfault_count is protected by dma_resv lock and rpm wakeref. + */ + if (INTEL_DISPLAY_ENABLED(xe) && ret == VM_FAULT_NOPAGE && bo && !bo->vram_userfault_count) { + if (tbo->resource->bus.is_iomem) { + dma_resv_assert_held(tbo->base.resv); + + xe_device_mem_access_get(xe); + + bo->vram_userfault_count = 1; + + spin_lock(&xe->mem_access.vram_userfault_lock); + list_add(&bo->vram_userfault_link, &xe->mem_access.vram_userfault_list); + spin_unlock(&xe->mem_access.vram_userfault_lock); + + xe_device_mem_access_put(xe); + } + } dma_resv_unlock(tbo->base.resv); + return ret; } @@ -1167,7 +1202,7 @@ static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) ttm_bo_vm_close(vma); - if (tbo->resource->bus.is_iomem) + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) xe_device_mem_access_put(xe); } @@ -1190,7 +1225,7 @@ int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, if (ret < 0) return ret; - if (tbo->resource->bus.is_iomem) + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) xe_device_mem_access_get(xe); return 0; @@ -2296,6 +2331,27 @@ int xe_bo_dumb_create(struct drm_file *file_priv, return err; } +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo) +{ + struct ttm_buffer_object *tbo = &bo->ttm; + struct ttm_device *bdev = tbo->bdev; + struct drm_device *ddev = tbo->base.dev; + struct xe_device *xe = to_xe_device(ddev); + + if (!INTEL_DISPLAY_ENABLED(xe)) + return; + + drm_vma_node_unmap(&tbo->base.vma_node, bdev->dev_mapping); + + /* + * We have exclusive access here via runtime suspend. All other callers + * must first grab the rpm wakeref. + */ + XE_WARN_ON(!bo->vram_userfault_count); + list_del(&bo->vram_userfault_link); + bo->vram_userfault_count = 0; +} + #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) #include "tests/xe_bo.c" #endif diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h index 098ccab7fa1e..fd9ec9fc9152 100644 --- a/drivers/gpu/drm/xe/xe_bo.h +++ b/drivers/gpu/drm/xe/xe_bo.h @@ -239,6 +239,8 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int xe_gem_mmap_offset_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo); + int xe_bo_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h index f71dbc518958..72c667dc664e 100644 --- a/drivers/gpu/drm/xe/xe_bo_types.h +++ b/drivers/gpu/drm/xe/xe_bo_types.h @@ -84,6 +84,11 @@ struct xe_bo { * objects. */ u16 cpu_caching; + /** + * Whether the object is currently in fake offset mmap backed by vram. + */ + unsigned int vram_userfault_count; + struct list_head vram_userfault_link; }; #define intel_bo_to_drm_bo(bo) (&(bo)->ttm.base) diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 9a212dbdb8a4..1aa303db80c9 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -363,6 +363,26 @@ struct xe_device { struct { /** @ref: ref count of memory accesses */ atomic_t ref; + /* + * Protects access to vram usefault list. + * It is required, if we are outside of the runtime suspend path, + * access to @vram_userfault_list requires always first grabbing the + * runtime pm, to ensure we can't race against runtime suspend. + * Once we have that we also need to grab @vram_userfault_lock, + * at which point we have exclusive access. + * The runtime suspend path is special since it doesn't really hold any locks, + * but instead has exclusive access by virtue of all other accesses requiring + * holding the runtime pm wakeref. + */ + spinlock_t vram_userfault_lock; + + /* + * Keep list of userfaulted gem obj, which require to release their + * mmap mappings at runtime suspend path. + */ + struct list_head vram_userfault_list; + + bool vram_userfault_ongoing; } mem_access; /** diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index b429c2876a76..bc1cb081e6e5 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -181,6 +181,8 @@ void xe_pm_init(struct xe_device *xe) } xe_pm_runtime_init(xe); + INIT_LIST_HEAD(&xe->mem_access.vram_userfault_list); + spin_lock_init(&xe->mem_access.vram_userfault_lock); } void xe_pm_runtime_fini(struct xe_device *xe) @@ -214,6 +216,7 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe) int xe_pm_runtime_suspend(struct xe_device *xe) { + struct xe_bo *bo, *on; struct xe_gt *gt; u8 id; int err = 0; @@ -247,6 +250,10 @@ int xe_pm_runtime_suspend(struct xe_device *xe) */ lock_map_acquire(&xe_device_mem_access_lockdep_map); + list_for_each_entry_safe(bo, on, + &xe->mem_access.vram_userfault_list, vram_userfault_link) + xe_bo_runtime_pm_release_mmap_offset(bo); + if (xe->d3cold.allowed) { err = xe_bo_evict_all(xe); if (err) -- 2.25.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend 2023-12-06 13:34 ` [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend Badal Nilawar @ 2023-12-07 11:37 ` Matthew Auld 2023-12-07 17:00 ` Gupta, Anshuman 2023-12-08 11:15 ` [Intel-xe] " Thomas Hellström 1 sibling, 1 reply; 15+ messages in thread From: Matthew Auld @ 2023-12-07 11:37 UTC (permalink / raw) To: Badal Nilawar, intel-xe; +Cc: rodrigo.vivi On 06/12/2023 13:34, Badal Nilawar wrote: > Release all mmap mappings for all vram objects which are associated > with userfault such that, while pcie function in D3hot, any access > to memory mappings will raise a userfault. > > Upon userfault, in order to access memory mappings, if graphics > function is in D3 then runtime resume of dgpu will be triggered to > transition to D0. > > Previous commit has blocked the rpm but let's make sure that rpm > does not get blocked for headed cards(client's parts). > Above pagefault approach will be useful for headed cards to save the > power when display is off and there are active mmap mappings. So why do we want to use this approach for cards with display turned off (client) and not for cards without a display part? I don't see that explained. Also do we have a way to test this new approach in CI? Is the display not mostly always turned on? > > v2: > - Add lock dep assertion before updating vram_userfault_count (Rodrigo) > - Avoid iomem check before bo migration check as bo can migrate > to system memory (Matthew Auld) > v3: > - Apply pagefault approach for headed cards. > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Matthew Auld <matthew.auld@intel.com> > Cc: Anshuman Gupta <anshuman.gupta@intel.com> > Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> > --- > drivers/gpu/drm/xe/xe_bo.c | 62 ++++++++++++++++++++++++++-- > drivers/gpu/drm/xe/xe_bo.h | 2 + > drivers/gpu/drm/xe/xe_bo_types.h | 5 +++ > drivers/gpu/drm/xe/xe_device_types.h | 20 +++++++++ > drivers/gpu/drm/xe/xe_pm.c | 7 ++++ > 5 files changed, 93 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index 5741948a2a51..419bc5c55aa7 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -797,6 +797,18 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, > dma_fence_put(fence); > } > > + /* > + * TTM has already nuked the mmap for us (see ttm_bo_unmap_virtual), > + * so if we moved from VRAM make sure to unlink this from the userfault > + * tracking. > + */ > + if (mem_type_is_vram(old_mem_type)) { > + spin_lock(&xe->mem_access.vram_userfault_lock); > + if (INTEL_DISPLAY_ENABLED(xe) && !list_empty(&bo->vram_userfault_link)) > + list_del_init(&bo->vram_userfault_link); > + spin_unlock(&xe->mem_access.vram_userfault_lock); > + } > + > xe_device_mem_access_put(xe); > > out: > @@ -1125,6 +1137,8 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) > { > struct ttm_buffer_object *tbo = vmf->vma->vm_private_data; > struct drm_device *ddev = tbo->base.dev; > + struct xe_bo *bo = NULL; > + struct xe_device *xe = to_xe_device(ddev); > vm_fault_t ret; > int idx, r = 0; > > @@ -1133,7 +1147,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) > return ret; > > if (drm_dev_enter(ddev, &idx)) { > - struct xe_bo *bo = ttm_to_xe_bo(tbo); > + bo = ttm_to_xe_bo(tbo); > > trace_xe_bo_cpu_fault(bo); > > @@ -1152,10 +1166,31 @@ static vm_fault_t xe_gem_fault(struct vm_fault *vmf) > } else { > ret = ttm_bo_vm_dummy_page(vmf, vmf->vma->vm_page_prot); > } > + > if (ret == VM_FAULT_RETRY && !(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) > return ret; > + /* > + * ttm_bo_vm_reserve() already has dma_resv_lock. > + * vram_userfault_count is protected by dma_resv lock and rpm wakeref. > + */ > + if (INTEL_DISPLAY_ENABLED(xe) && ret == VM_FAULT_NOPAGE && bo && !bo->vram_userfault_count) { > + if (tbo->resource->bus.is_iomem) { > + dma_resv_assert_held(tbo->base.resv); > + > + xe_device_mem_access_get(xe); > + > + bo->vram_userfault_count = 1; > + > + spin_lock(&xe->mem_access.vram_userfault_lock); > + list_add(&bo->vram_userfault_link, &xe->mem_access.vram_userfault_list); > + spin_unlock(&xe->mem_access.vram_userfault_lock); > + > + xe_device_mem_access_put(xe); > + } > + } > > dma_resv_unlock(tbo->base.resv); > + > return ret; > } > > @@ -1167,7 +1202,7 @@ static void xe_ttm_bo_vm_close(struct vm_area_struct *vma) > > ttm_bo_vm_close(vma); > > - if (tbo->resource->bus.is_iomem) > + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) > xe_device_mem_access_put(xe); > } > > @@ -1190,7 +1225,7 @@ int xe_drm_gem_ttm_mmap(struct drm_gem_object *gem, > if (ret < 0) > return ret; > > - if (tbo->resource->bus.is_iomem) > + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) > xe_device_mem_access_get(xe); > > return 0; > @@ -2296,6 +2331,27 @@ int xe_bo_dumb_create(struct drm_file *file_priv, > return err; > } > > +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo) > +{ > + struct ttm_buffer_object *tbo = &bo->ttm; > + struct ttm_device *bdev = tbo->bdev; > + struct drm_device *ddev = tbo->base.dev; > + struct xe_device *xe = to_xe_device(ddev); > + > + if (!INTEL_DISPLAY_ENABLED(xe)) > + return; > + > + drm_vma_node_unmap(&tbo->base.vma_node, bdev->dev_mapping); > + > + /* > + * We have exclusive access here via runtime suspend. All other callers > + * must first grab the rpm wakeref. > + */ > + XE_WARN_ON(!bo->vram_userfault_count); > + list_del(&bo->vram_userfault_link); > + bo->vram_userfault_count = 0; > +} > + > #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) > #include "tests/xe_bo.c" > #endif > diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h > index 098ccab7fa1e..fd9ec9fc9152 100644 > --- a/drivers/gpu/drm/xe/xe_bo.h > +++ b/drivers/gpu/drm/xe/xe_bo.h > @@ -239,6 +239,8 @@ int xe_gem_create_ioctl(struct drm_device *dev, void *data, > struct drm_file *file); > int xe_gem_mmap_offset_ioctl(struct drm_device *dev, void *data, > struct drm_file *file); > +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo); > + > int xe_bo_dumb_create(struct drm_file *file_priv, > struct drm_device *dev, > struct drm_mode_create_dumb *args); > diff --git a/drivers/gpu/drm/xe/xe_bo_types.h b/drivers/gpu/drm/xe/xe_bo_types.h > index f71dbc518958..72c667dc664e 100644 > --- a/drivers/gpu/drm/xe/xe_bo_types.h > +++ b/drivers/gpu/drm/xe/xe_bo_types.h > @@ -84,6 +84,11 @@ struct xe_bo { > * objects. > */ > u16 cpu_caching; > + /** > + * Whether the object is currently in fake offset mmap backed by vram. > + */ > + unsigned int vram_userfault_count; > + struct list_head vram_userfault_link; > }; > > #define intel_bo_to_drm_bo(bo) (&(bo)->ttm.base) > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h > index 9a212dbdb8a4..1aa303db80c9 100644 > --- a/drivers/gpu/drm/xe/xe_device_types.h > +++ b/drivers/gpu/drm/xe/xe_device_types.h > @@ -363,6 +363,26 @@ struct xe_device { > struct { > /** @ref: ref count of memory accesses */ > atomic_t ref; > + /* > + * Protects access to vram usefault list. > + * It is required, if we are outside of the runtime suspend path, > + * access to @vram_userfault_list requires always first grabbing the > + * runtime pm, to ensure we can't race against runtime suspend. > + * Once we have that we also need to grab @vram_userfault_lock, > + * at which point we have exclusive access. > + * The runtime suspend path is special since it doesn't really hold any locks, > + * but instead has exclusive access by virtue of all other accesses requiring > + * holding the runtime pm wakeref. > + */ > + spinlock_t vram_userfault_lock; > + > + /* > + * Keep list of userfaulted gem obj, which require to release their > + * mmap mappings at runtime suspend path. > + */ > + struct list_head vram_userfault_list; > + > + bool vram_userfault_ongoing; > } mem_access; > > /** > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c > index b429c2876a76..bc1cb081e6e5 100644 > --- a/drivers/gpu/drm/xe/xe_pm.c > +++ b/drivers/gpu/drm/xe/xe_pm.c > @@ -181,6 +181,8 @@ void xe_pm_init(struct xe_device *xe) > } > > xe_pm_runtime_init(xe); > + INIT_LIST_HEAD(&xe->mem_access.vram_userfault_list); > + spin_lock_init(&xe->mem_access.vram_userfault_lock); > } > > void xe_pm_runtime_fini(struct xe_device *xe) > @@ -214,6 +216,7 @@ struct task_struct *xe_pm_read_callback_task(struct xe_device *xe) > > int xe_pm_runtime_suspend(struct xe_device *xe) > { > + struct xe_bo *bo, *on; > struct xe_gt *gt; > u8 id; > int err = 0; > @@ -247,6 +250,10 @@ int xe_pm_runtime_suspend(struct xe_device *xe) > */ > lock_map_acquire(&xe_device_mem_access_lockdep_map); > > + list_for_each_entry_safe(bo, on, > + &xe->mem_access.vram_userfault_list, vram_userfault_link) > + xe_bo_runtime_pm_release_mmap_offset(bo); > + > if (xe->d3cold.allowed) { > err = xe_bo_evict_all(xe); > if (err) ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend 2023-12-07 11:37 ` Matthew Auld @ 2023-12-07 17:00 ` Gupta, Anshuman 0 siblings, 0 replies; 15+ messages in thread From: Gupta, Anshuman @ 2023-12-07 17:00 UTC (permalink / raw) To: Auld, Matthew, Nilawar, Badal, intel-xe@lists.freedesktop.org Cc: Vivi, Rodrigo > -----Original Message----- > From: Auld, Matthew <matthew.auld@intel.com> > Sent: Thursday, December 7, 2023 5:08 PM > To: Nilawar, Badal <badal.nilawar@intel.com>; intel-xe@lists.freedesktop.org > Cc: Gupta, Anshuman <anshuman.gupta@intel.com>; Vivi, Rodrigo > <rodrigo.vivi@intel.com>; Ghimiray, Himal Prasad > <himal.prasad.ghimiray@intel.com> > Subject: Re: [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm > suspend > > On 06/12/2023 13:34, Badal Nilawar wrote: > > Release all mmap mappings for all vram objects which are associated > > with userfault such that, while pcie function in D3hot, any access to > > memory mappings will raise a userfault. > > > > Upon userfault, in order to access memory mappings, if graphics > > function is in D3 then runtime resume of dgpu will be triggered to > > transition to D0. > > > > Previous commit has blocked the rpm but let's make sure that rpm does > > not get blocked for headed cards(client's parts). > > Above pagefault approach will be useful for headed cards to save the > > power when display is off and there are active mmap mappings. > > So why do we want to use this approach for cards with display turned off > (client) and not for cards without a display part? I don't see that explained. > Also do we have a way to test this new approach in CI? Is the display not > mostly always turned on? This approach is suitable for clients cards to save power and battery lift, we don't have any check for server vs client but display is the easiest way to detect client card. When display is "on", pci device can't enter to d3 as KMS CRTC always holds a wakeref. When KMS CRTC is inactive i.e display "off," then there is opportunity for pci device to enter to d3 but active mmap mapping may block it. Thanks, Anshuman Gupta. > > > > > v2: > > - Add lock dep assertion before updating vram_userfault_count (Rodrigo) > > - Avoid iomem check before bo migration check as bo can migrate > > to system memory (Matthew Auld) > > v3: > > - Apply pagefault approach for headed cards. > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Cc: Matthew Auld <matthew.auld@intel.com> > > Cc: Anshuman Gupta <anshuman.gupta@intel.com> > > Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> > > --- > > drivers/gpu/drm/xe/xe_bo.c | 62 ++++++++++++++++++++++++++-- > > drivers/gpu/drm/xe/xe_bo.h | 2 + > > drivers/gpu/drm/xe/xe_bo_types.h | 5 +++ > > drivers/gpu/drm/xe/xe_device_types.h | 20 +++++++++ > > drivers/gpu/drm/xe/xe_pm.c | 7 ++++ > > 5 files changed, 93 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > > index 5741948a2a51..419bc5c55aa7 100644 > > --- a/drivers/gpu/drm/xe/xe_bo.c > > +++ b/drivers/gpu/drm/xe/xe_bo.c > > @@ -797,6 +797,18 @@ static int xe_bo_move(struct ttm_buffer_object > *ttm_bo, bool evict, > > dma_fence_put(fence); > > } > > > > + /* > > + * TTM has already nuked the mmap for us (see > ttm_bo_unmap_virtual), > > + * so if we moved from VRAM make sure to unlink this from the > userfault > > + * tracking. > > + */ > > + if (mem_type_is_vram(old_mem_type)) { > > + spin_lock(&xe->mem_access.vram_userfault_lock); > > + if (INTEL_DISPLAY_ENABLED(xe) && !list_empty(&bo- > >vram_userfault_link)) > > + list_del_init(&bo->vram_userfault_link); > > + spin_unlock(&xe->mem_access.vram_userfault_lock); > > + } > > + > > xe_device_mem_access_put(xe); > > > > out: > > @@ -1125,6 +1137,8 @@ static vm_fault_t xe_gem_fault(struct vm_fault > *vmf) > > { > > struct ttm_buffer_object *tbo = vmf->vma->vm_private_data; > > struct drm_device *ddev = tbo->base.dev; > > + struct xe_bo *bo = NULL; > > + struct xe_device *xe = to_xe_device(ddev); > > vm_fault_t ret; > > int idx, r = 0; > > > > @@ -1133,7 +1147,7 @@ static vm_fault_t xe_gem_fault(struct vm_fault > *vmf) > > return ret; > > > > if (drm_dev_enter(ddev, &idx)) { > > - struct xe_bo *bo = ttm_to_xe_bo(tbo); > > + bo = ttm_to_xe_bo(tbo); > > > > trace_xe_bo_cpu_fault(bo); > > > > @@ -1152,10 +1166,31 @@ static vm_fault_t xe_gem_fault(struct vm_fault > *vmf) > > } else { > > ret = ttm_bo_vm_dummy_page(vmf, vmf->vma- > >vm_page_prot); > > } > > + > > if (ret == VM_FAULT_RETRY && !(vmf->flags & > FAULT_FLAG_RETRY_NOWAIT)) > > return ret; > > + /* > > + * ttm_bo_vm_reserve() already has dma_resv_lock. > > + * vram_userfault_count is protected by dma_resv lock and rpm > wakeref. > > + */ > > + if (INTEL_DISPLAY_ENABLED(xe) && ret == VM_FAULT_NOPAGE && bo > && !bo->vram_userfault_count) { > > + if (tbo->resource->bus.is_iomem) { > > + dma_resv_assert_held(tbo->base.resv); > > + > > + xe_device_mem_access_get(xe); > > + > > + bo->vram_userfault_count = 1; > > + > > + spin_lock(&xe->mem_access.vram_userfault_lock); > > + list_add(&bo->vram_userfault_link, &xe- > >mem_access.vram_userfault_list); > > + spin_unlock(&xe->mem_access.vram_userfault_lock); > > + > > + xe_device_mem_access_put(xe); > > + } > > + } > > > > dma_resv_unlock(tbo->base.resv); > > + > > return ret; > > } > > > > @@ -1167,7 +1202,7 @@ static void xe_ttm_bo_vm_close(struct > vm_area_struct *vma) > > > > ttm_bo_vm_close(vma); > > > > - if (tbo->resource->bus.is_iomem) > > + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) > > xe_device_mem_access_put(xe); > > } > > > > @@ -1190,7 +1225,7 @@ int xe_drm_gem_ttm_mmap(struct > drm_gem_object *gem, > > if (ret < 0) > > return ret; > > > > - if (tbo->resource->bus.is_iomem) > > + if (!INTEL_DISPLAY_ENABLED(xe) && tbo->resource->bus.is_iomem) > > xe_device_mem_access_get(xe); > > > > return 0; > > @@ -2296,6 +2331,27 @@ int xe_bo_dumb_create(struct drm_file > *file_priv, > > return err; > > } > > > > +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo) > > +{ > > + struct ttm_buffer_object *tbo = &bo->ttm; > > + struct ttm_device *bdev = tbo->bdev; > > + struct drm_device *ddev = tbo->base.dev; > > + struct xe_device *xe = to_xe_device(ddev); > > + > > + if (!INTEL_DISPLAY_ENABLED(xe)) > > + return; > > + > > + drm_vma_node_unmap(&tbo->base.vma_node, bdev- > >dev_mapping); > > + > > + /* > > + * We have exclusive access here via runtime suspend. All other callers > > + * must first grab the rpm wakeref. > > + */ > > + XE_WARN_ON(!bo->vram_userfault_count); > > + list_del(&bo->vram_userfault_link); > > + bo->vram_userfault_count = 0; > > +} > > + > > #if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) > > #include "tests/xe_bo.c" > > #endif > > diff --git a/drivers/gpu/drm/xe/xe_bo.h b/drivers/gpu/drm/xe/xe_bo.h > > index 098ccab7fa1e..fd9ec9fc9152 100644 > > --- a/drivers/gpu/drm/xe/xe_bo.h > > +++ b/drivers/gpu/drm/xe/xe_bo.h > > @@ -239,6 +239,8 @@ int xe_gem_create_ioctl(struct drm_device *dev, > void *data, > > struct drm_file *file); > > int xe_gem_mmap_offset_ioctl(struct drm_device *dev, void *data, > > struct drm_file *file); > > +void xe_bo_runtime_pm_release_mmap_offset(struct xe_bo *bo); > > + > > int xe_bo_dumb_create(struct drm_file *file_priv, > > struct drm_device *dev, > > struct drm_mode_create_dumb *args); > > diff --git a/drivers/gpu/drm/xe/xe_bo_types.h > b/drivers/gpu/drm/xe/xe_bo_types.h > > index f71dbc518958..72c667dc664e 100644 > > --- a/drivers/gpu/drm/xe/xe_bo_types.h > > +++ b/drivers/gpu/drm/xe/xe_bo_types.h > > @@ -84,6 +84,11 @@ struct xe_bo { > > * objects. > > */ > > u16 cpu_caching; > > + /** > > + * Whether the object is currently in fake offset mmap backed by vram. > > + */ > > + unsigned int vram_userfault_count; > > + struct list_head vram_userfault_link; > > }; > > > > #define intel_bo_to_drm_bo(bo) (&(bo)->ttm.base) > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h > b/drivers/gpu/drm/xe/xe_device_types.h > > index 9a212dbdb8a4..1aa303db80c9 100644 > > --- a/drivers/gpu/drm/xe/xe_device_types.h > > +++ b/drivers/gpu/drm/xe/xe_device_types.h > > @@ -363,6 +363,26 @@ struct xe_device { > > struct { > > /** @ref: ref count of memory accesses */ > > atomic_t ref; > > + /* > > + * Protects access to vram usefault list. > > + * It is required, if we are outside of the runtime suspend > path, > > + * access to @vram_userfault_list requires always first > grabbing the > > + * runtime pm, to ensure we can't race against runtime > suspend. > > + * Once we have that we also need to grab > @vram_userfault_lock, > > + * at which point we have exclusive access. > > + * The runtime suspend path is special since it doesn't really > hold any locks, > > + * but instead has exclusive access by virtue of all other > accesses requiring > > + * holding the runtime pm wakeref. > > + */ > > + spinlock_t vram_userfault_lock; > > + > > + /* > > + * Keep list of userfaulted gem obj, which require to release > their > > + * mmap mappings at runtime suspend path. > > + */ > > + struct list_head vram_userfault_list; > > + > > + bool vram_userfault_ongoing; > > } mem_access; > > > > /** > > diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c > > index b429c2876a76..bc1cb081e6e5 100644 > > --- a/drivers/gpu/drm/xe/xe_pm.c > > +++ b/drivers/gpu/drm/xe/xe_pm.c > > @@ -181,6 +181,8 @@ void xe_pm_init(struct xe_device *xe) > > } > > > > xe_pm_runtime_init(xe); > > + INIT_LIST_HEAD(&xe->mem_access.vram_userfault_list); > > + spin_lock_init(&xe->mem_access.vram_userfault_lock); > > } > > > > void xe_pm_runtime_fini(struct xe_device *xe) > > @@ -214,6 +216,7 @@ struct task_struct > *xe_pm_read_callback_task(struct xe_device *xe) > > > > int xe_pm_runtime_suspend(struct xe_device *xe) > > { > > + struct xe_bo *bo, *on; > > struct xe_gt *gt; > > u8 id; > > int err = 0; > > @@ -247,6 +250,10 @@ int xe_pm_runtime_suspend(struct xe_device *xe) > > */ > > lock_map_acquire(&xe_device_mem_access_lockdep_map); > > > > + list_for_each_entry_safe(bo, on, > > + &xe->mem_access.vram_userfault_list, > vram_userfault_link) > > + xe_bo_runtime_pm_release_mmap_offset(bo); > > + > > if (xe->d3cold.allowed) { > > err = xe_bo_evict_all(xe); > > if (err) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend 2023-12-06 13:34 ` [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend Badal Nilawar 2023-12-07 11:37 ` Matthew Auld @ 2023-12-08 11:15 ` Thomas Hellström 1 sibling, 0 replies; 15+ messages in thread From: Thomas Hellström @ 2023-12-08 11:15 UTC (permalink / raw) To: Badal Nilawar, intel-xe; +Cc: matthew.auld, rodrigo.vivi On 12/6/23 14:34, Badal Nilawar wrote: > Release all mmap mappings for all vram objects which are associated > with userfault such that, while pcie function in D3hot, any access > to memory mappings will raise a userfault. > > Upon userfault, in order to access memory mappings, if graphics > function is in D3 then runtime resume of dgpu will be triggered to > transition to D0. > > Previous commit has blocked the rpm but let's make sure that rpm > does not get blocked for headed cards(client's parts). > Above pagefault approach will be useful for headed cards to save the > power when display is off and there are active mmap mappings. > > v2: > - Add lock dep assertion before updating vram_userfault_count (Rodrigo) > - Avoid iomem check before bo migration check as bo can migrate > to system memory (Matthew Auld) > v3: > - Apply pagefault approach for headed cards. > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Matthew Auld <matthew.auld@intel.com> > Cc: Anshuman Gupta <anshuman.gupta@intel.com> > Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> > --- > drivers/gpu/drm/xe/xe_bo.c | 62 ++++++++++++++++++++++++++-- > drivers/gpu/drm/xe/xe_bo.h | 2 + > drivers/gpu/drm/xe/xe_bo_types.h | 5 +++ > drivers/gpu/drm/xe/xe_device_types.h | 20 +++++++++ > drivers/gpu/drm/xe/xe_pm.c | 7 ++++ > 5 files changed, 93 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c > index 5741948a2a51..419bc5c55aa7 100644 > --- a/drivers/gpu/drm/xe/xe_bo.c > +++ b/drivers/gpu/drm/xe/xe_bo.c > @@ -797,6 +797,18 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, > dma_fence_put(fence); > } > > + /* > + * TTM has already nuked the mmap for us (see ttm_bo_unmap_virtual), > + * so if we moved from VRAM make sure to unlink this from the userfault > + * tracking. > + */ > + if (mem_type_is_vram(old_mem_type)) { > + spin_lock(&xe->mem_access.vram_userfault_lock); > + if (INTEL_DISPLAY_ENABLED(xe) && !list_empty(&bo->vram_userfault_link)) > + list_del_init(&bo->vram_userfault_link); > + spin_unlock(&xe->mem_access.vram_userfault_lock); > + } > + Please move this block into move_notify() instead, if at all possible. That function is intended to release whatever various bindings we have set up to the backing memory in the old location. Thanks, Thomas ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-xe] ✓ CI.Patch_applied: success for Handle mmap with D3 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar 2023-12-06 13:34 ` [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings Badal Nilawar 2023-12-06 13:34 ` [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend Badal Nilawar @ 2023-12-06 16:28 ` Patchwork 2023-12-06 16:28 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork 2023-12-06 16:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork 4 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2023-12-06 16:28 UTC (permalink / raw) To: Badal Nilawar; +Cc: intel-xe == Series Details == Series: Handle mmap with D3 URL : https://patchwork.freedesktop.org/series/127432/ State : success == Summary == === Applying kernel patches on branch 'drm-xe-next' with base: === Base commit: 9d76164e0 drm/xe/xe2: Add workaround 14019988906 === git am output follows === Applying: drm/xe/dgfx: Block rpm for active mmap mappings Applying: drm/xe/dgfx: Release mmap mappings on rpm suspend ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-xe] ✗ CI.checkpatch: warning for Handle mmap with D3 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar ` (2 preceding siblings ...) 2023-12-06 16:28 ` [Intel-xe] ✓ CI.Patch_applied: success for Handle mmap with D3 Patchwork @ 2023-12-06 16:28 ` Patchwork 2023-12-06 16:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork 4 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2023-12-06 16:28 UTC (permalink / raw) To: Badal Nilawar; +Cc: intel-xe == Series Details == Series: Handle mmap with D3 URL : https://patchwork.freedesktop.org/series/127432/ State : warning == Summary == + KERNEL=/kernel + git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt Cloning into 'mt'... warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/ + git -C mt rev-list -n1 origin/master 6030b24c1386b00de8187b5fb987e283a57b372a + cd /kernel + git config --global --add safe.directory /kernel + git log -n1 commit 14a69294e56e6f2d275e677b7270af5fa5256fd5 Author: Badal Nilawar <badal.nilawar@intel.com> Date: Wed Dec 6 19:04:21 2023 +0530 drm/xe/dgfx: Release mmap mappings on rpm suspend Release all mmap mappings for all vram objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault. Upon userfault, in order to access memory mappings, if graphics function is in D3 then runtime resume of dgpu will be triggered to transition to D0. Previous commit has blocked the rpm but let's make sure that rpm does not get blocked for headed cards(client's parts). Above pagefault approach will be useful for headed cards to save the power when display is off and there are active mmap mappings. v2: - Add lock dep assertion before updating vram_userfault_count (Rodrigo) - Avoid iomem check before bo migration check as bo can migrate to system memory (Matthew Auld) v3: - Apply pagefault approach for headed cards. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> + /mt/dim checkpatch 9d76164e08c68a3d5d081ff7b07c15a2cadf741a drm-intel 7377d3711 drm/xe/dgfx: Block rpm for active mmap mappings 14a69294e drm/xe/dgfx: Release mmap mappings on rpm suspend -:83: WARNING:LONG_LINE: line length of 101 exceeds 100 columns #83: FILE: drivers/gpu/drm/xe/xe_bo.c:1176: + if (INTEL_DISPLAY_ENABLED(xe) && ret == VM_FAULT_NOPAGE && bo && !bo->vram_userfault_count) { total: 0 errors, 1 warnings, 0 checks, 178 lines checked ^ permalink raw reply [flat|nested] 15+ messages in thread
* [Intel-xe] ✗ CI.KUnit: failure for Handle mmap with D3 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar ` (3 preceding siblings ...) 2023-12-06 16:28 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork @ 2023-12-06 16:28 ` Patchwork 4 siblings, 0 replies; 15+ messages in thread From: Patchwork @ 2023-12-06 16:28 UTC (permalink / raw) To: Badal Nilawar; +Cc: intel-xe == Series Details == Series: Handle mmap with D3 URL : https://patchwork.freedesktop.org/series/127432/ State : failure == Summary == + trap cleanup EXIT + /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig ERROR:root:../drivers/gpu/drm/xe/xe_bo.c:18:10: fatal error: i915_drv.h: No such file or directory 18 | #include "i915_drv.h" | ^~~~~~~~~~~~ compilation terminated. make[7]: *** [../scripts/Makefile.build:243: drivers/gpu/drm/xe/xe_bo.o] Error 1 make[7]: *** Waiting for unfinished jobs.... make[6]: *** [../scripts/Makefile.build:480: drivers/gpu/drm/xe] Error 2 make[5]: *** [../scripts/Makefile.build:480: drivers/gpu/drm] Error 2 make[4]: *** [../scripts/Makefile.build:480: drivers/gpu] Error 2 make[4]: *** Waiting for unfinished jobs.... make[3]: *** [../scripts/Makefile.build:480: drivers] Error 2 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [/kernel/Makefile:1913: .] Error 2 make[1]: *** [/kernel/Makefile:234: __sub-make] Error 2 make: *** [Makefile:234: __sub-make] Error 2 [16:28:17] Configuring KUnit Kernel ... Generating .config ... Populating config with: $ make ARCH=um O=.kunit olddefconfig [16:28:21] Building KUnit Kernel ... Populating config with: $ make ARCH=um O=.kunit olddefconfig Building with: $ make ARCH=um O=.kunit --jobs=48 + cleanup ++ stat -c %u:%g /kernel + chown -R 1003:1003 /kernel ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-12-08 11:15 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-06 13:34 [Intel-xe] [PATCH 0/2] Handle mmap with D3 Badal Nilawar 2023-12-06 13:34 ` [Intel-xe] [PATCH 1/2] drm/xe/dgfx: Block rpm for active mmap mappings Badal Nilawar 2023-12-07 11:26 ` Matthew Auld 2023-12-07 13:06 ` Matthew Auld 2023-12-08 7:29 ` Nilawar, Badal 2023-12-08 9:11 ` Matthew Auld 2023-12-08 7:17 ` Nilawar, Badal 2023-12-08 9:27 ` Matthew Auld 2023-12-06 13:34 ` [Intel-xe] [PATCH 2/2] drm/xe/dgfx: Release mmap mappings on rpm suspend Badal Nilawar 2023-12-07 11:37 ` Matthew Auld 2023-12-07 17:00 ` Gupta, Anshuman 2023-12-08 11:15 ` [Intel-xe] " Thomas Hellström 2023-12-06 16:28 ` [Intel-xe] ✓ CI.Patch_applied: success for Handle mmap with D3 Patchwork 2023-12-06 16:28 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork 2023-12-06 16:28 ` [Intel-xe] ✗ CI.KUnit: failure " Patchwork
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox