* [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-15 20:52 ` Yang, Philip
0 siblings, 0 replies; 10+ messages in thread
From: Yang, Philip @ 2019-08-15 20:52 UTC (permalink / raw)
To: jglisse@redhat.com, alex.deucher@amd.com,
amd-gfx@lists.freedesktop.org, Kuehling, Felix, jgg@mellanox.com,
linux-mm@kvack.org, dri-devel@lists.freedesktop.org
Cc: Yang, Philip
hmm_range_fault may return NULL pages because some of pfns are equal to
HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
the page in.
The fix is to call hmm_pte_need_fault to update fault variable.
Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
mm/hmm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c
index 9f22562e2c43..7ca4fb39d3d8 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -544,6 +544,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
swp_entry_t entry = pte_to_swp_entry(pte);
if (!non_swap_entry(entry)) {
+ cpu_flags = pte_to_hmm_pfn_flags(range, pte);
+ hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
+ &fault, &write_fault);
if (fault || write_fault)
goto fault;
return 0;
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-15 20:52 ` Yang, Philip
0 siblings, 0 replies; 10+ messages in thread
From: Yang, Philip @ 2019-08-15 20:52 UTC (permalink / raw)
To: jglisse@redhat.com, alex.deucher@amd.com,
amd-gfx@lists.freedesktop.org, Kuehling, Felix, jgg@mellanox.com,
linux-mm@kvack.org, dri-devel@lists.freedesktop.org
Cc: Yang, Philip
hmm_range_fault may return NULL pages because some of pfns are equal to
HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
the page in.
The fix is to call hmm_pte_need_fault to update fault variable.
Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
mm/hmm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c
index 9f22562e2c43..7ca4fb39d3d8 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -544,6 +544,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
swp_entry_t entry = pte_to_swp_entry(pte);
if (!non_swap_entry(entry)) {
+ cpu_flags = pte_to_hmm_pfn_flags(range, pte);
+ hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
+ &fault, &write_fault);
if (fault || write_fault)
goto fault;
return 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
2019-08-15 20:52 ` Yang, Philip
@ 2019-08-15 21:02 ` Jerome Glisse
-1 siblings, 0 replies; 10+ messages in thread
From: Jerome Glisse @ 2019-08-15 21:02 UTC (permalink / raw)
To: Yang, Philip
Cc: alex.deucher@amd.com, Kuehling, Felix,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
jgg@mellanox.com, amd-gfx@lists.freedesktop.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
>
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 9f22562e2c43..7ca4fb39d3d8 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -544,6 +544,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
> swp_entry_t entry = pte_to_swp_entry(pte);
>
> if (!non_swap_entry(entry)) {
> + cpu_flags = pte_to_hmm_pfn_flags(range, pte);
> + hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
> + &fault, &write_fault);
> if (fault || write_fault)
> goto fault;
> return 0;
> --
> 2.17.1
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-15 21:02 ` Jerome Glisse
0 siblings, 0 replies; 10+ messages in thread
From: Jerome Glisse @ 2019-08-15 21:02 UTC (permalink / raw)
To: Yang, Philip
Cc: alex.deucher@amd.com, amd-gfx@lists.freedesktop.org,
Kuehling, Felix, jgg@mellanox.com, linux-mm@kvack.org,
dri-devel@lists.freedesktop.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
>
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 9f22562e2c43..7ca4fb39d3d8 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -544,6 +544,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
> swp_entry_t entry = pte_to_swp_entry(pte);
>
> if (!non_swap_entry(entry)) {
> + cpu_flags = pte_to_hmm_pfn_flags(range, pte);
> + hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
> + &fault, &write_fault);
> if (fault || write_fault)
> goto fault;
> return 0;
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20190815205227.7949-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
2019-08-15 20:52 ` Yang, Philip
@ 2019-08-16 0:54 ` Jason Gunthorpe
-1 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2019-08-16 0:54 UTC (permalink / raw)
To: Yang, Philip, Ralph Campbell
Cc: alex.deucher-5C7GfCeVMHo@public.gmane.org, Kuehling, Felix,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
I'll fix it for you but please be careful not to send Change-id's to
the public lists.
Also what is the Fixes line for this?
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
Ralph has also been looking at this area also so I'll give him a bit
to chime in, otherwise with Jerome's review this looks OK to go to
linux-next
Thanks,
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-16 0:54 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2019-08-16 0:54 UTC (permalink / raw)
To: Yang, Philip, Ralph Campbell
Cc: jglisse@redhat.com, alex.deucher@amd.com,
amd-gfx@lists.freedesktop.org, Kuehling, Felix,
linux-mm@kvack.org, dri-devel@lists.freedesktop.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
I'll fix it for you but please be careful not to send Change-id's to
the public lists.
Also what is the Fixes line for this?
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
Ralph has also been looking at this area also so I'll give him a bit
to chime in, otherwise with Jerome's review this looks OK to go to
linux-next
Thanks,
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
2019-08-16 0:54 ` Jason Gunthorpe
@ 2019-08-16 16:02 ` Yang, Philip
-1 siblings, 0 replies; 10+ messages in thread
From: Yang, Philip @ 2019-08-16 16:02 UTC (permalink / raw)
To: Jason Gunthorpe, Ralph Campbell
Cc: alex.deucher@amd.com, Kuehling, Felix,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
jglisse@redhat.com, amd-gfx@lists.freedesktop.org
On 2019-08-15 8:54 p.m., Jason Gunthorpe wrote:
> On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
>> hmm_range_fault may return NULL pages because some of pfns are equal to
>> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
>> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
>> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
>> the page in.
>>
>> The fix is to call hmm_pte_need_fault to update fault variable.
>
>> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
>
> I'll fix it for you but please be careful not to send Change-id's to
> the public lists.
>
Thanks, the change-id was added by our Gerrit hook, I need generate
patch files, remove change-id line and then send out modified patch
files in future.
> Also what is the Fixes line for this?
>
This fixes the issue found by the internal rocrtst, the
rocrtstFunc.Memory_Max_Mem evicted some user buffers, and then following
test restore those user buffers failed because the buffers are swapped
out and application doesn't touch the buffers to swap it in.
>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>> mm/hmm.c | 3 +++
>> 1 file changed, 3 insertions(+)
>
> Ralph has also been looking at this area also so I'll give him a bit
> to chime in, otherwise with Jerome's review this looks OK to go to
> linux-next
>
Ok, thanks for helping push this to hmm branch at
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> Thanks,
> Jason
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-16 16:02 ` Yang, Philip
0 siblings, 0 replies; 10+ messages in thread
From: Yang, Philip @ 2019-08-16 16:02 UTC (permalink / raw)
To: Jason Gunthorpe, Ralph Campbell
Cc: jglisse@redhat.com, alex.deucher@amd.com,
amd-gfx@lists.freedesktop.org, Kuehling, Felix,
linux-mm@kvack.org, dri-devel@lists.freedesktop.org
On 2019-08-15 8:54 p.m., Jason Gunthorpe wrote:
> On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
>> hmm_range_fault may return NULL pages because some of pfns are equal to
>> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
>> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
>> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
>> the page in.
>>
>> The fix is to call hmm_pte_need_fault to update fault variable.
>
>> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
>
> I'll fix it for you but please be careful not to send Change-id's to
> the public lists.
>
Thanks, the change-id was added by our Gerrit hook, I need generate
patch files, remove change-id line and then send out modified patch
files in future.
> Also what is the Fixes line for this?
>
This fixes the issue found by the internal rocrtst, the
rocrtstFunc.Memory_Max_Mem evicted some user buffers, and then following
test restore those user buffers failed because the buffers are swapped
out and application doesn't touch the buffers to swap it in.
>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>> mm/hmm.c | 3 +++
>> 1 file changed, 3 insertions(+)
>
> Ralph has also been looking at this area also so I'll give him a bit
> to chime in, otherwise with Jerome's review this looks OK to go to
> linux-next
>
Ok, thanks for helping push this to hmm branch at
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> Thanks,
> Jason
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
2019-08-15 20:52 ` Yang, Philip
@ 2019-08-23 13:39 ` Jason Gunthorpe
-1 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2019-08-23 13:39 UTC (permalink / raw)
To: Yang, Philip
Cc: alex.deucher-5C7GfCeVMHo@public.gmane.org, Kuehling, Felix,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
>
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
Applied to hmm.git, thanks
I fixed the commit message:
Author: Yang, Philip <Philip.Yang@amd.com>
Date: Thu Aug 15 20:52:56 2019 +0000
mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
hmm_range_fault() may return NULL pages because some of the pfns are equal
to HMM_PFN_NONE. This happens randomly under memory pressure. The reason
is during the swapped out page pte path, hmm_vma_handle_pte() doesn't
update the fault variable from cpu_flags, so it failed to call
hmm_vam_do_fault() to swap the page in.
The fix is to call hmm_pte_need_fault() to update fault variable.
Fixes: 74eee180b935 ("mm/hmm/mirror: device page fault handler")
Link: https://lore.kernel.org/r/20190815205227.7949-1-Philip.Yang@amd.com
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] mm/hmm: hmm_range_fault handle pages swapped out
@ 2019-08-23 13:39 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2019-08-23 13:39 UTC (permalink / raw)
To: Yang, Philip
Cc: jglisse@redhat.com, alex.deucher@amd.com,
amd-gfx@lists.freedesktop.org, Kuehling, Felix,
linux-mm@kvack.org, dri-devel@lists.freedesktop.org
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.
>
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> mm/hmm.c | 3 +++
> 1 file changed, 3 insertions(+)
Applied to hmm.git, thanks
I fixed the commit message:
Author: Yang, Philip <Philip.Yang@amd.com>
Date: Thu Aug 15 20:52:56 2019 +0000
mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
hmm_range_fault() may return NULL pages because some of the pfns are equal
to HMM_PFN_NONE. This happens randomly under memory pressure. The reason
is during the swapped out page pte path, hmm_vma_handle_pte() doesn't
update the fault variable from cpu_flags, so it failed to call
hmm_vam_do_fault() to swap the page in.
The fix is to call hmm_pte_need_fault() to update fault variable.
Fixes: 74eee180b935 ("mm/hmm/mirror: device page fault handler")
Link: https://lore.kernel.org/r/20190815205227.7949-1-Philip.Yang@amd.com
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2019-08-23 13:39 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-15 20:52 [PATCH] mm/hmm: hmm_range_fault handle pages swapped out Yang, Philip
2019-08-15 20:52 ` Yang, Philip
2019-08-15 21:02 ` Jerome Glisse
2019-08-15 21:02 ` Jerome Glisse
[not found] ` <20190815205227.7949-1-Philip.Yang-5C7GfCeVMHo@public.gmane.org>
2019-08-16 0:54 ` Jason Gunthorpe
2019-08-16 0:54 ` Jason Gunthorpe
2019-08-16 16:02 ` Yang, Philip
2019-08-16 16:02 ` Yang, Philip
2019-08-23 13:39 ` Jason Gunthorpe
2019-08-23 13:39 ` Jason Gunthorpe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.