* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
[not found] <20240825152403.3171682-1-namcao@linutronix.de>
@ 2024-08-25 16:04 ` Lorenzo Stoakes
2024-08-26 7:11 ` Nam Cao
2024-08-26 13:58 ` Liam R. Howlett
1 sibling, 1 reply; 13+ messages in thread
From: Lorenzo Stoakes @ 2024-08-25 16:04 UTC (permalink / raw)
To: Nam Cao
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Liam R. Howlett, Vlastimil Babka, linux-kernel, linux-mm, bigeasy
On Sun, Aug 25, 2024 at 05:24:03PM GMT, Nam Cao wrote:
[snip]
> diff --git a/mm/mmap.c b/mm/mmap.c
> index d0dfc85b209b..64067ddb8382 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2486,6 +2486,12 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> if (err)
> goto out_free_mpol;
>
> + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> + err = track_pfn_split(vma, addr);
> + if (err)
> + goto out_vma_unlink;
> + }
> +
> if (new->vm_file)
> get_file(new->vm_file);
>
> @@ -2515,6 +2521,8 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> vma_next(vmi);
> return 0;
>
> +out_vma_unlink:
> + unlink_anon_vmas(vma);
> out_free_mpol:
> mpol_put(vma_policy(new));
> out_free_vmi:
> --
> 2.39.2
>
Right from the start the 6.11rc cycle mm-unstable and therefore -next has
moved this function out to mm/vma.c, so you will need to make this change
there rather than against mm/mmap.c (or whichever tree this is intended to
come through needs to sync up, especially as there's a fairly substantial
amount of change going on right now in VMA handling).
Sorry about that!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-08-25 16:04 ` [PATCH] x86/mm/pat: Support splitting of virtual memory areas Lorenzo Stoakes
@ 2024-08-26 7:11 ` Nam Cao
0 siblings, 0 replies; 13+ messages in thread
From: Nam Cao @ 2024-08-26 7:11 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Liam R. Howlett, Vlastimil Babka, linux-kernel, linux-mm, bigeasy
On Sun, Aug 25, 2024 at 05:04:44PM +0100, Lorenzo Stoakes wrote:
> On Sun, Aug 25, 2024 at 05:24:03PM GMT, Nam Cao wrote:
>
> [snip]
>
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index d0dfc85b209b..64067ddb8382 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2486,6 +2486,12 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > if (err)
> > goto out_free_mpol;
> >
> > + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
> > + err = track_pfn_split(vma, addr);
> > + if (err)
> > + goto out_vma_unlink;
> > + }
> > +
> > if (new->vm_file)
> > get_file(new->vm_file);
> >
> > @@ -2515,6 +2521,8 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > vma_next(vmi);
> > return 0;
> >
> > +out_vma_unlink:
> > + unlink_anon_vmas(vma);
> > out_free_mpol:
> > mpol_put(vma_policy(new));
> > out_free_vmi:
> > --
> > 2.39.2
> >
>
> Right from the start the 6.11rc cycle mm-unstable and therefore -next has
> moved this function out to mm/vma.c, so you will need to make this change
> there rather than against mm/mmap.c (or whichever tree this is intended to
> come through needs to sync up, especially as there's a fairly substantial
> amount of change going on right now in VMA handling).
>
> Sorry about that!
Ah okay, thanks for lettimg me know.
We could wait for 6.12-rc1 to be out, and then let this patch go to x86
tree. Or we could let it go to mm tree, if x86 maintainers are okay with
that?
Best regards,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
[not found] <20240825152403.3171682-1-namcao@linutronix.de>
2024-08-25 16:04 ` [PATCH] x86/mm/pat: Support splitting of virtual memory areas Lorenzo Stoakes
@ 2024-08-26 13:58 ` Liam R. Howlett
2024-08-27 7:58 ` Nam Cao
1 sibling, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2024-08-26 13:58 UTC (permalink / raw)
To: Nam Cao
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Vlastimil Babka, Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
* Nam Cao <namcao@linutronix.de> [240825 11:29]:
> When a virtual memory area (VMA) gets splitted, memtype_rbroot's entries
> are not updated. This causes confusion later on when the VMAs get
> un-mapped, because the address ranges of the splitted VMAs do not match the
> address range of the initial VMA.
>
> For example, if user does:
>
> fd = open("/some/pci/bar", O_RDWR);
> addr = mmap(0, 8192, PROT_READ, MAP_SHARED, fd, 0);
> mprotect(addr, 4096, PROT_READ | PROT_WRITE);
> munmap(p, 8192);
>
> with the physical address starting from 0xfd000000, the range
> (0xfd000000-0xfd002000) would be tracked with the mmap() call.
>
> After mprotect(), the initial range gets splitted into
> (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
>
> Then, at munmap(), the first range does not match any entry in
> memtype_rbroot, and a message is seen in dmesg:
>
> x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
>
> The second range still matches by accident, because matching only the end
> address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> (x86/mm/pat: Change free_memtype() to support shrinking case)).
Does this need a fixes tag?
>
> Make sure VMA splitting is handled properly, by splitting the entries in
> memtype_rbroot.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
> arch/x86/mm/pat/memtype.c | 59 ++++++++++++++++++++++++++++++
> arch/x86/mm/pat/memtype.h | 3 ++
> arch/x86/mm/pat/memtype_interval.c | 22 +++++++++++
> include/linux/pgtable.h | 6 +++
> mm/mmap.c | 8 ++++
> 5 files changed, 98 insertions(+)
>
> diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
> index bdc2a240c2aa..b60019478a76 100644
> --- a/arch/x86/mm/pat/memtype.c
> +++ b/arch/x86/mm/pat/memtype.c
> @@ -935,6 +935,46 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
> return 0;
> }
>
> +static int split_pfn_range(u64 start, u64 end, u64 addr)
> +{
> + struct memtype *entry_new;
> + int is_range_ram, ret;
> +
> + if (!pat_enabled())
> + return 0;
> +
> + start = sanitize_phys(start);
> + end = sanitize_phys(end - 1) + 1;
> +
> + /* Low ISA region is not tracked, it is always mapped WB */
> + if (x86_platform.is_untracked_pat_range(start, end))
> + return 0;
> +
> + is_range_ram = pat_pagerange_is_ram(start, end);
> + if (is_range_ram == 1)
> + return 0;
> +
> + if (is_range_ram < 0)
> + return -EINVAL;
> +
> + entry_new = kmalloc(sizeof(*entry_new), GFP_KERNEL);
> + if (!entry_new)
> + return -ENOMEM;
> +
> + spin_lock(&memtype_lock);
> + ret = memtype_split(start, end, addr, entry_new);
> + spin_unlock(&memtype_lock);
> +
> + if (ret) {
> + pr_err("x86/PAT: %s:%d splitting invalid memtype [mem %#010Lx-%#010Lx]\n",
> + current->comm, current->pid, start, end - 1);
> + kfree(entry_new);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> /*
> * Internal interface to free a range of physical memory.
> * Frees non RAM regions only.
> @@ -1072,6 +1112,25 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> return 0;
> }
>
> +int track_pfn_split(struct vm_area_struct *vma, unsigned long addr)
> +{
> + unsigned long vma_size = vma->vm_end - vma->vm_start;
> + resource_size_t start_paddr, split_paddr;
> + int ret;
> +
> + if (vma->vm_flags & VM_PAT) {
> + ret = get_pat_info(vma, &start_paddr, NULL);
> + if (ret)
> + return ret;
> +
> + split_paddr = start_paddr + addr - vma->vm_start;
> +
> + return split_pfn_range(start_paddr, start_paddr + vma_size, split_paddr);
> + }
> +
> + return 0;
> +}
> +
> void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
> {
> enum page_cache_mode pcm;
> diff --git a/arch/x86/mm/pat/memtype.h b/arch/x86/mm/pat/memtype.h
> index cacecdbceb55..e01dc2018ab6 100644
> --- a/arch/x86/mm/pat/memtype.h
> +++ b/arch/x86/mm/pat/memtype.h
> @@ -31,6 +31,7 @@ static inline char *cattr_name(enum page_cache_mode pcm)
> #ifdef CONFIG_X86_PAT
> extern int memtype_check_insert(struct memtype *entry_new,
> enum page_cache_mode *new_type);
> +extern int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new);
I think we are dropping unnecessary externs now.
> extern struct memtype *memtype_erase(u64 start, u64 end);
> extern struct memtype *memtype_lookup(u64 addr);
> extern int memtype_copy_nth_element(struct memtype *entry_out, loff_t pos);
> @@ -38,6 +39,8 @@ extern int memtype_copy_nth_element(struct memtype *entry_out, loff_t pos);
> static inline int memtype_check_insert(struct memtype *entry_new,
> enum page_cache_mode *new_type)
> { return 0; }
> +static inline int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new)
> +{ return 0; }
> static inline struct memtype *memtype_erase(u64 start, u64 end)
> { return NULL; }
> static inline struct memtype *memtype_lookup(u64 addr)
> diff --git a/arch/x86/mm/pat/memtype_interval.c b/arch/x86/mm/pat/memtype_interval.c
> index 645613d59942..c75d9ee6b72f 100644
> --- a/arch/x86/mm/pat/memtype_interval.c
> +++ b/arch/x86/mm/pat/memtype_interval.c
> @@ -128,6 +128,28 @@ int memtype_check_insert(struct memtype *entry_new, enum page_cache_mode *ret_ty
> return 0;
> }
>
> +int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new)
> +{
> + struct memtype *entry_old;
> +
> + entry_old = memtype_match(start, end, MEMTYPE_EXACT_MATCH);
> + if (!entry_old)
> + return -EINVAL;
> +
> + interval_remove(entry_old, &memtype_rbroot);
> +
> + entry_new->start = addr;
> + entry_new->end = entry_old->end;
> + entry_new->type = entry_old->type;
> +
> + entry_old->end = addr;
> +
> + interval_insert(entry_old, &memtype_rbroot);
> + interval_insert(entry_new, &memtype_rbroot);
> +
> + return 0;
> +}
> +
> struct memtype *memtype_erase(u64 start, u64 end)
> {
> struct memtype *entry_old;
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 2a6a3cccfc36..8bfc8d0f5dd2 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1502,6 +1502,11 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> return 0;
> }
>
> +static inline int track_pfn_split(struct vm_area_struct *vma, unsigned long addr)
> +{
> + return 0;
> +}
> +
> /*
> * track_pfn_insert is called when a _new_ single pfn is established
> * by vmf_insert_pfn().
> @@ -1542,6 +1547,7 @@ static inline void untrack_pfn_clear(struct vm_area_struct *vma)
> extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> unsigned long pfn, unsigned long addr,
> unsigned long size);
> +extern int track_pfn_split(struct vm_area_struct *vma, unsigned long addr);
Same extern comment here.
> extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
> pfn_t pfn);
> extern int track_pfn_copy(struct vm_area_struct *vma);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index d0dfc85b209b..64067ddb8382 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2486,6 +2486,12 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> if (err)
> goto out_free_mpol;
>
> + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
It is also a bit odd that you check VM_PFNMAP() here, then call a
function to check another flag?
> + err = track_pfn_split(vma, addr);
> + if (err)
> + goto out_vma_unlink;
> + }
> +
I don't think the __split_vma() location is the best place to put this.
Can this be done through the vm_ops->may_split() that is called above?
This is arch independent code that now has an x86 specific check, and
I'd like to keep __split_vma() out of the flag checking. The only error
after the vm_ops check is ENOMEM (without any extra GFP restrictions on
the allocations), you don't need the new vma, and use the same arguments
passed to vm_ops->may_split().
> if (new->vm_file)
> get_file(new->vm_file);
>
> @@ -2515,6 +2521,8 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> vma_next(vmi);
> return 0;
>
> +out_vma_unlink:
> + unlink_anon_vmas(vma);
> out_free_mpol:
> mpol_put(vma_policy(new));
> out_free_vmi:
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-08-26 13:58 ` Liam R. Howlett
@ 2024-08-27 7:58 ` Nam Cao
2024-08-27 16:01 ` Liam R. Howlett
0 siblings, 1 reply; 13+ messages in thread
From: Nam Cao @ 2024-08-27 7:58 UTC (permalink / raw)
To: Liam R. Howlett, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H. Peter Anvin, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
linux-kernel, linux-mm, bigeasy
On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> > When a virtual memory area (VMA) gets splitted, memtype_rbroot's entries
> > are not updated. This causes confusion later on when the VMAs get
> > un-mapped, because the address ranges of the splitted VMAs do not match the
> > address range of the initial VMA.
> >
> > For example, if user does:
> >
> > fd = open("/some/pci/bar", O_RDWR);
> > addr = mmap(0, 8192, PROT_READ, MAP_SHARED, fd, 0);
> > mprotect(addr, 4096, PROT_READ | PROT_WRITE);
> > munmap(p, 8192);
> >
> > with the physical address starting from 0xfd000000, the range
> > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> >
> > After mprotect(), the initial range gets splitted into
> > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> >
> > Then, at munmap(), the first range does not match any entry in
> > memtype_rbroot, and a message is seen in dmesg:
> >
> > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> >
> > The second range still matches by accident, because matching only the end
> > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > (x86/mm/pat: Change free_memtype() to support shrinking case)).
>
> Does this need a fixes tag?
Yes, it should have
Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
thanks for the reminder.
>
> >
> > Make sure VMA splitting is handled properly, by splitting the entries in
> > memtype_rbroot.
> >
> > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > ---
> > arch/x86/mm/pat/memtype.c | 59 ++++++++++++++++++++++++++++++
> > arch/x86/mm/pat/memtype.h | 3 ++
> > arch/x86/mm/pat/memtype_interval.c | 22 +++++++++++
> > include/linux/pgtable.h | 6 +++
> > mm/mmap.c | 8 ++++
> > 5 files changed, 98 insertions(+)
> >
> > diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
> > index bdc2a240c2aa..b60019478a76 100644
> > --- a/arch/x86/mm/pat/memtype.c
> > +++ b/arch/x86/mm/pat/memtype.c
> > @@ -935,6 +935,46 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
> > return 0;
> > }
> >
> > +static int split_pfn_range(u64 start, u64 end, u64 addr)
> > +{
> > + struct memtype *entry_new;
> > + int is_range_ram, ret;
> > +
> > + if (!pat_enabled())
> > + return 0;
> > +
> > + start = sanitize_phys(start);
> > + end = sanitize_phys(end - 1) + 1;
> > +
> > + /* Low ISA region is not tracked, it is always mapped WB */
> > + if (x86_platform.is_untracked_pat_range(start, end))
> > + return 0;
> > +
> > + is_range_ram = pat_pagerange_is_ram(start, end);
> > + if (is_range_ram == 1)
> > + return 0;
> > +
> > + if (is_range_ram < 0)
> > + return -EINVAL;
> > +
> > + entry_new = kmalloc(sizeof(*entry_new), GFP_KERNEL);
> > + if (!entry_new)
> > + return -ENOMEM;
> > +
> > + spin_lock(&memtype_lock);
> > + ret = memtype_split(start, end, addr, entry_new);
> > + spin_unlock(&memtype_lock);
> > +
> > + if (ret) {
> > + pr_err("x86/PAT: %s:%d splitting invalid memtype [mem %#010Lx-%#010Lx]\n",
> > + current->comm, current->pid, start, end - 1);
> > + kfree(entry_new);
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > /*
> > * Internal interface to free a range of physical memory.
> > * Frees non RAM regions only.
> > @@ -1072,6 +1112,25 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> > return 0;
> > }
> >
> > +int track_pfn_split(struct vm_area_struct *vma, unsigned long addr)
> > +{
> > + unsigned long vma_size = vma->vm_end - vma->vm_start;
> > + resource_size_t start_paddr, split_paddr;
> > + int ret;
> > +
> > + if (vma->vm_flags & VM_PAT) {
> > + ret = get_pat_info(vma, &start_paddr, NULL);
> > + if (ret)
> > + return ret;
> > +
> > + split_paddr = start_paddr + addr - vma->vm_start;
> > +
> > + return split_pfn_range(start_paddr, start_paddr + vma_size, split_paddr);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
> > {
> > enum page_cache_mode pcm;
> > diff --git a/arch/x86/mm/pat/memtype.h b/arch/x86/mm/pat/memtype.h
> > index cacecdbceb55..e01dc2018ab6 100644
> > --- a/arch/x86/mm/pat/memtype.h
> > +++ b/arch/x86/mm/pat/memtype.h
> > @@ -31,6 +31,7 @@ static inline char *cattr_name(enum page_cache_mode pcm)
> > #ifdef CONFIG_X86_PAT
> > extern int memtype_check_insert(struct memtype *entry_new,
> > enum page_cache_mode *new_type);
> > +extern int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new);
>
> I think we are dropping unnecessary externs now.
It would look a bit odd, since the surrounding declarations all have
"extern". I have no strong preference, so if you prefer it that way, then
sure.
>
> > extern struct memtype *memtype_erase(u64 start, u64 end);
> > extern struct memtype *memtype_lookup(u64 addr);
> > extern int memtype_copy_nth_element(struct memtype *entry_out, loff_t pos);
> > @@ -38,6 +39,8 @@ extern int memtype_copy_nth_element(struct memtype *entry_out, loff_t pos);
> > static inline int memtype_check_insert(struct memtype *entry_new,
> > enum page_cache_mode *new_type)
> > { return 0; }
> > +static inline int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new)
> > +{ return 0; }
> > static inline struct memtype *memtype_erase(u64 start, u64 end)
> > { return NULL; }
> > static inline struct memtype *memtype_lookup(u64 addr)
> > diff --git a/arch/x86/mm/pat/memtype_interval.c b/arch/x86/mm/pat/memtype_interval.c
> > index 645613d59942..c75d9ee6b72f 100644
> > --- a/arch/x86/mm/pat/memtype_interval.c
> > +++ b/arch/x86/mm/pat/memtype_interval.c
> > @@ -128,6 +128,28 @@ int memtype_check_insert(struct memtype *entry_new, enum page_cache_mode *ret_ty
> > return 0;
> > }
> >
> > +int memtype_split(u64 start, u64 end, u64 addr, struct memtype *entry_new)
> > +{
> > + struct memtype *entry_old;
> > +
> > + entry_old = memtype_match(start, end, MEMTYPE_EXACT_MATCH);
> > + if (!entry_old)
> > + return -EINVAL;
> > +
> > + interval_remove(entry_old, &memtype_rbroot);
> > +
> > + entry_new->start = addr;
> > + entry_new->end = entry_old->end;
> > + entry_new->type = entry_old->type;
> > +
> > + entry_old->end = addr;
> > +
> > + interval_insert(entry_old, &memtype_rbroot);
> > + interval_insert(entry_new, &memtype_rbroot);
> > +
> > + return 0;
> > +}
> > +
> > struct memtype *memtype_erase(u64 start, u64 end)
> > {
> > struct memtype *entry_old;
> > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> > index 2a6a3cccfc36..8bfc8d0f5dd2 100644
> > --- a/include/linux/pgtable.h
> > +++ b/include/linux/pgtable.h
> > @@ -1502,6 +1502,11 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> > return 0;
> > }
> >
> > +static inline int track_pfn_split(struct vm_area_struct *vma, unsigned long addr)
> > +{
> > + return 0;
> > +}
> > +
> > /*
> > * track_pfn_insert is called when a _new_ single pfn is established
> > * by vmf_insert_pfn().
> > @@ -1542,6 +1547,7 @@ static inline void untrack_pfn_clear(struct vm_area_struct *vma)
> > extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
> > unsigned long pfn, unsigned long addr,
> > unsigned long size);
> > +extern int track_pfn_split(struct vm_area_struct *vma, unsigned long addr);
>
> Same extern comment here.
Same answer as above.
>
> > extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
> > pfn_t pfn);
> > extern int track_pfn_copy(struct vm_area_struct *vma);
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index d0dfc85b209b..64067ddb8382 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2486,6 +2486,12 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > if (err)
> > goto out_free_mpol;
> >
> > + if (unlikely(vma->vm_flags & VM_PFNMAP)) {
>
> It is also a bit odd that you check VM_PFNMAP() here, then call a
> function to check another flag?
Right, this check is redundant, thanks for pointing it out.
I stole this "style" from unmap_single_vma(), but I think the check is
redundant there as well.
>
> > + err = track_pfn_split(vma, addr);
> > + if (err)
> > + goto out_vma_unlink;
> > + }
> > +
>
> I don't think the __split_vma() location is the best place to put this.
> Can this be done through the vm_ops->may_split() that is called above?
I don't think ->may_split() is a suitable place. Its name gives me the
impression that it only checks whether it is okay to split the VMA, but not
really does any splitting work. Also that function pointer can be
overwritten by any driver.
>
> This is arch independent code that now has an x86 specific check, and
> I'd like to keep __split_vma() out of the flag checking.
I think these track_pfn_*() functions are meant to be arch-independent,
it's just that only x86 implements it at the moment. For instance,
untrack_pfn() and track_pfn_remap() are called in mm/ code.
> The only error
> after the vm_ops check is ENOMEM (without any extra GFP restrictions on
> the allocations), you don't need the new vma, and use the same arguments
> passed to vm_ops->may_split().
>
>
> > if (new->vm_file)
> > get_file(new->vm_file);
> >
> > @@ -2515,6 +2521,8 @@ static int __split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma,
> > vma_next(vmi);
> > return 0;
> >
> > +out_vma_unlink:
> > + unlink_anon_vmas(vma);
> > out_free_mpol:
> > mpol_put(vma_policy(new));
> > out_free_vmi:
> > --
> > 2.39.2
> >
Thanks for the comments,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-08-27 7:58 ` Nam Cao
@ 2024-08-27 16:01 ` Liam R. Howlett
2024-09-03 10:36 ` Nam Cao
0 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2024-08-27 16:01 UTC (permalink / raw)
To: Nam Cao
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Vlastimil Babka, Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
* Nam Cao <namcao@linutronix.de> [240827 03:59]:
> On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> > > When a virtual memory area (VMA) gets splitted, memtype_rbroot's entries
> > > are not updated. This causes confusion later on when the VMAs get
> > > un-mapped, because the address ranges of the splitted VMAs do not match the
> > > address range of the initial VMA.
> > >
> > > For example, if user does:
> > >
> > > fd = open("/some/pci/bar", O_RDWR);
> > > addr = mmap(0, 8192, PROT_READ, MAP_SHARED, fd, 0);
> > > mprotect(addr, 4096, PROT_READ | PROT_WRITE);
> > > munmap(p, 8192);
What is p? By the comments below, you mean addr here?
> > >
> > > with the physical address starting from 0xfd000000, the range
> > > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> > >
> > > After mprotect(), the initial range gets splitted into
> > > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> > >
> > > Then, at munmap(), the first range does not match any entry in
> > > memtype_rbroot, and a message is seen in dmesg:
> > >
> > > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> > >
> > > The second range still matches by accident, because matching only the end
> > > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > > (x86/mm/pat: Change free_memtype() to support shrinking case)).
> >
> > Does this need a fixes tag?
>
> Yes, it should have
> Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
> thanks for the reminder.
That commit is from 2008, is there a bug report on this issue?
>
> >
> > >
> > > Make sure VMA splitting is handled properly, by splitting the entries in
> > > memtype_rbroot.
> > >
> > > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > > ---
> > > arch/x86/mm/pat/memtype.c | 59 ++++++++++++++++++++++++++++++
> > > arch/x86/mm/pat/memtype.h | 3 ++
> > > arch/x86/mm/pat/memtype_interval.c | 22 +++++++++++
> > > include/linux/pgtable.h | 6 +++
> > > mm/mmap.c | 8 ++++
> > > 5 files changed, 98 insertions(+)
> > >
...
> >
> > It is also a bit odd that you check VM_PFNMAP() here, then call a
> > function to check another flag?
>
> Right, this check is redundant, thanks for pointing it out.
>
> I stole this "style" from unmap_single_vma(), but I think the check is
> redundant there as well.
If you have identified a redundant check, can you please remove it with
a separate patch?
>
> >
> > > + err = track_pfn_split(vma, addr);
> > > + if (err)
> > > + goto out_vma_unlink;
> > > + }
> > > +
> >
> > I don't think the __split_vma() location is the best place to put this.
> > Can this be done through the vm_ops->may_split() that is called above?
>
> I don't think ->may_split() is a suitable place. Its name gives me the
> impression that it only checks whether it is okay to split the VMA, but not
> really does any splitting work. Also that function pointer can be
> overwritten by any driver.
It's a callback that takes the arguments you need and is called as long
as it exists. Your function would deny splitting if it failed, so it
may not split in that case.
Also, any driver that overwrites it should do what is necessary for PAT
then. I don't love the idea of using the vm_ops either, I just like it
better than dropping in flag checks and arch-specific code. I can see
issue with using the callback and drivers that may have their own vma
mapping that also use PAT, I guess.
> >
> > This is arch independent code that now has an x86 specific check, and
> > I'd like to keep __split_vma() out of the flag checking.
>
> I think these track_pfn_*() functions are meant to be arch-independent,
> it's just that only x86 implements it at the moment. For instance,
> untrack_pfn() and track_pfn_remap() are called in mm/ code.
>
Arch-independent wrappers that are only used by one arch are not
arch-independent. PAT has been around for ages and only exists for x86
and x86_64.
We just went through removing arch_unmap(), which was used just for ppc.
They cause problems for general mm changes and just get in the way. If
we can avoid them, we should.
memtype_interval.c doesn't have any knowledge of the vmas, so you have
this extraction layer in memtype.c that is being bypassed here for the
memtype_erase(); ensuring the start-end match or at least the end
matches.
So your comment about the second range still matching by accident is
misleading - it's not matched at all because you are searching for the
exact match or the end address being the same (which it isn't in your
interval tree).
Taking a step back here, you are splitting a range in an interval tree
to match a vma split, but you aren't splitting the range based on PAT
changing; you are splitting it based on the vma becoming two vmas.
Since VM_PFNMAP is in VM_SPECIAL, the splitting is never undone and will
continue to fragment the interval tree, so even if flags change back to
match each other there will always be two vams - and what changed may
not even be the PAT.
So the interval split should occur when the PAT changes and needs to be
tracked differently. This does not happen when the vma is split - it
happens when a vma is removed or when the PAT is changed.
And, indeed, for the mremap() shrinking case, you already support
finding a range by just the end and have an abstraction layer. The
problem here is that you don't check by the start - but you could. You
could make the change to memtype_erase() to search for the exact, end,
or start and do what is necessary to shrink off the front of a region as
well.
What I find very strange is that 2039e6acaf94 ("x86/mm/pat: Change
free_memtype() to support shrinking case") enables shrinking of
VM_PFNMAP, but doesn't allow shrinking the end address. Why is one
allowed and the other not allowed?
Thanks,
Liam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-08-27 16:01 ` Liam R. Howlett
@ 2024-09-03 10:36 ` Nam Cao
2024-09-03 15:56 ` Liam R. Howlett
0 siblings, 1 reply; 13+ messages in thread
From: Nam Cao @ 2024-09-03 10:36 UTC (permalink / raw)
To: Liam R. Howlett, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H. Peter Anvin, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
linux-kernel, linux-mm, bigeasy
Sorry for the late reply, I was a bit busy, and needed some time to digest
your email.
On Tue, Aug 27, 2024 at 12:01:28PM -0400, Liam R. Howlett wrote:
> * Nam Cao <namcao@linutronix.de> [240827 03:59]:
> > On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> > > > When a virtual memory area (VMA) gets splitted, memtype_rbroot's entries
> > > > are not updated. This causes confusion later on when the VMAs get
> > > > un-mapped, because the address ranges of the splitted VMAs do not match the
> > > > address range of the initial VMA.
> > > >
> > > > For example, if user does:
> > > >
> > > > fd = open("/some/pci/bar", O_RDWR);
> > > > addr = mmap(0, 8192, PROT_READ, MAP_SHARED, fd, 0);
> > > > mprotect(addr, 4096, PROT_READ | PROT_WRITE);
> > > > munmap(p, 8192);
>
> What is p? By the comments below, you mean addr here?
Yes, it should be addr. Sorry about that.
>
> > > >
> > > > with the physical address starting from 0xfd000000, the range
> > > > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> > > >
> > > > After mprotect(), the initial range gets splitted into
> > > > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> > > >
> > > > Then, at munmap(), the first range does not match any entry in
> > > > memtype_rbroot, and a message is seen in dmesg:
> > > >
> > > > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> > > >
> > > > The second range still matches by accident, because matching only the end
> > > > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > > > (x86/mm/pat: Change free_memtype() to support shrinking case)).
> > >
> > > Does this need a fixes tag?
> >
> > Yes, it should have
> > Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
> > thanks for the reminder.
>
> That commit is from 2008, is there a bug report on this issue?
Not that I am aware of. I'm not entirely sure why, but I would guess due to
the combination of:
- This is not an issue for pages in RAM
- This only happens if VMAs are splitted
- The only user-visible effect is merely a pr_info(), and people may miss it.
I only encountered this issue while "trying to be smart" with mprotect() on
a portion of mmap()-ed device memory, I guess probably not many people do
that.
>
> >
> > >
> > > >
> > > > Make sure VMA splitting is handled properly, by splitting the entries in
> > > > memtype_rbroot.
> > > >
> > > > Signed-off-by: Nam Cao <namcao@linutronix.de>
> > > > ---
> > > > arch/x86/mm/pat/memtype.c | 59 ++++++++++++++++++++++++++++++
> > > > arch/x86/mm/pat/memtype.h | 3 ++
> > > > arch/x86/mm/pat/memtype_interval.c | 22 +++++++++++
> > > > include/linux/pgtable.h | 6 +++
> > > > mm/mmap.c | 8 ++++
> > > > 5 files changed, 98 insertions(+)
> > > >
> ...
>
> > >
> > > It is also a bit odd that you check VM_PFNMAP() here, then call a
> > > function to check another flag?
> >
> > Right, this check is redundant, thanks for pointing it out.
> >
> > I stole this "style" from unmap_single_vma(), but I think the check is
> > redundant there as well.
>
> If you have identified a redundant check, can you please remove it with
> a separate patch?
Sure.
>
> >
> > >
> > > > + err = track_pfn_split(vma, addr);
> > > > + if (err)
> > > > + goto out_vma_unlink;
> > > > + }
> > > > +
> > >
> > > I don't think the __split_vma() location is the best place to put this.
> > > Can this be done through the vm_ops->may_split() that is called above?
> >
> > I don't think ->may_split() is a suitable place. Its name gives me the
> > impression that it only checks whether it is okay to split the VMA, but not
> > really does any splitting work. Also that function pointer can be
> > overwritten by any driver.
>
> It's a callback that takes the arguments you need and is called as long
> as it exists. Your function would deny splitting if it failed, so it
> may not split in that case.
>
> Also, any driver that overwrites it should do what is necessary for PAT
> then. I don't love the idea of using the vm_ops either, I just like it
> better than dropping in flag checks and arch-specific code. I can see
> issue with using the callback and drivers that may have their own vma
> mapping that also use PAT, I guess.
Yeah I don't love this. You mentioned another approach below, which I
think would be the best (if it's possible). I will attempt that other
approach.
>
> > >
> > > This is arch independent code that now has an x86 specific check, and
> > > I'd like to keep __split_vma() out of the flag checking.
> >
> > I think these track_pfn_*() functions are meant to be arch-independent,
> > it's just that only x86 implements it at the moment. For instance,
> > untrack_pfn() and track_pfn_remap() are called in mm/ code.
> >
>
> Arch-independent wrappers that are only used by one arch are not
> arch-independent. PAT has been around for ages and only exists for x86
> and x86_64.
>
> We just went through removing arch_unmap(), which was used just for ppc.
> They cause problems for general mm changes and just get in the way. If
> we can avoid them, we should.
>
> memtype_interval.c doesn't have any knowledge of the vmas, so you have
> this extraction layer in memtype.c that is being bypassed here for the
> memtype_erase(); ensuring the start-end match or at least the end
> matches.
>
> So your comment about the second range still matching by accident is
> misleading - it's not matched at all because you are searching for the
> exact match or the end address being the same (which it isn't in your
> interval tree).
But the second range *does* match, because the end address match?
The second range is (0xfd001000-0xfd002000), which matches with
(0xfd000000-0xfd002000) in the interval tree.
Perhaps I should be clearer in the description..
>
> Taking a step back here, you are splitting a range in an interval tree
> to match a vma split, but you aren't splitting the range based on PAT
> changing; you are splitting it based on the vma becoming two vmas.
>
> Since VM_PFNMAP is in VM_SPECIAL, the splitting is never undone and will
> continue to fragment the interval tree, so even if flags change back to
> match each other there will always be two vams - and what changed may
> not even be the PAT.
Right, I did not consider this scenario.
>
> So the interval split should occur when the PAT changes and needs to be
> tracked differently. This does not happen when the vma is split - it
> happens when a vma is removed or when the PAT is changed.
>
> And, indeed, for the mremap() shrinking case, you already support
> finding a range by just the end and have an abstraction layer. The
> problem here is that you don't check by the start - but you could. You
> could make the change to memtype_erase() to search for the exact, end,
> or start and do what is necessary to shrink off the front of a region as
> well.
I thought about this solution initially, but since the interval tree allow
overlapping ranges, it can be tricky to determine the "best match" out
of the overlapping ranges. But I agree that this approach (if possible)
would be better than the current patch.
Let me think about this some more, and I will come back later.
>
> What I find very strange is that 2039e6acaf94 ("x86/mm/pat: Change
> free_memtype() to support shrinking case") enables shrinking of
> VM_PFNMAP, but doesn't allow shrinking the end address. Why is one
> allowed and the other not allowed?
Not really sure what you mean. I think you are ultimately asking why that
commit only matches end address, and not start address? That's because
mremap() may shrink a VMA from [start, end] to [start, new_end] (with
new_end < end). In that case, the range [new_end, end] would be removed
from the interval tree, and that commit wants to match [new_end, end] to
[start, end].
And I don't think mremap() can shrink [start, end] to [new_start, end]?
Thanks for sharing your thoughts.
Best regards,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-03 10:36 ` Nam Cao
@ 2024-09-03 15:56 ` Liam R. Howlett
2024-09-04 7:59 ` Nam Cao
0 siblings, 1 reply; 13+ messages in thread
From: Liam R. Howlett @ 2024-09-03 15:56 UTC (permalink / raw)
To: Nam Cao
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Vlastimil Babka, Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
* Nam Cao <namcao@linutronix.de> [240903 06:36]:
> Sorry for the late reply, I was a bit busy, and needed some time to digest
> your email.
No problem.
>
> On Tue, Aug 27, 2024 at 12:01:28PM -0400, Liam R. Howlett wrote:
> > * Nam Cao <namcao@linutronix.de> [240827 03:59]:
> > > On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > > > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> > > > > When a virtual memory area (VMA) gets splitted, memtype_rbroot's entries
> > > > > are not updated. This causes confusion later on when the VMAs get
> > > > > un-mapped, because the address ranges of the splitted VMAs do not match the
> > > > > address range of the initial VMA.
> > > > >
> > > > > For example, if user does:
> > > > >
> > > > > fd = open("/some/pci/bar", O_RDWR);
> > > > > addr = mmap(0, 8192, PROT_READ, MAP_SHARED, fd, 0);
> > > > > mprotect(addr, 4096, PROT_READ | PROT_WRITE);
> > > > > munmap(p, 8192);
> >
> > What is p? By the comments below, you mean addr here?
> Yes, it should be addr. Sorry about that.
>
> >
> > > > >
> > > > > with the physical address starting from 0xfd000000, the range
> > > > > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> > > > >
> > > > > After mprotect(), the initial range gets splitted into
> > > > > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> > > > >
> > > > > Then, at munmap(), the first range does not match any entry in
> > > > > memtype_rbroot, and a message is seen in dmesg:
> > > > >
> > > > > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> > > > >
> > > > > The second range still matches by accident, because matching only the end
> > > > > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > > > > (x86/mm/pat: Change free_memtype() to support shrinking case)).
> > > >
> > > > Does this need a fixes tag?
> > >
> > > Yes, it should have
> > > Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
> > > thanks for the reminder.
> >
> > That commit is from 2008, is there a bug report on this issue?
>
> Not that I am aware of. I'm not entirely sure why, but I would guess due to
> the combination of:
> - This is not an issue for pages in RAM
> - This only happens if VMAs are splitted
> - The only user-visible effect is merely a pr_info(), and people may miss it.
>
> I only encountered this issue while "trying to be smart" with mprotect() on
> a portion of mmap()-ed device memory, I guess probably not many people do
> that.
Or test it. I would have though some bots would have caught this.
Although the log message is just pr_info()? That seems wrong - we have
an error in the vma tree or the PAT tree and it's just an info printk?
...
> > So your comment about the second range still matching by accident is
> > misleading - it's not matched at all because you are searching for the
> > exact match or the end address being the same (which it isn't in your
> > interval tree).
>
> But the second range *does* match, because the end address match?
> The second range is (0xfd001000-0xfd002000), which matches with
> (0xfd000000-0xfd002000) in the interval tree.
>
> Perhaps I should be clearer in the description..
I see, yes. The error is with the first entry not being found.
...
> >
> > So the interval split should occur when the PAT changes and needs to be
> > tracked differently. This does not happen when the vma is split - it
> > happens when a vma is removed or when the PAT is changed.
> >
> > And, indeed, for the mremap() shrinking case, you already support
> > finding a range by just the end and have an abstraction layer. The
> > problem here is that you don't check by the start - but you could. You
> > could make the change to memtype_erase() to search for the exact, end,
> > or start and do what is necessary to shrink off the front of a region as
> > well.
>
> I thought about this solution initially, but since the interval tree allow
> overlapping ranges, it can be tricky to determine the "best match" out
> of the overlapping ranges. But I agree that this approach (if possible)
> would be better than the current patch.
>
> Let me think about this some more, and I will come back later.
Reading this some more, I believe you can detect the correct address by
matching the start address with the smallest end address (the smallest
interval has to be the entry created by the vma mapping).
>
> >
> > What I find very strange is that 2039e6acaf94 ("x86/mm/pat: Change
> > free_memtype() to support shrinking case") enables shrinking of
> > VM_PFNMAP, but doesn't allow shrinking the end address. Why is one
> > allowed and the other not allowed?
>
> Not really sure what you mean. I think you are ultimately asking why that
> commit only matches end address, and not start address? That's because
> mremap() may shrink a VMA from [start, end] to [start, new_end] (with
> new_end < end). In that case, the range [new_end, end] would be removed
> from the interval tree, and that commit wants to match [new_end, end] to
> [start, end].
> And I don't think mremap() can shrink [start, end] to [new_start, end]?
Even an untrack_pfn() call will only remove the first entry that
matches exactly or the end. Since the tree is sorted by start address,
I guess the smallest (since it's not specified if it's ordered
descending or ascending, and smaller makes more sense) interval will be
deleted? That is, a vma cannot span different attributes but attributes
can span vmas.
Oh wow, this also means if you unmap the end vma first, you will not
have an issue because the memtype_erase() (incorrectly named now) will
resize your PAT entry to match the start vma range.
I wonder what would happen in the "punch a hole" scenario where we
move/MAP_FIXED/unmap the middle of a vma.
My point is that it is unclear as to how the interval tree tracks the
PAT to vma mappings (a more clean comment would be nice). It seems
inconsistent and the situation you found should be handled in the
translation layer, and not the generic code.
Thanks,
Liam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-03 15:56 ` Liam R. Howlett
@ 2024-09-04 7:59 ` Nam Cao
2024-09-04 18:40 ` Liam R. Howlett
0 siblings, 1 reply; 13+ messages in thread
From: Nam Cao @ 2024-09-04 7:59 UTC (permalink / raw)
To: Liam R. Howlett, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H. Peter Anvin, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
linux-kernel, linux-mm, bigeasy
On Tue, Sep 03, 2024 at 11:56:57AM -0400, Liam R. Howlett wrote:
> * Nam Cao <namcao@linutronix.de> [240903 06:36]:
...
> > On Tue, Aug 27, 2024 at 12:01:28PM -0400, Liam R. Howlett wrote:
> > > * Nam Cao <namcao@linutronix.de> [240827 03:59]:
> > > > On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > > > > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
...
> > > > > >
> > > > > > with the physical address starting from 0xfd000000, the range
> > > > > > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> > > > > >
> > > > > > After mprotect(), the initial range gets splitted into
> > > > > > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> > > > > >
> > > > > > Then, at munmap(), the first range does not match any entry in
> > > > > > memtype_rbroot, and a message is seen in dmesg:
> > > > > >
> > > > > > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> > > > > >
> > > > > > The second range still matches by accident, because matching only the end
> > > > > > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > > > > > (x86/mm/pat: Change free_memtype() to support shrinking case)).
> > > > >
> > > > > Does this need a fixes tag?
> > > >
> > > > Yes, it should have
> > > > Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
> > > > thanks for the reminder.
> > >
> > > That commit is from 2008, is there a bug report on this issue?
> >
> > Not that I am aware of. I'm not entirely sure why, but I would guess due to
> > the combination of:
> > - This is not an issue for pages in RAM
> > - This only happens if VMAs are splitted
> > - The only user-visible effect is merely a pr_info(), and people may miss it.
> >
> > I only encountered this issue while "trying to be smart" with mprotect() on
> > a portion of mmap()-ed device memory, I guess probably not many people do
> > that.
>
> Or test it. I would have though some bots would have caught this.
> Although the log message is just pr_info()? That seems wrong - we have
> an error in the vma tree or the PAT tree and it's just an info printk?
Yeah right, I think pr_info() is another issue, it should be pr_warn() or
pr_err(). That is probably another patch.
...
> > >
> > > So the interval split should occur when the PAT changes and needs to be
> > > tracked differently. This does not happen when the vma is split - it
> > > happens when a vma is removed or when the PAT is changed.
> > >
> > > And, indeed, for the mremap() shrinking case, you already support
> > > finding a range by just the end and have an abstraction layer. The
> > > problem here is that you don't check by the start - but you could. You
> > > could make the change to memtype_erase() to search for the exact, end,
> > > or start and do what is necessary to shrink off the front of a region as
> > > well.
> >
> > I thought about this solution initially, but since the interval tree allow
> > overlapping ranges, it can be tricky to determine the "best match" out
> > of the overlapping ranges. But I agree that this approach (if possible)
> > would be better than the current patch.
> >
> > Let me think about this some more, and I will come back later.
>
> Reading this some more, I believe you can detect the correct address by
> matching the start address with the smallest end address (the smallest
> interval has to be the entry created by the vma mapping).
I don't think that would cover all cases. For example, if the tree has 2
intervals: [0x0000-0x2000] and [0x1000-0x3000]. Now, the mm subsystem tells
us that the interval [0x1000-0x2000] needs to be removed (e.g. user does
munmap()), your proposal would match this to the second interval. After the
removal, the tree has [0-0x2000] and [0x2000-0x3000]
Then, mm subsystem says [0x1000-0x3000] should be removed, and that doesn't
match anything. Turns out, the first removal was meant for the first
interval, but we didn't have enough information at the time to determine
that.
Bottom line is, it is not possible to correctly match [0x1000-0x2000] to
[0x0000-0x2000] and [0x1000-0x3000]: both matches can be valid.
>
> >
> > >
> > > What I find very strange is that 2039e6acaf94 ("x86/mm/pat: Change
> > > free_memtype() to support shrinking case") enables shrinking of
> > > VM_PFNMAP, but doesn't allow shrinking the end address. Why is one
> > > allowed and the other not allowed?
> >
> > Not really sure what you mean. I think you are ultimately asking why that
> > commit only matches end address, and not start address? That's because
> > mremap() may shrink a VMA from [start, end] to [start, new_end] (with
> > new_end < end). In that case, the range [new_end, end] would be removed
> > from the interval tree, and that commit wants to match [new_end, end] to
> > [start, end].
> > And I don't think mremap() can shrink [start, end] to [new_start, end]?
>
> Even an untrack_pfn() call will only remove the first entry that
> matches exactly or the end. Since the tree is sorted by start address,
> I guess the smallest (since it's not specified if it's ordered
> descending or ascending, and smaller makes more sense) interval will be
> deleted? That is, a vma cannot span different attributes but attributes
> can span vmas.
>
> Oh wow, this also means if you unmap the end vma first, you will not
> have an issue because the memtype_erase() (incorrectly named now) will
> resize your PAT entry to match the start vma range.
Right. Also funnily, if I run the test program in the description multiple
times, only the first run causes the message in dmesg; the following runs
do not see any problem, because of happy accident: the first run "leaks"
the 0xfd000000-0xfd000fff interval in the tree, which is accidentally
matched by the following runs.
>
> I wonder what would happen in the "punch a hole" scenario where we
> move/MAP_FIXED/unmap the middle of a vma.
If we think more about it, I'm sure we will come up with more scenarios
that are broken with the current implementation.
>
> My point is that it is unclear as to how the interval tree tracks the
> PAT to vma mappings (a more clean comment would be nice). It seems
> inconsistent and the situation you found should be handled in the
> translation layer, and not the generic code.
One solution I can think of: stop allowing overlapping intervals. Instead,
the overlapping portions would be split into new intervals with some
reference counting. memtype_erase() would need to be modified to:
- assemble the potentially split intervals
- split the intervals if needed
The point is, there wouldn't be any confusion with matching overlapping
intervals.
I will give it a try when I have some time, unless someone sees a problem
with it or has a better idea.
Best regards,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-04 7:59 ` Nam Cao
@ 2024-09-04 18:40 ` Liam R. Howlett
2024-09-04 21:29 ` Dave Hansen
2024-09-05 20:08 ` Nam Cao
0 siblings, 2 replies; 13+ messages in thread
From: Liam R. Howlett @ 2024-09-04 18:40 UTC (permalink / raw)
To: Nam Cao
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, Andrew Morton,
Vlastimil Babka, Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
* Nam Cao <namcao@linutronix.de> [240904 03:59]:
> On Tue, Sep 03, 2024 at 11:56:57AM -0400, Liam R. Howlett wrote:
> > * Nam Cao <namcao@linutronix.de> [240903 06:36]:
> ...
> > > On Tue, Aug 27, 2024 at 12:01:28PM -0400, Liam R. Howlett wrote:
> > > > * Nam Cao <namcao@linutronix.de> [240827 03:59]:
> > > > > On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > > > > > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> ...
> > > > > > >
> > > > > > > with the physical address starting from 0xfd000000, the range
> > > > > > > (0xfd000000-0xfd002000) would be tracked with the mmap() call.
> > > > > > >
> > > > > > > After mprotect(), the initial range gets splitted into
> > > > > > > (0xfd000000-0xfd001000) and (0xfd001000-0xfd002000).
> > > > > > >
> > > > > > > Then, at munmap(), the first range does not match any entry in
> > > > > > > memtype_rbroot, and a message is seen in dmesg:
> > > > > > >
> > > > > > > x86/PAT: test:177 freeing invalid memtype [mem 0xfd000000-0xfd000fff]
> > > > > > >
> > > > > > > The second range still matches by accident, because matching only the end
> > > > > > > address is acceptable (to handle shrinking VMA, added by 2039e6acaf94
> > > > > > > (x86/mm/pat: Change free_memtype() to support shrinking case)).
> > > > > >
> > > > > > Does this need a fixes tag?
> > > > >
> > > > > Yes, it should have
> > > > > Fixes: 2e5d9c857d4e ("x86: PAT infrastructure patch")
> > > > > thanks for the reminder.
> > > >
> > > > That commit is from 2008, is there a bug report on this issue?
> > >
> > > Not that I am aware of. I'm not entirely sure why, but I would guess due to
> > > the combination of:
> > > - This is not an issue for pages in RAM
> > > - This only happens if VMAs are splitted
> > > - The only user-visible effect is merely a pr_info(), and people may miss it.
> > >
> > > I only encountered this issue while "trying to be smart" with mprotect() on
> > > a portion of mmap()-ed device memory, I guess probably not many people do
> > > that.
> >
> > Or test it. I would have though some bots would have caught this.
> > Although the log message is just pr_info()? That seems wrong - we have
> > an error in the vma tree or the PAT tree and it's just an info printk?
>
> Yeah right, I think pr_info() is another issue, it should be pr_warn() or
> pr_err(). That is probably another patch.
Agreed.
>
> ...
> > > >
> > > > So the interval split should occur when the PAT changes and needs to be
> > > > tracked differently. This does not happen when the vma is split - it
> > > > happens when a vma is removed or when the PAT is changed.
> > > >
> > > > And, indeed, for the mremap() shrinking case, you already support
> > > > finding a range by just the end and have an abstraction layer. The
> > > > problem here is that you don't check by the start - but you could. You
> > > > could make the change to memtype_erase() to search for the exact, end,
> > > > or start and do what is necessary to shrink off the front of a region as
> > > > well.
> > >
> > > I thought about this solution initially, but since the interval tree allow
> > > overlapping ranges, it can be tricky to determine the "best match" out
> > > of the overlapping ranges. But I agree that this approach (if possible)
> > > would be better than the current patch.
> > >
> > > Let me think about this some more, and I will come back later.
> >
> > Reading this some more, I believe you can detect the correct address by
> > matching the start address with the smallest end address (the smallest
> > interval has to be the entry created by the vma mapping).
>
> I don't think that would cover all cases. For example, if the tree has 2
> intervals: [0x0000-0x2000] and [0x1000-0x3000]. Now, the mm subsystem tells
> us that the interval [0x1000-0x2000] needs to be removed (e.g. user does
> munmap()), your proposal would match this to the second interval. After the
> removal, the tree has [0-0x2000] and [0x2000-0x3000]
>
> Then, mm subsystem says [0x1000-0x3000] should be removed, and that doesn't
> match anything. Turns out, the first removal was meant for the first
> interval, but we didn't have enough information at the time to determine
> that.
>
> Bottom line is, it is not possible to correctly match [0x1000-0x2000] to
> [0x0000-0x2000] and [0x1000-0x3000]: both matches can be valid.
But those ranges won't exist. What appears to be happening in this code
is that there are higher levels of non-overlapping ranges with
memory (cache) types (or none are defined) , which are tracked on page
granularity. So we can't have a page that has two memory type.
The overlapping happens later, when the vmas are mapped. And we are
ensuring that the mapping of the vmas match the higher, larger areas.
The vmas are inserted with memtype_check_insert() which calls
memtype_check_conflict() that ensures any overlapping areas have the
same type as the one being added, so either there is no match or the
interval(s) with this page is set to a specific type. I suspect there
can only really be one range.
So I don't think overlapping areas like above could exist. The vma
cache type has to be the same throughout. It has to be the same type as
all overlapping areas.
Also, your ranges are inclusive while the ranges passed in seem to be
exclusive on the end address, so your example would look more like:
[0x0000-0x2000) [0x2000-0x3000).
You can see this documented in memtype_reserve() where sanitize_phys()
is called.
So we could have a VMA of [0x1000-0x2000), but this vma would have to be
in the first range. [0x0000-0x0FFF) would also be in the first range.
I think that searching for the smallest area containing the entry will
yield the desired entry in the interval tree.
Note that there is debugging support in the Documentation so you can go
look at what is in there with debugfs.
...
> One solution I can think of: stop allowing overlapping intervals. Instead,
> the overlapping portions would be split into new intervals with some
> reference counting. memtype_erase() would need to be modified to:
> - assemble the potentially split intervals
> - split the intervals if needed
> The point is, there wouldn't be any confusion with matching overlapping
> intervals.
>
> I will give it a try when I have some time, unless someone sees a problem
> with it or has a better idea.
I don't think this will work at all. It is dependent of overlapping
ranges to ensure the vmas match what is allowed in certain areas.
Thanks,
Liam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-04 18:40 ` Liam R. Howlett
@ 2024-09-04 21:29 ` Dave Hansen
2024-09-05 20:09 ` Nam Cao
2024-09-05 20:08 ` Nam Cao
1 sibling, 1 reply; 13+ messages in thread
From: Dave Hansen @ 2024-09-04 21:29 UTC (permalink / raw)
To: Liam R. Howlett, Nam Cao, Dave Hansen, Andy Lutomirski,
Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
x86, H. Peter Anvin, Andrew Morton, Vlastimil Babka,
Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
On 9/4/24 11:40, Liam R. Howlett wrote:
> But those ranges won't exist. What appears to be happening in this code
> is that there are higher levels of non-overlapping ranges with
> memory (cache) types (or none are defined) , which are tracked on page
> granularity. So we can't have a page that has two memory type.
Yeah, that's the key. Each page should be uniquely covered by one and
only one tree leaf.
Nam, I didn't see your original patch in my inbox and I don't see it on
lore either. Is there something funky going on there?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-04 18:40 ` Liam R. Howlett
2024-09-04 21:29 ` Dave Hansen
@ 2024-09-05 20:08 ` Nam Cao
2024-09-05 20:52 ` Dave Hansen
1 sibling, 1 reply; 13+ messages in thread
From: Nam Cao @ 2024-09-05 20:08 UTC (permalink / raw)
To: Liam R. Howlett, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H. Peter Anvin, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
linux-kernel, linux-mm, bigeasy
On Wed, Sep 04, 2024 at 02:40:34PM -0400, Liam R. Howlett wrote:
> * Nam Cao <namcao@linutronix.de> [240904 03:59]:
> > On Tue, Sep 03, 2024 at 11:56:57AM -0400, Liam R. Howlett wrote:
> > > * Nam Cao <namcao@linutronix.de> [240903 06:36]:
> > ...
> > > > On Tue, Aug 27, 2024 at 12:01:28PM -0400, Liam R. Howlett wrote:
> > > > > * Nam Cao <namcao@linutronix.de> [240827 03:59]:
> > > > > > On Mon, Aug 26, 2024 at 09:58:11AM -0400, Liam R. Howlett wrote:
> > > > > > > * Nam Cao <namcao@linutronix.de> [240825 11:29]:
> > > > > So the interval split should occur when the PAT changes and needs to be
> > > > > tracked differently. This does not happen when the vma is split - it
> > > > > happens when a vma is removed or when the PAT is changed.
> > > > >
> > > > > And, indeed, for the mremap() shrinking case, you already support
> > > > > finding a range by just the end and have an abstraction layer. The
> > > > > problem here is that you don't check by the start - but you could. You
> > > > > could make the change to memtype_erase() to search for the exact, end,
> > > > > or start and do what is necessary to shrink off the front of a region as
> > > > > well.
> > > >
> > > > I thought about this solution initially, but since the interval tree allow
> > > > overlapping ranges, it can be tricky to determine the "best match" out
> > > > of the overlapping ranges. But I agree that this approach (if possible)
> > > > would be better than the current patch.
> > > >
> > > > Let me think about this some more, and I will come back later.
> > >
> > > Reading this some more, I believe you can detect the correct address by
> > > matching the start address with the smallest end address (the smallest
> > > interval has to be the entry created by the vma mapping).
> >
> > I don't think that would cover all cases. For example, if the tree has 2
> > intervals: [0x0000-0x2000] and [0x1000-0x3000]. Now, the mm subsystem tells
> > us that the interval [0x1000-0x2000] needs to be removed (e.g. user does
> > munmap()), your proposal would match this to the second interval. After the
> > removal, the tree has [0-0x2000] and [0x2000-0x3000]
> >
> > Then, mm subsystem says [0x1000-0x3000] should be removed, and that doesn't
> > match anything. Turns out, the first removal was meant for the first
> > interval, but we didn't have enough information at the time to determine
> > that.
> >
> > Bottom line is, it is not possible to correctly match [0x1000-0x2000] to
> > [0x0000-0x2000] and [0x1000-0x3000]: both matches can be valid.
>
> But those ranges won't exist. What appears to be happening in this code
> is that there are higher levels of non-overlapping ranges with
> memory (cache) types (or none are defined) , which are tracked on page
> granularity. So we can't have a page that has two memory type.
>
> The overlapping happens later, when the vmas are mapped. And we are
> ensuring that the mapping of the vmas match the higher, larger areas.
> The vmas are inserted with memtype_check_insert() which calls
> memtype_check_conflict() that ensures any overlapping areas have the
> same type as the one being added, so either there is no match or the
> interval(s) with this page is set to a specific type. I suspect there
> can only really be one range.
>
> So I don't think overlapping areas like above could exist. The vma
> cache type has to be the same throughout. It has to be the same type as
> all overlapping areas.
Dave agreed with you, so I am likely the confused one, but I still think
the overlapping areas as I described do exist. For example, this userspace
code:
#include <stdio.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#define PCI_BAR "/sys/devices/pci0000:00/0000:00:02.0/resource0"
int main(void)
{
void *p1, *p2;
int fd, ret;
fd = open(PCI_BAR, O_RDWR);
// track 0xfd000000-0xfd001fff
p1 = mmap(0, 0x2000, PROT_READ, MAP_SHARED, fd, 0);
// track 0xfd001000-0xfd002fff
p2 = mmap(0, 0x2000, PROT_READ, MAP_SHARED, fd, 0x1000);
// untrack 0xfd001000-0xfd001fff
munmap(p2, 0x1000);
// untrack 0xfd000000-0xfd001fff
munmap(p1, 0x2000);
// untrack 0xfd002000-0xfd002fff
munmap(p2 + 0x1000, 0x1000);
}
If I pause this program right after the two mmap(), before any munmap(),
then:
$cat /sys/kernel/debug/x86/pat_memtype_list
PAT memtype list:
PAT: [mem 0x00000000bffe0000-0x00000000bffe2000] write-back
PAT: [mem 0x00000000bffe1000-0x00000000bffe2000] write-back
PAT: [mem 0x00000000fd000000-0x00000000fd002000] uncached-minus <-- what I described
PAT: [mem 0x00000000fd001000-0x00000000fd003000] uncached-minus <-- what I described
PAT: [mem 0x00000000febc0000-0x00000000febe0000] uncached-minus
PAT: [mem 0x00000000fed00000-0x00000000fed01000] uncached-minus
PAT: [mem 0x00000000fed00000-0x00000000fed01000] uncached-minus
The 2 mmap() call would create the overlapping intervals as I described.
Then, I let the C program run to completion, see what happen in dmesg:
x86/PAT: memtype_reserve added [mem 0xfd000000-0xfd001fff], track uncached-minus, req uncached-minus, ret uncached-minus
x86/PAT: Overlap at 0xfd000000-0xfd002000
x86/PAT: memtype_reserve added [mem 0xfd001000-0xfd002fff], track uncached-minus, req uncached-minus, ret uncached-minus
x86/PAT: memtype_free request [mem 0xfd001000-0xfd001fff]
x86/PAT: test:178 freeing invalid memtype [mem 0xfd000000-0xfd001fff]
x86/PAT: memtype_free request [mem 0xfd002000-0xfd002fff]
The problem I am raising is the first munmap() call:
[0xfd001000-0xfd001fff] would be untracked, but there is no way to tell for
sure which interval it belongs to. The current implementation matches it to
the first range, but it actually belongs to the second range. This
incorrect matching results in the "freeing invalid memtype" later on.
Hopefully I'm not being an idiot and wasting everyone's time..
>
> Also, your ranges are inclusive while the ranges passed in seem to be
> exclusive on the end address, so your example would look more like:
> [0x0000-0x2000) [0x2000-0x3000).
>
> You can see this documented in memtype_reserve() where sanitize_phys()
> is called.
>
> So we could have a VMA of [0x1000-0x2000), but this vma would have to be
> in the first range. [0x0000-0x0FFF) would also be in the first range.
>
> I think that searching for the smallest area containing the entry will
> yield the desired entry in the interval tree.
>
> Note that there is debugging support in the Documentation so you can go
> look at what is in there with debugfs.
>
> ...
>
> > One solution I can think of: stop allowing overlapping intervals. Instead,
> > the overlapping portions would be split into new intervals with some
> > reference counting. memtype_erase() would need to be modified to:
> > - assemble the potentially split intervals
> > - split the intervals if needed
> > The point is, there wouldn't be any confusion with matching overlapping
> > intervals.
> >
> > I will give it a try when I have some time, unless someone sees a problem
> > with it or has a better idea.
>
> I don't think this will work at all. It is dependent of overlapping
> ranges to ensure the vmas match what is allowed in certain areas.
We can ensure that the cache type is the same, before splitting, so I think
it can work? But let's clear up the other disagreement first.
Best regards,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-04 21:29 ` Dave Hansen
@ 2024-09-05 20:09 ` Nam Cao
0 siblings, 0 replies; 13+ messages in thread
From: Nam Cao @ 2024-09-05 20:09 UTC (permalink / raw)
To: Dave Hansen
Cc: Liam R. Howlett, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H. Peter Anvin, Andrew Morton, Vlastimil Babka, Lorenzo Stoakes,
linux-kernel, linux-mm, bigeasy
On Wed, Sep 04, 2024 at 02:29:47PM -0700, Dave Hansen wrote:
...
> Nam, I didn't see your original patch in my inbox and I don't see it on
> lore either. Is there something funky going on there?
Broken email setup on my side :(
My future patches should be received correctly.
Best regards,
Nam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86/mm/pat: Support splitting of virtual memory areas
2024-09-05 20:08 ` Nam Cao
@ 2024-09-05 20:52 ` Dave Hansen
0 siblings, 0 replies; 13+ messages in thread
From: Dave Hansen @ 2024-09-05 20:52 UTC (permalink / raw)
To: Nam Cao, Liam R. Howlett, Dave Hansen, Andy Lutomirski,
Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
x86, H. Peter Anvin, Andrew Morton, Vlastimil Babka,
Lorenzo Stoakes, linux-kernel, linux-mm, bigeasy
On 9/5/24 13:08, Nam Cao wrote:
>
> If I pause this program right after the two mmap(), before any munmap(),
> then:
> $cat /sys/kernel/debug/x86/pat_memtype_list
> PAT memtype list:
> PAT: [mem 0x00000000bffe0000-0x00000000bffe2000] write-back
> PAT: [mem 0x00000000bffe1000-0x00000000bffe2000] write-back
> PAT: [mem 0x00000000fd000000-0x00000000fd002000] uncached-minus <-- what I described
> PAT: [mem 0x00000000fd001000-0x00000000fd003000] uncached-minus <-- what I described
> PAT: [mem 0x00000000febc0000-0x00000000febe0000] uncached-minus
Well, that's not what I had in mind, so I'm obviously the confused one.
Let me take a look through your example and see if I can offer any
alternatives.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-09-05 20:53 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240825152403.3171682-1-namcao@linutronix.de>
2024-08-25 16:04 ` [PATCH] x86/mm/pat: Support splitting of virtual memory areas Lorenzo Stoakes
2024-08-26 7:11 ` Nam Cao
2024-08-26 13:58 ` Liam R. Howlett
2024-08-27 7:58 ` Nam Cao
2024-08-27 16:01 ` Liam R. Howlett
2024-09-03 10:36 ` Nam Cao
2024-09-03 15:56 ` Liam R. Howlett
2024-09-04 7:59 ` Nam Cao
2024-09-04 18:40 ` Liam R. Howlett
2024-09-04 21:29 ` Dave Hansen
2024-09-05 20:09 ` Nam Cao
2024-09-05 20:08 ` Nam Cao
2024-09-05 20:52 ` Dave Hansen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).