linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
@ 2025-08-08  9:21 Charan Teja Kalla
  2025-08-08 12:01 ` David Hildenbrand
  2025-08-11 12:07 ` Lorenzo Stoakes
  0 siblings, 2 replies; 16+ messages in thread
From: Charan Teja Kalla @ 2025-08-08  9:21 UTC (permalink / raw)
  To: akpm, shikemeng, kasong, nphamcs, bhe, baohua, chrisl, david
  Cc: linux-mm, linux-kernel, Charan Teja Kalla

It is possible to hit a zero entry while traversing the vmas in
unuse_mm(), called from the swapoff path. Not checking the zero entry
can result into operating on it as vma which leads into oops.

The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap())			swapoff(unuse_mm)
---------------                         -----------------
1) Identical mtree is built using
   __mt_dup().

2) copy_pte_range()-->
	copy_nonpresent_pte():
       The dst mm is added into the
    mmlist to be visible to the
    swapoff operation.

3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().

				     4) swapoff is tried on the
					'mm' added to the 'mmlist' as
					part of the 2.

				     5) unuse_mm(), that iterates
					through the vma's of this 'mm'
					will hit the non-NULL zero entry
					and operating on this zero entry
					as a vma is resulting into the
					oops.

Crash reported:
--------------
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault

 pc : try_to_unuse+0x188/0x79c
 lr : try_to_unuse+0x440/0x79c
 Call trace:
  try_to_unuse+0x188/0x79c
  __arm64_sys_swapoff+0x248/0x4d0
  invoke_syscall+0x58/0x10c
  el0_svc_common+0xa8/0xdc
  do_el0_svc+0x1c/0x28
  el0_svc+0x38/0x88
  el0t_64_sync_handler+0x70/0xbc
  el0t_64_sync+0x1a8/0x1ac

Fix this by checking if vma is zero entry before operating on it.

Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
 mm/swapfile.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 68ce283e84be..91513830ef9c 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2245,6 +2245,12 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
 
 	mmap_read_lock(mm);
 	for_each_vma(vmi, vma) {
+		/*
+		 * zero entries in mm_struct->mm_mt is a marker to stop
+		 * looking for vma's. see comment in exit_mmap().
+		 */
+		if (xa_is_zero(vma))
+			break;
 		if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
 			ret = unuse_vma(vma, type);
 			if (ret)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-08  9:21 [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path Charan Teja Kalla
@ 2025-08-08 12:01 ` David Hildenbrand
  2025-08-08 12:04   ` David Hildenbrand
  2025-08-11 12:07 ` Lorenzo Stoakes
  1 sibling, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-08-08 12:01 UTC (permalink / raw)
  To: Charan Teja Kalla, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl
  Cc: linux-mm, linux-kernel, Liam R. Howlett, Lorenzo Stoakes

On 08.08.25 11:21, Charan Teja Kalla wrote:
> It is possible to hit a zero entry while traversing the vmas in
> unuse_mm(), called from the swapoff path. Not checking the zero entry
> can result into operating on it as vma which leads into oops.
> 
> The issue is manifested from the below race between the fork() on a
> process and swapoff:
> fork(dup_mmap())			swapoff(unuse_mm)
> ---------------                         -----------------
> 1) Identical mtree is built using
>     __mt_dup().
> 
> 2) copy_pte_range()-->
> 	copy_nonpresent_pte():
>         The dst mm is added into the
>      mmlist to be visible to the
>      swapoff operation.
> 
> 3) Fatal signal is sent to the parent
> process(which is the current during the
> fork) thus skip the duplication of the
> vmas and mark the vma range with
> XA_ZERO_ENTRY as a marker for this process
> that helps during exit_mmap().
> 
> 				     4) swapoff is tried on the
> 					'mm' added to the 'mmlist' as
> 					part of the 2.
> 
> 				     5) unuse_mm(), that iterates
> 					through the vma's of this 'mm'
> 					will hit the non-NULL zero entry
> 					and operating on this zero entry
> 					as a vma is resulting into the
> 					oops.
> 

That looks like something Liam or Lorenzo could help with reviewing.
I suspect a proper fix would be around not exposing this
partially-valid tree to others when droping the mmap lock ...

While we dup the mm, the new process MM is write-locked -- see
dup_mmap() -- and unuse_mm() will read-lock the mmap_lock. So
in that period everything is fine.

I guess the culprit is in dup_mmap() when we do on error:

	} else {

		/*
		 * The entire maple tree has already been duplicated. If the
		 * mmap duplication fails, mark the failure point with
		 * XA_ZERO_ENTRY. In exit_mmap(), if this marker is encountered,
		 * stop releasing VMAs that have not been duplicated after this
		 * point.
		 */
		if (mpnt) {
			mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
			mas_store(&vmi.mas, XA_ZERO_ENTRY);
			/* Avoid OOM iterating a broken tree */
			set_bit(MMF_OOM_SKIP, &mm->flags);
		}
		/*
		 * The mm_struct is going to exit, but the locks will be dropped
		 * first.  Set the mm_struct as unstable is advisable as it is
		 * not fully initialised.
		 */
		set_bit(MMF_UNSTABLE, &mm->flags);
	}

Shouldn't we just remove anything from the tree here that was not copied
immediately?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-08 12:01 ` David Hildenbrand
@ 2025-08-08 12:04   ` David Hildenbrand
  2025-08-11  9:43     ` Charan Teja Kalla
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-08-08 12:04 UTC (permalink / raw)
  To: Charan Teja Kalla, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl
  Cc: linux-mm, linux-kernel, Liam R. Howlett, Lorenzo Stoakes

On 08.08.25 14:01, David Hildenbrand wrote:
> On 08.08.25 11:21, Charan Teja Kalla wrote:
>> It is possible to hit a zero entry while traversing the vmas in
>> unuse_mm(), called from the swapoff path. Not checking the zero entry
>> can result into operating on it as vma which leads into oops.
>>
>> The issue is manifested from the below race between the fork() on a
>> process and swapoff:
>> fork(dup_mmap())			swapoff(unuse_mm)
>> ---------------                         -----------------
>> 1) Identical mtree is built using
>>      __mt_dup().
>>
>> 2) copy_pte_range()-->
>> 	copy_nonpresent_pte():
>>          The dst mm is added into the
>>       mmlist to be visible to the
>>       swapoff operation.
>>
>> 3) Fatal signal is sent to the parent
>> process(which is the current during the
>> fork) thus skip the duplication of the
>> vmas and mark the vma range with
>> XA_ZERO_ENTRY as a marker for this process
>> that helps during exit_mmap().
>>
>> 				     4) swapoff is tried on the
>> 					'mm' added to the 'mmlist' as
>> 					part of the 2.
>>
>> 				     5) unuse_mm(), that iterates
>> 					through the vma's of this 'mm'
>> 					will hit the non-NULL zero entry
>> 					and operating on this zero entry
>> 					as a vma is resulting into the
>> 					oops.
>>
> 
> That looks like something Liam or Lorenzo could help with reviewing.
> I suspect a proper fix would be around not exposing this
> partially-valid tree to others when droping the mmap lock ...
> 
> While we dup the mm, the new process MM is write-locked -- see
> dup_mmap() -- and unuse_mm() will read-lock the mmap_lock. So
> in that period everything is fine.
> 
> I guess the culprit is in dup_mmap() when we do on error:
> 
> 	} else {
> 
> 		/*
> 		 * The entire maple tree has already been duplicated. If the
> 		 * mmap duplication fails, mark the failure point with
> 		 * XA_ZERO_ENTRY. In exit_mmap(), if this marker is encountered,
> 		 * stop releasing VMAs that have not been duplicated after this
> 		 * point.
> 		 */
> 		if (mpnt) {
> 			mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
> 			mas_store(&vmi.mas, XA_ZERO_ENTRY);
> 			/* Avoid OOM iterating a broken tree */
> 			set_bit(MMF_OOM_SKIP, &mm->flags);
> 		}
> 		/*
> 		 * The mm_struct is going to exit, but the locks will be dropped
> 		 * first.  Set the mm_struct as unstable is advisable as it is
> 		 * not fully initialised.
> 		 */
> 		set_bit(MMF_UNSTABLE, &mm->flags);
> 	}
> 
> Shouldn't we just remove anything from the tree here that was not copied
> immediately?

Another fix would be to just check MMF_UNSTABLE in unuse_mm(). But 
having these MMF_UNSTABLE checks all over the place feels a bit like 
whack-a-mole.

Is there anything preventing us from just leaving a proper tree that 
reflects reality in place before we drop the write lock?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-08 12:04   ` David Hildenbrand
@ 2025-08-11  9:43     ` Charan Teja Kalla
  2025-08-11 12:14       ` Lorenzo Stoakes
  0 siblings, 1 reply; 16+ messages in thread
From: Charan Teja Kalla @ 2025-08-11  9:43 UTC (permalink / raw)
  To: David Hildenbrand, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl
  Cc: linux-mm, linux-kernel, Liam R. Howlett, Lorenzo Stoakes,
	Matthew Wilcox

Thanks David, for the reply!!
On 8/8/2025 5:34 PM, David Hildenbrand wrote:
>>         if (mpnt) {
>>             mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
>>             mas_store(&vmi.mas, XA_ZERO_ENTRY);
>>             /* Avoid OOM iterating a broken tree */
>>             set_bit(MMF_OOM_SKIP, &mm->flags);
>>         }
>>         /*
>>          * The mm_struct is going to exit, but the locks will be dropped
>>          * first.  Set the mm_struct as unstable is advisable as it is
>>          * not fully initialised.
>>          */
>>         set_bit(MMF_UNSTABLE, &mm->flags);
>>     }
>>
>> Shouldn't we just remove anything from the tree here that was not copied
>> immediately?
> 
> Another fix would be to just check MMF_UNSTABLE in unuse_mm(). But
> having these MMF_UNSTABLE checks all over the place feels a bit like
> whack-a-mole.
> 
Seems MMF_UNSTABLE is the expectation per the commit,
64c37e134b12("kernel: be more careful about dup_mmap() failures and
uprobe registering"). Excerpt(s) from the commit message:

This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on
the oom side (even though this is extremely unlikely to be selected as
an oom victim in the race window), and __sets MMF_UNSTABLE to avoid
other potential users from using a partially initialised mm_struct.

When registering vmas for uprobe, skip the vmas in an mm that is marked
unstable.  Modifying a vma in an unstable mm may cause issues if the mm
isn't fully initialised.__

> Is there anything preventing us from just leaving a proper tree that
> reflects reality in place before we drop the write lock?

When you mean proper tree, is this about the your previous question? --
Shouldn't we just remove anything from the tree here that was not copied
immediately?

Anyway, would request Liam/Lorenzo to comment on this.

Thanks,
Charan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-08  9:21 [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path Charan Teja Kalla
  2025-08-08 12:01 ` David Hildenbrand
@ 2025-08-11 12:07 ` Lorenzo Stoakes
  2025-08-11 16:29   ` Charan Teja Kalla
  1 sibling, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes @ 2025-08-11 12:07 UTC (permalink / raw)
  To: Charan Teja Kalla
  Cc: akpm, shikemeng, kasong, nphamcs, bhe, baohua, chrisl, david,
	linux-mm, linux-kernel

On Fri, Aug 08, 2025 at 02:51:56PM +0530, Charan Teja Kalla wrote:
> It is possible to hit a zero entry while traversing the vmas in
> unuse_mm(), called from the swapoff path. Not checking the zero entry
> can result into operating on it as vma which leads into oops.
>
> The issue is manifested from the below race between the fork() on a
> process and swapoff:
> fork(dup_mmap())			swapoff(unuse_mm)
> ---------------                         -----------------
> 1) Identical mtree is built using
>    __mt_dup().
>
> 2) copy_pte_range()-->
> 	copy_nonpresent_pte():
>        The dst mm is added into the
>     mmlist to be visible to the
>     swapoff operation.

Yeah that seems really not right.

We should only expose it to swapoff once the fork succesfully completes.

This is surely the correct solution?

>
> 3) Fatal signal is sent to the parent
> process(which is the current during the
> fork) thus skip the duplication of the
> vmas and mark the vma range with
> XA_ZERO_ENTRY as a marker for this process
> that helps during exit_mmap().

Maybe we need to think about doing something else in case of a fatal
signal, as this seems to make this wildly more likely than an OOM or such
mid-fork that we've seen syzbot's about before?

Just a thought...



>
> 				     4) swapoff is tried on the
> 					'mm' added to the 'mmlist' as
> 					part of the 2.

Yeah again, why are we exposing an invalid mm tree to swapoff? That's crazy.

>
> 				     5) unuse_mm(), that iterates
> 					through the vma's of this 'mm'
> 					will hit the non-NULL zero entry
> 					and operating on this zero entry
> 					as a vma is resulting into the
> 					oops.
>
> Crash reported:
> --------------
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000446--> Loading the memory from offset 0x40 on the
> XA_ZERO_ENTRY as address.
> Mem abort info:
>   ESR = 0x0000000096000005
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x05: level 1 translation fault
>
>  pc : try_to_unuse+0x188/0x79c
>  lr : try_to_unuse+0x440/0x79c
>  Call trace:
>   try_to_unuse+0x188/0x79c
>   __arm64_sys_swapoff+0x248/0x4d0
>   invoke_syscall+0x58/0x10c
>   el0_svc_common+0xa8/0xdc
>   do_el0_svc+0x1c/0x28
>   el0_svc+0x38/0x88
>   el0t_64_sync_handler+0x70/0xbc
>   el0t_64_sync+0x1a8/0x1ac
>
> Fix this by checking if vma is zero entry before operating on it.

Any syzbot link or further information as to where this bug was found? Was this
on your system?

Has this happened in reality or due to fault injection?

>
> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
> ---
>  mm/swapfile.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 68ce283e84be..91513830ef9c 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2245,6 +2245,12 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
>
>  	mmap_read_lock(mm);
>  	for_each_vma(vmi, vma) {
> +		/*
> +		 * zero entries in mm_struct->mm_mt is a marker to stop
> +		 * looking for vma's. see comment in exit_mmap().
> +		 */

This comment needs to be _WAY_ more specific, and talk about the race.

> +		if (xa_is_zero(vma))
> +			break;

Yeah this seems wrong to me.

The solution is to not expose an incomplete mm to swapoff in the first place.

>  		if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
>  			ret = unuse_vma(vma, type);
>  			if (ret)
> --
> 2.34.1
>

I _wonder_ if we need something like this as a temporary hotfix, since it's
probably not otherwise harmful, even if it sucks horribly.

But the real fix is to not have swapoff get access to an invalid mm tree,
that's just crazy...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11  9:43     ` Charan Teja Kalla
@ 2025-08-11 12:14       ` Lorenzo Stoakes
  2025-08-11 13:03         ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes @ 2025-08-11 12:14 UTC (permalink / raw)
  To: Charan Teja Kalla
  Cc: David Hildenbrand, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl, linux-mm, linux-kernel, Liam R. Howlett, Matthew Wilcox

On Mon, Aug 11, 2025 at 03:13:14PM +0530, Charan Teja Kalla wrote:
> Thanks David, for the reply!!
> On 8/8/2025 5:34 PM, David Hildenbrand wrote:
> >>         if (mpnt) {
> >>             mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
> >>             mas_store(&vmi.mas, XA_ZERO_ENTRY);
> >>             /* Avoid OOM iterating a broken tree */
> >>             set_bit(MMF_OOM_SKIP, &mm->flags);
> >>         }
> >>         /*
> >>          * The mm_struct is going to exit, but the locks will be dropped
> >>          * first.  Set the mm_struct as unstable is advisable as it is
> >>          * not fully initialised.
> >>          */
> >>         set_bit(MMF_UNSTABLE, &mm->flags);
> >>     }
> >>
> >> Shouldn't we just remove anything from the tree here that was not copied
> >> immediately?
> >
> > Another fix would be to just check MMF_UNSTABLE in unuse_mm(). But
> > having these MMF_UNSTABLE checks all over the place feels a bit like
> > whack-a-mole.
> >
> Seems MMF_UNSTABLE is the expectation per the commit,
> 64c37e134b12("kernel: be more careful about dup_mmap() failures and
> uprobe registering"). Excerpt(s) from the commit message:

This really is whack-a-mole yeah.

>
> This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on
> the oom side (even though this is extremely unlikely to be selected as
> an oom victim in the race window), and __sets MMF_UNSTABLE to avoid
> other potential users from using a partially initialised mm_struct.
>

But... maybe this is better for the _hotfix_ version as a nicer way of
doing this.

> When registering vmas for uprobe, skip the vmas in an mm that is marked
> unstable.  Modifying a vma in an unstable mm may cause issues if the mm
> isn't fully initialised.__
>
> > Is there anything preventing us from just leaving a proper tree that
> > reflects reality in place before we drop the write lock?
>
> When you mean proper tree, is this about the your previous question? --
> Shouldn't we just remove anything from the tree here that was not copied
> immediately?

Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
dup_mmap()") did this for efficiency, so it'd be a regression to do this.

See
https://lore.kernel.org/linux-mm/20231016032226.59199-1-zhangpeng.00@bytedance.com/

>
> Anyway, would request Liam/Lorenzo to comment on this.

Hi! :)

Really want Liam's input on this too, as he looked at this previously and
was, I believe, at least potentially going to take a look around this whole
logic.

Fork is sadly an area in which things become horrible quick...

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 12:14       ` Lorenzo Stoakes
@ 2025-08-11 13:03         ` David Hildenbrand
  2025-08-11 13:08           ` Lorenzo Stoakes
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-08-11 13:03 UTC (permalink / raw)
  To: Lorenzo Stoakes, Charan Teja Kalla
  Cc: akpm, shikemeng, kasong, nphamcs, bhe, baohua, chrisl, linux-mm,
	linux-kernel, Liam R. Howlett, Matthew Wilcox

On 11.08.25 14:14, Lorenzo Stoakes wrote:
> On Mon, Aug 11, 2025 at 03:13:14PM +0530, Charan Teja Kalla wrote:
>> Thanks David, for the reply!!
>> On 8/8/2025 5:34 PM, David Hildenbrand wrote:
>>>>          if (mpnt) {
>>>>              mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
>>>>              mas_store(&vmi.mas, XA_ZERO_ENTRY);
>>>>              /* Avoid OOM iterating a broken tree */
>>>>              set_bit(MMF_OOM_SKIP, &mm->flags);
>>>>          }
>>>>          /*
>>>>           * The mm_struct is going to exit, but the locks will be dropped
>>>>           * first.  Set the mm_struct as unstable is advisable as it is
>>>>           * not fully initialised.
>>>>           */
>>>>          set_bit(MMF_UNSTABLE, &mm->flags);
>>>>      }
>>>>
>>>> Shouldn't we just remove anything from the tree here that was not copied
>>>> immediately?
>>>
>>> Another fix would be to just check MMF_UNSTABLE in unuse_mm(). But
>>> having these MMF_UNSTABLE checks all over the place feels a bit like
>>> whack-a-mole.
>>>
>> Seems MMF_UNSTABLE is the expectation per the commit,
>> 64c37e134b12("kernel: be more careful about dup_mmap() failures and
>> uprobe registering"). Excerpt(s) from the commit message:
> 
> This really is whack-a-mole yeah.
> 
>>
>> This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on
>> the oom side (even though this is extremely unlikely to be selected as
>> an oom victim in the race window), and __sets MMF_UNSTABLE to avoid
>> other potential users from using a partially initialised mm_struct.
>>
> 
> But... maybe this is better for the _hotfix_ version as a nicer way of
> doing this.

I would prefer using MMF_UNSTABLE as a hotfix.

> 
>> When registering vmas for uprobe, skip the vmas in an mm that is marked
>> unstable.  Modifying a vma in an unstable mm may cause issues if the mm
>> isn't fully initialised.__
>>
>>> Is there anything preventing us from just leaving a proper tree that
>>> reflects reality in place before we drop the write lock?
>>
>> When you mean proper tree, is this about the your previous question? --
>> Shouldn't we just remove anything from the tree here that was not copied
>> immediately?
> 
> Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
> dup_mmap()") did this for efficiency, so it'd be a regression to do this.

We're talking about the case where fork *fails*. That cannot possibly be 
relevant for performance, can it? :)

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 13:03         ` David Hildenbrand
@ 2025-08-11 13:08           ` Lorenzo Stoakes
  2025-08-11 13:19             ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes @ 2025-08-11 13:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Charan Teja Kalla, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl, linux-mm, linux-kernel, Liam R. Howlett, Matthew Wilcox

On Mon, Aug 11, 2025 at 03:03:36PM +0200, David Hildenbrand wrote:
> On 11.08.25 14:14, Lorenzo Stoakes wrote:
> > On Mon, Aug 11, 2025 at 03:13:14PM +0530, Charan Teja Kalla wrote:
> > > Thanks David, for the reply!!
> > > On 8/8/2025 5:34 PM, David Hildenbrand wrote:
> > > > >          if (mpnt) {
> > > > >              mas_set_range(&vmi.mas, mpnt->vm_start, mpnt->vm_end - 1);
> > > > >              mas_store(&vmi.mas, XA_ZERO_ENTRY);
> > > > >              /* Avoid OOM iterating a broken tree */
> > > > >              set_bit(MMF_OOM_SKIP, &mm->flags);
> > > > >          }
> > > > >          /*
> > > > >           * The mm_struct is going to exit, but the locks will be dropped
> > > > >           * first.  Set the mm_struct as unstable is advisable as it is
> > > > >           * not fully initialised.
> > > > >           */
> > > > >          set_bit(MMF_UNSTABLE, &mm->flags);
> > > > >      }
> > > > >
> > > > > Shouldn't we just remove anything from the tree here that was not copied
> > > > > immediately?
> > > >
> > > > Another fix would be to just check MMF_UNSTABLE in unuse_mm(). But
> > > > having these MMF_UNSTABLE checks all over the place feels a bit like
> > > > whack-a-mole.
> > > >
> > > Seems MMF_UNSTABLE is the expectation per the commit,
> > > 64c37e134b12("kernel: be more careful about dup_mmap() failures and
> > > uprobe registering"). Excerpt(s) from the commit message:
> >
> > This really is whack-a-mole yeah.
> >
> > >
> > > This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on
> > > the oom side (even though this is extremely unlikely to be selected as
> > > an oom victim in the race window), and __sets MMF_UNSTABLE to avoid
> > > other potential users from using a partially initialised mm_struct.
> > >
> >
> > But... maybe this is better for the _hotfix_ version as a nicer way of
> > doing this.
>
> I would prefer using MMF_UNSTABLE as a hotfix.

Yeah, I think MMF_UNSTABLE is probably the way go, this is what I was trying to
say :P

>
> >
> > > When registering vmas for uprobe, skip the vmas in an mm that is marked
> > > unstable.  Modifying a vma in an unstable mm may cause issues if the mm
> > > isn't fully initialised.__
> > >
> > > > Is there anything preventing us from just leaving a proper tree that
> > > > reflects reality in place before we drop the write lock?
> > >
> > > When you mean proper tree, is this about the your previous question? --
> > > Shouldn't we just remove anything from the tree here that was not copied
> > > immediately?
> >
> > Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
> > dup_mmap()") did this for efficiency, so it'd be a regression to do this.
>
> We're talking about the case where fork *fails*. That cannot possibly be
> relevant for performance, can it? :)

I think it optimises the overall operation, but as a product of that, has to
handle this edge case, and that necessitated this rather horrble stuff.

Obviously we don't need to optimise a 'we are about to die' case :)

See https://lore.kernel.org/linux-mm/20231016032226.59199-1-zhangpeng.00@bytedance.com/
for details.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 13:08           ` Lorenzo Stoakes
@ 2025-08-11 13:19             ` David Hildenbrand
  2025-08-11 13:22               ` Lorenzo Stoakes
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2025-08-11 13:19 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Charan Teja Kalla, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl, linux-mm, linux-kernel, Liam R. Howlett, Matthew Wilcox

>>>
>>>> When registering vmas for uprobe, skip the vmas in an mm that is marked
>>>> unstable.  Modifying a vma in an unstable mm may cause issues if the mm
>>>> isn't fully initialised.__
>>>>
>>>>> Is there anything preventing us from just leaving a proper tree that
>>>>> reflects reality in place before we drop the write lock?
>>>>
>>>> When you mean proper tree, is this about the your previous question? --
>>>> Shouldn't we just remove anything from the tree here that was not copied
>>>> immediately?
>>>
>>> Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
>>> dup_mmap()") did this for efficiency, so it'd be a regression to do this.
>>
>> We're talking about the case where fork *fails*. That cannot possibly be
>> relevant for performance, can it? :)
> 
> I think it optimises the overall operation, but as a product of that, has to
> handle this edge case, and that necessitated this rather horrble stuff.
> 
> Obviously we don't need to optimise a 'we are about to die' case :)

Right, so my original question was whether we could just drop all stale 
stuff from the tree before we lift the mmap write lock, leaving only the 
VMAs in there that we actually processed successfully.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 13:19             ` David Hildenbrand
@ 2025-08-11 13:22               ` Lorenzo Stoakes
  2025-08-11 15:17                 ` Liam R. Howlett
  0 siblings, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes @ 2025-08-11 13:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Charan Teja Kalla, akpm, shikemeng, kasong, nphamcs, bhe, baohua,
	chrisl, linux-mm, linux-kernel, Liam R. Howlett, Matthew Wilcox

On Mon, Aug 11, 2025 at 03:19:53PM +0200, David Hildenbrand wrote:
> > > >
> > > > > When registering vmas for uprobe, skip the vmas in an mm that is marked
> > > > > unstable.  Modifying a vma in an unstable mm may cause issues if the mm
> > > > > isn't fully initialised.__
> > > > >
> > > > > > Is there anything preventing us from just leaving a proper tree that
> > > > > > reflects reality in place before we drop the write lock?
> > > > >
> > > > > When you mean proper tree, is this about the your previous question? --
> > > > > Shouldn't we just remove anything from the tree here that was not copied
> > > > > immediately?
> > > >
> > > > Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
> > > > dup_mmap()") did this for efficiency, so it'd be a regression to do this.
> > >
> > > We're talking about the case where fork *fails*. That cannot possibly be
> > > relevant for performance, can it? :)
> >
> > I think it optimises the overall operation, but as a product of that, has to
> > handle this edge case, and that necessitated this rather horrble stuff.
> >
> > Obviously we don't need to optimise a 'we are about to die' case :)
>
> Right, so my original question was whether we could just drop all stale
> stuff from the tree before we lift the mmap write lock, leaving only the
> VMAs in there that we actually processed successfully.

That'd be better answered by Liam who's more familiar with it.

I think it may actually be difficult to do on some level or there was some
reason we couldn't, but I may be mistaken.

>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 13:22               ` Lorenzo Stoakes
@ 2025-08-11 15:17                 ` Liam R. Howlett
  2025-08-11 15:39                   ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: Liam R. Howlett @ 2025-08-11 15:17 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Charan Teja Kalla, akpm, shikemeng, kasong,
	nphamcs, bhe, baohua, chrisl, linux-mm, linux-kernel,
	Matthew Wilcox

* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [250811 09:22]:
> On Mon, Aug 11, 2025 at 03:19:53PM +0200, David Hildenbrand wrote:
> > > > >
> > > > > > When registering vmas for uprobe, skip the vmas in an mm that is marked
> > > > > > unstable.  Modifying a vma in an unstable mm may cause issues if the mm
> > > > > > isn't fully initialised.__
> > > > > >
> > > > > > > Is there anything preventing us from just leaving a proper tree that
> > > > > > > reflects reality in place before we drop the write lock?
> > > > > >
> > > > > > When you mean proper tree, is this about the your previous question? --
> > > > > > Shouldn't we just remove anything from the tree here that was not copied
> > > > > > immediately?
> > > > >
> > > > > Commit d24062914837 (" fork: use __mt_dup() to duplicate maple tree in
> > > > > dup_mmap()") did this for efficiency, so it'd be a regression to do this.
> > > >
> > > > We're talking about the case where fork *fails*. That cannot possibly be
> > > > relevant for performance, can it? :)
> > >
> > > I think it optimises the overall operation, but as a product of that, has to
> > > handle this edge case, and that necessitated this rather horrble stuff.
> > >
> > > Obviously we don't need to optimise a 'we are about to die' case :)
> >
> > Right, so my original question was whether we could just drop all stale
> > stuff from the tree before we lift the mmap write lock, leaving only the
> > VMAs in there that we actually processed successfully.
> 
> That'd be better answered by Liam who's more familiar with it.

The short answer is no, we cannot.

But some options and questions below..

> 
> I think it may actually be difficult to do on some level or there was some
> reason we couldn't, but I may be mistaken.

Down the rabbit hole we go..

The cloning of the tree happens by copying the tree in DFS and
replacing the old nodes with new nodes.  The tree leaves end up being
copied, which contains all the vmas (unless DONT_COPY is set, so
basically always all of them..).  When the tree is copied, we have a
duplicate of the tree with pointers to all the vmas in the old process.

The way the tree fails is that we've been unable to finish cloning it,
usually for out of memory reasons.  So, this means we have a tree with
new and exciting vmas that have never been used and old but still active
vmas in oldmm.

The failure point is then marked with an XA_ZERO_ENTRY, which will
succeed in storing as it's a direct replacement in the tree so no
allocations necessary.  Thus this is safe even in -ENOMEM scenarios.

Clearing out the stale data means we may actually need to allocate to
remove vmas from the new tree, because we use allocated memory in the
maple tree - we'll need to rebalance, new parents, etc, etc.

So, to remove the stale data - we may actually have to allocate memory.

But we're most likely out of memory.. and we don't want to get the
shrinker involved in a broken task teardown, especially since it has
already been run and failed to help..

We could replace all the old vmas with XA_ZERO_ENTRY, but that doesn't
really fix this issue either.

I could make a function that frees all new vmas and destroys the tree
specifically for this failure state?

I'm almost certain this will lead to another whack-a-mole situation, but
those _should_ already be checked or written to work when there are zero
vmas in an mm (going by experience of what the scheduler does with an
empty tree).  Syzbot finds these scenarios sometimes via signals or
other corner cases that can happen..

Then again, I also thought the unstable mm should be checked where
necessary to avoid assumptions on the mm state..?

This is funny because we already have a (probably) benign race with oom
here.  This code may already visit the mm after __oom_reap_task_mm() and
the mm disappearing, but since the anon vmas should be removed,
unuse_mm() will skip them.

Although, I'm not sure what happens when
mmu_notifier_invalidate_range_start_nonblock() fails AND unuse_mm() is
called on the mm after.  Maybe checking the unstable mm is necessary
here anyways?

Thanks,
Liam


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 15:17                 ` Liam R. Howlett
@ 2025-08-11 15:39                   ` David Hildenbrand
  2025-08-11 15:48                     ` Lorenzo Stoakes
  2025-08-11 15:48                     ` Liam R. Howlett
  0 siblings, 2 replies; 16+ messages in thread
From: David Hildenbrand @ 2025-08-11 15:39 UTC (permalink / raw)
  To: Liam R. Howlett, Lorenzo Stoakes, Charan Teja Kalla, akpm,
	shikemeng, kasong, nphamcs, bhe, baohua, chrisl, linux-mm,
	linux-kernel, Matthew Wilcox

>>
>> I think it may actually be difficult to do on some level or there was some
>> reason we couldn't, but I may be mistaken.
> 

Thanks for the information!

> Down the rabbit hole we go..
> 
> The cloning of the tree happens by copying the tree in DFS and
> replacing the old nodes with new nodes.  The tree leaves end up being
> copied, which contains all the vmas (unless DONT_COPY is set, so
> basically always all of them..).  When the tree is copied, we have a
> duplicate of the tree with pointers to all the vmas in the old process.
> 
> The way the tree fails is that we've been unable to finish cloning it,
> usually for out of memory reasons.  So, this means we have a tree with
> new and exciting vmas that have never been used and old but still active
> vmas in oldmm.
> 
> The failure point is then marked with an XA_ZERO_ENTRY, which will
> succeed in storing as it's a direct replacement in the tree so no
> allocations necessary.  Thus this is safe even in -ENOMEM scenarios.
> 
> Clearing out the stale data means we may actually need to allocate to
> remove vmas from the new tree, because we use allocated memory in the
> maple tree - we'll need to rebalance, new parents, etc, etc.
> 
> So, to remove the stale data - we may actually have to allocate memory.

Ah, that sucks. And I assume we cannot preallocate early, because we 
might actually require multiple alloc+free cycles.

> 
> But we're most likely out of memory.. and we don't want to get the
> shrinker involved in a broken task teardown, especially since it has
> already been run and failed to help..
> 
> We could replace all the old vmas with XA_ZERO_ENTRY, but that doesn't
> really fix this issue either.

No.

> 
> I could make a function that frees all new vmas and destroys the tree
> specifically for this failure state?

I think the problem is that some page tables were already copied, so we 
would have to zap them as well.

Maybe just factoring stuff from the exit_mmap() function could be one 
way to do it.

> 
> I'm almost certain this will lead to another whack-a-mole situation, but
> those _should_ already be checked or written to work when there are zero
> vmas in an mm (going by experience of what the scheduler does with an
> empty tree).  Syzbot finds these scenarios sometimes via signals or
> other corner cases that can happen..

Yes, I rememebr fixing one instance of "0 VMAs". Most code paths just 
don't apply if a process was never actually ran I guess.

> 
> Then again, I also thought the unstable mm should be checked where
> necessary to avoid assumptions on the mm state..?
> 
> This is funny because we already have a (probably) benign race with oom
> here.  This code may already visit the mm after __oom_reap_task_mm() and
> the mm disappearing, but since the anon vmas should be removed,
> unuse_mm() will skip them.
> 
> Although, I'm not sure what happens when
> mmu_notifier_invalidate_range_start_nonblock() fails AND unuse_mm() is
> called on the mm after.  Maybe checking the unstable mm is necessary
> here anyways?

Can we have MMU notifiers active while the process never even ran and we 
are only halfway through duplicating VMAs?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 15:39                   ` David Hildenbrand
@ 2025-08-11 15:48                     ` Lorenzo Stoakes
  2025-08-11 15:51                       ` David Hildenbrand
  2025-08-11 15:48                     ` Liam R. Howlett
  1 sibling, 1 reply; 16+ messages in thread
From: Lorenzo Stoakes @ 2025-08-11 15:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Liam R. Howlett, Charan Teja Kalla, akpm, shikemeng, kasong,
	nphamcs, bhe, baohua, chrisl, linux-mm, linux-kernel,
	Matthew Wilcox

On Mon, Aug 11, 2025 at 05:39:32PM +0200, David Hildenbrand wrote:
>
> >
> > I could make a function that frees all new vmas and destroys the tree
> > specifically for this failure state?
>
> I think the problem is that some page tables were already copied, so we
> would have to zap them as well.

This shouldn't be too much more egregious?

The issues arise when it might be an OOM issue, but if it's a fatal signal we
can take the time to clean up.

>
> Maybe just factoring stuff from the exit_mmap() function could be one way to
> do it.

Is exit_mmap() a problem here? Or maybe I don't understand what you're getting
at.

I wonder if but can we somehow avoid telling swapoff about mm's before we're
sure the operation has completed?

We are doing:

dup_mmap()
-> copy_page_range()
-> ...
-> copy_nonpresent_pte()

And there exposing things to the swapoff.

Could we separate this out until after we're sure the fork has succeeded?
Would it really be that egregious perf-wise to do so?

Anyway - Charan - I think for the hotfix patch, you should respin with a
check for MMF_UNSTABLE, as set when this code path is active.

Then we can think about going further in untangling this mess...

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 15:39                   ` David Hildenbrand
  2025-08-11 15:48                     ` Lorenzo Stoakes
@ 2025-08-11 15:48                     ` Liam R. Howlett
  1 sibling, 0 replies; 16+ messages in thread
From: Liam R. Howlett @ 2025-08-11 15:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Lorenzo Stoakes, Charan Teja Kalla, akpm, shikemeng, kasong,
	nphamcs, bhe, baohua, chrisl, linux-mm, linux-kernel,
	Matthew Wilcox

* David Hildenbrand <david@redhat.com> [250811 11:39]:
> > > 
> > > I think it may actually be difficult to do on some level or there was some
> > > reason we couldn't, but I may be mistaken.
> > 
> 
> Thanks for the information!
> 
> > Down the rabbit hole we go..
> > 
> > The cloning of the tree happens by copying the tree in DFS and
> > replacing the old nodes with new nodes.  The tree leaves end up being
> > copied, which contains all the vmas (unless DONT_COPY is set, so
> > basically always all of them..).  When the tree is copied, we have a
> > duplicate of the tree with pointers to all the vmas in the old process.
> > 
> > The way the tree fails is that we've been unable to finish cloning it,
> > usually for out of memory reasons.  So, this means we have a tree with
> > new and exciting vmas that have never been used and old but still active
> > vmas in oldmm.
> > 
> > The failure point is then marked with an XA_ZERO_ENTRY, which will
> > succeed in storing as it's a direct replacement in the tree so no
> > allocations necessary.  Thus this is safe even in -ENOMEM scenarios.
> > 
> > Clearing out the stale data means we may actually need to allocate to
> > remove vmas from the new tree, because we use allocated memory in the
> > maple tree - we'll need to rebalance, new parents, etc, etc.
> > 

...

> > 
> > I could make a function that frees all new vmas and destroys the tree
> > specifically for this failure state?
> 
> I think the problem is that some page tables were already copied, so we
> would have to zap them as well.
> 
> Maybe just factoring stuff from the exit_mmap() function could be one way to
> do it.

Yes, this is much easier now that both are in the same .c file.

..
> > 
> > This is funny because we already have a (probably) benign race with oom
> > here.  This code may already visit the mm after __oom_reap_task_mm() and
> > the mm disappearing, but since the anon vmas should be removed,
> > unuse_mm() will skip them.
> > 
> > Although, I'm not sure what happens when
> > mmu_notifier_invalidate_range_start_nonblock() fails AND unuse_mm() is
> > called on the mm after.  Maybe checking the unstable mm is necessary
> > here anyways?
> 
> Can we have MMU notifiers active while the process never even ran and we are
> only halfway through duplicating VMAs?
> 

I doubt it.  I was thinking in other cases where the MMF_UNSTABLE flag
was set but the oom code failed to free all anon vmas based on the MMU
notifier.  That is, does this code have an existing race that's much
harder to hit?

Thanks,
Liam


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 15:48                     ` Lorenzo Stoakes
@ 2025-08-11 15:51                       ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2025-08-11 15:51 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Liam R. Howlett, Charan Teja Kalla, akpm, shikemeng, kasong,
	nphamcs, bhe, baohua, chrisl, linux-mm, linux-kernel,
	Matthew Wilcox

On 11.08.25 17:48, Lorenzo Stoakes wrote:
> On Mon, Aug 11, 2025 at 05:39:32PM +0200, David Hildenbrand wrote:
>>
>>>
>>> I could make a function that frees all new vmas and destroys the tree
>>> specifically for this failure state?
>>
>> I think the problem is that some page tables were already copied, so we
>> would have to zap them as well.
> 
> This shouldn't be too much more egregious?
> 
> The issues arise when it might be an OOM issue, but if it's a fatal signal we
> can take the time to clean up.
> 
>>
>> Maybe just factoring stuff from the exit_mmap() function could be one way to
>> do it.
> 
> Is exit_mmap() a problem here? Or maybe I don't understand what you're getting
> at.

We are dropping the write lock in between, allowing for these races.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path
  2025-08-11 12:07 ` Lorenzo Stoakes
@ 2025-08-11 16:29   ` Charan Teja Kalla
  0 siblings, 0 replies; 16+ messages in thread
From: Charan Teja Kalla @ 2025-08-11 16:29 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: akpm, shikemeng, kasong, nphamcs, bhe, baohua, chrisl, david,
	linux-mm, linux-kernel



On 8/11/2025 5:37 PM, Lorenzo Stoakes wrote:
> Any syzbot link or further information as to where this bug was found? Was this
> on your system?
> 
> Has this happened in reality or due to fault injection?
Sorry, missed to mention this information on the commit message.

Yes this is really reported on the 6.12 LTS kernel, while running the
tests on the device farm, though, reproduction rate is very low.

Thanks
Charan

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-08-11 16:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-08  9:21 [PATCH] mm: swap: check for xa_zero_entry() on vma in swapoff path Charan Teja Kalla
2025-08-08 12:01 ` David Hildenbrand
2025-08-08 12:04   ` David Hildenbrand
2025-08-11  9:43     ` Charan Teja Kalla
2025-08-11 12:14       ` Lorenzo Stoakes
2025-08-11 13:03         ` David Hildenbrand
2025-08-11 13:08           ` Lorenzo Stoakes
2025-08-11 13:19             ` David Hildenbrand
2025-08-11 13:22               ` Lorenzo Stoakes
2025-08-11 15:17                 ` Liam R. Howlett
2025-08-11 15:39                   ` David Hildenbrand
2025-08-11 15:48                     ` Lorenzo Stoakes
2025-08-11 15:51                       ` David Hildenbrand
2025-08-11 15:48                     ` Liam R. Howlett
2025-08-11 12:07 ` Lorenzo Stoakes
2025-08-11 16:29   ` Charan Teja Kalla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).