All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <andrea@betterlinux.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>, CAI Qian <caiqian@redhat.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>, Izik Eidus <ieidus@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG 3.0.0-rc1] ksm: NULL pointer dereference in ksm_do_scan()
Date: Thu, 2 Jun 2011 22:15:01 +0200	[thread overview]
Message-ID: <20110602201501.GC4114@thinkpad> (raw)
In-Reply-To: <20110602164458.GG19505@random.random>

On Thu, Jun 02, 2011 at 06:44:58PM +0200, Andrea Arcangeli wrote:
> On Thu, Jun 02, 2011 at 08:36:41AM -0700, Chris Wright wrote:
> > * Andrea Arcangeli (aarcange@redhat.com) wrote:
> > > On Thu, Jun 02, 2011 at 07:31:43AM -0700, Chris Wright wrote:
> > > > * CAI Qian (caiqian@redhat.com) wrote:
> > > > > madvise(0x2210000, 4096, 0xc /* MADV_??? */) = 0
> > > > > --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > > > 
> > > > Right, that's just what the program is trying to do, segfault.
> > > > 
> > > > > +++ killed by SIGSEGV (core dumped) +++
> > > > > Segmentation fault (core dumped)
> > > > > 
> > > > > Did I miss anything?
> > > > 
> > > > I found it works but not 100% of the time.
> > > > 
> > > > So I just run the bug in a loop.
> > > 
> > > echo 0 >scan_millisecs helps.
> > 
> > BTW, here's my stack trace (I dropped back to 2.6.39 just to see if it
> > happened to be recent regression).  It looks like mm_slot is off the list:
> > 
> > R10: dead000000200200 R11: dead000000100100
> 
> Yes it had to be use after free.
> 
> I cooked this patch, still untested but it builds. Will test it soon.

Hi Andrea,

I just tested this patch, but it doesn't seem to fix the problem, at
least not the one I reported. The same bug happens again.

Thanks,
-Andrea

> 
> ===
> Subject: ksm: fix __ksm_exit vs ksm scan SMP race
> 
> From: Andrea Arcangeli <aarcange@redhat.com>
> 
> If the KSM scan releases the ksm_mmlist_lock after the mm_users already dropped
> to zero but before __ksm_exit had a chance runs, both the KSM scan and
> __ksm_exit will free the slot. This fixes the SMP race condition by using
> test_and_bit_set in __ksm_exit to see if __ksm_exit arrived before the KSM
> scan or not.
> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index d708b3e..47ef4c1 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -645,10 +645,16 @@ static int unmerge_and_remove_all_rmap_items(void)
>  		if (ksm_test_exit(mm)) {
>  			hlist_del(&mm_slot->link);
>  			list_del(&mm_slot->mm_list);
> +			/*
> +			 * After releasing ksm_mmlist_lock __ksm_exit
> +			 * can run and we already changed mm_slot, so
> +			 * notify it with MMF_VM_MERGEABLE not to free
> +			 * this again.
> +			 */
> +			clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  			spin_unlock(&ksm_mmlist_lock);
>  
>  			free_mm_slot(mm_slot);
> -			clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  			up_read(&mm->mmap_sem);
>  			mmdrop(mm);
>  		} else {
> @@ -1377,10 +1383,15 @@ next_mm:
>  		 */
>  		hlist_del(&slot->link);
>  		list_del(&slot->mm_list);
> +		/*
> +		 * After releasing ksm_mmlist_lock __ksm_exit can run
> +		 * and we already changed mm_slot, so notify it with
> +		 * MMF_VM_MERGEABLE not to free this again.
> +		 */
> +		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		spin_unlock(&ksm_mmlist_lock);
>  
>  		free_mm_slot(slot);
> -		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		up_read(&mm->mmap_sem);
>  		mmdrop(mm);
>  	} else {
> @@ -1463,6 +1474,11 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
>  				 VM_NONLINEAR | VM_MIXEDMAP | VM_SAO))
>  			return 0;		/* just ignore the advice */
>  
> +		/*
> +		 * It should be safe to test_bit instead of
> +		 * test_and_bit_set because the madvise generic caller
> +		 * holds the mmap_sem write mode.
> +		 */
>  		if (!test_bit(MMF_VM_MERGEABLE, &mm->flags)) {
>  			err = __ksm_enter(mm);
>  			if (err)
> @@ -1511,6 +1527,10 @@ int __ksm_enter(struct mm_struct *mm)
>  	list_add_tail(&mm_slot->mm_list, &ksm_scan.mm_slot->mm_list);
>  	spin_unlock(&ksm_mmlist_lock);
>  
> +	/*
> +	 * It should be safe to set it outside ksm_mmlist_lock because
> +	 * we hold a mm_user pin on the mm so __ksm_exit can't run.
> +	 */
>  	set_bit(MMF_VM_MERGEABLE, &mm->flags);
>  	atomic_inc(&mm->mm_count);
>  
> @@ -1538,9 +1558,28 @@ void __ksm_exit(struct mm_struct *mm)
>  	mm_slot = get_mm_slot(mm);
>  	if (mm_slot && ksm_scan.mm_slot != mm_slot) {
>  		if (!mm_slot->rmap_list) {
> -			hlist_del(&mm_slot->link);
> -			list_del(&mm_slot->mm_list);
> -			easy_to_free = 1;
> +			/*
> +			 * If MMF_VM_MERGEABLE isn't set it was freed
> +			 * by the scan immediately after mm_count
> +			 * reached zero (visible by the scan) but
> +			 * before __ksm_exit() run, so we don't need
> +			 * to do anything here. We don't even need to
> +			 * wait for the KSM scan to release the
> +			 * mmap_sem as it's not working on the mm
> +			 * anymore but it's just releasing it, and it
> +			 * probably already did and dropped its
> +			 * mm_count too (it would however be safe to
> +			 * take mmap_sem here even if MMF_VM_MERGEABLE
> +			 * is already clear, as the actual mm can't be
> +			 * freed until we return and we run mmdrop
> +			 * too, but it's unnecessary).
> +			 */
> +			if (test_and_clear_bit(MMF_VM_MERGEABLE, &mm->flags)) {
> +				hlist_del(&mm_slot->link);
> +				list_del(&mm_slot->mm_list);
> +				easy_to_free = 1;
> +			} else
> +				mm_slot = NULL;
>  		} else {
>  			list_move(&mm_slot->mm_list,
>  				  &ksm_scan.mm_slot->mm_list);
> @@ -1550,7 +1589,6 @@ void __ksm_exit(struct mm_struct *mm)
>  
>  	if (easy_to_free) {
>  		free_mm_slot(mm_slot);
> -		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		mmdrop(mm);
>  	} else if (mm_slot) {
>  		down_write(&mm->mmap_sem);

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Righi <andrea@betterlinux.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>, CAI Qian <caiqian@redhat.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>, Izik Eidus <ieidus@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG 3.0.0-rc1] ksm: NULL pointer dereference in ksm_do_scan()
Date: Thu, 2 Jun 2011 22:15:01 +0200	[thread overview]
Message-ID: <20110602201501.GC4114@thinkpad> (raw)
In-Reply-To: <20110602164458.GG19505@random.random>

On Thu, Jun 02, 2011 at 06:44:58PM +0200, Andrea Arcangeli wrote:
> On Thu, Jun 02, 2011 at 08:36:41AM -0700, Chris Wright wrote:
> > * Andrea Arcangeli (aarcange@redhat.com) wrote:
> > > On Thu, Jun 02, 2011 at 07:31:43AM -0700, Chris Wright wrote:
> > > > * CAI Qian (caiqian@redhat.com) wrote:
> > > > > madvise(0x2210000, 4096, 0xc /* MADV_??? */) = 0
> > > > > --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> > > > 
> > > > Right, that's just what the program is trying to do, segfault.
> > > > 
> > > > > +++ killed by SIGSEGV (core dumped) +++
> > > > > Segmentation fault (core dumped)
> > > > > 
> > > > > Did I miss anything?
> > > > 
> > > > I found it works but not 100% of the time.
> > > > 
> > > > So I just run the bug in a loop.
> > > 
> > > echo 0 >scan_millisecs helps.
> > 
> > BTW, here's my stack trace (I dropped back to 2.6.39 just to see if it
> > happened to be recent regression).  It looks like mm_slot is off the list:
> > 
> > R10: dead000000200200 R11: dead000000100100
> 
> Yes it had to be use after free.
> 
> I cooked this patch, still untested but it builds. Will test it soon.

Hi Andrea,

I just tested this patch, but it doesn't seem to fix the problem, at
least not the one I reported. The same bug happens again.

Thanks,
-Andrea

> 
> ===
> Subject: ksm: fix __ksm_exit vs ksm scan SMP race
> 
> From: Andrea Arcangeli <aarcange@redhat.com>
> 
> If the KSM scan releases the ksm_mmlist_lock after the mm_users already dropped
> to zero but before __ksm_exit had a chance runs, both the KSM scan and
> __ksm_exit will free the slot. This fixes the SMP race condition by using
> test_and_bit_set in __ksm_exit to see if __ksm_exit arrived before the KSM
> scan or not.
> 
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> ---
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index d708b3e..47ef4c1 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -645,10 +645,16 @@ static int unmerge_and_remove_all_rmap_items(void)
>  		if (ksm_test_exit(mm)) {
>  			hlist_del(&mm_slot->link);
>  			list_del(&mm_slot->mm_list);
> +			/*
> +			 * After releasing ksm_mmlist_lock __ksm_exit
> +			 * can run and we already changed mm_slot, so
> +			 * notify it with MMF_VM_MERGEABLE not to free
> +			 * this again.
> +			 */
> +			clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  			spin_unlock(&ksm_mmlist_lock);
>  
>  			free_mm_slot(mm_slot);
> -			clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  			up_read(&mm->mmap_sem);
>  			mmdrop(mm);
>  		} else {
> @@ -1377,10 +1383,15 @@ next_mm:
>  		 */
>  		hlist_del(&slot->link);
>  		list_del(&slot->mm_list);
> +		/*
> +		 * After releasing ksm_mmlist_lock __ksm_exit can run
> +		 * and we already changed mm_slot, so notify it with
> +		 * MMF_VM_MERGEABLE not to free this again.
> +		 */
> +		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		spin_unlock(&ksm_mmlist_lock);
>  
>  		free_mm_slot(slot);
> -		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		up_read(&mm->mmap_sem);
>  		mmdrop(mm);
>  	} else {
> @@ -1463,6 +1474,11 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
>  				 VM_NONLINEAR | VM_MIXEDMAP | VM_SAO))
>  			return 0;		/* just ignore the advice */
>  
> +		/*
> +		 * It should be safe to test_bit instead of
> +		 * test_and_bit_set because the madvise generic caller
> +		 * holds the mmap_sem write mode.
> +		 */
>  		if (!test_bit(MMF_VM_MERGEABLE, &mm->flags)) {
>  			err = __ksm_enter(mm);
>  			if (err)
> @@ -1511,6 +1527,10 @@ int __ksm_enter(struct mm_struct *mm)
>  	list_add_tail(&mm_slot->mm_list, &ksm_scan.mm_slot->mm_list);
>  	spin_unlock(&ksm_mmlist_lock);
>  
> +	/*
> +	 * It should be safe to set it outside ksm_mmlist_lock because
> +	 * we hold a mm_user pin on the mm so __ksm_exit can't run.
> +	 */
>  	set_bit(MMF_VM_MERGEABLE, &mm->flags);
>  	atomic_inc(&mm->mm_count);
>  
> @@ -1538,9 +1558,28 @@ void __ksm_exit(struct mm_struct *mm)
>  	mm_slot = get_mm_slot(mm);
>  	if (mm_slot && ksm_scan.mm_slot != mm_slot) {
>  		if (!mm_slot->rmap_list) {
> -			hlist_del(&mm_slot->link);
> -			list_del(&mm_slot->mm_list);
> -			easy_to_free = 1;
> +			/*
> +			 * If MMF_VM_MERGEABLE isn't set it was freed
> +			 * by the scan immediately after mm_count
> +			 * reached zero (visible by the scan) but
> +			 * before __ksm_exit() run, so we don't need
> +			 * to do anything here. We don't even need to
> +			 * wait for the KSM scan to release the
> +			 * mmap_sem as it's not working on the mm
> +			 * anymore but it's just releasing it, and it
> +			 * probably already did and dropped its
> +			 * mm_count too (it would however be safe to
> +			 * take mmap_sem here even if MMF_VM_MERGEABLE
> +			 * is already clear, as the actual mm can't be
> +			 * freed until we return and we run mmdrop
> +			 * too, but it's unnecessary).
> +			 */
> +			if (test_and_clear_bit(MMF_VM_MERGEABLE, &mm->flags)) {
> +				hlist_del(&mm_slot->link);
> +				list_del(&mm_slot->mm_list);
> +				easy_to_free = 1;
> +			} else
> +				mm_slot = NULL;
>  		} else {
>  			list_move(&mm_slot->mm_list,
>  				  &ksm_scan.mm_slot->mm_list);
> @@ -1550,7 +1589,6 @@ void __ksm_exit(struct mm_struct *mm)
>  
>  	if (easy_to_free) {
>  		free_mm_slot(mm_slot);
> -		clear_bit(MMF_VM_MERGEABLE, &mm->flags);
>  		mmdrop(mm);
>  	} else if (mm_slot) {
>  		down_write(&mm->mmap_sem);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-02 20:15 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-01 22:20 [BUG 3.0.0-rc1] ksm: NULL pointer dereference in ksm_do_scan() Andrea Righi
2011-06-01 22:20 ` Andrea Righi
2011-06-02  1:53 ` Chris Wright
2011-06-02  1:53   ` Chris Wright
2011-06-02  7:09 ` CAI Qian
2011-06-02  7:09   ` CAI Qian
2011-06-02 14:19   ` Andrea Righi
2011-06-02 14:19     ` Andrea Righi
2011-06-02 16:48     ` Chris Wright
2011-06-02 16:48       ` Chris Wright
2011-06-02 17:29       ` Hugh Dickins
2011-06-02 17:29         ` Hugh Dickins
2011-06-02 17:43         ` Andrea Arcangeli
2011-06-02 17:43           ` Andrea Arcangeli
2011-06-03 17:06           ` Hugh Dickins
2011-06-03 17:06             ` Hugh Dickins
2011-06-03 18:13             ` Andrea Arcangeli
2011-06-03 18:13               ` Andrea Arcangeli
2011-06-02 17:35       ` [PATCH] ksm: fix race between ksmd and exiting task Chris Wright
2011-06-02 17:35         ` Chris Wright
2011-06-02 20:12         ` Andrea Righi
2011-06-02 20:12           ` Andrea Righi
2011-06-02 21:23           ` Chris Wright
2011-06-02 21:23             ` Chris Wright
2011-06-03 16:37         ` Hugh Dickins
2011-06-03 16:37           ` Hugh Dickins
2011-06-04  0:54           ` [PATCH] ksm: fix NULL pointer dereference in scan_get_next_rmap_item Chris Wright
2011-06-04  0:54             ` Chris Wright
2011-06-02 20:10       ` [BUG 3.0.0-rc1] ksm: NULL pointer dereference in ksm_do_scan() Andrea Righi
2011-06-02 20:10         ` Andrea Righi
2011-06-03  4:42     ` CAI Qian
2011-06-03  4:42       ` CAI Qian
2011-06-02 14:31   ` Chris Wright
2011-06-02 14:31     ` Chris Wright
2011-06-02 14:36     ` Andrea Arcangeli
2011-06-02 14:36       ` Andrea Arcangeli
2011-06-02 15:36       ` Chris Wright
2011-06-02 15:36         ` Chris Wright
2011-06-02 16:44         ` Andrea Arcangeli
2011-06-02 16:44           ` Andrea Arcangeli
2011-06-02 20:15           ` Andrea Righi [this message]
2011-06-02 20:15             ` Andrea Righi
2011-06-02 21:35             ` Andrea Arcangeli
2011-06-02 21:35               ` Andrea Arcangeli
2011-06-03  4:50       ` CAI Qian
2011-06-03  4:50         ` CAI Qian
2011-06-03  4:44     ` CAI Qian
2011-06-03  4:44       ` CAI Qian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110602201501.GC4114@thinkpad \
    --to=andrea@betterlinux.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=caiqian@redhat.com \
    --cc=chrisw@sous-sol.org \
    --cc=hughd@google.com \
    --cc=ieidus@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.