public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
From: Robin Holt <holt@sgi.com>
To: linux-ia64@vger.kernel.org
Subject: Re: Fix race in the accessed/dirty bit handlers
Date: Wed, 08 Mar 2006 10:48:35 +0000	[thread overview]
Message-ID: <20060308104835.GC31680@lnx-holt.americas.sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0603071901420.2463@schroedinger.engr.sgi.com>

Are you sure this does not lead to a data integrity issue.  What if we
have a clean writable page.  If we get started on the dirty handler and
the pte gets zapped from under us, the page would not get marked as dirty
(pte had not gotten updated yet) and the process would continue running.
If the page never gets dirtied again until the process terminates, we
would never record a need to flush it to backing store.

I can possibly buy the change to the iaccess_bit handler, but I would need
a better explanation of how a race is prevented for the dirty_bit handler.

I think this might be the wrong direction, but I am still not awake
enough to think all the way through this.

Thanks,
Robin


On Tue, Mar 07, 2006 at 07:05:32PM -0800, Christoph Lameter wrote:
> A pte may be zapped by the swapper, exiting process, unmapping or page
> migration while the accessed or dirty bit handers are about to run. In that
> case the accessed bit or dirty is set on an zeroed pte which leads the VM to
> conclude that this is a swap pte. This may lead to
> 
> - Messages from the vm like
> 
> swap_free: Bad swap file entry 4000000000000000
> 
> - Processes being aborted
> 
> swap_dup: Bad swap file entry 4000000000000000
> VM: killing process ....
> 
> Page migration is particular suitable for the creation of this race since
> it needs to remove and restore page table entries.
> 
> The fix here is to check for the present bit and simply not update
> the pte if the page is not present anymore. If the page is not present
> then the fault handler should run next which will take care of the problem
> by bringing the page back and then mark the page dirty or move it onto the
> active list.
> 
> Signed-off-by: Christoph Lameter <clameter@sgi.com>
> 
> Index: linux-2.6.16-rc5-mm3/arch/ia64/kernel/ivt.S
> =================================> --- linux-2.6.16-rc5-mm3.orig/arch/ia64/kernel/ivt.S	2006-03-07 09:17:22.000000000 -0800
> +++ linux-2.6.16-rc5-mm3/arch/ia64/kernel/ivt.S	2006-03-07 18:53:11.000000000 -0800
> @@ -561,11 +561,12 @@ ENTRY(dirty_bit)
>  	;;					// avoid RAW on r18
>  	mov ar.ccv=r18				// set compare value for cmpxchg
>  	or r25=_PAGE_D|_PAGE_A,r18		// set the dirty and accessed bits
> +	tbit.z p7,p6 = r18,_PAGE_P_BIT		// Check present bit
>  	;;
> -	cmpxchg8.acq r26=[r17],r25,ar.ccv
> +(p6)	cmpxchg8.acq r26=[r17],r25,ar.ccv	// Only update if page is present
>  	mov r24=PAGE_SHIFT<<2
>  	;;
> -	cmp.eq p6,p7=r26,r18
> +(p6)	cmp.eq p6,p7=r26,r18			// Only compare if page is present
>  	;;
>  (p6)	itc.d r25				// install updated PTE
>  	;;
> @@ -626,11 +627,12 @@ ENTRY(iaccess_bit)
>  	;;
>  	mov ar.ccv=r18				// set compare value for cmpxchg
>  	or r25=_PAGE_A,r18			// set the accessed bit
> +	tbit.z p7,p6 = r18,_PAGE_P_BIT	 	// Check present bit
>  	;;
> -	cmpxchg8.acq r26=[r17],r25,ar.ccv
> +(p6)	cmpxchg8.acq r26=[r17],r25,ar.ccv	// Only if page present
>  	mov r24=PAGE_SHIFT<<2
>  	;;
> -	cmp.eq p6,p7=r26,r18
> +(p6)	cmp.eq p6,p7=r26,r18			// Only if page present
>  	;;
>  (p6)	itc.i r25				// install updated PTE
>  	;;
> @@ -680,11 +682,12 @@ ENTRY(daccess_bit)
>  	;;					// avoid RAW on r18
>  	mov ar.ccv=r18				// set compare value for cmpxchg
>  	or r25=_PAGE_A,r18			// set the dirty bit
> +	tbit.z p7,p6 = r18,_PAGE_P_BIT		// Check present bit
>  	;;
> -	cmpxchg8.acq r26=[r17],r25,ar.ccv
> +(p6)	cmpxchg8.acq r26=[r17],r25,ar.ccv	// Only if page is present
>  	mov r24=PAGE_SHIFT<<2
>  	;;
> -	cmp.eq p6,p7=r26,r18
> +(p6)	cmp.eq p6,p7=r26,r18			// Only if page is present
>  	;;
>  (p6)	itc.d r25				// install updated PTE
>  	/*
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2006-03-08 10:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-08  3:05 Fix race in the accessed/dirty bit handlers Christoph Lameter
2006-03-08 10:48 ` Robin Holt [this message]
2006-03-08 15:33 ` Christoph Lameter
2006-03-08 21:59 ` Chen, Kenneth W
2006-03-08 22:04 ` Christoph Lameter
2006-03-08 22:25 ` Luck, Tony
2006-03-08 22:32 ` Christoph Lameter
2006-03-08 23:56 ` Luck, Tony
2006-03-09  0:21 ` Christoph Lameter
2006-03-09 13:35 ` Zoltan Menyhart
2006-03-09 16:23 ` Christoph Lameter
2006-03-09 18:09 ` Zoltan Menyhart
2006-03-09 18:27 ` Christoph Lameter
2006-03-09 18:33 ` David Mosberger-Tang
2006-03-09 19:44 ` Chen, Kenneth W
2006-03-10  9:47 ` Zoltan Menyhart
2006-03-10  9:54 ` Christian Hildner
2006-03-10 10:40 ` Zoltan Menyhart
2006-03-10 16:47 ` Luck, Tony
2006-03-10 17:11 ` Zoltan Menyhart
2006-03-10 17:22 ` Chen, Kenneth W
2006-03-10 17:28 ` Luck, Tony
2006-03-10 17:29 ` Chen, Kenneth W
2006-03-13  9:13 ` Zoltan Menyhart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060308104835.GC31680@lnx-holt.americas.sgi.com \
    --to=holt@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox