All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch resend v4] update ctime and mtime for mmaped write
Date: Tue, 27 Mar 2007 09:52:20 -0800	[thread overview]
Message-ID: <20070327095220.4bc76cdc.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1HW7tS-0003em-00@dorka.pomaz.szeredi.hu>

On Tue, 27 Mar 2007 11:23:06 +0200 Miklos Szeredi <miklos@szeredi.hu> wrote:

> > > > > But Peter Staubach says a RH custumer has files written thorugh mmap,
> > > > > which are not being backed up.
> > > > 
> > > > Yes, I expect the backup problem is the major real-world hurt arising from
> > > > this bug.
> > > > 
> > > > But I expect we could adequately plug that problem at munmap()-time.  Or,
> > > > better, do_wp_page().  As I said - half-assed.
> > > > 
> > > > It's a question if whether the backup problem is the only thing which is hurting
> > > > in the real-world, or if people have other problems.
> > > > 
> > > > (In fact, what's wrong with doing it in do_wp_page()?
> > > 
> > > It's rather more expensive, than just toggling a bit.
> > 
> > It shouldn't be, especially for filesystems which have one-second timestamp
> > granularity.
> > 
> > Filesystems which have s_time_gran=1 might hurt a bit, but no more than
> > they will with write().
> > 
> > Actually, no - we'd only update the mctime once per page per writeback
> > period (30 seconds by default) so the load will be small.
> 
> Why?  For each faulted page the times will be updated, no?

Yes, but only at pagefault-time.  And

- the faults are already "slow": we need to pull the page contents in
  from disk, or memset or cow the page

- we need to take the trap

compared to which, the cost of the timestamp update will (we hope) be
relatively low.

> Maybe it's acceptable, I don't really know the cost of
> file_update_time().
> 
> Tried this patch, and it seems to work.  It will even randomly update
> the time for tmpfs files (on initial fault, and on swapins).
> 
> Miklos
> 
> Index: linux/mm/memory.c
> ===================================================================
> --- linux.orig/mm/memory.c	2007-03-27 11:04:40.000000000 +0200
> +++ linux/mm/memory.c	2007-03-27 11:08:19.000000000 +0200
> @@ -1664,6 +1664,8 @@ gotten:
>  unlock:
>  	pte_unmap_unlock(page_table, ptl);
>  	if (dirty_page) {
> +		if (vma->vm_file)
> +			file_update_time(vma->vm_file);
>  		set_page_dirty_balance(dirty_page);
>  		put_page(dirty_page);
>  	}
> @@ -2316,6 +2318,8 @@ retry:
>  unlock:
>  	pte_unmap_unlock(page_table, ptl);
>  	if (dirty_page) {
> +		if (vma->vm_file)
> +			file_update_time(vma->vm_file);
>  		set_page_dirty_balance(dirty_page);
>  		put_page(dirty_page);
>  	}

that's simpler ;) Is it correct enough though?  The place where it will
become inaccurate is for repeated modification via an established map.  ie:

	p = mmap(..., MAP_SHARED);
	for ( ; ; )
		*p = 1;

in which case I think the timestamp will only get updated once per
writeback interval (ie: 30 seconds).


tmpfs files have an s_time_gran of 1, so benchmarking some workload on
tmpfs with this patch will tell us the worst-case overhead of the change. 

I guess we should arrange for multiple CPUs to perform write faults against
multiple pages of the same file in parallel.  Of course, that'd be a pretty
darn short benchmark because it'll run out of RAM.  Which reveals why we
probably won't have a performance problem in there.



WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: a.p.zijlstra@chello.nl, linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch resend v4] update ctime and mtime for mmaped write
Date: Tue, 27 Mar 2007 09:52:20 -0800	[thread overview]
Message-ID: <20070327095220.4bc76cdc.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1HW7tS-0003em-00@dorka.pomaz.szeredi.hu>

On Tue, 27 Mar 2007 11:23:06 +0200 Miklos Szeredi <miklos@szeredi.hu> wrote:

> > > > > But Peter Staubach says a RH custumer has files written thorugh mmap,
> > > > > which are not being backed up.
> > > > 
> > > > Yes, I expect the backup problem is the major real-world hurt arising from
> > > > this bug.
> > > > 
> > > > But I expect we could adequately plug that problem at munmap()-time.  Or,
> > > > better, do_wp_page().  As I said - half-assed.
> > > > 
> > > > It's a question if whether the backup problem is the only thing which is hurting
> > > > in the real-world, or if people have other problems.
> > > > 
> > > > (In fact, what's wrong with doing it in do_wp_page()?
> > > 
> > > It's rather more expensive, than just toggling a bit.
> > 
> > It shouldn't be, especially for filesystems which have one-second timestamp
> > granularity.
> > 
> > Filesystems which have s_time_gran=1 might hurt a bit, but no more than
> > they will with write().
> > 
> > Actually, no - we'd only update the mctime once per page per writeback
> > period (30 seconds by default) so the load will be small.
> 
> Why?  For each faulted page the times will be updated, no?

Yes, but only at pagefault-time.  And

- the faults are already "slow": we need to pull the page contents in
  from disk, or memset or cow the page

- we need to take the trap

compared to which, the cost of the timestamp update will (we hope) be
relatively low.

> Maybe it's acceptable, I don't really know the cost of
> file_update_time().
> 
> Tried this patch, and it seems to work.  It will even randomly update
> the time for tmpfs files (on initial fault, and on swapins).
> 
> Miklos
> 
> Index: linux/mm/memory.c
> ===================================================================
> --- linux.orig/mm/memory.c	2007-03-27 11:04:40.000000000 +0200
> +++ linux/mm/memory.c	2007-03-27 11:08:19.000000000 +0200
> @@ -1664,6 +1664,8 @@ gotten:
>  unlock:
>  	pte_unmap_unlock(page_table, ptl);
>  	if (dirty_page) {
> +		if (vma->vm_file)
> +			file_update_time(vma->vm_file);
>  		set_page_dirty_balance(dirty_page);
>  		put_page(dirty_page);
>  	}
> @@ -2316,6 +2318,8 @@ retry:
>  unlock:
>  	pte_unmap_unlock(page_table, ptl);
>  	if (dirty_page) {
> +		if (vma->vm_file)
> +			file_update_time(vma->vm_file);
>  		set_page_dirty_balance(dirty_page);
>  		put_page(dirty_page);
>  	}

that's simpler ;) Is it correct enough though?  The place where it will
become inaccurate is for repeated modification via an established map.  ie:

	p = mmap(..., MAP_SHARED);
	for ( ; ; )
		*p = 1;

in which case I think the timestamp will only get updated once per
writeback interval (ie: 30 seconds).


tmpfs files have an s_time_gran of 1, so benchmarking some workload on
tmpfs with this patch will tell us the worst-case overhead of the change. 

I guess we should arrange for multiple CPUs to perform write faults against
multiple pages of the same file in parallel.  Of course, that'd be a pretty
darn short benchmark because it'll run out of RAM.  Which reveals why we
probably won't have a performance problem in there.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-03-27 17:52 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-25 21:10 [patch resend v4] update ctime and mtime for mmaped write Miklos Szeredi
2007-03-25 21:10 ` Miklos Szeredi, Miklos Szeredi
2007-03-26 21:00 ` Andrew Morton
2007-03-26 21:00   ` Andrew Morton
2007-03-26 21:10   ` Matt Mackall
2007-03-26 21:10     ` Matt Mackall
2007-03-26 22:25     ` Andrew Morton
2007-03-26 22:25       ` Andrew Morton
2007-03-26 21:43   ` Miklos Szeredi
2007-03-26 21:43     ` Miklos Szeredi
2007-03-26 22:31     ` Andrew Morton
2007-03-26 22:31       ` Andrew Morton
2007-03-27  6:55       ` Miklos Szeredi
2007-03-27  6:55         ` Miklos Szeredi
2007-03-27  7:22         ` Andrew Morton
2007-03-27  7:22           ` Andrew Morton
2007-03-27  7:36           ` Miklos Szeredi
2007-03-27  7:36             ` Miklos Szeredi
2007-03-27  7:49             ` Andrew Morton
2007-03-27  7:49               ` Andrew Morton
2007-03-27  8:03               ` Miklos Szeredi
2007-03-27  8:03                 ` Miklos Szeredi
2007-03-27  8:18                 ` Andrew Morton
2007-03-27  8:18                   ` Andrew Morton
2007-03-27  8:28                   ` Miklos Szeredi
2007-03-27  8:28                     ` Miklos Szeredi
2007-03-27  8:51                     ` Andrew Morton
2007-03-27  8:51                       ` Andrew Morton
2007-03-27  9:23                       ` Miklos Szeredi
2007-03-27  9:23                         ` Miklos Szeredi
2007-03-27 17:52                         ` Andrew Morton [this message]
2007-03-27 17:52                           ` Andrew Morton
2007-03-27 18:29                           ` Miklos Szeredi
2007-03-27 18:29                             ` Miklos Szeredi
  -- strict thread matches above, loose matches on Subject: below --
2007-03-27 18:42 linux
2007-03-27 18:55 ` Miklos Szeredi
2007-03-27 19:00   ` Miklos Szeredi
2007-03-27 19:05   ` Andrew Morton
2007-03-27 19:24   ` linux
2007-03-27 19:34     ` Andrew Morton
2007-03-27 20:09       ` linux
2007-03-27 20:09         ` linux
2007-03-27 20:31         ` Miklos Szeredi
2007-03-27 20:31           ` Miklos Szeredi
2007-03-28  1:48           ` linux
2007-03-28  1:48             ` linux
2007-03-28  7:58             ` Nick Piggin
2007-03-28  7:58               ` Nick Piggin
2007-03-28  9:50               ` linux
2007-03-28  9:50                 ` linux
2007-03-29  4:59                 ` Nick Piggin
2007-03-29  4:59                   ` Nick Piggin
2007-03-27 20:47         ` Andrew Morton
2007-03-27 20:47           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070327095220.4bc76cdc.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.