All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 2/2] msync: start async writeout when MS_ASYNC
Date: Wed, 13 Jun 2012 14:29:49 -0700	[thread overview]
Message-ID: <20120613142949.734818a8.akpm@linux-foundation.org> (raw)
In-Reply-To: <1338497035-13014-3-git-send-email-pbonzini@redhat.com>

On Thu, 31 May 2012 22:43:55 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> msync.c says that applications had better use fsync() or fadvise(FADV_DONTNEED)
> instead of MS_ASYNC.  Both advices are really bad:
> 
> * fsync() can be a replacement for MS_SYNC, not for MS_ASYNC;
> 
> * fadvise(FADV_DONTNEED) invalidates the pages completely, which will make
>   later accesses expensive.
> 
> Having the possibility to schedule a writeback immediately is an advantage
> for the applications.  They can do the same thing that fadvise does,
> but without the invalidation part.  The implementation is also similar
> to fadvise, but with tag-and-write enabled.
> 
> One example is if you are implementing a persistent dirty bitmap.
> Whenever you set bits to 1 you need to synchronize it with MS_SYNC, so
> that dirtiness is reported properly after a host crash.  If you have set
> any bits to 0, getting them to disk is not needed for correctness, but
> it is still desirable to save some work after a host crash.  You could
> simply use MS_SYNC in a separate thread, but MS_ASYNC provides exactly
> the desired semantics and is easily done in the kernel.
> 
> If the application does not want to start I/O, it can simply call msync
> with flags equal to MS_INVALIDATE.  This one remains a no-op, as it should
> be on a reasonable implementation.

Means that people will find that their msync(MS_ASYNC) call will newly
start IO.  This may well be undesirable for some.

Also, it hardwires into the kernel behaviour which userspace itself
could have initiated, with sync_file_range().  ie: reduced flexibility.

Perhaps we can update the msync.c code comments to direct people to
sync_file_range()?


One wonders how msync() works with nonlinear mappings.  I guess
"badly".  I think this was all discussed when we merged
remap_file_pages() (what a mistake that was) and we decided "too hard".


  reply	other threads:[~2012-06-13 21:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-31 20:43 [PATCH 0/2] msync improvements Paolo Bonzini
2012-05-31 20:43 ` [PATCH 1/2] msync: support syncing a small part of the file Paolo Bonzini
2012-06-13 21:26   ` Andrew Morton
2012-06-13 21:51     ` Zan Lynx
2012-06-13 22:08       ` Andrew Morton
2012-06-14  8:57         ` Paolo Bonzini
2012-05-31 20:43 ` [PATCH 2/2] msync: start async writeout when MS_ASYNC Paolo Bonzini
2012-06-13 21:29   ` Andrew Morton [this message]
2012-06-14  9:02     ` Paolo Bonzini
2012-06-14 10:07       ` Andrew Morton
2012-06-14 10:19         ` Paolo Bonzini
2012-06-14 12:24     ` Paolo Bonzini
2012-06-12 15:38 ` [PATCH 0/2] msync improvements Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120613142949.734818a8.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.