All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: miklos@szeredi.hu, wfg@mail.ustc.edu.cn, a.p.zijlstra@chello.nl,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] remove throttle_vm_writeout()
Date: Thu, 4 Oct 2007 16:48:01 -0700	[thread overview]
Message-ID: <20071004164801.d8478727.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1Ida56-0002Zz-00@dorka.pomaz.szeredi.hu>

On Fri, 05 Oct 2007 01:26:12 +0200
Miklos Szeredi <miklos@szeredi.hu> wrote:

> > This is a somewhat general problem: a userspace process is in the IO path. 
> > Userspace block drivers, for example - pretty much anything which involves
> > kernel->userspace upcalls for storage applications.
> > 
> > I solved it once in the past by marking the userspace process as
> > PF_MEMALLOC and I beleive that others have implemented the same hack.
> > 
> > I suspect that what we need is a general solution, and that the solution
> > will involve explicitly telling the kernel that this process is one which
> > actually cleans memory and needs special treatment.
> > 
> > Because I bet there will be other corner-cases where such a process needs
> > kernel help, and there might be optimisation opportunities as well.
> > 
> > Problem is, any such mark-me-as-special syscall would need to be
> > privileged, and FUSE servers presently don't require special perms (do
> > they?)
> 
> No, and that's a rather important feature, that I'd rather not give
> up.

Can fuse do it?  Perhaps the fs can diddle the server's task_struct at
registration time?

>  But with the dirty limiting, the memory cleaning really shouldn't
> be a problem, as there is plenty of memory _not_ used for dirty file
> data, that the filesystem can use during the writeback.

I don't think I understand that.  Sure, it _shouldn't_ be a problem.  But it
_is_.  That's what we're trying to fix, isn't it?

> So the only thing the kernel should be careful about, is not to block
> on an allocation if not strictly necessary.
> 
> Actually a trivial fix for this problem could be to just tweak the
> thresholds, so to make the above scenario impossible.  Although I'm
> still not convinced, this patch is perfect, because the dirty
> threshold can actually change in time...
> 
> Index: linux/mm/page-writeback.c
> ===================================================================
> --- linux.orig/mm/page-writeback.c      2007-10-05 00:31:01.000000000 +0200
> +++ linux/mm/page-writeback.c   2007-10-05 00:50:11.000000000 +0200
> @@ -515,6 +515,12 @@ void throttle_vm_writeout(gfp_t gfp_mask
>          for ( ; ; ) {
>                 get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);
> 
> +               /*
> +                * Make sure the theshold is over the hard limit of
> +                * dirty_thresh + ratelimit_pages * nr_cpus
> +                */
> +               dirty_thresh += ratelimit_pages * num_online_cpus();
> +
>                  /*
>                   * Boost the allowable dirty threshold a bit for page
>                   * allocators so they don't get DoS'ed by heavy writers

I can probably kind of guess what you're trying to do here.  But if
ratelimit_pages * num_online_cpus() exceeds the size of the offending zone
then things might go bad.


WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: wfg@mail.ustc.edu.cn, a.p.zijlstra@chello.nl, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] remove throttle_vm_writeout()
Date: Thu, 4 Oct 2007 16:48:01 -0700	[thread overview]
Message-ID: <20071004164801.d8478727.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1Ida56-0002Zz-00@dorka.pomaz.szeredi.hu>

On Fri, 05 Oct 2007 01:26:12 +0200
Miklos Szeredi <miklos@szeredi.hu> wrote:

> > This is a somewhat general problem: a userspace process is in the IO path. 
> > Userspace block drivers, for example - pretty much anything which involves
> > kernel->userspace upcalls for storage applications.
> > 
> > I solved it once in the past by marking the userspace process as
> > PF_MEMALLOC and I beleive that others have implemented the same hack.
> > 
> > I suspect that what we need is a general solution, and that the solution
> > will involve explicitly telling the kernel that this process is one which
> > actually cleans memory and needs special treatment.
> > 
> > Because I bet there will be other corner-cases where such a process needs
> > kernel help, and there might be optimisation opportunities as well.
> > 
> > Problem is, any such mark-me-as-special syscall would need to be
> > privileged, and FUSE servers presently don't require special perms (do
> > they?)
> 
> No, and that's a rather important feature, that I'd rather not give
> up.

Can fuse do it?  Perhaps the fs can diddle the server's task_struct at
registration time?

>  But with the dirty limiting, the memory cleaning really shouldn't
> be a problem, as there is plenty of memory _not_ used for dirty file
> data, that the filesystem can use during the writeback.

I don't think I understand that.  Sure, it _shouldn't_ be a problem.  But it
_is_.  That's what we're trying to fix, isn't it?

> So the only thing the kernel should be careful about, is not to block
> on an allocation if not strictly necessary.
> 
> Actually a trivial fix for this problem could be to just tweak the
> thresholds, so to make the above scenario impossible.  Although I'm
> still not convinced, this patch is perfect, because the dirty
> threshold can actually change in time...
> 
> Index: linux/mm/page-writeback.c
> ===================================================================
> --- linux.orig/mm/page-writeback.c      2007-10-05 00:31:01.000000000 +0200
> +++ linux/mm/page-writeback.c   2007-10-05 00:50:11.000000000 +0200
> @@ -515,6 +515,12 @@ void throttle_vm_writeout(gfp_t gfp_mask
>          for ( ; ; ) {
>                 get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);
> 
> +               /*
> +                * Make sure the theshold is over the hard limit of
> +                * dirty_thresh + ratelimit_pages * nr_cpus
> +                */
> +               dirty_thresh += ratelimit_pages * num_online_cpus();
> +
>                  /*
>                   * Boost the allowable dirty threshold a bit for page
>                   * allocators so they don't get DoS'ed by heavy writers

I can probably kind of guess what you're trying to do here.  But if
ratelimit_pages * num_online_cpus() exceeds the size of the offending zone
then things might go bad.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-10-04 23:49 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-04 12:25 [PATCH] remove throttle_vm_writeout() Miklos Szeredi
2007-10-04 12:25 ` Miklos Szeredi
2007-10-04 12:40 ` Peter Zijlstra
2007-10-04 13:00   ` Miklos Szeredi
2007-10-04 13:00     ` Miklos Szeredi
2007-10-04 13:23     ` Peter Zijlstra
2007-10-04 13:49       ` Miklos Szeredi
2007-10-04 13:49         ` Miklos Szeredi
2007-10-04 16:47         ` Peter Zijlstra
2007-10-04 16:47           ` Peter Zijlstra
2007-10-04 17:46           ` Andrew Morton
2007-10-04 17:46             ` Andrew Morton
2007-10-04 18:10             ` Peter Zijlstra
2007-10-04 18:10               ` Peter Zijlstra
2007-10-04 18:54               ` Andrew Morton
2007-10-04 18:54                 ` Andrew Morton
2007-10-05 12:30             ` Fengguang Wu
2007-10-05 12:30               ` Fengguang Wu
2007-10-05 12:30                 ` Fengguang Wu
2007-10-05 17:20                 ` Andrew Morton
2007-10-05 17:20                   ` Andrew Morton
2007-10-06  2:32                   ` Fengguang Wu
2007-10-06  2:32                     ` Fengguang Wu
2007-10-06  2:32                       ` Fengguang Wu
2007-10-07 23:54               ` David Chinner
2007-10-07 23:54                 ` David Chinner
2007-10-08  0:33                 ` Fengguang Wu
2007-10-08  0:33                   ` Fengguang Wu
2007-10-08  0:33                     ` Fengguang Wu
2007-10-04 21:07           ` Miklos Szeredi
2007-10-04 21:07             ` Miklos Szeredi
2007-10-04 21:56 ` Andrew Morton
2007-10-04 21:56   ` Andrew Morton
2007-10-04 22:39   ` Miklos Szeredi
2007-10-04 22:39     ` Miklos Szeredi
2007-10-04 23:09     ` Andrew Morton
2007-10-04 23:09       ` Andrew Morton
2007-10-04 23:26       ` Miklos Szeredi
2007-10-04 23:26         ` Miklos Szeredi
2007-10-04 23:48         ` Andrew Morton [this message]
2007-10-04 23:48           ` Andrew Morton
2007-10-05  0:12           ` Miklos Szeredi
2007-10-05  0:12             ` Miklos Szeredi
2007-10-05  0:48             ` Andrew Morton
2007-10-05  0:48               ` Andrew Morton
2007-10-05  8:22               ` Peter Zijlstra
2007-10-05  9:22                 ` Miklos Szeredi
2007-10-05  9:22                   ` Miklos Szeredi
2007-10-05  9:47                   ` Peter Zijlstra
2007-10-05 10:27                     ` Miklos Szeredi
2007-10-05 10:27                       ` Miklos Szeredi
2007-10-05 10:32                       ` Miklos Szeredi
2007-10-05 10:32                         ` Miklos Szeredi
2007-10-05 15:43                         ` John Stoffel
2007-10-05 15:43                           ` John Stoffel
2007-10-05 10:57                       ` Peter Zijlstra
2007-10-05 11:27                         ` Miklos Szeredi
2007-10-05 11:27                           ` Miklos Szeredi
2007-10-05 17:50                         ` Trond Myklebust
2007-10-05 17:50                           ` Trond Myklebust
2007-10-05 18:32                           ` Peter Zijlstra
2007-10-05 18:32                             ` Peter Zijlstra
2007-10-05 19:20                             ` Trond Myklebust
2007-10-05 19:20                               ` Trond Myklebust
2007-10-05 19:23                               ` Trond Myklebust
2007-10-05 19:23                                 ` Trond Myklebust
2007-10-05 21:07                                 ` Peter Zijlstra
2007-10-05 21:07                                   ` Peter Zijlstra
2007-10-06  0:40                             ` Fengguang Wu
2007-10-06  0:40                               ` Fengguang Wu
2007-10-06  0:40                                 ` Fengguang Wu
2007-10-05  7:32       ` Peter Zijlstra
2007-10-05 19:54         ` Rik van Riel
2007-10-05 19:54           ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071004164801.d8478727.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=wfg@mail.ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.