Re: [PATCH 03/10] writeback: Do not congestion sleep if there are no congested BDIs or significant writeback

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Minchan Kim <minchan.kim@gmail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Christoph Hellwig <hch@lst.de>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 03/10] writeback: Do not congestion sleep if there are no congested BDIs or significant writeback
Date: Wed, 8 Sep 2010 23:52:45 +0900	[thread overview]
Message-ID: <20100908145245.GG4620@barrios-desktop> (raw)
In-Reply-To: <20100908110403.GB29263@csn.ul.ie>

On Wed, Sep 08, 2010 at 12:04:03PM +0100, Mel Gorman wrote:
> On Wed, Sep 08, 2010 at 12:25:33AM +0900, Minchan Kim wrote:
> > > + * @zone: A zone to consider the number of being being written back from
> > > + * @sync: SYNC or ASYNC IO
> > > + * @timeout: timeout in jiffies
> > > + *
> > > + * Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
> > > + * write congestion.  If no backing_devs are congested then the number of
> > > + * writeback pages in the zone are checked and compared to the inactive
> > > + * list. If there is no sigificant writeback or congestion, there is no point
> >                                                 and 
> > 
> 
> Why and? "or" makes sense because we avoid sleeping on either condition.

if (nr_bdi_congested[sync]) == 0) {
        if (writeback < inactive / 2) {
                cond_resched();
                ..
                goto out
        }
}

for avoiding sleeping, above two condition should meet. 
So I thought "and" is make sense. 
Am I missing something?

> 
> > > + * in sleeping but cond_resched() is called in case the current process has
> > > + * consumed its CPU quota.
> > > + */
> > > +long wait_iff_congested(struct zone *zone, int sync, long timeout)
> > > +{
> > > +	long ret;
> > > +	unsigned long start = jiffies;
> > > +	DEFINE_WAIT(wait);
> > > +	wait_queue_head_t *wqh = &congestion_wqh[sync];
> > > +
> > > +	/*
> > > +	 * If there is no congestion, check the amount of writeback. If there
> > > +	 * is no significant writeback and no congestion, just cond_resched
> > > +	 */
> > > +	if (atomic_read(&nr_bdi_congested[sync]) == 0) {
> > > +		unsigned long inactive, writeback;
> > > +
> > > +		inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
> > > +				zone_page_state(zone, NR_INACTIVE_ANON);
> > > +		writeback = zone_page_state(zone, NR_WRITEBACK);
> > > +
> > > +		/*
> > > +		 * If less than half the inactive list is being written back,
> > > +		 * reclaim might as well continue
> > > +		 */
> > > +		if (writeback < inactive / 2) {
> > 
> > I am not sure this is best.
> > 
> 
> I'm not saying it is. The objective is to identify a situation where
> sleeping until the next write or congestion clears is pointless. We have
> already identified that we are not congested so the question is "are we
> writing a lot at the moment?". The assumption is that if there is a lot
> of writing going on, we might as well sleep until one completes rather
> than reclaiming more.
> 
> This is the first effort at identifying pointless sleeps. Better ones
> might be identified in the future but that shouldn't stop us making a
> semi-sensible decision now.

nr_bdi_congested is no problem since we have used it for a long time.
But you added new rule about writeback. 

Why I pointed out is that you added new rule and I hope let others know
this change since they have a good idea or any opinions. 
I think it's a one of roles as reviewer.

> 
> > 1. Without considering various speed class storage, could we fix it as half of inactive?
> 
> We don't really have a good means of identifying speed classes of
> storage. Worse, we are considering on a zone-basis here, not a BDI
> basis. The pages being written back in the zone could be backed by
> anything so we cannot make decisions based on BDI speed.

True. So it's why I have below question.
As you said, we don't have enough information in vmscan.
So I am not sure how effective such semi-sensible decision is. 

I think best is to throttle in page-writeback well. 
But I am not a expert about that and don't have any idea. Sorry.
So I can't insist on my nitpick. If others don't have any objection,
I don't mind this, either. 

Wu, Do you have any opinion?

> 
> > 2. Isn't there any writeback throttling on above layer? Do we care of it in here?
> > 
> 
> There are but congestion_wait() and now wait_iff_congested() are part of
> that. We can see from the figures in the leader that congestion_wait()
> is sleeping more than is necessary or smart.
> 
> > Just out of curiosity. 
> > 
> 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab

-- 
Kind regards,
Minchan Kim

WARNING: multiple messages have this Message-ID (diff)

From: Minchan Kim <minchan.kim@gmail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Wu Fengguang <fengguang.wu@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Christoph Hellwig <hch@lst.de>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 03/10] writeback: Do not congestion sleep if there are no congested BDIs or significant writeback
Date: Wed, 8 Sep 2010 23:52:45 +0900	[thread overview]
Message-ID: <20100908145245.GG4620@barrios-desktop> (raw)
In-Reply-To: <20100908110403.GB29263@csn.ul.ie>

On Wed, Sep 08, 2010 at 12:04:03PM +0100, Mel Gorman wrote:
> On Wed, Sep 08, 2010 at 12:25:33AM +0900, Minchan Kim wrote:
> > > + * @zone: A zone to consider the number of being being written back from
> > > + * @sync: SYNC or ASYNC IO
> > > + * @timeout: timeout in jiffies
> > > + *
> > > + * Waits for up to @timeout jiffies for a backing_dev (any backing_dev) to exit
> > > + * write congestion.  If no backing_devs are congested then the number of
> > > + * writeback pages in the zone are checked and compared to the inactive
> > > + * list. If there is no sigificant writeback or congestion, there is no point
> >                                                 and 
> > 
> 
> Why and? "or" makes sense because we avoid sleeping on either condition.

if (nr_bdi_congested[sync]) == 0) {
        if (writeback < inactive / 2) {
                cond_resched();
                ..
                goto out
        }
}

for avoiding sleeping, above two condition should meet. 
So I thought "and" is make sense. 
Am I missing something?

> 
> > > + * in sleeping but cond_resched() is called in case the current process has
> > > + * consumed its CPU quota.
> > > + */
> > > +long wait_iff_congested(struct zone *zone, int sync, long timeout)
> > > +{
> > > +	long ret;
> > > +	unsigned long start = jiffies;
> > > +	DEFINE_WAIT(wait);
> > > +	wait_queue_head_t *wqh = &congestion_wqh[sync];
> > > +
> > > +	/*
> > > +	 * If there is no congestion, check the amount of writeback. If there
> > > +	 * is no significant writeback and no congestion, just cond_resched
> > > +	 */
> > > +	if (atomic_read(&nr_bdi_congested[sync]) == 0) {
> > > +		unsigned long inactive, writeback;
> > > +
> > > +		inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
> > > +				zone_page_state(zone, NR_INACTIVE_ANON);
> > > +		writeback = zone_page_state(zone, NR_WRITEBACK);
> > > +
> > > +		/*
> > > +		 * If less than half the inactive list is being written back,
> > > +		 * reclaim might as well continue
> > > +		 */
> > > +		if (writeback < inactive / 2) {
> > 
> > I am not sure this is best.
> > 
> 
> I'm not saying it is. The objective is to identify a situation where
> sleeping until the next write or congestion clears is pointless. We have
> already identified that we are not congested so the question is "are we
> writing a lot at the moment?". The assumption is that if there is a lot
> of writing going on, we might as well sleep until one completes rather
> than reclaiming more.
> 
> This is the first effort at identifying pointless sleeps. Better ones
> might be identified in the future but that shouldn't stop us making a
> semi-sensible decision now.

nr_bdi_congested is no problem since we have used it for a long time.
But you added new rule about writeback. 

Why I pointed out is that you added new rule and I hope let others know
this change since they have a good idea or any opinions. 
I think it's a one of roles as reviewer.

> 
> > 1. Without considering various speed class storage, could we fix it as half of inactive?
> 
> We don't really have a good means of identifying speed classes of
> storage. Worse, we are considering on a zone-basis here, not a BDI
> basis. The pages being written back in the zone could be backed by
> anything so we cannot make decisions based on BDI speed.

True. So it's why I have below question.
As you said, we don't have enough information in vmscan.
So I am not sure how effective such semi-sensible decision is. 

I think best is to throttle in page-writeback well. 
But I am not a expert about that and don't have any idea. Sorry.
So I can't insist on my nitpick. If others don't have any objection,
I don't mind this, either. 

Wu, Do you have any opinion?

> 
> > 2. Isn't there any writeback throttling on above layer? Do we care of it in here?
> > 
> 
> There are but congestion_wait() and now wait_iff_congested() are part of
> that. We can see from the figures in the leader that congestion_wait()
> is sleeping more than is necessary or smart.
> 
> > Just out of curiosity. 
> > 
> 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-09-08 14:53 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-06 10:47 [PATCH 0/9] Reduce latencies and improve overall reclaim efficiency v1 Mel Gorman
2010-09-06 10:47 ` Mel Gorman
2010-09-06 10:47 ` [PATCH 01/10] tracing, vmscan: Add trace events for LRU list shrinking Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-06 10:47 ` [PATCH 02/10] writeback: Account for time spent congestion_waited Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-06 10:47 ` [PATCH 03/10] writeback: Do not congestion sleep if there are no congested BDIs or significant writeback Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-07 15:25   ` Minchan Kim
2010-09-07 15:25     ` Minchan Kim
2010-09-08 11:04     ` Mel Gorman
2010-09-08 11:04       ` Mel Gorman
2010-09-08 14:52       ` Minchan Kim [this message]
2010-09-08 14:52         ` Minchan Kim
2010-09-09  8:54         ` Mel Gorman
2010-09-09  8:54           ` Mel Gorman
2010-09-12 15:37           ` Minchan Kim
2010-09-12 15:37             ` Minchan Kim
2010-09-13  8:55             ` Mel Gorman
2010-09-13  8:55               ` Mel Gorman
2010-09-13  9:48               ` Minchan Kim
2010-09-13  9:48                 ` Minchan Kim
2010-09-13 10:07                 ` Mel Gorman
2010-09-13 10:07                   ` Mel Gorman
2010-09-13 10:20                   ` Minchan Kim
2010-09-13 10:20                     ` Minchan Kim
2010-09-13 10:30                     ` Mel Gorman
2010-09-13 10:30                       ` Mel Gorman
2010-09-08 21:23   ` Andrew Morton
2010-09-08 21:23     ` Andrew Morton
2010-09-09 10:43     ` Mel Gorman
2010-09-09 10:43       ` Mel Gorman
2010-09-09  3:02   ` KAMEZAWA Hiroyuki
2010-09-09  3:02     ` KAMEZAWA Hiroyuki
2010-09-09  8:58     ` Mel Gorman
2010-09-09  8:58       ` Mel Gorman
2010-09-06 10:47 ` [PATCH 04/10] vmscan: Synchronous lumpy reclaim should not call congestion_wait() Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-07 15:26   ` Minchan Kim
2010-09-07 15:26     ` Minchan Kim
2010-09-08  6:15   ` Johannes Weiner
2010-09-08  6:15     ` Johannes Weiner
2010-09-08 11:25   ` Wu Fengguang
2010-09-08 11:25     ` Wu Fengguang
2010-09-09  3:03   ` KAMEZAWA Hiroyuki
2010-09-09  3:03     ` KAMEZAWA Hiroyuki
2010-09-06 10:47 ` [PATCH 05/10] vmscan: Synchrounous lumpy reclaim use lock_page() instead trylock_page() Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-07 15:28   ` Minchan Kim
2010-09-07 15:28     ` Minchan Kim
2010-09-08  6:16   ` Johannes Weiner
2010-09-08  6:16     ` Johannes Weiner
2010-09-08 11:28   ` Wu Fengguang
2010-09-08 11:28     ` Wu Fengguang
2010-09-09  3:04   ` KAMEZAWA Hiroyuki
2010-09-09  3:04     ` KAMEZAWA Hiroyuki
2010-09-09  3:15     ` KAMEZAWA Hiroyuki
2010-09-09  3:15       ` KAMEZAWA Hiroyuki
2010-09-09  3:25       ` Wu Fengguang
2010-09-09  3:25         ` Wu Fengguang
2010-09-09  4:13       ` KOSAKI Motohiro
2010-09-09  4:13         ` KOSAKI Motohiro
2010-09-09  9:22         ` Mel Gorman
2010-09-09  9:22           ` Mel Gorman
2010-09-10 10:25           ` KOSAKI Motohiro
2010-09-10 10:25             ` KOSAKI Motohiro
2010-09-10 10:33             ` KOSAKI Motohiro
2010-09-10 10:33               ` KOSAKI Motohiro
2010-09-10 10:33               ` KOSAKI Motohiro
2010-09-13  9:14             ` Mel Gorman
2010-09-13  9:14               ` Mel Gorman
2010-09-14 10:14               ` KOSAKI Motohiro
2010-09-14 10:14                 ` KOSAKI Motohiro
2010-09-06 10:47 ` [PATCH 06/10] vmscan: Narrow the scenarios lumpy reclaim uses synchrounous reclaim Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-09  3:14   ` KAMEZAWA Hiroyuki
2010-09-09  3:14     ` KAMEZAWA Hiroyuki
2010-09-06 10:47 ` [PATCH 07/10] vmscan: Remove dead code in shrink_inactive_list() Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-07 15:33   ` Minchan Kim
2010-09-07 15:33     ` Minchan Kim
2010-09-06 10:47 ` [PATCH 08/10] vmscan: isolated_lru_pages() stop neighbour search if neighbour cannot be isolated Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-07 15:37   ` Minchan Kim
2010-09-07 15:37     ` Minchan Kim
2010-09-08 11:12     ` Mel Gorman
2010-09-08 11:12       ` Mel Gorman
2010-09-08 14:58       ` Minchan Kim
2010-09-08 14:58         ` Minchan Kim
2010-09-08 11:37   ` Wu Fengguang
2010-09-08 11:37     ` Wu Fengguang
2010-09-08 12:50     ` Mel Gorman
2010-09-08 12:50       ` Mel Gorman
2010-09-08 13:14       ` Wu Fengguang
2010-09-08 13:14         ` Wu Fengguang
2010-09-08 13:27         ` Mel Gorman
2010-09-08 13:27           ` Mel Gorman
2010-09-09  3:17   ` KAMEZAWA Hiroyuki
2010-09-09  3:17     ` KAMEZAWA Hiroyuki
2010-09-06 10:47 ` [PATCH 09/10] vmscan: Do not writeback filesystem pages in direct reclaim Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-13 13:31   ` Wu Fengguang
2010-09-13 13:31     ` Wu Fengguang
2010-09-13 13:55     ` Mel Gorman
2010-09-13 13:55       ` Mel Gorman
2010-09-13 14:33       ` Wu Fengguang
2010-09-13 14:33         ` Wu Fengguang
2010-10-28 21:50   ` Christoph Hellwig
2010-10-28 21:50     ` Christoph Hellwig
2010-10-29 10:26     ` Mel Gorman
2010-10-29 10:26       ` Mel Gorman
2010-09-06 10:47 ` [PATCH 10/10] vmscan: Kick flusher threads to clean pages when reclaim is encountering dirty pages Mel Gorman
2010-09-06 10:47   ` Mel Gorman
2010-09-09  3:22   ` KAMEZAWA Hiroyuki
2010-09-09  3:22     ` KAMEZAWA Hiroyuki
2010-09-09  9:32     ` Mel Gorman
2010-09-09  9:32       ` Mel Gorman
2010-09-13  0:53       ` KAMEZAWA Hiroyuki
2010-09-13  0:53         ` KAMEZAWA Hiroyuki
2010-09-13 13:48   ` Wu Fengguang
2010-09-13 13:48     ` Wu Fengguang
2010-09-13 14:10     ` Mel Gorman
2010-09-13 14:10       ` Mel Gorman
2010-09-13 14:41       ` Wu Fengguang
2010-09-13 14:41         ` Wu Fengguang
2010-09-06 10:49 ` [PATCH 0/9] Reduce latencies and improve overall reclaim efficiency v1 Mel Gorman
2010-09-06 10:49   ` Mel Gorman
2010-09-08  3:14 ` KOSAKI Motohiro
2010-09-08  3:14   ` KOSAKI Motohiro
2010-09-08  8:38   ` Mel Gorman
2010-09-08  8:38     ` Mel Gorman
2010-09-13 23:10 ` Minchan Kim
2010-09-13 23:10   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100908145245.GG4620@barrios-desktop \
    --to=minchan.kim@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.