Re: [mm] 8cc621d2f4: fio.write_iops -21.8% regression

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chris Goldsworthy <cgoldswo@codeaurora.org>
To: lkp@lists.01.org
Subject: Re: [mm] 8cc621d2f4: fio.write_iops -21.8% regression
Date: Tue, 25 May 2021 09:53:49 -0700	[thread overview]
Message-ID: <48d281469120cbed8aa58cd5f108ed47@codeaurora.org> (raw)
In-Reply-To: <YK0Us01mBTRWOQIw@google.com>

[-- Attachment #1: Type: text/plain, Size: 5186 bytes --]

On 2021-05-25 08:16, Minchan Kim wrote:
> On Mon, May 24, 2021 at 10:37:49AM -0700, Chris Goldsworthy wrote:
>> Hi Minchan,
>> 
>> This looks good to me, I just have some minor feedback.
>> 
>> Thanks,
> 
> Hi Chris,
> 
> Thanks for the review. Please see below.
> 
>> 
>> Chris.
>> 
>> On 2021-05-20 11:36, Minchan Kim wrote:
>> > On Thu, May 20, 2021 at 04:31:44PM +0800, kernel test robot wrote:
>> > >
>> > >
>> > > Greeting,
>> > >
>> > > FYI, we noticed a -21.8% regression of fio.write_iops due to commit:
>> > >
>> > >
>> > > commit: 8cc621d2f45ddd3dc664024a647ee7adf48d79a5 ("mm: fs:
>> > > invalidate BH LRU during page migration")
>> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> > >
>> > >
>> > > in testcase: fio-basic
>> > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU
>> > > @ 2.10GHz with 256G memory
>> > > with following parameters:
>> > >
>> > > 	disk: 2pmem
>> > > 	fs: ext4
>> > > 	runtime: 200s
>> > > 	nr_task: 50%
>> > > 	time_based: tb
>> > > 	rw: randwrite
>> > > 	bs: 4k
>> > > 	ioengine: libaio
>> > > 	test_size: 200G
>> > > 	cpufreq_governor: performance
>> > > 	ucode: 0x5003006
>> > >
>> > > test-description: Fio is a tool that will spawn a number of threads
>> > > or processes doing a particular type of I/O action as specified by
>> > > the user.
>> > > test-url: https://github.com/axboe/fio
>> > >
>> > >
>> > >
>> > > If you fix the issue, kindly add following tag
>> > > Reported-by: kernel test robot <oliver.sang@intel.com>
>> > >
>> > >
>> > > Details are as below:
>> > > -------------------------------------------------------------------------------------------------->
>> > >
>> > >
>> > > To reproduce:
>> > >
>> > >         git clone https://github.com/intel/lkp-tests.git
>> > >         cd lkp-tests
>> > >         bin/lkp install                job.yaml  # job file is
>> > > attached in this email
>> > >         bin/lkp split-job --compatible job.yaml  # generate the yaml
>> > > file for lkp run
>> > >         bin/lkp run                    generated-yaml-file
>> >
>> > Hi,
>> >
>> > I tried to insall the lkp-test in my machine by following above guide
>> > but failed
>> > due to package problems(I guess it's my problem since I use something
>> > particular
>> > environement). However, I guess it comes from increased miss ratio of
>> > bh_lrus
>> > since the patch caused more frequent invalidation of the bh_lrus calls
>> > compared
>> > to old. For example, lru_add_drain could be called from several hot
>> > places(e.g.,
>> > unmap and pagevec_release from several path) and it could keeps
>> > invalidating
>> > bh_lrus.
>> >
>> > IMO, we should move the overhead from such hot path to cold one. How
>> > about this?
>> >
>> > From ebf4ede1cf32fb14d85f0015a3693cb8e1b8dbfe Mon Sep 17 00:00:00 2001
>> > From: Minchan Kim <minchan@kernel.org>
>> > Date: Thu, 20 May 2021 11:17:56 -0700
>> > Subject: [PATCH] invalidate bh_lrus only at lru_add_drain_all
>> >
>> > Not-Yet-Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> >  mm/swap.c | 15 +++++++++++++--
>> >  1 file changed, 13 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/mm/swap.c b/mm/swap.c
>> > index dfb48cf9c2c9..d6168449e28c 100644
>> > --- a/mm/swap.c
>> > +++ b/mm/swap.c
>> > @@ -642,7 +642,6 @@ void lru_add_drain_cpu(int cpu)
>> >  		pagevec_lru_move_fn(pvec, lru_lazyfree_fn);
>> >
>> >  	activate_page_drain(cpu);
>> > -	invalidate_bh_lrus_cpu(cpu);
>> >  }
>> >
>> >  /**
>> > @@ -725,6 +724,17 @@ void lru_add_drain(void)
>> >  	local_unlock(&lru_pvecs.lock);
>> >  }
>> >
>> > +void lru_and_bh_lrus_drain(void)
>> > +{
>> > +	int cpu;
>> > +
>> > +	local_lock(&lru_pvecs.lock);
>> > +	cpu = smp_processor_id();
>> > +	lru_add_drain_cpu(cpu);
>> > +	local_unlock(&lru_pvecs.lock);
>> > +	invalidate_bh_lrus_cpu(cpu);
>> > +}
>> > +
>> 
>> Nit: drop int cpu?
> 
> Do you mean to suggest using smp_processor_id at both places
> instead of local varaible? Since the invalidate_bh_lrus_cpu
> is called out of the lru_pvecs.lock, I wanted to express
> the draining happens at the same CPU via storing the CPU.

Ah, got it.

>> 
>> >  void lru_add_drain_cpu_zone(struct zone *zone)
>> >  {
>> >  	local_lock(&lru_pvecs.lock);
>> > @@ -739,7 +749,7 @@ static DEFINE_PER_CPU(struct work_struct,
>> > lru_add_drain_work);
>> >
>> >  static void lru_add_drain_per_cpu(struct work_struct *dummy)
>> >  {
>> > -	lru_add_drain();
>> > +	lru_and_bh_lrus_drain();
>> >  }
>> >
>> >  /*
>> > @@ -881,6 +891,7 @@ void lru_cache_disable(void)
>> >  	__lru_add_drain_all(true);
>> >  #else
>> >  	lru_add_drain();
>> > +	invalidate_bh_lrus_cpu(smp_processor_id());
>> >  #endif
>> >  }
>> 
>> Can't we replace the call to lru_add_drain() and
>> invalidate_bh_lrus_cpu(smp_processor_id()) with a single call to
>> lru_and_bh_lrus_drain()?
> 
> Good idea.
> 
> Thanks!

-- 
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

WARNING: multiple messages have this Message-ID (diff)

From: Chris Goldsworthy <cgoldswo@codeaurora.org>
To: Minchan Kim <minchan@kernel.org>
Cc: kernel test robot <oliver.sang@intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Laura Abbott <labbott@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	John Dias <joaodias@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	lkp@lists.01.org, lkp@intel.com, ying.huang@intel.com,
	feng.tang@intel.com, zhengjun.xing@intel.com,
	Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [mm]  8cc621d2f4:  fio.write_iops -21.8% regression
Date: Tue, 25 May 2021 09:53:49 -0700	[thread overview]
Message-ID: <48d281469120cbed8aa58cd5f108ed47@codeaurora.org> (raw)
In-Reply-To: <YK0Us01mBTRWOQIw@google.com>

On 2021-05-25 08:16, Minchan Kim wrote:
> On Mon, May 24, 2021 at 10:37:49AM -0700, Chris Goldsworthy wrote:
>> Hi Minchan,
>> 
>> This looks good to me, I just have some minor feedback.
>> 
>> Thanks,
> 
> Hi Chris,
> 
> Thanks for the review. Please see below.
> 
>> 
>> Chris.
>> 
>> On 2021-05-20 11:36, Minchan Kim wrote:
>> > On Thu, May 20, 2021 at 04:31:44PM +0800, kernel test robot wrote:
>> > >
>> > >
>> > > Greeting,
>> > >
>> > > FYI, we noticed a -21.8% regression of fio.write_iops due to commit:
>> > >
>> > >
>> > > commit: 8cc621d2f45ddd3dc664024a647ee7adf48d79a5 ("mm: fs:
>> > > invalidate BH LRU during page migration")
>> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> > >
>> > >
>> > > in testcase: fio-basic
>> > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU
>> > > @ 2.10GHz with 256G memory
>> > > with following parameters:
>> > >
>> > > 	disk: 2pmem
>> > > 	fs: ext4
>> > > 	runtime: 200s
>> > > 	nr_task: 50%
>> > > 	time_based: tb
>> > > 	rw: randwrite
>> > > 	bs: 4k
>> > > 	ioengine: libaio
>> > > 	test_size: 200G
>> > > 	cpufreq_governor: performance
>> > > 	ucode: 0x5003006
>> > >
>> > > test-description: Fio is a tool that will spawn a number of threads
>> > > or processes doing a particular type of I/O action as specified by
>> > > the user.
>> > > test-url: https://github.com/axboe/fio
>> > >
>> > >
>> > >
>> > > If you fix the issue, kindly add following tag
>> > > Reported-by: kernel test robot <oliver.sang@intel.com>
>> > >
>> > >
>> > > Details are as below:
>> > > -------------------------------------------------------------------------------------------------->
>> > >
>> > >
>> > > To reproduce:
>> > >
>> > >         git clone https://github.com/intel/lkp-tests.git
>> > >         cd lkp-tests
>> > >         bin/lkp install                job.yaml  # job file is
>> > > attached in this email
>> > >         bin/lkp split-job --compatible job.yaml  # generate the yaml
>> > > file for lkp run
>> > >         bin/lkp run                    generated-yaml-file
>> >
>> > Hi,
>> >
>> > I tried to insall the lkp-test in my machine by following above guide
>> > but failed
>> > due to package problems(I guess it's my problem since I use something
>> > particular
>> > environement). However, I guess it comes from increased miss ratio of
>> > bh_lrus
>> > since the patch caused more frequent invalidation of the bh_lrus calls
>> > compared
>> > to old. For example, lru_add_drain could be called from several hot
>> > places(e.g.,
>> > unmap and pagevec_release from several path) and it could keeps
>> > invalidating
>> > bh_lrus.
>> >
>> > IMO, we should move the overhead from such hot path to cold one. How
>> > about this?
>> >
>> > From ebf4ede1cf32fb14d85f0015a3693cb8e1b8dbfe Mon Sep 17 00:00:00 2001
>> > From: Minchan Kim <minchan@kernel.org>
>> > Date: Thu, 20 May 2021 11:17:56 -0700
>> > Subject: [PATCH] invalidate bh_lrus only at lru_add_drain_all
>> >
>> > Not-Yet-Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> >  mm/swap.c | 15 +++++++++++++--
>> >  1 file changed, 13 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/mm/swap.c b/mm/swap.c
>> > index dfb48cf9c2c9..d6168449e28c 100644
>> > --- a/mm/swap.c
>> > +++ b/mm/swap.c
>> > @@ -642,7 +642,6 @@ void lru_add_drain_cpu(int cpu)
>> >  		pagevec_lru_move_fn(pvec, lru_lazyfree_fn);
>> >
>> >  	activate_page_drain(cpu);
>> > -	invalidate_bh_lrus_cpu(cpu);
>> >  }
>> >
>> >  /**
>> > @@ -725,6 +724,17 @@ void lru_add_drain(void)
>> >  	local_unlock(&lru_pvecs.lock);
>> >  }
>> >
>> > +void lru_and_bh_lrus_drain(void)
>> > +{
>> > +	int cpu;
>> > +
>> > +	local_lock(&lru_pvecs.lock);
>> > +	cpu = smp_processor_id();
>> > +	lru_add_drain_cpu(cpu);
>> > +	local_unlock(&lru_pvecs.lock);
>> > +	invalidate_bh_lrus_cpu(cpu);
>> > +}
>> > +
>> 
>> Nit: drop int cpu?
> 
> Do you mean to suggest using smp_processor_id at both places
> instead of local varaible? Since the invalidate_bh_lrus_cpu
> is called out of the lru_pvecs.lock, I wanted to express
> the draining happens at the same CPU via storing the CPU.

Ah, got it.

>> 
>> >  void lru_add_drain_cpu_zone(struct zone *zone)
>> >  {
>> >  	local_lock(&lru_pvecs.lock);
>> > @@ -739,7 +749,7 @@ static DEFINE_PER_CPU(struct work_struct,
>> > lru_add_drain_work);
>> >
>> >  static void lru_add_drain_per_cpu(struct work_struct *dummy)
>> >  {
>> > -	lru_add_drain();
>> > +	lru_and_bh_lrus_drain();
>> >  }
>> >
>> >  /*
>> > @@ -881,6 +891,7 @@ void lru_cache_disable(void)
>> >  	__lru_add_drain_all(true);
>> >  #else
>> >  	lru_add_drain();
>> > +	invalidate_bh_lrus_cpu(smp_processor_id());
>> >  #endif
>> >  }
>> 
>> Can't we replace the call to lru_add_drain() and
>> invalidate_bh_lrus_cpu(smp_processor_id()) with a single call to
>> lru_and_bh_lrus_drain()?
> 
> Good idea.
> 
> Thanks!

-- 
The Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

next prev parent reply	other threads:[~2021-05-25 16:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  8:31 [mm] 8cc621d2f4: fio.write_iops -21.8% regression kernel test robot
2021-05-20  8:31 ` kernel test robot
2021-05-20 18:36 ` Minchan Kim
2021-05-20 18:36   ` Minchan Kim
2021-05-21  5:29   ` Xing, Zhengjun
2021-05-24 17:37   ` Chris Goldsworthy
2021-05-24 17:37     ` Chris Goldsworthy
2021-05-25 15:16     ` Minchan Kim
2021-05-25 15:16       ` Minchan Kim
2021-05-25 16:39       ` Minchan Kim
2021-05-25 16:39         ` Minchan Kim
2021-05-25 16:57         ` Chris Goldsworthy
2021-05-25 16:57           ` Chris Goldsworthy
2021-09-03  7:11           ` Xing, Zhengjun
2021-09-03  7:11             ` [LKP] " Xing, Zhengjun
2021-09-07 16:55             ` Minchan Kim
2021-09-07 16:55               ` [LKP] " Minchan Kim
2021-09-07 18:46               ` Chris Goldsworthy
2021-09-07 18:46                 ` [LKP] " Chris Goldsworthy
2021-09-07 21:27                 ` Minchan Kim
2021-09-07 21:27                   ` [LKP] " Minchan Kim
2021-05-25 16:53       ` Chris Goldsworthy [this message]
2021-05-25 16:53         ` Chris Goldsworthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48d281469120cbed8aa58cd5f108ed47@codeaurora.org \
    --to=cgoldswo@codeaurora.org \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.