All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Yang <richardw.yang@linux.intel.com>
To: lkp@lists.01.org
Subject: Re: [driver core] 570d020012: will-it-scale.per_thread_ops -12.2% regression
Date: Thu, 21 Feb 2019 14:02:18 +0800	[thread overview]
Message-ID: <20190221060218.GA19466@richard> (raw)
In-Reply-To: <87h8cx21gl.fsf@yhuang-dev.intel.com>

[-- Attachment #1: Type: text/plain, Size: 7216 bytes --]

On Thu, Feb 21, 2019 at 12:46:18PM +0800, Huang, Ying wrote:
>Wei Yang <richardw.yang@linux.intel.com> writes:
>
>> On Thu, Feb 21, 2019 at 11:10:49AM +0800, kernel test robot wrote:
>>>On Tue, Feb 19, 2019 at 01:19:04PM +0100, Greg Kroah-Hartman wrote:
>>>> On Tue, Feb 19, 2019 at 08:59:45AM +0800, Wei Yang wrote:
>>>> > On Mon, Feb 18, 2019 at 03:54:42PM +0800, kernel test robot wrote:
>>>> > >Greeting,
>>>> > >
>>>> > >FYI, we noticed a -12.2% regression of will-it-scale.per_thread_ops due to commit:
>>>> > >
>>>> > >
>>>> > >commit: 570d0200123fb4f809aa2f6226e93a458d664d70 ("driver core: move device->knode_class to device_private")
>>>> > >https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>>>> > >
>>>> > 
>>>> > This is interesting.
>>>> > 
>>>> > I didn't expect the move of this field will impact the performance.
>>>> > 
>>>> > The reason is struct device is a hotter memory than device->device_private?
>>>> > 
>>>> > >in testcase: will-it-scale
>>>> > >on test machine: 288 threads Knights Mill with 80G memory
>>>> > >with following parameters:
>>>> > >
>>>> > >	nr_task: 100%
>>>> > >	mode: thread
>>>> > >	test: unlink2
>>>> > >	cpufreq_governor: performance
>>>> > >
>>>> > >test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>>>> > >test-url: https://github.com/antonblanchard/will-it-scale
>>>> > >
>>>> > >In addition to that, the commit also has significant impact on the following tests:
>>>> > >
>>>> > >+------------------+---------------------------------------------------------------+
>>>> > >| testcase: change | will-it-scale: will-it-scale.per_thread_ops -29.9% regression |
>>>> > >| test machine     | 288 threads Knights Mill with 80G memory                      |
>>>> > >| test parameters  | cpufreq_governor=performance                                  |
>>>> > >|                  | mode=thread                                                   |
>>>> > >|                  | nr_task=100%                                                  |
>>>> > >|                  | test=signal1                                                  |
>>>> 
>>>> Ok, I'm going to blame your testing system, or something here, and not
>>>> the above patch.
>>>> 
>>>> All this test does is call raise(3).  That does not touch the driver
>>>> core at all.
>>>> 
>>>> > >+------------------+---------------------------------------------------------------+
>>>> > >| testcase: change | will-it-scale: will-it-scale.per_thread_ops -16.5% regression |
>>>> > >| test machine     | 288 threads Knights Mill with 80G memory                      |
>>>> > >| test parameters  | cpufreq_governor=performance                                  |
>>>> > >|                  | mode=thread                                                   |
>>>> > >|                  | nr_task=100%                                                  |
>>>> > >|                  | test=open1                                                    |
>>>> > >+------------------+---------------------------------------------------------------+
>>>> 
>>>> Same here, open1 just calls open/close a lot.  No driver core
>>>> interaction at all there either.
>>>> 
>>>> So are you _sure_ this is the offending patch?
>>>
>>>Hi Greg,
>>>
>>>We did an experiment, recovered the layout of struct device. and we
>>>found the regression is gone. I guess the regession is not from the
>>>patch but related to the struct layout.
>>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-unlink2/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>    237096              14%     270789        will-it-scale.workload
>>>       823              14%        939        will-it-scale.per_thread_ops
>>>
>>
>> Do you have the comparison between a36dc70b810afe9183de2ea18f and the one
>> before 570d020012?
>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-signal1/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>     93.51   3%        48%     138.53   3%  will-it-scale.time.user_time
>>>       186              40%        261        will-it-scale.per_thread_ops
>>>     53909              40%      75507        will-it-scale.workload
>>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-open1/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>    447722              22%     546258  10%  will-it-scale.time.involuntary_context_switches
>>>    226995              19%     269751        will-it-scale.workload
>>>       787              19%        936        will-it-scale.per_thread_ops
>>>
>>>
>>>
>>>commit a36dc70b810afe9183de2ea18faa4c0939c139ac
>>>Author: 0day robot <lkp@intel.com>
>>>Date:   Wed Feb 20 14:21:19 2019 +0800
>>>
>>>    backfile klist_node in struct device for debugging
>>>    
>>>    Signed-off-by: 0day robot <lkp@intel.com>
>>>
>>>diff --git a/include/linux/device.h b/include/linux/device.h
>>>index d0e452fd0bff2..31666cb72b3ba 100644
>>>--- a/include/linux/device.h
>>>+++ b/include/linux/device.h
>>>@@ -1035,6 +1035,7 @@ struct device {
>>> 	spinlock_t		devres_lock;
>>> 	struct list_head	devres_head;
>>> 
>>>+	struct klist_node       knode_class_test_by_rongc;
>>> 	struct class		*class;
>>> 	const struct attribute_group **groups;	/* optional groups */
>>
>> Hmm... because this is not properly aligned?
>>
>> struct klist_node {
>> 	void			*n_klist;	/* never access directly */
>> 	struct list_head	n_node;
>> 	struct kref		n_ref;
>> };
>>
>> Except struct kref has one "int" type, others are pointers.
>>
>> But... I am still confused.
>
>I guess because the size of struct device is changed, it influences some
>alignment changes in the system.  Thus influence the benchmark score.
>

That's interesting.

I wrote a module to see the exact size of these two structure on my x86_64.

    sizeof(struct device) = 736 = 8 * 92
    sizeof(struct device_private) = 160 = 8 * 20
    sizeof(struct klist_node) = 32 = 8 * 4

Even klist_node has one 4 byte field, c complier would pack the structure to
make it aligned. Which system alignment it would affect?

After the patch, size would change like this:

   struct device          736   ->   704
   struce device_private  160   ->   192

Would this size change affect system?

>Best Regards,
>Huang, Ying
>
>>>
>>>Best Regards,
>>>Rong Chen

-- 
Wei Yang
Help you, Help me

WARNING: multiple messages have this Message-ID (diff)
From: Wei Yang <richardw.yang@linux.intel.com>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>,
	kernel test robot <rong.a.chen@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	lkp@01.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [LKP] [driver core] 570d020012: will-it-scale.per_thread_ops -12.2% regression
Date: Thu, 21 Feb 2019 14:02:18 +0800	[thread overview]
Message-ID: <20190221060218.GA19466@richard> (raw)
In-Reply-To: <87h8cx21gl.fsf@yhuang-dev.intel.com>

On Thu, Feb 21, 2019 at 12:46:18PM +0800, Huang, Ying wrote:
>Wei Yang <richardw.yang@linux.intel.com> writes:
>
>> On Thu, Feb 21, 2019 at 11:10:49AM +0800, kernel test robot wrote:
>>>On Tue, Feb 19, 2019 at 01:19:04PM +0100, Greg Kroah-Hartman wrote:
>>>> On Tue, Feb 19, 2019 at 08:59:45AM +0800, Wei Yang wrote:
>>>> > On Mon, Feb 18, 2019 at 03:54:42PM +0800, kernel test robot wrote:
>>>> > >Greeting,
>>>> > >
>>>> > >FYI, we noticed a -12.2% regression of will-it-scale.per_thread_ops due to commit:
>>>> > >
>>>> > >
>>>> > >commit: 570d0200123fb4f809aa2f6226e93a458d664d70 ("driver core: move device->knode_class to device_private")
>>>> > >https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>>>> > >
>>>> > 
>>>> > This is interesting.
>>>> > 
>>>> > I didn't expect the move of this field will impact the performance.
>>>> > 
>>>> > The reason is struct device is a hotter memory than device->device_private?
>>>> > 
>>>> > >in testcase: will-it-scale
>>>> > >on test machine: 288 threads Knights Mill with 80G memory
>>>> > >with following parameters:
>>>> > >
>>>> > >	nr_task: 100%
>>>> > >	mode: thread
>>>> > >	test: unlink2
>>>> > >	cpufreq_governor: performance
>>>> > >
>>>> > >test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>>>> > >test-url: https://github.com/antonblanchard/will-it-scale
>>>> > >
>>>> > >In addition to that, the commit also has significant impact on the following tests:
>>>> > >
>>>> > >+------------------+---------------------------------------------------------------+
>>>> > >| testcase: change | will-it-scale: will-it-scale.per_thread_ops -29.9% regression |
>>>> > >| test machine     | 288 threads Knights Mill with 80G memory                      |
>>>> > >| test parameters  | cpufreq_governor=performance                                  |
>>>> > >|                  | mode=thread                                                   |
>>>> > >|                  | nr_task=100%                                                  |
>>>> > >|                  | test=signal1                                                  |
>>>> 
>>>> Ok, I'm going to blame your testing system, or something here, and not
>>>> the above patch.
>>>> 
>>>> All this test does is call raise(3).  That does not touch the driver
>>>> core at all.
>>>> 
>>>> > >+------------------+---------------------------------------------------------------+
>>>> > >| testcase: change | will-it-scale: will-it-scale.per_thread_ops -16.5% regression |
>>>> > >| test machine     | 288 threads Knights Mill with 80G memory                      |
>>>> > >| test parameters  | cpufreq_governor=performance                                  |
>>>> > >|                  | mode=thread                                                   |
>>>> > >|                  | nr_task=100%                                                  |
>>>> > >|                  | test=open1                                                    |
>>>> > >+------------------+---------------------------------------------------------------+
>>>> 
>>>> Same here, open1 just calls open/close a lot.  No driver core
>>>> interaction at all there either.
>>>> 
>>>> So are you _sure_ this is the offending patch?
>>>
>>>Hi Greg,
>>>
>>>We did an experiment, recovered the layout of struct device. and we
>>>found the regression is gone. I guess the regession is not from the
>>>patch but related to the struct layout.
>>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-unlink2/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>    237096              14%     270789        will-it-scale.workload
>>>       823              14%        939        will-it-scale.per_thread_ops
>>>
>>
>> Do you have the comparison between a36dc70b810afe9183de2ea18f and the one
>> before 570d020012?
>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-signal1/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>     93.51   3%        48%     138.53   3%  will-it-scale.time.user_time
>>>       186              40%        261        will-it-scale.per_thread_ops
>>>     53909              40%      75507        will-it-scale.workload
>>>
>>>
>>>tests: 1
>>>testcase/path_params/tbox_group/run: will-it-scale/performance-thread-100%-open1/lkp-knm01
>>>
>>>570d0200123fb4f8  a36dc70b810afe9183de2ea18f  
>>>----------------  --------------------------  
>>>         %stddev      change         %stddev
>>>             \          |                \  
>>>    447722              22%     546258  10%  will-it-scale.time.involuntary_context_switches
>>>    226995              19%     269751        will-it-scale.workload
>>>       787              19%        936        will-it-scale.per_thread_ops
>>>
>>>
>>>
>>>commit a36dc70b810afe9183de2ea18faa4c0939c139ac
>>>Author: 0day robot <lkp@intel.com>
>>>Date:   Wed Feb 20 14:21:19 2019 +0800
>>>
>>>    backfile klist_node in struct device for debugging
>>>    
>>>    Signed-off-by: 0day robot <lkp@intel.com>
>>>
>>>diff --git a/include/linux/device.h b/include/linux/device.h
>>>index d0e452fd0bff2..31666cb72b3ba 100644
>>>--- a/include/linux/device.h
>>>+++ b/include/linux/device.h
>>>@@ -1035,6 +1035,7 @@ struct device {
>>> 	spinlock_t		devres_lock;
>>> 	struct list_head	devres_head;
>>> 
>>>+	struct klist_node       knode_class_test_by_rongc;
>>> 	struct class		*class;
>>> 	const struct attribute_group **groups;	/* optional groups */
>>
>> Hmm... because this is not properly aligned?
>>
>> struct klist_node {
>> 	void			*n_klist;	/* never access directly */
>> 	struct list_head	n_node;
>> 	struct kref		n_ref;
>> };
>>
>> Except struct kref has one "int" type, others are pointers.
>>
>> But... I am still confused.
>
>I guess because the size of struct device is changed, it influences some
>alignment changes in the system.  Thus influence the benchmark score.
>

That's interesting.

I wrote a module to see the exact size of these two structure on my x86_64.

    sizeof(struct device) = 736 = 8 * 92
    sizeof(struct device_private) = 160 = 8 * 20
    sizeof(struct klist_node) = 32 = 8 * 4

Even klist_node has one 4 byte field, c complier would pack the structure to
make it aligned. Which system alignment it would affect?

After the patch, size would change like this:

   struct device          736   ->   704
   struce device_private  160   ->   192

Would this size change affect system?

>Best Regards,
>Huang, Ying
>
>>>
>>>Best Regards,
>>>Rong Chen

-- 
Wei Yang
Help you, Help me

  reply	other threads:[~2019-02-21  6:02 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-18  7:54 [driver core] 570d020012: will-it-scale.per_thread_ops -12.2% regression kernel test robot
2019-02-18  7:54 ` [LKP] " kernel test robot
2019-02-19  0:59 ` Wei Yang
2019-02-19 12:19   ` Greg Kroah-Hartman
2019-02-19 12:19     ` [LKP] " Greg Kroah-Hartman
2019-02-21  3:10     ` kernel test robot
2019-02-21  3:10       ` [LKP] " kernel test robot
2019-02-21  3:46       ` Wei Yang
2019-02-21  3:46         ` [LKP] " Wei Yang
2019-02-21  4:46         ` Huang, Ying
2019-02-21  4:46           ` [LKP] " Huang, Ying
2019-02-21  6:02           ` Wei Yang [this message]
2019-02-21  6:02             ` Wei Yang
2019-02-21  6:29             ` Huang, Ying
2019-02-21  6:29               ` [LKP] " Huang, Ying
2019-02-21  5:46         ` kernel test robot
2019-02-21  5:46           ` [LKP] " kernel test robot
2019-02-21  7:10       ` Greg Kroah-Hartman
2019-02-21  7:10         ` [LKP] " Greg Kroah-Hartman
2019-02-21  7:18         ` Huang, Ying
2019-02-21  7:18           ` [LKP] " Huang, Ying
2019-02-21  7:35           ` Greg Kroah-Hartman
2019-02-21  7:35             ` [LKP] " Greg Kroah-Hartman
2019-02-21  8:30             ` Huang, Ying
2019-02-21  8:30               ` [LKP] " Huang, Ying
2019-02-21  8:39               ` Wei Yang
2019-02-21  9:12                 ` Greg Kroah-Hartman
2019-02-21  9:12                   ` [LKP] " Greg Kroah-Hartman
2019-02-21 21:40                   ` Wei Yang
2019-02-21  7:53           ` Wei Yang
2019-02-21  7:53             ` [LKP] " Wei Yang
2019-02-21 22:31             ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190221060218.GA19466@richard \
    --to=richardw.yang@linux.intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.