From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.com>,
LKML <linux-kernel@vger.kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Minchan Kim <minchan@kernel.org>,
Vinayak Menon <vinmenon@codeaurora.org>,
Mel Gorman <mgorman@suse.de>,
Andrew Morton <akpm@linux-foundation.org>, LKP <lkp@01.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression
Date: Tue, 14 Jun 2016 16:03:23 +0200 [thread overview]
Message-ID: <57600EAB.9030000@de.ibm.com> (raw)
In-Reply-To: <CA+55aFx2TdqHW5VvirF-fAe4rRtSKK6BH06LyN4Ma3Q7ifJkxA@mail.gmail.com>
On 06/14/2016 08:11 AM, Linus Torvalds wrote:
> On Mon, Jun 13, 2016 at 5:52 AM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
>> On Sat, Jun 11, 2016 at 06:02:57PM -0700, Linus Torvalds wrote:
>>>
>>> I've timed it at over a thousand cycles on at least some CPU's, but
>>> that's still peanuts compared to a real page fault. It shouldn't be
>>> *that* noticeable, ie no way it's a 6% regression on its own.
>>
>> Looks like setting accessed bit is the problem.
>
> Ok. I've definitely seen it as an issue, but never to the point of
> several percent on a real benchmark that wasn't explicitly testing
> that cost.
>
> I reported the excessive dirty/accessed bit cost to Intel back in the
> P4 days, but it's apparently not been high enough for anybody to care.
>
>> We spend 36% more time in page walk only, about 1% of total userspace time.
>> Combining this with page walk footprint on caches, I guess we can get to
>> this 3.5% score difference I see.
>>
>> I'm not sure if there's anything we can do to solve the issue without
>> screwing relacim logic again. :(
>
> I think we should say "screw the reclaim logic" for now, and revert
> commit 5c0a85fad949 for now.
>
> Considering how much trouble the accessed bit is on some other
> architectures too, I wonder if we should strive to simply not care
> about it, and always leaving it set. And then rely entirely on just
> unmapping the pages and making the "we took a page fault after
> unmapping" be the real activity tester.
>
> So get rid of the "if the page is young, mark it old but leave it in
> the page tables" logic entirely. When we unmap a page, it will always
> either be in the swap cache or the page cache anyway, so faulting it
> in again should be just a minor fault with no actual IO happening.
>
> That might be less of an impact in the end - yes, the unmap and
> re-fault is much more expensive, but it presumably happens to much
> fewer pages.
FWIW, something like that is what Martin did for s390 3 years ago.
We now use invalidation and page faults to implement the *young
functions in pgtable.h (basically using a SW young bit). This
helped us to get rid of the storage keys (which contain the HW
reference bit). The performance did not seem to suffer.
See commit 0944fe3f4a323f436180d39402cae7f9c46ead17
s390/mm: implement software referenced bits
>
> What do you think?
Your proposal would be to do the software tracking via
invalidation/fault part of the generic mm code and not to hide it
in the architecture backend. Correct?
>
> Linus
>
next prev parent reply other threads:[~2016-06-14 14:03 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 2:27 [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression kernel test robot
2016-06-06 9:51 ` Kirill A. Shutemov
2016-06-08 7:21 ` [LKP] " Huang, Ying
2016-06-08 8:41 ` Huang, Ying
2016-06-08 8:58 ` Kirill A. Shutemov
2016-06-12 0:49 ` Huang, Ying
2016-06-12 1:02 ` Linus Torvalds
2016-06-13 9:02 ` Huang, Ying
2016-06-14 13:38 ` Minchan Kim
2016-06-15 23:42 ` Huang, Ying
2016-06-13 12:52 ` Kirill A. Shutemov
2016-06-14 6:11 ` Linus Torvalds
2016-06-14 8:26 ` Kirill A. Shutemov
2016-06-14 16:07 ` Rik van Riel
2016-06-14 14:03 ` Christian Borntraeger [this message]
2016-06-14 8:57 ` Minchan Kim
2016-06-14 14:34 ` Kirill A. Shutemov
2016-06-15 23:52 ` Huang, Ying
2016-06-16 0:13 ` Minchan Kim
2016-06-16 22:27 ` Huang, Ying
2016-06-17 5:41 ` Minchan Kim
2016-06-17 19:26 ` Huang, Ying
2016-06-20 0:06 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57600EAB.9030000@de.ibm.com \
--to=borntraeger@de.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@linux.intel.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lkp@01.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
--cc=schwidefsky@de.ibm.com \
--cc=torvalds@linux-foundation.org \
--cc=vinmenon@codeaurora.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox