From: Dave Chinner <david@fromorbit.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Huang, Ying" <ying.huang@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
Bob Peterson <rpeterso@redhat.com>,
Wu Fengguang <fengguang.wu@intel.com>, LKP <lkp@01.org>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [LKP] [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression
Date: Thu, 11 Aug 2016 14:46:09 +1000 [thread overview]
Message-ID: <20160811044609.GW16044@dastard> (raw)
In-Reply-To: <CA+55aFy=xeEKgfHWisfxsZNqrgMJxvKvhrWrPmdpMxu0pnOXig@mail.gmail.com>
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying <ying.huang@intel.com> wrote:
> >
> > Here it is,
>
> Thanks.
>
> Appended is a munged "after" list, with the "before" values in
> parenthesis. It actually looks fairly similar.
>
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform_write/block_page_mkwrite didn't mark the
> pages accessed, and now with iomap_file_buffered_write() they get
> marked as active and that screws up the LRU list, and makes us not
> flush out the dirty pages well (because they are seen as active and
> not good for writeback), and then you get bad memory use.
>
> I'm not seeing anything that looks like locking-related.
Not in that profile. I've been doing some local testing inside a
4-node fake-numa 16p/16GB RAM VM to see what I can find.
I'm yet to work out how I can trigger a profile like the one that
was reported (I really need to see the event traces), but in the
mean time I found this....
Doing a large sequential single threaded buffered write using a 4k
buffer (so single page per syscall to make the XFS IO path allocator
behave the same way as in 4.7), I'm seeing a CPU profile that
indicates we have a potential mapping->tree_lock issue:
# xfs_io -f -c "truncate 0" -c "pwrite 0 47g" /mnt/scratch/fooey
wrote 50465865728/50465865728 bytes at offset 0
47.000 GiB, 12320768 ops; 0:01:36.00 (499.418 MiB/sec and 127850.9132 ops/sec)
....
24.15% [kernel] [k] _raw_spin_unlock_irqrestore
9.67% [kernel] [k] copy_user_generic_string
5.64% [kernel] [k] _raw_spin_unlock_irq
3.34% [kernel] [k] get_page_from_freelist
2.57% [kernel] [k] mark_page_accessed
2.45% [kernel] [k] do_raw_spin_lock
1.83% [kernel] [k] shrink_page_list
1.70% [kernel] [k] free_hot_cold_page
1.26% [kernel] [k] xfs_do_writepage
1.21% [kernel] [k] __radix_tree_lookup
1.20% [kernel] [k] __wake_up_bit
0.99% [kernel] [k] __block_write_begin_int
0.95% [kernel] [k] find_get_pages_tag
0.92% [kernel] [k] cancel_dirty_page
0.89% [kernel] [k] unlock_page
0.87% [kernel] [k] clear_page_dirty_for_io
0.85% [kernel] [k] xfs_bmap_worst_indlen
0.84% [kernel] [k] xfs_file_buffered_aio_write
0.81% [kernel] [k] delay_tsc
0.78% [kernel] [k] node_dirty_ok
0.77% [kernel] [k] up_write
0.74% [kernel] [k] ___might_sleep
0.73% [kernel] [k] xfs_bmap_add_extent_hole_delay
0.72% [kernel] [k] __fget_light
0.67% [kernel] [k] add_to_page_cache_lru
0.67% [kernel] [k] __slab_free
0.63% [kernel] [k] drop_buffers
0.59% [kernel] [k] down_write
0.59% [kernel] [k] kmem_cache_alloc
0.58% [kernel] [k] iomap_write_actor
0.53% [kernel] [k] page_mapping
0.52% [kernel] [k] entry_SYSCALL_64_fastpath
0.52% [kernel] [k] __mark_inode_dirty
0.51% [kernel] [k] __block_commit_write.isra.30
0.51% [kernel] [k] xfs_file_write_iter
0.49% [kernel] [k] mark_buffer_async_write
0.47% [kernel] [k] balance_dirty_pages_ratelimited
0.47% [kernel] [k] xfs_count_page_state
0.47% [kernel] [k] page_evictable
0.46% [kernel] [k] xfs_vm_releasepage
0.46% [kernel] [k] xfs_iomap_write_delay
0.46% [kernel] [k] do_raw_spin_unlock
0.44% [kernel] [k] xfs_file_iomap_begin
0.44% [kernel] [k] xfs_map_at_offset
0.42% [kernel] [k] xfs_iext_bno_to_ext
There's very little XFS showing up near the top of the profile;
it's all page cache, writeback and spin lock traffic. This is a
dead give-away as to the lock being contended:
- 33.30% 0.01% [kernel] [k] kswapd
- 4.67% kswapd
- 4.69% shrink_node
- 4.77% shrink_node_memcg.isra.75
- 7.38% shrink_inactive_list
- 6.70% shrink_page_list
- 20.02% __remove_mapping
19.90% _raw_spin_unlock_irqrestore
I don't think that this is the same as what aim7 is triggering as
there's no XFS write() path allocation functions near the top of the
profile to speak of. Still, I don't recall seeing this before...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2016-08-11 4:46 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-09 14:33 [lkp] [xfs] 68a9f5e700: aim7.jobs-per-min -13.6% regression kernel test robot
2016-08-10 18:24 ` Linus Torvalds
2016-08-10 23:08 ` Dave Chinner
2016-08-10 23:51 ` Linus Torvalds
2016-08-10 23:58 ` [LKP] " Huang, Ying
2016-08-11 0:11 ` Huang, Ying
2016-08-11 0:23 ` Linus Torvalds
2016-08-11 0:33 ` Huang, Ying
2016-08-11 1:00 ` Linus Torvalds
2016-08-11 4:46 ` Dave Chinner [this message]
2016-08-15 17:22 ` Huang, Ying
2016-08-16 0:08 ` Dave Chinner
2016-08-11 15:57 ` Christoph Hellwig
2016-08-11 16:55 ` Linus Torvalds
2016-08-11 17:51 ` Huang, Ying
2016-08-11 19:51 ` Linus Torvalds
2016-08-11 20:00 ` Christoph Hellwig
2016-08-11 20:35 ` Linus Torvalds
2016-08-11 22:16 ` Al Viro
2016-08-11 22:30 ` Linus Torvalds
2016-08-11 21:16 ` Huang, Ying
2016-08-11 21:40 ` Linus Torvalds
2016-08-11 22:08 ` Christoph Hellwig
2016-08-12 0:54 ` Dave Chinner
2016-08-12 2:23 ` Dave Chinner
2016-08-12 2:32 ` Linus Torvalds
2016-08-12 2:52 ` Christoph Hellwig
2016-08-12 3:20 ` Linus Torvalds
2016-08-12 4:16 ` Dave Chinner
2016-08-12 5:02 ` Linus Torvalds
2016-08-12 6:04 ` Dave Chinner
2016-08-12 6:29 ` Ye Xiaolong
2016-08-12 8:51 ` Ye Xiaolong
2016-08-12 10:02 ` Dave Chinner
2016-08-12 10:43 ` Fengguang Wu
2016-08-13 0:30 ` [LKP] [lkp] " Christoph Hellwig
2016-08-13 21:48 ` Christoph Hellwig
2016-08-13 22:07 ` Fengguang Wu
2016-08-13 22:15 ` Christoph Hellwig
2016-08-13 22:51 ` Fengguang Wu
2016-08-14 14:50 ` Fengguang Wu
2016-08-14 16:17 ` Christoph Hellwig
2016-08-14 23:46 ` Dave Chinner
2016-08-14 23:57 ` Fengguang Wu
2016-08-15 14:14 ` Fengguang Wu
2016-08-15 21:22 ` Dave Chinner
2016-08-16 12:20 ` Fengguang Wu
2016-08-15 20:30 ` Huang, Ying
2016-08-22 22:09 ` Huang, Ying
2016-09-26 6:25 ` Huang, Ying
2016-09-26 14:55 ` Christoph Hellwig
2016-09-27 0:52 ` Huang, Ying
2016-08-16 13:25 ` Fengguang Wu
2016-08-13 23:32 ` Dave Chinner
2016-08-12 2:27 ` Linus Torvalds
2016-08-12 3:56 ` Dave Chinner
2016-08-12 18:03 ` Linus Torvalds
2016-08-13 23:58 ` Fengguang Wu
2016-08-15 0:48 ` Dave Chinner
2016-08-15 1:37 ` Linus Torvalds
2016-08-15 2:28 ` Dave Chinner
2016-08-15 2:53 ` Linus Torvalds
2016-08-15 5:00 ` Dave Chinner
[not found] ` <CA+55aFwva2Xffai+Eqv1Jn_NGryk3YJ2i5JoHOQnbQv6qVPAsw@mail.gmail.com>
[not found] ` <CA+55aFy14nUnJQ_GdF=j8Fa9xiH70c6fY2G3q5HQ01+8z1z3qQ@mail.gmail.com>
[not found] ` <CA+55aFxp+rLehC8c157uRbH459wUC1rRPfCVgvmcq5BrG9gkyg@mail.gmail.com>
2016-08-15 22:22 ` Dave Chinner
2016-08-15 22:42 ` Dave Chinner
2016-08-15 23:20 ` Linus Torvalds
2016-08-15 23:48 ` Linus Torvalds
2016-08-16 0:44 ` Dave Chinner
2016-08-16 15:05 ` Mel Gorman
2016-08-16 17:47 ` Linus Torvalds
2016-08-17 15:48 ` Michal Hocko
2016-08-17 16:42 ` Michal Hocko
2016-08-17 15:49 ` Mel Gorman
2016-08-18 0:45 ` Mel Gorman
2016-08-18 7:11 ` Dave Chinner
2016-08-18 13:24 ` Mel Gorman
2016-08-18 17:55 ` Linus Torvalds
2016-08-18 21:19 ` Dave Chinner
2016-08-18 22:25 ` Linus Torvalds
2016-08-19 9:00 ` Michal Hocko
2016-08-19 10:49 ` Mel Gorman
2016-08-19 23:48 ` Dave Chinner
2016-08-20 1:08 ` Linus Torvalds
2016-08-20 12:16 ` Mel Gorman
2016-08-19 15:08 ` Mel Gorman
2016-09-01 23:32 ` Dave Chinner
2016-09-06 15:37 ` Mel Gorman
2016-09-06 15:52 ` Huang, Ying
2016-08-24 15:40 ` Huang, Ying
2016-08-25 9:37 ` Mel Gorman
2016-08-18 2:44 ` Dave Chinner
2016-08-16 0:15 ` Linus Torvalds
2016-08-16 0:38 ` Dave Chinner
2016-08-16 0:50 ` Linus Torvalds
2016-08-16 0:19 ` Dave Chinner
2016-08-16 1:51 ` Linus Torvalds
2016-08-16 22:02 ` Dave Chinner
2016-08-16 23:23 ` Linus Torvalds
2016-08-15 23:01 ` Linus Torvalds
2016-08-16 0:17 ` Dave Chinner
2016-08-16 0:45 ` Linus Torvalds
2016-08-15 5:03 ` Ingo Molnar
2016-08-17 16:24 ` Peter Zijlstra
2016-08-15 12:58 ` Fengguang Wu
2016-08-11 1:16 ` Dave Chinner
2016-08-11 1:32 ` Dave Chinner
2016-08-11 2:36 ` Ye Xiaolong
2016-08-11 3:05 ` Dave Chinner
2016-08-12 1:26 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160811044609.GW16044@dastard \
--to=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=rpeterso@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox