All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Dan Streetman <ddstreet@ieee.org>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
	Linux-MM <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Fri, 25 Apr 2014 09:38:30 +0100	[thread overview]
Message-ID: <20140425083830.GY23991@suse.de> (raw)
In-Reply-To: <CALZtONCa3jLrYkPSFPNnV84zePxFtdkWJBu092ScgUe2AugMxQ@mail.gmail.com>

On Thu, Apr 24, 2014 at 02:48:43PM -0400, Dan Streetman wrote:
> >> <SNIP>
> >> -             }
> >> -
> >> +     list_for_each(tmp, &swap_list_head) {
> >> +             si = list_entry(tmp, typeof(*si), list);
> >>               spin_lock(&si->lock);
> >> -             if (!si->highest_bit) {
> >> -                     spin_unlock(&si->lock);
> >> -                     continue;
> >> -             }
> >> -             if (!(si->flags & SWP_WRITEOK)) {
> >> +             if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> >>                       spin_unlock(&si->lock);
> >>                       continue;
> >>               }
> >>
> >> -             swap_list.next = next;
> >> +             /*
> >> +              * rotate the current swap_info that we're going to use
> >> +              * to after any other swap_info that have the same prio,
> >> +              * so that all equal-priority swap_info get used equally
> >> +              */
> >> +             next = si;
> >> +             list_for_each_entry_continue(next, &swap_list_head, list) {
> >> +                     if (si->prio != next->prio)
> >> +                             break;
> >> +                     list_rotate_left(&si->list);
> >> +                     next = si;
> >> +             }
> >>
> >
> > The list manipulations will be a lot of cache writes as the list is shuffled
> > around. On slow storage I do not think this will be noticable but it may
> > be noticable on faster swap devices that are SSD based. I've added Shaohua
> > Li to the cc as he has been concerned with the performance of swap in the
> > past. Shaohua, can you run this patchset through any of your test cases
> > with the addition that multiple swap files are used to see if the cache
> > writes are noticable? You'll need multiple swap files, some of which are
> > at equal priority so the list shuffling logic is triggered.
> 
> One performance improvement could be instead of rotating the current
> entry past each following same-prio entry, just scan to the end of the
> same-prio entries and move the current entry there; that would skip
> the extra writes.  Especially since this code will run for each
> get_swap_page(), no need for any unnecessary writes.
> 

Shaohua is the person that would be most sensitive to performance problems
in this area and his tests are in the clear. If he's happy then I don't
think there is justification for changing the patch as-is.

> >
> >>               spin_unlock(&swap_lock);
> >>               /* This is called for allocating swap entry for cache */
> >>               offset = scan_swap_map(si, SWAP_HAS_CACHE);
> >>               spin_unlock(&si->lock);
> >>               if (offset)
> >> -                     return swp_entry(type, offset);
> >> +                     return swp_entry(si->type, offset);
> >>               spin_lock(&swap_lock);
> >> -             next = swap_list.next;
> >> +             /*
> >> +              * shouldn't really have got here, but for some reason the
> >> +              * scan_swap_map came back empty for this swap_info.
> >> +              * Since we dropped the swap_lock, there may now be
> >> +              * non-full higher prio swap_infos; let's start over.
> >> +              */
> >> +             tmp = &swap_list_head;
> >>       }
> >
> > Has this ever triggered? The number of swap pages was examined under the
> > swap lock so no other process should have been iterating through the
> > swap files. Once a candidate was found, the si lock was acquired for the
> > swap scan map so nothing else should have raced with it.
> 
> Well scan_swap_map() does drop the si->lock if it has any trouble at
> all finding an offset to use, so I think it's possible that for a
> nearly-full si multiple concurrent get_swap_page() calls could enter
> scan_swap_map() with the same si, only some of them actually get pages
> from the si and then the si becomes full, and the other threads in
> scan_swap_map() see it's full and exit in failure.  I can update the
> code comment there to better indicate why it was reached, instead of
> just saying "we shouldn't have got here" :-)
> 

With the updates to some comments then feel free to add

Acked-by: Mel Gorman <mgorman@suse.de>

> It may also be worth trying to get a better indicator of "available"
> swap_info_structs for use in get_swap_page(), either by looking at
> something other than si->highest_bit and/or keeping the si out of the
> prio_list until it's actually available for use, not just has a single
> entry free.  However, that probably won't be simple and might be
> better as a separate patch to the rest of these changes.
> 

I agree that it is likely outside the scope of what this series is meant
to accomplish.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Dan Streetman <ddstreet@ieee.org>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
	Linux-MM <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Fri, 25 Apr 2014 09:38:30 +0100	[thread overview]
Message-ID: <20140425083830.GY23991@suse.de> (raw)
In-Reply-To: <CALZtONCa3jLrYkPSFPNnV84zePxFtdkWJBu092ScgUe2AugMxQ@mail.gmail.com>

On Thu, Apr 24, 2014 at 02:48:43PM -0400, Dan Streetman wrote:
> >> <SNIP>
> >> -             }
> >> -
> >> +     list_for_each(tmp, &swap_list_head) {
> >> +             si = list_entry(tmp, typeof(*si), list);
> >>               spin_lock(&si->lock);
> >> -             if (!si->highest_bit) {
> >> -                     spin_unlock(&si->lock);
> >> -                     continue;
> >> -             }
> >> -             if (!(si->flags & SWP_WRITEOK)) {
> >> +             if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> >>                       spin_unlock(&si->lock);
> >>                       continue;
> >>               }
> >>
> >> -             swap_list.next = next;
> >> +             /*
> >> +              * rotate the current swap_info that we're going to use
> >> +              * to after any other swap_info that have the same prio,
> >> +              * so that all equal-priority swap_info get used equally
> >> +              */
> >> +             next = si;
> >> +             list_for_each_entry_continue(next, &swap_list_head, list) {
> >> +                     if (si->prio != next->prio)
> >> +                             break;
> >> +                     list_rotate_left(&si->list);
> >> +                     next = si;
> >> +             }
> >>
> >
> > The list manipulations will be a lot of cache writes as the list is shuffled
> > around. On slow storage I do not think this will be noticable but it may
> > be noticable on faster swap devices that are SSD based. I've added Shaohua
> > Li to the cc as he has been concerned with the performance of swap in the
> > past. Shaohua, can you run this patchset through any of your test cases
> > with the addition that multiple swap files are used to see if the cache
> > writes are noticable? You'll need multiple swap files, some of which are
> > at equal priority so the list shuffling logic is triggered.
> 
> One performance improvement could be instead of rotating the current
> entry past each following same-prio entry, just scan to the end of the
> same-prio entries and move the current entry there; that would skip
> the extra writes.  Especially since this code will run for each
> get_swap_page(), no need for any unnecessary writes.
> 

Shaohua is the person that would be most sensitive to performance problems
in this area and his tests are in the clear. If he's happy then I don't
think there is justification for changing the patch as-is.

> >
> >>               spin_unlock(&swap_lock);
> >>               /* This is called for allocating swap entry for cache */
> >>               offset = scan_swap_map(si, SWAP_HAS_CACHE);
> >>               spin_unlock(&si->lock);
> >>               if (offset)
> >> -                     return swp_entry(type, offset);
> >> +                     return swp_entry(si->type, offset);
> >>               spin_lock(&swap_lock);
> >> -             next = swap_list.next;
> >> +             /*
> >> +              * shouldn't really have got here, but for some reason the
> >> +              * scan_swap_map came back empty for this swap_info.
> >> +              * Since we dropped the swap_lock, there may now be
> >> +              * non-full higher prio swap_infos; let's start over.
> >> +              */
> >> +             tmp = &swap_list_head;
> >>       }
> >
> > Has this ever triggered? The number of swap pages was examined under the
> > swap lock so no other process should have been iterating through the
> > swap files. Once a candidate was found, the si lock was acquired for the
> > swap scan map so nothing else should have raced with it.
> 
> Well scan_swap_map() does drop the si->lock if it has any trouble at
> all finding an offset to use, so I think it's possible that for a
> nearly-full si multiple concurrent get_swap_page() calls could enter
> scan_swap_map() with the same si, only some of them actually get pages
> from the si and then the si becomes full, and the other threads in
> scan_swap_map() see it's full and exit in failure.  I can update the
> code comment there to better indicate why it was reached, instead of
> just saying "we shouldn't have got here" :-)
> 

With the updates to some comments then feel free to add

Acked-by: Mel Gorman <mgorman@suse.de>

> It may also be worth trying to get a better indicator of "available"
> swap_info_structs for use in get_swap_page(), either by looking at
> something other than si->highest_bit and/or keeping the si out of the
> prio_list until it's actually available for use, not just has a single
> entry free.  However, that probably won't be simple and might be
> better as a separate patch to the rest of these changes.
> 

I agree that it is likely outside the scope of what this series is meant
to accomplish.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2014-04-25  8:38 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 10:42 ` Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-13 15:58   ` Weijie Yang
2014-02-14 10:17   ` Mel Gorman
2014-02-14 10:17     ` Mel Gorman
2014-02-14 13:33     ` Weijie Yang
2014-02-14 13:33       ` Weijie Yang
2014-02-14 13:10   ` Christian Ehrhardt
2014-02-16  2:59     ` Weijie Yang
2014-02-24  8:28       ` Hugh Dickins
2014-02-24  8:28         ` Hugh Dickins
2014-04-12 21:00         ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00           ` Dan Streetman
2014-04-12 21:00           ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-12 21:00             ` Dan Streetman
2014-04-23 10:34             ` Mel Gorman
2014-04-23 10:34               ` Mel Gorman
2014-04-24  0:17               ` Shaohua Li
2014-04-24  0:17                 ` Shaohua Li
2014-04-24  8:30                 ` Mel Gorman
2014-04-24  8:30                   ` Mel Gorman
2014-04-24 18:48               ` Dan Streetman
2014-04-24 18:48                 ` Dan Streetman
2014-04-25  4:15                 ` Weijie Yang
2014-04-25  4:15                   ` Weijie Yang
2014-05-02 20:00                   ` Dan Streetman
2014-05-02 20:00                   ` Dan Streetman
2014-05-02 20:00                     ` Dan Streetman
2014-05-04  9:39                     ` Bob Liu
2014-05-04  9:39                       ` Bob Liu
2014-05-04 20:16                       ` Dan Streetman
2014-05-04 20:16                         ` Dan Streetman
2014-05-04 20:16                       ` Dan Streetman
2014-05-04  9:39                     ` Bob Liu
2014-04-25  8:38                 ` Mel Gorman [this message]
2014-04-25  8:38                   ` Mel Gorman
2014-04-12 21:00           ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-12 21:00             ` Dan Streetman
2014-04-23 13:14             ` Mel Gorman
2014-04-23 13:14               ` Mel Gorman
2014-04-24 17:52               ` Dan Streetman
2014-04-24 17:52                 ` Dan Streetman
2014-04-25  8:49                 ` Mel Gorman
2014-04-25  8:49                   ` Mel Gorman
2014-05-02 19:02           ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02             ` Dan Streetman
2014-05-02 19:02             ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-02 19:02             ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-12 10:35               ` Mel Gorman
2014-05-12 10:35                 ` Mel Gorman
2014-05-02 19:02             ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-06  2:18               ` Steven Rostedt
2014-05-06  2:18                 ` Steven Rostedt
2014-05-06 20:12                 ` Dan Streetman
2014-05-06 20:12                   ` Dan Streetman
2014-05-06 20:39                   ` Steven Rostedt
2014-05-06 20:39                     ` Steven Rostedt
2014-05-06 21:47                     ` Dan Streetman
2014-05-06 21:47                       ` Dan Streetman
2014-05-06 22:43                       ` Steven Rostedt
2014-05-06 22:43                         ` Steven Rostedt
2014-05-02 19:02             ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-05 15:51               ` Dan Streetman
2014-05-05 15:51                 ` Dan Streetman
2014-05-05 19:13               ` Steven Rostedt
2014-05-05 19:13                 ` Steven Rostedt
2014-05-05 19:38                 ` Peter Zijlstra
2014-05-09 20:42                 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 20:42                   ` Dan Streetman
2014-05-09 21:17                   ` Steven Rostedt
2014-05-09 21:17                     ` Steven Rostedt
2014-05-12 11:11               ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Mel Gorman
2014-05-12 11:11                 ` Mel Gorman
2014-05-12 13:00                 ` Dan Streetman
2014-05-12 13:00                   ` Dan Streetman
2014-05-12 16:38             ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38               ` Dan Streetman
2014-05-12 16:38               ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-12 16:38               ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-12 16:38               ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-13 10:33                 ` Mel Gorman
2014-05-13 10:33                   ` Mel Gorman
2014-05-12 16:38               ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-13 10:34                 ` Mel Gorman
2014-05-13 10:34                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140425083830.GY23991@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=ddstreet@ieee.org \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=shli@fusionio.com \
    --cc=weijieut@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.