From: Mel Gorman <mgorman@suse.de>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: trond.myklebust@primarydata.com, linux-nfs@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Jerome Marchand <jmarchan@redhat.com>
Subject: Re: [PATCH 3/4] sunrpc: if we're closing down a socket, clear memalloc on it first
Date: Thu, 4 Jun 2015 14:08:30 +0100 [thread overview]
Message-ID: <20150604130830.GH26425@suse.de> (raw)
In-Reply-To: <20150603103200.4f66bae5@synchrony.poochiereds.net>
On Wed, Jun 03, 2015 at 10:32:00AM -0400, Jeff Layton wrote:
> On Tue, 2 Jun 2015 13:40:26 +0100
> Mel Gorman <mgorman@suse.de> wrote:
>
> > On Sat, May 30, 2015 at 08:03:12AM -0400, Jeff Layton wrote:
> > > We currently increment the memalloc_socks counter if we have a xprt that
> > > is associated with a swapfile. That socket can be replaced however
> > > during a reconnect event, and the memalloc_socks counter is never
> > > decremented if that occurs.
> > >
> > > When tearing down a xprt socket, check to see if the xprt is set up for
> > > swapping and sk_clear_memalloc before releasing the socket if so.
> > >
> > > Cc: Mel Gorman <mgorman@suse.de>
> > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> >
> > Acked-by: Mel Gorman <mgorman@suse.de>
> >
>
> Thanks Mel,
>
> I should also mention that I see this warning pop when working with
> swapfiles on NFS. This trace is with this patchset, but I see a similar
> one without it:
>
> [ 74.232485] ------------[ cut here ]------------
> [ 74.233354] WARNING: CPU: 2 PID: 754 at net/core/sock.c:364 sk_clear_memalloc+0x51/0x80()
> [ 74.234790] Modules linked in: cts rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device nfsd snd_pcm snd_timer snd e1000 ppdev parport_pc joydev parport pvpanic soundcore floppy serio_raw i2c_piix4 pcspkr nfs_acl lockd virtio_balloon acpi_cpufreq auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci ata_generic virtio_ring pata_acpi virtio
> [ 74.243599] CPU: 2 PID: 754 Comm: swapoff Not tainted 4.1.0-rc6+ #5
> [ 74.244635] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 74.245546] 0000000000000000 0000000079e69e31 ffff8800d066bde8 ffffffff8179263d
> [ 74.246786] 0000000000000000 0000000000000000 ffff8800d066be28 ffffffff8109e6fa
> [ 74.248175] 0000000000000000 ffff880118d48000 ffff8800d58f5c08 ffff880036e380a8
> [ 74.249483] Call Trace:
> [ 74.249872] [<ffffffff8179263d>] dump_stack+0x45/0x57
> [ 74.250703] [<ffffffff8109e6fa>] warn_slowpath_common+0x8a/0xc0
> [ 74.251655] [<ffffffff8109e82a>] warn_slowpath_null+0x1a/0x20
> [ 74.252585] [<ffffffff81661241>] sk_clear_memalloc+0x51/0x80
> [ 74.253519] [<ffffffffa0116c72>] xs_disable_swap+0x42/0x80 [sunrpc]
> [ 74.254537] [<ffffffffa01109de>] rpc_clnt_swap_deactivate+0x7e/0xc0 [sunrpc]
> [ 74.255610] [<ffffffffa03e4fd7>] nfs_swap_deactivate+0x27/0x30 [nfs]
> [ 74.256582] [<ffffffff811e99d4>] destroy_swap_extents+0x74/0x80
> [ 74.257496] [<ffffffff811ecb52>] SyS_swapoff+0x222/0x5c0
> [ 74.258318] [<ffffffff81023f27>] ? syscall_trace_leave+0xc7/0x140
> [ 74.259253] [<ffffffff81798dae>] system_call_fastpath+0x12/0x71
> [ 74.260158] ---[ end trace 2530722966429f10 ]---
>
> ...that comes from this in sk_clear_memalloc:
>
> /*
> * SOCK_MEMALLOC is allowed to ignore rmem limits to ensure forward
> * progress of swapping. However, if SOCK_MEMALLOC is cleared while
> * it has rmem allocations there is a risk that the user of the
> * socket cannot make forward progress due to exceeding the rmem
> * limits. By rights, sk_clear_memalloc() should only be called
> * on sockets being torn down but warn and reset the accounting if
> * that assumption breaks.
> */
> if (WARN_ON(sk->sk_forward_alloc))
> sk_mem_reclaim(sk);
>
> Is it wrong to call sk_clear_memalloc on swapoff? Should we try to keep
> it set up as a memalloc socket on the last swapoff and just wait until
> the socket is being freed to clear it? If so, then maybe the right
> thing to do is to call sk_clear_memalloc in __sk_free or somewhere
> similar if it's set up for memalloc?
>
I think it is perfectly reasonable to remove the warning after your
series. When I had it in mind, I was primarily thinking of the shutdown
case and a single swap file. With your series applied, the disabling of
swap is called at the correct time. So, something like this to tack on
to the end of your series?
---8<---
net, swap: Remove a warning and clarify why sk_mem_reclaim is required when deactivating swap
Jeff Layton reported the following;
[ 74.232485] ------------[ cut here ]------------
[ 74.233354] WARNING: CPU: 2 PID: 754 at net/core/sock.c:364 sk_clear_memalloc+0x51/0x80()
[ 74.234790] Modules linked in: cts rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device nfsd snd_pcm snd_timer snd e1000 ppdev parport_pc joydev parport pvpanic soundcore floppy serio_raw i2c_piix4 pcspkr nfs_acl lockd virtio_balloon acpi_cpufreq auth_rpcgss grace sunrpc qxl drm_kms_helper ttm drm virtio_console virtio_blk virtio_pci ata_generic virtio_ring pata_acpi virtio
[ 74.243599] CPU: 2 PID: 754 Comm: swapoff Not tainted 4.1.0-rc6+ #5
[ 74.244635] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 74.245546] 0000000000000000 0000000079e69e31 ffff8800d066bde8 ffffffff8179263d
[ 74.246786] 0000000000000000 0000000000000000 ffff8800d066be28 ffffffff8109e6fa
[ 74.248175] 0000000000000000 ffff880118d48000 ffff8800d58f5c08 ffff880036e380a8
[ 74.249483] Call Trace:
[ 74.249872] [<ffffffff8179263d>] dump_stack+0x45/0x57
[ 74.250703] [<ffffffff8109e6fa>] warn_slowpath_common+0x8a/0xc0
[ 74.251655] [<ffffffff8109e82a>] warn_slowpath_null+0x1a/0x20
[ 74.252585] [<ffffffff81661241>] sk_clear_memalloc+0x51/0x80
[ 74.253519] [<ffffffffa0116c72>] xs_disable_swap+0x42/0x80 [sunrpc]
[ 74.254537] [<ffffffffa01109de>] rpc_clnt_swap_deactivate+0x7e/0xc0 [sunrpc]
[ 74.255610] [<ffffffffa03e4fd7>] nfs_swap_deactivate+0x27/0x30 [nfs]
[ 74.256582] [<ffffffff811e99d4>] destroy_swap_extents+0x74/0x80
[ 74.257496] [<ffffffff811ecb52>] SyS_swapoff+0x222/0x5c0
[ 74.258318] [<ffffffff81023f27>] ? syscall_trace_leave+0xc7/0x140
[ 74.259253] [<ffffffff81798dae>] system_call_fastpath+0x12/0x71
[ 74.260158] ---[ end trace 2530722966429f10 ]---
The warning in question was unnecessary but with Jeff's series the rules
are also clearer. This patch removes the warning and updates the comment
to explain why sk_mem_reclaim() may still be called.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
net/core/sock.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 71e3e5f1eaa0..1ebf706b5847 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -354,14 +354,12 @@ void sk_clear_memalloc(struct sock *sk)
/*
* SOCK_MEMALLOC is allowed to ignore rmem limits to ensure forward
- * progress of swapping. However, if SOCK_MEMALLOC is cleared while
- * it has rmem allocations there is a risk that the user of the
- * socket cannot make forward progress due to exceeding the rmem
- * limits. By rights, sk_clear_memalloc() should only be called
- * on sockets being torn down but warn and reset the accounting if
- * that assumption breaks.
+ * progress of swapping. SOCK_MEMALLOC may be cleared while
+ * it has rmem allocations due to the last swapfile being deactivated
+ * but there is a risk that the socket is unusable due to exceeding
+ * the rmem limits. Reclaim the reserves and obey rmem limits again.
*/
- if (WARN_ON(sk->sk_forward_alloc))
+ if (sk->sk_forward_alloc)
sk_mem_reclaim(sk);
}
EXPORT_SYMBOL_GPL(sk_clear_memalloc);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-06-04 13:08 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-30 12:03 [PATCH 0/4] sunrpc: clean up "swapper" xprt handling Jeff Layton
2015-05-30 12:03 ` [PATCH 1/4] sunrpc: keep a count of swapfiles associated with the rpc_clnt Jeff Layton
2015-06-02 12:36 ` Mel Gorman
2015-05-30 12:03 ` [PATCH 2/4] sunrpc: make xprt->swapper an atomic_t Jeff Layton
2015-05-30 17:55 ` Chuck Lever
2015-05-30 19:38 ` Jeff Layton
2015-05-30 12:03 ` [PATCH 3/4] sunrpc: if we're closing down a socket, clear memalloc on it first Jeff Layton
2015-06-02 12:38 ` Mel Gorman
2015-06-02 12:40 ` Mel Gorman
2015-06-03 14:32 ` Jeff Layton
2015-06-04 13:08 ` Mel Gorman [this message]
2015-06-04 14:25 ` Jeff Layton
2015-05-30 12:03 ` [PATCH 4/4] sunrpc: lock xprt before trying to set memalloc on the sockets Jeff Layton
2015-05-30 12:57 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150604130830.GH26425@suse.de \
--to=mgorman@suse.de \
--cc=jlayton@poochiereds.net \
--cc=jmarchan@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).