All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Christie <michaelc@cs.wisc.edu>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>,
	Andrew Morton <akpm@osdl.org>, David Miller <davem@davemloft.net>,
	Rik van Riel <riel@redhat.com>,
	Daniel Phillips <phillips@google.com>
Subject: Re: [PATCH 20/20] iscsi: support for swapping over iSCSI.
Date: Thu, 14 Sep 2006 16:00:15 -0500	[thread overview]
Message-ID: <4509C2DF.8000007@cs.wisc.edu> (raw)
In-Reply-To: <1158266150.30737.92.camel@taijtu>

Peter Zijlstra wrote:
> On Thu, 2006-09-14 at 14:22 -0500, Mike Christie wrote:
>> Peter Zijlstra wrote:
>>> On Wed, 2006-09-13 at 15:50 -0500, Mike Christie wrote:
>>>> Peter Zijlstra wrote:
>>>>> Implement sht->swapdev() for iSCSI. This method takes care of reserving
>>>>> the extra memory needed and marking all relevant sockets with SOCK_VMIO.
>>>>>
>>>>> When used for swapping, TCP socket creation is done under GFP_MEMALLOC and
>>>>> the TCP connect is done with SOCK_VMIO to ensure their success. Also the
>>>>> netlink userspace interface is marked SOCK_VMIO, this will ensure that even
>>>>> under pressure we can still communicate with the daemon (which runs as
>>>>> mlockall() and needs no additional memory to operate).
>>>>>
>>>>> Netlink requests are handled under the new PF_MEM_NOWAIT when a swapper is
>>>>> present. This ensures that the netlink socket will not block. User-space will
>>>>> need to retry failed requests.
>>>>>
>>>>> The TCP receive path is handled under PF_MEMALLOC for SOCK_VMIO sockets.
>>>>> This makes sure we do not block the critical socket, and that we do not
>>>>> fail to process incomming data.
>>>>>
>>>>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>>>> CC: Mike Christie <michaelc@cs.wisc.edu>
>>>>> ---
>>>>>  drivers/scsi/iscsi_tcp.c            |  103 +++++++++++++++++++++++++++++++-----
>>>>>  drivers/scsi/scsi_transport_iscsi.c |   23 +++++++-
>>>>>  include/scsi/libiscsi.h             |    1 
>>>>>  include/scsi/scsi_transport_iscsi.h |    2 
>>>>>  4 files changed, 113 insertions(+), 16 deletions(-)
>>>>>
>>>>> Index: linux-2.6/drivers/scsi/iscsi_tcp.c
>>>>> ===================================================================
>>>>> --- linux-2.6.orig/drivers/scsi/iscsi_tcp.c
>>>>> +++ linux-2.6/drivers/scsi/iscsi_tcp.c
>>>>> @@ -42,6 +42,7 @@
>>>>>  #include <scsi/scsi_host.h>
>>>>>  #include <scsi/scsi.h>
>>>>>  #include <scsi/scsi_transport_iscsi.h>
>>>>> +#include <scsi/scsi_device.h>
>>>>>  
>>>>>  #include "iscsi_tcp.h"
>>>>>  
>>>>> @@ -845,9 +846,13 @@ iscsi_tcp_data_recv(read_descriptor_t *r
>>>>>  	int rc;
>>>>>  	struct iscsi_conn *conn = rd_desc->arg.data;
>>>>>  	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
>>>>> -	int processed;
>>>>> +	int processed = 0;
>>>>>  	char pad[ISCSI_PAD_LEN];
>>>>>  	struct scatterlist sg;
>>>>> +	unsigned long pflags = current->flags;
>>>>> +
>>>>> +	if (sk_has_vmio(tcp_conn->sock->sk))
>>>>> +		current->flags |= PF_MEMALLOC;
>>>>>  
>>>> Is this too late or not needed or what is it for? This function gets run
>>>> from the network layer's softirq and at this point we have a skbuff with
>>>> data that we want to process. The iscsi layer also does not allocate
>>>> memory for read or write IO in this path.
>>> I thought I found allocations in that path, lemme search...
>>> found this:
>>>
>>> iscsi_tcp_data_recv()
>>>   iscsi_data_rescv()
>>>     iscsi_complete_pdu()
>>>       __iscsi_complete_pdu()
>>>         iscsi_recv_pdu()
>>>           alloc_skb( GFP_ATOMIC);
>>>
>> You are right that is for the netlink interface. Could we move the
>> PF_MEMALLOC setting and clearing to iscsi_recv_pdu and and add it to
>> iscsi_conn_error in scsi_transport_iscsi.c so that iscsi_iser and
>> qla4xxx will have it set when they need it. I will send a patch for this
>> along with a way to have the netlink sock vmio set for all iscsi drivers
>> that need it.
> 
> I already have such a patch, look at:
> http://programming.kicks-ass.net/kernel-patches/vm_deadlock/current/iscsi_vmio.patch
> 

You are drowning me in patches :) I did not see that one. I was still
commenting on this patch :)

The new patch looks ok.


> but what conditional do you want to use for PF_MEMALLOC, an
> unconditional setting will be highly unpopular.
> 
> Hmm, perhaps you could key it of sk_has_vmio(nls)...

Yes.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michaelc@cs.wisc.edu>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>,
	Andrew Morton <akpm@osdl.org>, David Miller <davem@davemloft.net>,
	Rik van Riel <riel@redhat.com>,
	Daniel Phillips <phillips@google.com>
Subject: Re: [PATCH 20/20] iscsi: support for swapping over iSCSI.
Date: Thu, 14 Sep 2006 16:00:15 -0500	[thread overview]
Message-ID: <4509C2DF.8000007@cs.wisc.edu> (raw)
In-Reply-To: <1158266150.30737.92.camel@taijtu>

Peter Zijlstra wrote:
> On Thu, 2006-09-14 at 14:22 -0500, Mike Christie wrote:
>> Peter Zijlstra wrote:
>>> On Wed, 2006-09-13 at 15:50 -0500, Mike Christie wrote:
>>>> Peter Zijlstra wrote:
>>>>> Implement sht->swapdev() for iSCSI. This method takes care of reserving
>>>>> the extra memory needed and marking all relevant sockets with SOCK_VMIO.
>>>>>
>>>>> When used for swapping, TCP socket creation is done under GFP_MEMALLOC and
>>>>> the TCP connect is done with SOCK_VMIO to ensure their success. Also the
>>>>> netlink userspace interface is marked SOCK_VMIO, this will ensure that even
>>>>> under pressure we can still communicate with the daemon (which runs as
>>>>> mlockall() and needs no additional memory to operate).
>>>>>
>>>>> Netlink requests are handled under the new PF_MEM_NOWAIT when a swapper is
>>>>> present. This ensures that the netlink socket will not block. User-space will
>>>>> need to retry failed requests.
>>>>>
>>>>> The TCP receive path is handled under PF_MEMALLOC for SOCK_VMIO sockets.
>>>>> This makes sure we do not block the critical socket, and that we do not
>>>>> fail to process incomming data.
>>>>>
>>>>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>>>> CC: Mike Christie <michaelc@cs.wisc.edu>
>>>>> ---
>>>>>  drivers/scsi/iscsi_tcp.c            |  103 +++++++++++++++++++++++++++++++-----
>>>>>  drivers/scsi/scsi_transport_iscsi.c |   23 +++++++-
>>>>>  include/scsi/libiscsi.h             |    1 
>>>>>  include/scsi/scsi_transport_iscsi.h |    2 
>>>>>  4 files changed, 113 insertions(+), 16 deletions(-)
>>>>>
>>>>> Index: linux-2.6/drivers/scsi/iscsi_tcp.c
>>>>> ===================================================================
>>>>> --- linux-2.6.orig/drivers/scsi/iscsi_tcp.c
>>>>> +++ linux-2.6/drivers/scsi/iscsi_tcp.c
>>>>> @@ -42,6 +42,7 @@
>>>>>  #include <scsi/scsi_host.h>
>>>>>  #include <scsi/scsi.h>
>>>>>  #include <scsi/scsi_transport_iscsi.h>
>>>>> +#include <scsi/scsi_device.h>
>>>>>  
>>>>>  #include "iscsi_tcp.h"
>>>>>  
>>>>> @@ -845,9 +846,13 @@ iscsi_tcp_data_recv(read_descriptor_t *r
>>>>>  	int rc;
>>>>>  	struct iscsi_conn *conn = rd_desc->arg.data;
>>>>>  	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
>>>>> -	int processed;
>>>>> +	int processed = 0;
>>>>>  	char pad[ISCSI_PAD_LEN];
>>>>>  	struct scatterlist sg;
>>>>> +	unsigned long pflags = current->flags;
>>>>> +
>>>>> +	if (sk_has_vmio(tcp_conn->sock->sk))
>>>>> +		current->flags |= PF_MEMALLOC;
>>>>>  
>>>> Is this too late or not needed or what is it for? This function gets run
>>>> from the network layer's softirq and at this point we have a skbuff with
>>>> data that we want to process. The iscsi layer also does not allocate
>>>> memory for read or write IO in this path.
>>> I thought I found allocations in that path, lemme search...
>>> found this:
>>>
>>> iscsi_tcp_data_recv()
>>>   iscsi_data_rescv()
>>>     iscsi_complete_pdu()
>>>       __iscsi_complete_pdu()
>>>         iscsi_recv_pdu()
>>>           alloc_skb( GFP_ATOMIC);
>>>
>> You are right that is for the netlink interface. Could we move the
>> PF_MEMALLOC setting and clearing to iscsi_recv_pdu and and add it to
>> iscsi_conn_error in scsi_transport_iscsi.c so that iscsi_iser and
>> qla4xxx will have it set when they need it. I will send a patch for this
>> along with a way to have the netlink sock vmio set for all iscsi drivers
>> that need it.
> 
> I already have such a patch, look at:
> http://programming.kicks-ass.net/kernel-patches/vm_deadlock/current/iscsi_vmio.patch
> 

You are drowning me in patches :) I did not see that one. I was still
commenting on this patch :)

The new patch looks ok.


> but what conditional do you want to use for PF_MEMALLOC, an
> unconditional setting will be highly unpopular.
> 
> Hmm, perhaps you could key it of sk_has_vmio(nls)...

Yes.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2006-09-14 21:00 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-12 15:25 [PATCH 00/20] vm deadlock avoidance for NFS, NBD and iSCSI (take 7) Peter Zijlstra
2006-09-12 15:25 ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 06/20] nfs: teach the NFS client how to treat PG_swapcache pages Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 19/20] mm: a process flags to avoid blocking allocations Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 18/20] netlink: add SOCK_VMIO support to AF_NETLINK Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 07/20] nfs: add a comment explaining the use of PG_private in the NFS client Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 08/20] nfs: enable swap on NFS Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 20/20] iscsi: support for swapping over iSCSI Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-13 20:50   ` Mike Christie
2006-09-13 20:50     ` Mike Christie
2006-09-14  6:17     ` Peter Zijlstra
2006-09-14  6:17       ` Peter Zijlstra
2006-09-14 19:22       ` Mike Christie
2006-09-14 19:22         ` Mike Christie
2006-09-14 20:35         ` Peter Zijlstra
2006-09-14 20:35           ` Peter Zijlstra
2006-09-14 20:46           ` Peter Zijlstra
2006-09-14 20:46             ` Peter Zijlstra
2006-09-14 21:09             ` Mike Christie
2006-09-14 21:09               ` Mike Christie
2006-09-14 21:28               ` Mike Christie
2006-09-14 21:28                 ` Mike Christie
2006-09-14 21:00           ` Mike Christie [this message]
2006-09-14 21:00             ` Mike Christie
2006-09-14 21:03             ` Mike Christie
2006-09-14 21:03               ` Mike Christie
2006-09-14 21:18               ` Peter Zijlstra
2006-09-14 21:18                 ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 12/20] nbd: limit blk_queue Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 22:47   ` Jens Axboe
2006-09-12 22:47     ` Jens Axboe
2006-09-12 15:25 ` [PATCH 14/20] uml: enable scsi and add iscsi config Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 05/20] uml: rename arch/um remove_mapping() Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 17/20] scsi: propagate the swapdev hook into the scsi stack Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 02/20] net: vm deadlock avoidance core Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 09/20] nfs: make swap on NFS robust Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 16/20] iscsi: add session context to ep_connect Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 10/20] mm: block device swap notification Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 04/20] mm: methods for teaching filesystems about PG_swapcache pages Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 13/20] nbd: use swapdev hook to make swap deadlock free Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 03/20] mm: add support for non block device backed swap files Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 15/20] iscsi: kernel side tcp connect Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 15:25 ` [PATCH 11/20] nbd: request_fn fixup Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 22:47   ` Jens Axboe
2006-09-12 22:47     ` Jens Axboe
2006-09-13  0:21     ` Jeff Garzik
2006-09-13  0:21       ` Jeff Garzik
2006-09-13  6:14       ` Jens Axboe
2006-09-13  6:14         ` Jens Axboe
2006-09-12 15:25 ` [PATCH 01/20] mm: serialize access to min_free_kbytes Peter Zijlstra
2006-09-12 15:25   ` Peter Zijlstra
2006-09-12 16:37 ` [PATCH 00/20] vm deadlock avoidance for NFS, NBD and iSCSI (take 7) Linus Torvalds
2006-09-12 16:37   ` Linus Torvalds
2006-09-12 23:58   ` Nate Diller
2006-09-12 23:58     ` Nate Diller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4509C2DF.8000007@cs.wisc.edu \
    --to=michaelc@cs.wisc.edu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@osdl.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=phillips@google.com \
    --cc=riel@redhat.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.