From: Mike Christie <mchristi@redhat.com>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: target-devel <target-devel@vger.kernel.org>,
fw@strlen.de, Linux-netdev <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: Re: iscsi target regression due to "tcp: remove prequeue support" patch
Date: Mon, 15 Jan 2018 20:33:22 +0000 [thread overview]
Message-ID: <5A5D1012.9000303@redhat.com> (raw)
In-Reply-To: <1515998504.24576.187.camel@haakon3.daterainc.com>
On 01/15/2018 12:41 AM, Nicholas A. Bellinger wrote:
> Hey MNC & Co,
>
> Ping on the earlier iscsi-target authentication login failures atop
> 4.14 + commit e7942d063 removing tcp prequeue support.
>
> For reference, what is your pre 4.14 environment using for
> sysctl_tcp_low_latency..?
tcp_low_latency=1.
I have tested the current kernel with 1 and 0, and it does not make a
difference.
>
> netdev folks, how would you like to proceed for -rc1..?
>
> On Mon, 2018-01-08 at 22:32 -0800, Nicholas A. Bellinger wrote:
>> Hi MNC & Florian,
>>
>> (Adding net-dev + DaveM CC')
>>
>> Catching up on pre-holiday threads, thanks for the heads up.
>>
>> Comments below.
>>
>> On Wed, 2017-12-13 at 23:56 -0600, Mike Christie wrote:
>>> Hey Nick and Florian,
>>>
>>> Starting in 4.14 iscsi logins will fail around 50% of the time.
>>>
>>> I git bisected the issue down to this commit:
>>>
>>> commit e7942d0633c47c791ece6afa038be9cf977226de
>>> Author: Florian Westphal <fw@strlen.de>
>>> Date: Sun Jul 30 03:57:18 2017 +0200
>>>
>>> tcp: remove prequeue support
>>>
>>> Nick, attached is the iscsi target log info when the login fails.
>>>
>>> You can see at:
>>>
>>> Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT:
>>> 0x00000000, CmdSN: 0x00000000, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65
>>>
>>> we have got a login command and we seem to then go into
>>> iscsit_do_rx_data -> sock_recvmsg
>>>
>>> We seem to get stuck in there though, because we stay blocked until:
>>>
>>> Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready:
>>> conn: ffff88b35cbb3000
>>> Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn:
>>> ffff88b35cbb3000 >>>>
>>>
>>> where initiator side timeout fires 15 seconds later and it disconnects
>>> the tcp connection, and we eventually break out of the recvmsg call:
>>>
>>> Dec 13 17:55:16 rhel73n1 kernel: Entering iscsi_target_sk_state_change
>>> Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_check_close:
>>> TCP_CLOSE_WAIT|TCP_CLOSE,returning FALSE
>>>
>>> ....
>>>
>>> Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68
>>> Dec 13 17:55:16 rhel73n1 kernel: iscsi_target_do_login_rx after
>>> rx_login_io, ffff88b35cbb3000, kworker/2:2:1829
>>>
>>
>> Ok, the 3rd third login request payload (65 + 3 padded to 68 bytes)
>> containing CHAP_N + CHAP_R keys remains blocked on sock_recvmsg(), until
>> TPG login_timeout subsequently fires after 15 seconds of inactivity to
>> terminate this login attempt.
>>
>>> Is the iscsi target doing something incorrect in its use of
>>> sk_data_ready and sock_recvmsg or is the tcp patch at fault?
>>
>> From the logs, sk_data_ready() -> iscsi_target_sk_data_ready() callbacks
>> appear firing as expected.
>>
>> iscsi-target login does iscsit_rx_do_data() -> rx_data() ->
>> sock_recvmsg(..., MSG_WAITALL) from a system_wq kworker process context
>> after iscsi_target_sk_data_ready() callback queues up
>> iscsi_conn->login_work for execution, and sock_recvmsg() uses a single
>> struct kvec iovec for struct msg_hdr.
>>
>> AFAICT, iscsi-target uses blocking kernel socket reads from process
>> context, similar to kernel_recvmsg(..., MSG_WAITALL) with DRBD.
>>
>> Florian + DaveM, any idea why the removal of prequeue support is having
>> an effect here..?
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe target-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <mchristi@redhat.com>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: target-devel <target-devel@vger.kernel.org>,
fw@strlen.de, Linux-netdev <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: Re: iscsi target regression due to "tcp: remove prequeue support" patch
Date: Mon, 15 Jan 2018 14:33:22 -0600 [thread overview]
Message-ID: <5A5D1012.9000303@redhat.com> (raw)
In-Reply-To: <1515998504.24576.187.camel@haakon3.daterainc.com>
On 01/15/2018 12:41 AM, Nicholas A. Bellinger wrote:
> Hey MNC & Co,
>
> Ping on the earlier iscsi-target authentication login failures atop
> 4.14 + commit e7942d063 removing tcp prequeue support.
>
> For reference, what is your pre 4.14 environment using for
> sysctl_tcp_low_latency..?
tcp_low_latency=1.
I have tested the current kernel with 1 and 0, and it does not make a
difference.
>
> netdev folks, how would you like to proceed for -rc1..?
>
> On Mon, 2018-01-08 at 22:32 -0800, Nicholas A. Bellinger wrote:
>> Hi MNC & Florian,
>>
>> (Adding net-dev + DaveM CC')
>>
>> Catching up on pre-holiday threads, thanks for the heads up.
>>
>> Comments below.
>>
>> On Wed, 2017-12-13 at 23:56 -0600, Mike Christie wrote:
>>> Hey Nick and Florian,
>>>
>>> Starting in 4.14 iscsi logins will fail around 50% of the time.
>>>
>>> I git bisected the issue down to this commit:
>>>
>>> commit e7942d0633c47c791ece6afa038be9cf977226de
>>> Author: Florian Westphal <fw@strlen.de>
>>> Date: Sun Jul 30 03:57:18 2017 +0200
>>>
>>> tcp: remove prequeue support
>>>
>>> Nick, attached is the iscsi target log info when the login fails.
>>>
>>> You can see at:
>>>
>>> Dec 13 17:55:01 rhel73n1 kernel: Got Login Command, Flags 0x81, ITT:
>>> 0x00000000, CmdSN: 0x00000000, ExpStatSN: 0xf86dc69b, CID: 0, Length: 65
>>>
>>> we have got a login command and we seem to then go into
>>> iscsit_do_rx_data -> sock_recvmsg
>>>
>>> We seem to get stuck in there though, because we stay blocked until:
>>>
>>> Dec 13 17:55:01 rhel73n1 kernel: Entering iscsi_target_sk_data_ready:
>>> conn: ffff88b35cbb3000
>>> Dec 13 17:55:01 rhel73n1 kernel: Got LOGIN_FLAGS_READ_ACTIVE=1, conn:
>>> ffff88b35cbb3000 >>>>
>>>
>>> where initiator side timeout fires 15 seconds later and it disconnects
>>> the tcp connection, and we eventually break out of the recvmsg call:
>>>
>>> Dec 13 17:55:16 rhel73n1 kernel: Entering iscsi_target_sk_state_change
>>> Dec 13 17:55:16 rhel73n1 kernel: __iscsi_target_sk_check_close:
>>> TCP_CLOSE_WAIT|TCP_CLOSE,returning FALSE
>>>
>>> ....
>>>
>>> Dec 13 17:55:16 rhel73n1 kernel: rx_loop: 68, total_rx: 68, data: 68
>>> Dec 13 17:55:16 rhel73n1 kernel: iscsi_target_do_login_rx after
>>> rx_login_io, ffff88b35cbb3000, kworker/2:2:1829
>>>
>>
>> Ok, the 3rd third login request payload (65 + 3 padded to 68 bytes)
>> containing CHAP_N + CHAP_R keys remains blocked on sock_recvmsg(), until
>> TPG login_timeout subsequently fires after 15 seconds of inactivity to
>> terminate this login attempt.
>>
>>> Is the iscsi target doing something incorrect in its use of
>>> sk_data_ready and sock_recvmsg or is the tcp patch at fault?
>>
>> From the logs, sk_data_ready() -> iscsi_target_sk_data_ready() callbacks
>> appear firing as expected.
>>
>> iscsi-target login does iscsit_rx_do_data() -> rx_data() ->
>> sock_recvmsg(..., MSG_WAITALL) from a system_wq kworker process context
>> after iscsi_target_sk_data_ready() callback queues up
>> iscsi_conn->login_work for execution, and sock_recvmsg() uses a single
>> struct kvec iovec for struct msg_hdr.
>>
>> AFAICT, iscsi-target uses blocking kernel socket reads from process
>> context, similar to kernel_recvmsg(..., MSG_WAITALL) with DRBD.
>>
>> Florian + DaveM, any idea why the removal of prequeue support is having
>> an effect here..?
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe target-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
next prev parent reply other threads:[~2018-01-15 20:33 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-14 5:56 iscsi target regression due to "tcp: remove prequeue support" patch Mike Christie
2018-01-09 6:32 ` Nicholas A. Bellinger
2018-01-09 6:32 ` Nicholas A. Bellinger
2018-01-15 6:41 ` Nicholas A. Bellinger
2018-01-15 6:41 ` Nicholas A. Bellinger
2018-01-15 20:33 ` Mike Christie [this message]
2018-01-15 20:33 ` Mike Christie
2018-01-18 8:41 ` Nicholas A. Bellinger
2018-01-18 8:41 ` Nicholas A. Bellinger
2018-01-15 10:41 ` Florian Westphal
2018-01-18 8:38 ` Nicholas A. Bellinger
2018-01-18 8:38 ` Nicholas A. Bellinger
2018-01-18 15:10 ` Florian Westphal
2018-01-18 15:10 ` Florian Westphal
2018-01-19 4:28 ` Mike Christie
2018-01-19 4:28 ` Mike Christie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A5D1012.9000303@redhat.com \
--to=mchristi@redhat.com \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=nab@linux-iscsi.org \
--cc=netdev@vger.kernel.org \
--cc=target-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.