From: Asdo <asdo@shiftmail.org>
To: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Cc: e1000-devel@lists.sourceforge.net, Netdev <netdev@vger.kernel.org>
Subject: Re: TCP sockets stalling - help! (long)
Date: Wed, 25 Nov 2009 16:36:28 +0100 [thread overview]
Message-ID: <4B0D4EFC.4050302@shiftmail.org> (raw)
In-Reply-To: <4B0D2C0E.9050101@shiftmail.org>
[-- Attachment #1.1: Type: text/plain, Size: 3381 bytes --]
Asdo wrote:
> Ilpo Järvinen wrote:
>
>> ...I'd next try strace the sftp server to see what it was doing
>> during the stall.
>>
>>
> Thanks for your help Ilpo
>
> Isn't the strace equivalent to the stack trace I obtained via cat
> /proc/pid/stack reported previously? That was at the time of the stall
>
> I'm thinking the strace would slow down sftp-server very deeply...
>
I found out that if I attach the strace I can see at least the last
function call.
The SFTP it's hanged right now so I did that:
root@mystorage:/root# strace -p 11475
Process 11475 attached - interrupt to quit
select(5, [3], [], NULL, NULL
(stuck here forever... doesn't move)
(it's strange the first option of select is 5, shouldn't it be 4 from
man select? A bug of strace maybe?)
root@mystorage:/root# cat /proc/11475/stack
[<ffffffff8112e644>] poll_schedule_timeout+0x34/0x50
[<ffffffff8112ef4f>] do_select+0x58f/0x6b0
[<ffffffff8112f8b5>] core_sys_select+0x185/0x2b0
[<ffffffff8112fc32>] sys_select+0x42/0x110
[<ffffffff8101225b>] tracesys+0xd9/0xde
[<ffffffffffffffff>] 0xffffffffffffffff
And this is from cat /proc/net/tcp
2: 0F12A8C0:0016 2512A8C0:0FBD 01 00000000:00000000 02:00009144
00000000 0 0 5326251 2 ffff88085408ce00 26 4 1 9 4
The select refers to open files so here they are:
root@mystorage:/proc/11475/fd# ll
total 0
lr-x------ 1 ccosentino wetlab 64 2009-11-25 14:43 0 -> pipe:[5326309]
l-wx------ 1 ccosentino wetlab 64 2009-11-25 14:43 1 -> pipe:[5326310]
l-wx------ 1 ccosentino wetlab 64 2009-11-25 14:43 2 -> pipe:[5326311]
lr-x------ 1 ccosentino wetlab 64 2009-11-25 14:43 3 -> pipe:[5326309]
l-wx------ 1 ccosentino wetlab 64 2009-11-25 14:43 4 -> pipe:[5326310]
l-wx------ 1 ccosentino wetlab 64 2009-11-25 14:43 5 ->
/path/to/file_being_saved.filepart
I tried to send SIGSTOP and then SIGCONT to see if I could make it make
a loop and then reenter into the select. I'm not sure it really did
that, what do you think? This is the strace:
root@mystorage:/root# strace -p 11475 2>&1 | tee sftpstrace.dmp
Process 11475 attached - interrupt to quit
select(5, [3], [], NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
select(5, [3], [], NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGCONT (Continued) @ 0 (0) ---
select(5, [3], [], NULL, NULL
(hanged again here)
Do you think this info is enough or I really have to strace it since the
beginning?
If it is a race condition it might not happen if the sftp-server is
deeply slowed down by the strace.
If I had a way to make it continue right now we could get the rest of
the strace... But it's not so easy, I tried starting a Samba transfer
but it did not unlock the SFTP this time. SIGSTOP + SIGCONT also didn't
work.
BTW people using the Storage also experienced data loss while pushing
files in it: appartently data disappeared from the middle of a file they
were saving to the Storage.
To me looks like another hint that application-level data which has been
received via network by TCP stack is trapped there and not being pushed
to the application.
Or the data might even be trapped into the anonymous sockets between
sshd and sftp-server.
Thanks for your help
[-- Attachment #2: Type: text/plain, Size: 354 bytes --]
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
[-- Attachment #3: Type: text/plain, Size: 164 bytes --]
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
next prev parent reply other threads:[~2009-11-25 15:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-25 4:25 TCP sockets stalling - help! (long) Asdo
2009-11-25 11:48 ` Ilpo Järvinen
2009-11-25 13:07 ` Asdo
2009-11-25 15:36 ` Asdo [this message]
2009-11-25 16:38 ` Asdo
2009-11-25 22:29 ` [E1000-devel] " Ilpo Järvinen
2009-11-26 22:33 ` Frederic Leroy
2009-11-27 9:10 ` Ilpo Järvinen
2009-11-27 12:56 ` Asdo
2009-11-27 20:53 ` Frederic Leroy
2009-11-27 21:00 ` Ilpo Järvinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B0D4EFC.4050302@shiftmail.org \
--to=asdo@shiftmail.org \
--cc=e1000-devel@lists.sourceforge.net \
--cc=ilpo.jarvinen@helsinki.fi \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).