* Re: Killing process with SIGKILL and ncpfs
@ 2001-01-17 17:07 Petr Vandrovec
0 siblings, 0 replies; 3+ messages in thread
From: Petr Vandrovec @ 2001-01-17 17:07 UTC (permalink / raw)
To: Urban Widmark; +Cc: linux-kernel, marteen.deboer
On 17 Jan 01 at 13:41, Urban Widmark wrote:
> SIGKILL or SIGSTOP can be already pending, or perhaps received while
> waiting in socket->ops->recvmsg(). recvmsg will then return -ERESTARTSYS
> because signal_pending() is true and the smbfs code treats that as a
> network problem (causing unnecessary reconnects and sometimes complete
> failures requiring umount/mount).
Yes. I was going to rewrite this code sometime around 2.3.40, when
I wrote independent NCP socket layer for some other project. Unfortunately,
I found that returning -ERESTARTSYS from some procedures (read_inode)
is converted to bad_inode, instead of just dropping/reverting all
changes which were done :-( So I left it as is.
> Running strace on a multithreaded program causes problems for smbfs.
> Someone was nice enough to post a small testprogram for this (you may want
> to try it on ncpfs, if you want it I'll find it for you).
>
> These problems go away if all signals are blocked. Of course the smbfs
> code would need to be changed to not block on recv, else you may end up
> with a program waiting for network input that can't be killed ... (?)
I'm going to use:
if (current->flags & PF_EXITING)
mask = 0;
else
mask = sigmask(SIGKILL);
It causes following:
(1) you can still kill bad task locked in ncpfs with SIGKILL
(2) except when connectivity problem happens during exit(). In that
case timeout should take a care. No way around, except more
complicated code:
if (current->flags & PF_EXITING) {
lock(...)
rm_sig_from_queue(SIGKILL, ¤t->pending);
unlock(...)
}
mask = sigmask(SIGKILL);
which allows you to 'kill -9' exiting task again...
But I cannot convice myself that SIGKILL is correctly delivered
to PF_EXITING tasks... And rm_sig_from_queue is signal.c internal
function.
(3) attaching/detaching debugger causes no longer problems, as SIGSTOP
is ignored by ncpfs - I have no idea why original code included
SIGSTOP... Now it just stops task after NCP request is done, so
only problem is that you cannot immediately stop program which
waits for server reply. But I think that waiting few milliseconds
is better than killing connection for whole server
(4) so you can happilly debug programs from ncpfs volumes
If it will not cause too much troubles for you, can you find multithreaded
test program for me? Current solution, which uses only SIGKILL, and only
when task does not exit, looks good for me, and works for my testcases.
There are some corner cases, such as when SIGKILL is already pending when
ncpfs is entered, but I'm not sure whether it is worth of adding check
of signal_pending() before call to do_ncp_*rpc_call(), as it is not four
line patch then.
Best regards,
Petr Vandrovec
vandrove@vc.cvut.cz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 3+ messages in thread* Killing process with SIGKILL and ncpfs
@ 2001-01-17 13:07 Petr Vandrovec
2001-01-17 12:41 ` Urban Widmark
0 siblings, 1 reply; 3+ messages in thread
From: Petr Vandrovec @ 2001-01-17 13:07 UTC (permalink / raw)
To: linux-kernel; +Cc: marteen.deboer
Hi,
Maarten de Boer pointed to me, that if you load some simple program,
such as 'void main(void) {}', trace into main (break main; run)
and then quit from gdb (Really exit? yes), child process is then
killed due to INT3 (probably). Then exit_mmap releases executable
mapping - and ncp_do_request is entered with SIGKILL pending!
Trace; d18e8822 <[ncpfs]ncp_do_request+1e2/1f8>
Trace; d18e88a5 <[ncpfs]ncp_request2+6d/a0>
Trace; d18e7c3c <[ncpfs]ncp_make_closed+9c/c8>
Trace; d18e332e <[ncpfs]ncp_release+a/1c>
Trace; c01349c1 <fput+39/e8>
Trace; c012566e <exit_mmap+da/124>
Trace; c0115e54 <mmput+38/50>
Trace; c011a134 <do_exit+d0/2a8>
Trace; c0108e10 <do_signal+234/28c>
Trace; c011f032 <force_sig_info+9a/a4>
Trace; c011f24d <force_sig+11/18>
Trace; c0109581 <do_int3+35/78>
Trace; c0109088 <error_code+34/3c>
Trace; c0108fa4 <signal_return+14/18>
So my question is:
(1) should ncpfs ignore ALL signals (even SIGKILL/SIGSTOP) when
task is in PF_EXITING mode, or
(2) should kernel clear all pending signals at the beginning of do_exit or
(3) is it gdb bug that they forget 'int3' operation in traced program?
Thanks,
Petr Vandrovec
vandrove@vc.cvut.cz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Killing process with SIGKILL and ncpfs
2001-01-17 13:07 Petr Vandrovec
@ 2001-01-17 12:41 ` Urban Widmark
0 siblings, 0 replies; 3+ messages in thread
From: Urban Widmark @ 2001-01-17 12:41 UTC (permalink / raw)
To: Petr Vandrovec; +Cc: linux-kernel, marteen.deboer
On Wed, 17 Jan 2001, Petr Vandrovec wrote:
> Hi,
> Maarten de Boer pointed to me, that if you load some simple program,
> such as 'void main(void) {}', trace into main (break main; run)
> and then quit from gdb (Really exit? yes), child process is then
> killed due to INT3 (probably). Then exit_mmap releases executable
> mapping - and ncp_do_request is entered with SIGKILL pending!
smbfs has a signal problem in it's fs/smbfs/sock.c, possibly related.
SIGKILL or SIGSTOP can be already pending, or perhaps received while
waiting in socket->ops->recvmsg(). recvmsg will then return -ERESTARTSYS
because signal_pending() is true and the smbfs code treats that as a
network problem (causing unnecessary reconnects and sometimes complete
failures requiring umount/mount).
I don't know what happens with ncpfs, I don't think you wrote that, but I
am interested in how it handles this. (Again, I have looked at the ncpfs
code for useful bits to copy :)
Running strace on a multithreaded program causes problems for smbfs.
Someone was nice enough to post a small testprogram for this (you may want
to try it on ncpfs, if you want it I'll find it for you).
These problems go away if all signals are blocked. Of course the smbfs
code would need to be changed to not block on recv, else you may end up
with a program waiting for network input that can't be killed ... (?)
/Urban
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2001-01-17 16:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-17 17:07 Killing process with SIGKILL and ncpfs Petr Vandrovec
-- strict thread matches above, loose matches on Subject: below --
2001-01-17 13:07 Petr Vandrovec
2001-01-17 12:41 ` Urban Widmark
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.