Re: Killing process with SIGKILL and ncpfs

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: Killing process with SIGKILL and ncpfs
@ 2001-01-17 17:07 Petr Vandrovec
  0 siblings, 0 replies; 3+ messages in thread
From: Petr Vandrovec @ 2001-01-17 17:07 UTC (permalink / raw)
  To: Urban Widmark; +Cc: linux-kernel, marteen.deboer

On 17 Jan 01 at 13:41, Urban Widmark wrote:
> SIGKILL or SIGSTOP can be already pending, or perhaps received while
> waiting in socket->ops->recvmsg(). recvmsg will then return -ERESTARTSYS
> because signal_pending() is true and the smbfs code treats that as a
> network problem (causing unnecessary reconnects and sometimes complete
> failures requiring umount/mount).

Yes. I was going to rewrite this code sometime around 2.3.40, when
I wrote independent NCP socket layer for some other project. Unfortunately,
I found that returning -ERESTARTSYS from some procedures (read_inode)
is converted to bad_inode, instead of just dropping/reverting all
changes which were done :-( So I left it as is.

> Running strace on a multithreaded program causes problems for smbfs.
> Someone was nice enough to post a small testprogram for this (you may want
> to try it on ncpfs, if you want it I'll find it for you).
> 
> These problems go away if all signals are blocked. Of course the smbfs
> code would need to be changed to not block on recv, else you may end up
> with a program waiting for network input that can't be killed ... (?)

I'm going to use:

if (current->flags & PF_EXITING)
  mask = 0;
else
  mask = sigmask(SIGKILL);

It causes following:
(1) you can still kill bad task locked in ncpfs with SIGKILL
(2) except when connectivity problem happens during exit(). In that
    case timeout should take a care. No way around, except more
    complicated code:
    
       if (current->flags & PF_EXITING) {
         lock(...)
         rm_sig_from_queue(SIGKILL, &current->pending);
         unlock(...)
       }
       mask = sigmask(SIGKILL);
       
    which allows you to 'kill -9' exiting task again...
    But I cannot convice myself that SIGKILL is correctly delivered
    to PF_EXITING tasks... And rm_sig_from_queue is signal.c internal
    function.
(3) attaching/detaching debugger causes no longer problems, as SIGSTOP 
    is ignored by ncpfs - I have no idea why original code included
    SIGSTOP... Now it just stops task after NCP request is done, so
    only problem is that you cannot immediately stop program which
    waits for server reply. But I think that waiting few milliseconds
    is better than killing connection for whole server
(4) so you can happilly debug programs from ncpfs volumes

If it will not cause too much troubles for you, can you find multithreaded
test program for me? Current solution, which uses only SIGKILL, and only
when task does not exit, looks good for me, and works for my testcases.
There are some corner cases, such as when SIGKILL is already pending when 
ncpfs is entered, but I'm not sure whether it is worth of adding check 
of signal_pending() before call to do_ncp_*rpc_call(), as it is not four 
line patch then.
                                        Best regards,
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz
                                            
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Killing process with SIGKILL and ncpfs
@ 2001-01-17 13:07 Petr Vandrovec
  2001-01-17 12:41 ` Urban Widmark
  0 siblings, 1 reply; 3+ messages in thread
From: Petr Vandrovec @ 2001-01-17 13:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: marteen.deboer

Hi,
  Maarten de Boer pointed to me, that if you load some simple program,
such as 'void main(void) {}', trace into main (break main; run)
and then quit from gdb (Really exit? yes), child process is then
killed due to INT3 (probably). Then exit_mmap releases executable
mapping - and ncp_do_request is entered with SIGKILL pending!

Trace; d18e8822 <[ncpfs]ncp_do_request+1e2/1f8>
Trace; d18e88a5 <[ncpfs]ncp_request2+6d/a0>
Trace; d18e7c3c <[ncpfs]ncp_make_closed+9c/c8>
Trace; d18e332e <[ncpfs]ncp_release+a/1c>
Trace; c01349c1 <fput+39/e8>
Trace; c012566e <exit_mmap+da/124>
Trace; c0115e54 <mmput+38/50>
Trace; c011a134 <do_exit+d0/2a8>
Trace; c0108e10 <do_signal+234/28c>
Trace; c011f032 <force_sig_info+9a/a4>
Trace; c011f24d <force_sig+11/18>
Trace; c0109581 <do_int3+35/78>
Trace; c0109088 <error_code+34/3c>
Trace; c0108fa4 <signal_return+14/18>

So my question is:
(1) should ncpfs ignore ALL signals (even SIGKILL/SIGSTOP) when
    task is in PF_EXITING mode, or
(2) should kernel clear all pending signals at the beginning of do_exit or
(3) is it gdb bug that they forget 'int3' operation in traced program?

                                    Thanks,
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz
                                            
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Killing process with SIGKILL and ncpfs
  2001-01-17 13:07 Petr Vandrovec
@ 2001-01-17 12:41 ` Urban Widmark
  0 siblings, 0 replies; 3+ messages in thread
From: Urban Widmark @ 2001-01-17 12:41 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel, marteen.deboer

On Wed, 17 Jan 2001, Petr Vandrovec wrote:

> Hi,
>   Maarten de Boer pointed to me, that if you load some simple program,
> such as 'void main(void) {}', trace into main (break main; run)
> and then quit from gdb (Really exit? yes), child process is then
> killed due to INT3 (probably). Then exit_mmap releases executable
> mapping - and ncp_do_request is entered with SIGKILL pending!

smbfs has a signal problem in it's fs/smbfs/sock.c, possibly related.

SIGKILL or SIGSTOP can be already pending, or perhaps received while
waiting in socket->ops->recvmsg(). recvmsg will then return -ERESTARTSYS
because signal_pending() is true and the smbfs code treats that as a
network problem (causing unnecessary reconnects and sometimes complete
failures requiring umount/mount).

I don't know what happens with ncpfs, I don't think you wrote that, but I
am interested in how it handles this. (Again, I have looked at the ncpfs
code for useful bits to copy :)

Running strace on a multithreaded program causes problems for smbfs.
Someone was nice enough to post a small testprogram for this (you may want
to try it on ncpfs, if you want it I'll find it for you).

These problems go away if all signals are blocked. Of course the smbfs
code would need to be changed to not block on recv, else you may end up
with a program waiting for network input that can't be killed ... (?)

/Urban

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2001-01-17 16:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-17 17:07 Killing process with SIGKILL and ncpfs Petr Vandrovec
  -- strict thread matches above, loose matches on Subject: below --
2001-01-17 13:07 Petr Vandrovec
2001-01-17 12:41 ` Urban Widmark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.