linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* May close() return any error code?
@ 2015-07-29 10:46 Takashi Iwai
  2015-07-29 14:45 ` Dmitry Torokhov
  2015-08-02  7:42 ` Pavel Machek
  0 siblings, 2 replies; 7+ messages in thread
From: Takashi Iwai @ 2015-07-29 10:46 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: linux-input, linux-kernel, Dr. Werner Fink

Hi,

while debugging a problem of X and gdm with the old systemd-210, we
encountered a sudden death of systemd-logind, and this turned out to
be an unexpected errno from close().  The close() call for input
devices returns ENODEV error.  The logind in systemd-210 treats this
error code as fatal, triggers assert() and eventually kills itself.
The details are found in an openSUSE bugzilla thread:
  https://bugzilla.opensuse.org/show_bug.cgi?id=939571

This seems coming from evdev_flush().  As there is no fd leak, it's no
big problem per se.  But, now the question is whether returning such
an error code is correct behavior at all.  At least, it doesn't seem
defined in POSIX:
  http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

I myself have no strong opinion here, so would like just to ask others
suggestions / comments.

thanks,


Takashi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-07-29 10:46 May close() return any error code? Takashi Iwai
@ 2015-07-29 14:45 ` Dmitry Torokhov
  2015-07-30 13:53   ` Takashi Iwai
  2015-08-02 14:07   ` Al Viro
  2015-08-02  7:42 ` Pavel Machek
  1 sibling, 2 replies; 7+ messages in thread
From: Dmitry Torokhov @ 2015-07-29 14:45 UTC (permalink / raw)
  To: Takashi Iwai, Al Viro; +Cc: linux-input, linux-kernel, Dr. Werner Fink

HI Takashi,

On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
> Hi,
> 
> while debugging a problem of X and gdm with the old systemd-210, we
> encountered a sudden death of systemd-logind, and this turned out to
> be an unexpected errno from close().  The close() call for input
> devices returns ENODEV error.  The logind in systemd-210 treats this
> error code as fatal, triggers assert() and eventually kills itself.
> The details are found in an openSUSE bugzilla thread:
>   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> 
> This seems coming from evdev_flush().  As there is no fd leak, it's no
> big problem per se.  But, now the question is whether returning such
> an error code is correct behavior at all.  At least, it doesn't seem
> defined in POSIX:
>   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html

Hmm, if I checked the right version of the code close_nointr_nofail()
expects only 0 as the return code so even if we change the kernel to
use more conforming -EIO instead of -ENODEV systemd will still die...

The question is whether we really need to propagate return value from
f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-07-29 14:45 ` Dmitry Torokhov
@ 2015-07-30 13:53   ` Takashi Iwai
  2015-08-02 14:07   ` Al Viro
  1 sibling, 0 replies; 7+ messages in thread
From: Takashi Iwai @ 2015-07-30 13:53 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: Al Viro, linux-input, linux-kernel, Dr. Werner Fink

On Wed, 29 Jul 2015 16:45:11 +0200,
Dmitry Torokhov wrote:
> 
> HI Takashi,
> 
> On Wed, Jul 29, 2015 at 12:46:59PM +0200, Takashi Iwai wrote:
> > Hi,
> > 
> > while debugging a problem of X and gdm with the old systemd-210, we
> > encountered a sudden death of systemd-logind, and this turned out to
> > be an unexpected errno from close().  The close() call for input
> > devices returns ENODEV error.  The logind in systemd-210 treats this
> > error code as fatal, triggers assert() and eventually kills itself.
> > The details are found in an openSUSE bugzilla thread:
> >   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> > 
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Hmm, if I checked the right version of the code close_nointr_nofail()
> expects only 0 as the return code so even if we change the kernel to
> use more conforming -EIO instead of -ENODEV systemd will still die...

Yes, it can be seen rather as a bug in systemd-210, and the later
systemd version doesn't seem to trigger assert() with that condition.

> The question is whether we really need to propagate return value from
> f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

Oh yes, of course, Al is the best person to ask :)

Looking back at this problem now, I find it even dangerous to
propagate an error, especially -EINTR.  Then user-space may interpret
it as if the close didn't succeed and the fd were alive, although the
kernel performed the close itself.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-07-29 10:46 May close() return any error code? Takashi Iwai
  2015-07-29 14:45 ` Dmitry Torokhov
@ 2015-08-02  7:42 ` Pavel Machek
  2015-08-02 13:57   ` Al Viro
  1 sibling, 1 reply; 7+ messages in thread
From: Pavel Machek @ 2015-08-02  7:42 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Dmitry Torokhov, linux-input, linux-kernel, Dr. Werner Fink

On Wed 2015-07-29 12:46:59, Takashi Iwai wrote:
> Hi,
> 
> while debugging a problem of X and gdm with the old systemd-210, we
> encountered a sudden death of systemd-logind, and this turned out to
> be an unexpected errno from close().  The close() call for input
> devices returns ENODEV error.  The logind in systemd-210 treats this
> error code as fatal, triggers assert() and eventually kills itself.
> The details are found in an openSUSE bugzilla thread:
>   https://bugzilla.opensuse.org/show_bug.cgi?id=939571
> 
> This seems coming from evdev_flush().  As there is no fd leak, it's no
> big problem per se.  But, now the question is whether returning such
> an error code is correct behavior at all.  At least, it doesn't seem
> defined in POSIX:
>
http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html


Returning an error from close() would imply that file descriptor is
not closed.... seems like bad idea. Just fix the kernel not to do it.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-08-02  7:42 ` Pavel Machek
@ 2015-08-02 13:57   ` Al Viro
  0 siblings, 0 replies; 7+ messages in thread
From: Al Viro @ 2015-08-02 13:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Takashi Iwai, Dmitry Torokhov, linux-input, linux-kernel,
	Dr. Werner Fink

On Sun, Aug 02, 2015 at 09:42:20AM +0200, Pavel Machek wrote:
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >
> http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Returning an error from close() would imply that file descriptor is
> not closed.... seems like bad idea. Just fix the kernel not to do it.

The only thing implied here is failure to RTF{M,S}.  To quote close(2):

NOTES
       Not checking the return value of close() is a common  but  nevertheless
       serious  programming error.  It is quite possible that errors on a pre‐
       vious write(2) operation are first reported at the final close().   Not
       checking the return value when closing the file may lead to silent loss
       of data.  This can especially be observed with NFS and with disk quota.
       Note  that  the  return  value should only be used for diagnostics.  In
       particular close() should not be retried after an EINTR since this  may
       cause a reused descriptor from another thread to be closed.

That's Linux one.  FreeBSD one says

     In case of any error except EBADF, the supplied file descriptor is deal-
     located and therefore is no longer valid.

and that matches behaviour of historical BSD variants as well.  POSIX is being
its usual charming self and says "if a kernel shipped by $VALUED_MEMBER does
this and that cretinous thing, far be it from us to call it broken".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-07-29 14:45 ` Dmitry Torokhov
  2015-07-30 13:53   ` Takashi Iwai
@ 2015-08-02 14:07   ` Al Viro
  2015-08-04 10:21     ` Takashi Iwai
  1 sibling, 1 reply; 7+ messages in thread
From: Al Viro @ 2015-08-02 14:07 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: Takashi Iwai, linux-input, linux-kernel, Dr. Werner Fink

On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
> > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > big problem per se.  But, now the question is whether returning such
> > an error code is correct behavior at all.  At least, it doesn't seem
> > defined in POSIX:
> >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> 
> Hmm, if I checked the right version of the code close_nointr_nofail()
> expects only 0 as the return code so even if we change the kernel to
> use more conforming -EIO instead of -ENODEV systemd will still die...
> 
> The question is whether we really need to propagate return value from
> f_op->flush() up to userspace in filp_close(). Why don't we ask Al?

That's the whole damn point of having ->flush().  And yes, we do need that -
things like NFS (not to mention tapes, etc.) do rely on that.

Whether it makes sense to do this kind of "do something that might have
a failure to report on each close()" for evdev is up to driver, obviously.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: May close() return any error code?
  2015-08-02 14:07   ` Al Viro
@ 2015-08-04 10:21     ` Takashi Iwai
  0 siblings, 0 replies; 7+ messages in thread
From: Takashi Iwai @ 2015-08-04 10:21 UTC (permalink / raw)
  To: Al Viro; +Cc: Dmitry Torokhov, linux-input, linux-kernel, Dr. Werner Fink

On Sun, 02 Aug 2015 16:07:06 +0200,
Al Viro wrote:
> 
> On Wed, Jul 29, 2015 at 07:45:11AM -0700, Dmitry Torokhov wrote:
> > > This seems coming from evdev_flush().  As there is no fd leak, it's no
> > > big problem per se.  But, now the question is whether returning such
> > > an error code is correct behavior at all.  At least, it doesn't seem
> > > defined in POSIX:
> > >   http://pubs.opengroup.org/onlinepubs/009695399/functions/close.html
> > 
> > Hmm, if I checked the right version of the code close_nointr_nofail()
> > expects only 0 as the return code so even if we change the kernel to
> > use more conforming -EIO instead of -ENODEV systemd will still die...
> > 
> > The question is whether we really need to propagate return value from
> > f_op->flush() up to userspace in filp_close(). Why don't we ask Al?
> 
> That's the whole damn point of having ->flush().  And yes, we do need that -
> things like NFS (not to mention tapes, etc.) do rely on that.
> 
> Whether it makes sense to do this kind of "do something that might have
> a failure to report on each close()" for evdev is up to driver, obviously.

So, the behavior of VFS layer is as designed.  Then I suppose the fix
should be rather in evdev.c.  Dmitry, could you paper over it?


thanks,

Takashi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-08-04 10:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-29 10:46 May close() return any error code? Takashi Iwai
2015-07-29 14:45 ` Dmitry Torokhov
2015-07-30 13:53   ` Takashi Iwai
2015-08-02 14:07   ` Al Viro
2015-08-04 10:21     ` Takashi Iwai
2015-08-02  7:42 ` Pavel Machek
2015-08-02 13:57   ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).