* [RFC v0] Bluetooth: Fix lost socket error code @ 2012-06-19 14:20 Mikel Astiz 2012-06-29 13:24 ` Johan Hedberg 0 siblings, 1 reply; 6+ messages in thread From: Mikel Astiz @ 2012-06-19 14:20 UTC (permalink / raw) To: linux-bluetooth; +Cc: Mikel Astiz From: Mikel Astiz <mikel.astiz@bmw-carit.de> Using sock_error() here to check the status of the socket is wrong because it resets sk->sk_err to zero. For RFCOMM sockets, this means the disconnect reason is not exposed to userland in the socket options. Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> --- This a second attempt to expose ACL disconnect reason to userland, as proposed first in "[RFC v0] Bluetooth: mgmt: Add device disconnect reason". The motivation behind was explained in the userspace patchset "[RFC BlueZ v0 0/5] ACL disconnect reason". This second approach focuses on RFCOMM sockets given that L2CAP is already exposing such information. net/bluetooth/af_bluetooth.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c index f7db579..4799338 100644 --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -313,7 +313,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock, if (copied >= target) break; - err = sock_error(sk); + err = sk->sk_err; if (err) break; if (sk->sk_shutdown & RCV_SHUTDOWN) -- 1.7.7.6 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC v0] Bluetooth: Fix lost socket error code 2012-06-19 14:20 [RFC v0] Bluetooth: Fix lost socket error code Mikel Astiz @ 2012-06-29 13:24 ` Johan Hedberg 2012-07-03 7:44 ` Mikel Astiz 0 siblings, 1 reply; 6+ messages in thread From: Johan Hedberg @ 2012-06-29 13:24 UTC (permalink / raw) To: Mikel Astiz; +Cc: linux-bluetooth, Mikel Astiz Hi Mikel, On Tue, Jun 19, 2012, Mikel Astiz wrote: > From: Mikel Astiz <mikel.astiz@bmw-carit.de> > > Using sock_error() here to check the status of the socket is wrong > because it resets sk->sk_err to zero. For RFCOMM sockets, this means the > disconnect reason is not exposed to userland in the socket options. > > Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> > --- > This a second attempt to expose ACL disconnect reason to userland, as proposed first in "[RFC v0] Bluetooth: mgmt: Add device disconnect reason". > > The motivation behind was explained in the userspace patchset "[RFC BlueZ v0 0/5] ACL disconnect reason". > > This second approach focuses on RFCOMM sockets given that L2CAP is already exposing such information. > > net/bluetooth/af_bluetooth.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c > index f7db579..4799338 100644 > --- a/net/bluetooth/af_bluetooth.c > +++ b/net/bluetooth/af_bluetooth.c > @@ -313,7 +313,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock, > if (copied >= target) > break; > > - err = sock_error(sk); > + err = sk->sk_err; > if (err) > break; > if (sk->sk_shutdown & RCV_SHUTDOWN) Looking at the implementation of sock_error (include/net/sock.h) This should be: err = -sk->sk_err; However, is this really needed if the error is returned for the write() or read() that triggers it (through errno). And in the case of POLLHUP/POLLERR there wont be any write/read that could clear the error so you should be able to read it with getsockopt just fine. Am I missing something? Johan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v0] Bluetooth: Fix lost socket error code 2012-06-29 13:24 ` Johan Hedberg @ 2012-07-03 7:44 ` Mikel Astiz 2012-07-03 8:18 ` Johan Hedberg 2012-07-03 13:42 ` Mike 0 siblings, 2 replies; 6+ messages in thread From: Mikel Astiz @ 2012-07-03 7:44 UTC (permalink / raw) To: johan.hedberg, linux-bluetooth Hi Johan, On Fri, Jun 29, 2012 at 3:24 PM, Johan Hedberg <johan.hedberg@gmail.com> wrote: > Hi Mikel, > > On Tue, Jun 19, 2012, Mikel Astiz wrote: >> From: Mikel Astiz <mikel.astiz@bmw-carit.de> >> >> Using sock_error() here to check the status of the socket is wrong >> because it resets sk->sk_err to zero. For RFCOMM sockets, this means the >> disconnect reason is not exposed to userland in the socket options. >> >> Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> >> --- >> This a second attempt to expose ACL disconnect reason to userland, as proposed first in "[RFC v0] Bluetooth: mgmt: Add device disconnect reason". >> >> The motivation behind was explained in the userspace patchset "[RFC BlueZ v0 0/5] ACL disconnect reason". >> >> This second approach focuses on RFCOMM sockets given that L2CAP is already exposing such information. >> >> net/bluetooth/af_bluetooth.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c >> index f7db579..4799338 100644 >> --- a/net/bluetooth/af_bluetooth.c >> +++ b/net/bluetooth/af_bluetooth.c >> @@ -313,7 +313,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock, >> if (copied >= target) >> break; >> >> - err = sock_error(sk); >> + err = sk->sk_err; >> if (err) >> break; >> if (sk->sk_shutdown & RCV_SHUTDOWN) > > Looking at the implementation of sock_error (include/net/sock.h) This > should be: > > err = -sk->sk_err; Right. > However, is this really needed if the error is returned for the write() > or read() that triggers it (through errno). And in the case of > POLLHUP/POLLERR there wont be any write/read that could clear the error > so you should be able to read it with getsockopt just fine. Am I missing > something? I don't actually get why rfcomm_sock_recvmsg() is being called, but that's what I experimentally see: [18985.546436] [73] rfcomm_l2data_ready:197: ffff880205613800 bytes 4 [18985.546535] [21592] rfcomm_process_rx:1853: session ffff880230ccc9c0 state 1 qlen 1 [18985.546546] [21592] rfcomm_recv_disc:1205: session ffff880230ccc9c0 state 1 dlci 26 [18985.546552] [21592] rfcomm_send_ua:773: ffff880230ccc9c0 dlci 26 [18985.546558] [21592] rfcomm_send_cmd:750: ffff880230ccc9c0 cmd 115 [18985.546563] [21592] rfcomm_send_frame:741: session ffff880230ccc9c0 len 4 [18985.546591] [21592] __rfcomm_dlc_close:447: dlc ffff88016fe16500 state 9 dlci 26 err 104 session ffff880230ccc9c0 [18985.546597] [21592] rfcomm_dlc_clear_timer:286: dlc ffff88016fe16500 state 9 [18985.546603] [21592] rfcomm_sk_state_change:71: dlc ffff88016fe16500 state 9 err 104 Here the error has been set. [18985.546628] [21592] rfcomm_dlc_unlink:351: dlc ffff88016fe16500 refcnt 2 session ffff880230ccc9c0 [18985.546635] [21592] rfcomm_session_set_timer:250: session ffff880230ccc9c0 state 1 timeout 2000 [18985.546646] [21592] rfcomm_process_dlcs:1792: session ffff880230ccc9c0 state 1 [18985.546701] [32724] bt_sock_stream_recvmsg:300: sk ffff880205615800 size 7768 And here the error has been cleared. [18985.546780] [32724] rfcomm_sock_getsockopt:806: sk ffff880205615800 [18985.546788] [32724] rfcomm_sock_getsockopt_old:742: sk ffff880205615800 [18985.546791] [21576] rfcomm_sock_shutdown:882: sock ffff880136965180, sk ffff880205615800 [18985.546799] [21576] __rfcomm_sock_close:209: sk ffff880205615800 state 9 socket ffff880136965180 My best guess would be that we don't shutdown the socket immediately after the error, and that's the reason why packets are still flowing around, but I can't really tell. In any case, using sock_error() to exit from bt_sock_stream_recvmsg() seems a bad idea. I would say it's buggy due to race conditions, since the socket option is temporarily set and then later cleared, don't you think? Cheers, Mikel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v0] Bluetooth: Fix lost socket error code 2012-07-03 7:44 ` Mikel Astiz @ 2012-07-03 8:18 ` Johan Hedberg 2012-07-03 13:42 ` Mike 1 sibling, 0 replies; 6+ messages in thread From: Johan Hedberg @ 2012-07-03 8:18 UTC (permalink / raw) To: Mikel Astiz; +Cc: linux-bluetooth Hi Mikel, On Tue, Jul 03, 2012, Mikel Astiz wrote: > I don't actually get why rfcomm_sock_recvmsg() is being called, but > that's what I experimentally see: > > [18985.546436] [73] rfcomm_l2data_ready:197: ffff880205613800 bytes 4 > [18985.546535] [21592] rfcomm_process_rx:1853: session > ffff880230ccc9c0 state 1 qlen 1 > [18985.546546] [21592] rfcomm_recv_disc:1205: session ffff880230ccc9c0 > state 1 dlci 26 > [18985.546552] [21592] rfcomm_send_ua:773: ffff880230ccc9c0 dlci 26 > [18985.546558] [21592] rfcomm_send_cmd:750: ffff880230ccc9c0 cmd 115 > [18985.546563] [21592] rfcomm_send_frame:741: session ffff880230ccc9c0 len 4 > [18985.546591] [21592] __rfcomm_dlc_close:447: dlc ffff88016fe16500 > state 9 dlci 26 err 104 session ffff880230ccc9c0 > [18985.546597] [21592] rfcomm_dlc_clear_timer:286: dlc ffff88016fe16500 state 9 > [18985.546603] [21592] rfcomm_sk_state_change:71: dlc ffff88016fe16500 > state 9 err 104 > > Here the error has been set. > > [18985.546628] [21592] rfcomm_dlc_unlink:351: dlc ffff88016fe16500 > refcnt 2 session ffff880230ccc9c0 > [18985.546635] [21592] rfcomm_session_set_timer:250: session > ffff880230ccc9c0 state 1 timeout 2000 > [18985.546646] [21592] rfcomm_process_dlcs:1792: session > ffff880230ccc9c0 state 1 > [18985.546701] [32724] bt_sock_stream_recvmsg:300: sk ffff880205615800 size 7768 > > And here the error has been cleared. recvmsg is the function that gets triggered when user space does a read() system call. So it seems somewhere in user space the error from read() is being ignored. Johan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v0] Bluetooth: Fix lost socket error code 2012-07-03 7:44 ` Mikel Astiz 2012-07-03 8:18 ` Johan Hedberg @ 2012-07-03 13:42 ` Mike 2012-07-03 15:46 ` Mikel Astiz 1 sibling, 1 reply; 6+ messages in thread From: Mike @ 2012-07-03 13:42 UTC (permalink / raw) To: Mikel Astiz; +Cc: johan.hedberg, linux-bluetooth Hi Mikel, On Tue, Jul 3, 2012 at 2:44 AM, Mikel Astiz <mikel.astiz.oss@gmail.com> wrote: > Hi Johan, > > On Fri, Jun 29, 2012 at 3:24 PM, Johan Hedberg <johan.hedberg@gmail.com> wrote: >> Hi Mikel, >> >> On Tue, Jun 19, 2012, Mikel Astiz wrote: >>> From: Mikel Astiz <mikel.astiz@bmw-carit.de> >>> >>> Using sock_error() here to check the status of the socket is wrong >>> because it resets sk->sk_err to zero. For RFCOMM sockets, this means the >>> disconnect reason is not exposed to userland in the socket options. >>> >>> Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> >>> --- >>> This a second attempt to expose ACL disconnect reason to userland, as proposed first in "[RFC v0] Bluetooth: mgmt: Add device disconnect reason". >>> >>> The motivation behind was explained in the userspace patchset "[RFC BlueZ v0 0/5] ACL disconnect reason". >>> >>> This second approach focuses on RFCOMM sockets given that L2CAP is already exposing such information. >>> >>> net/bluetooth/af_bluetooth.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c >>> index f7db579..4799338 100644 >>> --- a/net/bluetooth/af_bluetooth.c >>> +++ b/net/bluetooth/af_bluetooth.c >>> @@ -313,7 +313,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock, >>> if (copied >= target) >>> break; >>> >>> - err = sock_error(sk); >>> + err = sk->sk_err; >>> if (err) >>> break; >>> if (sk->sk_shutdown & RCV_SHUTDOWN) >> >> Looking at the implementation of sock_error (include/net/sock.h) This >> should be: >> >> err = -sk->sk_err; > > Right. > >> However, is this really needed if the error is returned for the write() >> or read() that triggers it (through errno). And in the case of >> POLLHUP/POLLERR there wont be any write/read that could clear the error >> so you should be able to read it with getsockopt just fine. Am I missing >> something? > > I don't actually get why rfcomm_sock_recvmsg() is being called, but > that's what I experimentally see: > > [18985.546436] [73] rfcomm_l2data_ready:197: ffff880205613800 bytes 4 > [18985.546535] [21592] rfcomm_process_rx:1853: session > ffff880230ccc9c0 state 1 qlen 1 > [18985.546546] [21592] rfcomm_recv_disc:1205: session ffff880230ccc9c0 > state 1 dlci 26 > [18985.546552] [21592] rfcomm_send_ua:773: ffff880230ccc9c0 dlci 26 > [18985.546558] [21592] rfcomm_send_cmd:750: ffff880230ccc9c0 cmd 115 > [18985.546563] [21592] rfcomm_send_frame:741: session ffff880230ccc9c0 len 4 > [18985.546591] [21592] __rfcomm_dlc_close:447: dlc ffff88016fe16500 > state 9 dlci 26 err 104 session ffff880230ccc9c0 > [18985.546597] [21592] rfcomm_dlc_clear_timer:286: dlc ffff88016fe16500 state 9 > [18985.546603] [21592] rfcomm_sk_state_change:71: dlc ffff88016fe16500 > state 9 err 104 > > Here the error has been set. > > [18985.546628] [21592] rfcomm_dlc_unlink:351: dlc ffff88016fe16500 > refcnt 2 session ffff880230ccc9c0 > [18985.546635] [21592] rfcomm_session_set_timer:250: session > ffff880230ccc9c0 state 1 timeout 2000 > [18985.546646] [21592] rfcomm_process_dlcs:1792: session > ffff880230ccc9c0 state 1 > [18985.546701] [32724] bt_sock_stream_recvmsg:300: sk ffff880205615800 size 7768 > > And here the error has been cleared. > > [18985.546780] [32724] rfcomm_sock_getsockopt:806: sk ffff880205615800 > [18985.546788] [32724] rfcomm_sock_getsockopt_old:742: sk ffff880205615800 > [18985.546791] [21576] rfcomm_sock_shutdown:882: sock > ffff880136965180, sk ffff880205615800 > [18985.546799] [21576] __rfcomm_sock_close:209: sk ffff880205615800 > state 9 socket ffff880136965180 > > My best guess would be that we don't shutdown the socket immediately > after the error, and that's the reason why packets are still flowing > around, but I can't really tell. > > In any case, using sock_error() to exit from bt_sock_stream_recvmsg() > seems a bad idea. I would say it's buggy due to race conditions, since > the socket option is temporarily set and then later cleared, don't you > think? Are you seeing this in reference to ofono? I noticed that gatchat/gatio.c, function received_data would try to do a read before checking the condition that caused it to be called. In this case, the socket error would be correct at the beginning of the function, but would no longer be correct by the time the function reaches the condition check for G_IO_HUP as a read attempt occurs. Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v0] Bluetooth: Fix lost socket error code 2012-07-03 13:42 ` Mike @ 2012-07-03 15:46 ` Mikel Astiz 0 siblings, 0 replies; 6+ messages in thread From: Mikel Astiz @ 2012-07-03 15:46 UTC (permalink / raw) To: Mike; +Cc: johan.hedberg, linux-bluetooth Hi Mike, On Tue, Jul 3, 2012 at 3:42 PM, Mike <puffy.taco@gmail.com> wrote: > Hi Mikel, > > On Tue, Jul 3, 2012 at 2:44 AM, Mikel Astiz <mikel.astiz.oss@gmail.com> wrote: >> Hi Johan, >> >> On Fri, Jun 29, 2012 at 3:24 PM, Johan Hedberg <johan.hedberg@gmail.com> wrote: >>> Hi Mikel, >>> >>> On Tue, Jun 19, 2012, Mikel Astiz wrote: >>>> From: Mikel Astiz <mikel.astiz@bmw-carit.de> >>>> >>>> Using sock_error() here to check the status of the socket is wrong >>>> because it resets sk->sk_err to zero. For RFCOMM sockets, this means the >>>> disconnect reason is not exposed to userland in the socket options. >>>> >>>> Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> >>>> --- >>>> This a second attempt to expose ACL disconnect reason to userland, as proposed first in "[RFC v0] Bluetooth: mgmt: Add device disconnect reason". >>>> >>>> The motivation behind was explained in the userspace patchset "[RFC BlueZ v0 0/5] ACL disconnect reason". >>>> >>>> This second approach focuses on RFCOMM sockets given that L2CAP is already exposing such information. >>>> >>>> net/bluetooth/af_bluetooth.c | 2 +- >>>> 1 files changed, 1 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c >>>> index f7db579..4799338 100644 >>>> --- a/net/bluetooth/af_bluetooth.c >>>> +++ b/net/bluetooth/af_bluetooth.c >>>> @@ -313,7 +313,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock, >>>> if (copied >= target) >>>> break; >>>> >>>> - err = sock_error(sk); >>>> + err = sk->sk_err; >>>> if (err) >>>> break; >>>> if (sk->sk_shutdown & RCV_SHUTDOWN) >>> >>> Looking at the implementation of sock_error (include/net/sock.h) This >>> should be: >>> >>> err = -sk->sk_err; >> >> Right. >> >>> However, is this really needed if the error is returned for the write() >>> or read() that triggers it (through errno). And in the case of >>> POLLHUP/POLLERR there wont be any write/read that could clear the error >>> so you should be able to read it with getsockopt just fine. Am I missing >>> something? >> >> I don't actually get why rfcomm_sock_recvmsg() is being called, but >> that's what I experimentally see: >> >> [18985.546436] [73] rfcomm_l2data_ready:197: ffff880205613800 bytes 4 >> [18985.546535] [21592] rfcomm_process_rx:1853: session >> ffff880230ccc9c0 state 1 qlen 1 >> [18985.546546] [21592] rfcomm_recv_disc:1205: session ffff880230ccc9c0 >> state 1 dlci 26 >> [18985.546552] [21592] rfcomm_send_ua:773: ffff880230ccc9c0 dlci 26 >> [18985.546558] [21592] rfcomm_send_cmd:750: ffff880230ccc9c0 cmd 115 >> [18985.546563] [21592] rfcomm_send_frame:741: session ffff880230ccc9c0 len 4 >> [18985.546591] [21592] __rfcomm_dlc_close:447: dlc ffff88016fe16500 >> state 9 dlci 26 err 104 session ffff880230ccc9c0 >> [18985.546597] [21592] rfcomm_dlc_clear_timer:286: dlc ffff88016fe16500 state 9 >> [18985.546603] [21592] rfcomm_sk_state_change:71: dlc ffff88016fe16500 >> state 9 err 104 >> >> Here the error has been set. >> >> [18985.546628] [21592] rfcomm_dlc_unlink:351: dlc ffff88016fe16500 >> refcnt 2 session ffff880230ccc9c0 >> [18985.546635] [21592] rfcomm_session_set_timer:250: session >> ffff880230ccc9c0 state 1 timeout 2000 >> [18985.546646] [21592] rfcomm_process_dlcs:1792: session >> ffff880230ccc9c0 state 1 >> [18985.546701] [32724] bt_sock_stream_recvmsg:300: sk ffff880205615800 size 7768 >> >> And here the error has been cleared. >> >> [18985.546780] [32724] rfcomm_sock_getsockopt:806: sk ffff880205615800 >> [18985.546788] [32724] rfcomm_sock_getsockopt_old:742: sk ffff880205615800 >> [18985.546791] [21576] rfcomm_sock_shutdown:882: sock >> ffff880136965180, sk ffff880205615800 >> [18985.546799] [21576] __rfcomm_sock_close:209: sk ffff880205615800 >> state 9 socket ffff880136965180 >> >> My best guess would be that we don't shutdown the socket immediately >> after the error, and that's the reason why packets are still flowing >> around, but I can't really tell. >> >> In any case, using sock_error() to exit from bt_sock_stream_recvmsg() >> seems a bad idea. I would say it's buggy due to race conditions, since >> the socket option is temporarily set and then later cleared, don't you >> think? > > Are you seeing this in reference to ofono? I noticed that > gatchat/gatio.c, function received_data would try to do a read before > checking the condition that caused it to be called. In this case, the > socket error would be correct at the beginning of the function, but > would no longer be correct by the time the function reaches the > condition check for G_IO_HUP as a read attempt occurs. Yes, you're right. The socket option does exist at the beginning of received_data(), and it's cleared during the body. However, I'm still not sure what the general Kernel policy is regarding socket error reporting, so my question would be: is it a desired behaviour that the SO_ERROR would be cleared after a read attempt? If so, the Kernel would need no patch, and it would become a userland problem to propagate the disconnect reason as appropriate. Cheers, Mikel ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-03 15:46 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-19 14:20 [RFC v0] Bluetooth: Fix lost socket error code Mikel Astiz 2012-06-29 13:24 ` Johan Hedberg 2012-07-03 7:44 ` Mikel Astiz 2012-07-03 8:18 ` Johan Hedberg 2012-07-03 13:42 ` Mike 2012-07-03 15:46 ` Mikel Astiz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox