netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] move smp_rmb() after load of status
@ 2018-07-07 19:23 humbabek
  2018-07-07 22:34 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: humbabek @ 2018-07-07 19:23 UTC (permalink / raw)
  Cc: humbabek, David S. Miller, Willem de Bruijn, Eric Dumazet,
	Kees Cook, Kirill Tkhai, Mike Maloney, Magnus Karlsson, netdev,
	linux-kernel

Signed-off-by: humbabek <humbabek@gmail.com>
---
 net/packet/af_packet.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 57634bc3da74..91830ef07c48 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -394,25 +394,30 @@ static void __packet_set_status(struct packet_sock *po, void *frame, int status)
 static int __packet_get_status(struct packet_sock *po, void *frame)
 {
 	union tpacket_uhdr h;
-
-	smp_rmb();
+	int status;
 
 	h.raw = frame;
 	switch (po->tp_version) {
 	case TPACKET_V1:
 		flush_dcache_page(pgv_to_page(&h.h1->tp_status));
-		return h.h1->tp_status;
+		status = h.h1->tp_status;
+		break;
 	case TPACKET_V2:
 		flush_dcache_page(pgv_to_page(&h.h2->tp_status));
-		return h.h2->tp_status;
+		status = h.h2->tp_status;
+		break;
 	case TPACKET_V3:
 		flush_dcache_page(pgv_to_page(&h.h3->tp_status));
-		return h.h3->tp_status;
+		status = h.h3->tp_status;
+		break;
 	default:
 		WARN(1, "TPACKET version not supported.\n");
 		BUG();
-		return 0;
+		status = 0;
+		break;
 	}
+	smp_rmb();
+	return status;
 }
 
 static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec *ts,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] move smp_rmb() after load of status
  2018-07-07 19:23 [PATCH] move smp_rmb() after load of status humbabek
@ 2018-07-07 22:34 ` Eric Dumazet
  2018-07-08  9:27   ` humbabek
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2018-07-07 22:34 UTC (permalink / raw)
  To: humbabek
  Cc: David S. Miller, Willem de Bruijn, Eric Dumazet, Kees Cook,
	Kirill Tkhai, Mike Maloney, Magnus Karlsson, netdev, linux-kernel



On 07/07/2018 12:23 PM, humbabek wrote:
> Signed-off-by: humbabek <humbabek@gmail.com>
> ---
>

No changelog ?

No comment added explaining why this smp_rmb() move is needed ?
(ie where is the opposite smp_wmb() )

Share with us your thoughts and reasoning, please.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] move smp_rmb() after load of status
  2018-07-07 22:34 ` Eric Dumazet
@ 2018-07-08  9:27   ` humbabek
  2018-07-08 21:53     ` Willem de Bruijn
  0 siblings, 1 reply; 4+ messages in thread
From: humbabek @ 2018-07-08  9:27 UTC (permalink / raw)
  Cc: humbabek, David S. Miller, Willem de Bruijn, Eric Dumazet,
	Kirill Tkhai, Kees Cook, Mike Maloney, Magnus Karlsson, netdev,
	linux-kernel

Signed-off-by: humbabek <humbabek@gmail.com>
---
 net/packet/af_packet.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 57634bc3da74..91830ef07c48 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -394,25 +394,30 @@ static void __packet_set_status(struct packet_sock *po, void *frame, int status)
 static int __packet_get_status(struct packet_sock *po, void *frame)
 {
 	union tpacket_uhdr h;
-
-	smp_rmb();
+	int status;
 
 	h.raw = frame;
 	switch (po->tp_version) {
 	case TPACKET_V1:
 		flush_dcache_page(pgv_to_page(&h.h1->tp_status));
-		return h.h1->tp_status;
+		status = h.h1->tp_status;
+		break;
 	case TPACKET_V2:
 		flush_dcache_page(pgv_to_page(&h.h2->tp_status));
-		return h.h2->tp_status;
+		status = h.h2->tp_status;
+		break;
 	case TPACKET_V3:
 		flush_dcache_page(pgv_to_page(&h.h3->tp_status));
-		return h.h3->tp_status;
+		status = h.h3->tp_status;
+		break;
 	default:
 		WARN(1, "TPACKET version not supported.\n");
 		BUG();
-		return 0;
+		status = 0;
+		break;
 	}
+	smp_rmb();
+	return status;
 }
 
 static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec *ts,
-- 
2.17.1

Sorry for a lack of information- it is the first my patch. 

The opposie smp_wmb() is in: https://elixir.bootlin.com/linux/v4.15/source/net/packet/af_packet.c#L415

On my eye smp_rmb() should be moved *after* actual read of status to be ensured that no read after __packet_get_status() happens before actual read of status.

Note, that tpacket_snd() polls a ring [https://elixir.bootlin.com/linux/v4.15/source/net/packet/af_packet.c#L2677], possibly waiting for a published packet (by published I mean a packet with a TP_STATUS_SEND_REQUEST status).

It is important to have a guarantee that no load was reordered before we know the status, isn't it?

Please note that it is possible that I am wrong, so please take this into account. However, if I am wrong please let me know. 

>From the other side, what is the point to place smp_rmb() before read of status? 

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] move smp_rmb() after load of status
  2018-07-08  9:27   ` humbabek
@ 2018-07-08 21:53     ` Willem de Bruijn
  0 siblings, 0 replies; 4+ messages in thread
From: Willem de Bruijn @ 2018-07-08 21:53 UTC (permalink / raw)
  To: humbabek
  Cc: David Miller, Willem de Bruijn, Eric Dumazet, ktkhai, Kees Cook,
	Mike Maloney, Karlsson, Magnus, Network Development, LKML

> @@ -394,25 +394,30 @@ static void __packet_set_status(struct packet_sock *po, void *frame, int status)
>  static int __packet_get_status(struct packet_sock *po, void *frame)
>  {
>         union tpacket_uhdr h;
> -
> -       smp_rmb();
> +       int status;
>
>         h.raw = frame;
>         switch (po->tp_version) {
>         case TPACKET_V1:
>                 flush_dcache_page(pgv_to_page(&h.h1->tp_status));
> -               return h.h1->tp_status;
> +               status = h.h1->tp_status;
> +               break;
>         case TPACKET_V2:
>                 flush_dcache_page(pgv_to_page(&h.h2->tp_status));
> -               return h.h2->tp_status;
> +               status = h.h2->tp_status;
> +               break;
>         case TPACKET_V3:
>                 flush_dcache_page(pgv_to_page(&h.h3->tp_status));
> -               return h.h3->tp_status;
> +               status = h.h3->tp_status;
> +               break;
>         default:
>                 WARN(1, "TPACKET version not supported.\n");
>                 BUG();
> -               return 0;
> +               status = 0;
> +               break;
>         }
> +       smp_rmb();
> +       return status;
>  }
>
>  static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec *ts,
> --
> 2.17.1
>
> Sorry for a lack of information- it is the first my patch.
>
> The opposie smp_wmb() is in: https://elixir.bootlin.com/linux/v4.15/source/net/packet/af_packet.c#L415

__packet_get_status in the common path synchronizes with userspace
code, not with __packet_set_status. See __v2_tx_user_ready in
tools/testing/selftests/net/psock_tpacket.c from an example.

On receive, the smp_wmb in tpacket_rcv ensures that a frame is filled
before the kernel updates the status field to TP_STATUS_USER. So that
is not the purpose of the smp_wmb in __packet_set_status.

Digging through the original PACKET_TX_RING discussion (below)
it appears that the intent of that smp_wmb was intended either to ensure
that the frame was fully written before freeing the skb (when called from
tpacket_destruct_skb) or even to ensure that the frame was fully written
before calling flush_dcache_page on the underlying shared page.

The latter would explain its analogue in __packet_get_status,
though it is not implemented as a barrier between the field access
and page flush in either function in the final patch.

>From http://lists.openwall.net/netdev/2008/10/31/70:

    >> Also, when you set the status back to TP_STATUS_KERNEL in the
    >> destructor, you need to add the following barriers:
    >>
    >> __packet_set_status(po, ph, TP_STATUS_KERNEL);
    >> smp_mb();   // make sure the TP_STATUS_KERNEL was actually written to
    >> memory before this - couldn't this actually be just a smp_wmb?
    >> flush_dcache_page(virt_to_page(ph));  // on non-x86 architectures like
    >> ARM that have a moronic cache (i.e cache by virtual rather than
    >> physical address). on x86 this is a noop.
    >>
    >
    > So, If my understanding of those memory barriers is correct, we should
    > have a smp_rmb() before status reading and smp_wmb() after status
    > writing in skb destructor and send procedure.

> On my eye smp_rmb() should be moved *after* actual read of status to be ensured that no read after __packet_get_status() happens before actual read of status.
>

That sounds reasonable. But we should understand the intent of the
existing code before making changes. That was not the purpose of this
barrier, it appears.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-08 21:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-07 19:23 [PATCH] move smp_rmb() after load of status humbabek
2018-07-07 22:34 ` Eric Dumazet
2018-07-08  9:27   ` humbabek
2018-07-08 21:53     ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).