netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brenden Blanco <bblanco@plumgrid.com>
To: Tariq Toukan <ttoukan.linux@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, jhs@mojatatu.com,
	saeedm@dev.mellanox.co.il, kafai@fb.com, brouer@redhat.com,
	as754m@att.com, alexei.starovoitov@gmail.com,
	gerlitz.or@gmail.com, john.fastabend@gmail.com,
	hannes@stressinduktion.org, tgraf@suug.ch, tom@herbertland.com,
	daniel@iogearbox.net
Subject: Re: [PATCH v8 06/11] net/mlx4_en: add page recycle to prepare rx ring for tx support
Date: Tue, 19 Jul 2016 10:41:32 -0700	[thread overview]
Message-ID: <20160719174130.GA19547@gmail.com> (raw)
In-Reply-To: <6d638467-eea6-d3e1-6984-88a1198ef303@gmail.com>

On Tue, Jul 19, 2016 at 04:33:28PM +0300, Tariq Toukan wrote:
[...]
> >So, I took Dave's suggestion to heart, and spent the last 2 days seeing
> >what was possible to implement with just xdp as the focus, rather than
> >an overall cleanup which Tariq will be looking at.
> >
> >Unfortunately, this turned out to a be a bit of a rat hole.
> >
> >What I wanted to do was to pre-allocate all the required pages before
> >reaching the point of no return. Doing this isn't all that hard, since
> >it should just be a few loops. However, I ended with a bit more
> >duplicated code than one would like, since I had to tease out the
> >various sections that assume exclusive access to hardware.
> >
> >But, more than that, is that I don't see a way to fill these pages into
> >the rings safely while hardware still has ability to write into the old
> >ones. There was no "pause" API that I could find besides
> >mlx4_en_stop_port(). That function is fairly destructive and requires
> >the resource allocation in mlx4_en_start_port() to succeed to recover
> >the port status.
> >
> >One option that I considered would be to drain buffers from the rx ring,
> >and just let mlx4_en_recover_from_oom() do its job once we update the
> >page template in frag_info[]. This, however, also requires the queues to
> >be paused safely, so we again have to rely on mlx4_en_stop_port().
> >
> >One change I can make is to avoid allocating additional tx rings, which
> >means that we can skip the calls to mlx4_en_free/alloc_resources().
> I think it's more natural to manage the two types of tx rings
> without converting one type to the other dynamically, just to avoid
> re-allocation.
> I don't see the re-allocation of resources as a serious issue here,
> especially after I submitted patches that add resiliency and make it
> more safe (see them in [1]).
Yes, I saw those, it would be helpful for the v8 version but with v9
onwards I guess it is not conflicting, since no reallocation is done.
> We just need to make sure we don't end up with an inoperative port,
> I will take care of it once net and net-next are merged.
Thanks
> 
> I'd like to keep the tx rings separation, also when it comes to
> resource allocation, so that we allocate xdp rings when needed, and
> not borrow them from the other type.
> I have two possible solutions in mind:
> 1) Based on the suggestion you have in v8, in which rsv tx rings
> grow beyond to the regular tx array, while maintaining the
> accounting.
> One improvement to make the code more clear and readable is to
> explicitly define two fields for accounting the number of tx rings:
> - rename rsv_tx_rings to xdp_tx_rings.
I'll incorporate this one, since anyway the code is changing a bit to
accommodate prog-per-ring.
> - have a new priv->netdev_num_tx_rings = priv->num_tx_rings - xdp_tx_rings.
> and then modify driver rings iterators accordingly.
The ring iteration is sprinkled in too many places to incorporate
easily, so this makes sense as its own cleanup, with either (1) or (2)
from your proposal.
> 
> 2) Another solution I have in mind goes as follows.
> Expand the current tx_ring array to be two dimensional, where
> tx_ring[0] points to the regular tx array, and tx_ring[1] points to
> the xdp rings.
> We look to keep using the same napi handlers and not get into code
> duplication, and make minimal adjustments that do not harm
> performance.
> The idea is to mark CQ's types in creation time, so that we know for
> each CQ whether it's for a regular TX ring or an XDP tx ring.
> When we get a completion, we can just read the CQ type, and then
> choose the correct tx_ring according to cq type and cq->ring,
> something like tx_ring[cq->cq_type][cq->ring]; (we can do it so that
> we avoid using an if statement).
> It is possible to use the existing enum cq_type field "cq->is_tx",
> rename it (say to 'type'), and expand the enum cq_type with an
> additional value for TX_XDP.
> In addition, similarly to the previous suggestion, also here we
> would want to use two separate fields for the number of tx rings,
> one for netdev and one for xdp, and modify the rings iterators
> accordingly.
I think (2) makes sense, but the proof will be in the code, depending on
how easy to review it is.
> 
> [1]:
> https://patchwork.ozlabs.org/patch/649620/
> https://patchwork.ozlabs.org/patch/649619/
> 
> Another (not related) nit, as you're already working on the next V,
> I suggest we rename priv->prog to priv->bpf_prog.
I think xdp_prog is better, but point taken. Will update.
> 
> Regards,
> Tariq

  parent reply	other threads:[~2016-07-19 17:41 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-12  7:51 [PATCH v8 00/11] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 01/11] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-12 13:14   ` Jesper Dangaard Brouer
2016-07-12 14:52     ` Tom Herbert
2016-07-12 16:08       ` Jakub Kicinski
2016-07-13  4:14       ` Alexei Starovoitov
2016-07-12  7:51 ` [PATCH v8 02/11] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 03/11] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 04/11] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-12 12:02   ` Tariq Toukan
2016-07-13 11:27   ` David Laight
2016-07-13 14:08     ` Brenden Blanco
2016-07-14  7:25   ` Jesper Dangaard Brouer
2016-07-15  3:30     ` Alexei Starovoitov
2016-07-15  8:21       ` Jesper Dangaard Brouer
2016-07-15 16:56         ` Alexei Starovoitov
2016-07-15 16:18       ` Tom Herbert
2016-07-15 16:47         ` Alexei Starovoitov
2016-07-15 17:49           ` Tom Herbert
2016-07-18  9:10             ` Thomas Graf
2016-07-18 11:39               ` Tom Herbert
2016-07-18 12:48                 ` Thomas Graf
2016-07-18 13:07                   ` Tom Herbert
2016-07-19  2:45                     ` Alexei Starovoitov
2016-07-18 19:03                 ` Brenden Blanco
2016-07-15 19:09           ` Jesper Dangaard Brouer
2016-07-18  4:01             ` Alexei Starovoitov
2016-07-18  8:35               ` Daniel Borkmann
2016-07-15 18:08     ` Tom Herbert
2016-07-15 18:45       ` Jesper Dangaard Brouer
2016-07-12  7:51 ` [PATCH v8 05/11] Add sample for adding simple drop program to link Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 06/11] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-12 12:09   ` Tariq Toukan
2016-07-12 21:18   ` David Miller
2016-07-13  0:54     ` Brenden Blanco
2016-07-13  7:17       ` Tariq Toukan
2016-07-13 15:40         ` Brenden Blanco
2016-07-15 21:52           ` Brenden Blanco
     [not found]             ` <6d638467-eea6-d3e1-6984-88a1198ef303@gmail.com>
2016-07-19 17:41               ` Brenden Blanco [this message]
2016-07-12  7:51 ` [PATCH v8 07/11] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 08/11] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-12 12:16   ` Tariq Toukan
2016-07-12  7:51 ` [PATCH v8 09/11] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 10/11] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-12  7:51 ` [PATCH v8 11/11] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-12 14:38 ` [PATCH v8 00/11] Add driver bpf hook for early packet drop and forwarding Tariq Toukan
2016-07-13 15:00   ` Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160719174130.GA19547@gmail.com \
    --to=bblanco@plumgrid.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    --cc=ttoukan.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).