netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
	John Fastabend <john.r.fastabend@intel.com>,
	Daniel Borkmann <dborkman@redhat.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	Florian Westphal <fw@strlen.de>,
	gerlitz.or@gmail.com, netdev@vger.kernel.org,
	john.ronciak@intel.com, amirv@mellanox.com,
	eric.dumazet@gmail.com, danny.zhou@intel.com,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access
Date: Wed, 08 Oct 2014 10:20:14 -0700	[thread overview]
Message-ID: <5435724E.5090507@gmail.com> (raw)
In-Reply-To: <20141007185940.GE27719@hmsreliant.think-freely.org>

On 10/07/2014 11:59 AM, Neil Horman wrote:
> On Tue, Oct 07, 2014 at 01:26:11AM +0200, Hannes Frederic Sowa wrote:
>> Hi John,
>>
>> On Mon, Oct 6, 2014, at 22:37, John Fastabend wrote:
>>>> I find the six additional ndo ops a bit worrisome as we are adding more
>>>> and more subsystem specific ndoops to this struct. I would like to see
>>>> some unification here, but currently cannot make concrete proposals,
>>>> sorry.
>>>
>>> I agree it seems like a bit much. One thought was to split the ndo
>>> ops into categories. Switch ops, MACVLAN ops, basic ops and with this
>>> userspace queue ops. This sort of goes along with some of the switch
>>> offload work which is going to add a handful more ops as best I can
>>> tell.
>>
>> Thanks for your mail, you answered all of my questions.
>>
>> Have you looked at <https://code.google.com/p/kernel/wiki/ProjectUnetq>?
>> Willem (also in Cc) used sysfs files which get mmaped to represent the
>> tx/rx descriptors. The representation was independent of the device and
>> IIRC the prototype used a write(fd, "", 1) to signal the kernel it
>> should proceed with tx. I agree, it would be great to be syscall-free
>> here.
>>
>> For the semantics of the descriptors we could also easily generate files
>> in sysfs. I thought about something like tracepoints already do for
>> representing the data in the ringbuffer depending on the event:
>>
>> -- >8 --
>> # cat /sys/kernel/debug/tracing/events/net/net_dev_queue/format
>> name: net_dev_queue
>> ID: 1006
>> format:
>> 	field:unsigned short common_type;       offset:0;       size:2;
>> 	signed:0;
>> 	field:unsigned char common_flags;       offset:2;       size:1;
>> 	signed:0;
>> 	field:unsigned char common_preempt_count;       offset:3;
>> 	size:1; signed:0;
>> 	field:int common_pid;   offset:4;       size:4; signed:1;
>>
>> 	field:void * skbaddr;   offset:8;       size:8; signed:0;
>> 	field:unsigned int len; offset:16;      size:4; signed:0;
>> 	field:__data_loc char[] name;   offset:20;      size:4;
>> 	signed:1;
>>
>> print fmt: "dev=%s skbaddr=%p len=%u", __get_str(name), REC->skbaddr,
>> REC->len
>> -- >8 --
>>
>> Maybe the macros from tracing are reusable (TP_STRUCT__entry), e.g.
>> endianess would need to be added. Hopefully there is already a user
>> space parser somewhere in the perf sources. An easier to parse binary
>> representation could be added easily and maybe even something vDSO alike
>> if people care about that.
>>
>> Maybe this open/mmap per queue also kills some of the ndo_ops?
>>
>> Bye,
>> Hannes
>>
>
>
> John-
> 	I don't know if its of use to you here, but I was experimenting awhile
> ago with af_packet memory mapping, using the protection bits in the page tables
> as a doorbell mechanism.  I scrapped the work as the performance bottleneck for
> af_packet wasn't found in the syscall trap time, but it occurs to me, it might
> be useful for you here, in that, using this mechanism, if you keep the transmit
> ring non-empty, you only encur the cost of a single trap to start the transmit
> process.  Let me know if you want to see it.
>
> Neil
>


Hi Neil,

If you could forward it along I'll take a look. It seems like something
along these lines will be needed.

Thanks,
John


-- 
John Fastabend         Intel Corporation

  reply	other threads:[~2014-10-08 17:20 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-06  0:06 [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access John Fastabend
2014-10-06  0:07 ` [net-next PATCH v1 2/3] net: sched: add direct ring acces via af_packet to ixgbe John Fastabend
2014-10-06  0:07 ` [net-next PATCH v1 3/3] net: packet: Document PACKET_DEV_QPAIR_SPLIT and friends John Fastabend
2014-10-06  0:29 ` [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access Florian Westphal
2014-10-06  1:09   ` David Miller
2014-10-06  1:18     ` John Fastabend
2014-10-06  1:12   ` John Fastabend
2014-10-06  9:49     ` Daniel Borkmann
2014-10-06 15:01       ` John Fastabend
2014-10-06 16:35         ` Jesper Dangaard Brouer
2014-10-06 17:03         ` Hannes Frederic Sowa
2014-10-06 20:37           ` John Fastabend
2014-10-06 23:26             ` Hannes Frederic Sowa
2014-10-07 18:59               ` Neil Horman
2014-10-08 17:20                 ` John Fastabend [this message]
2014-10-09 13:36                   ` [PATCH] af_packet: Add Doorbell transmit mode to AF_PACKET sockets Neil Horman
2014-10-09 15:01                     ` John Fastabend
2014-10-09 16:05                       ` Neil Horman
2014-10-06 16:55 ` [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access Stephen Hemminger
2014-10-06 20:42   ` John Fastabend
2014-10-06 21:42 ` David Miller
2014-10-07  4:25   ` John Fastabend
2014-10-07  4:24 ` Willem de Bruijn
2014-10-07  9:27   ` David Laight
2014-10-07 15:43     ` David Miller
2014-10-07 15:59       ` David Laight
2014-10-07 16:08         ` David Miller
2014-10-07 15:21   ` Zhou, Danny
2014-10-07 15:46     ` Willem de Bruijn
2014-10-07 15:55       ` John Fastabend
2014-10-07 16:06         ` Zhou, Danny
2014-10-07 16:05     ` David Miller
2014-10-10  3:49       ` Zhou, Danny
  -- strict thread matches above, loose matches on Subject: below --
2014-10-07 16:33 Alexei Starovoitov
2014-10-07 16:46 ` Zhou, Danny
2014-10-07 17:01 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5435724E.5090507@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=amirv@mellanox.com \
    --cc=danny.zhou@intel.com \
    --cc=dborkman@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jbrouer@redhat.com \
    --cc=john.r.fastabend@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).