All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>,
	John Fastabend <john.r.fastabend@intel.com>,
	Daniel Borkmann <dborkman@redhat.com>,
	Jesper Dangaard Brouer <jbrouer@redhat.com>,
	"John W. Linville" <linville@tuxdriver.com>,
	Florian Westphal <fw@strlen.de>,
	gerlitz.or@gmail.com, netdev@vger.kernel.org,
	john.ronciak@intel.com, amirv@mellanox.com,
	eric.dumazet@gmail.com, danny.zhou@intel.com,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access
Date: Wed, 08 Oct 2014 10:20:14 -0700	[thread overview]
Message-ID: <5435724E.5090507@gmail.com> (raw)
In-Reply-To: <20141007185940.GE27719@hmsreliant.think-freely.org>

On 10/07/2014 11:59 AM, Neil Horman wrote:
> On Tue, Oct 07, 2014 at 01:26:11AM +0200, Hannes Frederic Sowa wrote:
>> Hi John,
>>
>> On Mon, Oct 6, 2014, at 22:37, John Fastabend wrote:
>>>> I find the six additional ndo ops a bit worrisome as we are adding more
>>>> and more subsystem specific ndoops to this struct. I would like to see
>>>> some unification here, but currently cannot make concrete proposals,
>>>> sorry.
>>>
>>> I agree it seems like a bit much. One thought was to split the ndo
>>> ops into categories. Switch ops, MACVLAN ops, basic ops and with this
>>> userspace queue ops. This sort of goes along with some of the switch
>>> offload work which is going to add a handful more ops as best I can
>>> tell.
>>
>> Thanks for your mail, you answered all of my questions.
>>
>> Have you looked at <https://code.google.com/p/kernel/wiki/ProjectUnetq>?
>> Willem (also in Cc) used sysfs files which get mmaped to represent the
>> tx/rx descriptors. The representation was independent of the device and
>> IIRC the prototype used a write(fd, "", 1) to signal the kernel it
>> should proceed with tx. I agree, it would be great to be syscall-free
>> here.
>>
>> For the semantics of the descriptors we could also easily generate files
>> in sysfs. I thought about something like tracepoints already do for
>> representing the data in the ringbuffer depending on the event:
>>
>> -- >8 --
>> # cat /sys/kernel/debug/tracing/events/net/net_dev_queue/format
>> name: net_dev_queue
>> ID: 1006
>> format:
>> 	field:unsigned short common_type;       offset:0;       size:2;
>> 	signed:0;
>> 	field:unsigned char common_flags;       offset:2;       size:1;
>> 	signed:0;
>> 	field:unsigned char common_preempt_count;       offset:3;
>> 	size:1; signed:0;
>> 	field:int common_pid;   offset:4;       size:4; signed:1;
>>
>> 	field:void * skbaddr;   offset:8;       size:8; signed:0;
>> 	field:unsigned int len; offset:16;      size:4; signed:0;
>> 	field:__data_loc char[] name;   offset:20;      size:4;
>> 	signed:1;
>>
>> print fmt: "dev=%s skbaddr=%p len=%u", __get_str(name), REC->skbaddr,
>> REC->len
>> -- >8 --
>>
>> Maybe the macros from tracing are reusable (TP_STRUCT__entry), e.g.
>> endianess would need to be added. Hopefully there is already a user
>> space parser somewhere in the perf sources. An easier to parse binary
>> representation could be added easily and maybe even something vDSO alike
>> if people care about that.
>>
>> Maybe this open/mmap per queue also kills some of the ndo_ops?
>>
>> Bye,
>> Hannes
>>
>
>
> John-
> 	I don't know if its of use to you here, but I was experimenting awhile
> ago with af_packet memory mapping, using the protection bits in the page tables
> as a doorbell mechanism.  I scrapped the work as the performance bottleneck for
> af_packet wasn't found in the syscall trap time, but it occurs to me, it might
> be useful for you here, in that, using this mechanism, if you keep the transmit
> ring non-empty, you only encur the cost of a single trap to start the transmit
> process.  Let me know if you want to see it.
>
> Neil
>


Hi Neil,

If you could forward it along I'll take a look. It seems like something
along these lines will be needed.

Thanks,
John


-- 
John Fastabend         Intel Corporation

  reply	other threads:[~2014-10-08 17:20 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-06  0:06 [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access John Fastabend
2014-10-06  0:07 ` [net-next PATCH v1 2/3] net: sched: add direct ring acces via af_packet to ixgbe John Fastabend
2014-10-06  0:07 ` [net-next PATCH v1 3/3] net: packet: Document PACKET_DEV_QPAIR_SPLIT and friends John Fastabend
2014-10-06  0:29 ` [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access Florian Westphal
2014-10-06  1:09   ` David Miller
2014-10-06  1:18     ` John Fastabend
2014-10-06  1:12   ` John Fastabend
2014-10-06  9:49     ` Daniel Borkmann
2014-10-06 15:01       ` John Fastabend
2014-10-06 16:35         ` Jesper Dangaard Brouer
2014-10-06 17:03         ` Hannes Frederic Sowa
2014-10-06 20:37           ` John Fastabend
2014-10-06 23:26             ` Hannes Frederic Sowa
2014-10-07 18:59               ` Neil Horman
2014-10-08 17:20                 ` John Fastabend [this message]
2014-10-09 13:36                   ` [PATCH] af_packet: Add Doorbell transmit mode to AF_PACKET sockets Neil Horman
2014-10-09 15:01                     ` John Fastabend
2014-10-09 16:05                       ` Neil Horman
2014-10-06 16:55 ` [net-next PATCH v1 1/3] net: sched: af_packet support for direct ring access Stephen Hemminger
2014-10-06 20:42   ` John Fastabend
2014-10-06 21:42 ` David Miller
2014-10-07  4:25   ` John Fastabend
2014-10-07  4:24 ` Willem de Bruijn
2014-10-07  9:27   ` David Laight
2014-10-07 15:43     ` David Miller
2014-10-07 15:59       ` David Laight
2014-10-07 16:08         ` David Miller
2014-10-07 15:21   ` Zhou, Danny
2014-10-07 15:46     ` Willem de Bruijn
2014-10-07 15:55       ` John Fastabend
2014-10-07 16:06         ` Zhou, Danny
2014-10-07 16:05     ` David Miller
2014-10-10  3:49       ` Zhou, Danny
  -- strict thread matches above, loose matches on Subject: below --
2014-10-07 16:33 Alexei Starovoitov
2014-10-07 16:46 ` Zhou, Danny
2014-10-07 17:01 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5435724E.5090507@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=amirv@mellanox.com \
    --cc=danny.zhou@intel.com \
    --cc=dborkman@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=jbrouer@redhat.com \
    --cc=john.r.fastabend@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=linville@tuxdriver.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.