* XDP on many-core NPU
@ 2017-11-27 23:33 MD I. Islam
2017-11-28 11:02 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 8+ messages in thread
From: MD I. Islam @ 2017-11-27 23:33 UTC (permalink / raw)
To: xdp-newbies
Hi
I was wondering if XDP can scale to many-core NPU (such as NPS-400
which has 256 cores)? I need to develop a XCP/RCP like application
that can achieve bare-metal performance on each core. The application
will run in a run-to-completion model. I see, DPDK can run userspace
application on each core. I'm wondering if XDP has anything like that?
Please let me know any suggestion.
Thanks
Tamim
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
2017-11-27 23:33 XDP on many-core NPU MD I. Islam
@ 2017-11-28 11:02 ` Jesper Dangaard Brouer
2017-11-28 20:00 ` MD I. Islam
0 siblings, 1 reply; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2017-11-28 11:02 UTC (permalink / raw)
To: MD I. Islam; +Cc: xdp-newbies, brouer
On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
> I was wondering if XDP can scale to many-core NPU (such as NPS-400
> which has 256 cores)? I need to develop a XCP/RCP like application
> that can achieve bare-metal performance on each core. The application
> will run in a run-to-completion model. I see, DPDK can run userspace
> application on each core. I'm wondering if XDP has anything like that?
> Please let me know any suggestion.
Hi Tamim,
I think you are mixing up things a bit here...
You mention a specific NIC (NPS-400) which have many cores inside the
NIC. You need to understand XDP is a software solution, where the
programming language is eBPF. XDP does NOT run inside the NIC, instead
XDP runs as the earliest possible step in the Linux kernel network stack.
The only NIC that does hardware offloading of XDP is Netronome[1], see
their white papers[2].
[1] https://www.netronome.com/
[2] https://open-nfp.org/dataplanes-ebpf/technical-papers/
Regarding scaling: XDP scales perfect for each added CPU core. XDP is
currently (footnote-1) loaded on for entire NIC, but the XDP/eBPF
program is executed separate/independent on each NIC RX-ring queue
(processing up-to 64 frames per NAPI poll cycle).
The XDP scaling depend on how well the NIC RSS distribute traffic
across RX-ring queues, which is also true for the normal kernel network
stack. To address bad RSS distribution, I recently implement cpumap[3]
to allow XDP to scale delivery to the normal kernel network
stack. See sample code[4][5] on how to use it.
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/cpumap.c
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_kern.c
[5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_user.c
(footnote-1: there are debates regarding loading XDP/eBPF progs on
specific RX-queue numbers, so this might change.)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
2017-11-28 11:02 ` Jesper Dangaard Brouer
@ 2017-11-28 20:00 ` MD I. Islam
2017-11-28 20:38 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 8+ messages in thread
From: MD I. Islam @ 2017-11-28 20:00 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: xdp-newbies
On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
>
>> I was wondering if XDP can scale to many-core NPU (such as NPS-400
>> which has 256 cores)? I need to develop a XCP/RCP like application
>> that can achieve bare-metal performance on each core. The application
>> will run in a run-to-completion model. I see, DPDK can run userspace
>> application on each core. I'm wondering if XDP has anything like that?
>> Please let me know any suggestion.
>
> Hi Tamim,
>
> I think you are mixing up things a bit here...
>
> You mention a specific NIC (NPS-400) which have many cores inside the
> NIC. You need to understand XDP is a software solution, where the
> programming language is eBPF. XDP does NOT run inside the NIC, instead
> XDP runs as the earliest possible step in the Linux kernel network stack.
>
> The only NIC that does hardware offloading of XDP is Netronome[1], see
> their white papers[2].
Hi Jesper
I was looking at
http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
It looks like the NPS-400 NIC also runs an embedded Linux itself. The
packets are processed by the embedded ARC processor. Packets
processing however is done at userspace. They also use DPDK-like
framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
achieve something similar to using XDP? Please let me know if I'm
getting anything wrong. I'm not sure if it is possible for me (a third
party developer/PhD student) to load a customized Linux on the their
NIC.
>
> [1] https://www.netronome.com/
> [2] https://open-nfp.org/dataplanes-ebpf/technical-papers/
>
> Regarding scaling: XDP scales perfect for each added CPU core. XDP is
> currently (footnote-1) loaded on for entire NIC, but the XDP/eBPF
> program is executed separate/independent on each NIC RX-ring queue
> (processing up-to 64 frames per NAPI poll cycle).
>
> The XDP scaling depend on how well the NIC RSS distribute traffic
> across RX-ring queues, which is also true for the normal kernel network
> stack. To address bad RSS distribution, I recently implement cpumap[3]
> to allow XDP to scale delivery to the normal kernel network
> stack. See sample code[4][5] on how to use it.
I was not looking to offload eBPF program from control plane. I would
rather like to program the dataplane by modifying the embedded Linux.
I'm wondering if I can create kernel thread and pin them on each core
and having XDP to provide the thread with packets. Please let me know
any suggestion.
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/cpumap.c
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_kern.c
> [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_user.c
>
>
> (footnote-1: there are debates regarding loading XDP/eBPF progs on
> specific RX-queue numbers, so this might change.)
> --
> Best regards,
> Jesper Dangaard Brouer
> MSc.CS, Principal Kernel Engineer at Red Hat
> LinkedIn: http://www.linkedin.com/in/brouer
Many thanks
Tamim
PhD Candidate
Kent State University
http://web.cs.kent.edu/~mislam4/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
2017-11-28 20:00 ` MD I. Islam
@ 2017-11-28 20:38 ` Jesper Dangaard Brouer
2017-11-28 21:38 ` Andy Gospodarek
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2017-11-28 20:38 UTC (permalink / raw)
To: MD I. Islam; +Cc: xdp-newbies, brouer, Gilad Ben Yossef, Gilad Ben-Yossef
On Tue, 28 Nov 2017 15:00:04 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
> On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
> >
> >> I was wondering if XDP can scale to many-core NPU (such as NPS-400
> >> which has 256 cores)? I need to develop a XCP/RCP like application
> >> that can achieve bare-metal performance on each core. The application
> >> will run in a run-to-completion model. I see, DPDK can run userspace
> >> application on each core. I'm wondering if XDP has anything like that?
> >> Please let me know any suggestion.
> >
> > Hi Tamim,
> >
> > I think you are mixing up things a bit here...
> >
> > You mention a specific NIC (NPS-400) which have many cores inside the
> > NIC. You need to understand XDP is a software solution, where the
> > programming language is eBPF. XDP does NOT run inside the NIC, instead
> > XDP runs as the earliest possible step in the Linux kernel network stack.
> >
> > The only NIC that does hardware offloading of XDP is Netronome[1], see
> > their white papers[2].
>
> Hi Jesper
>
> I was looking at
> http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
> It looks like the NPS-400 NIC also runs an embedded Linux itself. The
> packets are processed by the embedded ARC processor. Packets
> processing however is done at userspace. They also use DPDK-like
> framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
> achieve something similar to using XDP? Please let me know if I'm
> getting anything wrong. I'm not sure if it is possible for me (a
> third party developer/PhD student) to load a customized Linux on the
> their NIC.
You should ask Gilad Ben-Yossef (Cc'ed), if he can help you getting XDP
working on this NIC? ;-)
> > [1] https://www.netronome.com/
> > [2] https://open-nfp.org/dataplanes-ebpf/technical-papers/
> >
> > Regarding scaling: XDP scales perfect for each added CPU core. XDP
> > is currently (footnote-1) loaded on for entire NIC, but the XDP/eBPF
> > program is executed separate/independent on each NIC RX-ring queue
> > (processing up-to 64 frames per NAPI poll cycle).
> >
> > The XDP scaling depend on how well the NIC RSS distribute traffic
> > across RX-ring queues, which is also true for the normal kernel
> > network stack. To address bad RSS distribution, I recently
> > implement cpumap[3] to allow XDP to scale delivery to the normal
> > kernel network stack. See sample code[4][5] on how to use it.
>
> I was not looking to offload eBPF program from control plane. I would
> rather like to program the dataplane by modifying the embedded Linux.
I know Broadcom is coming out with a smart-NIC, that actually just runs
Linux, and they plan to support and use XDP to redirect packets into
the machine that have the PCI NIC installed. Is that what you are
looking for?
> I'm wondering if I can create kernel thread and pin them on each core
> and having XDP to provide the thread with packets.
Well, what you describe above is exactly what cpumap does, it create
kthread and pin them to specific CPUs. See below three links [3][4][5].
> > [3]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/cpumap.c
> > [4]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_kern.c
> > [5]
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_user.c
> >
> >
> > (footnote-1: there are debates regarding loading XDP/eBPF progs on
> > specific RX-queue numbers, so this might change.)
>
> Many thanks
> Tamim
> PhD Candidate
> Kent State University
> http://web.cs.kent.edu/~mislam4/
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
2017-11-28 20:38 ` Jesper Dangaard Brouer
@ 2017-11-28 21:38 ` Andy Gospodarek
[not found] ` <CAHashqBku3=+mNAMQvKf_9k_gU74O0mUaHXt+5bVRCcThnXwdw@mail.gmail.com>
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Andy Gospodarek @ 2017-11-28 21:38 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: MD I. Islam, xdp-newbies, Gilad Ben Yossef, Gilad Ben-Yossef
On Tue, Nov 28, 2017 at 09:38:49PM +0100, Jesper Dangaard Brouer wrote:
>
> On Tue, 28 Nov 2017 15:00:04 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
>
> > On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
> > <brouer@redhat.com> wrote:
> > >
> > > On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
> > >
> > >> I was wondering if XDP can scale to many-core NPU (such as NPS-400
> > >> which has 256 cores)? I need to develop a XCP/RCP like application
> > >> that can achieve bare-metal performance on each core. The application
> > >> will run in a run-to-completion model. I see, DPDK can run userspace
> > >> application on each core. I'm wondering if XDP has anything like that?
> > >> Please let me know any suggestion.
> > >
> > > Hi Tamim,
> > >
> > > I think you are mixing up things a bit here...
> > >
> > > You mention a specific NIC (NPS-400) which have many cores inside the
> > > NIC. You need to understand XDP is a software solution, where the
> > > programming language is eBPF. XDP does NOT run inside the NIC, instead
> > > XDP runs as the earliest possible step in the Linux kernel network stack.
> > >
> > > The only NIC that does hardware offloading of XDP is Netronome[1], see
> > > their white papers[2].
> >
> > Hi Jesper
> >
> > I was looking at
> > http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
> > It looks like the NPS-400 NIC also runs an embedded Linux itself. The
> > packets are processed by the embedded ARC processor. Packets
> > processing however is done at userspace. They also use DPDK-like
> > framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
> > achieve something similar to using XDP? Please let me know if I'm
> > getting anything wrong. I'm not sure if it is possible for me (a
> > third party developer/PhD student) to load a customized Linux on the
> > their NIC.
>
> You should ask Gilad Ben-Yossef (Cc'ed), if he can help you getting XDP
> working on this NIC? ;-)
>
>
> > > [1] https://www.netronome.com/
> > > [2] https://open-nfp.org/dataplanes-ebpf/technical-papers/
> > >
> > > Regarding scaling: XDP scales perfect for each added CPU core. XDP
> > > is currently (footnote-1) loaded on for entire NIC, but the XDP/eBPF
> > > program is executed separate/independent on each NIC RX-ring queue
> > > (processing up-to 64 frames per NAPI poll cycle).
> > >
> > > The XDP scaling depend on how well the NIC RSS distribute traffic
> > > across RX-ring queues, which is also true for the normal kernel
> > > network stack. To address bad RSS distribution, I recently
> > > implement cpumap[3] to allow XDP to scale delivery to the normal
> > > kernel network stack. See sample code[4][5] on how to use it.
> >
> > I was not looking to offload eBPF program from control plane. I would
> > rather like to program the dataplane by modifying the embedded Linux.
>
> I know Broadcom is coming out with a smart-NIC, that actually just runs
> Linux, and they plan to support and use XDP to redirect packets into
> the machine that have the PCI NIC installed. Is that what you are
> looking for?
>
Did somebody say, Broadcom? :-)
There are options that exist in the world for running a customized
version of Linux in a NIC that can control the traffic (if you like)
before the traffic arrives at the server. Jesper is also correct that
standard XDP programs do run directly on this NIC as well. Feel free to
email me directly if you want to know more and help determine if
hardware like this would be good for your research.
>
> > I'm wondering if I can create kernel thread and pin them on each core
> > and having XDP to provide the thread with packets.
>
> Well, what you describe above is exactly what cpumap does, it create
> kthread and pin them to specific CPUs. See below three links [3][4][5].
>
> > > [3]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/cpumap.c
> > > [4]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_kern.c
> > > [5]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_user.c
> > >
> > >
> > > (footnote-1: there are debates regarding loading XDP/eBPF progs on
> > > specific RX-queue numbers, so this might change.)
> >
> > Many thanks
> > Tamim
> > PhD Candidate
> > Kent State University
> > http://web.cs.kent.edu/~mislam4/
>
> --
> Best regards,
> Jesper Dangaard Brouer
> MSc.CS, Principal Kernel Engineer at Red Hat
> LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
[not found] ` <CAHashqBku3=+mNAMQvKf_9k_gU74O0mUaHXt+5bVRCcThnXwdw@mail.gmail.com>
@ 2017-11-28 22:50 ` MD I. Islam
0 siblings, 0 replies; 8+ messages in thread
From: MD I. Islam @ 2017-11-28 22:50 UTC (permalink / raw)
To: Andy Gospodarek
Cc: Jesper Dangaard Brouer, xdp-newbies, Gilad Ben Yossef,
Gilad Ben-Yossef
On Tue, Nov 28, 2017 at 4:14 PM, Andy Gospodarek <andy@greyhouse.net> wrote:
> On Tue, Nov 28, 2017 at 3:38 PM, Jesper Dangaard Brouer <brouer@redhat.com>
> wrote:
>>
>>
>> On Tue, 28 Nov 2017 15:00:04 -0500 "MD I. Islam" <tamim@csebuet.org>
>> wrote:
>>
>> > On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
>> > <brouer@redhat.com> wrote:
>> > >
>> > > On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org>
>> > > wrote:
>> > >
>> > >> I was wondering if XDP can scale to many-core NPU (such as NPS-400
>> > >> which has 256 cores)? I need to develop a XCP/RCP like application
>> > >> that can achieve bare-metal performance on each core. The application
>> > >> will run in a run-to-completion model. I see, DPDK can run userspace
>> > >> application on each core. I'm wondering if XDP has anything like
>> > >> that?
>> > >> Please let me know any suggestion.
>> > >
>> > > Hi Tamim,
>> > >
>> > > I think you are mixing up things a bit here...
>> > >
>> > > You mention a specific NIC (NPS-400) which have many cores inside the
>> > > NIC. You need to understand XDP is a software solution, where the
>> > > programming language is eBPF. XDP does NOT run inside the NIC,
>> > > instead
>> > > XDP runs as the earliest possible step in the Linux kernel network
>> > > stack.
>> > >
>> > > The only NIC that does hardware offloading of XDP is Netronome[1], see
>> > > their white papers[2].
>> >
>> > Hi Jesper
>> >
>> > I was looking at
>> >
>> > http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
>> > It looks like the NPS-400 NIC also runs an embedded Linux itself. The
>> > packets are processed by the embedded ARC processor. Packets
>> > processing however is done at userspace. They also use DPDK-like
>> > framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
>> > achieve something similar to using XDP? Please let me know if I'm
>> > getting anything wrong. I'm not sure if it is possible for me (a
>> > third party developer/PhD student) to load a customized Linux on the
>> > their NIC.
>>
>> You should ask Gilad Ben-Yossef (Cc'ed), if he can help you getting XDP
>> working on this NIC? ;-)
>>
>>
>> > > [1] https://www.netronome.com/
>> > > [2] https://open-nfp.org/dataplanes-ebpf/technical-papers/
>> > >
>> > > Regarding scaling: XDP scales perfect for each added CPU core. XDP
>> > > is currently (footnote-1) loaded on for entire NIC, but the XDP/eBPF
>> > > program is executed separate/independent on each NIC RX-ring queue
>> > > (processing up-to 64 frames per NAPI poll cycle).
>> > >
>> > > The XDP scaling depend on how well the NIC RSS distribute traffic
>> > > across RX-ring queues, which is also true for the normal kernel
>> > > network stack. To address bad RSS distribution, I recently
>> > > implement cpumap[3] to allow XDP to scale delivery to the normal
>> > > kernel network stack. See sample code[4][5] on how to use it.
>> >
>> > I was not looking to offload eBPF program from control plane. I would
>> > rather like to program the dataplane by modifying the embedded Linux.
>>
>> I know Broadcom is coming out with a smart-NIC, that actually just runs
>> Linux, and they plan to support and use XDP to redirect packets into
>> the machine that have the PCI NIC installed. Is that what you are
>> looking for?
>>
>
> Did somebody say, Broadcom? :-)
>
> There are options that exist in the world for running a customized version
> of Linux in a NIC that can control the traffic (if you like) before the
> traffic arrives at the server. Jesper is also correct that standard XDP
> programs do run directly on this NIC as well. Feel free to email me
> directly if you want to know more and help determine if hardware like this
> would be good for your research.
Hi Andy
That will be very helpful!! I will email you in person.
Thanks
>
>>
>> > I'm wondering if I can create kernel thread and pin them on each core
>> > and having XDP to provide the thread with packets.
>>
>> Well, what you describe above is exactly what cpumap does, it create
>> kthread and pin them to specific CPUs. See below three links [3][4][5].
>>
>> > > [3]
>> > >
>> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/cpumap.c
>> > > [4]
>> > >
>> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_kern.c
>> > > [5]
>> > >
>> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_redirect_cpu_user.c
>> > >
>> > >
>> > > (footnote-1: there are debates regarding loading XDP/eBPF progs on
>> > > specific RX-queue numbers, so this might change.)
>> >
>> > Many thanks
>> > Tamim
>> > PhD Candidate
>> > Kent State University
>> > http://web.cs.kent.edu/~mislam4/
>>
>> --
>> Best regards,
>> Jesper Dangaard Brouer
>> MSc.CS, Principal Kernel Engineer at Red Hat
>> LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
2017-11-28 20:38 ` Jesper Dangaard Brouer
2017-11-28 21:38 ` Andy Gospodarek
[not found] ` <CAHashqBku3=+mNAMQvKf_9k_gU74O0mUaHXt+5bVRCcThnXwdw@mail.gmail.com>
@ 2017-11-29 5:38 ` Gilad Ben-Yossef
[not found] ` <CAOtvUMet3z4qFubyO7iDunGeD1fi_mRdOKeih0qM0bzKqcTpGA@mail.gmail.com>
3 siblings, 0 replies; 8+ messages in thread
From: Gilad Ben-Yossef @ 2017-11-29 5:38 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: MD I. Islam, xdp-newbies, Gilad Ben Yossef
Hi,
On Tue, Nov 28, 2017 at 10:38 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Tue, 28 Nov 2017 15:00:04 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
>
>> On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
>> <brouer@redhat.com> wrote:
>> >
>> > On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org> wrote:
>> >
>> >> I was wondering if XDP can scale to many-core NPU (such as NPS-400
>> >> which has 256 cores)? I need to develop a XCP/RCP like application
>> >> that can achieve bare-metal performance on each core. The application
>> >> will run in a run-to-completion model. I see, DPDK can run userspace
>> >> application on each core. I'm wondering if XDP has anything like that?
>> >> Please let me know any suggestion.
>> >
>> > Hi Tamim,
>> >
>> > I think you are mixing up things a bit here...
>> >
>> > You mention a specific NIC (NPS-400) which have many cores inside the
>> > NIC. You need to understand XDP is a software solution, where the
>> > programming language is eBPF. XDP does NOT run inside the NIC, instead
>> > XDP runs as the earliest possible step in the Linux kernel network stack.
>> >
>> > The only NIC that does hardware offloading of XDP is Netronome[1], see
>> > their white papers[2].
>>
>> Hi Jesper
>>
>> I was looking at
>> http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
>> It looks like the NPS-400 NIC also runs an embedded Linux itself. The
>> packets are processed by the embedded ARC processor. Packets
>> processing however is done at userspace. They also use DPDK-like
>> framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
>> achieve something similar to using XDP? Please let me know if I'm
>> getting anything wrong. I'm not sure if it is possible for me (a
>> third party developer/PhD student) to load a customized Linux on the
>> their NIC.
>
> You should ask Gilad Ben-Yossef (Cc'ed), if he can help you getting XDP
> working on this NIC? ;-)
The NPS-400 wasn't a NIC but a network processor designed for a router
and anyway AFAIK, Mellanox (who purchased EZchip) killed this product
line.
You might find their BlueField based NIC more suitable for what you want.
I no longer work for Mellanox :-)
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
"If you take a class in large-scale robotics, can you end up in a
situation where the homework eats your dog?"
-- Jean-Baptiste Queru
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: XDP on many-core NPU
[not found] ` <CAOtvUMet3z4qFubyO7iDunGeD1fi_mRdOKeih0qM0bzKqcTpGA@mail.gmail.com>
@ 2017-11-29 8:02 ` MD I. Islam
0 siblings, 0 replies; 8+ messages in thread
From: MD I. Islam @ 2017-11-29 8:02 UTC (permalink / raw)
To: Gilad Ben-Yossef, Andy Gospodarek, Edwin Peer
Cc: Jesper Dangaard Brouer, xdp-newbies, Gilad Ben Yossef
Yeah, I actually need a network processor (NPU) where I can run Linux.
Something like NPS-400, Ericsson SNP 4000 or Freescale T4240. A
Ericsson research paper [1] also designed a hypothetical a 256 core
ARM Cortex-m3 based NPU. Do you know of any many-core NPU based
development board where I can install linux?
1. http://conferences.sigcomm.org/sigcomm/2013/papers/hotsdn/p103.pdf
Many thanks for the help!!
On Wed, Nov 29, 2017 at 12:37 AM, Gilad Ben-Yossef <gilad@benyossef.com> wrote:
> Hi,
>
> On Tue, Nov 28, 2017 at 10:38 PM, Jesper Dangaard Brouer <brouer@redhat.com>
> wrote:
>>
>>
>> On Tue, 28 Nov 2017 15:00:04 -0500 "MD I. Islam" <tamim@csebuet.org>
>> wrote:
>>
>> > On Tue, Nov 28, 2017 at 6:02 AM, Jesper Dangaard Brouer
>> > <brouer@redhat.com> wrote:
>> > >
>> > > On Mon, 27 Nov 2017 18:33:10 -0500 "MD I. Islam" <tamim@csebuet.org>
>> > > wrote:
>> > >
>> > >> I was wondering if XDP can scale to many-core NPU (such as NPS-400
>> > >> which has 256 cores)? I need to develop a XCP/RCP like application
>> > >> that can achieve bare-metal performance on each core. The application
>> > >> will run in a run-to-completion model. I see, DPDK can run userspace
>> > >> application on each core. I'm wondering if XDP has anything like
>> > >> that?
>> > >> Please let me know any suggestion.
>> > >
>> > > Hi Tamim,
>> > >
>> > > I think you are mixing up things a bit here...
>> > >
>> > > You mention a specific NIC (NPS-400) which have many cores inside the
>> > > NIC. You need to understand XDP is a software solution, where the
>> > > programming language is eBPF. XDP does NOT run inside the NIC,
>> > > instead
>> > > XDP runs as the earliest possible step in the Linux kernel network
>> > > stack.
>> > >
>> > > The only NIC that does hardware offloading of XDP is Netronome[1], see
>> > > their white papers[2].
>> >
>> > Hi Jesper
>> >
>> > I was looking at
>> >
>> > http://events.linuxfoundation.org/sites/events/files/slides/Massively_Multi-Core_LPC_2013.pdf.
>> > It looks like the NPS-400 NIC also runs an embedded Linux itself. The
>> > packets are processed by the embedded ARC processor. Packets
>> > processing however is done at userspace. They also use DPDK-like
>> > framework OpenNPU/NPS SDK to bypass the kernel. Is it possible to
>> > achieve something similar to using XDP? Please let me know if I'm
>> > getting anything wrong. I'm not sure if it is possible for me (a
>> > third party developer/PhD student) to load a customized Linux on the
>> > their NIC.
>>
>> You should ask Gilad Ben-Yossef (Cc'ed), if he can help you getting XDP
>> working on this NIC? ;-)
>>
>>
> The NPS-400 wasn't a NIC but a network processor designed for a router and
> anyway AFAIK, Mellanox (who purchased EZchip) killed this product line.
> You might find their BlueField based NIC more suitable for what you want.
>
> I no longer work for Mellanox :-)
>
> Gilad
>
>
> --
> Gilad Ben-Yossef
> Chief Coffee Drinker
>
> "If you take a class in large-scale robotics, can you end up in a situation
> where the homework eats your dog?"
> -- Jean-Baptiste Queru
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-11-29 8:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-27 23:33 XDP on many-core NPU MD I. Islam
2017-11-28 11:02 ` Jesper Dangaard Brouer
2017-11-28 20:00 ` MD I. Islam
2017-11-28 20:38 ` Jesper Dangaard Brouer
2017-11-28 21:38 ` Andy Gospodarek
[not found] ` <CAHashqBku3=+mNAMQvKf_9k_gU74O0mUaHXt+5bVRCcThnXwdw@mail.gmail.com>
2017-11-28 22:50 ` MD I. Islam
2017-11-29 5:38 ` Gilad Ben-Yossef
[not found] ` <CAOtvUMet3z4qFubyO7iDunGeD1fi_mRdOKeih0qM0bzKqcTpGA@mail.gmail.com>
2017-11-29 8:02 ` MD I. Islam
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.