From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: DPDK & QPI performance issue in Romley platform. Date: Mon, 2 Sep 2013 09:10:12 -0700 Message-ID: <20130902091012.2e68b88e@nehalam.linuxnetplumber.net> References: <52240466.7050907@cas-well.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev-VfR2kkLFssw@public.gmane.org, "Yannic.Chou \(=?utf-8?B?5ZGo5ZOy5q2j?=\) : 6808" , "Alan Yu \(=?utf-8?B?5L+e5Lqm5YGJ?=\) : 6632" To: Zachary Return-path: In-Reply-To: <52240466.7050907-hquedaq+nxtWk0Htik3J/w@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On Mon, 2 Sep 2013 11:22:14 +0800 Zachary wrote: > Hi~ > > I have a question about DPDK & QPI performance issue in Romley platform. > Recently, I use DPDK example, l2fwd, to test DPDK's performance in my Romley platform. > When I try to do the test, crossing used CPU, I find the performance dramatically decrease. > Is it true? Or any method can prove the phenomenon? > > In my opinion, there should be no this kind of issue here due to QPI have enough bandwidth to deal the kinds of case. > Thus, I am so amaze in our results and can not explain it. > Could someone can help me to solve this problem. > > Thank a lot! Many DPDK API's have NUMA socket as one of the parameters. In order to get good performance it is up to the application to be NUMA aware and use socket local resources. One example we do is to have a packet mbuf pool per socket, and assign each device to the correct pool. Also, you may want to choose which lcore's to assign to which function based on socket locality. For example threads that are polling receiver should be on same socket as that NIC. Remember the example applications are demo toys, and don't do all the things a real application would need to do.