From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: Random numbers at line-rate Date: Mon, 21 Jul 2014 13:43:50 -0700 Message-ID: <20140721134350.70ad9fba@haswell> References: <20140721195415.GA25740@hmsreliant.think-freely.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev-VfR2kkLFssw@public.gmane.org To: Neil Horman Return-path: In-Reply-To: <20140721195415.GA25740-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" On Mon, 21 Jul 2014 15:54:15 -0400 Neil Horman wrote: > On Mon, Jul 21, 2014 at 09:24:36PM +0200, Chris Pappas wrote: > > Hi, > > > > I need to generate a random number per packet and I used the rte_fast_rand > > function to do so. When I run the code for one port-core I get almost > > line-rate performance. However, running simultaneously on multiple cores > > degrades performance significantly. (in all cases I uses minimum-sized > > packets). > > > > Shouldn't the implementation scale for multicore and not degrade > > performance or am I missing anything? Also, is there another recommendation > > for generating randomness at line-rate? (the cpu does not support rdrand). > > > > Best regards, > > Chris > > > > thats an odd random number generator. I think, without locking, its likely on a > multicore system to produce identical values on multiple cores operating in > parallel (since multiple cores can read rte_red_rand_seed at the same time). > That may well lead to multiple packets having the same nonce, which might cause > odd behavior. > > If your cpu supports it, I'd suggest writing some inline assembly to use the > rdrand instruction instead. I'm not sure about its performance relative to the > current implementation, but IIRC the instruction is handled internal to the > core, so it should scale with any number of cpus. > > neil > Or just do per-core seed value (and use RTE_PER_LCORE)