From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28EC814831D for ; Tue, 14 Jan 2025 15:43:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736869441; cv=none; b=OmqVQMHVz5V93PI9QFAAuObbMLbXiOgD9bx2loGD5WKQoI3R7Z5/oD8/yGYKUC4efkkfm22+uJRDw2mkuUHEVEM6Fcge9BlvBJEyoNgK245kvbIl2NMJl/2wyg3z76BXWmkb9t2IuLas0mI1ZAuQ/1WSCGZAavpyncsYDXKMU5c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736869441; c=relaxed/simple; bh=gb5scqlg5EOwNBXJECxOfOQmPlXNiyx3rEEvEVzDu8w=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SniRlTV75mvAn0v8QUebnSNasWraWAkpRA+h32OrB0+28HJP0em9XCZFmvm5k3F2gKwXQrPgqcjrJIagUxRXBRkzRHum8cw1JVM+wFOlZdqpY/CV9ykI3Op0gRma0/wHHe7eO/WLNH2WS5NLHgpJg8m3dTaagDMoVEI+LsfXbK0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YS3zuWmx; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YS3zuWmx" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-21634338cfdso90735595ad.2 for ; Tue, 14 Jan 2025 07:43:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736869439; x=1737474239; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=UJ9b0XQeiCRp/yXZkAH3WnG1X4ZD5WGhB4dCgP6LS1Q=; b=YS3zuWmxCCFMSoQf7dp0J07ECKKWOaHDU02k/4mOVwU16fg1v7/qlpqzSZIkZr5bip zGJ5O5VplenQW328ydYwZSyUyf46pIuAuDAU5cYM+oPt5YiNN0YD1YLJMbREHhWK77eV w/2oEJmHYwmVpgNpc8ebBA8ULo3pO7q2seR8ZGsLqgeAFTTDaNyszCCFdBlLCmmVJNfR ksxgfDzXuhFSnXp3TIlP5N/qeAxlSQFc3lDnC7+k6kQr2akPbqx3gFGeojSd84vZ7xTK FDUpXdKYSXbUZZdMDnfyb6cWignmqQN55fnoyaxKyybELj3UVLXAlsVuHrKvCyHQOylE h3jA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736869439; x=1737474239; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UJ9b0XQeiCRp/yXZkAH3WnG1X4ZD5WGhB4dCgP6LS1Q=; b=YCdz+CCmS7mBQZmbeoospExOkt8NZ4e4rSAwLCT+3iy67VfeE/YeTa6E5clZdiOv3C dcQN7tq8Ow89/oSlRn6RRYOUyhh8qhSXOcXoKSjxDDehtG1u0JX+D0RwJWGrYvrwCAHO A1Up1D3rXKZWnlEXG1A7oy9Cd6ztb3tJVoCnit82ayQbC3hI9DRQrEWfa8UGhm77yQ3u FaMuuxxQLoT0dX+NQx6YuwlGWjQRuTmFPFhCq3yS0xPp8oKBOwvdZdu6wX8mInc7BA2m 7vnURQQJFLJMc6XNFH74XxfC3MYhBMGKwq+38PYR6+VBc62L70CfPFSq5YbvAcNx+IcG aNyA== X-Forwarded-Encrypted: i=1; AJvYcCWnwdxSPbXhAzJrj/NTuRcdyJbUzlmac22U4dSvZ+XjatOuts9pSAnTQKTNTHBV1MtOT/dU2X/IPgHXyDw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx2ipnv4vBE+N1XD/gj7vH5uQH75mQ57hp+QWRZu4AkqUeFW8i/ ijnjfSX9DM9/Z+pZAfTipRwgdQ808M21SD/kSG7fbPis/VyjflKT X-Gm-Gg: ASbGncsHzV73FtSrqrdN7IcPtdGnV7lQ5GkV5Vo2z6qgnnf6jB9BFl4iydZ5vAXvwDa BTOS1etmkslWO8IdFdoCsj2fdDz+tlNAwZ4wgKTiOwK+879z2O6G3SO59cxgaa3rmsSwAarAygd KrlpIx5/p+p7BUEyyUaaszT5P5mvu2ygVvoTFYdH2YDP26JrZzbx6Ncxc0tPWfFokoi7r2aLaE2 1uVF7twUqfAL7yTALDHTDfeSYKyrJOoiSyaKJckCo5v4Bttn5WXfN9V X-Google-Smtp-Source: AGHT+IGMz2kIAmi2/VjxbKd8rXPzm6AjqXQpzUxQ+GjpRVf6sTN+TzY0hCeAwVTyu/RPaRZkj+tGNg== X-Received: by 2002:a05:6a20:7350:b0:1e0:cc4a:caab with SMTP id adf61e73a8af0-1e88d12c1e2mr39609048637.19.1736869439118; Tue, 14 Jan 2025 07:43:59 -0800 (PST) Received: from localhost ([216.228.125.129]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72d4059485bsm7446087b3a.83.2025.01.14.07.43.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2025 07:43:58 -0800 (PST) Date: Tue, 14 Jan 2025 10:43:56 -0500 From: Yury Norov To: I Hsin Cheng Cc: mark.rutland@arm.com, linux@rasmusvillemoes.dk, jserv@ccns.ncku.edu.tw, linux-kernel@vger.kernel.org, visitorckw@gmail.com Subject: Re: [RFC PATCH] cpumask: Implement "random" version of cpumask_any_but() Message-ID: References: <20250113061839.22131-1-richard120310@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: > > > > @@ -401,12 +401,18 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta > > > > static __always_inline > > > > unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu) > > > > { > > > > - unsigned int i; > > > > + unsigned int i, n, weight; > > > > > > > > cpumask_check(cpu); > > > > - for_each_cpu(i, mask) > > > > - if (i != cpu) > > > > - break; > > > > + weight = cpumask_weight(mask); > > > > + n = get_random_u32() % weight; > > > > + > > > > + /* If we accidentally pick "n" equal to "cpu", > > > > + * then simply choose "n + 1"th cpu. > > > > + */ > > > > + if (n == cpu) > > > > + n = (n + 1) % weight; > > > > + i = cpumask_nth(n, mask); > > > > This is an entirely broken thing, and it works only because your CPU mask > > is dense. Imagine cpumask: 0111 1111. Your new cpumask_any_but(mask, 5) > > will return 5 exactly, if the get_random_u32() draws 4. > > > > Oh I'm sorry for this part, "cpumask_nth()" return value should be > checked instead of checking "get_random_u32()". May I ask why does it > only work for dense cpumask? Because for dense mask cpumask_nth(n) == n, and your n = get_random_u32() % weight is the same as n = cpumask_nth(get_random_u32() % weight) > I mean clearly original method will be > faster when cpumask isn't dense. > > > It looks broken even for a dense mask. By probability, your code returns: > > > > P(0-4,7) == 1/8, > > P(5) == 0, > > P(6) == 1/4. > > > > Assuming you are trying to implement a random uniform distribution drawing, > > the correct probabilities should look like: > > > > P(0-4,6-7) == 1/7, > > P(5) == 0, > > > > Thanks, > > Yury > > > > I did noticed this part, I was thinking that we do not actually need to > make the prbability to be uniform distribution, just want to make it > scatters more then picking certain number. On cpumask level you can't speculate whether users need true randomness or not. That's why we pay so much attention to proper function naming. See: - cpumask_any() - 'any' requirement is much simpler than random; - cpumask_any_distribute() - not random at all, just a better (good enough) approximation; - cpumask_random() - a true random drawing. Must pass Chi^2, Kolmogorov-Smirnov and other fancy statistical tests. As you can see, doesn't exist. If in your function you use expensive true-random get_random_u32(), one can expect that the output will not be worse than the original uniform distribution, by the statistical properties means. > Thanks for your detailed review and explanation, also thanks to Mark and > Kuan-Wei ! > > I now realize "random" here is more of a convention for the caller to > states that it doesn't matter which cpu it gets, maybe we should > rephrase the comment to make it less confusing? Because I think "random" > itself does stands for a particular meaning. Feel free to send a patch. Thanks, Yury