From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E461E2701CB for ; Tue, 24 Mar 2026 01:08:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774314523; cv=none; b=MMJkYGCtdc1AFlrB4H7ea7nSqgZcSVpuFIjKLNsZqB1Z9SrdSDuke5bi7gIYKDxAZ70NT/U+n0wBTSt/9/4y/ckWplAm4+q4aBqZT2meiYycWypV6xaHSyUHCu8wv0XEG4uKiKzng9qT7WuR7o9UHK1+YbBnJn8BpKIk2b7Eezo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774314523; c=relaxed/simple; bh=KInrRAyhRt791yTdFYM69QpYi0R9v/G0h2CxMTmAbU0=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=YqFFnRHFg/0Z1eHgaboHTjwHWlPVlBSfe1MdH7epc5aFskUqNwIFZfFSWjGSVwZugXiBWfkoLaokgn25OgSpmkjFvq3fYLKFXH8HHPkztMSEjxRlyHrKoPyCVoW935W50TeSzpabzS3PuunVVfSikcyYIEU05wTuqoTeJSdOTRo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=cfC4g1+Z; arc=none smtp.client-ip=95.215.58.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="cfC4g1+Z" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1774314520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OL/H9vnrwh8k2n7/j6srSqA0EyN3fEIrRdwUnMet1/s=; b=cfC4g1+ZyWDKntwuCkIXgBjduic1ZDBTiNgA+JMPU9NQbDdq0HfcbRZbSHvcXQPpT2ZDci +aGx6+YOoHG3ln0WrBfPktiZ6aztXsAk1LIdJpT9J/EPuHaBGgU3dDunKnpl+9UPqFsB5S eQ8AEY5V8LHLzw7zkH4c3AUJ2odMins= From: Roman Gushchin To: "Lorenzo Stoakes (Oracle)" Cc: Andrew Morton , David Hildenbrand , Zi Yan , Baolin Wang , "Liam R . Howlett" , Nico Pache , Ryan Roberts , Dev Jain , Barry Song , Lance Yang , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 0/9] mm/huge_memory: refactor zap_huge_pmd() In-Reply-To: <5cd57c69-7193-422f-b6b5-75bb5234e5f3@lucifer.local> (Lorenzo Stoakes's message of "Mon, 23 Mar 2026 11:31:29 +0000") References: <20260319200917.ce345a369d035050b6329ac5@linux-foundation.org> <87tsu9kgv3.fsf@linux.dev> <20260320203311.715ed75bcd84c18d24894324@linux-foundation.org> <20260321171530.8b3e8207f89d5a7384b9f01f@linux-foundation.org> <5cd57c69-7193-422f-b6b5-75bb5234e5f3@lucifer.local> Date: Mon, 23 Mar 2026 18:08:27 -0700 Message-ID: <87jyv2hw50.fsf@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-Migadu-Flow: FLOW_OUT "Lorenzo Stoakes (Oracle)" writes: > On Sat, Mar 21, 2026 at 05:15:30PM -0700, Andrew Morton wrote: >> On Fri, 20 Mar 2026 20:33:11 -0700 Andrew Morton wrote: >> >> > A lot of patchsets are "failed to apply". What is Sashiko trying to >> > apply MM patches to? It would take some smarts to apply the v2 >> > patchset when v1 is presently in mm.git? >> >> ? >> >> The way things are going at present, I'm just not going to apply a > > 50% noise vs. signal?... maybe wait until we're in the 9x'%s? > >> series which Sashiko "failed to apply". And that's cool, I'll just >> wait for a version which Sashiko was able to apply. And then not >> apply unless all Sashiko questions are resolved or convincingly refuted. > > Andrew, for crying out loud. Please don't do this. > > 2 of the 3 series I respan on Friday, working a 13 hour day to do so, don't > apply to Sashiko, but do apply to the mm tree. I'll look into that. > I haven't the _faintest clue_ how we are supposed to factor a 3rd party > experimental website applying or not applying series into our work?? > > And 'not apply unless all Sashiko questions are resolved or convincingly > refuted.' is seriously concerning. > > The workload is already insane, now you're expecting us to answer every bit > of nonsense Sashiko hallucinates or misunderstands also? > > I say that with no disrespect to Roman or his efforts, but as discussed at > length, it is not ready for prime time yet. > > It's clear that Sashiko is not correctly handling applies, and produces a > lot of noise. Predicating taking series on this is absurd. Not trying to pretend that Sashiko is perfect in any way, I think a good mental exercise is to put down our expectation how the "perfect" system would work. The more I work on it, the more I realize it it's far from binary correct/incorrect. In fact, the same applies to humans: I'm sure everyone of us had once this feeling that someone is to picky and just annoying us with finding small nits. At the same time some of these people are extremely useful for the community to find and fix a lot of issues. In the end, we do argue all the time about questions/issues raised by human reviewers. Like do we prefer a system, which finds more real bugs at the cost of being more noisy or we prefer a system which misses more but if it points at the bug, it's certainly real? I'm sure you tempted to prefer the latter, but image a hypothetical system which finds _all_ bugs, but has some false positive rate, e.g. 20%. I think it's pretty attractive. Also lot of raised issues are real, but subjectively are not worth our time. But this is extremely subjective! Depends on the personal level of perfectionism, amount of time available, the state of code before, further plans, etc etc. For example, syzkaller has usually o(100's) open bugs, which are 100% real, but not always are high priority work. I think that asking to address 100% issues raised by any LLM is not reasonable (especially because it's output might be different each time you runt it with the same input), but I also think it's reasonable to address critical & high severity concerns. And I'm happy to tweak Sashiko to be more conservative here, but I think it should be based on some specific examples or data, not purely subjective. tl;dr I increasingly realize the importance of the social context for providing good reviews, and it can't be easily derived from the code. What is acceptable in one subsystem is considered a bad practice in the other. I guess the only way to get the system we all find acceptable (and we still might not like it, who likes being pointed at their bugs?) is collectively codify our expectations in prompts on per-subsystem basis. Thanks!