All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Theodore Tso <tytso@mit.edu>,
	Greg KH <gregkh@linuxfoundation.org>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	debarbos@redhat.com, Arnaldo Carvalho de Melo <acme@kernel.org>,
	Konstantin Ryabitsev <mricon@kernel.org>,
	Guenter Roeck <linux@roeck-us.net>,
	sashiko-bot@kernel.org, sashiko-reviews@lists.linux.dev,
	sashiko@lists.linux.dev,
	Linux Kernel Workflows <workflows@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	devicetree@vger.kernel.org, kfree@google.com
Subject: Re: Stop false review statements
Date: Mon, 18 May 2026 00:05:45 +0200	[thread overview]
Message-ID: <20260518000545.778932fe@foz.lan> (raw)
In-Reply-To: <4ECB5626-01E2-4AEB-AD11-524AB224CAA0@linux.dev>

On Sun, 17 May 2026 12:42:12 -0700
Roman Gushchin <roman.gushchin@linux.dev> wrote:

> 
> > On May 17, 2026, at 11:57 AM, Theodore Tso <tytso@mit.edu> wrote:
> > On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:  
> >> 
> >> I actually tried to run it with ollama on my
> >> personal framework 13. Adding nominal support is trivial, but the
> >> whole thing is not really useful: I can get maybe few hundreds
> >> tokens per second using a quantified model with reduced quality; an
> >> average sashiko review is consuming 3.5 millions tokens (with Gemini
> >> 3.1 pro, it’s also model-dependent).  
> > 
> > I'm curious.  What hardware and LLM model were you using?  A few
> > hundred tokens per second seems surprising high.  My initial
> > research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
> > can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).  
> 
> I’ve framework 13 with amd 7840u. I’ve tried several models both on cpu and gpu. 
> Sorry, it was a couple of months ago and I don’t remember all the details, so I won’t 
> claim any specific numbers, but as I remember the best numbers were around 
> a hundred tokens per second. In any case it’s few orders of magnitude slower than
>  what is realistically required.
> 
> If someone has a powerful hardware and is willing to benchmark sashiko with open-source
> models, I’m very interested in results.

If you add the patch you used with ollama somewhere, I can try
running here and do some benchmarks - that is assuming that 
it won't try to run 3.5 millions of tokens.


> 
> > [1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/
> > 
> > The model matters of course.  With Gemma 3 27B and a 6-bit
> > quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
> > 72.8 tokens/second.  But unless you're using a really low-end model,
> > or a really expensive, splufty hardware platform, I haven't seen
> > reports of hundreds of tokens per second on hardware costing a
> > reasonable amount of memory.  (I'll set aside the question of whether
> > spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
> > fully spec'ed out M3 Ultra Mac Studio is "reasonable".)
> > 
> > As a result I'm not entirely sure how realistic it is to do reviews
> > using "free" (you still have to pay $$$ for the hardware) local,
> > open-weight LLM's if an average review requires around 3.5 million
> > tokens.  
> 
> Fully agree. But it might change in few years, things are moving quickly.


Thanks,
Mauro

  reply	other threads:[~2026-05-17 22:05 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-17 19:42 Stop false review statements Roman Gushchin
2026-05-17 22:05 ` Mauro Carvalho Chehab [this message]
  -- strict thread matches above, loose matches on Subject: below --
2026-05-17 19:53 Roman Gushchin
2026-05-16  8:05 Krzysztof Kozlowski
2026-05-16 12:11 ` Guenter Roeck
2026-05-16 12:16   ` Krzysztof Kozlowski
2026-05-16 12:23     ` Guenter Roeck
2026-05-16 12:29       ` Krzysztof Kozlowski
2026-05-16 13:24         ` Laurent Pinchart
2026-05-16 13:45           ` Krzysztof Kozlowski
2026-05-16 21:10           ` Mauro Carvalho Chehab
2026-05-17 15:21       ` Jonathan Corbet
2026-05-18  8:22         ` Jani Nikula
2026-05-16 15:20   ` Konstantin Ryabitsev
2026-05-16 15:36     ` Greg KH
2026-05-16 15:41     ` Roman Gushchin
2026-05-16 15:45       ` Greg KH
2026-05-16 15:49         ` Roman Gushchin
2026-05-16 18:28           ` Arnaldo Carvalho de Melo
2026-05-16 21:29             ` Derek Barbosa
2026-05-16 21:33               ` Krzysztof Kozlowski
2026-05-16 21:59                 ` Roman Gushchin
2026-05-17  8:25                   ` Krzysztof Kozlowski
2026-05-17 10:05                   ` Mauro Carvalho Chehab
2026-05-17 10:10                     ` Willy Tarreau
2026-05-17 10:12                     ` Greg KH
2026-05-17 16:29                       ` Theodore Tso
2026-05-17 22:22                         ` Laurent Pinchart
2026-05-17 16:39                       ` Mauro Carvalho Chehab
2026-05-17 17:03                         ` Guenter Roeck
2026-05-17 18:17                         ` Roman Gushchin
2026-05-17 18:56                           ` Mauro Carvalho Chehab
2026-05-18  5:31                             ` Greg KH
2026-05-17 18:57                           ` Theodore Tso
2026-05-17 19:36                             ` Mauro Carvalho Chehab
2026-05-18  8:04                   ` Jani Nikula
2026-05-18  8:12                     ` Krzysztof Kozlowski
2026-05-18 12:16                     ` Theodore Tso
2026-05-18 12:54                       ` Geert Uytterhoeven
2026-05-18 19:40                       ` Mauro Carvalho Chehab
2026-05-16 18:28           ` Krzysztof Kozlowski
2026-05-16 18:56             ` Roman Gushchin
2026-05-16 19:00               ` Krzysztof Kozlowski
2026-05-16 19:13                 ` Guenter Roeck
2026-05-16 19:25                   ` Guenter Roeck
2026-05-16 19:31                     ` Roman Gushchin
2026-05-16 19:15                 ` Roman Gushchin
2026-05-16 20:41                   ` Theodore Tso
2026-05-16 22:04                     ` Hillf Danton
2026-05-17 15:56                   ` Danilo Krummrich
2026-05-17 21:25                     ` Danilo Krummrich
2026-05-18 17:19                     ` Roman Gushchin
2026-05-19 12:23                       ` Danilo Krummrich
2026-05-18  2:12           ` SeongJae Park
2026-05-16 22:32         ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260518000545.778932fe@foz.lan \
    --to=mchehab+huawei@kernel.org \
    --cc=acme@kernel.org \
    --cc=debarbos@redhat.com \
    --cc=devicetree@vger.kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=kfree@google.com \
    --cc=krzk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mricon@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=sashiko-bot@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=sashiko@lists.linux.dev \
    --cc=tytso@mit.edu \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.