Devicetree
 help / color / mirror / Atom feed
From: "Theodore Tso" <tytso@mit.edu>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Krzysztof Kozlowski <krzk@kernel.org>,
	debarbos@redhat.com, Arnaldo Carvalho de Melo <acme@kernel.org>,
	Konstantin Ryabitsev <mricon@kernel.org>,
	Guenter Roeck <linux@roeck-us.net>,
	sashiko-bot@kernel.org, sashiko-reviews@lists.linux.dev,
	sashiko@lists.linux.dev,
	Linux Kernel Workflows <workflows@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	devicetree@vger.kernel.org, kfree@google.com
Subject: Re: Stop false review statements
Date: Sun, 17 May 2026 14:57:01 -0400	[thread overview]
Message-ID: <20260517185701.GB53471@macsyma-wired.lan> (raw)
In-Reply-To: <F2FBD939-179D-467B-9FA8-BAA44F6C7524@linux.dev>

On Sun, May 17, 2026 at 11:17:06AM -0700, Roman Gushchin wrote:
> 
> I actually tried to run it with ollama on my
> personal framework 13. Adding nominal support is trivial, but the
> whole thing is not really useful: I can get maybe few hundreds
> tokens per second using a quantified model with reduced quality; an
> average sashiko review is consuming 3.5 millions tokens (with Gemini
> 3.1 pro, it’s also model-dependent).

I'm curious.  What hardware and LLM model were you using?  A few
hundred tokens per second seems surprising high.  My initial
research[1] showes that an M5 Max Macbook Pro costing 5 or 6 kilobucks
can do 31.6 tokens/second on a 27B 4-bit Quanitized model (Qwen 3.5).

[1] https://www.reddit.com/r/LocalLLaMA/comments/1rzkw4x/m5_max_128g_performance_tests_i_just_got_my_new/

The model matters of course.  With Gemma 3 27B and a 6-bit
quantization, it's 21 tokens/s, and with Deepseek R1 8B Q6_K, it's
72.8 tokens/second.  But unless you're using a really low-end model,
or a really expensive, splufty hardware platform, I haven't seen
reports of hundreds of tokens per second on hardware costing a
reasonable amount of memory.  (I'll set aside the question of whether
spending $6k for a fully spec'ed out M5 Max Macbook Pro, or $15k for a
fully spec'ed out M3 Ultra Mac Studio is "reasonable".)

As a result I'm not entirely sure how realistic it is to do reviews
using "free" (you still have to pay $$$ for the hardware) local,
open-weight LLM's if an average review requires around 3.5 million
tokens.

Cheers,

						- Ted

  parent reply	other threads:[~2026-05-17 18:57 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-16  8:05 Stop false review statements Krzysztof Kozlowski
2026-05-16 12:11 ` Guenter Roeck
2026-05-16 12:16   ` Krzysztof Kozlowski
2026-05-16 12:23     ` Guenter Roeck
2026-05-16 12:29       ` Krzysztof Kozlowski
2026-05-16 13:24         ` Laurent Pinchart
2026-05-16 13:45           ` Krzysztof Kozlowski
2026-05-16 21:10           ` Mauro Carvalho Chehab
2026-05-17 15:21       ` Jonathan Corbet
2026-05-16 15:20   ` Konstantin Ryabitsev
2026-05-16 15:36     ` Greg KH
2026-05-16 15:41     ` Roman Gushchin
2026-05-16 15:45       ` Greg KH
2026-05-16 15:49         ` Roman Gushchin
2026-05-16 18:28           ` Arnaldo Carvalho de Melo
2026-05-16 21:29             ` Derek Barbosa
2026-05-16 21:33               ` Krzysztof Kozlowski
2026-05-16 21:59                 ` Roman Gushchin
2026-05-17  8:25                   ` Krzysztof Kozlowski
2026-05-17 10:05                   ` Mauro Carvalho Chehab
2026-05-17 10:10                     ` Willy Tarreau
2026-05-17 10:12                     ` Greg KH
2026-05-17 16:29                       ` Theodore Tso
2026-05-17 22:22                         ` Laurent Pinchart
2026-05-17 16:39                       ` Mauro Carvalho Chehab
2026-05-17 17:03                         ` Guenter Roeck
2026-05-17 18:17                         ` Roman Gushchin
2026-05-17 18:56                           ` Mauro Carvalho Chehab
2026-05-18  5:31                             ` Greg KH
2026-05-17 18:57                           ` Theodore Tso [this message]
2026-05-17 19:36                             ` Mauro Carvalho Chehab
2026-05-16 18:28           ` Krzysztof Kozlowski
2026-05-16 18:56             ` Roman Gushchin
2026-05-16 19:00               ` Krzysztof Kozlowski
2026-05-16 19:13                 ` Guenter Roeck
2026-05-16 19:25                   ` Guenter Roeck
2026-05-16 19:31                     ` Roman Gushchin
2026-05-16 19:15                 ` Roman Gushchin
2026-05-16 20:41                   ` Theodore Tso
2026-05-17 15:56                   ` Danilo Krummrich
2026-05-17 21:25                     ` Danilo Krummrich
2026-05-18  2:12           ` SeongJae Park
2026-05-16 22:32         ` Mauro Carvalho Chehab
  -- strict thread matches above, loose matches on Subject: below --
2026-05-17 19:42 Roman Gushchin
2026-05-17 22:05 ` Mauro Carvalho Chehab
2026-05-17 19:53 Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260517185701.GB53471@macsyma-wired.lan \
    --to=tytso@mit.edu \
    --cc=acme@kernel.org \
    --cc=debarbos@redhat.com \
    --cc=devicetree@vger.kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=kfree@google.com \
    --cc=krzk@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mchehab+huawei@kernel.org \
    --cc=mricon@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=sashiko-bot@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=sashiko@lists.linux.dev \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox