All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simona Vetter <simona.vetter@ffwll.ch>
To: Thomas Zimmermann <tzimmermann@suse.de>
Cc: "Marek Olšák" <maraeo@gmail.com>,
	"Simona Vetter" <simona.vetter@ffwll.ch>,
	"Daniel Stone" <daniel@fooishbar.org>,
	"James Jones" <jajones@nvidia.com>,
	"Brian Starkey" <brian.starkey@arm.com>,
	"Michel Dänzer" <michel.daenzer@mailbox.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"amd-gfx mailing list" <amd-gfx@lists.freedesktop.org>,
	"ML Mesa-dev" <mesa-dev@lists.freedesktop.org>,
	nd@arm.com,
	"Laurent Pinchart" <Laurent.pinchart@ideasonboard.com>
Subject: Re: [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment
Date: Mon, 20 Jan 2025 19:41:36 +0100	[thread overview]
Message-ID: <Z46Y4EME7T6yejWP@phenom.ffwll.local> (raw)
In-Reply-To: <0e9aee49-aa69-4fb6-bab8-4624143f5267@suse.de>

On Mon, Jan 20, 2025 at 08:58:20AM +0100, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 18.01.25 um 03:37 schrieb Marek Olšák:
> [...]
> > 
> > 3) Implementing DRM_FORMAT_MOD_LINEAR as having 256B pitch and offset
> > alignment. This is what we do today. Even if Intel and some AMD chips
> > can do 64B or 128B alignment, they overalign to 256B. With so many
> > AMD+NV laptops out there, NV is probably next, unless they already do
> > this in the closed source driver.

I don't think this works, or at least not any better than the current
linear modifier. There's way too many users of that thing out there that I
think you can realistically redefine it.

Adding new linear modifiers and then preferring those above the old LINEAR
(if there is one left after all the wittling down to a common set) is I
think the only option that really works to fix something.

> The dumb-buffer series currently being discussed on dri-devel also touches
> handling of scanline pitches. THe actual value varies with each driver. 
> Should dumb buffers use a default pitch alignment of 256 on al hardware?

If you go with new modifiers then there could be shared code that dtrt by
just looking at the modifier list.
-Sima

> Best regards
> Thomas
> 
> > 
> > Marek
> > 
> > On Fri, Jan 17, 2025 at 9:18 AM Simona Vetter <simona.vetter@ffwll.ch>
> > wrote:
> > 
> >     On Wed, Jan 15, 2025 at 12:20:07PM +0000, Daniel Stone wrote:
> >     > On Wed, 15 Jan 2025 at 04:05, Marek Olšák <maraeo@gmail.com> wrote:
> >     > > On Tue, Jan 14, 2025 at 12:58 PM Daniel Stone
> >     <daniel@fooishbar.org> wrote:
> >     > >> AMD hardware is the only hardware I know of which doesn't support
> >     > >> overaligning. Say (not hypothetically) we have a GPU and a
> >     display
> >     > >> controller which have a minimum pitch alignment of 32 bytes, no
> >     > >> minimum height alignment, minimum 32-byte offset alignment,
> >     minimum
> >     > >> pitch of 32 bytes, and minimum image size of 32 bytes.
> >     > >>
> >     > >> To be maximally compatible, we'd have to expose 28 (pitch
> >     align) * 32
> >     > >> (height align) * 28 (offset align) * 28 (min pitch) * 28 (min
> >     size) ==
> >     > >> 19668992 individual modifiers when queried, which is 150MB
> >     per format
> >     > >> just to store the list of modifiers.
> >     > >
> >     > > Maximum compatibility is not required nor expected.
> >     > >
> >     > > In your case, only 1 linear modifier would be added for that
> >     driver, which is: [5 / 0 / 5 / 5 / 5]
> >     > >
> >     > > Then if, and only if, compatibility with other devices is
> >     desired, the driver developer could look at drivers of those other
> >     devices and determine which other linear modifiers to add. Ideally
> >     it would be just 1, so there would be a total of 2.
> >     >
> >     > Mali (actually two DRM drivers and sort of three Mesa drivers)
> >     can be
> >     > paired with any one of 11 KMS drivers (really 12 given that one is a
> >     > very independent subdriver), and something like 20 different codecs
> >     > (at least 12 different vendors; I didn't bother counting the actual
> >     > subdrivers which are all quite different). The VeriSilicon Hantro G2
> >     > codec driver is shipped by five (that we know of) vendors who
> >     all have
> >     > their own KMS drivers. One of those is in the Rockchip RK3588, which
> >     > (don't ask me why) ships six different codec blocks, with three
> >     > different drivers, from two different vendors - that's before
> >     you even
> >     > get to things like the ISP and NPU which really need to be sharing
> >     > buffers properly without copies.
> >     >
> >     > So yeah, working widely without having to encode specific knowledge
> >     > everywhere isn't a nice-to-have, it's a hard baseline requirement.
> >     >
> >     > >> > DRM_FORMAT_MOD_LINEAR needs to go because it prevents apps
> >     from detecting whether 2 devices have 0 compatible memory layouts,
> >     which is a useful thing to know.
> >     > >>
> >     > >> I get the point, but again, we have the exact same problem
> >     today with
> >     > >> placement, i.e. some devices require buffers to be in or not
> >     be in
> >     > >> VRAM or GTT or sysram for some uses, and some devices require
> >     physical
> >     > >> contiguity. Solving that problem would require an additional
> >     4 bits,
> >     > >> which brings us to 2.3GB of modifiers per format with the current
> >     > >> scheme. Not super viable.
> >     > >
> >     > > Userspace doesn't determine placement. The kernel memory
> >     management can move buffers between heaps to accommodate sharing
> >     between devices as needed. This is a problem in which userspace
> >     has no say.
> >     >
> >     > It really does though!
> >     >
> >     > None of these devices use TTM with placement moves, and doing that
> >     > isn't a fix either. Embedded systems have so low memory
> >     bandwidth that
> >     > the difference between choosing the wrong placement and moving it
> >     > later vs. having the right placement to begin with is the difference
> >     > between 'this does not work' and 'great, I can ship this'. Which is
> >     > great if you're a consultancy trying to get paid, but tbh I'd rather
> >     > work on more interesting things.
> >     >
> >     > So yeah, userspace does very much choose the placement. On most
> >     > drivers, this is either by 'knowing' which device to allocate
> >     from, or
> >     > passing a flag to your allocation ioctl. For newer drivers though,
> >     > there's the dma-heap allocation mechanism which is now upstream and
> >     > the blessed path, for which userspace needs to explicitly know the
> >     > desired placement (and must, because fixing it up later is a
> >     > non-starter).
> >     >
> >     > Given that we need to keep LINEAR ~forever for ABI reasons, and
> >     > because there's no reasonably workable alternative, let's
> >     abandon the
> >     > idea of abandoning LINEAR, and try to work with out-of-band
> >     signalling
> >     > instead.
> >     >
> >     > One idea is to actually pursue the allocator idea and express this
> >     > properly through constraints. I'd be super in favour of this,
> >     > unsurprisingly, because it allows us to solve a whole pile of other
> >     > problems, rather than the extremely narrow AMD/Intel interop case.
> >     >
> >     > Another idea for the out-of-band signalling would be to add
> >     > information-only modifiers, like
> >     > DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_EQ(256), or
> >     > DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_GE(32). But then that doesn't
> >     really
> >     > work at all with how people actually use modifiers: as the doc
> >     > describes, userspace takes and intersects the declared modifier
> >     lists
> >     > and passes the result through. The intersection of LINEAR+EQ256 and
> >     > LINEAR+GE32 is LINEAR, so a userspace that follows the rules
> >     will just
> >     > drop the hints on the floor and pick whatever linear allocation it
> >     > feels like.
> > 
> >     Yeah I think latest when we also take into account logical image
> >     size (not
> >     just pitch) with stuff like it needs to be aligned to 2 pixels in both
> >     directions just using modifiers falls apart.
> > 
> >     And the problem with linear, unlike device modifiers is that we
> >     can't just
> >     throw up our hands and enumerate the handful of formats in actual
> >     use for
> >     interop. There's so many produces and consumers of linera buffers
> >     (Daniel's list above missed camera/image processors) that save
> >     assumption
> >     is that anything really can happen.
> > 
> >     > I think I've just talked myself into the position that passing
> >     > allocator constraints together with modifiers is the only way to
> >     > actually solve this problem, at least without creating the sort of
> >     > technical debt that meant we spent years fixing up implicit/explicit
> >     > modifier interactions when it really should've just been adding a
> >     > !)@*(#$ u64 next to the u32.
> > 
> >     Yeah probably.
> > 
> >     Otoh I know inertia, so I am tempted to go with the oddball
> >     LINEAR_VEDNOR_A_VENDOR_B_INTEROP thing and stretch the runway for
> >     a bit.
> >     And we just assign those as we go as a very special thing, and the
> >     drivers
> >     that support it would prefer it above just LINEAR if there's no other
> >     common format left.
> > 
> >     Also makes it really obvious what all userspace/kernel driver enabling
> >     would be needed to justify such a modifier.
> >     -Sima
> > 
> >     >
> >     > Cheers,
> >     > Daniel
> > 
> >     --     Simona Vetter
> >     Software Engineer, Intel Corporation
> >     http://blog.ffwll.ch
> > 
> 
> -- 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

  reply	other threads:[~2025-01-20 18:42 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-15 20:53 [PATCH] drm/fourcc: add LINEAR modifiers with an exact pitch alignment Marek Olšák
2024-12-15 20:54 ` Marek Olšák
2024-12-15 23:22   ` Joshua Ashton
2024-12-15 23:57     ` Marek Olšák
2024-12-16  2:08       ` Joshua Ashton
2024-12-16  5:40         ` Marek Olšák
2024-12-16  9:28           ` Dmitry Baryshkov
2024-12-16 21:49             ` Marek Olšák
2024-12-16  9:27 ` Michel Dänzer
2024-12-16 10:46   ` Lucas Stach
2024-12-16 14:53     ` Simona Vetter
2024-12-16 21:58       ` Marek Olšák
2024-12-18 10:21         ` Simona Vetter
2024-12-16 21:54     ` Marek Olšák
2024-12-17  9:59       ` Michel Dänzer
2024-12-16 21:29   ` Marek Olšák
2024-12-17  9:14     ` Michel Dänzer
2024-12-17  9:14 ` Brian Starkey
2024-12-17 10:13   ` Michel Dänzer
2024-12-17 11:03     ` Brian Starkey
2024-12-18  9:44       ` Michel Dänzer
2024-12-18 10:24         ` Simona Vetter
2024-12-18 10:32           ` Brian Starkey
2024-12-19  2:53             ` Marek Olšák
2024-12-19  9:09               ` Daniel Stone
2024-12-19 10:32               ` Brian Starkey
2024-12-20  0:33                 ` Marek Olšák
2024-12-20 11:30                   ` Brian Starkey
2024-12-20 14:24                     ` Marek Olšák
2024-12-20 15:27                       ` Simona Vetter
2024-12-19  9:02             ` Daniel Stone
2024-12-19 16:09               ` Michel Dänzer
2024-12-20 15:24                 ` Simona Vetter
2024-12-25  7:34                   ` Marek Olšák
2024-12-19 18:03               ` Simona Vetter
2025-01-10 21:23                 ` James Jones
2025-01-14  9:38                   ` Marek Olšák
2025-01-14 17:55                     ` James Jones
2025-01-15  3:49                       ` Marek Olšák
2025-01-14 17:58                     ` Daniel Stone
2025-01-15  4:05                       ` Marek Olšák
2025-01-15 12:20                         ` Daniel Stone
2025-01-17 14:18                           ` Simona Vetter
2025-01-18  2:37                             ` Marek Olšák
2025-01-20  7:58                               ` Thomas Zimmermann
2025-01-20 18:41                                 ` Simona Vetter [this message]
2025-01-21 19:21                                   ` Marek Olšák
2025-01-22 10:47                                     ` Simona Vetter
2025-01-20 21:31                                 ` Laurent Pinchart
2025-01-21  9:02                                 ` Philipp Zabel
2025-01-14 18:33                     ` Faith Ekstrand
2025-01-15  4:27                       ` Marek Olšák
2025-01-15  8:37                       ` Simona Vetter
2025-01-20 22:00                   ` Laurent Pinchart
2025-01-21 22:40                     ` James Jones
2025-01-20 21:48     ` Laurent Pinchart
2025-01-14 13:46 ` Thomas Zimmermann
2025-01-14 13:50   ` Thomas Zimmermann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z46Y4EME7T6yejWP@phenom.ffwll.local \
    --to=simona.vetter@ffwll.ch \
    --cc=Laurent.pinchart@ideasonboard.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=brian.starkey@arm.com \
    --cc=daniel@fooishbar.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jajones@nvidia.com \
    --cc=maraeo@gmail.com \
    --cc=mesa-dev@lists.freedesktop.org \
    --cc=michel.daenzer@mailbox.org \
    --cc=nd@arm.com \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.