* [PATCH v3] drm/plane: Add documentation about software color conversion.
@ 2023-08-25 14:04 Jocelyn Falempe
2023-08-25 14:14 ` Maxime Ripard
` (3 more replies)
0 siblings, 4 replies; 26+ messages in thread
From: Jocelyn Falempe @ 2023-08-25 14:04 UTC (permalink / raw)
To: tzimmermann, airlied, maarten.lankhorst, mripard, daniel,
ppaalanen, javierm, contact
Cc: Jocelyn Falempe, dri-devel
After discussions on IRC, the consensus is that the DRM drivers should
avoid software color conversion, and only advertise the formats supported
by hardware.
Update the doc accordingly so that the rule and exceptions are clear for
everyone.
Acked-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
---
drivers/gpu/drm/drm_plane.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 24e7998d1731..d05642033202 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -140,6 +140,30 @@
* DRM_FORMAT_MOD_LINEAR. Before linux kernel release v5.1 there have been
* various bugs in this area with inconsistencies between the capability
* flag and per-plane properties.
+ *
+ * All drivers should support XRGB8888, even if the hardware cannot support
+ * it. This has become the de-facto standard and a lot of user-space assume
+ * it will be present. If XRGB8888 is not natively supported, then it
+ * shouldn't be the default for preferred depth or fbdev emulation.
+ *
+ * DRM drivers should not do software color conversion, and
+ * only advertise the formats they support in hardware. This is for
+ * performance reason, and to avoid multiple conversions in userspace and
+ * kernel space. KMS page flips are generally expected to be very cheap
+ * operations.
+ *
+ * But there are two exceptions only for dumb buffers:
+ * * To support XRGB8888 if it's not supported by the hardware.
+ * * Any driver is free to modify its internal representation of the format,
+ * as long as it doesn't alter the visible content in any way, and doesn't
+ * modify the user-provided buffer. An example would be to drop the
+ * padding component from a format to save some memory bandwidth.
+ * On most hardware, VRAM read access are slow, so when doing the software
+ * conversion, the dumb buffer should be allocated in system RAM in order to
+ * have decent performance.
+ * Extra care should be taken when doing software conversion with
+ * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
+ * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
*/
static unsigned int drm_num_planes(struct drm_device *dev)
base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
--
2.41.0
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-08-25 14:04 [PATCH v3] drm/plane: Add documentation about software color conversion Jocelyn Falempe
@ 2023-08-25 14:14 ` Maxime Ripard
2023-08-25 14:25 ` Javier Martinez Canillas
` (2 subsequent siblings)
3 siblings, 0 replies; 26+ messages in thread
From: Maxime Ripard @ 2023-08-25 14:14 UTC (permalink / raw)
To: Jocelyn Falempe
Cc: tzimmermann, javierm, dri-devel, ppaalanen, mripard, airlied
On Fri, 25 Aug 2023 16:04:18 +0200, Jocelyn Falempe wrote:
> After discussions on IRC, the consensus is that the DRM drivers should
> avoid software color conversion, and only advertise the formats supported
> by hardware.
> Update the doc accordingly so that the rule and exceptions are clear for
> everyone.
>
> [ ... ]
Acked-by: Maxime Ripard <mripard@kernel.org>
Thanks!
Maxime
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-08-25 14:04 [PATCH v3] drm/plane: Add documentation about software color conversion Jocelyn Falempe
2023-08-25 14:14 ` Maxime Ripard
@ 2023-08-25 14:25 ` Javier Martinez Canillas
2023-08-28 7:35 ` Pekka Paalanen
2023-09-08 9:21 ` Thomas Zimmermann
3 siblings, 0 replies; 26+ messages in thread
From: Javier Martinez Canillas @ 2023-08-25 14:25 UTC (permalink / raw)
To: Jocelyn Falempe, tzimmermann, airlied, maarten.lankhorst, mripard,
daniel, ppaalanen, contact
Cc: Jocelyn Falempe, dri-devel
Jocelyn Falempe <jfalempe@redhat.com> writes:
Hello Jocelyn,
> After discussions on IRC, the consensus is that the DRM drivers should
> avoid software color conversion, and only advertise the formats supported
> by hardware.
> Update the doc accordingly so that the rule and exceptions are clear for
> everyone.
>
> Acked-by: Simon Ser <contact@emersion.fr>
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> ---
Thanks a lot for writing this!
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
--
Best regards,
Javier Martinez Canillas
Core Platforms
Red Hat
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-08-25 14:04 [PATCH v3] drm/plane: Add documentation about software color conversion Jocelyn Falempe
2023-08-25 14:14 ` Maxime Ripard
2023-08-25 14:25 ` Javier Martinez Canillas
@ 2023-08-28 7:35 ` Pekka Paalanen
2023-09-08 9:21 ` Thomas Zimmermann
3 siblings, 0 replies; 26+ messages in thread
From: Pekka Paalanen @ 2023-08-28 7:35 UTC (permalink / raw)
To: Jocelyn Falempe; +Cc: tzimmermann, javierm, mripard, dri-devel, airlied
[-- Attachment #1: Type: text/plain, Size: 2629 bytes --]
On Fri, 25 Aug 2023 16:04:18 +0200
Jocelyn Falempe <jfalempe@redhat.com> wrote:
> After discussions on IRC, the consensus is that the DRM drivers should
> avoid software color conversion, and only advertise the formats supported
> by hardware.
> Update the doc accordingly so that the rule and exceptions are clear for
> everyone.
>
> Acked-by: Simon Ser <contact@emersion.fr>
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> ---
> drivers/gpu/drm/drm_plane.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
> index 24e7998d1731..d05642033202 100644
> --- a/drivers/gpu/drm/drm_plane.c
> +++ b/drivers/gpu/drm/drm_plane.c
> @@ -140,6 +140,30 @@
> * DRM_FORMAT_MOD_LINEAR. Before linux kernel release v5.1 there have been
> * various bugs in this area with inconsistencies between the capability
> * flag and per-plane properties.
> + *
> + * All drivers should support XRGB8888, even if the hardware cannot support
> + * it. This has become the de-facto standard and a lot of user-space assume
> + * it will be present. If XRGB8888 is not natively supported, then it
> + * shouldn't be the default for preferred depth or fbdev emulation.
> + *
> + * DRM drivers should not do software color conversion, and
> + * only advertise the formats they support in hardware. This is for
> + * performance reason, and to avoid multiple conversions in userspace and
> + * kernel space. KMS page flips are generally expected to be very cheap
> + * operations.
> + *
> + * But there are two exceptions only for dumb buffers:
> + * * To support XRGB8888 if it's not supported by the hardware.
> + * * Any driver is free to modify its internal representation of the format,
> + * as long as it doesn't alter the visible content in any way, and doesn't
> + * modify the user-provided buffer. An example would be to drop the
> + * padding component from a format to save some memory bandwidth.
> + * On most hardware, VRAM read access are slow, so when doing the software
> + * conversion, the dumb buffer should be allocated in system RAM in order to
> + * have decent performance.
> + * Extra care should be taken when doing software conversion with
> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
> */
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Thanks,
pq
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-08-25 14:04 [PATCH v3] drm/plane: Add documentation about software color conversion Jocelyn Falempe
` (2 preceding siblings ...)
2023-08-28 7:35 ` Pekka Paalanen
@ 2023-09-08 9:21 ` Thomas Zimmermann
2023-09-08 10:58 ` Maxime Ripard
2023-09-08 11:16 ` Pekka Paalanen
3 siblings, 2 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 9:21 UTC (permalink / raw)
To: Jocelyn Falempe, airlied, maarten.lankhorst, mripard, daniel,
ppaalanen, javierm, contact
Cc: dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 1807 bytes --]
Hi
Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
[...]
> + *
> + * But there are two exceptions only for dumb buffers:
> + * * To support XRGB8888 if it's not supported by the hardware.
> + * * Any driver is free to modify its internal representation of the format,
> + * as long as it doesn't alter the visible content in any way, and doesn't
> + * modify the user-provided buffer. An example would be to drop the
> + * padding component from a format to save some memory bandwidth.
I have strong objections to this point, _especially_ as you're
apparently trying to sneak this in after our discussion. NAK on this
part from my side.
If you want userspace to be able to use a certain format, then export
the corresponding 4cc code. Then let userspace decide what to do about
it. If userspace pick a certain format, go with it.
Hence, no implicit conversion from XRGB888 to RGB888, just because it's
possible.
Best regards
Thomas
> + * On most hardware, VRAM read access are slow, so when doing the software
> + * conversion, the dumb buffer should be allocated in system RAM in order to
> + * have decent performance.
> + * Extra care should be taken when doing software conversion with
> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
> */
>
> static unsigned int drm_num_planes(struct drm_device *dev)
>
> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 9:21 ` Thomas Zimmermann
@ 2023-09-08 10:58 ` Maxime Ripard
2023-09-08 13:22 ` Thomas Zimmermann
2023-09-08 11:16 ` Pekka Paalanen
1 sibling, 1 reply; 26+ messages in thread
From: Maxime Ripard @ 2023-09-08 10:58 UTC (permalink / raw)
To: Thomas Zimmermann; +Cc: Jocelyn Falempe, javierm, dri-devel, ppaalanen, airlied
[-- Attachment #1: Type: text/plain, Size: 1784 bytes --]
Hi,
On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
> [...]
> > + *
> > + * But there are two exceptions only for dumb buffers:
> > + * * To support XRGB8888 if it's not supported by the hardware.
>
>
> > + * * Any driver is free to modify its internal representation of the format,
> > + * as long as it doesn't alter the visible content in any way, and doesn't
> > + * modify the user-provided buffer. An example would be to drop the
> > + * padding component from a format to save some memory bandwidth.
>
> I have strong objections to this point, _especially_ as you're apparently
> trying to sneak this in after our discussion.
I think it's an unfair characterization. This was discussed on
#dri-devel, and went through several rounds over the mailing lists, with
you in Cc for each. How is that sneaking something in?
> NAK on this part from my side.
>
> If you want userspace to be able to use a certain format, then export the
> corresponding 4cc code. Then let userspace decide what to do about it. If
> userspace pick a certain format, go with it.
>
> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
> possible.
I'm not sure what's your argument is, really. Last time we discussed
this, you were saying that you were enforcing that rule because that was
the outcome of that (painful) discussion with Pekka and Javier. It turns
out that both Pekka and Javier are ok with the patch, so it's not clear
to me what your objections are at this point.
Are you really arguing that we shouldn't allow a driver to consume less
resources while not affecting any other component in the system in any
way?
Maxime
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 9:21 ` Thomas Zimmermann
2023-09-08 10:58 ` Maxime Ripard
@ 2023-09-08 11:16 ` Pekka Paalanen
2023-09-08 13:56 ` Thomas Zimmermann
1 sibling, 1 reply; 26+ messages in thread
From: Pekka Paalanen @ 2023-09-08 11:16 UTC (permalink / raw)
To: Thomas Zimmermann; +Cc: Jocelyn Falempe, javierm, mripard, dri-devel, airlied
[-- Attachment #1: Type: text/plain, Size: 2574 bytes --]
On Fri, 8 Sep 2023 11:21:51 +0200
Thomas Zimmermann <tzimmermann@suse.de> wrote:
> Hi
>
> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
> [...]
> > + *
> > + * But there are two exceptions only for dumb buffers:
> > + * * To support XRGB8888 if it's not supported by the hardware.
>
>
> > + * * Any driver is free to modify its internal representation of the format,
> > + * as long as it doesn't alter the visible content in any way, and doesn't
> > + * modify the user-provided buffer. An example would be to drop the
> > + * padding component from a format to save some memory bandwidth.
>
> I have strong objections to this point, _especially_ as you're
> apparently trying to sneak this in after our discussion. NAK on this
> part from my side.
>
> If you want userspace to be able to use a certain format, then export
> the corresponding 4cc code. Then let userspace decide what to do about
> it. If userspace pick a certain format, go with it.
What is the reason for your objection, exactly?
> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
> possible.
For the particular driver in question though, the conversion allows
using a display resolution that is otherwise not possible. I also hear
it improves performance since 25% less data needs to travel across a
slow bus. There is also so little VRAM, than all dumb buffers need to
be allocated from sysram instead anyway, so a copy is always necessary.
Since XRGB8888 is the one format that is recommended to be supported by
all drivers, I don't see a problem here. Did you test on your
incredibly slow g200 test rig if the conversion ends up hurting instead
of helping performance there?
If it hurts, then I see that you have a good reason to NAK this.
It's hard to imagine how it would hurt, since you always need a copy
from sysram dumb buffers to VRAM - or do you?
Thanks,
pq
> > + * On most hardware, VRAM read access are slow, so when doing the software
> > + * conversion, the dumb buffer should be allocated in system RAM in order to
> > + * have decent performance.
> > + * Extra care should be taken when doing software conversion with
> > + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
> > + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
> > */
> >
> > static unsigned int drm_num_planes(struct drm_device *dev)
> >
> > base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 10:58 ` Maxime Ripard
@ 2023-09-08 13:22 ` Thomas Zimmermann
2023-09-08 13:27 ` Simon Ser
2023-09-08 13:46 ` Javier Martinez Canillas
0 siblings, 2 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 13:22 UTC (permalink / raw)
To: Maxime Ripard; +Cc: Jocelyn Falempe, javierm, dri-devel, ppaalanen, airlied
[-- Attachment #1.1: Type: text/plain, Size: 3311 bytes --]
Hi Maxime
Am 08.09.23 um 12:58 schrieb Maxime Ripard:
> Hi,
>
> On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>> [...]
>>> + *
>>> + * But there are two exceptions only for dumb buffers:
>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>
>>
>>> + * * Any driver is free to modify its internal representation of the format,
>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>> + * modify the user-provided buffer. An example would be to drop the
>>> + * padding component from a format to save some memory bandwidth.
>>
>> I have strong objections to this point, _especially_ as you're apparently
>> trying to sneak this in after our discussion.
>
> I think it's an unfair characterization. This was discussed on
> #dri-devel, and went through several rounds over the mailing lists, with
> you in Cc for each. How is that sneaking something in?
A few months ago, we had a flamewar'ish IRC discussion on format
conversion within the kernel. The general sentiment was that the kernel
drivers should use what ever is provided by userspace without further
processing. The short argument was 'userspace knows better'. The only
exception is for supporting XRGB8888 on hardware that would otherwise
not support it. After some consideration, I agree with all that. (Back
then I didn't.)
A few weeks ago I received a patch to do an implicit conversion from
XRGB8888 to RGB888 within mgag200. [1] I don't have a link to the
discussion, but I NAK'ed that patch pretty hard on IRC by following that
other discussion.
And know I find that this patch (even in its v1) contains language that
retroactively legitimizes the mgag200 patch. I wrote 'apparently' I my
reply, as I assume that there's more to it, but how does it not look
like an attempt to sneak in something that is known to be controversial?
It might have been better to discuss the question separately on the
dri-devel ML. Maybe we can do this here.
Best regards
Thomas
[1] https://patchwork.freedesktop.org/patch/531879/?series=116381&rev=1
>
>> NAK on this part from my side.
>>
>> If you want userspace to be able to use a certain format, then export the
>> corresponding 4cc code. Then let userspace decide what to do about it. If
>> userspace pick a certain format, go with it.
>>
>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
>> possible.
>
> I'm not sure what's your argument is, really. Last time we discussed
> this, you were saying that you were enforcing that rule because that was
> the outcome of that (painful) discussion with Pekka and Javier. It turns
> out that both Pekka and Javier are ok with the patch, so it's not clear
> to me what your objections are at this point.
>
> Are you really arguing that we shouldn't allow a driver to consume less
> resources while not affecting any other component in the system in any
> way?
>
> Maxime
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:22 ` Thomas Zimmermann
@ 2023-09-08 13:27 ` Simon Ser
2023-09-08 13:46 ` Javier Martinez Canillas
1 sibling, 0 replies; 26+ messages in thread
From: Simon Ser @ 2023-09-08 13:27 UTC (permalink / raw)
To: Thomas Zimmermann
Cc: Jocelyn Falempe, javierm, Maxime Ripard, ppaalanen, dri-devel,
airlied
On Friday, September 8th, 2023 at 22:22, Thomas Zimmermann <tzimmermann@suse.de> wrote:
> Am 08.09.23 um 12:58 schrieb Maxime Ripard:
>
> > Hi,
> >
> > On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
> >
> > > Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
> > > [...]
> > >
> > > > + *
> > > > + * But there are two exceptions only for dumb buffers:
> > > > + * * To support XRGB8888 if it's not supported by the hardware.
> > >
> > > > + * * Any driver is free to modify its internal representation of the format,
> > > > + * as long as it doesn't alter the visible content in any way, and doesn't
> > > > + * modify the user-provided buffer. An example would be to drop the
> > > > + * padding component from a format to save some memory bandwidth.
> > >
> > > I have strong objections to this point, especially as you're apparently
> > > trying to sneak this in after our discussion.
> >
> > I think it's an unfair characterization. This was discussed on
> > #dri-devel, and went through several rounds over the mailing lists, with
> > you in Cc for each. How is that sneaking something in?
>
>
> A few months ago, we had a flamewar'ish IRC discussion on format
> conversion within the kernel. The general sentiment was that the kernel
> drivers should use what ever is provided by userspace without further
> processing. The short argument was 'userspace knows better'. The only
> exception is for supporting XRGB8888 on hardware that would otherwise
> not support it. After some consideration, I agree with all that. (Back
> then I didn't.)
(FWIW, as a userspace dev, the above makes perfect sense to me.)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:22 ` Thomas Zimmermann
2023-09-08 13:27 ` Simon Ser
@ 2023-09-08 13:46 ` Javier Martinez Canillas
2023-09-08 14:06 ` Jocelyn Falempe
2023-09-08 14:09 ` Thomas Zimmermann
1 sibling, 2 replies; 26+ messages in thread
From: Javier Martinez Canillas @ 2023-09-08 13:46 UTC (permalink / raw)
To: Thomas Zimmermann, Maxime Ripard
Cc: Jocelyn Falempe, dri-devel, ppaalanen, airlied
Thomas Zimmermann <tzimmermann@suse.de> writes:
Hello Thomas,
> Hi Maxime
>
> Am 08.09.23 um 12:58 schrieb Maxime Ripard:
>> Hi,
>>
>> On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>> [...]
>>>> + *
>>>> + * But there are two exceptions only for dumb buffers:
>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>
>>>
>>>> + * * Any driver is free to modify its internal representation of the format,
>>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>>> + * modify the user-provided buffer. An example would be to drop the
>>>> + * padding component from a format to save some memory bandwidth.
>>>
>>> I have strong objections to this point, _especially_ as you're apparently
>>> trying to sneak this in after our discussion.
>>
>> I think it's an unfair characterization. This was discussed on
>> #dri-devel, and went through several rounds over the mailing lists, with
>> you in Cc for each. How is that sneaking something in?
>
> A few months ago, we had a flamewar'ish IRC discussion on format
> conversion within the kernel. The general sentiment was that the kernel
> drivers should use what ever is provided by userspace without further
> processing. The short argument was 'userspace knows better'. The only
> exception is for supporting XRGB8888 on hardware that would otherwise
> not support it. After some consideration, I agree with all that. (Back
> then I didn't.)
>
> A few weeks ago I received a patch to do an implicit conversion from
> XRGB8888 to RGB888 within mgag200. [1] I don't have a link to the
> discussion, but I NAK'ed that patch pretty hard on IRC by following that
> other discussion.
>
> And know I find that this patch (even in its v1) contains language that
> retroactively legitimizes the mgag200 patch. I wrote 'apparently' I my
> reply, as I assume that there's more to it, but how does it not look
> like an attempt to sneak in something that is known to be controversial?
>
While is true that the motivation for Jocelyn's patch was to make explicit
what are the rules with regard to drivers emulating formats (other than
"we had a flamewar on IRC a while back" which is quite ambiguous), it was
not attempt to sneak something that is known to be controversial.
In fact, it is an attempt to dispel the controversy and document what is
acceptable and what is not for a driver.
> It might have been better to discuss the question separately on the
> dri-devel ML. Maybe we can do this here.
>
This was discussed in the #dri-devel IRC channel, I believe you were on
PTO at the time and probably that's why you missed. I found the logs here:
https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2023-08-04
As you can see there, most people agreed that what Jocelyn wrote in his
doc patch is the most pragmatic compromise.
--
Best regards,
Javier Martinez Canillas
Core Platforms
Red Hat
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 11:16 ` Pekka Paalanen
@ 2023-09-08 13:56 ` Thomas Zimmermann
2023-09-08 14:41 ` Pekka Paalanen
2023-09-08 14:48 ` Jocelyn Falempe
0 siblings, 2 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 13:56 UTC (permalink / raw)
To: Pekka Paalanen; +Cc: Jocelyn Falempe, javierm, mripard, dri-devel, airlied
[-- Attachment #1.1: Type: text/plain, Size: 4793 bytes --]
Hi
Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
> On Fri, 8 Sep 2023 11:21:51 +0200
> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
>> Hi
>>
>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>> [...]
>>> + *
>>> + * But there are two exceptions only for dumb buffers:
>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>
>>
>>> + * * Any driver is free to modify its internal representation of the format,
>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>> + * modify the user-provided buffer. An example would be to drop the
>>> + * padding component from a format to save some memory bandwidth.
>>
>> I have strong objections to this point, _especially_ as you're
>> apparently trying to sneak this in after our discussion. NAK on this
>> part from my side.
>>
>> If you want userspace to be able to use a certain format, then export
>> the corresponding 4cc code. Then let userspace decide what to do about
>> it. If userspace pick a certain format, go with it.
>
> What is the reason for your objection, exactly?
>
>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
>> possible.
>
> For the particular driver in question though, the conversion allows
> using a display resolution that is otherwise not possible. I also hear
> it improves performance since 25% less data needs to travel across a
> slow bus. There is also so little VRAM, than all dumb buffers need to
> be allocated from sysram instead anyway, so a copy is always necessary.
>
> Since XRGB8888 is the one format that is recommended to be supported by
> all drivers, I don't see a problem here. Did you test on your
> incredibly slow g200 test rig if the conversion ends up hurting instead
> of helping performance there?
>
> If it hurts, then I see that you have a good reason to NAK this.
>
> It's hard to imagine how it would hurt, since you always need a copy
> from sysram dumb buffers to VRAM - or do you?
I have a number of concerns. My point it not that we shouldn't optimize.
I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
for userspace to use.
AFAICT the main argument against userspace is that Mesa doesn't like
3-byte pixels. But I don't see how this conversion cannot be a
post-processing step within Mesa: do the rendering in RGB32 and then
convert to a framebuffer in RGB24. Userspace can do that more
efficiently than the kernel. This has all of the upsides of reduced
bandwidth, but none of the downsides of kernel code. Applications and/or
Mesa would be in control of the buffer format and apply the optimization
where it makes sense. And it would be available for all drivers that are
similar to mgag200.
My main point is simplicity of the driver: I prefer the driver to be
simple without unnecessary indirection or overhead. Optimizations like
these my or may not work on a given system with a certain workload. I'd
better leave this heuristic to userspace.
Another point of concern is CPU consumption: Slow I/O buses may stall
the display thread, but the CPU could do something else in the meantime.
Doing format conversion on the CPU prevents that, hence affecting other
parts of the system negatively. Of course, that's more of a gut feeling
than hard data.
Please note that the kernel's conversion code uses memory allocation of
intermediate buffers. We even recently had a discussion about allocation
overhead during display updates. Userspace can surely do a better job at
keeping such buffers around.
And finally a note the hardware itself: on low-end hardware like those
Matrox chips, just switch to RGB16. That will be pretty and fast enough
for these chips' server systems. Anyone who cares about fast and
beautiful should buy a real graphics card.
Best regards
Thomas
>
>
> Thanks,
> pq
>
>>> + * On most hardware, VRAM read access are slow, so when doing the software
>>> + * conversion, the dumb buffer should be allocated in system RAM in order to
>>> + * have decent performance.
>>> + * Extra care should be taken when doing software conversion with
>>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
>>> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
>>> */
>>>
>>> static unsigned int drm_num_planes(struct drm_device *dev)
>>>
>>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
>>
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:46 ` Javier Martinez Canillas
@ 2023-09-08 14:06 ` Jocelyn Falempe
2023-09-08 14:13 ` Thomas Zimmermann
2023-09-08 14:09 ` Thomas Zimmermann
1 sibling, 1 reply; 26+ messages in thread
From: Jocelyn Falempe @ 2023-09-08 14:06 UTC (permalink / raw)
To: Javier Martinez Canillas, Thomas Zimmermann, Maxime Ripard
Cc: dri-devel, ppaalanen, airlied
On 08/09/2023 15:46, Javier Martinez Canillas wrote:
> Thomas Zimmermann <tzimmermann@suse.de> writes:
>
> Hello Thomas,
>
>> Hi Maxime
>>
>> Am 08.09.23 um 12:58 schrieb Maxime Ripard:
>>> Hi,
>>>
>>> On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
>>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>>> [...]
>>>>> + *
>>>>> + * But there are two exceptions only for dumb buffers:
>>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>>
>>>>
>>>>> + * * Any driver is free to modify its internal representation of the format,
>>>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>>>> + * modify the user-provided buffer. An example would be to drop the
>>>>> + * padding component from a format to save some memory bandwidth.
>>>>
>>>> I have strong objections to this point, _especially_ as you're apparently
>>>> trying to sneak this in after our discussion.
>>>
>>> I think it's an unfair characterization. This was discussed on
>>> #dri-devel, and went through several rounds over the mailing lists, with
>>> you in Cc for each. How is that sneaking something in?
>>
>> A few months ago, we had a flamewar'ish IRC discussion on format
>> conversion within the kernel. The general sentiment was that the kernel
>> drivers should use what ever is provided by userspace without further
>> processing. The short argument was 'userspace knows better'. The only
>> exception is for supporting XRGB8888 on hardware that would otherwise
>> not support it. After some consideration, I agree with all that. (Back
>> then I didn't.)
I wasn't part of this "flamewar", and though my patch was a bit
unrelated to this. That's why I started this work to document clearly
what is acceptable in the kernel or not. I discuss it on IRC, and then
proposed the patch on dri-devel to find a compromise, and see if my case
can be acceptable or not.
>>
>> A few weeks ago I received a patch to do an implicit conversion from
>> XRGB8888 to RGB888 within mgag200. [1] I don't have a link to the
>> discussion, but I NAK'ed that patch pretty hard on IRC by following that
>> other discussion.
>>
>> And know I find that this patch (even in its v1) contains language that
>> retroactively legitimizes the mgag200 patch. I wrote 'apparently' I my
>> reply, as I assume that there's more to it, but how does it not look
>> like an attempt to sneak in something that is known to be controversial?
>>
That was not my intention, and I apologize if you feel it this way. My
goal was just to clarify if this optimization is acceptable for other
kernel developers, since I though you were willing to accept it, but
some other developers from the "flamewar" were against.
>
> While is true that the motivation for Jocelyn's patch was to make explicit
> what are the rules with regard to drivers emulating formats (other than
> "we had a flamewar on IRC a while back" which is quite ambiguous), it was
> not attempt to sneak something that is known to be controversial.
>
> In fact, it is an attempt to dispel the controversy and document what is
> acceptable and what is not for a driver.
>
>> It might have been better to discuss the question separately on the
>> dri-devel ML. Maybe we can do this here.
>>
>
> This was discussed in the #dri-devel IRC channel, I believe you were on
> PTO at the time and probably that's why you missed. I found the logs here:
>
> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2023-08-04
>
> As you can see there, most people agreed that what Jocelyn wrote in his
> doc patch is the most pragmatic compromise.
>
Best regards,
--
Jocelyn
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:46 ` Javier Martinez Canillas
2023-09-08 14:06 ` Jocelyn Falempe
@ 2023-09-08 14:09 ` Thomas Zimmermann
1 sibling, 0 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 14:09 UTC (permalink / raw)
To: Javier Martinez Canillas, Maxime Ripard
Cc: Jocelyn Falempe, dri-devel, ppaalanen, airlied
[-- Attachment #1.1: Type: text/plain, Size: 4080 bytes --]
Hi Javier
Am 08.09.23 um 15:46 schrieb Javier Martinez Canillas:
> Thomas Zimmermann <tzimmermann@suse.de> writes:
>
> Hello Thomas,
>
>> Hi Maxime
>>
>> Am 08.09.23 um 12:58 schrieb Maxime Ripard:
>>> Hi,
>>>
>>> On Fri, Sep 08, 2023 at 11:21:51AM +0200, Thomas Zimmermann wrote:
>>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>>> [...]
>>>>> + *
>>>>> + * But there are two exceptions only for dumb buffers:
>>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>>
>>>>
>>>>> + * * Any driver is free to modify its internal representation of the format,
>>>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>>>> + * modify the user-provided buffer. An example would be to drop the
>>>>> + * padding component from a format to save some memory bandwidth.
>>>>
>>>> I have strong objections to this point, _especially_ as you're apparently
>>>> trying to sneak this in after our discussion.
>>>
>>> I think it's an unfair characterization. This was discussed on
>>> #dri-devel, and went through several rounds over the mailing lists, with
>>> you in Cc for each. How is that sneaking something in?
>>
>> A few months ago, we had a flamewar'ish IRC discussion on format
>> conversion within the kernel. The general sentiment was that the kernel
>> drivers should use what ever is provided by userspace without further
>> processing. The short argument was 'userspace knows better'. The only
>> exception is for supporting XRGB8888 on hardware that would otherwise
>> not support it. After some consideration, I agree with all that. (Back
>> then I didn't.)
>>
>> A few weeks ago I received a patch to do an implicit conversion from
>> XRGB8888 to RGB888 within mgag200. [1] I don't have a link to the
>> discussion, but I NAK'ed that patch pretty hard on IRC by following that
>> other discussion.
>>
>> And know I find that this patch (even in its v1) contains language that
>> retroactively legitimizes the mgag200 patch. I wrote 'apparently' I my
>> reply, as I assume that there's more to it, but how does it not look
>> like an attempt to sneak in something that is known to be controversial?
>>
>
> While is true that the motivation for Jocelyn's patch was to make explicit
> what are the rules with regard to drivers emulating formats (other than
> "we had a flamewar on IRC a while back" which is quite ambiguous), it was
> not attempt to sneak something that is known to be controversial.
>
> In fact, it is an attempt to dispel the controversy and document what is
> acceptable and what is not for a driver.
>
>> It might have been better to discuss the question separately on the
>> dri-devel ML. Maybe we can do this here.
>>
>
> This was discussed in the #dri-devel IRC channel, I believe you were on
> PTO at the time and probably that's why you missed. I found the logs here:
>
> https://people.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2023-08-04
>
> As you can see there, most people agreed that what Jocelyn wrote in his
> doc patch is the most pragmatic compromise.
Thanks for the pointer to the URL. Quoting Daniel (sima) from that
discussion.
"imo just document that for hw/drivers where XRGB8888 is not support or
has a very bad cost in terms of upload/scanout bw it's ok to emulate it
in the kernel driver, but _only_ for that format "
This seems the overall sentiment. I disagree with the "has very bad
cost" part, though. The cost/benefit ratio should be determined by
userspace IMHO. Please see my reply to Pekka for some details.
I don't have a pointer to that other IRC discussion, but IIRC there
where quite a lot more people involved, including from userspace. Many
of those are absent here.
Best regards
Thomas
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 14:06 ` Jocelyn Falempe
@ 2023-09-08 14:13 ` Thomas Zimmermann
0 siblings, 0 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 14:13 UTC (permalink / raw)
To: Jocelyn Falempe, Javier Martinez Canillas, Maxime Ripard
Cc: airlied, ppaalanen, dri-devel
[-- Attachment #1.1: Type: text/plain, Size: 856 bytes --]
Hi Jocelyn
Am 08.09.23 um 16:06 schrieb Jocelyn Falempe:
[...]
>>> And know I find that this patch (even in its v1) contains language that
>>> retroactively legitimizes the mgag200 patch. I wrote 'apparently' I my
>>> reply, as I assume that there's more to it, but how does it not look
>>> like an attempt to sneak in something that is known to be controversial?
>>>
>
> That was not my intention, and I apologize if you feel it this way. My
> goal was just to clarify if this optimization is acceptable for other
Sorry for throwing accusations around. You certainly act in good faith.
Best regards
Thomas
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:56 ` Thomas Zimmermann
@ 2023-09-08 14:41 ` Pekka Paalanen
2023-09-08 15:10 ` Thomas Zimmermann
2023-09-08 14:48 ` Jocelyn Falempe
1 sibling, 1 reply; 26+ messages in thread
From: Pekka Paalanen @ 2023-09-08 14:41 UTC (permalink / raw)
To: Thomas Zimmermann; +Cc: Jocelyn Falempe, javierm, mripard, dri-devel, airlied
[-- Attachment #1: Type: text/plain, Size: 5483 bytes --]
On Fri, 8 Sep 2023 15:56:51 +0200
Thomas Zimmermann <tzimmermann@suse.de> wrote:
> Hi
>
> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
> > On Fri, 8 Sep 2023 11:21:51 +0200
> > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >
> >> Hi
> >>
> >> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
> >> [...]
> >>> + *
> >>> + * But there are two exceptions only for dumb buffers:
> >>> + * * To support XRGB8888 if it's not supported by the hardware.
> >>
> >>
> >>> + * * Any driver is free to modify its internal representation of the format,
> >>> + * as long as it doesn't alter the visible content in any way, and doesn't
> >>> + * modify the user-provided buffer. An example would be to drop the
> >>> + * padding component from a format to save some memory bandwidth.
> >>
> >> I have strong objections to this point, _especially_ as you're
> >> apparently trying to sneak this in after our discussion. NAK on this
> >> part from my side.
> >>
> >> If you want userspace to be able to use a certain format, then export
> >> the corresponding 4cc code. Then let userspace decide what to do about
> >> it. If userspace pick a certain format, go with it.
> >
> > What is the reason for your objection, exactly?
> >
> >> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
> >> possible.
> >
> > For the particular driver in question though, the conversion allows
> > using a display resolution that is otherwise not possible. I also hear
> > it improves performance since 25% less data needs to travel across a
> > slow bus. There is also so little VRAM, than all dumb buffers need to
> > be allocated from sysram instead anyway, so a copy is always necessary.
> >
> > Since XRGB8888 is the one format that is recommended to be supported by
> > all drivers, I don't see a problem here. Did you test on your
> > incredibly slow g200 test rig if the conversion ends up hurting instead
> > of helping performance there?
> >
> > If it hurts, then I see that you have a good reason to NAK this.
> >
> > It's hard to imagine how it would hurt, since you always need a copy
> > from sysram dumb buffers to VRAM - or do you?
>
> I have a number of concerns. My point it not that we shouldn't optimize.
> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
> for userspace to use.
>
> AFAICT the main argument against userspace is that Mesa doesn't like
> 3-byte pixels. But I don't see how this conversion cannot be a
> post-processing step within Mesa: do the rendering in RGB32 and then
> convert to a framebuffer in RGB24. Userspace can do that more
> efficiently than the kernel. This has all of the upsides of reduced
> bandwidth, but none of the downsides of kernel code. Applications and/or
> Mesa would be in control of the buffer format and apply the optimization
> where it makes sense. And it would be available for all drivers that are
> similar to mgag200.
>
> My main point is simplicity of the driver: I prefer the driver to be
> simple without unnecessary indirection or overhead. Optimizations like
> these my or may not work on a given system with a certain workload. I'd
> better leave this heuristic to userspace.
>
> Another point of concern is CPU consumption: Slow I/O buses may stall
> the display thread, but the CPU could do something else in the meantime.
> Doing format conversion on the CPU prevents that, hence affecting other
> parts of the system negatively. Of course, that's more of a gut feeling
> than hard data.
>
> Please note that the kernel's conversion code uses memory allocation of
> intermediate buffers. We even recently had a discussion about allocation
> overhead during display updates. Userspace can surely do a better job at
> keeping such buffers around.
>
> And finally a note the hardware itself: on low-end hardware like those
> Matrox chips, just switch to RGB16. That will be pretty and fast enough
> for these chips' server systems. Anyone who cares about fast and
> beautiful should buy a real graphics card.
Fair enough.
Did you consider that every frame will be copied twice: once in
userspace to get RGB888, then again by the driver into VRAM, since dumb
buffers reside in sysram?
RGB565 would probably go the same route I guess.
I suspect the intermediate buffers (dumb buffers in this case) need to
be full frame size rather than a scanline, too. I'm not sure why the
driver would need any scratch buffers beyond a couple dozen bytes while
doing a software conversion, just enough to have the lowest common
multiple of 3 bytes and optimal bus write width.
Thanks,
pq
>
> Best regards
> Thomas
>
> >
> >
> > Thanks,
> > pq
> >
> >>> + * On most hardware, VRAM read access are slow, so when doing the software
> >>> + * conversion, the dumb buffer should be allocated in system RAM in order to
> >>> + * have decent performance.
> >>> + * Extra care should be taken when doing software conversion with
> >>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
> >>> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
> >>> */
> >>>
> >>> static unsigned int drm_num_planes(struct drm_device *dev)
> >>>
> >>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
> >>
> >
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 13:56 ` Thomas Zimmermann
2023-09-08 14:41 ` Pekka Paalanen
@ 2023-09-08 14:48 ` Jocelyn Falempe
2023-09-08 15:37 ` Thomas Zimmermann
1 sibling, 1 reply; 26+ messages in thread
From: Jocelyn Falempe @ 2023-09-08 14:48 UTC (permalink / raw)
To: Thomas Zimmermann, Pekka Paalanen; +Cc: javierm, mripard, dri-devel, airlied
On 08/09/2023 15:56, Thomas Zimmermann wrote:
> Hi
>
> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
>> On Fri, 8 Sep 2023 11:21:51 +0200
>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>
>>> Hi
>>>
>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>> [...]
>>>> + *
>>>> + * But there are two exceptions only for dumb buffers:
>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>
>>>
>>>> + * * Any driver is free to modify its internal representation
>>>> of the format,
>>>> + * as long as it doesn't alter the visible content in any
>>>> way, and doesn't
>>>> + * modify the user-provided buffer. An example would be to
>>>> drop the
>>>> + * padding component from a format to save some memory
>>>> bandwidth.
>>>
>>> I have strong objections to this point, _especially_ as you're
>>> apparently trying to sneak this in after our discussion. NAK on this
>>> part from my side.
>>>
>>> If you want userspace to be able to use a certain format, then export
>>> the corresponding 4cc code. Then let userspace decide what to do about
>>> it. If userspace pick a certain format, go with it.
>>
>> What is the reason for your objection, exactly?
>>
>>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
>>> possible.
>>
>> For the particular driver in question though, the conversion allows
>> using a display resolution that is otherwise not possible. I also hear
>> it improves performance since 25% less data needs to travel across a
>> slow bus. There is also so little VRAM, than all dumb buffers need to
>> be allocated from sysram instead anyway, so a copy is always necessary.
>>
>> Since XRGB8888 is the one format that is recommended to be supported by
>> all drivers, I don't see a problem here. Did you test on your
>> incredibly slow g200 test rig if the conversion ends up hurting instead
>> of helping performance there?
>>
>> If it hurts, then I see that you have a good reason to NAK this.
>>
>> It's hard to imagine how it would hurt, since you always need a copy
>> from sysram dumb buffers to VRAM - or do you?
>
> I have a number of concerns. My point it not that we shouldn't optimize.
> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
> for userspace to use.
It already does, it's just userspace doesn't want to support it.
>
> AFAICT the main argument against userspace is that Mesa doesn't like
> 3-byte pixels. But I don't see how this conversion cannot be a
> post-processing step within Mesa: do the rendering in RGB32 and then
> convert to a framebuffer in RGB24. Userspace can do that more
> efficiently than the kernel. This has all of the upsides of reduced
> bandwidth, but none of the downsides of kernel code. Applications and/or
> Mesa would be in control of the buffer format and apply the optimization
> where it makes sense. And it would be available for all drivers that are
> similar to mgag200.
For this particular case, user-space would use more memory and CPU,
because the copy to VRAM is done on kernel side, and it is where the
conversion must be done for maximum performances. So there is no way for
userspace to be as efficient as the kernel in this use-case.
For the conversion, the kernel allocate only 1 line, and convert/copy
one line at a time. And because the CPU is out-of-order, it can start
converting the next line using CPU registers while waiting for the bus.
Userspace would need to allocate a whole framebuffer, and can't use the
"wasted" CPU cycle to do the conversion.
>
> My main point is simplicity of the driver: I prefer the driver to be
> simple without unnecessary indirection or overhead. Optimizations like
> these my or may not work on a given system with a certain workload. I'd
> better leave this heuristic to userspace.
I agree that the driver is simpler without optimizing thing. But I think
it's the role of the driver to get the maximum from the hardware, even
if it's old and broken like g200. And userspace don't want to optimize
for such hardware.
>
> Another point of concern is CPU consumption: Slow I/O buses may stall
> the display thread, but the CPU could do something else in the meantime.
> Doing format conversion on the CPU prevents that, hence affecting other
> parts of the system negatively. Of course, that's more of a gut feeling
> than hard data.
I think it's the reverse. Without dropping the X data, the CPU is
actually stalling much longer sending useless bytes to the mgag200's
VRAM. And the CPU can't do anything else while doing memcpy_toio().
Using DMA is the only way to free the CPU during the copy, but it
appears to be either broken or significantly slower on most mgag200
hardware.
I actually have made the work to support DMA, and I'm a bit sad it
didn't work on all g200, so this optimization is really a last resort,
on a really broken hardware.
>
> Please note that the kernel's conversion code uses memory allocation of
> intermediate buffers. We even recently had a discussion about allocation
> overhead during display updates. Userspace can surely do a better job at
> keeping such buffers around.
>
> And finally a note the hardware itself: on low-end hardware like those
> Matrox chips, just switch to RGB16. That will be pretty and fast enough
> for these chips' server systems. Anyone who cares about fast and
> beautiful should buy a real graphics card.
There are still sysadmin users that occasionally have to browse the web
to find answer, or read their mail in a GUI client. It turns out that
rgb16 is pretty ugly for today standard, and buying an external card
would be a bit too much, and won't work for remote control.
Best regards,
--
Jocelyn
>
> Best regards
> Thomas
>
>>
>>
>> Thanks,
>> pq
>>
>>>> + * On most hardware, VRAM read access are slow, so when doing
>>>> the software
>>>> + * conversion, the dumb buffer should be allocated in system
>>>> RAM in order to
>>>> + * have decent performance.
>>>> + * Extra care should be taken when doing software conversion with
>>>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed
>>>> explanations here:
>>>> + *
>>>> https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
>>>> */
>>>> static unsigned int drm_num_planes(struct drm_device *dev)
>>>>
>>>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
>>>
>>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 14:41 ` Pekka Paalanen
@ 2023-09-08 15:10 ` Thomas Zimmermann
2023-09-11 8:38 ` Pekka Paalanen
0 siblings, 1 reply; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 15:10 UTC (permalink / raw)
To: Pekka Paalanen; +Cc: dri-devel, airlied, Jocelyn Falempe, javierm, mripard
[-- Attachment #1.1: Type: text/plain, Size: 6925 bytes --]
Hi
Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
> On Fri, 8 Sep 2023 15:56:51 +0200
> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
>> Hi
>>
>> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
>>> On Fri, 8 Sep 2023 11:21:51 +0200
>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>
>>>> Hi
>>>>
>>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>>> [...]
>>>>> + *
>>>>> + * But there are two exceptions only for dumb buffers:
>>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>>
>>>>
>>>>> + * * Any driver is free to modify its internal representation of the format,
>>>>> + * as long as it doesn't alter the visible content in any way, and doesn't
>>>>> + * modify the user-provided buffer. An example would be to drop the
>>>>> + * padding component from a format to save some memory bandwidth.
>>>>
>>>> I have strong objections to this point, _especially_ as you're
>>>> apparently trying to sneak this in after our discussion. NAK on this
>>>> part from my side.
>>>>
>>>> If you want userspace to be able to use a certain format, then export
>>>> the corresponding 4cc code. Then let userspace decide what to do about
>>>> it. If userspace pick a certain format, go with it.
>>>
>>> What is the reason for your objection, exactly?
>>>
>>>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
>>>> possible.
>>>
>>> For the particular driver in question though, the conversion allows
>>> using a display resolution that is otherwise not possible. I also hear
>>> it improves performance since 25% less data needs to travel across a
>>> slow bus. There is also so little VRAM, than all dumb buffers need to
>>> be allocated from sysram instead anyway, so a copy is always necessary.
>>>
>>> Since XRGB8888 is the one format that is recommended to be supported by
>>> all drivers, I don't see a problem here. Did you test on your
>>> incredibly slow g200 test rig if the conversion ends up hurting instead
>>> of helping performance there?
>>>
>>> If it hurts, then I see that you have a good reason to NAK this.
>>>
>>> It's hard to imagine how it would hurt, since you always need a copy
>>> from sysram dumb buffers to VRAM - or do you?
>>
>> I have a number of concerns. My point it not that we shouldn't optimize.
>> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
>> for userspace to use.
>>
>> AFAICT the main argument against userspace is that Mesa doesn't like
>> 3-byte pixels. But I don't see how this conversion cannot be a
>> post-processing step within Mesa: do the rendering in RGB32 and then
>> convert to a framebuffer in RGB24. Userspace can do that more
>> efficiently than the kernel. This has all of the upsides of reduced
>> bandwidth, but none of the downsides of kernel code. Applications and/or
>> Mesa would be in control of the buffer format and apply the optimization
>> where it makes sense. And it would be available for all drivers that are
>> similar to mgag200.
>>
>> My main point is simplicity of the driver: I prefer the driver to be
>> simple without unnecessary indirection or overhead. Optimizations like
>> these my or may not work on a given system with a certain workload. I'd
>> better leave this heuristic to userspace.
>>
>> Another point of concern is CPU consumption: Slow I/O buses may stall
>> the display thread, but the CPU could do something else in the meantime.
>> Doing format conversion on the CPU prevents that, hence affecting other
>> parts of the system negatively. Of course, that's more of a gut feeling
>> than hard data.
>>
>> Please note that the kernel's conversion code uses memory allocation of
>> intermediate buffers. We even recently had a discussion about allocation
>> overhead during display updates. Userspace can surely do a better job at
>> keeping such buffers around.
>>
>> And finally a note the hardware itself: on low-end hardware like those
>> Matrox chips, just switch to RGB16. That will be pretty and fast enough
>> for these chips' server systems. Anyone who cares about fast and
>> beautiful should buy a real graphics card.
>
> Fair enough.
>
> Did you consider that every frame will be copied twice: once in
> userspace to get RGB888, then again by the driver into VRAM, since dumb
> buffers reside in sysram?
In the kernel, we reduce the copying to the changed parts, if we have
damage information from userspace. IDK Mesa's software renderer, but it
could certainly apply a similar optimization.
>
> RGB565 would probably go the same route I guess.
From what I know userspace still supports RGB565 rendering directly.
Last time I tested, it worked for me.
>
> I suspect the intermediate buffers (dumb buffers in this case) need to
> be full frame size rather than a scanline, too. I'm not sure why the
> driver would need any scratch buffers beyond a couple dozen bytes while
> doing a software conversion, just enough to have the lowest common
> multiple of 3 bytes and optimal bus write width.
For the conversion in the kernel we allocate enough temporary memory to
hold the part of each scanline that changed. Then we go over dirty
scanlines, convert each into the output format and store it in that
temporary buffer. Then copy the temporary buffer to VRAM. The buffer can
be reused for the scanlines of a single conversion. But the nature of
the framebuffer (buffers are possibly shared with concurrent access from
multiple planes) makes it hard to keep that temporary memory around.
Hence it's freed after each conversion. The code is at [1].
I assume that a userspace software renderer could do a much better job
at keeping the temporary memory allocated.
Best regards
Thomas
[1]
https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_format_helper.c#L88
>
>
> Thanks,
> pq
>
>>
>> Best regards
>> Thomas
>>
>>>
>>>
>>> Thanks,
>>> pq
>>>
>>>>> + * On most hardware, VRAM read access are slow, so when doing the software
>>>>> + * conversion, the dumb buffer should be allocated in system RAM in order to
>>>>> + * have decent performance.
>>>>> + * Extra care should be taken when doing software conversion with
>>>>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
>>>>> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
>>>>> */
>>>>>
>>>>> static unsigned int drm_num_planes(struct drm_device *dev)
>>>>>
>>>>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
>>>>
>>>
>>
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 14:48 ` Jocelyn Falempe
@ 2023-09-08 15:37 ` Thomas Zimmermann
2023-09-11 10:05 ` Jocelyn Falempe
2023-09-11 11:14 ` Maxime Ripard
0 siblings, 2 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-08 15:37 UTC (permalink / raw)
To: Jocelyn Falempe, Pekka Paalanen; +Cc: javierm, mripard, dri-devel, airlied
[-- Attachment #1.1: Type: text/plain, Size: 7165 bytes --]
Hi
Am 08.09.23 um 16:48 schrieb Jocelyn Falempe:
> On 08/09/2023 15:56, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
>>> On Fri, 8 Sep 2023 11:21:51 +0200
>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>
>>>> Hi
>>>>
>>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>>> [...]
>>>>> + *
>>>>> + * But there are two exceptions only for dumb buffers:
>>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>>
>>>>
>>>>> + * * Any driver is free to modify its internal representation
>>>>> of the format,
>>>>> + * as long as it doesn't alter the visible content in any
>>>>> way, and doesn't
>>>>> + * modify the user-provided buffer. An example would be to
>>>>> drop the
>>>>> + * padding component from a format to save some memory
>>>>> bandwidth.
>>>>
>>>> I have strong objections to this point, _especially_ as you're
>>>> apparently trying to sneak this in after our discussion. NAK on this
>>>> part from my side.
>>>>
>>>> If you want userspace to be able to use a certain format, then export
>>>> the corresponding 4cc code. Then let userspace decide what to do about
>>>> it. If userspace pick a certain format, go with it.
>>>
>>> What is the reason for your objection, exactly?
>>>
>>>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
>>>> possible.
>>>
>>> For the particular driver in question though, the conversion allows
>>> using a display resolution that is otherwise not possible. I also hear
>>> it improves performance since 25% less data needs to travel across a
>>> slow bus. There is also so little VRAM, than all dumb buffers need to
>>> be allocated from sysram instead anyway, so a copy is always necessary.
>>>
>>> Since XRGB8888 is the one format that is recommended to be supported by
>>> all drivers, I don't see a problem here. Did you test on your
>>> incredibly slow g200 test rig if the conversion ends up hurting instead
>>> of helping performance there?
>>>
>>> If it hurts, then I see that you have a good reason to NAK this.
>>>
>>> It's hard to imagine how it would hurt, since you always need a copy
>>> from sysram dumb buffers to VRAM - or do you?
>>
>> I have a number of concerns. My point it not that we shouldn't
>> optimize. I just don't want it in the kernel. Mgag200 can export
>> DRM_FORMAT_RGB888 for userspace to use.
>
> It already does, it's just userspace doesn't want to support it.
>>
>> AFAICT the main argument against userspace is that Mesa doesn't like
>> 3-byte pixels. But I don't see how this conversion cannot be a
>> post-processing step within Mesa: do the rendering in RGB32 and then
>> convert to a framebuffer in RGB24. Userspace can do that more
>> efficiently than the kernel. This has all of the upsides of reduced
>> bandwidth, but none of the downsides of kernel code. Applications
>> and/or Mesa would be in control of the buffer format and apply the
>> optimization where it makes sense. And it would be available for all
>> drivers that are similar to mgag200.
>
> For this particular case, user-space would use more memory and CPU,
> because the copy to VRAM is done on kernel side, and it is where the
> conversion must be done for maximum performances. So there is no way for
> userspace to be as efficient as the kernel in this use-case.
>
> For the conversion, the kernel allocate only 1 line, and convert/copy
> one line at a time. And because the CPU is out-of-order, it can start
> converting the next line using CPU registers while waiting for the bus.
Access is writecombined, so you cannot throw large amounts of data
towards the bus and do something else meanwhile.
>
> Userspace would need to allocate a whole framebuffer, and can't use the
> "wasted" CPU cycle to do the conversion.
Yes, userspace would probably need a full extra framebuffer to store the
RGB32 data. But just as in the kernel, userspace can do format
conversion of only the damaged areas. No extra CPU overhead here.
>
>>
>> My main point is simplicity of the driver: I prefer the driver to be
>> simple without unnecessary indirection or overhead. Optimizations like
>> these my or may not work on a given system with a certain workload.
>> I'd better leave this heuristic to userspace.
>
> I agree that the driver is simpler without optimizing thing. But I think
> it's the role of the driver to get the maximum from the hardware, even
> if it's old and broken like g200. And userspace don't want to optimize
> for such hardware.
Optimization always depends on the workload; something that the driver
doesn't know. For example, as we mostly move the mouse cursor around the
screen, the damages areas are usually small. Optimizing this might be
pointless in any case.
>
>>
>> Another point of concern is CPU consumption: Slow I/O buses may stall
>> the display thread, but the CPU could do something else in the
>> meantime. Doing format conversion on the CPU prevents that, hence
>> affecting other parts of the system negatively. Of course, that's more
>> of a gut feeling than hard data.
>
> I think it's the reverse. Without dropping the X data, the CPU is
> actually stalling much longer sending useless bytes to the mgag200's
> VRAM. And the CPU can't do anything else while doing memcpy_toio().
Hyperthreading.
You are also arguing against XRGB in general, which is a different topic.
> Using DMA is the only way to free the CPU during the copy, but it
> appears to be either broken or significantly slower on most mgag200
> hardware.
>
> I actually have made the work to support DMA, and I'm a bit sad it
> didn't work on all g200, so this optimization is really a last resort,
> on a really broken hardware.
>
>>
>> Please note that the kernel's conversion code uses memory allocation
>> of intermediate buffers. We even recently had a discussion about
>> allocation overhead during display updates. Userspace can surely do a
>> better job at keeping such buffers around.
>>
>> And finally a note the hardware itself: on low-end hardware like those
>> Matrox chips, just switch to RGB16. That will be pretty and fast
>> enough for these chips' server systems. Anyone who cares about fast
>> and beautiful should buy a real graphics card.
>
> There are still sysadmin users that occasionally have to browse the web
> to find answer, or read their mail in a GUI client. It turns out that
> rgb16 is pretty ugly for today standard, and buying an external card
> would be a bit too much, and won't work for remote control.
I'm sure sysadmins have a computer for work with a decent GPU and don't
need to browse the web on their server systems.
Best regards
Thomas
>
>
> Best regards,
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 15:10 ` Thomas Zimmermann
@ 2023-09-11 8:38 ` Pekka Paalanen
2023-09-11 10:18 ` Thomas Zimmermann
2023-09-12 15:57 ` Michel Dänzer
0 siblings, 2 replies; 26+ messages in thread
From: Pekka Paalanen @ 2023-09-11 8:38 UTC (permalink / raw)
To: Thomas Zimmermann; +Cc: dri-devel, airlied, Jocelyn Falempe, javierm, mripard
[-- Attachment #1: Type: text/plain, Size: 9482 bytes --]
On Fri, 8 Sep 2023 17:10:46 +0200
Thomas Zimmermann <tzimmermann@suse.de> wrote:
> Hi
>
> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
> > On Fri, 8 Sep 2023 15:56:51 +0200
> > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >
> >> Hi
> >>
> >> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
> >>> On Fri, 8 Sep 2023 11:21:51 +0200
> >>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >>>
> >>>> Hi
> >>>>
> >>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
> >>>> [...]
> >>>>> + *
> >>>>> + * But there are two exceptions only for dumb buffers:
> >>>>> + * * To support XRGB8888 if it's not supported by the hardware.
> >>>>
> >>>>
> >>>>> + * * Any driver is free to modify its internal representation of the format,
> >>>>> + * as long as it doesn't alter the visible content in any way, and doesn't
> >>>>> + * modify the user-provided buffer. An example would be to drop the
> >>>>> + * padding component from a format to save some memory bandwidth.
> >>>>
> >>>> I have strong objections to this point, _especially_ as you're
> >>>> apparently trying to sneak this in after our discussion. NAK on this
> >>>> part from my side.
> >>>>
> >>>> If you want userspace to be able to use a certain format, then export
> >>>> the corresponding 4cc code. Then let userspace decide what to do about
> >>>> it. If userspace pick a certain format, go with it.
> >>>
> >>> What is the reason for your objection, exactly?
> >>>
> >>>> Hence, no implicit conversion from XRGB888 to RGB888, just because it's
> >>>> possible.
> >>>
> >>> For the particular driver in question though, the conversion allows
> >>> using a display resolution that is otherwise not possible. I also hear
> >>> it improves performance since 25% less data needs to travel across a
> >>> slow bus. There is also so little VRAM, than all dumb buffers need to
> >>> be allocated from sysram instead anyway, so a copy is always necessary.
> >>>
> >>> Since XRGB8888 is the one format that is recommended to be supported by
> >>> all drivers, I don't see a problem here. Did you test on your
> >>> incredibly slow g200 test rig if the conversion ends up hurting instead
> >>> of helping performance there?
> >>>
> >>> If it hurts, then I see that you have a good reason to NAK this.
> >>>
> >>> It's hard to imagine how it would hurt, since you always need a copy
> >>> from sysram dumb buffers to VRAM - or do you?
> >>
> >> I have a number of concerns. My point it not that we shouldn't optimize.
> >> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
> >> for userspace to use.
> >>
> >> AFAICT the main argument against userspace is that Mesa doesn't like
> >> 3-byte pixels. But I don't see how this conversion cannot be a
> >> post-processing step within Mesa: do the rendering in RGB32 and then
> >> convert to a framebuffer in RGB24. Userspace can do that more
> >> efficiently than the kernel. This has all of the upsides of reduced
> >> bandwidth, but none of the downsides of kernel code. Applications and/or
> >> Mesa would be in control of the buffer format and apply the optimization
> >> where it makes sense. And it would be available for all drivers that are
> >> similar to mgag200.
> >>
> >> My main point is simplicity of the driver: I prefer the driver to be
> >> simple without unnecessary indirection or overhead. Optimizations like
> >> these my or may not work on a given system with a certain workload. I'd
> >> better leave this heuristic to userspace.
> >>
> >> Another point of concern is CPU consumption: Slow I/O buses may stall
> >> the display thread, but the CPU could do something else in the meantime.
> >> Doing format conversion on the CPU prevents that, hence affecting other
> >> parts of the system negatively. Of course, that's more of a gut feeling
> >> than hard data.
> >>
> >> Please note that the kernel's conversion code uses memory allocation of
> >> intermediate buffers. We even recently had a discussion about allocation
> >> overhead during display updates. Userspace can surely do a better job at
> >> keeping such buffers around.
> >>
> >> And finally a note the hardware itself: on low-end hardware like those
> >> Matrox chips, just switch to RGB16. That will be pretty and fast enough
> >> for these chips' server systems. Anyone who cares about fast and
> >> beautiful should buy a real graphics card.
> >
> > Fair enough.
> >
> > Did you consider that every frame will be copied twice: once in
> > userspace to get RGB888, then again by the driver into VRAM, since dumb
> > buffers reside in sysram?
>
> In the kernel, we reduce the copying to the changed parts, if we have
> damage information from userspace. IDK Mesa's software renderer, but it
> could certainly apply a similar optimization.
I have already assumed that everything uses damage information to
optimize the regions to copy. It's still two copies instead of one.
Actually, it's slightly more than two copies.
Due to damage tracking and the driver needing to flip between two VRAM
buffers, it is always copying current + previous damage, not only
current damage.
> > I suspect the intermediate buffers (dumb buffers in this case) need to
> > be full frame size rather than a scanline, too. I'm not sure why the
> > driver would need any scratch buffers beyond a couple dozen bytes while
> > doing a software conversion, just enough to have the lowest common
> > multiple of 3 bytes and optimal bus write width.
>
> For the conversion in the kernel we allocate enough temporary memory to
> hold the part of each scanline that changed. Then we go over dirty
> scanlines, convert each into the output format and store it in that
> temporary buffer. Then copy the temporary buffer to VRAM. The buffer can
> be reused for the scanlines of a single conversion. But the nature of
> the framebuffer (buffers are possibly shared with concurrent access from
> multiple planes) makes it hard to keep that temporary memory around.
> Hence it's freed after each conversion. The code is at [1].
Yes, that's the conversion in the kernel. However, before the kernel
can run that conversion, userspace must have already allocated full
sized dumb buffers to convert its full sized internal framebuffer into.
Since KMS is modernly used with double-buffering, there must always be
two dumb buffers, which means updating a dumb buffer needs to copy not
just current damage but also previous damage. Userspace has no way to
know that single-buffering would be equally good in this case for this
particular driver.
If userspace gave its internal framebuffer to the kernel without doing
the conversion to RGB888, then that internal buffer would be the dumb
buffer, and there would be no need to allocate a second full size
buffer (third, because double-buffering towards KMS).
The driver allocating even full scanlines would be a lot less memory
touched if userspace didn't convert to RGB888, and if the driver used a
tailored conversion function (it literally needs to handle only a single
combination of input and output conditions), it wouldn't need even that
nor to allocate and free on every conversion. I understand you do not
want to pay the cost having such special-case code, and that's ok. It
would just be even further optimization after getting rid of a static
full sized buffer allocation.
> I assume that a userspace software renderer could do a much better job
> at keeping the temporary memory allocated.
I'm not sure why the kernel can't keep the temporary scanline buffer
allocated with the CRTC. It only needs to be reallocated if it's too
small. Sure, people probably don't want to spend time on such code.
All the above discussion is based on the assumption that dumb buffers
are allocated in sysram.
It's fine to say you don't want to optimize, but I want to be clear on
the exact trade-off.
The trade-offs vary greatly depending on each particular use case,
which is why I wouldn't make a hard rule of no in-kernel conversions,
just a rule of thumb since it's *usually* not useful or is actively
hurting. Here we are talking about XRGB8888 which is already exempt
from the rule of thumb, with two more special conditions: dumb buffers
in sysram, and the performance traits of RGB888 on the specific
hardware.
Thanks,
pq
>
> Best regards
> Thomas
>
> [1]
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_format_helper.c#L88
>
> >
> >
> > Thanks,
> > pq
> >
> >>
> >> Best regards
> >> Thomas
> >>
> >>>
> >>>
> >>> Thanks,
> >>> pq
> >>>
> >>>>> + * On most hardware, VRAM read access are slow, so when doing the software
> >>>>> + * conversion, the dumb buffer should be allocated in system RAM in order to
> >>>>> + * have decent performance.
> >>>>> + * Extra care should be taken when doing software conversion with
> >>>>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
> >>>>> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
> >>>>> */
> >>>>>
> >>>>> static unsigned int drm_num_planes(struct drm_device *dev)
> >>>>>
> >>>>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
> >>>>
> >>>
> >>
> >
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 15:37 ` Thomas Zimmermann
@ 2023-09-11 10:05 ` Jocelyn Falempe
2023-09-11 10:44 ` Thomas Zimmermann
2023-09-11 11:14 ` Maxime Ripard
1 sibling, 1 reply; 26+ messages in thread
From: Jocelyn Falempe @ 2023-09-11 10:05 UTC (permalink / raw)
To: Thomas Zimmermann, Pekka Paalanen; +Cc: javierm, mripard, dri-devel, airlied
On 08/09/2023 17:37, Thomas Zimmermann wrote:
> Hi
>
> Am 08.09.23 um 16:48 schrieb Jocelyn Falempe:
>> On 08/09/2023 15:56, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
>>>> On Fri, 8 Sep 2023 11:21:51 +0200
>>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
>>>>> [...]
>>>>>> + *
>>>>>> + * But there are two exceptions only for dumb buffers:
>>>>>> + * * To support XRGB8888 if it's not supported by the hardware.
>>>>>
>>>>>
>>>>>> + * * Any driver is free to modify its internal representation
>>>>>> of the format,
>>>>>> + * as long as it doesn't alter the visible content in any
>>>>>> way, and doesn't
>>>>>> + * modify the user-provided buffer. An example would be to
>>>>>> drop the
>>>>>> + * padding component from a format to save some memory
>>>>>> bandwidth.
>>>>>
>>>>> I have strong objections to this point, _especially_ as you're
>>>>> apparently trying to sneak this in after our discussion. NAK on this
>>>>> part from my side.
>>>>>
>>>>> If you want userspace to be able to use a certain format, then export
>>>>> the corresponding 4cc code. Then let userspace decide what to do about
>>>>> it. If userspace pick a certain format, go with it.
>>>>
>>>> What is the reason for your objection, exactly?
>>>>
>>>>> Hence, no implicit conversion from XRGB888 to RGB888, just because
>>>>> it's
>>>>> possible.
>>>>
>>>> For the particular driver in question though, the conversion allows
>>>> using a display resolution that is otherwise not possible. I also hear
>>>> it improves performance since 25% less data needs to travel across a
>>>> slow bus. There is also so little VRAM, than all dumb buffers need to
>>>> be allocated from sysram instead anyway, so a copy is always necessary.
>>>>
>>>> Since XRGB8888 is the one format that is recommended to be supported by
>>>> all drivers, I don't see a problem here. Did you test on your
>>>> incredibly slow g200 test rig if the conversion ends up hurting instead
>>>> of helping performance there?
>>>>
>>>> If it hurts, then I see that you have a good reason to NAK this.
>>>>
>>>> It's hard to imagine how it would hurt, since you always need a copy
>>>> from sysram dumb buffers to VRAM - or do you?
>>>
>>> I have a number of concerns. My point it not that we shouldn't
>>> optimize. I just don't want it in the kernel. Mgag200 can export
>>> DRM_FORMAT_RGB888 for userspace to use.
>>
>> It already does, it's just userspace doesn't want to support it.
>>>
>>> AFAICT the main argument against userspace is that Mesa doesn't like
>>> 3-byte pixels. But I don't see how this conversion cannot be a
>>> post-processing step within Mesa: do the rendering in RGB32 and then
>>> convert to a framebuffer in RGB24. Userspace can do that more
>>> efficiently than the kernel. This has all of the upsides of reduced
>>> bandwidth, but none of the downsides of kernel code. Applications
>>> and/or Mesa would be in control of the buffer format and apply the
>>> optimization where it makes sense. And it would be available for all
>>> drivers that are similar to mgag200.
>>
>> For this particular case, user-space would use more memory and CPU,
>> because the copy to VRAM is done on kernel side, and it is where the
>> conversion must be done for maximum performances. So there is no way
>> for userspace to be as efficient as the kernel in this use-case.
>>
>> For the conversion, the kernel allocate only 1 line, and convert/copy
>> one line at a time. And because the CPU is out-of-order, it can start
>> converting the next line using CPU registers while waiting for the bus.
>
> Access is writecombined, so you cannot throw large amounts of data
> towards the bus and do something else meanwhile.
>
>>
>> Userspace would need to allocate a whole framebuffer, and can't use
>> the "wasted" CPU cycle to do the conversion.
>
> Yes, userspace would probably need a full extra framebuffer to store the
> RGB32 data. But just as in the kernel, userspace can do format
> conversion of only the damaged areas. No extra CPU overhead here.
>
>>
>>>
>>> My main point is simplicity of the driver: I prefer the driver to be
>>> simple without unnecessary indirection or overhead. Optimizations
>>> like these my or may not work on a given system with a certain
>>> workload. I'd better leave this heuristic to userspace.
>>
>> I agree that the driver is simpler without optimizing thing. But I
>> think it's the role of the driver to get the maximum from the
>> hardware, even if it's old and broken like g200. And userspace don't
>> want to optimize for such hardware.
>
> Optimization always depends on the workload; something that the driver
> doesn't know. For example, as we mostly move the mouse cursor around the
> screen, the damages areas are usually small. Optimizing this might be
> pointless in any case.
So your point is we should not optimize because sometime it might not be
necessary ? And even for cursor update, the conversion is still 25% faster.
>
>>
>>>
>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>> the display thread, but the CPU could do something else in the
>>> meantime. Doing format conversion on the CPU prevents that, hence
>>> affecting other parts of the system negatively. Of course, that's
>>> more of a gut feeling than hard data.
>>
>> I think it's the reverse. Without dropping the X data, the CPU is
>> actually stalling much longer sending useless bytes to the mgag200's
>> VRAM. And the CPU can't do anything else while doing memcpy_toio().
>
> Hyperthreading.
I still doubt a user-space conversion would do a better job than the kernel.
>
> You are also arguing against XRGB in general, which is a different topic.
yes, the issue is human eyes only sees 3 colors, and it's not a power of
two. So compromise have been made, and that Matrox card, is from the era
of the transition from 16bits to 32bits, and works significantly better
in 24bits. And it's probably the only remaining GPU with this problem.
>
>> Using DMA is the only way to free the CPU during the copy, but it
>> appears to be either broken or significantly slower on most mgag200
>> hardware.
>>
>> I actually have made the work to support DMA, and I'm a bit sad it
>> didn't work on all g200, so this optimization is really a last resort,
>> on a really broken hardware.
>>
>>>
>>> Please note that the kernel's conversion code uses memory allocation
>>> of intermediate buffers. We even recently had a discussion about
>>> allocation overhead during display updates. Userspace can surely do a
>>> better job at keeping such buffers around.
>>>
>>> And finally a note the hardware itself: on low-end hardware like
>>> those Matrox chips, just switch to RGB16. That will be pretty and
>>> fast enough for these chips' server systems. Anyone who cares about
>>> fast and beautiful should buy a real graphics card.
>>
>> There are still sysadmin users that occasionally have to browse the
>> web to find answer, or read their mail in a GUI client. It turns out
>> that rgb16 is pretty ugly for today standard, and buying an external
>> card would be a bit too much, and won't work for remote control.
>
> I'm sure sysadmins have a computer for work with a decent GPU and don't
> need to browse the web on their server systems.
The GUI applications also include graphical installer, that obviously
you can't run on other system.
I do have bug reports, and I already fixed a few regressions in the
mgag200 driver from this reports.
But if you think they shouldn't use this GPU, then why maintaining a
driver in the first place ? Simpledrm is enough if you don't use graphics.
>
> Best regards
> Thomas
>
>>
>>
>> Best regards,
>>
>
Best regards,
--
Jocelyn
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-11 8:38 ` Pekka Paalanen
@ 2023-09-11 10:18 ` Thomas Zimmermann
2023-09-12 15:57 ` Michel Dänzer
1 sibling, 0 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-11 10:18 UTC (permalink / raw)
To: Pekka Paalanen; +Cc: dri-devel, airlied, Jocelyn Falempe, javierm, mripard
[-- Attachment #1.1: Type: text/plain, Size: 6547 bytes --]
Hi
Am 11.09.23 um 10:38 schrieb Pekka Paalanen:
[...]
>> In the kernel, we reduce the copying to the changed parts, if we have
>> damage information from userspace. IDK Mesa's software renderer, but it
>> could certainly apply a similar optimization.
>
> I have already assumed that everything uses damage information to
> optimize the regions to copy. It's still two copies instead of one.
> Actually, it's slightly more than two copies.
>
> Due to damage tracking and the driver needing to flip between two VRAM
> buffers, it is always copying current + previous damage, not only
> current damage.
All dumb buffers are in system memory. From there, we always copy into
the same region of VRAM. So from the kernel's perspective, it's really
just the latest damage.
>
>>> I suspect the intermediate buffers (dumb buffers in this case) need to
>>> be full frame size rather than a scanline, too. I'm not sure why the
>>> driver would need any scratch buffers beyond a couple dozen bytes while
>>> doing a software conversion, just enough to have the lowest common
>>> multiple of 3 bytes and optimal bus write width.
>>
>> For the conversion in the kernel we allocate enough temporary memory to
>> hold the part of each scanline that changed. Then we go over dirty
>> scanlines, convert each into the output format and store it in that
>> temporary buffer. Then copy the temporary buffer to VRAM. The buffer can
>> be reused for the scanlines of a single conversion. But the nature of
>> the framebuffer (buffers are possibly shared with concurrent access from
>> multiple planes) makes it hard to keep that temporary memory around.
>> Hence it's freed after each conversion. The code is at [1].
>
> Yes, that's the conversion in the kernel. However, before the kernel
> can run that conversion, userspace must have already allocated full
> sized dumb buffers to convert its full sized internal framebuffer into.
> Since KMS is modernly used with double-buffering, there must always be
> two dumb buffers, which means updating a dumb buffer needs to copy not
> just current damage but also previous damage. Userspace has no way to
> know that single-buffering would be equally good in this case for this
> particular driver.
I think that this would be a good optimization. If we let userspace know
that a certain GEM BO is a shadow buffer, the compositor could avoid
that second dumb buffer entirely. We have plenty of DRM drivers that
would benefit from this (grep the DRM code for 'shadow_plane').
It would also allow a format conversion in userspace with little memory
overhead compared to the two dumb buffers of the current code.
>
> If userspace gave its internal framebuffer to the kernel without doing
> the conversion to RGB888, then that internal buffer would be the dumb
> buffer, and there would be no need to allocate a second full size
> buffer (third, because double-buffering towards KMS).
>
> The driver allocating even full scanlines would be a lot less memory
> touched if userspace didn't convert to RGB888, and if the driver used a
> tailored conversion function (it literally needs to handle only a single
> combination of input and output conditions), it wouldn't need even that
> nor to allocate and free on every conversion. I understand you do not
> want to pay the cost having such special-case code, and that's ok. It
> would just be even further optimization after getting rid of a static
> full sized buffer allocation.
>
>
>> I assume that a userspace software renderer could do a much better job
>> at keeping the temporary memory allocated.
>
> I'm not sure why the kernel can't keep the temporary scanline buffer
> allocated with the CRTC. It only needs to be reallocated if it's too
> small. Sure, people probably don't want to spend time on such code.
I wanted to make that happen at some point. But it's a bit of a hassle.
Either we pass a temporary buffer to the kernel's format-conversion code
or we store it in one of the involved data structures, preferable
drm_framebuffer. The former solution complicates the caller and is
prone to memory leaks; the latter is complicated as drm_framebuffer
could be used concurrently.
>
> All the above discussion is based on the assumption that dumb buffers
> are allocated in sysram.
>
> It's fine to say you don't want to optimize, but I want to be clear on
> the exact trade-off.
>
> The trade-offs vary greatly depending on each particular use case,
> which is why I wouldn't make a hard rule of no in-kernel conversions,
> just a rule of thumb since it's *usually* not useful or is actively
> hurting. Here we are talking about XRGB8888 which is already exempt
> from the rule of thumb, with two more special conditions: dumb buffers
> in sysram, and the performance traits of RGB888 on the specific
> hardware.
I don't like this optimization from an architectural POV. But I also
haven't seen that it makes a difference in practice. As I said in
another email, the current XRGB-based code is not too slow to be
unusable in the case of a full-screen update. And as we're mostly moving
mouse cursors around the screen, screen updates are small most of the time.
Best regards
Thomas
>
>
> Thanks,
> pq
>
>>
>> Best regards
>> Thomas
>>
>> [1]
>> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_format_helper.c#L88
>>
>>>
>>>
>>> Thanks,
>>> pq
>>>
>>>>
>>>> Best regards
>>>> Thomas
>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> pq
>>>>>
>>>>>>> + * On most hardware, VRAM read access are slow, so when doing the software
>>>>>>> + * conversion, the dumb buffer should be allocated in system RAM in order to
>>>>>>> + * have decent performance.
>>>>>>> + * Extra care should be taken when doing software conversion with
>>>>>>> + * DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
>>>>>>> + * https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
>>>>>>> */
>>>>>>>
>>>>>>> static unsigned int drm_num_planes(struct drm_device *dev)
>>>>>>>
>>>>>>> base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781
>>>>>>
>>>>>
>>>>
>>>
>>
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-11 10:05 ` Jocelyn Falempe
@ 2023-09-11 10:44 ` Thomas Zimmermann
0 siblings, 0 replies; 26+ messages in thread
From: Thomas Zimmermann @ 2023-09-11 10:44 UTC (permalink / raw)
To: Jocelyn Falempe, Pekka Paalanen; +Cc: javierm, mripard, dri-devel, airlied
[-- Attachment #1.1: Type: text/plain, Size: 4571 bytes --]
Hi
Am 11.09.23 um 12:05 schrieb Jocelyn Falempe:
[...]
>> Optimization always depends on the workload; something that the driver
>> doesn't know. For example, as we mostly move the mouse cursor around
>> the screen, the damages areas are usually small. Optimizing this might
>> be pointless in any case.
>
> So your point is we should not optimize because sometime it might not be
> necessary ? And even for cursor update, the conversion is still 25% faster.
Yes, kind of. Those 25% are meaningless in the big picture. And the
current code is fast enough for what these chips do (server consoles).
So there's no pointing micro-optimizing anything here. See my reply to
Pekka for an optimization that I'd find much more useful.
If I have to choose between fast-enough-but-simple and
faster-but-complicated, the former is much preferable. Anyone who wants
fast and beautiful graphics has a better graphics card anyway.
>
>>
>>>
>>>>
>>>> Another point of concern is CPU consumption: Slow I/O buses may
>>>> stall the display thread, but the CPU could do something else in the
>>>> meantime. Doing format conversion on the CPU prevents that, hence
>>>> affecting other parts of the system negatively. Of course, that's
>>>> more of a gut feeling than hard data.
>>>
>>> I think it's the reverse. Without dropping the X data, the CPU is
>>> actually stalling much longer sending useless bytes to the mgag200's
>>> VRAM. And the CPU can't do anything else while doing memcpy_toio().
>>
>> Hyperthreading.
>
> I still doubt a user-space conversion would do a better job than the
> kernel.
>>
>> You are also arguing against XRGB in general, which is a different topic.
>
> yes, the issue is human eyes only sees 3 colors, and it's not a power of
> two. So compromise have been made, and that Matrox card, is from the era
> of the transition from 16bits to 32bits, and works significantly better
> in 24bits. And it's probably the only remaining GPU with this problem.
No. There's at least udl hardware, which does only support RGB16 and
RGB24. And for simpledrm and ofdrm, we have to take whatever the system
provided us with. That could be RGB24.
>
>>
>>> Using DMA is the only way to free the CPU during the copy, but it
>>> appears to be either broken or significantly slower on most mgag200
>>> hardware.
>>>
>>> I actually have made the work to support DMA, and I'm a bit sad it
>>> didn't work on all g200, so this optimization is really a last
>>> resort, on a really broken hardware.
>>>
>>>>
>>>> Please note that the kernel's conversion code uses memory allocation
>>>> of intermediate buffers. We even recently had a discussion about
>>>> allocation overhead during display updates. Userspace can surely do
>>>> a better job at keeping such buffers around.
>>>>
>>>> And finally a note the hardware itself: on low-end hardware like
>>>> those Matrox chips, just switch to RGB16. That will be pretty and
>>>> fast enough for these chips' server systems. Anyone who cares about
>>>> fast and beautiful should buy a real graphics card.
>>>
>>> There are still sysadmin users that occasionally have to browse the
>>> web to find answer, or read their mail in a GUI client. It turns out
>>> that rgb16 is pretty ugly for today standard, and buying an external
>>> card would be a bit too much, and won't work for remote control.
>>
>> I'm sure sysadmins have a computer for work with a decent GPU and
>> don't need to browse the web on their server systems.
>
> The GUI applications also include graphical installer, that obviously
> you can't run on other system.
> I do have bug reports, and I already fixed a few regressions in the
> mgag200 driver from this reports.
This isn't a driver bug.
> But if you think they shouldn't use this GPU, then why maintaining a
> driver in the first place ? Simpledrm is enough if you don't use graphics.
I've done my share of graphic installations on these chips. Neither
low-color modes nor performance has ever been a problem. And if, it is
still better to spend the time on the renderer, so that all drivers can
benefit.
Best regards
Thomas
>
>>
>> Best regards
>> Thomas
>>
>>>
>>>
>>> Best regards,
>>>
>>
>
> Best regards,
>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-08 15:37 ` Thomas Zimmermann
2023-09-11 10:05 ` Jocelyn Falempe
@ 2023-09-11 11:14 ` Maxime Ripard
1 sibling, 0 replies; 26+ messages in thread
From: Maxime Ripard @ 2023-09-11 11:14 UTC (permalink / raw)
To: Thomas Zimmermann
Cc: Jocelyn Falempe, javierm, dri-devel, Pekka Paalanen, airlied
[-- Attachment #1: Type: text/plain, Size: 1274 bytes --]
On Fri, Sep 08, 2023 at 05:37:27PM +0200, Thomas Zimmermann wrote:
> > > Please note that the kernel's conversion code uses memory allocation
> > > of intermediate buffers. We even recently had a discussion about
> > > allocation overhead during display updates. Userspace can surely do
> > > a better job at keeping such buffers around.
> > >
> > > And finally a note the hardware itself: on low-end hardware like
> > > those Matrox chips, just switch to RGB16. That will be pretty and
> > > fast enough for these chips' server systems. Anyone who cares about
> > > fast and beautiful should buy a real graphics card.
> >
> > There are still sysadmin users that occasionally have to browse the web
> > to find answer, or read their mail in a GUI client. It turns out that
> > rgb16 is pretty ugly for today standard, and buying an external card
> > would be a bit too much, and won't work for remote control.
>
> I'm sure sysadmins have a computer for work with a decent GPU and don't need
> to browse the web on their server systems.
If your expectation is that there's no capable display stack running on
those systems, what user-space component would you expect to optimize
the format/transfers like you previously hinted at exactly?
Maxime
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-11 8:38 ` Pekka Paalanen
2023-09-11 10:18 ` Thomas Zimmermann
@ 2023-09-12 15:57 ` Michel Dänzer
2023-09-13 8:14 ` Jocelyn Falempe
1 sibling, 1 reply; 26+ messages in thread
From: Michel Dänzer @ 2023-09-12 15:57 UTC (permalink / raw)
To: Pekka Paalanen, Thomas Zimmermann, Jocelyn Falempe
Cc: mripard, airlied, javierm, dri-devel
On 9/11/23 10:38, Pekka Paalanen wrote:
> On Fri, 8 Sep 2023 17:10:46 +0200
> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
>>> On Fri, 8 Sep 2023 15:56:51 +0200
>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>>
>>>> I have a number of concerns. My point it not that we shouldn't optimize.
>>>> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
>>>> for userspace to use.
>>>>
>>>> AFAICT the main argument against userspace is that Mesa doesn't like
>>>> 3-byte pixels. But I don't see how this conversion cannot be a
>>>> post-processing step within Mesa: do the rendering in RGB32 and then
>>>> convert to a framebuffer in RGB24.
Even assuming the conversion could be handled transparently in Mesa, it would still require the KMS client to pick RGB888 instead of XRGB8888. Most (all?) KMS clients support XRGB8888, many of them will realistically never support RGB888. (Or even if they did, they might prefer XRGB8888 anyway, if RGB888 requires a final conversion step)
>>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>>> the display thread, but the CPU could do something else in the meantime.
>>>> Doing format conversion on the CPU prevents that, hence affecting other
>>>> parts of the system negatively. Of course, that's more of a gut feeling
>>>> than hard data.
Jocelyn, have you measured if the XRGB8888 -> RGB888 conversion copy takes longer than a straight RGB888 -> RGB888 copy in the kernel?
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-12 15:57 ` Michel Dänzer
@ 2023-09-13 8:14 ` Jocelyn Falempe
2023-09-13 17:02 ` Michel Dänzer
0 siblings, 1 reply; 26+ messages in thread
From: Jocelyn Falempe @ 2023-09-13 8:14 UTC (permalink / raw)
To: dri-devel
On 12/09/2023 17:57, Michel Dänzer wrote:
> On 9/11/23 10:38, Pekka Paalanen wrote:
>> On Fri, 8 Sep 2023 17:10:46 +0200
>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
>>>> On Fri, 8 Sep 2023 15:56:51 +0200
>>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>>>
>>>>> I have a number of concerns. My point it not that we shouldn't optimize.
>>>>> I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
>>>>> for userspace to use.
>>>>>
>>>>> AFAICT the main argument against userspace is that Mesa doesn't like
>>>>> 3-byte pixels. But I don't see how this conversion cannot be a
>>>>> post-processing step within Mesa: do the rendering in RGB32 and then
>>>>> convert to a framebuffer in RGB24.
>
> Even assuming the conversion could be handled transparently in Mesa, it would still require the KMS client to pick RGB888 instead of XRGB8888. Most (all?) KMS clients support XRGB8888, many of them will realistically never support RGB888. (Or even if they did, they might prefer XRGB8888 anyway, if RGB888 requires a final conversion step)
>
>
>>>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>>>> the display thread, but the CPU could do something else in the meantime.
>>>>> Doing format conversion on the CPU prevents that, hence affecting other
>>>>> parts of the system negatively. Of course, that's more of a gut feeling
>>>>> than hard data.
>
> Jocelyn, have you measured if the XRGB8888 -> RGB888 conversion copy takes longer than a straight RGB888 -> RGB888 copy in the kernel?
>
>
At least on my Matrox system, the conversion is really negligible
compared to the copy, due to slow memory bandwidth. I wasn't able to see
a difference, using ktime_get_ns().
The CPU is an old Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz
in 1280x1024, the RGB888 copy takes 95ms.
and the XRGB8888->RGB888 takes also 95ms.
(and the full XRGB8888 copy takes 125ms)
Also we say "conversion", but when talking about XRGB8888->RGB888, there
is no math involved, only dropping one bytes every 4.
But really performance is not the main concern here as it is obvious
that it's much faster in RGB888. I tried to summarize the other
arguments below, which I still find not convincing:
* The driver is already fast enough, if you want faster, buy another GPU.
* This adds too much complexity in the driver (about ~20 lines of code)
* It breaks an unwritten rule of not changing the color format inside
the kernel (which I tried to write with this patch, while adding an
exception for legitimate use cases, like this one).
But let's admit that this discussion is going nowhere, and that I failed
to reach a consensus, so I'm now focusing on other things.
Best regards,
--
Jocelyn
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v3] drm/plane: Add documentation about software color conversion.
2023-09-13 8:14 ` Jocelyn Falempe
@ 2023-09-13 17:02 ` Michel Dänzer
0 siblings, 0 replies; 26+ messages in thread
From: Michel Dänzer @ 2023-09-13 17:02 UTC (permalink / raw)
To: Jocelyn Falempe, Thomas Zimmermann; +Cc: dri-devel
On 9/13/23 10:14, Jocelyn Falempe wrote:
> On 12/09/2023 17:57, Michel Dänzer wrote:
>> On 9/11/23 10:38, Pekka Paalanen wrote:
>>> On Fri, 8 Sep 2023 17:10:46 +0200
>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>> Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
>>>>> On Fri, 8 Sep 2023 15:56:51 +0200
>>>>> Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>>>>>
>>>>>> Another point of concern is CPU consumption: Slow I/O buses may stall
>>>>>> the display thread, but the CPU could do something else in the meantime.
>>>>>> Doing format conversion on the CPU prevents that, hence affecting other
>>>>>> parts of the system negatively. Of course, that's more of a gut feeling
>>>>>> than hard data.
>>
>> Jocelyn, have you measured if the XRGB8888 -> RGB888 conversion copy takes longer than a straight RGB888 -> RGB888 copy in the kernel?
>
> At least on my Matrox system, the conversion is really negligible compared to the copy, due to slow memory bandwidth. I wasn't able to see a difference, using ktime_get_ns().
>
> The CPU is an old Intel(R) Core(TM) i3 CPU 540 @ 3.07GHz
> in 1280x1024, the RGB888 copy takes 95ms.
> and the XRGB8888->RGB888 takes also 95ms.
> (and the full XRGB8888 copy takes 125ms)
Then any XRGB8888 → RGB888 conversion in user space has to result in worse performance.
> But let's admit that this discussion is going nowhere, and that I failed to reach a consensus, so I'm now focusing on other things.
I see consensus, with one person disagreeing.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2023-09-13 17:02 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-25 14:04 [PATCH v3] drm/plane: Add documentation about software color conversion Jocelyn Falempe
2023-08-25 14:14 ` Maxime Ripard
2023-08-25 14:25 ` Javier Martinez Canillas
2023-08-28 7:35 ` Pekka Paalanen
2023-09-08 9:21 ` Thomas Zimmermann
2023-09-08 10:58 ` Maxime Ripard
2023-09-08 13:22 ` Thomas Zimmermann
2023-09-08 13:27 ` Simon Ser
2023-09-08 13:46 ` Javier Martinez Canillas
2023-09-08 14:06 ` Jocelyn Falempe
2023-09-08 14:13 ` Thomas Zimmermann
2023-09-08 14:09 ` Thomas Zimmermann
2023-09-08 11:16 ` Pekka Paalanen
2023-09-08 13:56 ` Thomas Zimmermann
2023-09-08 14:41 ` Pekka Paalanen
2023-09-08 15:10 ` Thomas Zimmermann
2023-09-11 8:38 ` Pekka Paalanen
2023-09-11 10:18 ` Thomas Zimmermann
2023-09-12 15:57 ` Michel Dänzer
2023-09-13 8:14 ` Jocelyn Falempe
2023-09-13 17:02 ` Michel Dänzer
2023-09-08 14:48 ` Jocelyn Falempe
2023-09-08 15:37 ` Thomas Zimmermann
2023-09-11 10:05 ` Jocelyn Falempe
2023-09-11 10:44 ` Thomas Zimmermann
2023-09-11 11:14 ` Maxime Ripard
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.