* Re: fbdev: update outdated help text for CONFIG_FB_NVIDIA
From: robgithub @ 2026-03-13 20:35 UTC (permalink / raw)
To: Randy Dunlap
Cc: robgithub, Helge Deller, Thomas Zimmermann, linux-fbdev,
Prasanna Kumar T S M, Wei Liu, Michael Kelley, Sukrut Heroorkar,
Mukesh Rathor, Uwe Kleine-König, dri-devel, linux-kernel
In-Reply-To: <034989df-cf86-4136-8522-6c48e5523645@infradead.org>
From 688a061ba0db71fc2d5facd8344db7a4d5b1575a Mon Sep 17 00:00:00 2001
From: robgithub <rob.github@jumpstation.co.uk>
Date: Wed, 11 Mar 2026 22:14:43 +0000
Subject: [PATCH] fbdev: update outdated help text for CONFIG_FB_NVIDIA
The help text for CONFIG_FB_NVIDIA refers to obsolete hardware and
incorrect default behaviour. This patch updates the description to
reflect the current state of the driver and supported devices.
Signed-off-by: robgithub <rob.github@jumpstation.co.uk>
---
drivers/video/fbdev/Kconfig | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index ac9ac4287c6a..d8e331427443 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -668,10 +668,10 @@ config FB_NVIDIA
select BITREVERSE
select VGASTATE
help
- This driver supports graphics boards with the nVidia chips, TNT
- and newer. For very old chipsets, such as the RIVA128, then use
- the rivafb.
- Say Y if you have such a graphics board.
+ Supports NVIDIA GPUs from TNT through early GeForce generations
+ (NV4–NV2x: Twintor, Twintor2, Celsius, Kelvin).
+ Later architectures (Rankine and newer) are not reliably supported.
+ If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called nvidiafb.
--
2.52.0
Thanks, this my first time submitting a patch.
I have fixed the invisible whitespace inconsistency, wonder why I missed that :)
I really wanted to stop future users reading the "and newer" and enabling it on an unsupported card. It stopped my kernel booting with no errors.
The references I used to confirm the Nvidia devices affected are
List of devices documented in the code between lines 1228-1277
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/video/fbdev/nvidia/nvidia.c?h=v6.19.6
with more human readable extraction available at
https://cateee.net/lkddb/web-lkddb/FB_NVIDIA.html
and the code names for all Nvidia cards are here
https://nouveau.freedesktop.org/CodeNames.html
Regards
Rob
On Thu, 12 Mar 2026 14:02:18 -0700
Randy Dunlap <rdunlap@infradead.org> wrote:
> Hi,
>
> On 3/12/26 1:55 PM, robgithub wrote:
> > The help text for CONFIG_FB_NVIDIA refers to obsolete hardware and
> > incorrect default behaviour. This patch updates the description to
> > reflect the current state of the driver and supported devices.
> >
> > Signed-off-by: robgithub <rob.github@jumpstation.co.uk>
>
> Inline patches are preferred over attachments.
>
> I thought that Claws mail could send inline patches successfully. (?)
>
> Documentation/process/email-clients.rst says that it works (after a little
> configuration setting).
>
> I don't know anything about which products are supported, so I have no
> comment on that.
>
> In the patch, the indentation is incorrect. Kconfig help text should be
> indented with one tab + 2 spaces, not with 4 spaces.
>
> thanks.
^ permalink raw reply related
* [PATCH 1/3] staging: sm750fb: Rename enum sm750_pnltype values to snake_case
From: Shubham Chakraborty @ 2026-03-14 8:09 UTC (permalink / raw)
To: Sudip Mukherjee, Teddy Wang, Greg Kroah-Hartman, linux-fbdev,
linux-staging, linux-kernel
Cc: Shubham Chakraborty
This patch renames the CamelCase enum values in sm750_pnltype to follow the Linux kernel coding style.
Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
---
drivers/staging/sm750fb/sm750.c | 6 +++---
drivers/staging/sm750fb/sm750.h | 6 +++---
drivers/staging/sm750fb/sm750_hw.c | 6 +++---
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/staging/sm750fb/sm750.c b/drivers/staging/sm750fb/sm750.c
index dec1f6b88a7d..729c34372a1e 100644
--- a/drivers/staging/sm750fb/sm750.c
+++ b/drivers/staging/sm750fb/sm750.c
@@ -942,11 +942,11 @@ static void sm750fb_setup(struct sm750_dev *sm750_dev, char *src)
} else if (!strncmp(opt, "nocrt", strlen("nocrt"))) {
sm750_dev->nocrt = 1;
} else if (!strncmp(opt, "36bit", strlen("36bit"))) {
- sm750_dev->pnltype = sm750_doubleTFT;
+ sm750_dev->pnltype = SM750_DOUBLE_TFT;
} else if (!strncmp(opt, "18bit", strlen("18bit"))) {
- sm750_dev->pnltype = sm750_dualTFT;
+ sm750_dev->pnltype = SM750_DUAL_TFT;
} else if (!strncmp(opt, "24bit", strlen("24bit"))) {
- sm750_dev->pnltype = sm750_24TFT;
+ sm750_dev->pnltype = SM750_24TFT;
} else if (!strncmp(opt, "nohwc0", strlen("nohwc0"))) {
g_hwcursor &= ~0x1;
} else if (!strncmp(opt, "nohwc1", strlen("nohwc1"))) {
diff --git a/drivers/staging/sm750fb/sm750.h b/drivers/staging/sm750fb/sm750.h
index 67b9bfa23f41..49a79d0a8a2e 100644
--- a/drivers/staging/sm750fb/sm750.h
+++ b/drivers/staging/sm750fb/sm750.h
@@ -13,9 +13,9 @@
#endif
enum sm750_pnltype {
- sm750_24TFT = 0, /* 24bit tft */
- sm750_dualTFT = 2, /* dual 18 bit tft */
- sm750_doubleTFT = 1, /* 36 bit double pixel tft */
+ SM750_24TFT = 0, /* 24bit tft */
+ SM750_DUAL_TFT = 2, /* dual 18 bit tft */
+ SM750_DOUBLE_TFT = 1, /* 36 bit double pixel tft */
};
/* vga channel is not concerned */
diff --git a/drivers/staging/sm750fb/sm750_hw.c b/drivers/staging/sm750fb/sm750_hw.c
index a29faee91c78..0e594734a8b9 100644
--- a/drivers/staging/sm750fb/sm750_hw.c
+++ b/drivers/staging/sm750fb/sm750_hw.c
@@ -134,12 +134,12 @@ int hw_sm750_inithw(struct sm750_dev *sm750_dev, struct pci_dev *pdev)
~(PANEL_DISPLAY_CTRL_DUAL_DISPLAY |
PANEL_DISPLAY_CTRL_DOUBLE_PIXEL);
switch (sm750_dev->pnltype) {
- case sm750_24TFT:
+ case SM750_24TFT:
break;
- case sm750_doubleTFT:
+ case SM750_DOUBLE_TFT:
val |= PANEL_DISPLAY_CTRL_DOUBLE_PIXEL;
break;
- case sm750_dualTFT:
+ case SM750_DUAL_TFT:
val |= PANEL_DISPLAY_CTRL_DUAL_DISPLAY;
break;
}
--
2.53.0
^ permalink raw reply related
* [PATCH 2/3] staging: sm750fb: Rename struct init_status members to snake_case
From: Shubham Chakraborty @ 2026-03-14 8:09 UTC (permalink / raw)
To: Sudip Mukherjee, Teddy Wang, Greg Kroah-Hartman, linux-fbdev,
linux-staging, linux-kernel
Cc: Shubham Chakraborty
In-Reply-To: <20260314080934.135457-1-chakrabortyshubham66@gmail.com>
Rename CamelCase members powerMode, setAllEngOff, and resetMemory to follow the Linux kernel coding style.
Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
---
drivers/staging/sm750fb/sm750.c | 6 +++---
drivers/staging/sm750fb/sm750.h | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/sm750fb/sm750.c b/drivers/staging/sm750fb/sm750.c
index 729c34372a1e..c30ffab8a5f3 100644
--- a/drivers/staging/sm750fb/sm750.c
+++ b/drivers/staging/sm750fb/sm750.c
@@ -921,9 +921,9 @@ static void sm750fb_setup(struct sm750_dev *sm750_dev, char *src)
sm750_dev->init_parm.chip_clk = 0;
sm750_dev->init_parm.mem_clk = 0;
sm750_dev->init_parm.master_clk = 0;
- sm750_dev->init_parm.powerMode = 0;
- sm750_dev->init_parm.setAllEngOff = 0;
- sm750_dev->init_parm.resetMemory = 1;
+ sm750_dev->init_parm.power_mode = 0;
+ sm750_dev->init_parm.set_all_eng_off = 0;
+ sm750_dev->init_parm.reset_memory = 1;
/* defaultly turn g_hwcursor on for both view */
g_hwcursor = 3;
diff --git a/drivers/staging/sm750fb/sm750.h b/drivers/staging/sm750fb/sm750.h
index 49a79d0a8a2e..b683a82af349 100644
--- a/drivers/staging/sm750fb/sm750.h
+++ b/drivers/staging/sm750fb/sm750.h
@@ -39,13 +39,13 @@ enum sm750_path {
};
struct init_status {
- ushort powerMode;
+ ushort power_mode;
/* below three clocks are in unit of MHZ*/
ushort chip_clk;
ushort mem_clk;
ushort master_clk;
- ushort setAllEngOff;
- ushort resetMemory;
+ ushort set_all_eng_off;
+ ushort reset_memory;
};
struct lynx_accel {
--
2.53.0
^ permalink raw reply related
* [PATCH 3/3] staging: sm750fb: Rename struct sm750_dev members to snake_case
From: Shubham Chakraborty @ 2026-03-14 8:09 UTC (permalink / raw)
To: Sudip Mukherjee, Teddy Wang, Greg Kroah-Hartman, linux-fbdev,
linux-staging, linux-kernel
Cc: Shubham Chakraborty
In-Reply-To: <20260314080934.135457-1-chakrabortyshubham66@gmail.com>
Rename CamelCase members pvReg and pvMem to follow the Linux kernel coding style.
Signed-off-by: Shubham Chakraborty <chakrabortyshubham66@gmail.com>
---
drivers/staging/sm750fb/sm750.c | 22 +++++++++++-----------
drivers/staging/sm750fb/sm750.h | 4 ++--
drivers/staging/sm750fb/sm750_hw.c | 20 ++++++++++----------
3 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/staging/sm750fb/sm750.c b/drivers/staging/sm750fb/sm750.c
index c30ffab8a5f3..1c60ba056719 100644
--- a/drivers/staging/sm750fb/sm750.c
+++ b/drivers/staging/sm750fb/sm750.c
@@ -624,27 +624,27 @@ static int sm750fb_set_drv(struct lynxfb_par *par)
output->paths = sm750_pnc;
crtc->channel = sm750_primary;
crtc->o_screen = 0;
- crtc->v_screen = sm750_dev->pvMem;
+ crtc->v_screen = sm750_dev->pv_mem;
pr_info("use simul primary mode\n");
break;
case sm750_simul_sec:
output->paths = sm750_pnc;
crtc->channel = sm750_secondary;
crtc->o_screen = 0;
- crtc->v_screen = sm750_dev->pvMem;
+ crtc->v_screen = sm750_dev->pv_mem;
break;
case sm750_dual_normal:
if (par->index == 0) {
output->paths = sm750_panel;
crtc->channel = sm750_primary;
crtc->o_screen = 0;
- crtc->v_screen = sm750_dev->pvMem;
+ crtc->v_screen = sm750_dev->pv_mem;
} else {
output->paths = sm750_crt;
crtc->channel = sm750_secondary;
/* not consider of padding stuffs for o_screen,need fix */
crtc->o_screen = sm750_dev->vidmem_size >> 1;
- crtc->v_screen = sm750_dev->pvMem + crtc->o_screen;
+ crtc->v_screen = sm750_dev->pv_mem + crtc->o_screen;
}
break;
case sm750_dual_swap:
@@ -652,7 +652,7 @@ static int sm750fb_set_drv(struct lynxfb_par *par)
output->paths = sm750_panel;
crtc->channel = sm750_secondary;
crtc->o_screen = 0;
- crtc->v_screen = sm750_dev->pvMem;
+ crtc->v_screen = sm750_dev->pv_mem;
} else {
output->paths = sm750_crt;
crtc->channel = sm750_primary;
@@ -660,7 +660,7 @@ static int sm750fb_set_drv(struct lynxfb_par *par)
* need fix
*/
crtc->o_screen = sm750_dev->vidmem_size >> 1;
- crtc->v_screen = sm750_dev->pvMem + crtc->o_screen;
+ crtc->v_screen = sm750_dev->pv_mem + crtc->o_screen;
}
break;
default:
@@ -764,14 +764,14 @@ static int lynxfb_set_fbinfo(struct fb_info *info, int index)
* must be set after crtc member initialized
*/
crtc->cursor.offset = crtc->o_screen + crtc->vidmem_size - 1024;
- crtc->cursor.mmio = sm750_dev->pvReg +
+ crtc->cursor.mmio = sm750_dev->pv_reg +
0x800f0 + (int)crtc->channel * 0x140;
pr_info("crtc->cursor.mmio = %p\n", crtc->cursor.mmio);
crtc->cursor.max_h = 64;
crtc->cursor.max_w = 64;
crtc->cursor.size = crtc->cursor.max_h * crtc->cursor.max_w * 2 / 8;
- crtc->cursor.vstart = sm750_dev->pvMem + crtc->cursor.offset;
+ crtc->cursor.vstart = sm750_dev->pv_mem + crtc->cursor.offset;
memset_io(crtc->cursor.vstart, 0, crtc->cursor.size);
if (!g_hwcursor)
@@ -1090,7 +1090,7 @@ static int lynxfb_pci_probe(struct pci_dev *pdev,
sm750_dev->mtrr.vram = arch_phys_wc_add(sm750_dev->vidmem_start,
sm750_dev->vidmem_size);
- memset_io(sm750_dev->pvMem, 0, sm750_dev->vidmem_size);
+ memset_io(sm750_dev->pv_mem, 0, sm750_dev->vidmem_size);
pci_set_drvdata(pdev, sm750_dev);
@@ -1121,8 +1121,8 @@ static void lynxfb_pci_remove(struct pci_dev *pdev)
sm750fb_framebuffer_release(sm750_dev);
arch_phys_wc_del(sm750_dev->mtrr.vram);
- iounmap(sm750_dev->pvReg);
- iounmap(sm750_dev->pvMem);
+ iounmap(sm750_dev->pv_reg);
+ iounmap(sm750_dev->pv_mem);
kfree(g_settings);
}
diff --git a/drivers/staging/sm750fb/sm750.h b/drivers/staging/sm750fb/sm750.h
index b683a82af349..3f6570fc8f08 100644
--- a/drivers/staging/sm750fb/sm750.h
+++ b/drivers/staging/sm750fb/sm750.h
@@ -97,8 +97,8 @@ struct sm750_dev {
unsigned long vidreg_start;
__u32 vidmem_size;
__u32 vidreg_size;
- void __iomem *pvReg;
- unsigned char __iomem *pvMem;
+ void __iomem *pv_reg;
+ unsigned char __iomem *pv_mem;
/* locks*/
spinlock_t slock;
diff --git a/drivers/staging/sm750fb/sm750_hw.c b/drivers/staging/sm750fb/sm750_hw.c
index 0e594734a8b9..32b3813d0d8b 100644
--- a/drivers/staging/sm750fb/sm750_hw.c
+++ b/drivers/staging/sm750fb/sm750_hw.c
@@ -49,19 +49,19 @@ int hw_sm750_map(struct sm750_dev *sm750_dev, struct pci_dev *pdev)
}
/* now map mmio and vidmem */
- sm750_dev->pvReg =
+ sm750_dev->pv_reg =
ioremap(sm750_dev->vidreg_start, sm750_dev->vidreg_size);
- if (!sm750_dev->pvReg) {
+ if (!sm750_dev->pv_reg) {
pr_err("mmio failed\n");
ret = -EFAULT;
goto exit;
}
- pr_info("mmio virtual addr = %p\n", sm750_dev->pvReg);
+ pr_info("mmio virtual addr = %p\n", sm750_dev->pv_reg);
- sm750_dev->accel.dpr_base = sm750_dev->pvReg + DE_BASE_ADDR_TYPE1;
- sm750_dev->accel.dp_port_base = sm750_dev->pvReg + DE_PORT_ADDR_TYPE1;
+ sm750_dev->accel.dpr_base = sm750_dev->pv_reg + DE_BASE_ADDR_TYPE1;
+ sm750_dev->accel.dp_port_base = sm750_dev->pv_reg + DE_PORT_ADDR_TYPE1;
- mmio750 = sm750_dev->pvReg;
+ mmio750 = sm750_dev->pv_reg;
sm750_set_chip_type(sm750_dev->devid, sm750_dev->revid);
sm750_dev->vidmem_start = pci_resource_start(pdev, 0);
@@ -76,15 +76,15 @@ int hw_sm750_map(struct sm750_dev *sm750_dev, struct pci_dev *pdev)
sm750_dev->vidmem_start, sm750_dev->vidmem_size);
/* reserve the vidmem space of smi adaptor */
- sm750_dev->pvMem =
+ sm750_dev->pv_mem =
ioremap_wc(sm750_dev->vidmem_start, sm750_dev->vidmem_size);
- if (!sm750_dev->pvMem) {
- iounmap(sm750_dev->pvReg);
+ if (!sm750_dev->pv_mem) {
+ iounmap(sm750_dev->pv_reg);
pr_err("Map video memory failed\n");
ret = -EFAULT;
goto exit;
}
- pr_info("video memory vaddr = %p\n", sm750_dev->pvMem);
+ pr_info("video memory vaddr = %p\n", sm750_dev->pv_mem);
exit:
return ret;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v1] backlight: apple_bl: Convert to a platform driver
From: Rafael J. Wysocki @ 2026-03-14 11:50 UTC (permalink / raw)
To: Lee Jones
Cc: LKML, Linux ACPI, Daniel Thompson, Jingoo Han, Helge Deller,
dri-devel, linux-fbdev
From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
In all cases in which a struct acpi_driver is used for binding a driver
to an ACPI device object, a corresponding platform device is created by
the ACPI core and that device is regarded as a proper representation of
underlying hardware. Accordingly, a struct platform_driver should be
used by driver code to bind to that device. There are multiple reasons
why drivers should not bind directly to ACPI device objects [1].
Overall, it is better to bind drivers to platform devices than to their
ACPI companions, so convert the Apple Backlight ACPI driver to a
platform one.
While this is not expected to alter functionality, it changes sysfs
layout and so it will be visible to user space.
Link: https://lore.kernel.org/all/2396510.ElGaqSPkdT@rafael.j.wysocki/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
drivers/video/backlight/apple_bl.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/video/backlight/apple_bl.c b/drivers/video/backlight/apple_bl.c
index aaa824437a2a..423513d68b5b 100644
--- a/drivers/video/backlight/apple_bl.c
+++ b/drivers/video/backlight/apple_bl.c
@@ -24,6 +24,7 @@
#include <linux/pci.h>
#include <linux/acpi.h>
#include <linux/atomic.h>
+#include <linux/platform_device.h>
#include <acpi/video.h>
static struct backlight_device *apple_backlight_device;
@@ -134,7 +135,7 @@ static const struct hw_data nvidia_chipset_data = {
.set_brightness = nvidia_chipset_set_brightness,
};
-static int apple_bl_add(struct acpi_device *dev)
+static int apple_bl_probe(struct platform_device *pdev)
{
struct backlight_properties props;
struct pci_dev *host;
@@ -193,7 +194,7 @@ static int apple_bl_add(struct acpi_device *dev)
return 0;
}
-static void apple_bl_remove(struct acpi_device *dev)
+static void apple_bl_remove(struct platform_device *pdev)
{
backlight_device_unregister(apple_backlight_device);
@@ -206,12 +207,12 @@ static const struct acpi_device_id apple_bl_ids[] = {
{"", 0},
};
-static struct acpi_driver apple_bl_driver = {
- .name = "Apple backlight",
- .ids = apple_bl_ids,
- .ops = {
- .add = apple_bl_add,
- .remove = apple_bl_remove,
+static struct platform_driver apple_bl_driver = {
+ .probe = apple_bl_probe,
+ .remove = apple_bl_remove,
+ .driver = {
+ .name = "Apple backlight",
+ .acpi_match_table = apple_bl_ids,
},
};
@@ -224,12 +225,12 @@ static int __init apple_bl_init(void)
if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
return -ENODEV;
- return acpi_bus_register_driver(&apple_bl_driver);
+ return platform_driver_register(&apple_bl_driver);
}
static void __exit apple_bl_exit(void)
{
- acpi_bus_unregister_driver(&apple_bl_driver);
+ platform_driver_unregister(&apple_bl_driver);
}
module_init(apple_bl_init);
--
2.51.0
^ permalink raw reply related
* [PATCH 4/5] backlight: cgbc_bl: fix kernel-doc comment for struct cgbc_bl_data
From: Kit Dallege @ 2026-03-15 14:58 UTC (permalink / raw)
To: Lee Jones, Daniel Thompson, Jingoo Han
Cc: dri-devel, linux-fbdev, linux-kernel, Kit Dallege
Add the required 'struct cgbc_bl_data -' prefix to the kernel-doc
comment so it is properly recognized as struct documentation.
This fixes the following warning:
drivers/video/backlight/cgbc_bl.c:29: This comment starts with '/**', but isn't a kernel-doc comment
Signed-off-by: Kit Dallege <xaum.io@gmail.com>
Signed-off-by: kovan <xaum.io@gmail.com>
---
drivers/video/backlight/cgbc_bl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/backlight/cgbc_bl.c b/drivers/video/backlight/cgbc_bl.c
index ab27e14338a8..6b70cd2f0ddd 100644
--- a/drivers/video/backlight/cgbc_bl.c
+++ b/drivers/video/backlight/cgbc_bl.c
@@ -27,7 +27,7 @@
#define CGBC_BL_MAX_BRIGHTNESS 100
/**
- * CGBC backlight driver data
+ * struct cgbc_bl_data - CGBC backlight driver data
* @dev: Pointer to the platform device
* @cgbc: Pointer to the parent CGBC device data
* @current_brightness: Current brightness level (0-100)
--
2.53.0
^ permalink raw reply related
* [PATCH] staging: sm750fb: Replace busy-wait loop with udelay()
From: OaroraEtimis @ 2026-03-15 15:05 UTC (permalink / raw)
To: sudipm.mukherjee, teddy.wang, gregkh
Cc: linux-fbdev, linux-staging, linux-kernel, OaroraEtimis
The empty for-loop in sw_i2c_wait() triggers a -Wunused-but-set-variable
warning and can be optimized away by modern compilers.
Fix this by replacing the unreliable loop with a standard udelay(2).
This also cleans up the unused 'tmp' variable and outdated comments.
Include <linux/delay.h> to resolve the implicit function declaration.
Signed-off-by: OaroraEtimis <OaroraEtimis@gmail.com>
---
drivers/staging/sm750fb/ddk750_swi2c.c | 20 ++------------------
1 file changed, 2 insertions(+), 18 deletions(-)
diff --git a/drivers/staging/sm750fb/ddk750_swi2c.c b/drivers/staging/sm750fb/ddk750_swi2c.c
index 0ef8d4ff2ef9..d0aeb917be92 100644
--- a/drivers/staging/sm750fb/ddk750_swi2c.c
+++ b/drivers/staging/sm750fb/ddk750_swi2c.c
@@ -11,6 +11,7 @@
#include "ddk750_reg.h"
#include "ddk750_swi2c.h"
#include "ddk750_power.h"
+#include <linux/delay.h>
/*
* I2C Software Master Driver:
@@ -80,24 +81,7 @@ static unsigned long sw_i2c_data_gpio_data_dir_reg = GPIO_DATA_DIRECTION;
*/
static void sw_i2c_wait(void)
{
- /* find a bug:
- * peekIO method works well before suspend/resume
- * but after suspend, peekIO(0x3ce,0x61) & 0x10
- * always be non-zero,which makes the while loop
- * never finish.
- * use non-ultimate for loop below is safe
- */
-
- /* Change wait algorithm to use PCI bus clock,
- * it's more reliable than counter loop ..
- * write 0x61 to 0x3ce and read from 0x3cf
- */
- int i, tmp;
-
- for (i = 0; i < 600; i++) {
- tmp = i;
- tmp += i;
- }
+ udelay(2);
}
/*
--
2.47.3
^ permalink raw reply related
* [PATCH v2 1/2] staging: sm750fb: Replace busy-wait loop with udelay()
From: Oarora Etimis @ 2026-03-15 23:20 UTC (permalink / raw)
To: sudipm.mukherjee, teddy.wang, gregkh
Cc: linux-fbdev, linux-staging, linux-kernel, OaroraEtimis
From: OaroraEtimis <OaroraEtimis@gmail.com>
The empty for-loop in sw_i2c_wait() triggers a -Wunused-but-set-variable
warning and can be optimized away by modern compilers.
Fix this by replacing the unreliable loop with a standard udelay(2).
This also cleans up the unused 'tmp' variable and outdated comments.
Include <linux/delay.h> to resolve the implicit function declaration.
Signed-off-by: OaroraEtimis <OaroraEtimis@gmail.com>
---
Changes in v2:
- Rebased onto the latest staging-next branch to resolve merge conflicts.
- No logical code changes.
drivers/staging/sm750fb/ddk750_swi2c.c | 20 ++------------------
1 file changed, 2 insertions(+), 18 deletions(-)
diff --git a/drivers/staging/sm750fb/ddk750_swi2c.c b/drivers/staging/sm750fb/ddk750_swi2c.c
index e63f3b00ec4c..d579cb9da79e 100644
--- a/drivers/staging/sm750fb/ddk750_swi2c.c
+++ b/drivers/staging/sm750fb/ddk750_swi2c.c
@@ -11,6 +11,7 @@
#include "ddk750_reg.h"
#include "ddk750_swi2c.h"
#include "ddk750_power.h"
+#include <linux/delay.h>
/*
* I2C Software Master Driver:
@@ -80,24 +81,7 @@ static unsigned long sw_i2c_data_gpio_data_dir_reg = GPIO_DATA_DIRECTION;
*/
static void sw_i2c_wait(void)
{
- /* find a bug:
- * peekIO method works well before suspend/resume
- * but after suspend, peekIO(0x3ce,0x61) & 0x10
- * always be non-zero,which makes the while loop
- * never finish.
- * use non-ultimate for loop below is safe
- */
-
- /* Change wait algorithm to use PCI bus clock,
- * it's more reliable than counter loop ..
- * write 0x61 to 0x3ce and read from 0x3cf
- */
- int i, tmp;
-
- for (i = 0; i < 600; i++) {
- tmp = i;
- tmp += i;
- }
+ udelay(2);
}
/*
--
2.47.3
^ permalink raw reply related
* Re: [PATCH v12 1/1] rust: interop: Add list module for C linked list interface
From: Alexandre Courbot @ 2026-03-16 4:34 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Alex Gaynor, Danilo Krummrich, Dave Airlie,
David Airlie, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Simona Vetter, Daniel Almeida, Koen Koning, Nikola Djukic,
Philipp Stanner, Elle Rhumsaa, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh, alexeyi,
Eliot Courtney, dri-devel, nouveau, rust-for-linux, linux-doc,
amd-gfx, intel-gfx, intel-xe, linux-fbdev, Miguel Ojeda
In-Reply-To: <20260306203648.1136554-2-joelagnelf@nvidia.com>
On Sat Mar 7, 2026 at 5:36 AM JST, Joel Fernandes wrote:
> Add a new module `kernel::interop::list` for working with C's doubly
> circular linked lists. Provide low-level iteration over list nodes.
>
> Typed iteration over actual items is provided with a `clist_create`
> macro to assist in creation of the `CList` type.
>
> Cc: Nikola Djukic <ndjukic@nvidia.com>
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
> Acked-by: Alexandre Courbot <acourbot@nvidia.com>
> Acked-by: Gary Guo <gary@garyguo.net>
> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> ---
> MAINTAINERS | 8 +
> rust/helpers/helpers.c | 1 +
> rust/helpers/list.c | 17 ++
> rust/kernel/interop/list.rs | 338 ++++++++++++++++++++++++++++++++++++
> rust/kernel/interop/mod.rs | 9 +
I think I have mentioned that in another context, but the standard seems
to be to use `interop.rs` instead of `interop/mod.rs` (Miguel please
correct me if I'm wrong).
Also, and once again, please, please build with CLIPPY=1 before sending.
The buddy series also has clippy warnings. Fixing them before I can
review is tedious.
^ permalink raw reply
* Re: [PATCH v2 1/2] staging: sm750fb: Replace busy-wait loop with udelay()
From: Greg KH @ 2026-03-16 6:11 UTC (permalink / raw)
To: Oarora Etimis
Cc: sudipm.mukherjee, teddy.wang, linux-fbdev, linux-staging,
linux-kernel
In-Reply-To: <20260315232042.231620-1-OaroraEtimis@gmail.com>
On Mon, Mar 16, 2026 at 07:20:42AM +0800, Oarora Etimis wrote:
> From: OaroraEtimis <OaroraEtimis@gmail.com>
>
> The empty for-loop in sw_i2c_wait() triggers a -Wunused-but-set-variable
> warning and can be optimized away by modern compilers.
>
> Fix this by replacing the unreliable loop with a standard udelay(2).
> This also cleans up the unused 'tmp' variable and outdated comments.
> Include <linux/delay.h> to resolve the implicit function declaration.
>
> Signed-off-by: OaroraEtimis <OaroraEtimis@gmail.com>
> ---
> Changes in v2:
> - Rebased onto the latest staging-next branch to resolve merge conflicts.
> - No logical code changes.
>
> drivers/staging/sm750fb/ddk750_swi2c.c | 20 ++------------------
> 1 file changed, 2 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/staging/sm750fb/ddk750_swi2c.c b/drivers/staging/sm750fb/ddk750_swi2c.c
> index e63f3b00ec4c..d579cb9da79e 100644
> --- a/drivers/staging/sm750fb/ddk750_swi2c.c
> +++ b/drivers/staging/sm750fb/ddk750_swi2c.c
> @@ -11,6 +11,7 @@
> #include "ddk750_reg.h"
> #include "ddk750_swi2c.h"
> #include "ddk750_power.h"
> +#include <linux/delay.h>
>
> /*
> * I2C Software Master Driver:
> @@ -80,24 +81,7 @@ static unsigned long sw_i2c_data_gpio_data_dir_reg = GPIO_DATA_DIRECTION;
> */
> static void sw_i2c_wait(void)
> {
> - /* find a bug:
> - * peekIO method works well before suspend/resume
> - * but after suspend, peekIO(0x3ce,0x61) & 0x10
> - * always be non-zero,which makes the while loop
> - * never finish.
> - * use non-ultimate for loop below is safe
> - */
> -
> - /* Change wait algorithm to use PCI bus clock,
> - * it's more reliable than counter loop ..
> - * write 0x61 to 0x3ce and read from 0x3cf
> - */
> - int i, tmp;
> -
> - for (i = 0; i < 600; i++) {
> - tmp = i;
> - tmp += i;
> - }
> + udelay(2);
How is "2" the same as this busy loop?
And why not fix this properly, as the comment states? You just removed
that information so no one knows how to do this in the future :(
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH v2 1/2] staging: sm750fb: Replace busy-wait loop with udelay()
From: OaroraEtimis @ 2026-03-16 7:42 UTC (permalink / raw)
To: gregkh
Cc: sudipm.mukherjee, teddy.wang, linux-fbdev, linux-staging,
linux-kernel
In-Reply-To: <2026031626-semisoft-attic-8b37@gregkh>
Hi Greg,
Thanks for the review. Sorry for dropping the historical comment in v2.
My only goal was to fix the -Wunused-but-set-variable warning and
prevent the loop from being optimized away by the compiler.
I will definitely restore the comment.
On Mon, Mar 16, 2026 at 14:11, Greg KH wrote:
> How is "2" the same as this busy loop?
It was a rough estimation. A 600-iteration empty loop on older CPUs
(~500MHz) took about 2 to 3 microseconds.
> And why not fix this properly, as the comment states?
The comment suggests writing to VGA ports (0x3ce/0x3cf) to force a
delay. I didn't implement this because I don't have the specific
hardware or datasheets to test it.
I was afraid that introducing direct VGA I/O just to fix a compiler
warning might cause unexpected hardware regressions *or compatibility
issues across different platforms.*
Given that I can't test hardware I/O, how would you prefer I handle this
in v3?
1. Keep the original loop but add cpu_relax() inside to prevent compiler
optimization. (Safest for the hardware)
2. Use udelay(2) (or ndelay) and restore the historical comment.
3. Migrate the driver to the standard i2c-algo-bit framework (a much
heavier refactoring).
I'd appreciate your guidance on the best path forward for this staging
driver.
Thanks,
Oarora
^ permalink raw reply
* Re: [PATCH v3] fbdev/hga: Request memory region before ioremap
From: Thomas Zimmermann @ 2026-03-16 8:13 UTC (permalink / raw)
To: Helge Deller, Hardik Phalet, Ferenc Bakonyi
Cc: Shuah Khan, Brigham Campbell, linux-nvidia, linux-fbdev,
dri-devel, linux-kernel
In-Reply-To: <e9580f2e-19e2-4e82-b041-afe4cf9fb301@gmx.de>
Hi
Am 13.03.26 um 13:50 schrieb Helge Deller:
> Hi Thomas,
>
> On 3/13/26 09:05, Thomas Zimmermann wrote:
>> Am 12.03.26 um 20:47 schrieb Helge Deller:
>>> On 3/12/26 16:10, Thomas Zimmermann wrote:
>>>> Am 12.03.26 um 16:04 schrieb Hardik Phalet:
>>>>> On Tue Mar 10, 2026 at 6:38 PM IST, Thomas Zimmermann wrote:
>>>>>> Hi,
>>>>>>
>>>>>> thanks for the patch. Let's hope there are no conflicts with other
>>>>>> hardware. IDK if anyone still uses this driver.
>>>>> Hi Thomas,
>>>>>
>>>>> Thanks for reviewing this.
>>>>>
>>>>> Since I currently do not have access to the hardware needed to
>>>>> test the
>>>>> change properly, I will drop this patch for now. I may revisit it
>>>>> once I
>>>>> can validate the behavior on real hardware.
>>>>
>>>> Good luck. That's the Hercules framebuffer driver. Finding such
>>>> ancient hardware that can run modern Linux is nigh impossible.
>>>>
>>>> But we can merge the patch. If it breaks anyone's setup, they will
>>>> send a bug report.
>>>>
>>>> Helge will pick up the fix if he's ok with it.
>>>
>>> No, I don't want to merge such patches any longer without any testing
>>> on real hardware. There is no actual problem (else someone would
>>> have reported),
>>> as such I don't see a benefit to apply it. Applying it just brings
>>> the risk
>>> that we break it for someone.
>>> So, NAK.
>>>
>>> I believe I wrote about my opinion already in another patch?
>>
>> Sorry, I wasn't aware.
>>
>>> I think we should rephrase that specific TODO item (which mentions
>>> the memory
>>> region allocation) that only patches which have been tested are
>>> accepted.
>>
>> There will likely no one show up here for testing unless it breaks
>> there system. Which you won't know until you merge the patch.
>
> No-one likes to merge unnecessary patches which highly potentially
> introduce malfunctioning and haven't been tested at all.
>
>> If only pre-tested patches can go in,
>
> You misunderstand.
> I'm still happy to take *any* patches for fbdev.
> Even untested ones if they
> a) seem necessary (e.g. bugfix), or
> b) seem beneficial (code cleanup)
> as long as they don't break the driver. This patch may break the driver.
Any patch falling under a) or b) may break a driver. The patch at hand
isn't even particularly fragile. People keep posting clean-up patches
for some of the fbdev drivers and I doubt that most of them have seen
any testing.
My point here is that if a patch breaks the driver then someone will
show up and report the regression. If you operate under the (implicit)
assumption that the driver breaks without anyone reporting the
regression, the corollary is that the driver is unused. And so should
be removed. I suspect this is the case for a lot of fbdev drivers. Hence
we should make a push to get drivers removed.
Best regards
Thomas
>
> Helge
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)
^ permalink raw reply
* Re: [PATCH v2 1/2] staging: sm750fb: Replace busy-wait loop with udelay()
From: Greg KH @ 2026-03-16 8:16 UTC (permalink / raw)
To: OaroraEtimis
Cc: sudipm.mukherjee, teddy.wang, linux-fbdev, linux-staging,
linux-kernel
In-Reply-To: <495a2eb6-619e-4ad8-b550-678f7334819e@gmail.com>
On Mon, Mar 16, 2026 at 03:42:04PM +0800, OaroraEtimis wrote:
> Hi Greg,
>
> Thanks for the review. Sorry for dropping the historical comment in v2. My
> only goal was to fix the -Wunused-but-set-variable warning and prevent the
> loop from being optimized away by the compiler.
>
> I will definitely restore the comment.
>
>
> On Mon, Mar 16, 2026 at 14:11, Greg KH wrote:
>
> > How is "2" the same as this busy loop?
>
> It was a rough estimation. A 600-iteration empty loop on older CPUs
> (~500MHz) took about 2 to 3 microseconds.
>
>
> > And why not fix this properly, as the comment states?
>
> The comment suggests writing to VGA ports (0x3ce/0x3cf) to force a delay. I
> didn't implement this because I don't have the specific hardware or
> datasheets to test it.
>
> I was afraid that introducing direct VGA I/O just to fix a compiler warning
> might cause unexpected hardware regressions *or compatibility issues across
> different platforms.*
>
> Given that I can't test hardware I/O, how would you prefer I handle this in
> v3?
Don't do any logical change that testing without hardware can not be
verified (i.e. whitespace changes).
> 1. Keep the original loop but add cpu_relax() inside to prevent compiler
> optimization. (Safest for the hardware)
Are you sure? Can you sleep at that point in time? What does the
performance of the device then look like?
> 2. Use udelay(2) (or ndelay) and restore the historical comment.
> 3. Migrate the driver to the standard i2c-algo-bit framework (a much heavier
> refactoring).
Number 3 is the best way.
good luck!
greg k-h
^ permalink raw reply
* Re: [PATCH] staging: sm750fb: Replace busy-wait loop with udelay()
From: Dan Carpenter @ 2026-03-16 8:55 UTC (permalink / raw)
To: OaroraEtimis, Sudip Mukherjee
Cc: sudipm.mukherjee, teddy.wang, gregkh, linux-fbdev, linux-staging,
linux-kernel
In-Reply-To: <20260315150532.87280-1-OaroraEtimis@gmail.com>
Hi Sudip,
What the status on a replacement for the sm750fb driver?
On Sun, Mar 15, 2026 at 11:05:32PM +0800, OaroraEtimis wrote:
> The empty for-loop in sw_i2c_wait() triggers a -Wunused-but-set-variable
> warning and can be optimized away by modern compilers.
>
> Fix this by replacing the unreliable loop with a standard udelay(2).
> This also cleans up the unused 'tmp' variable and outdated comments.
> Include <linux/delay.h> to resolve the implicit function declaration.
>
> Signed-off-by: OaroraEtimis <OaroraEtimis@gmail.com>
> ---
> drivers/staging/sm750fb/ddk750_swi2c.c | 20 ++------------------
> 1 file changed, 2 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/staging/sm750fb/ddk750_swi2c.c b/drivers/staging/sm750fb/ddk750_swi2c.c
> index 0ef8d4ff2ef9..d0aeb917be92 100644
> --- a/drivers/staging/sm750fb/ddk750_swi2c.c
> +++ b/drivers/staging/sm750fb/ddk750_swi2c.c
> @@ -11,6 +11,7 @@
> #include "ddk750_reg.h"
> #include "ddk750_swi2c.h"
> #include "ddk750_power.h"
> +#include <linux/delay.h>
>
> /*
> * I2C Software Master Driver:
> @@ -80,24 +81,7 @@ static unsigned long sw_i2c_data_gpio_data_dir_reg = GPIO_DATA_DIRECTION;
> */
> static void sw_i2c_wait(void)
> {
> - /* find a bug:
> - * peekIO method works well before suspend/resume
> - * but after suspend, peekIO(0x3ce,0x61) & 0x10
> - * always be non-zero,which makes the while loop
> - * never finish.
> - * use non-ultimate for loop below is safe
> - */
> -
> - /* Change wait algorithm to use PCI bus clock,
> - * it's more reliable than counter loop ..
> - * write 0x61 to 0x3ce and read from 0x3cf
> - */
> - int i, tmp;
> -
> - for (i = 0; i < 600; i++) {
> - tmp = i;
> - tmp += i;
> - }
> + udelay(2);
What an interesting thing... This function has always been nonsense
since it was first merged. The comment talks about the old code which
did:
while (peekIO(0x3ce, 0x61) & 0x10)
;
Which apparently "works well" except that after suspend it becomes
a forever loop so instead of that they have this bonkers "count to 600"
loop.
Your patch looks reasonable, but generally we don't merge that sort
of patch without anyone to test it. The problem with merging your code
is that it looks so reasonable, that people might assume it's correct.
Meanwhile if we just leave it as-is, everyone can see it's totally buggy
so when people report bugs we'll know exactly where to look.
Sudip was working on a replacement for this driver. Let's ask him for
advice.
regards,
dan carpenter
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: Alexandre Courbot @ 2026-03-16 13:12 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <20260309135338.3919996-2-joelagnelf@nvidia.com>
On Mon Mar 9, 2026 at 10:53 PM JST, Joel Fernandes wrote:
> Add safe Rust abstractions over the Linux kernel's GPU buddy
> allocator for physical memory management. The GPU buddy allocator
> implements a binary buddy system useful for GPU physical memory
> allocation. nova-core will use it for physical memory allocation.
>
> Christian Koenig mentioned he'd like to step down from reviewer role for
> GPU buddy so updated accordingly. Arun/Matthew agree on the modified entry.
>
> Cc: Nikola Djukic <ndjukic@nvidia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
This is getting close IMHO. But `make rustdoc` fails on the examples,
and there are still clippy warnings - please make sure to address them.
A few more comments below.
<snip>
> diff --git a/rust/kernel/gpu/buddy.rs b/rust/kernel/gpu/buddy.rs
> new file mode 100644
> index 000000000000..9027c9a7778f
> --- /dev/null
> +++ b/rust/kernel/gpu/buddy.rs
> @@ -0,0 +1,611 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! GPU buddy allocator bindings.
> +//!
> +//! C header: [`include/linux/gpu_buddy.h`](srctree/include/linux/gpu_buddy.h)
> +//!
> +//! This module provides Rust abstractions over the Linux kernel's GPU buddy
> +//! allocator, which implements a binary buddy memory allocator.
> +//!
> +//! The buddy allocator manages a contiguous address space and allocates blocks
> +//! in power-of-two sizes, useful for GPU physical memory management.
> +//!
> +//! # Examples
> +//!
> +//! Create a buddy allocator and perform a basic range allocation:
> +//!
> +//! ```
> +//! use kernel::{
> +//! gpu::buddy::{GpuBuddy, GpuBuddyAllocMode, GpuBuddyAllocFlags, GpuBuddyParams},
nit: should also use kernel formatting style.
> +//! prelude::*,
> +//! ptr::Alignment,
> +//! sizes::*, //
> +//! };
> +//!
> +//! // Create a 1GB buddy allocator with 4KB minimum chunk size.
> +//! let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! base_offset: 0,
> +//! physical_memory_size: SZ_1G as u64,
> +//! chunk_size: SZ_4K,
`chunk_size` is an interesting case. The C API uses a `u64`, but I think
we can reasonably consider that we won't ever need chunks larger than
4GB (or can we :O). I'm actually ok with using a `usize` for this one.
One of the first things the C code does is throwing an error if it is
not a power of 2, so maybe we can even request an `Alignment`?
I'm a bit torn as to whether we should use a `u64` to conform with the C
API, but doing so would mean we cannot use an `Alignment`...
> +//! })?;
> +//!
> +//! assert_eq!(buddy.size(), SZ_1G as u64);
> +//! assert_eq!(buddy.chunk_size(), SZ_4K);
> +//! let initial_free = buddy.free_memory();
> +//!
> +//! // Allocate 16MB, results in a single 16MB block at offset 0.
> +//! let allocated = KBox::pin_init(
> +//! buddy.alloc_blocks(
> +//! GpuBuddyAllocMode::Range { start: 0, end: 0 },
This zero-sized range looks confusing for the example. I understand the
C API allows this, but I don't think we should. Is there a difference
with just using `GpuBuddyAllocMode::Simple`? If not, let's switch to
that, and reject zero-sized ranges in the same spirit as we don't allow
invalid flag combinations.
> +//! SZ_16M,
> +//! Alignment::new::<SZ_16M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let block = allocated.iter().next().expect("expected one block");
> +//! assert_eq!(block.offset(), 0);
> +//! assert_eq!(block.order(), 12); // 2^12 pages = 16MB
> +//! assert_eq!(block.size(), SZ_16M);
Here we should also check that there is not a second block.
> +//!
> +//! // Dropping the allocation returns the memory to the buddy allocator.
s/memory/range - technically we are not returning physical memory.
> +//! drop(allocated);
> +//! assert_eq!(buddy.free_memory(), initial_free);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Top-down allocation allocates from the highest addresses:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! # gpu::buddy::{GpuBuddy, GpuBuddyAllocMode, GpuBuddyAllocFlags, GpuBuddyParams},
> +//! # prelude::*,
> +//! # ptr::Alignment,
> +//! # sizes::*, //
`make rustdoc` fails to build:
error[E0433]: failed to resolve: use of undeclared type `GpuBuddyAllocFlags`
--> rust/doctests_kernel_generated.rs:6182:9
|
6182 | GpuBuddyAllocFlags::default(),
| ^^^^^^^^^^^^^^^^^^ use of undeclared type `GpuBuddyAllocFlags`
|
help: an enum with a similar name exists
|
6182 - GpuBuddyAllocFlags::default(),
6182 + GpuBuddyAllocFlag::default(),
|
help: consider importing this struct
|
3 + use kernel::gpu::buddy::GpuBuddyAllocFlags;
|
error[E0433]: failed to resolve: use of undeclared type `GpuBuddyAllocFlags`
--> rust/doctests_kernel_generated.rs:6195:9
|
6195 | GpuBuddyAllocFlags::default(),
| ^^^^^^^^^^^^^^^^^^ use of undeclared type `GpuBuddyAllocFlags`
|
help: an enum with a similar name exists
|
6195 - GpuBuddyAllocFlags::default(),
6195 + GpuBuddyAllocFlag::default(),
|
help: consider importing this struct
|
3 + use kernel::gpu::buddy::GpuBuddyAllocFlags;
> +//! # };
> +//! # let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! # base_offset: 0,
> +//! # physical_memory_size: SZ_1G as u64,
> +//! # chunk_size: SZ_4K,
> +//! # })?;
> +//! # let initial_free = buddy.free_memory();
> +//! let topdown = KBox::pin_init(
> +//! buddy.alloc_blocks(
> +//! GpuBuddyAllocMode::TopDown,
> +//! SZ_16M,
> +//! Alignment::new::<SZ_16M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let block = topdown.iter().next().expect("expected one block");
> +//! assert_eq!(block.offset(), (SZ_1G - SZ_16M) as u64);
> +//! assert_eq!(block.order(), 12);
> +//! assert_eq!(block.size(), SZ_16M);
> +//!
> +//! // Dropping the allocation returns the memory to the buddy allocator.
> +//! drop(topdown);
> +//! assert_eq!(buddy.free_memory(), initial_free);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Non-contiguous allocation can fill fragmented memory by returning multiple
> +//! blocks:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! # gpu::buddy::{
> +//! # GpuBuddy, GpuBuddyAllocFlags, GpuBuddyAllocMode, GpuBuddyParams,
> +//! # },
> +//! # prelude::*,
> +//! # ptr::Alignment,
> +//! # sizes::*, //
> +//! # };
> +//! # let buddy = GpuBuddy::new(GpuBuddyParams {
> +//! # base_offset: 0,
> +//! # physical_memory_size: SZ_1G as u64,
> +//! # chunk_size: SZ_4K,
> +//! # })?;
> +//! # let initial_free = buddy.free_memory();
> +//! // Create fragmentation by allocating 4MB blocks at [0,4M) and [8M,12M).
> +//! let frag1 = KBox::pin_init(
> +//! buddy.alloc_blocks(
> +//! GpuBuddyAllocMode::Range { start: 0, end: SZ_4M as u64 },
> +//! SZ_4M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_4M as u64);
> +//!
> +//! let frag2 = KBox::pin_init(
> +//! buddy.alloc_blocks(
> +//! GpuBuddyAllocMode::Range {
> +//! start: SZ_8M as u64,
> +//! end: (SZ_8M + SZ_4M) as u64,
> +//! },
> +//! SZ_4M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_8M as u64);
> +//!
> +//! // Allocate 8MB, this returns 2 blocks from the holes.
> +//! let fragmented = KBox::pin_init(
> +//! buddy.alloc_blocks(
> +//! GpuBuddyAllocMode::Range { start: 0, end: SZ_16M as u64 },
> +//! SZ_8M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//! assert_eq!(buddy.free_memory(), initial_free - SZ_16M as u64);
> +//!
> +//! let (mut count, mut total) = (0u32, 0usize);
> +//! for block in fragmented.iter() {
> +//! assert_eq!(block.size(), SZ_4M);
> +//! total += block.size();
> +//! count += 1;
> +//! }
Note that we can avoid mutable variables with this:
//! let total_size: usize = fragmented.iter()
//! .inspect(|block| assert_eq!(block.size(), SZ_4M))
//! .map(|block| block.size())
//! .sum();
//! assert_eq!(total_size, SZ_8M);
//! assert_eq!(fragmented.iter().count(), 2);
But your call as to whether this is an improvement.
> +//! assert_eq!(total, SZ_8M);
> +//! assert_eq!(count, 2);
> +//! # Ok::<(), Error>(())
> +//! ```
> +//!
> +//! Contiguous allocation fails when only fragmented space is available:
> +//!
> +//! ```
> +//! # use kernel::{
> +//! # gpu::buddy::{
> +//! # GpuBuddy, GpuBuddyAllocFlag, GpuBuddyAllocMode, GpuBuddyParams,
> +//! # },
> +//! # prelude::*,
> +//! # ptr::Alignment,
> +//! # sizes::*, //
> +//! # };
> +//! // Create a small 16MB buddy allocator with fragmented memory.
> +//! let small = GpuBuddy::new(GpuBuddyParams {
> +//! base_offset: 0,
> +//! physical_memory_size: SZ_16M as u64,
> +//! chunk_size: SZ_4K,
> +//! })?;
> +//!
> +//! let _hole1 = KBox::pin_init(
> +//! small.alloc_blocks(
> +//! GpuBuddyAllocMode::Range { start: 0, end: SZ_4M as u64 },
> +//! SZ_4M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//!
> +//! let _hole2 = KBox::pin_init(
> +//! small.alloc_blocks(
> +//! GpuBuddyAllocMode::Range {
> +//! start: SZ_8M as u64,
> +//! end: (SZ_8M + SZ_4M) as u64,
> +//! },
> +//! SZ_4M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlags::default(),
> +//! ),
> +//! GFP_KERNEL,
> +//! )?;
> +//!
> +//! // 8MB contiguous should fail, only two non-contiguous 4MB holes exist.
> +//! let result = KBox::pin_init(
> +//! small.alloc_blocks(
> +//! GpuBuddyAllocMode::Simple,
> +//! SZ_8M,
> +//! Alignment::new::<SZ_4M>(),
> +//! GpuBuddyAllocFlag::Contiguous.into(),
> +//! ),
> +//! GFP_KERNEL,
> +//! );
> +//! assert!(result.is_err());
> +//! # Ok::<(), Error>(())
> +//! ```
I think these last two examples are great both as documentation and
tests - the doc has also become much more readable!
> +
> +use crate::{
> + bindings,
> + clist_create,
> + error::to_result,
> + interop::list::CListHead,
> + new_mutex,
> + prelude::*,
> + ptr::Alignment,
> + sync::{
> + lock::mutex::MutexGuard,
> + Arc,
> + Mutex, //
> + },
> + types::Opaque, //
> +};
> +
> +/// Allocation mode for the GPU buddy allocator.
> +///
> +/// The mode determines the primary allocation strategy. Modes are mutually
> +/// exclusive: an allocation is either simple, range-constrained, or top-down.
> +///
> +/// Orthogonal modifier flags (e.g., contiguous, clear) are specified separately
> +/// via [`GpuBuddyAllocFlags`].
> +#[derive(Copy, Clone, Debug, PartialEq, Eq)]
> +pub enum GpuBuddyAllocMode {
> + /// Simple allocation without constraints.
> + Simple,
> + /// Range-based allocation between `start` and `end` addresses.
> + Range {
> + /// Start of the allocation range.
> + start: u64,
> + /// End of the allocation range.
> + end: u64,
> + },
> + /// Allocate from top of address space downward.
> + TopDown,
> +}
> +
> +impl GpuBuddyAllocMode {
> + // Returns the C flags corresponding to the allocation mode.
> + fn into_flags(self) -> usize {
> + match self {
> + Self::Simple => 0,
> + Self::Range { .. } => bindings::GPU_BUDDY_RANGE_ALLOCATION as usize,
> + Self::TopDown => bindings::GPU_BUDDY_TOPDOWN_ALLOCATION as usize,
> + }
> + }
> +
> + // Extracts the range start/end, defaulting to (0, 0) for non-range modes.
> + fn range(self) -> (u64, u64) {
> + match self {
> + Self::Range { start, end } => (start, end),
> + _ => (0, 0),
> + }
> + }
> +}
> +
> +crate::impl_flags!(
> + /// Modifier flags for GPU buddy allocation.
> + ///
> + /// These flags can be combined with any [`GpuBuddyAllocMode`] to control
> + /// additional allocation behavior.
> + #[derive(Clone, Copy, Default, PartialEq, Eq)]
> + pub struct GpuBuddyAllocFlags(u32);
> +
> + /// Individual modifier flag for GPU buddy allocation.
> + #[derive(Clone, Copy, PartialEq, Eq)]
> + pub enum GpuBuddyAllocFlag {
> + /// Allocate physically contiguous blocks.
> + Contiguous = bindings::GPU_BUDDY_CONTIGUOUS_ALLOCATION as u32,
> +
> + /// Request allocation from cleared (zeroed) memory.
> + Clear = bindings::GPU_BUDDY_CLEAR_ALLOCATION as u32,
> +
> + /// Disable trimming of partially used blocks.
> + TrimDisable = bindings::GPU_BUDDY_TRIM_DISABLE as u32,
> + }
> +);
> +
> +/// Parameters for creating a GPU buddy allocator.
> +pub struct GpuBuddyParams {
> + /// Base offset (in bytes) where the managed memory region starts.
> + /// Allocations will be offset by this value.
> + pub base_offset: u64,
> + /// Total physical memory size (in bytes) managed by the allocator.
> + pub physical_memory_size: u64,
> + /// Minimum allocation unit / chunk size (in bytes), must be >= 4KB.
> + pub chunk_size: usize,
As I mentioned above, let's consider if we can store this as an `Alignment`.
> +}
> +
> +/// Inner structure holding the actual buddy allocator.
> +///
> +/// # Synchronization
> +///
> +/// The C `gpu_buddy` API requires synchronization (see `include/linux/gpu_buddy.h`).
> +/// [`GpuBuddyGuard`] ensures that the lock is held for all
> +/// allocator and free operations, preventing races between concurrent allocations
> +/// and the freeing that occurs when [`AllocatedBlocks`] is dropped.
`GpuBuddyGuard` is now private, so we should avoid mentioning it in the
public documentation as it will just confuse users.
Users won't care about such implementation details - we can just say
that internal locking ensures all operations are properly synchronized.
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains an initialized buddy allocator.
> +#[pin_data(PinnedDrop)]
> +struct GpuBuddyInner {
> + #[pin]
> + inner: Opaque<bindings::gpu_buddy>,
> +
> + // TODO: Replace `Mutex<()>` with `Mutex<Opaque<..>>` once `Mutex::new()`
> + // accepts `impl PinInit<T>`.
> + #[pin]
> + lock: Mutex<()>,
> + /// Cached creation parameters (do not change after init).
> + params: GpuBuddyParams,
> +}
> +
> +impl GpuBuddyInner {
> + /// Create a pin-initializer for the buddy allocator.
> + fn new(params: GpuBuddyParams) -> impl PinInit<Self, Error> {
> + let size = params.physical_memory_size;
> + let chunk_size = params.chunk_size;
> +
> + // INVARIANT: `gpu_buddy_init` returns 0 on success, at which point the
> + // `gpu_buddy` structure is initialized and ready for use with all
> + // `gpu_buddy_*` APIs. `try_pin_init!` only completes if all fields succeed,
> + // so the invariant holds when construction finishes.
> + try_pin_init!(Self {
> + inner <- Opaque::try_ffi_init(|ptr| {
> + // SAFETY: `ptr` points to valid uninitialized memory from the pin-init
> + // infrastructure. `gpu_buddy_init` will initialize the structure.
> + to_result(unsafe { bindings::gpu_buddy_init(ptr, size, chunk_size as u64) })
> + }),
> + lock <- new_mutex!(()),
> + params,
> + })
> + }
> +
> + /// Lock the mutex and return a guard for accessing the allocator.
> + fn lock(&self) -> GpuBuddyGuard<'_> {
> + GpuBuddyGuard {
> + inner: self,
> + _guard: self.lock.lock(),
> + }
> + }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for GpuBuddyInner {
> + fn drop(self: Pin<&mut Self>) {
> + let guard = self.lock();
> +
> + // SAFETY: Per the type invariant, `inner` contains an initialized
> + // allocator. `guard` provides exclusive access.
> + unsafe {
> + bindings::gpu_buddy_fini(guard.as_raw());
> + }
> + }
> +}
> +
> +// SAFETY: GpuBuddyInner can be sent between threads.
> +unsafe impl Send for GpuBuddyInner {}
> +
> +// SAFETY: `GpuBuddyInner` is `Sync` because `GpuBuddyInner::lock`
> +// serializes all access to the C allocator, preventing data races.
> +unsafe impl Sync for GpuBuddyInner {}
> +
> +// Guard that proves the lock is held, enabling access to the allocator.
> +// The `_guard` holds the lock for the duration of this guard's lifetime.
> +struct GpuBuddyGuard<'a> {
> + inner: &'a GpuBuddyInner,
> + _guard: MutexGuard<'a, ()>,
> +}
> +
> +impl GpuBuddyGuard<'_> {
> + /// Get a raw pointer to the underlying C `gpu_buddy` structure.
> + fn as_raw(&self) -> *mut bindings::gpu_buddy {
> + self.inner.inner.get()
> + }
> +}
> +
> +/// GPU buddy allocator instance.
> +///
> +/// This structure wraps the C `gpu_buddy` allocator using reference counting.
> +/// The allocator is automatically cleaned up when all references are dropped.
> +///
> +/// Refer to the module-level documentation for usage examples.
> +pub struct GpuBuddy(Arc<GpuBuddyInner>);
> +
> +impl GpuBuddy {
> + /// Create a new buddy allocator.
> + ///
> + /// Creates a buddy allocator that manages a contiguous address space of the given
> + /// size, with the specified minimum allocation unit (chunk_size must be at least 4KB).
> + pub fn new(params: GpuBuddyParams) -> Result<Self> {
> + Ok(Self(Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL)?))
Can be written as:
Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL).map(Self)
I prefer this form as it avoids the `?` and re-wrapping into `Ok` for
something that is already a `Result`.
>
> + }
> +
> + /// Get the base offset for allocations.
> + pub fn base_offset(&self) -> u64 {
> + self.0.params.base_offset
> + }
> +
> + /// Get the chunk size (minimum allocation unit).
> + pub fn chunk_size(&self) -> usize {
If my suggestion above works this could return an `Alignment`.
> + self.0.params.chunk_size
> + }
> +
> + /// Get the total managed size.
> + pub fn size(&self) -> u64 {
> + self.0.params.physical_memory_size
> + }
> +
> + /// Get the available (free) memory in bytes.
> + pub fn free_memory(&self) -> u64 {
> + let guard = self.0.lock();
> +
> + // SAFETY: Per the type invariant, `inner` contains an initialized allocator.
> + // `guard` provides exclusive access.
> + unsafe { (*guard.as_raw()).avail }
> + }
> +
> + /// Allocate blocks from the buddy allocator.
> + ///
> + /// Returns a pin-initializer for [`AllocatedBlocks`].
> + ///
> + /// Takes `&self` instead of `&mut self` because the internal [`Mutex`] provides
> + /// synchronization - no external `&mut` exclusivity needed.
This is another implementation detail - the fact this takes `&self` and
is not `unsafe` is already proof that synchronization is taken care of.
> + pub fn alloc_blocks(
> + &self,
> + mode: GpuBuddyAllocMode,
> + size: usize,
For this parameter I am pretty sure we want to conform to the C API and
use a `u64` - there is no benefit in not doing so, and buffers larger
than 4GB *are* a reality nowadays, (maybe not for graphics, but this
will also be used in compute scenarios).
> + min_block_size: Alignment,
> + flags: GpuBuddyAllocFlags,
> + ) -> impl PinInit<AllocatedBlocks, Error> {
> + let buddy_arc = Arc::clone(&self.0);
> + let (start, end) = mode.range();
> + let mode_flags = mode.into_flags();
> + let modifier_flags = u32::from(flags) as usize;
> +
> + // Create pin-initializer that initializes list and allocates blocks.
> + try_pin_init!(AllocatedBlocks {
> + buddy: buddy_arc,
> + list <- CListHead::new(),
> + _: {
> + // Lock while allocating to serialize with concurrent frees.
> + let guard = buddy.lock();
> +
> + // SAFETY: Per the type invariant, `inner` contains an initialized
> + // allocator. `guard` provides exclusive access.
> + to_result(unsafe {
> + bindings::gpu_buddy_alloc_blocks(
> + guard.as_raw(),
> + start,
> + end,
> + size as u64,
> + min_block_size.as_usize() as u64,
> + list.as_raw(),
> + mode_flags | modifier_flags,
> + )
> + })?
> + }
> + })
> + }
> +}
> +
> +/// Allocated blocks from the buddy allocator with automatic cleanup.
> +///
> +/// This structure owns a list of allocated blocks and ensures they are
> +/// automatically freed when dropped. Use `iter()` to iterate over all
> +/// allocated blocks.
> +///
> +/// # Invariants
> +///
> +/// - `list` is an initialized, valid list head containing allocated blocks.
> +#[pin_data(PinnedDrop)]
> +pub struct AllocatedBlocks {
> + #[pin]
> + list: CListHead,
> + buddy: Arc<GpuBuddyInner>,
> +}
> +
> +impl AllocatedBlocks {
> + /// Check if the block list is empty.
> + pub fn is_empty(&self) -> bool {
> + // An empty list head points to itself.
> + !self.list.is_linked()
> + }
> +
> + /// Iterate over allocated blocks.
> + ///
> + /// Returns an iterator yielding [`AllocatedBlock`] values. Each [`AllocatedBlock`]
> + /// borrows `self` and is only valid for the duration of that borrow.
> + pub fn iter(&self) -> impl Iterator<Item = AllocatedBlock<'_>> + '_ {
> + // SAFETY:
> + // - Per the type invariant, `list` is an initialized sentinel `list_head`
> + // and is not concurrently modified (we hold a `&self` borrow).
> + // - The list contains `gpu_buddy_block` items linked via
> + // `__bindgen_anon_1.link`.
> + // - `Block` is `#[repr(transparent)]` over `gpu_buddy_block`.
> + let clist = clist_create!(unsafe {
> + self.list.as_raw(),
> + Block,
> + bindings::gpu_buddy_block,
> + __bindgen_anon_1.link
> + });
> +
> + clist
> + .iter()
> + .map(|this| AllocatedBlock { this, blocks: self })
> + }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for AllocatedBlocks {
> + fn drop(self: Pin<&mut Self>) {
> + let guard = self.buddy.lock();
> +
> + // SAFETY:
> + // - list is valid per the type's invariants.
> + // - guard provides exclusive access to the allocator.
> + unsafe {
> + bindings::gpu_buddy_free_list(guard.as_raw(), self.list.as_raw(), 0);
> + }
> + }
> +}
> +
> +/// A GPU buddy block.
> +///
> +/// Transparent wrapper over C `gpu_buddy_block` structure. This type is returned
> +/// as references during iteration over [`AllocatedBlocks`].
> +///
> +/// # Invariants
> +///
> +/// The inner [`Opaque`] contains a valid, allocated `gpu_buddy_block`.
> +#[repr(transparent)]
> +struct Block(Opaque<bindings::gpu_buddy_block>);
> +
> +impl Block {
> + /// Get a raw pointer to the underlying C block.
> + fn as_raw(&self) -> *mut bindings::gpu_buddy_block {
> + self.0.get()
> + }
> +
> + /// Get the block's raw offset in the buddy address space (without base offset).
> + fn offset(&self) -> u64 {
> + // SAFETY: `self.as_raw()` is valid per the type's invariants.
> + unsafe { bindings::gpu_buddy_block_offset(self.as_raw()) }
> + }
> +
> + /// Get the block order.
> + fn order(&self) -> u32 {
> + // SAFETY: `self.as_raw()` is valid per the type's invariants.
> + unsafe { bindings::gpu_buddy_block_order(self.as_raw()) }
> + }
Speaking of synchronization - I only had a quick look at the C API, but
are we sure these methods can all be called without holding the lock?
> +}
> +
> +// SAFETY: `Block` is a wrapper around `gpu_buddy_block` which can be
> +// sent across threads safely.
> +unsafe impl Send for Block {}
> +
> +// SAFETY: `Block` is only accessed through shared references after
> +// allocation, and thus safe to access concurrently across threads.
> +unsafe impl Sync for Block {}
> +
> +/// A buddy block paired with its owning [`AllocatedBlocks`] context.
> +///
> +/// Unlike a raw block, which only knows its offset within the buddy address
> +/// space, an [`AllocatedBlock`] also has access to the allocator's `base_offset`
> +/// and `chunk_size`, enabling it to compute absolute offsets and byte sizes.
> +///
> +/// Returned by [`AllocatedBlocks::iter()`].
> +pub struct AllocatedBlock<'a> {
> + this: &'a Block,
> + blocks: &'a AllocatedBlocks,
> +}
> +
> +impl AllocatedBlock<'_> {
> + /// Get the block's offset in the address space.
> + ///
> + /// Returns the absolute offset including the allocator's base offset.
> + /// This is the actual address to use for accessing the allocated memory.
> + pub fn offset(&self) -> u64 {
> + self.blocks.buddy.params.base_offset + self.this.offset()
> + }
> +
> + /// Get the block order (size = chunk_size << order).
> + pub fn order(&self) -> u32 {
> + self.this.order()
> + }
> +
> + /// Get the block's size in bytes.
> + pub fn size(&self) -> usize {
> + self.blocks.buddy.params.chunk_size << self.this.order()
> + }
> +}
> diff --git a/rust/kernel/gpu/mod.rs b/rust/kernel/gpu/mod.rs
Let's use `gpu.rs` as the file for this module.
> new file mode 100644
> index 000000000000..8f25e6367edc
> --- /dev/null
> +++ b/rust/kernel/gpu/mod.rs
> @@ -0,0 +1,5 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! GPU subsystem abstractions.
> +
> +pub mod buddy;
IMHO we should have a `#[cfg(CONFIG_GPU_BUDDY = "y")]` here for
defensiveness...
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index bb741f1e0dfd..63e3f656eb6c 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -98,6 +98,8 @@
> pub mod firmware;
> pub mod fmt;
> pub mod fs;
> +#[cfg(CONFIG_GPU_BUDDY = "y")]
> +pub mod gpu;
... because in the future I suspect the condition for enabling that
module will become broader. I think it's fine to keep it as-is for now
though.
^ permalink raw reply
* Re: [PATCH v9 01/23] gpu: nova-core: Select GPU_BUDDY for VRAM allocation
From: Alexandre Courbot @ 2026-03-16 13:17 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <20260311004008.2208806-2-joelagnelf@nvidia.com>
On Wed Mar 11, 2026 at 9:39 AM JST, Joel Fernandes wrote:
> nova-core will use the GPU buddy allocator for physical VRAM management.
> Enable it in Kconfig.
>
> Cc: Nikola Djukic <ndjukic@nvidia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
As I said in v8, let's squash this change with the first commit that
actually makes use of GPU_BUDDY.
^ permalink raw reply
* Re: [PATCH v9 02/23] gpu: nova-core: Kconfig: Sort select statements alphabetically
From: Alexandre Courbot @ 2026-03-16 13:17 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <20260311004008.2208806-3-joelagnelf@nvidia.com>
On Wed Mar 11, 2026 at 9:39 AM JST, Joel Fernandes wrote:
> Reorder the select statements in NOVA_CORE Kconfig to be in
> alphabetical order.
>
> Suggested-by: Danilo Krummrich <dakr@kernel.org>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
This one is already in drm-rust-next.
^ permalink raw reply
* Re: [PATCH v9 04/23] gpu: nova-core: gsp: Extract usable FB region from GSP
From: Alexandre Courbot @ 2026-03-16 13:18 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <20260311004008.2208806-5-joelagnelf@nvidia.com>
On Wed Mar 11, 2026 at 9:39 AM JST, Joel Fernandes wrote:
> Add first_usable_fb_region() to GspStaticConfigInfo to extract the first
> usable FB region from GSP's fbRegionInfoParams. Usable regions are those
> that are not reserved or protected.
>
> The extracted region is stored in GetGspStaticInfoReply and exposed via
> usable_fb_region() API for use by the memory subsystem.
>
> Cc: Nikola Djukic <ndjukic@nvidia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
This doesn't take into account the feedback I gave in [1]. In
particular, a TODO to handle the remaining regions looks important to
me.
[1] https://lore.kernel.org/all/DGRGDFASWXB7.3NAK8RRTCV88P@nvidia.com/
^ permalink raw reply
* Re: [PATCH v9 05/23] gpu: nova-core: gsp: Expose total physical VRAM end from FB region info
From: Alexandre Courbot @ 2026-03-16 13:19 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <20260311004008.2208806-6-joelagnelf@nvidia.com>
On Wed Mar 11, 2026 at 9:39 AM JST, Joel Fernandes wrote:
> Add `total_fb_end()` to `GspStaticConfigInfo` that computes the exclusive end
> address of the highest valid FB region covering both usable and GSP-reserved
> areas.
>
> This allows callers to know the full physical VRAM extent, not just the
> allocatable portion.
>
> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> ---
> drivers/gpu/nova-core/gsp/commands.rs | 6 ++++++
> drivers/gpu/nova-core/gsp/fw/commands.rs | 19 +++++++++++++++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
> index 8d5780d9cace..389d215098c6 100644
> --- a/drivers/gpu/nova-core/gsp/commands.rs
> +++ b/drivers/gpu/nova-core/gsp/commands.rs
> @@ -193,6 +193,9 @@ pub(crate) struct GetGspStaticInfoReply {
> /// Usable FB (VRAM) region for driver memory allocation.
> #[expect(dead_code)]
> pub(crate) usable_fb_region: Range<u64>,
> + /// End of VRAM.
> + #[expect(dead_code)]
> + pub(crate) total_fb_end: u64,
> }
>
> impl MessageFromGsp for GetGspStaticInfoReply {
> @@ -206,9 +209,12 @@ fn read(
> ) -> Result<Self, Self::InitError> {
> let (base, size) = msg.first_usable_fb_region().ok_or(ENODEV)?;
>
> + let total_fb_end = msg.total_fb_end().ok_or(ENODEV)?;
> +
> Ok(GetGspStaticInfoReply {
> gpu_name: msg.gpu_name_str(),
> usable_fb_region: base..base.saturating_add(size),
> + total_fb_end,
> })
> }
> }
> diff --git a/drivers/gpu/nova-core/gsp/fw/commands.rs b/drivers/gpu/nova-core/gsp/fw/commands.rs
> index cef86cab8a12..acaf92cd6735 100644
> --- a/drivers/gpu/nova-core/gsp/fw/commands.rs
> +++ b/drivers/gpu/nova-core/gsp/fw/commands.rs
> @@ -147,6 +147,25 @@ pub(crate) fn first_usable_fb_region(&self) -> Option<(u64, u64)> {
> }
> None
> }
> +
> + /// Compute the end of physical VRAM from all FB regions.
> + pub(crate) fn total_fb_end(&self) -> Option<u64> {
> + let fb_info = &self.0.fbRegionInfoParams;
> + let mut max_end: Option<u64> = None;
> + for i in 0..fb_info.numFBRegions.into_safe_cast() {
> + if let Some(reg) = fb_info.fbRegion.get(i) {
> + if reg.limit < reg.base {
> + continue;
> + }
This is basically a repeat of the code of the previous patch. Let's
implement an iterator over the FB memory regions (that filters out
invalid regions) that we can leverage in both places so we don't need to
repeat ourselves.
^ permalink raw reply
* Re: [PATCH 4/5] backlight: cgbc_bl: fix kernel-doc comment for struct cgbc_bl_data
From: Lee Jones @ 2026-03-16 14:12 UTC (permalink / raw)
To: Kit Dallege
Cc: Daniel Thompson, Jingoo Han, dri-devel, linux-fbdev, linux-kernel
In-Reply-To: <20260315145802.24224-1-xaum.io@gmail.com>
On Sun, 15 Mar 2026, Kit Dallege wrote:
> Add the required 'struct cgbc_bl_data -' prefix to the kernel-doc
> comment so it is properly recognized as struct documentation.
>
> This fixes the following warning:
>
> drivers/video/backlight/cgbc_bl.c:29: This comment starts with '/**', but isn't a kernel-doc comment
>
> Signed-off-by: Kit Dallege <xaum.io@gmail.com>
> Signed-off-by: kovan <xaum.io@gmail.com>
Real names please.
--
Lee Jones [李琼斯]
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: Joel Fernandes @ 2026-03-16 17:29 UTC (permalink / raw)
To: Danilo Krummrich
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Dave Airlie, Daniel Almeida, Koen Koning, dri-devel,
nouveau, rust-for-linux, Nikola Djukic, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian König, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Huang Rui,
Matthew Auld, Matthew Brost, Lucas De Marchi,
Thomas Hellström, Helge Deller, Alex Gaynor, Boqun Feng,
John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Alexandre Courbot, Andrea Righi, Andy Ritger, Zhi Wang,
Balbir Singh, Philipp Stanner, Elle Rhumsaa, alexeyi,
Eliot Courtney, joel, linux-doc, amd-gfx, intel-gfx, intel-xe,
linux-fbdev
In-Reply-To: <DH0ZOYNAP1CN.1NL9E28UC2C95@kernel.org>
On Thu, 12 Mar 2026, Danilo Krummrich wrote:
> -R: Christian Koenig <christian.koenig@amd.com>
>
> This should really be a separate patch as it is unrelated to the addition of the
> Rust GPU buddy code.
>
> +R: Joel Fernandes <joelagnelf@nvidia.com>
Agreed. I'll split the reviewer change into a separate MAINTAINERS patch.
The F: path additions will stay in the buddy bindings patch since those directly
track the new files being introduced.
Thanks for the review!
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: Joel Fernandes @ 2026-03-16 18:43 UTC (permalink / raw)
To: Alexandre Courbot
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, John Hubbard, Alistair Popple, Timur Tabi, Edwin Peer,
Andrea Righi, Andy Ritger, Zhi Wang, Balbir Singh,
Philipp Stanner, Elle Rhumsaa, alexeyi, Eliot Courtney, joel,
linux-doc, amd-gfx, intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <DH48DNAQCE0Z.2EX23VD27CQVX@nvidia.com>
Hi Alex,
On Mon Mar 16, 2026 at 10:12 PM JST, Alexandre Courbot wrote:
> This is getting close IMHO. But `make rustdoc` fails on the examples,
> and there are still clippy warnings - please make sure to address them.
Fixed.
>> +//! use kernel::{
>> +//! gpu::buddy::{GpuBuddy, GpuBuddyAllocMode, GpuBuddyAllocFlags, GpuBuddyParams},
>
> nit: should also use kernel formatting style.
Fixed.
>> +//! ptr::Alignment,
>> +//! sizes::*, //
>> +//! };
>> +//!
>> +//! // Create a 1GB buddy allocator with 4KB minimum chunk size.
>> +//! let buddy = GpuBuddy::new(GpuBuddyParams {
>> +//! base_offset: 0,
>> +//! physical_memory_size: SZ_1G as u64,
>> +//! chunk_size: SZ_4K,
>
> `chunk_size` is an interesting case. The C API uses a `u64`, but I think
> we can reasonably consider that we won't ever need chunks larger than
> 4GB (or can we :O). I'm actually ok with using a `usize` for this one.
>
> One of the first things the C code does is throwing an error if it is
> not a power of 2, so maybe we can even request an `Alignment`?
>
> I'm a bit torn as to whether we should use a `u64` to conform with the C
> API, but doing so would mean we cannot use an `Alignment`...
I prefer to keep it simple and use `usize` for now. I cannot imagine
chunk_size ever exceeding 4GB, and given our stance on rejecting invalid
inputs, this sounds reasonable. Regarding `Alignment`, I still prefer
`usize` here since it makes the caller-side simpler and as you noted the
C code already does error-checking. Let's revisit if needed once this
lands.
>> +//! GpuBuddyAllocMode::Range { start: 0, end: 0 },
>
> This zero-sized range looks confusing for the example. I understand the
> C API allows this, but I don't think we should. Is there a difference
> with just using `GpuBuddyAllocMode::Simple`? If not, let's switch to
> that, and reject zero-sized ranges in the same spirit as we don't allow
> invalid flag combinations.
Good point. Switched to use `GpuBuddyAllocMode::Simple` and
added validation.
>> +//! assert_eq!(block.offset(), 0);
>> +//! assert_eq!(block.order(), 12); // 2^12 pages = 16MB
>> +//! assert_eq!(block.size(), SZ_16M);
>
> Here we should also check that there is not a second block.
Added.
>> +//! // Dropping the allocation returns the memory to the buddy allocator.
>
> s/memory/range - technically we are not returning physical memory.
Fixed.
>> +//! let (mut count, mut total) = (0u32, 0usize);
>> +//! for block in fragmented.iter() {
>> +//! assert_eq!(block.size(), SZ_4M);
>> +//! total += block.size();
>> +//! count += 1;
>> +//! }
>
> Note that we can avoid mutable variables with this:
>
> //! let total_size: usize = fragmented.iter()
> //! .inspect(|block| assert_eq!(block.size(), SZ_4M))
> //! .map(|block| block.size())
> //! .sum();
> //! assert_eq!(total_size, SZ_8M);
> //! assert_eq!(fragmented.iter().count(), 2);
>
> But your call as to whether this is an improvement.
I feel the current for-loop version is slightly more readable,
especially in a doc example aimed at new users, so I'd like to keep
it as-is.
>> +//! # };
>> +//! # let buddy = GpuBuddy::new(GpuBuddyParams {
>> +//! # base_offset: 0,
>> +//! # physical_memory_size: SZ_1G as u64,
>> +//! # chunk_size: SZ_4K,
>> +//! # })?;
>> +//! # let initial_free = buddy.free_memory();
>
> `make rustdoc` fails to build:
>
> error[E0433]: failed to resolve: use of undeclared type `GpuBuddyAllocFlags`
Fixed. I'll try to run this before submissions henceforth.
>> +/// # Synchronization
>> +///
>> +/// The C `gpu_buddy` API requires synchronization (see `include/linux/gpu_buddy.h`).
>> +/// [`GpuBuddyGuard`] ensures that the lock is held for all
>> +/// allocator and free operations, preventing races between concurrent allocations
>> +/// and the freeing that occurs when [`AllocatedBlocks`] is dropped.
>
> `GpuBuddyGuard` is now private, so we should avoid mentioning it in the
> public documentation as it will just confuse users.
>
> Users won't care about such implementation details - we can just say
> that internal locking ensures all operations are properly synchronized.
Done.
>> + pub fn new(params: GpuBuddyParams) -> Result<Self> {
>> + Ok(Self(Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL)?))
>
> Can be written as:
>
> Arc::pin_init(GpuBuddyInner::new(params), GFP_KERNEL).map(Self)
>
> I prefer this form as it avoids the `?` and re-wrapping into `Ok` for
> something that is already a `Result`.
Done.
>> + /// Allocate blocks from the buddy allocator.
>> + ///
>> + /// Returns a pin-initializer for [`AllocatedBlocks`].
>> + ///
>> + /// Takes `&self` instead of `&mut self` because the internal [`Mutex`] provides
>> + /// synchronization - no external `&mut` exclusivity needed.
>
> This is another implementation detail - the fact this takes `&self` and
> is not `unsafe` is already proof that synchronization is taken care of.
Removed the comment.
>> + pub fn alloc_blocks(
>> + &self,
>> + mode: GpuBuddyAllocMode,
>> + size: usize,
>
> For this parameter I am pretty sure we want to conform to the C API and
> use a `u64` - there is no benefit in not doing so, and buffers larger
> than 4GB *are* a reality nowadays, (maybe not for graphics, but this
> will also be used in compute scenarios).
Agreed. Though, note this adds 7 more `as` usages, but I guess there's
nothing we can do till the IntoSafe stuff is moved to core rust, I think.
>> + fn offset(&self) -> u64 {
>> + // SAFETY: `self.as_raw()` is valid per the type's invariants.
>> + unsafe { bindings::gpu_buddy_block_offset(self.as_raw()) }
>> + }
>> +
>> + /// Get the block order.
>> + fn order(&self) -> u32 {
>> + // SAFETY: `self.as_raw()` is valid per the type's invariants.
>> + unsafe { bindings::gpu_buddy_block_order(self.as_raw()) }
>> + }
>
> Speaking of synchronization - I only had a quick look at the C API, but
> are we sure these methods can all be called without holding the lock?
Yes, sure. Locking is only required around alloc/free operations. Additionally,
`AllocatedBlock` borrows a reference to `AllocatedBlocks`, so the borrow checker
prevents `AllocatedBlocks` from being dropped.
>> diff --git a/rust/kernel/gpu/mod.rs b/rust/kernel/gpu/mod.rs
>
> Let's use `gpu.rs` as the file for this module.
Done, renamed and also updated the MAINTAINERS entries.
>> +pub mod buddy;
>
> IMHO we should have a `#[cfg(CONFIG_GPU_BUDDY = "y")]` here for
> defensiveness...
Added.
>> +#[cfg(CONFIG_GPU_BUDDY = "y")]
>> +pub mod gpu;
>
> ... because in the future I suspect the condition for enabling that
> module will become broader. I think it's fine to keep it as-is for now
> though.
Noted, agreed, keeping it as-is for now.
Thanks for the thorough review! Will respin and send likely tomorrow or day after.
thanks,
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: John Hubbard @ 2026-03-16 18:51 UTC (permalink / raw)
To: Alexandre Courbot, Joel Fernandes
Cc: linux-kernel, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Dave Airlie, Daniel Almeida,
Koen Koning, dri-devel, nouveau, rust-for-linux, Nikola Djukic,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jonathan Corbet, Alex Deucher,
Christian König, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Matthew Auld, Matthew Brost,
Lucas De Marchi, Thomas Hellström, Helge Deller, Alex Gaynor,
Boqun Feng, Alistair Popple, Timur Tabi, Edwin Peer, Andrea Righi,
Andy Ritger, Zhi Wang, Balbir Singh, Philipp Stanner,
Elle Rhumsaa, alexeyi, Eliot Courtney, joel, linux-doc, amd-gfx,
intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <DH48DNAQCE0Z.2EX23VD27CQVX@nvidia.com>
On 3/16/26 6:12 AM, Alexandre Courbot wrote:
> On Mon Mar 9, 2026 at 10:53 PM JST, Joel Fernandes wrote:
...
>> +//! // Create a 1GB buddy allocator with 4KB minimum chunk size.
>> +//! let buddy = GpuBuddy::new(GpuBuddyParams {
>> +//! base_offset: 0,
>> +//! physical_memory_size: SZ_1G as u64,
>> +//! chunk_size: SZ_4K,
>
> `chunk_size` is an interesting case. The C API uses a `u64`, but I think
> we can reasonably consider that we won't ever need chunks larger than
> 4GB (or can we :O). I'm actually ok with using a `usize` for this one.
>
> One of the first things the C code does is throwing an error if it is
> not a power of 2, so maybe we can even request an `Alignment`?
>
> I'm a bit torn as to whether we should use a `u64` to conform with the C
> API, but doing so would mean we cannot use an `Alignment`...
Alex, have you seen my Alignment patch [1], for that? It's sitting
around with only Miguel having responded, but seems like exactly
what you're talking about here.
[1] https://lore.kernel.org/20260312031507.216709-3-jhubbard@nvidia.com
thanks,
--
John Hubbard
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: Joel Fernandes @ 2026-03-16 21:00 UTC (permalink / raw)
To: John Hubbard
Cc: Alexandre Courbot, linux-kernel, Miguel Ojeda, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Dave Airlie,
Daniel Almeida, Koen Koning, dri-devel, nouveau, rust-for-linux,
Nikola Djukic, Maarten Lankhorst, Maxime Ripard, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian König, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Huang Rui,
Matthew Auld, Matthew Brost, Lucas De Marchi,
Thomas Hellström, Helge Deller, Alex Gaynor, Boqun Feng,
Alistair Popple, Andrea Righi, Zhi Wang, Philipp Stanner,
Elle Rhumsaa, alexeyi, Eliot Courtney, linux-doc, amd-gfx,
intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <efc10902-2ee9-4cb3-a4cc-442998eef01a@nvidia.com>
On Mon, 16 Mar 2026, John Hubbard wrote:
> Alex, have you seen my Alignment patch [1], for that? It's sitting
> around with only Miguel having responded, but seems like exactly
> what you're talking about here.
>
> [1] https://lore.kernel.org/20260312031507.216709-3-jhubbard@nvidia.com
`Alignment` is already in core Rust for Linux
(`rust/kernel/ptr.rs`) and I'm already using it in my nova-core v9
patches.
--
Joel Fernandes
^ permalink raw reply
* Re: [PATCH v12.1 1/1] rust: gpu: Add GPU buddy allocator bindings
From: John Hubbard @ 2026-03-16 22:08 UTC (permalink / raw)
To: Joel Fernandes
Cc: Alexandre Courbot, linux-kernel, Miguel Ojeda, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Dave Airlie,
Daniel Almeida, Koen Koning, dri-devel, nouveau, rust-for-linux,
Nikola Djukic, Maarten Lankhorst, Maxime Ripard, Simona Vetter,
Jonathan Corbet, Alex Deucher, Christian König, Jani Nikula,
Joonas Lahtinen, Rodrigo Vivi, Tvrtko Ursulin, Huang Rui,
Matthew Auld, Matthew Brost, Lucas De Marchi,
Thomas Hellström, Helge Deller, Alex Gaynor, Boqun Feng,
Alistair Popple, Andrea Righi, Zhi Wang, Philipp Stanner,
Elle Rhumsaa, alexeyi, Eliot Courtney, linux-doc, amd-gfx,
intel-gfx, intel-xe, linux-fbdev
In-Reply-To: <1773694747.5c82fbe5a9875889@nvidia.com>
On 3/16/26 2:00 PM, Joel Fernandes wrote:
> On Mon, 16 Mar 2026, John Hubbard wrote:
>
>> Alex, have you seen my Alignment patch [1], for that? It's sitting
>> around with only Miguel having responded, but seems like exactly
>> what you're talking about here.
>>
>> [1] https://lore.kernel.org/20260312031507.216709-3-jhubbard@nvidia.com
>
> `Alignment` is already in core Rust for Linux
> (`rust/kernel/ptr.rs`) and I'm already using it in my nova-core v9
> patches.
>
Right, but the patch I linked doesn't introduce Alignment. It adds a
new from_u64() constructor to the existing Alignment type. Today the
only constructors take usize, so there's no way to go from a u64
DeviceSize constant to an Alignment without a manual cast.
Alex wanted to use Alignment for chunk_size but said "doing so would
mean we cannot use an Alignment" if the field conforms to the
C API's u64.
So again, I think that patch is worth looking at.
thanks,
--
John Hubbard
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox