linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
@ 2025-06-28  5:25 Abdelrahman Fekry
  2025-06-28 19:52 ` Andy Shevchenko
  2025-07-06 16:22 ` Hans de Goede
  0 siblings, 2 replies; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-06-28  5:25 UTC (permalink / raw)
  To: andy, hdegoede, mchehab, sakari.ailus, gregkh
  Cc: linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter, Abdelrahman Fekry

The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
before key initialization steps like kmem_cache_create(),
kmem_cache_alloc(), and __bo_init().

This means that if any of these steps fail, the flag remains set,
misleading other parts of the driver (e.g. hmm_bo_alloc())
into thinking the device is initialized. This could lead
to undefined behavior or invalid memory use.

Additionally, since __bo_init() is called from inside
hmm_bo_device_init() after the flag was already set, its internal
check for HMM_BO_DEVICE_INITED is redundant.

- Move the flag assignment to the end after all allocations succeed.
- Remove redundant check of the flag inside __bo_init()

Signed-off-by: Abdelrahman Fekry <abdelrahmanfekry375@gmail.com>
---
 drivers/staging/media/atomisp/pci/hmm/hmm_bo.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
index 224ca8d42721..5d0cd5260d3a 100644
--- a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
+++ b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
@@ -37,8 +37,6 @@ static int __bo_init(struct hmm_bo_device *bdev, struct hmm_buffer_object *bo,
 		     unsigned int pgnr)
 {
 	check_bodev_null_return(bdev, -EINVAL);
-	var_equal_return(hmm_bo_device_inited(bdev), 0, -EINVAL,
-			 "hmm_bo_device not inited yet.\n");
 	/* prevent zero size buffer object */
 	if (pgnr == 0) {
 		dev_err(atomisp_dev, "0 size buffer is not allowed.\n");
@@ -341,7 +339,6 @@ int hmm_bo_device_init(struct hmm_bo_device *bdev,
 	spin_lock_init(&bdev->list_lock);
 	mutex_init(&bdev->rbtree_mutex);

-	bdev->flag = HMM_BO_DEVICE_INITED;

 	INIT_LIST_HEAD(&bdev->entire_bo_list);
 	bdev->allocated_rbtree = RB_ROOT;
@@ -376,6 +373,8 @@ int hmm_bo_device_init(struct hmm_bo_device *bdev,

 	__bo_insert_to_free_rbtree(&bdev->free_rbtree, bo);

+	bdev->flag = HMM_BO_DEVICE_INITED;
+
 	return 0;
 }

--
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-06-28  5:25 [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag Abdelrahman Fekry
@ 2025-06-28 19:52 ` Andy Shevchenko
  2025-06-29  9:51   ` Abdelrahman Fekry
  2025-07-01 11:58   ` Abdelrahman Fekry
  2025-07-06 16:22 ` Hans de Goede
  1 sibling, 2 replies; 14+ messages in thread
From: Andy Shevchenko @ 2025-06-28 19:52 UTC (permalink / raw)
  To: Abdelrahman Fekry
  Cc: andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
<abdelrahmanfekry375@gmail.com> wrote:
>
> The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> before key initialization steps like kmem_cache_create(),
> kmem_cache_alloc(), and __bo_init().
>
> This means that if any of these steps fail, the flag remains set,
> misleading other parts of the driver (e.g. hmm_bo_alloc())
> into thinking the device is initialized. This could lead
> to undefined behavior or invalid memory use.

Nice. Can you make some fault injection (temporary by modifying the
code to always fail, for example) and actually prove this in practice?
If so, the few (important) lines from the given Oops would be nice to
have here.

> Additionally, since __bo_init() is called from inside
> hmm_bo_device_init() after the flag was already set, its internal
> check for HMM_BO_DEVICE_INITED is redundant.
>
> - Move the flag assignment to the end after all allocations succeed.
> - Remove redundant check of the flag inside __bo_init()


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-06-28 19:52 ` Andy Shevchenko
@ 2025-06-29  9:51   ` Abdelrahman Fekry
  2025-07-01 11:58   ` Abdelrahman Fekry
  1 sibling, 0 replies; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-06-29  9:51 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

On Sat, Jun 28, 2025 at 10:52 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
> <abdelrahmanfekry375@gmail.com> wrote:
> >
> > The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> > before key initialization steps like kmem_cache_create(),
> > kmem_cache_alloc(), and __bo_init().
> >
> > This means that if any of these steps fail, the flag remains set,
> > misleading other parts of the driver (e.g. hmm_bo_alloc())
> > into thinking the device is initialized. This could lead
> > to undefined behavior or invalid memory use.
>
> Nice. Can you make some fault injection (temporary by modifying the
> code to always fail, for example) and actually prove this in practice?
> If so, the few (important) lines from the given Oops would be nice to
> have here.

I will look out how this can be done. Thanks for the feedback

> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-06-28 19:52 ` Andy Shevchenko
  2025-06-29  9:51   ` Abdelrahman Fekry
@ 2025-07-01 11:58   ` Abdelrahman Fekry
  2025-07-01 12:45     ` Andy Shevchenko
  1 sibling, 1 reply; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-07-01 11:58 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hello Andy,
On Sat, Jun 28, 2025 at 10:52 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
> <abdelrahmanfekry375@gmail.com> wrote:
> >
> > The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> > before key initialization steps like kmem_cache_create(),
> > kmem_cache_alloc(), and __bo_init().
> >
> > This means that if any of these steps fail, the flag remains set,
> > misleading other parts of the driver (e.g. hmm_bo_alloc())
> > into thinking the device is initialized. This could lead
> > to undefined behavior or invalid memory use.
>
> Nice. Can you make some fault injection (temporary by modifying the
> code to always fail, for example) and actually prove this in practice?
> If so, the few (important) lines from the given Oops would be nice to
> have here.
>
I have been trying to test it without having any intel atomisp
hardware and failed continuously, do you have any tips or maybe some
resources on how i can test this driver.

> > Additionally, since __bo_init() is called from inside
> > hmm_bo_device_init() after the flag was already set, its internal
> > check for HMM_BO_DEVICE_INITED is redundant.
> >
> > - Move the flag assignment to the end after all allocations succeed.
> > - Remove redundant check of the flag inside __bo_init()
>
>
Thanks for your continuous guidance.
> --
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 11:58   ` Abdelrahman Fekry
@ 2025-07-01 12:45     ` Andy Shevchenko
  2025-07-01 13:54       ` Hans de Goede
  2025-07-01 15:33       ` Abdelrahman Fekry
  0 siblings, 2 replies; 14+ messages in thread
From: Andy Shevchenko @ 2025-07-01 12:45 UTC (permalink / raw)
  To: Abdelrahman Fekry
  Cc: Andy Shevchenko, andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

On Tue, Jul 01, 2025 at 02:58:43PM +0300, Abdelrahman Fekry wrote:
> Hello Andy,
> On Sat, Jun 28, 2025 at 10:52 PM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> > On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
> > <abdelrahmanfekry375@gmail.com> wrote:

> > > The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> > > before key initialization steps like kmem_cache_create(),
> > > kmem_cache_alloc(), and __bo_init().
> > >
> > > This means that if any of these steps fail, the flag remains set,
> > > misleading other parts of the driver (e.g. hmm_bo_alloc())
> > > into thinking the device is initialized. This could lead
> > > to undefined behavior or invalid memory use.
> >
> > Nice. Can you make some fault injection (temporary by modifying the
> > code to always fail, for example) and actually prove this in practice?
> > If so, the few (important) lines from the given Oops would be nice to
> > have here.

> I have been trying to test it without having any intel atomisp
> hardware and failed continuously, do you have any tips or maybe some
> resources on how i can test this driver.

So, the easiest way as I can see it is to ask people who possess the HW to
test, but you need to provide a testing patch (which can be applied on top
of this one, for example).

> > > Additionally, since __bo_init() is called from inside
> > > hmm_bo_device_init() after the flag was already set, its internal
> > > check for HMM_BO_DEVICE_INITED is redundant.
> > >
> > > - Move the flag assignment to the end after all allocations succeed.
> > > - Remove redundant check of the flag inside __bo_init()

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 12:45     ` Andy Shevchenko
@ 2025-07-01 13:54       ` Hans de Goede
  2025-07-01 15:45         ` Abdelrahman Fekry
  2025-07-01 15:33       ` Abdelrahman Fekry
  1 sibling, 1 reply; 14+ messages in thread
From: Hans de Goede @ 2025-07-01 13:54 UTC (permalink / raw)
  To: Andy Shevchenko, Abdelrahman Fekry
  Cc: Andy Shevchenko, andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hi,

On 1-Jul-25 2:45 PM, Andy Shevchenko wrote:
> On Tue, Jul 01, 2025 at 02:58:43PM +0300, Abdelrahman Fekry wrote:
>> Hello Andy,
>> On Sat, Jun 28, 2025 at 10:52 PM Andy Shevchenko
>> <andy.shevchenko@gmail.com> wrote:
>>> On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
>>> <abdelrahmanfekry375@gmail.com> wrote:
> 
>>>> The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
>>>> before key initialization steps like kmem_cache_create(),
>>>> kmem_cache_alloc(), and __bo_init().
>>>>
>>>> This means that if any of these steps fail, the flag remains set,
>>>> misleading other parts of the driver (e.g. hmm_bo_alloc())
>>>> into thinking the device is initialized. This could lead
>>>> to undefined behavior or invalid memory use.
>>>
>>> Nice. Can you make some fault injection (temporary by modifying the
>>> code to always fail, for example) and actually prove this in practice?
>>> If so, the few (important) lines from the given Oops would be nice to
>>> have here.
> 
>> I have been trying to test it without having any intel atomisp
>> hardware and failed continuously, do you have any tips or maybe some
>> resources on how i can test this driver.
> 
> So, the easiest way as I can see it is to ask people who possess the HW to
> test, but you need to provide a testing patch (which can be applied on top
> of this one, for example).

I don't think it is worth testing the error path here,
the old code is obviously wrong.

What is interesting here is to see if the extra call to hmm_init()
in __hmm_alloc() is necessary at all.

And obviously hmm_init() needs to propagate the error return
from hmm_bo_init() right away instead of continuing if nothing
was wrong and then only returning the error at the end.

So it seems that there plenty of cleanup to do around this
without needing error injection to test all paths.

Actually I'm pretty sure that there will be quite a few
error-handling paths with bugs in the atomisp code given
its overall quality. But lets clean things up first, that
should make addressing any such cases easier.

Regards,

Hans


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 12:45     ` Andy Shevchenko
  2025-07-01 13:54       ` Hans de Goede
@ 2025-07-01 15:33       ` Abdelrahman Fekry
  2025-07-01 18:00         ` Andy Shevchenko
  1 sibling, 1 reply; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-07-01 15:33 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Andy Shevchenko, andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hi andy ,
On Tue, Jul 1, 2025 at 3:45 PM Andy Shevchenko
<andriy.shevchenko@intel.com> wrote:

> > > Nice. Can you make some fault injection (temporary by modifying the
> > > code to always fail, for example) and actually prove this in practice?
> > > If so, the few (important) lines from the given Oops would be nice to
> > > have here.
>
> > I have been trying to test it without having any intel atomisp
> > hardware and failed continuously, do you have any tips or maybe some
> > resources on how i can test this driver.
>
> So, the easiest way as I can see it is to ask people who possess the HW to
> test, but you need to provide a testing patch (which can be applied on top
> of this one, for example).
>

Well, after several hours of trial and error, I finally managed to
find a workaround that allowed me to test the scenario. As expected,
the system crashed exactly at the point we discussed. I was able to
capture the kernel panic log, which is shown below.

To simulate the issue, I injected a failure right after setting the
HMM_BO_DEVICE_INITED flag, this mimics a failure in one of the
subsequent initialization steps. Then, I wrote a test module that
calls the hmm_init() function directly. As anticipated, the kernel
panicked at the hmm_alloc(1) call inside hmm_init().

Here’s the relevant panic log:
[  161.802542] atomisp: loading out-of-tree module taints kernel.
[  161.823358] ===== HMM BO DEVICE TEST =====
[  161.823666] (NULL device *): Simulated failure for testing purposes.
[  161.824064] (NULL device *): invalid L1PT: pte = 0x7fffffff
[  161.824427] (NULL device *): hmm_bo_device_init failed.
[  161.824818] BUG: kernel NULL pointer dereference, address: 0000000000000020
[  161.825309] #PF: supervisor read access in kernel mode
[  161.825693] #PF: error_code(0x0000) - not-present page
[  161.826100] PGD 0 P4D 0
[  161.826237] Oops: Oops: 0000 [#1] SMP PTI
[  161.826482] CPU: 2 UID: 0 PID: 3688 Comm: modprobe Kdump: loaded
Tainted: G           O        6.16.0-rc4+ #2 PREEMPT(voluntary)
[  161.827445] Tainted: [O]=OOT_MODULE
[  161.827650] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  161.828273] RIP:
0010:__bo_search_and_remove_from_free_rbtree+0xf/0xd0 [atomisp]
[  161.828977] Code: 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 49
89 fc 53 <8b> 47 20 39 f0 74 46 89 f3 76 07 48 83 7f 10 00 74 3b 39 d8
73 1f
[  161.830239] RSP: 0018:ffffb28104a2e970 EFLAGS: 00010246
[  161.830588] RAX: 0000000000000000 RBX: ffffffffc0a868e0 RCX: ffff8d6141e1ce88
[  161.831071] RDX: ffff8d5f47601980 RSI: 0000000000000001 RDI: 0000000000000000
[  161.831524] RBP: ffffb28104a2e980 R08: 0000000000000003 R09: 0000000000000001
[  161.831977] R10: 6369766564204c4c R11: 6564204c4c554e28 R12: 0000000000000000
[  161.832422] R13: 0000000000000000 R14: ffffffffc0a87950 R15: 0000000000000001
[  161.833019] FS:  00007f04fce83740(0000) GS:ffff8d619f0c4000(0000)
knlGS:0000000000000000
[  161.833527] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  161.833868] CR2: 0000000000000020 CR3: 000000010625a003 CR4: 00000000000706f0
[  161.834307] Call Trace:
[  161.834434]  <TASK>
[  161.834545]  hmm_bo_alloc+0x5c/0x2c0 [atomisp]
[  161.834959]  __hmm_alloc+0x48/0xf0 [atomisp]
[  161.835267]  hmm_init+0x98/0xd0 [atomisp]
[  161.835561]  ? __pfx_test_init+0x10/0x10 [atomisp]
[  161.835863]  test_init+0x42/0xff0 [atomisp]
[  161.836174]  do_one_initcall+0x4b/0x320
[  161.836446]  do_init_module+0x6a/0x2b0
[  161.836675]  load_module+0x24f7/0x25c0
[  161.836905]  ? kernel_read_file+0x226/0x2d0
[  161.837160]  init_module_from_file+0x9b/0xe0
[  161.837413]  ? init_module_from_file+0x9b/0xe0
[  161.837687]  idempotent_init_module+0x170/0x270
[  161.837958]  __x64_sys_finit_module+0x6f/0xe0
[  161.838225]  x64_sys_call+0x1b7a/0x2150
[  161.838454]  do_syscall_64+0x74/0x1d0
[  161.838701]  ? ksys_mmap_pgoff+0x1b7/0x240
[  161.838950]  ? __x64_sys_mmap+0x37/0x50
[  161.839176]  ? x64_sys_call+0x2008/0x2150
[  161.839429]  ? do_syscall_64+0xa3/0x1d0
[  161.839640]  ? __x64_sys_read+0x1e/0x30
[  161.839863]  ? x64_sys_call+0x1b90/0x2150
[  161.840098]  ? do_syscall_64+0xa3/0x1d0
[  161.840315]  ? do_syscall_64+0x199/0x1d0
[  161.840538]  ? x64_sys_call+0x1b90/0x2150
[  161.840775]  ? do_syscall_64+0xa3/0x1d0
[  161.841007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  161.841289] RIP: 0033:0x7f04fc92695d
[  161.841490] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e
fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 03 35 0d 00 f7 d8 64 89
01 48
[  161.842992] RSP: 002b:00007ffd12ffbb88 EFLAGS: 00000246 ORIG_RAX:
0000000000000139
[  161.843500] RAX: ffffffffffffffda RBX: 0000557fdea491a0 RCX: 00007f04fc92695d
[  161.843968] RDX: 0000000000000000 RSI: 0000557fd288c358 RDI: 000000000000000c
[  161.844401] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000
[  161.844857] R10: 000000000000000c R11: 0000000000000246 R12: 0000557fd288c358
[  161.845285] R13: 0000000000000000 R14: 0000557fdea492b0 R15: 0000557fdea491a0
[  161.845740]  </TASK>
[  161.845844] Modules linked in: atomisp(O+) ipu_bridge v4l2_fwnode
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common
v4l2_async videodev mc isofs vboxsf vboxguest vboxvideo
drm_vram_helper nls_iso8859_1 intel_rapl_msr intel_rapl_common
intel_uncore_frequency_common ghash_clmulni_intel sha512_ssse3
sha1_ssse3 aesni_intel snd_intel8x0 rapl snd_ac97_codec ac97_bus
snd_pcm binfmt_misc joydev snd_seq_midi snd_seq_midi_event snd_rawmidi
snd_seq vga16fb snd_seq_device vgastate input_leds sch_fq_codel
snd_timer snd mac_hid soundcore serio_raw vmwgfx drm_ttm_helper ttm
drm_client_lib drm_kms_helper drm msr parport_pc ppdev lp parport
ramoops pstore_blk reed_solomon efi_pstore pstore_zone ip_tables
x_tables autofs4 hid_generic usbhid hid e1000 video psmouse wmi ahci
libahci i2c_piix4 pata_acpi i2c_smbus [last unloaded: vboxguest]
[  161.851072] CR2: 0000000000000020


> With Best Regards,
> Andy Shevchenko

Best Regards,
Abelrahman Fekry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 13:54       ` Hans de Goede
@ 2025-07-01 15:45         ` Abdelrahman Fekry
  2025-07-06 18:25           ` Hans de Goede
  0 siblings, 1 reply; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-07-01 15:45 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Andy Shevchenko, Andy Shevchenko, andy, hdegoede, mchehab,
	sakari.ailus, gregkh, linux-kernel-mentees, linux-kernel,
	linux-media, linux-staging, skhan, dan.carpenter

Hi Hans,

On Tue, Jul 1, 2025 at 4:54 PM Hans de Goede <hansg@kernel.org> wrote:
>
> Hi,
>
> On 1-Jul-25 2:45 PM, Andy Shevchenko wrote:
> > On Tue, Jul 01, 2025 at 02:58:43PM +0300, Abdelrahman Fekry wrote:
> >> Hello Andy,
> >> On Sat, Jun 28, 2025 at 10:52 PM Andy Shevchenko
> >> <andy.shevchenko@gmail.com> wrote:
> >>> On Sat, Jun 28, 2025 at 8:26 AM Abdelrahman Fekry
> >>> <abdelrahmanfekry375@gmail.com> wrote:
> >
> >>>> The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> >>>> before key initialization steps like kmem_cache_create(),
> >>>> kmem_cache_alloc(), and __bo_init().
> >>>>
> >>>> This means that if any of these steps fail, the flag remains set,
> >>>> misleading other parts of the driver (e.g. hmm_bo_alloc())
> >>>> into thinking the device is initialized. This could lead
> >>>> to undefined behavior or invalid memory use.
> >>>
> >>> Nice. Can you make some fault injection (temporary by modifying the
> >>> code to always fail, for example) and actually prove this in practice?
> >>> If so, the few (important) lines from the given Oops would be nice to
> >>> have here.
> >
> >> I have been trying to test it without having any intel atomisp
> >> hardware and failed continuously, do you have any tips or maybe some
> >> resources on how i can test this driver.
> >
> > So, the easiest way as I can see it is to ask people who possess the HW to
> > test, but you need to provide a testing patch (which can be applied on top
> > of this one, for example).
>
> I don't think it is worth testing the error path here,
> the old code is obviously wrong.
>
> What is interesting here is to see if the extra call to hmm_init()
> in __hmm_alloc() is necessary at all.
>
i think its redundant because in all scenarios hmm_init() should
be called before hmm_alloc()

> And obviously hmm_init() needs to propagate the error return
> from hmm_bo_init() right away instead of continuing if nothing
> was wrong and then only returning the error at the end.
>
> So it seems that there plenty of cleanup to do around this
> without needing error injection to test all paths.
>
> Actually I'm pretty sure that there will be quite a few
> error-handling paths with bugs in the atomisp code given
> its overall quality. But lets clean things up first, that
> should make addressing any such cases easier.
>
I totally agree with this , i have submitted a patch that cleans the
custom sysfs atrributes
as you suggested as a beginning , the patch got reviewed by andy and dan
here is the link
https://lore.kernel.org/all/20250627100604.29061-1-abdelrahmanfekry375@gmail.com/

What do you think I should work on next after these two patches, do
you have any suggestions?

> Regards,
>
> Hans
>
Best Regards,
Abdelrahman Fekry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 15:33       ` Abdelrahman Fekry
@ 2025-07-01 18:00         ` Andy Shevchenko
  0 siblings, 0 replies; 14+ messages in thread
From: Andy Shevchenko @ 2025-07-01 18:00 UTC (permalink / raw)
  To: Abdelrahman Fekry
  Cc: Andy Shevchenko, andy, hdegoede, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

On Tue, Jul 1, 2025 at 6:34 PM Abdelrahman Fekry
<abdelrahmanfekry375@gmail.com> wrote:
>
> Hi andy ,
> On Tue, Jul 1, 2025 at 3:45 PM Andy Shevchenko
> <andriy.shevchenko@intel.com> wrote:
>
> > > > Nice. Can you make some fault injection (temporary by modifying the
> > > > code to always fail, for example) and actually prove this in practice?
> > > > If so, the few (important) lines from the given Oops would be nice to
> > > > have here.
> >
> > > I have been trying to test it without having any intel atomisp
> > > hardware and failed continuously, do you have any tips or maybe some
> > > resources on how i can test this driver.
> >
> > So, the easiest way as I can see it is to ask people who possess the HW to
> > test, but you need to provide a testing patch (which can be applied on top
> > of this one, for example).
>
> Well, after several hours of trial and error, I finally managed to
> find a workaround that allowed me to test the scenario. As expected,
> the system crashed exactly at the point we discussed. I was able to
> capture the kernel panic log, which is shown below.

Thank you for the effort! I think we are good to go with the change,
maybe Hans can add a summary of this test into the commit message when
applying. Or Link tag might be enough, dunno.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-06-28  5:25 [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag Abdelrahman Fekry
  2025-06-28 19:52 ` Andy Shevchenko
@ 2025-07-06 16:22 ` Hans de Goede
  1 sibling, 0 replies; 14+ messages in thread
From: Hans de Goede @ 2025-07-06 16:22 UTC (permalink / raw)
  To: Abdelrahman Fekry, andy, mchehab, sakari.ailus, gregkh
  Cc: linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hi,

On 28-Jun-25 7:25 AM, Abdelrahman Fekry wrote:
> The HMM_BO_DEVICE_INITED flag was being set in hmm_bo_device_init()
> before key initialization steps like kmem_cache_create(),
> kmem_cache_alloc(), and __bo_init().
> 
> This means that if any of these steps fail, the flag remains set,
> misleading other parts of the driver (e.g. hmm_bo_alloc())
> into thinking the device is initialized. This could lead
> to undefined behavior or invalid memory use.
> 
> Additionally, since __bo_init() is called from inside
> hmm_bo_device_init() after the flag was already set, its internal
> check for HMM_BO_DEVICE_INITED is redundant.
> 
> - Move the flag assignment to the end after all allocations succeed.
> - Remove redundant check of the flag inside __bo_init()
> 
> Signed-off-by: Abdelrahman Fekry <abdelrahmanfekry375@gmail.com>

Thanks, patch looks good to me:

Reviewed-by: Hans de Goede <hansg@kernel.org>

I have adjusted the commit message with a link to the backtrace
you generated through fault-injection and I've merged this in my
media-atomisp branch:

https://git.kernel.org/pub/scm/linux/kernel/git/hansg/linux.git/log/?h=media-atomisp

This patch will be included in my next pull-request
to Mauro (to media subsystem maintainer)

Regards,

Hans




> ---
>  drivers/staging/media/atomisp/pci/hmm/hmm_bo.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
> index 224ca8d42721..5d0cd5260d3a 100644
> --- a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
> +++ b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c
> @@ -37,8 +37,6 @@ static int __bo_init(struct hmm_bo_device *bdev, struct hmm_buffer_object *bo,
>  		     unsigned int pgnr)
>  {
>  	check_bodev_null_return(bdev, -EINVAL);
> -	var_equal_return(hmm_bo_device_inited(bdev), 0, -EINVAL,
> -			 "hmm_bo_device not inited yet.\n");
>  	/* prevent zero size buffer object */
>  	if (pgnr == 0) {
>  		dev_err(atomisp_dev, "0 size buffer is not allowed.\n");
> @@ -341,7 +339,6 @@ int hmm_bo_device_init(struct hmm_bo_device *bdev,
>  	spin_lock_init(&bdev->list_lock);
>  	mutex_init(&bdev->rbtree_mutex);
> 
> -	bdev->flag = HMM_BO_DEVICE_INITED;
> 
>  	INIT_LIST_HEAD(&bdev->entire_bo_list);
>  	bdev->allocated_rbtree = RB_ROOT;
> @@ -376,6 +373,8 @@ int hmm_bo_device_init(struct hmm_bo_device *bdev,
> 
>  	__bo_insert_to_free_rbtree(&bdev->free_rbtree, bo);
> 
> +	bdev->flag = HMM_BO_DEVICE_INITED;
> +
>  	return 0;
>  }
> 
> --
> 2.25.1
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-01 15:45         ` Abdelrahman Fekry
@ 2025-07-06 18:25           ` Hans de Goede
  2025-07-07 10:57             ` Abdelrahman Fekry
  2025-07-07 13:26             ` Abdelrahman Fekry
  0 siblings, 2 replies; 14+ messages in thread
From: Hans de Goede @ 2025-07-06 18:25 UTC (permalink / raw)
  To: Abdelrahman Fekry
  Cc: Andy Shevchenko, Andy Shevchenko, andy, mchehab, sakari.ailus,
	gregkh, linux-kernel-mentees, linux-kernel, linux-media,
	linux-staging, skhan, dan.carpenter

Hi, Abdelrahman

On 1-Jul-25 5:45 PM, Abdelrahman Fekry wrote:
> Hi Hans,
> 
> On Tue, Jul 1, 2025 at 4:54 PM Hans de Goede <hansg@kernel.org> wrote:

...

>> Actually I'm pretty sure that there will be quite a few
>> error-handling paths with bugs in the atomisp code given
>> its overall quality. But lets clean things up first, that
>> should make addressing any such cases easier.
>>
> I totally agree with this , i have submitted a patch that cleans the
> custom sysfs atrributes
> as you suggested as a beginning , the patch got reviewed by andy and dan
> here is the link
> https://lore.kernel.org/all/20250627100604.29061-1-abdelrahmanfekry375@gmail.com/
> 
> What do you think I should work on next after these two patches, do
> you have any suggestions?

The hmm_alloc code can use some more cleanups:

* hmm_get_mmu_base_addr() should be moved to drivers/staging/media/atomisp/pci/hmm/hmm.c
  and then the "struct hmm_bo_device bo_device;" in hmm.c can be made static

* hmm_init() sets hmm_initialized = true even on errors. It should
  immediately exit (return ret) on errors instead of continue-ing
  with calling hmm_alloc() even though hmm_bo_device_init() failed.

* I've checked the code and hmm_init() is called before any hmm_alloc()
  calls are made so the extra hmm_init() call in __hmm_alloc() can be
  dropped.

* After dropping the extra hmm_init() call in __hmm_alloc() the
  hmm_initialized flag can be removed since it is now no longer read
  anywhere.

* And maybe you'll find more possible cleanups while working on this

Regards,

Hans



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-06 18:25           ` Hans de Goede
@ 2025-07-07 10:57             ` Abdelrahman Fekry
  2025-07-07 13:26             ` Abdelrahman Fekry
  1 sibling, 0 replies; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-07-07 10:57 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Andy Shevchenko, Andy Shevchenko, andy, mchehab, sakari.ailus,
	gregkh, linux-kernel-mentees, linux-kernel, linux-media,
	linux-staging, skhan, dan.carpenter

Hi Hans,
Thank you so much for your support and guidance. It really helps a lot.

> The hmm_alloc code can use some more cleanups:
>
> * hmm_get_mmu_base_addr() should be moved to drivers/staging/media/atomisp/pci/hmm/hmm.c
>   and then the "struct hmm_bo_device bo_device;" in hmm.c can be made static
>
On it .

> * hmm_init() sets hmm_initialized = true even on errors. It should
>   immediately exit (return ret) on errors instead of continue-ing
>   with calling hmm_alloc() even though hmm_bo_device_init() failed.

I have created a patch for this but i was waiting for the other patch on
hmm_init() to be applied because it builds on it.

> * I've checked the code and hmm_init() is called before any hmm_alloc()
>   calls are made so the extra hmm_init() call in __hmm_alloc() can be
>   dropped.

Same for this too , i was just waiting for the prev patch to get
applied to avoid conflicts.

> * After dropping the extra hmm_init() call in __hmm_alloc() the
>   hmm_initialized flag can be removed since it is now no longer read
>   anywhere.
>
I agree with this too , will do it

> Regards,
>
> Hans
Thank you
Abdelrahman Fekry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-06 18:25           ` Hans de Goede
  2025-07-07 10:57             ` Abdelrahman Fekry
@ 2025-07-07 13:26             ` Abdelrahman Fekry
  2025-07-07 13:55               ` Hans de Goede
  1 sibling, 1 reply; 14+ messages in thread
From: Abdelrahman Fekry @ 2025-07-07 13:26 UTC (permalink / raw)
  To: Hans de Goede, hdegoede, andy
  Cc: Andy Shevchenko, Andy Shevchenko, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hi Hans, Andy

On Sun, Jul 6, 2025 at 9:25 PM Hans de Goede <hansg@kernel.org> wrote:
>
> Hi, Abdelrahman
>
> On 1-Jul-25 5:45 PM, Abdelrahman Fekry wrote:
> > Hi Hans,
> >
> > On Tue, Jul 1, 2025 at 4:54 PM Hans de Goede <hansg@kernel.org> wrote:

> * And maybe you'll find more possible cleanups while working on this

I was working on some cleanups and i found something that i really
don't understand.
how is the error checking of hmm_alloc is handled , some functions are
checking the
returning value of it using mmgr_NULL and other use mmgr_EXCEPTION
while in fact, when __hm_alloc fails it doesn't return either
mmgr_EXCEPTION or mmgr_NULL in
most cases.

> Regards,
>
> Hans
>
>
Best,
Abdelrhaman Fekry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag
  2025-07-07 13:26             ` Abdelrahman Fekry
@ 2025-07-07 13:55               ` Hans de Goede
  0 siblings, 0 replies; 14+ messages in thread
From: Hans de Goede @ 2025-07-07 13:55 UTC (permalink / raw)
  To: Abdelrahman Fekry, hdegoede, andy
  Cc: Andy Shevchenko, Andy Shevchenko, mchehab, sakari.ailus, gregkh,
	linux-kernel-mentees, linux-kernel, linux-media, linux-staging,
	skhan, dan.carpenter

Hi Abdelrhaman,

On 7-Jul-25 15:26, Abdelrahman Fekry wrote:
> Hi Hans, Andy
> 
> On Sun, Jul 6, 2025 at 9:25 PM Hans de Goede <hansg@kernel.org> wrote:
>>
>> Hi, Abdelrahman
>>
>> On 1-Jul-25 5:45 PM, Abdelrahman Fekry wrote:
>>> Hi Hans,
>>>
>>> On Tue, Jul 1, 2025 at 4:54 PM Hans de Goede <hansg@kernel.org> wrote:
> 
>> * And maybe you'll find more possible cleanups while working on this
> 
> I was working on some cleanups and i found something that i really
> don't understand.
> how is the error checking of hmm_alloc is handled , some functions are
> checking the
> returning value of it using mmgr_NULL and other use mmgr_EXCEPTION
> while in fact, when __hm_alloc fails it doesn't return either
> mmgr_EXCEPTION or mmgr_NULL in
> most cases.

Sounds like you've found another issue to fix :)

Regards,

Hans


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-07 13:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-28  5:25 [PATCH] staging: media: atomisp: Fix premature setting of HMM_BO_DEVICE_INITED flag Abdelrahman Fekry
2025-06-28 19:52 ` Andy Shevchenko
2025-06-29  9:51   ` Abdelrahman Fekry
2025-07-01 11:58   ` Abdelrahman Fekry
2025-07-01 12:45     ` Andy Shevchenko
2025-07-01 13:54       ` Hans de Goede
2025-07-01 15:45         ` Abdelrahman Fekry
2025-07-06 18:25           ` Hans de Goede
2025-07-07 10:57             ` Abdelrahman Fekry
2025-07-07 13:26             ` Abdelrahman Fekry
2025-07-07 13:55               ` Hans de Goede
2025-07-01 15:33       ` Abdelrahman Fekry
2025-07-01 18:00         ` Andy Shevchenko
2025-07-06 16:22 ` Hans de Goede

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).