* Possible case of Race in kobject_get_path() @ 2022-03-30 10:11 Mukesh Ojha 2022-03-30 10:16 ` Greg KH 0 siblings, 1 reply; 9+ messages in thread From: Mukesh Ojha @ 2022-03-30 10:11 UTC (permalink / raw) To: Rafael J. Wysocki, gregkh, linux-remoteproc Hi All, We are facing one issue where one driver (p1) is trying to register its device from driver probe and from another path (p2) dev_set_name(new name) done from driver probe of the added device whose new name length can be more than earlier done in (p1 path) which could result in redzone overwritten issue. Can we get your suggestion here ? is this case of a race here ? p1 p2 device_register() kobject_get_path() => get_kobj_path_length (length is calculated from this path) dev_set_name() => fill_kobj_path (path is copied here) Thanks, -Mukesh ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 10:11 Possible case of Race in kobject_get_path() Mukesh Ojha @ 2022-03-30 10:16 ` Greg KH [not found] ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com> 0 siblings, 1 reply; 9+ messages in thread From: Greg KH @ 2022-03-30 10:16 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-remoteproc On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: > Hi All, > > We are facing one issue where one driver (p1) is trying to register its > device from driver probe > and from another path (p2) dev_set_name(new name) done from driver probe of > the added device whose > new name length can be more than earlier done in (p1 path) which could > result in redzone overwritten issue. I do not understand, what specific driver is doing this so that we can see an example of this problem? > Can we get your suggestion here ? is this case of a race here ? > > p1 p2 > > device_register() > kobject_get_path() > => get_kobj_path_length > (length is calculated from this path) > > dev_set_name() > > => fill_kobj_path > (path is copied here) > I can not understand this example, any specific code you can point me at? thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com>]
* Re: Possible case of Race in kobject_get_path() [not found] ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com> @ 2022-03-30 11:01 ` Greg KH 2022-03-30 11:14 ` Mukesh Ojha 0 siblings, 1 reply; 9+ messages in thread From: Greg KH @ 2022-03-30 11:01 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: > > On 3/30/2022 3:46 PM, Greg KH wrote: > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: > > > Hi All, > > > > > > We are facing one issue where one driver (p1) is trying to register its > > > device from driver probe > > > and from another path (p2) dev_set_name(new name) done from driver probe of > > > the added device whose > > > new name length can be more than earlier done in (p1 path) which could > > > result in redzone overwritten issue. > > I do not understand, what specific driver is doing this so that we can > > see an example of this problem? > > >> > > trying to paste some logs of the race. > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone > overwritten What kernel version is this? > [ 14.244189][ T503] > ----------------------------------------------------------------------------- > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f > @offset=3965. First byte 0x2f instead of 0xcc > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 > cpu=0 pid=503 > [ 14.273111][ T503] kobject_get_path+0x58/0x100 > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 > [ 14.282639][ T503] device_add+0x98c/0x1028 > [ 14.286925][ T503] *device_register+0x24/0x38* > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] > [ 14.310007][ T503] platform_drv_probe+0x60/0x180 > [ 14.314812][ T503] really_probe+0x208/0xb64 > [ 14.319184][ T503] driver_probe_device+0x130/0x254 > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 > [ 14.329399][ T503] __device_attach+0x170/0x364 > [ 14.334035][ T503] bus_probe_device+0x4c/0x164 > [ 14.338671][ T503] device_add+0xa60/0x1028 > [ 14.342957][ T503] platform_device_add+0x260/0x2c8 > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 > [slim_qcom_ngd_ctrl] > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 > cpu=4 pid=518 > [ 14.363018][ T503] do_init_module+0xac/0x494 > [ 14.367475][ T503] load_module+0x35c0/0x3e54 > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 > [ 14.382328][ T503] el0_svc_common+0xdc/0x294 > [ 14.386786][ T503] el0_svc+0x38/0x9c > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc > cc cc cc cc 2f 64 65 .............*/de* > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 > 66 6f 72 6d 2f 73 vices/platform/s > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 > 2e 73 6c 69 6d 2f oc/3340000.slim/ > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d > 6e 67 64 2e 31 2f qcom,slim-ngd.1/ > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 > 6c 61 76 65 00 00 *btfmslim_slave*.. > > > > 499 static int btfm_slim_probe(struct slim_device *slim) I do not see that function in Linus's tree right now. Where does it come from? thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 11:01 ` Greg KH @ 2022-03-30 11:14 ` Mukesh Ojha 2022-03-30 11:19 ` Greg KH 0 siblings, 1 reply; 9+ messages in thread From: Mukesh Ojha @ 2022-03-30 11:14 UTC (permalink / raw) To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel On 3/30/2022 4:31 PM, Greg KH wrote: > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: >> On 3/30/2022 3:46 PM, Greg KH wrote: >>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: >>>> Hi All, >>>> >>>> We are facing one issue where one driver (p1) is trying to register its >>>> device from driver probe >>>> and from another path (p2) dev_set_name(new name) done from driver probe of >>>> the added device whose >>>> new name length can be more than earlier done in (p1 path) which could >>>> result in redzone overwritten issue. >>> I do not understand, what specific driver is doing this so that we can >>> see an example of this problem? >> trying to paste some logs of the race. >> >> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone >> overwritten > What kernel version is this? 5.10 > >> [ 14.244189][ T503] >> ----------------------------------------------------------------------------- >> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f >> @offset=3965. First byte 0x2f instead of 0xcc >> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 >> cpu=0 pid=503 >> [ 14.273111][ T503] kobject_get_path+0x58/0x100 >> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 >> [ 14.282639][ T503] device_add+0x98c/0x1028 >> [ 14.286925][ T503] *device_register+0x24/0x38* >> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] >> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] >> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] >> [ 14.310007][ T503] platform_drv_probe+0x60/0x180 >> [ 14.314812][ T503] really_probe+0x208/0xb64 >> [ 14.319184][ T503] driver_probe_device+0x130/0x254 >> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 >> [ 14.329399][ T503] __device_attach+0x170/0x364 >> [ 14.334035][ T503] bus_probe_device+0x4c/0x164 >> [ 14.338671][ T503] device_add+0xa60/0x1028 >> [ 14.342957][ T503] platform_device_add+0x260/0x2c8 >> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 >> [slim_qcom_ngd_ctrl] >> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 >> cpu=4 pid=518 >> [ 14.363018][ T503] do_init_module+0xac/0x494 >> [ 14.367475][ T503] load_module+0x35c0/0x3e54 >> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 >> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 >> [ 14.382328][ T503] el0_svc_common+0xdc/0x294 >> [ 14.386786][ T503] el0_svc+0x38/0x9c >> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 >> >> >> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc >> cc cc cc cc 2f 64 65 .............*/de* >> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 >> 66 6f 72 6d 2f 73 vices/platform/s >> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 >> 2e 73 6c 69 6d 2f oc/3340000.slim/ >> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d >> 6e 67 64 2e 31 2f qcom,slim-ngd.1/ >> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 >> 6c 61 76 65 00 00 *btfmslim_slave*.. >> >> >> >> 499 static int btfm_slim_probe(struct slim_device *slim) > I do not see that function in Linus's tree right now. Where does it > come from? Don't you think, it can be caused by any kernel module. > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 11:14 ` Mukesh Ojha @ 2022-03-30 11:19 ` Greg KH 2022-03-30 11:23 ` Mukesh Ojha 0 siblings, 1 reply; 9+ messages in thread From: Greg KH @ 2022-03-30 11:19 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote: > > On 3/30/2022 4:31 PM, Greg KH wrote: > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: > > > On 3/30/2022 3:46 PM, Greg KH wrote: > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: > > > > > Hi All, > > > > > > > > > > We are facing one issue where one driver (p1) is trying to register its > > > > > device from driver probe > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of > > > > > the added device whose > > > > > new name length can be more than earlier done in (p1 path) which could > > > > > result in redzone overwritten issue. > > > > I do not understand, what specific driver is doing this so that we can > > > > see an example of this problem? > > > trying to paste some logs of the race. > > > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone > > > overwritten > > What kernel version is this? > 5.10 That is very old, is this an issue in 5.17? > > > [ 14.244189][ T503] > > > ----------------------------------------------------------------------------- > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f > > > @offset=3965. First byte 0x2f instead of 0xcc > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 > > > cpu=0 pid=503 > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100 > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 > > > [ 14.282639][ T503] device_add+0x98c/0x1028 > > > [ 14.286925][ T503] *device_register+0x24/0x38* > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180 > > > [ 14.314812][ T503] really_probe+0x208/0xb64 > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254 > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 > > > [ 14.329399][ T503] __device_attach+0x170/0x364 > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164 > > > [ 14.338671][ T503] device_add+0xa60/0x1028 > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8 > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 > > > [slim_qcom_ngd_ctrl] > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 > > > cpu=4 pid=518 > > > [ 14.363018][ T503] do_init_module+0xac/0x494 > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54 > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294 > > > [ 14.386786][ T503] el0_svc+0x38/0x9c > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 > > > > > > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc > > > cc cc cc cc 2f 64 65 .............*/de* > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 > > > 66 6f 72 6d 2f 73 vices/platform/s > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 > > > 2e 73 6c 69 6d 2f oc/3340000.slim/ > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/ > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 > > > 6c 61 76 65 00 00 *btfmslim_slave*.. > > > > > > > > > > > > 499 static int btfm_slim_probe(struct slim_device *slim) > > I do not see that function in Linus's tree right now. Where does it > > come from? > Don't you think, it can be caused by any kernel module. Kernel code do not protect themselves against other kernel code doing abusive things. Perhaps fix your out-of-tree code to not do this if no in-kernel code is doing this? Nothing we can do at all about out-of-tree code that we have never seen. Best of luck! greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 11:19 ` Greg KH @ 2022-03-30 11:23 ` Mukesh Ojha 2022-03-30 11:31 ` Greg KH 0 siblings, 1 reply; 9+ messages in thread From: Mukesh Ojha @ 2022-03-30 11:23 UTC (permalink / raw) To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel On 3/30/2022 4:49 PM, Greg KH wrote: > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote: >> On 3/30/2022 4:31 PM, Greg KH wrote: >>> On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: >>>> On 3/30/2022 3:46 PM, Greg KH wrote: >>>>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: >>>>>> Hi All, >>>>>> >>>>>> We are facing one issue where one driver (p1) is trying to register its >>>>>> device from driver probe >>>>>> and from another path (p2) dev_set_name(new name) done from driver probe of >>>>>> the added device whose >>>>>> new name length can be more than earlier done in (p1 path) which could >>>>>> result in redzone overwritten issue. >>>>> I do not understand, what specific driver is doing this so that we can >>>>> see an example of this problem? >>>> trying to paste some logs of the race. >>>> >>>> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone >>>> overwritten >>> What kernel version is this? >> 5.10 > That is very old, is this an issue in 5.17? > >>>> [ 14.244189][ T503] >>>> ----------------------------------------------------------------------------- >>>> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f >>>> @offset=3965. First byte 0x2f instead of 0xcc >>>> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 >>>> cpu=0 pid=503 >>>> [ 14.273111][ T503] kobject_get_path+0x58/0x100 >>>> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 >>>> [ 14.282639][ T503] device_add+0x98c/0x1028 >>>> [ 14.286925][ T503] *device_register+0x24/0x38* >>>> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] >>>> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] >>>> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] >>>> [ 14.310007][ T503] platform_drv_probe+0x60/0x180 >>>> [ 14.314812][ T503] really_probe+0x208/0xb64 >>>> [ 14.319184][ T503] driver_probe_device+0x130/0x254 >>>> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 >>>> [ 14.329399][ T503] __device_attach+0x170/0x364 >>>> [ 14.334035][ T503] bus_probe_device+0x4c/0x164 >>>> [ 14.338671][ T503] device_add+0xa60/0x1028 >>>> [ 14.342957][ T503] platform_device_add+0x260/0x2c8 >>>> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 >>>> [slim_qcom_ngd_ctrl] >>>> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 >>>> cpu=4 pid=518 >>>> [ 14.363018][ T503] do_init_module+0xac/0x494 >>>> [ 14.367475][ T503] load_module+0x35c0/0x3e54 >>>> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 >>>> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 >>>> [ 14.382328][ T503] el0_svc_common+0xdc/0x294 >>>> [ 14.386786][ T503] el0_svc+0x38/0x9c >>>> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 >>>> >>>> >>>> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc >>>> cc cc cc cc 2f 64 65 .............*/de* >>>> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 >>>> 66 6f 72 6d 2f 73 vices/platform/s >>>> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 >>>> 2e 73 6c 69 6d 2f oc/3340000.slim/ >>>> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d >>>> 6e 67 64 2e 31 2f qcom,slim-ngd.1/ >>>> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 >>>> 6c 61 76 65 00 00 *btfmslim_slave*.. >>>> >>>> >>>> >>>> 499 static int btfm_slim_probe(struct slim_device *slim) >>> I do not see that function in Linus's tree right now. Where does it >>> come from? >> Don't you think, it can be caused by any kernel module. > Kernel code do not protect themselves against other kernel code doing > abusive things. > > Perhaps fix your out-of-tree code to not do this if no in-kernel code is > doing this? Nothing we can do at all about out-of-tree code that we > have never seen. Thanks for the comment, Greg. Why do we make dev_set_name() make it available to module apart from the core ? -Mukesh > > Best of luck! > > greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 11:23 ` Mukesh Ojha @ 2022-03-30 11:31 ` Greg KH 2022-03-30 12:55 ` Mukesh Ojha 0 siblings, 1 reply; 9+ messages in thread From: Greg KH @ 2022-03-30 11:31 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote: > > On 3/30/2022 4:49 PM, Greg KH wrote: > > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote: > > > On 3/30/2022 4:31 PM, Greg KH wrote: > > > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: > > > > > On 3/30/2022 3:46 PM, Greg KH wrote: > > > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: > > > > > > > Hi All, > > > > > > > > > > > > > > We are facing one issue where one driver (p1) is trying to register its > > > > > > > device from driver probe > > > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of > > > > > > > the added device whose > > > > > > > new name length can be more than earlier done in (p1 path) which could > > > > > > > result in redzone overwritten issue. > > > > > > I do not understand, what specific driver is doing this so that we can > > > > > > see an example of this problem? > > > > > trying to paste some logs of the race. > > > > > > > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone > > > > > overwritten > > > > What kernel version is this? > > > 5.10 > > That is very old, is this an issue in 5.17? > > > > > > > [ 14.244189][ T503] > > > > > ----------------------------------------------------------------------------- > > > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f > > > > > @offset=3965. First byte 0x2f instead of 0xcc > > > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 > > > > > cpu=0 pid=503 > > > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100 > > > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 > > > > > [ 14.282639][ T503] device_add+0x98c/0x1028 > > > > > [ 14.286925][ T503] *device_register+0x24/0x38* > > > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] > > > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] > > > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] > > > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180 > > > > > [ 14.314812][ T503] really_probe+0x208/0xb64 > > > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254 > > > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 > > > > > [ 14.329399][ T503] __device_attach+0x170/0x364 > > > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164 > > > > > [ 14.338671][ T503] device_add+0xa60/0x1028 > > > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8 > > > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 > > > > > [slim_qcom_ngd_ctrl] > > > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 > > > > > cpu=4 pid=518 > > > > > [ 14.363018][ T503] do_init_module+0xac/0x494 > > > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54 > > > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 > > > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 > > > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294 > > > > > [ 14.386786][ T503] el0_svc+0x38/0x9c > > > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 > > > > > > > > > > > > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc > > > > > cc cc cc cc 2f 64 65 .............*/de* > > > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 > > > > > 66 6f 72 6d 2f 73 vices/platform/s > > > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 > > > > > 2e 73 6c 69 6d 2f oc/3340000.slim/ > > > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d > > > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/ > > > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 > > > > > 6c 61 76 65 00 00 *btfmslim_slave*.. > > > > > > > > > > > > > > > > > > > > 499 static int btfm_slim_probe(struct slim_device *slim) > > > > I do not see that function in Linus's tree right now. Where does it > > > > come from? > > > Don't you think, it can be caused by any kernel module. > > Kernel code do not protect themselves against other kernel code doing > > abusive things. > > > > Perhaps fix your out-of-tree code to not do this if no in-kernel code is > > doing this? Nothing we can do at all about out-of-tree code that we > > have never seen. > > > Thanks for the comment, Greg. > Why do we make dev_set_name() make it available to module apart from the > core ? Because busses can be in a module, just like you should be doing :) Please submit your code upstream for review. thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 11:31 ` Greg KH @ 2022-03-30 12:55 ` Mukesh Ojha 2022-03-30 14:22 ` Greg KH 0 siblings, 1 reply; 9+ messages in thread From: Mukesh Ojha @ 2022-03-30 12:55 UTC (permalink / raw) To: Greg KH; +Cc: Rafael J. Wysocki, linux-kernel On 3/30/2022 5:01 PM, Greg KH wrote: > On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote: >> On 3/30/2022 4:49 PM, Greg KH wrote: >>> On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote: >>>> On 3/30/2022 4:31 PM, Greg KH wrote: >>>>> On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: >>>>>> On 3/30/2022 3:46 PM, Greg KH wrote: >>>>>>> On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: >>>>>>>> Hi All, >>>>>>>> >>>>>>>> We are facing one issue where one driver (p1) is trying to register its >>>>>>>> device from driver probe >>>>>>>> and from another path (p2) dev_set_name(new name) done from driver probe of >>>>>>>> the added device whose >>>>>>>> new name length can be more than earlier done in (p1 path) which could >>>>>>>> result in redzone overwritten issue. >>>>>>> I do not understand, what specific driver is doing this so that we can >>>>>>> see an example of this problem? >>>>>> trying to paste some logs of the race. >>>>>> >>>>>> [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone >>>>>> overwritten >>>>> What kernel version is this? >>>> 5.10 >>> That is very old, is this an issue in 5.17? >>> >>>>>> [ 14.244189][ T503] >>>>>> ----------------------------------------------------------------------------- >>>>>> [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f >>>>>> @offset=3965. First byte 0x2f instead of 0xcc >>>>>> [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 >>>>>> cpu=0 pid=503 >>>>>> [ 14.273111][ T503] kobject_get_path+0x58/0x100 >>>>>> [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 >>>>>> [ 14.282639][ T503] device_add+0x98c/0x1028 >>>>>> [ 14.286925][ T503] *device_register+0x24/0x38* >>>>>> [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] >>>>>> [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] >>>>>> [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] >>>>>> [ 14.310007][ T503] platform_drv_probe+0x60/0x180 >>>>>> [ 14.314812][ T503] really_probe+0x208/0xb64 >>>>>> [ 14.319184][ T503] driver_probe_device+0x130/0x254 >>>>>> [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 >>>>>> [ 14.329399][ T503] __device_attach+0x170/0x364 >>>>>> [ 14.334035][ T503] bus_probe_device+0x4c/0x164 >>>>>> [ 14.338671][ T503] device_add+0xa60/0x1028 >>>>>> [ 14.342957][ T503] platform_device_add+0x260/0x2c8 >>>>>> [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 >>>>>> [slim_qcom_ngd_ctrl] >>>>>> [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 >>>>>> cpu=4 pid=518 >>>>>> [ 14.363018][ T503] do_init_module+0xac/0x494 >>>>>> [ 14.367475][ T503] load_module+0x35c0/0x3e54 >>>>>> [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 >>>>>> [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 >>>>>> [ 14.382328][ T503] el0_svc_common+0xdc/0x294 >>>>>> [ 14.386786][ T503] el0_svc+0x38/0x9c >>>>>> [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 >>>>>> >>>>>> >>>>>> [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc >>>>>> cc cc cc cc 2f 64 65 .............*/de* >>>>>> [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 >>>>>> 66 6f 72 6d 2f 73 vices/platform/s >>>>>> [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 >>>>>> 2e 73 6c 69 6d 2f oc/3340000.slim/ >>>>>> [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d >>>>>> 6e 67 64 2e 31 2f qcom,slim-ngd.1/ >>>>>> [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 >>>>>> 6c 61 76 65 00 00 *btfmslim_slave*.. >>>>>> >>>>>> >>>>>> >>>>>> 499 static int btfm_slim_probe(struct slim_device *slim) >>>>> I do not see that function in Linus's tree right now. Where does it >>>>> come from? >>>> Don't you think, it can be caused by any kernel module. >>> Kernel code do not protect themselves against other kernel code doing >>> abusive things. >>> >>> Perhaps fix your out-of-tree code to not do this if no in-kernel code is >>> doing this? Nothing we can do at all about out-of-tree code that we >>> have never seen. >> >> Thanks for the comment, Greg. >> Why do we make dev_set_name() make it available to module apart from the >> core ? > Because busses can be in a module, just like you should be doing :) > > Please submit your code upstream for review. I still did not get why one should not do device name change if dev_set_name() is available for anyone to do name change while kobject_get_path() is running? This is possible race, if we know this is the limitation/issue and would be difficult to address , that is fine and should be documented. Thanks, Mukesh > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Possible case of Race in kobject_get_path() 2022-03-30 12:55 ` Mukesh Ojha @ 2022-03-30 14:22 ` Greg KH 0 siblings, 0 replies; 9+ messages in thread From: Greg KH @ 2022-03-30 14:22 UTC (permalink / raw) To: Mukesh Ojha; +Cc: Rafael J. Wysocki, linux-kernel On Wed, Mar 30, 2022 at 06:25:29PM +0530, Mukesh Ojha wrote: > > On 3/30/2022 5:01 PM, Greg KH wrote: > > On Wed, Mar 30, 2022 at 04:53:41PM +0530, Mukesh Ojha wrote: > > > On 3/30/2022 4:49 PM, Greg KH wrote: > > > > On Wed, Mar 30, 2022 at 04:44:26PM +0530, Mukesh Ojha wrote: > > > > > On 3/30/2022 4:31 PM, Greg KH wrote: > > > > > > On Wed, Mar 30, 2022 at 04:14:11PM +0530, Mukesh Ojha wrote: > > > > > > > On 3/30/2022 3:46 PM, Greg KH wrote: > > > > > > > > On Wed, Mar 30, 2022 at 03:41:04PM +0530, Mukesh Ojha wrote: > > > > > > > > > Hi All, > > > > > > > > > > > > > > > > > > We are facing one issue where one driver (p1) is trying to register its > > > > > > > > > device from driver probe > > > > > > > > > and from another path (p2) dev_set_name(new name) done from driver probe of > > > > > > > > > the added device whose > > > > > > > > > new name length can be more than earlier done in (p1 path) which could > > > > > > > > > result in redzone overwritten issue. > > > > > > > > I do not understand, what specific driver is doing this so that we can > > > > > > > > see an example of this problem? > > > > > > > trying to paste some logs of the race. > > > > > > > > > > > > > > [ 14.235820][ T503] BUG kmalloc-128 (Tainted: G S O ): Left Redzone > > > > > > > overwritten > > > > > > What kernel version is this? > > > > > 5.10 > > > > That is very old, is this an issue in 5.17? > > > > > > > > > > > [ 14.244189][ T503] > > > > > > > ----------------------------------------------------------------------------- > > > > > > > [ 14.255241][ T503] INFO: 0xffffff87caae0f7d-0xffffff87caae0f7f > > > > > > > @offset=3965. First byte 0x2f instead of 0xcc > > > > > > > [ 14.265176][ T503] INFO: Allocated in kobject_get_path+0x58/0x100 age=1 > > > > > > > cpu=0 pid=503 > > > > > > > [ 14.273111][ T503] kobject_get_path+0x58/0x100 > > > > > > > [ 14.277747][ T503] kobject_uevent_env+0x120/0x744 > > > > > > > [ 14.282639][ T503] device_add+0x98c/0x1028 > > > > > > > [ 14.286925][ T503] *device_register+0x24/0x38* > > > > > > > [ 14.291395][ T503] slim_alloc_device+0xdc/0x108 [slimbus] > > > > > > > [ 14.296992][ T503] slim_register_controller+0x210/0x2ac [slimbus] > > > > > > > [ 14.303291][ T503] qcom_slim_ngd_probe+0x3c/0x348 [slim_qcom_ngd_ctrl] > > > > > > > [ 14.310007][ T503] platform_drv_probe+0x60/0x180 > > > > > > > [ 14.314812][ T503] really_probe+0x208/0xb64 > > > > > > > [ 14.319184][ T503] driver_probe_device+0x130/0x254 > > > > > > > [ 14.324159][ T503] __device_attach_driver+0x1e8/0x324 > > > > > > > [ 14.329399][ T503] __device_attach+0x170/0x364 > > > > > > > [ 14.334035][ T503] bus_probe_device+0x4c/0x164 > > > > > > > [ 14.338671][ T503] device_add+0xa60/0x1028 > > > > > > > [ 14.342957][ T503] platform_device_add+0x260/0x2c8 > > > > > > > [ 14.347941][ T503] qcom_slim_ngd_ctrl_probe+0x71c/0x804 > > > > > > > [slim_qcom_ngd_ctrl] > > > > > > > [ 14.355177][ T503] INFO: Freed in kobject_uevent_env+0x210/0x744 age=1 > > > > > > > cpu=4 pid=518 > > > > > > > [ 14.363018][ T503] do_init_module+0xac/0x494 > > > > > > > [ 14.367475][ T503] load_module+0x35c0/0x3e54 > > > > > > > [ 14.371930][ T503] __se_sys_finit_module+0x188/0x1c8 > > > > > > > [ 14.377086][ T503] __arm64_sys_finit_module+0x20/0x30 > > > > > > > [ 14.382328][ T503] el0_svc_common+0xdc/0x294 > > > > > > > [ 14.386786][ T503] el0_svc+0x38/0x9c > > > > > > > [ 14.390552][ T503] el0_sync_handler+0x8c/0xf0 > > > > > > > > > > > > > > > > > > > > > [ 14.490994*][ T503] Redzone ffffff87caae0f7**0*: cc cc cc cc cc cc cc cc cc > > > > > > > cc cc cc cc 2f 64 65 .............*/de* > > > > > > > [ 14.501185][ T503] Object ffffff87caae0f80: 76 69 63 65 73 2f 70 6c 61 74 > > > > > > > 66 6f 72 6d 2f 73 vices/platform/s > > > > > > > [ 14.511376][ T503] Object ffffff87caae0f90: 6f 63 2f 33 33 34 30 30 30 30 > > > > > > > 2e 73 6c 69 6d 2f oc/3340000.slim/ > > > > > > > [ 14.521566][ T503] Object ffffff87caae0fa0: 71 63 6f 6d 2c 73 6c 69 6d 2d > > > > > > > 6e 67 64 2e 31 2f qcom,slim-ngd.1/ > > > > > > > [ 14.531757][ T503] Object ffffff87caae0fb0: 62 74 66 6d 73 6c 69 6d 5f 73 > > > > > > > 6c 61 76 65 00 00 *btfmslim_slave*.. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 499 static int btfm_slim_probe(struct slim_device *slim) > > > > > > I do not see that function in Linus's tree right now. Where does it > > > > > > come from? > > > > > Don't you think, it can be caused by any kernel module. > > > > Kernel code do not protect themselves against other kernel code doing > > > > abusive things. > > > > > > > > Perhaps fix your out-of-tree code to not do this if no in-kernel code is > > > > doing this? Nothing we can do at all about out-of-tree code that we > > > > have never seen. > > > > > > Thanks for the comment, Greg. > > > Why do we make dev_set_name() make it available to module apart from the > > > core ? > > Because busses can be in a module, just like you should be doing :) > > > > Please submit your code upstream for review. > > I still did not get why one should not do device name change if > dev_set_name() is available for anyone to do name change while > kobject_get_path() is running? I do not see where in the kernel today this is happening. What code changes the name of a device like this? > This is possible race, if we know this is the limitation/issue and would be > difficult to address , that is fine and should be documented. Again, why are you doing this? What needs to change the device name in this way and what is it racing with? Without code examples, speculating about all of this is quite difficult to discuss. thanks, greg k-h ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-03-30 14:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-30 10:11 Possible case of Race in kobject_get_path() Mukesh Ojha
2022-03-30 10:16 ` Greg KH
[not found] ` <148f4e9b-0584-ccc1-1443-30d41190dfd3@quicinc.com>
2022-03-30 11:01 ` Greg KH
2022-03-30 11:14 ` Mukesh Ojha
2022-03-30 11:19 ` Greg KH
2022-03-30 11:23 ` Mukesh Ojha
2022-03-30 11:31 ` Greg KH
2022-03-30 12:55 ` Mukesh Ojha
2022-03-30 14:22 ` Greg KH
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.