* [PATCH] habanalabs: print to kernel log when reset is finished @ 2019-08-10 12:38 Oded Gabbay 2019-08-10 12:53 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Oded Gabbay @ 2019-08-10 12:38 UTC (permalink / raw) To: linux-kernel, oshpigelman, ttayar; +Cc: gregkh Now that we don't print the queue testing messages, we need to print when the reset is finished so whoever looks at the kernel log will know the reset process was finished successfully and the driver is not stuck. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> --- drivers/misc/habanalabs/device.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c index 9a5926888b99..1fac808c2546 100644 --- a/drivers/misc/habanalabs/device.c +++ b/drivers/misc/habanalabs/device.c @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, else hdev->soft_reset_cnt++; + dev_info(hdev->dev, "Successfully finished resetting the device\n"); + return 0; out_err: -- 2.17.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-10 12:38 [PATCH] habanalabs: print to kernel log when reset is finished Oded Gabbay @ 2019-08-10 12:53 ` Greg KH 2019-08-10 15:29 ` Oded Gabbay 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2019-08-10 12:53 UTC (permalink / raw) To: Oded Gabbay; +Cc: linux-kernel, oshpigelman, ttayar On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > Now that we don't print the queue testing messages, we need to print when > the reset is finished so whoever looks at the kernel log will know the > reset process was finished successfully and the driver is not stuck. > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > --- > drivers/misc/habanalabs/device.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > index 9a5926888b99..1fac808c2546 100644 > --- a/drivers/misc/habanalabs/device.c > +++ b/drivers/misc/habanalabs/device.c > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > else > hdev->soft_reset_cnt++; > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); Really? For doing things "properly" there is no need to spam the kernel log. Only spit stuff out if an error happens. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-10 12:53 ` Greg KH @ 2019-08-10 15:29 ` Oded Gabbay 2019-08-10 17:23 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Oded Gabbay @ 2019-08-10 15:29 UTC (permalink / raw) To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org, Omer Shpigelman, Tomer Tayar On Sat, Aug 10, 2019 at 3:53 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > > Now that we don't print the queue testing messages, we need to print when > > the reset is finished so whoever looks at the kernel log will know the > > reset process was finished successfully and the driver is not stuck. > > > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > --- > > drivers/misc/habanalabs/device.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > > index 9a5926888b99..1fac808c2546 100644 > > --- a/drivers/misc/habanalabs/device.c > > +++ b/drivers/misc/habanalabs/device.c > > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > > else > > hdev->soft_reset_cnt++; > > > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); > > Really? For doing things "properly" there is no need to spam the kernel > log. Only spit stuff out if an error happens. > > thanks, > > greg k-h I beg to differ for two reasons: 1. Reset happens very rarely, if at all. So this message (that get printed after reset is done) will definitely not spam the kernel log. 2. When a reset starts we print an appropriate error message. I think it is expected by the user that we will also print if and when the reset has finished successfully. I really believe that lack of this printing might be deceiving for users. Thanks, Oded ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-10 15:29 ` Oded Gabbay @ 2019-08-10 17:23 ` Greg KH 2019-08-10 17:45 ` Oded Gabbay 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2019-08-10 17:23 UTC (permalink / raw) To: Oded Gabbay; +Cc: Linux-Kernel@Vger. Kernel. Org, Omer Shpigelman, Tomer Tayar On Sat, Aug 10, 2019 at 06:29:16PM +0300, Oded Gabbay wrote: > On Sat, Aug 10, 2019 at 3:53 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > > > Now that we don't print the queue testing messages, we need to print when > > > the reset is finished so whoever looks at the kernel log will know the > > > reset process was finished successfully and the driver is not stuck. > > > > > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > > --- > > > drivers/misc/habanalabs/device.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > > > index 9a5926888b99..1fac808c2546 100644 > > > --- a/drivers/misc/habanalabs/device.c > > > +++ b/drivers/misc/habanalabs/device.c > > > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > > > else > > > hdev->soft_reset_cnt++; > > > > > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); > > > > Really? For doing things "properly" there is no need to spam the kernel > > log. Only spit stuff out if an error happens. > > > > thanks, > > > > greg k-h > > I beg to differ for two reasons: > 1. Reset happens very rarely, if at all. So this message (that get > printed after reset is done) will definitely not spam the kernel log. > 2. When a reset starts we print an appropriate error message. I think > it is expected by the user that we will also print if and when the > reset has finished successfully. I really believe that lack of this > printing might be deceiving for users. How is anyone going to parse the kernel log for anything to know if something happens? How do you trigger a reset? Is it done by userspace? If so, just notify them then. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-10 17:23 ` Greg KH @ 2019-08-10 17:45 ` Oded Gabbay 2019-08-11 7:38 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Oded Gabbay @ 2019-08-10 17:45 UTC (permalink / raw) To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org, Omer Shpigelman, Tomer Tayar On Sat, Aug 10, 2019 at 8:23 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Sat, Aug 10, 2019 at 06:29:16PM +0300, Oded Gabbay wrote: > > On Sat, Aug 10, 2019 at 3:53 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > > > > Now that we don't print the queue testing messages, we need to print when > > > > the reset is finished so whoever looks at the kernel log will know the > > > > reset process was finished successfully and the driver is not stuck. > > > > > > > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > > > --- > > > > drivers/misc/habanalabs/device.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > > > > index 9a5926888b99..1fac808c2546 100644 > > > > --- a/drivers/misc/habanalabs/device.c > > > > +++ b/drivers/misc/habanalabs/device.c > > > > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > > > > else > > > > hdev->soft_reset_cnt++; > > > > > > > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); > > > > > > Really? For doing things "properly" there is no need to spam the kernel > > > log. Only spit stuff out if an error happens. > > > > > > thanks, > > > > > > greg k-h > > > > I beg to differ for two reasons: > > 1. Reset happens very rarely, if at all. So this message (that get > > printed after reset is done) will definitely not spam the kernel log. > > 2. When a reset starts we print an appropriate error message. I think > > it is expected by the user that we will also print if and when the > > reset has finished successfully. I really believe that lack of this > > printing might be deceiving for users. > > How is anyone going to parse the kernel log for anything to know if > something happens? Actually our jenkins server parse the kernel log and look for bad messages from the driver... But the main reason for this, IMO, is for developers (in userspace) that run with a terminal open on the kernel log to see if something bad happens while they run their applications. > > How do you trigger a reset? Is it done by userspace? If so, just > notify them then. Reset can be triggered by two main reasons, but a userspace application is *not* one of them. It can be due to a command submission not finishing in time or from a major error in the device (e.g. double ECC error). That's why I think it is important to tell them the reset + re-initialization flow was finished successfully. Of course all this information, let's call it operational status, can be acquired by the INFO IOCTL the driver provides, but it is *really* convenient, IMHO, to see these type of messages in the kernel log while they happen. > > thanks, > > greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-10 17:45 ` Oded Gabbay @ 2019-08-11 7:38 ` Greg KH 2019-08-11 7:40 ` Oded Gabbay 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2019-08-11 7:38 UTC (permalink / raw) To: Oded Gabbay; +Cc: Linux-Kernel@Vger. Kernel. Org, Omer Shpigelman, Tomer Tayar On Sat, Aug 10, 2019 at 08:45:54PM +0300, Oded Gabbay wrote: > On Sat, Aug 10, 2019 at 8:23 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Sat, Aug 10, 2019 at 06:29:16PM +0300, Oded Gabbay wrote: > > > On Sat, Aug 10, 2019 at 3:53 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > > > > > Now that we don't print the queue testing messages, we need to print when > > > > > the reset is finished so whoever looks at the kernel log will know the > > > > > reset process was finished successfully and the driver is not stuck. > > > > > > > > > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > > > > --- > > > > > drivers/misc/habanalabs/device.c | 2 ++ > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > > > > > index 9a5926888b99..1fac808c2546 100644 > > > > > --- a/drivers/misc/habanalabs/device.c > > > > > +++ b/drivers/misc/habanalabs/device.c > > > > > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > > > > > else > > > > > hdev->soft_reset_cnt++; > > > > > > > > > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); > > > > > > > > Really? For doing things "properly" there is no need to spam the kernel > > > > log. Only spit stuff out if an error happens. > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > I beg to differ for two reasons: > > > 1. Reset happens very rarely, if at all. So this message (that get > > > printed after reset is done) will definitely not spam the kernel log. > > > 2. When a reset starts we print an appropriate error message. I think > > > it is expected by the user that we will also print if and when the > > > reset has finished successfully. I really believe that lack of this > > > printing might be deceiving for users. > > > > How is anyone going to parse the kernel log for anything to know if > > something happens? > > Actually our jenkins server parse the kernel log and look for bad > messages from the driver... > But the main reason for this, IMO, is for developers (in userspace) > that run with a terminal open on the kernel log to see if something > bad happens while they run their applications. > > > > > How do you trigger a reset? Is it done by userspace? If so, just > > notify them then. > > Reset can be triggered by two main reasons, but a userspace > application is *not* one of them. > It can be due to a command submission not finishing in time or from a > major error in the device (e.g. double ECC error). > That's why I think it is important to tell them the reset + > re-initialization flow was finished successfully. > Of course all this information, let's call it operational status, can > be acquired by the INFO IOCTL the driver provides, but it is *really* > convenient, IMHO, to see these type of messages in the kernel log > while they happen. Ah, yes, if this is something that userspace can not trigger then it's ok. But you should then make it a 'dev_warn' to match up with the message level when the reset starts. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] habanalabs: print to kernel log when reset is finished 2019-08-11 7:38 ` Greg KH @ 2019-08-11 7:40 ` Oded Gabbay 0 siblings, 0 replies; 7+ messages in thread From: Oded Gabbay @ 2019-08-11 7:40 UTC (permalink / raw) To: Greg KH; +Cc: Linux-Kernel@Vger. Kernel. Org, Omer Shpigelman, Tomer Tayar On Sun, Aug 11, 2019 at 10:38 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Sat, Aug 10, 2019 at 08:45:54PM +0300, Oded Gabbay wrote: > > On Sat, Aug 10, 2019 at 8:23 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Sat, Aug 10, 2019 at 06:29:16PM +0300, Oded Gabbay wrote: > > > > On Sat, Aug 10, 2019 at 3:53 PM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > > > > > On Sat, Aug 10, 2019 at 03:38:08PM +0300, Oded Gabbay wrote: > > > > > > Now that we don't print the queue testing messages, we need to print when > > > > > > the reset is finished so whoever looks at the kernel log will know the > > > > > > reset process was finished successfully and the driver is not stuck. > > > > > > > > > > > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > > > > > --- > > > > > > drivers/misc/habanalabs/device.c | 2 ++ > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c > > > > > > index 9a5926888b99..1fac808c2546 100644 > > > > > > --- a/drivers/misc/habanalabs/device.c > > > > > > +++ b/drivers/misc/habanalabs/device.c > > > > > > @@ -907,6 +907,8 @@ int hl_device_reset(struct hl_device *hdev, bool hard_reset, > > > > > > else > > > > > > hdev->soft_reset_cnt++; > > > > > > > > > > > > + dev_info(hdev->dev, "Successfully finished resetting the device\n"); > > > > > > > > > > Really? For doing things "properly" there is no need to spam the kernel > > > > > log. Only spit stuff out if an error happens. > > > > > > > > > > thanks, > > > > > > > > > > greg k-h > > > > > > > > I beg to differ for two reasons: > > > > 1. Reset happens very rarely, if at all. So this message (that get > > > > printed after reset is done) will definitely not spam the kernel log. > > > > 2. When a reset starts we print an appropriate error message. I think > > > > it is expected by the user that we will also print if and when the > > > > reset has finished successfully. I really believe that lack of this > > > > printing might be deceiving for users. > > > > > > How is anyone going to parse the kernel log for anything to know if > > > something happens? > > > > Actually our jenkins server parse the kernel log and look for bad > > messages from the driver... > > But the main reason for this, IMO, is for developers (in userspace) > > that run with a terminal open on the kernel log to see if something > > bad happens while they run their applications. > > > > > > > > How do you trigger a reset? Is it done by userspace? If so, just > > > notify them then. > > > > Reset can be triggered by two main reasons, but a userspace > > application is *not* one of them. > > It can be due to a command submission not finishing in time or from a > > major error in the device (e.g. double ECC error). > > That's why I think it is important to tell them the reset + > > re-initialization flow was finished successfully. > > Of course all this information, let's call it operational status, can > > be acquired by the INFO IOCTL the driver provides, but it is *really* > > convenient, IMHO, to see these type of messages in the kernel log > > while they happen. > > Ah, yes, if this is something that userspace can not trigger then it's > ok. But you should then make it a 'dev_warn' to match up with the > message level when the reset starts. > > thanks, > > greg k-h Sure, np. Thanks, Oded ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-11 7:40 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-08-10 12:38 [PATCH] habanalabs: print to kernel log when reset is finished Oded Gabbay 2019-08-10 12:53 ` Greg KH 2019-08-10 15:29 ` Oded Gabbay 2019-08-10 17:23 ` Greg KH 2019-08-10 17:45 ` Oded Gabbay 2019-08-11 7:38 ` Greg KH 2019-08-11 7:40 ` Oded Gabbay
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox