All of lore.kernel.org
 help / color / mirror / Atom feed
From: skakit@codeaurora.org
To: Stephen Boyd <swboyd@chromium.org>
Cc: gregkh@linuxfoundation.org, mgautam@codeaurora.org,
	linux-arm-msm@vger.kernel.org, linux-serial@vger.kernel.org,
	akashast@codeaurora.org, rojay@codeaurora.org,
	msavaliy@qti.qualcomm.com
Subject: Re: [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure
Date: Mon, 24 Feb 2020 20:17:23 +0530	[thread overview]
Message-ID: <52dda56a636e96014b0705a2dcdbbf50@codeaurora.org> (raw)
In-Reply-To: <158213461988.184098.7165493520823815160@swboyd.mtv.corp.google.com>

On 2020-02-19 23:20, Stephen Boyd wrote:
> Quoting skakit@codeaurora.org (2020-02-14 05:17:01)
>> On 2020-02-13 04:19, Stephen Boyd wrote:
>> >     driver_probe_device+0x70/0x140
>> >     __driver_attach_async_helper+0x7c/0xa8
>> >     async_run_entry_fn+0x60/0x178
>> >     process_one_work+0x33c/0x640
>> >     worker_thread+0x2a0/0x470
>> >     kthread+0x128/0x138
>> >     ret_from_fork+0x10/0x18
>> >    Code: 1aca096a 911e0129 b940012b 7100054a (b800450b)
>> I think the most probable explanation of the crash is, set_termios 
>> call
>> is starting RX engine and RX engine is sampling some garbage data from
>> line, and by the time startup is called, we have few data to read.
>> How frequently you are able to see this crash? because internally we 
>> are
>> unable to reproduce it.
> 
> How is set_termios involved? Is that starting the RX side before
> uart_startup() is called? Sorry I haven't looked into the code flow 
> very
> deeply here.
> 
> It seems to happen when the bluetooth driver probes so maybe constantly
> adding and removing the hci_uart module will cause this to happen for
> you? I also run the kernel with many debug options enabled, so maybe 
> try
> enabling all the debug stuff? I see it randomly right now so I'm not
> sure.
> 

In general uart_startup() is called before set_termios() ,but as per the 
crash logs shared, it seems RX engine is active(which can only happen 
from set_termios) before startup() is called.

>> >
>> >
>> > This seems to be the problematic line. We didn't call handle_rx() from
>> > the stop_rx() path before. And this qcom_geni_serial_stop_rx() function
>> > is called from qcom_geni_serial_startup(), but most importantly, we
>> > call
>> > into this function from startup before we allocate memory for the
>> > port->rx_fifo member (see the devm_kcalloc() later in
>> > qcom_geni_serial_port_setup() and see how it's after we stop rx).
>> >
>> > Why do we need to flush the rx buffer by reading it into the software
>> > buffer? Can't we simply ack any outstanding RX interrupts in the
>> > hardware when we're stopping receive?
>> We can't simply ack RX_LAST interrupt, there is a sticky bit that get
>> set on HW level(not exposed to SW) with RX last interrupt. The only 
>> way
>> to clear it is flush out RX FIFO HW buffer. The sticky bit can create
>> problem for future transfers if remained uncleared.
>> How about we allocate buffer to port->rx_fifo in probe itself?
> 
> Ok. If we have to read the rx fifo to flush the buffer then perhaps
> write another function that qcom_geni_serial_stop_rx() can use to
> indicate that it wants to throw away whatever is in the rx fifo? Or
> adjust handle_rx() to check for a NULL fifo pointer and throw it away 
> in
> that case? When we're setting up this device I don't understand why we
> would want to read anything out of the rx fifo that was there before 
> the
> driver started.

We cannot adjust handle_rx() to check for a NULL fifo pointer as reading 
from RX FIFO is mandatory to clear the sticky bit set.  We are passing 
"true" to handle_rx function so it will read and discard whatever data 
present in RX FIFO, it won't send to upper layers. We are planning to 
post a patch which has rx_fifo buffer allocated in probe itself, in 
order to avoid the NULL dereference.

      reply	other threads:[~2020-02-24 14:47 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-11 10:13 [PATCH] tty: serial: qcom_geni_serial: Fix RX cancel command failure satya priya
2020-02-12 22:49 ` Stephen Boyd
2020-02-14 13:17   ` skakit
2020-02-19 17:50     ` Stephen Boyd
2020-02-24 14:47       ` skakit [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52dda56a636e96014b0705a2dcdbbf50@codeaurora.org \
    --to=skakit@codeaurora.org \
    --cc=akashast@codeaurora.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=mgautam@codeaurora.org \
    --cc=msavaliy@qti.qualcomm.com \
    --cc=rojay@codeaurora.org \
    --cc=swboyd@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.