From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-1?Q?Per_F=F6rlin?= Subject: Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios Date: Thu, 25 Oct 2012 17:02:53 +0200 Message-ID: <5089549D.3060507@stericsson.com> References: <5088206C.7080101@stericsson.com> <50893E88.9000908@codeaurora.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from eu1sys200aog112.obsmtp.com ([207.126.144.133]:45674 "EHLO eu1sys200aog112.obsmtp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933842Ab2JYPDU (ORCPT ); Thu, 25 Oct 2012 11:03:20 -0400 In-Reply-To: <50893E88.9000908@codeaurora.org> Sender: linux-mmc-owner@vger.kernel.org List-Id: linux-mmc@vger.kernel.org To: Konstantin Dorfman Cc: Per Forlin , "cjb@laptop.org" , "linux-mmc@vger.kernel.org" On 10/25/2012 03:28 PM, Konstantin Dorfman wrote: > On 10/24/2012 07:07 PM, Per F=F6rlin wrote: >> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote: >>> Hello Per, >>> >>> On Mon, October 22, 2012 1:02 am, Per Forlin wrote: >>>>> When mmcqt reports on completion of a request there should be >>>>> a context switch to allow the insertion of the next read ahead BI= Os >>>>> to the block layer. Since the mmcqd tries to fetch another reques= t >>>>> immediately after the completion of the previous request it gets = NULL >>>>> and starts waiting for the completion of the previous request. >>>>> This wait on completion gives the FS the opportunity to insert th= e next >>>>> request but the MMC layer is already blocked on the previous requ= est >>>>> completion and is not aware of the new request waiting to be fetc= hed. >>>> I thought that I could trigger a context switch in order to give >>>> execution time for FS to add the new request to the MMC queue. >>>> I made a simple hack to call yield() in case the request gets NULL= =2E I >>>> thought it may give the FS layer enough time to add a new request = to >>>> the MMC queue. This would not delay the MMC transfer since the yie= ld() >>>> is done in parallel with an ongoing transfer. Anyway it was just m= eant >>>> to be a simple test. >>>> >>>> One yield was not enough. Just for sanity check I added a msleep a= s >>>> well and that was enough to let FS add a new request, >>>> Would it be possible to gain throughput by delaying the fetch of n= ew >>>> request? Too avoid unnecessary NULL requests >>>> >>>> If (ongoing request is read AND size is max read ahead AND new req= uest >>>> is NULL) yield(); >>>> >>>> BR >>>> Per >>> We did the same experiment and it will not give maximum possible >>> performance. There is no guarantee that the context switch which wa= s >>> manually caused by the MMC layer comes just in time: when it was ea= rly >>> then next fetch still results in NULL, when it was later, then we m= iss >>> possibility to fetch/prepare new request. >>> >>> Any delay in fetch of the new request that comes after the new requ= est has >>> arrived hits throughput and latency. >>> >>> The solution we are talking about here will fix not only situation = with FS >>> read ahead mechanism, but also it will remove penalty of the MMC co= ntext >>> waiting on completion while any new request arrives. >>> >>> Thanks, >>> >> It seems strange that the block layer cannot keep up with relatively= slow flash media devices. There must be a limitation on number of outs= tanding request towards MMC. >> I need to make up my mind if it's the best way to address this issue= in the MMC framework or block layer. I have started to look into the b= lock layer code but it will take some time to dig out the relevant part= s. >> >> BR >> Per >> > The root cause of the issue in incompletion of the current design wit= h > well known producer-consumer problem solution (producer is block laye= r, > consumer is mmc layer). > Classic definitions states that the buffer is fix size, in our case w= e > have queue, so Producer always capable to put new request into the qu= eue. > Consumer context blocked when both buffers (curr and prev) are busy > (first started its execution on the bus, second is fetched and waitin= g > for the first). This happens but I thought that the block layer would continue to add r= equest to the MMC queue while the consumer is busy. When consumer fetches request from the queue again there should be seve= ral requests available in the queue, but there is only one. > Producer context considered to be blocked when FS (or others bio > sources) has no requests to put into queue. Does the block layer ever wait for outstanding request to finish? Could= this be another reason why the producer doesn't add new requests on th= e MMC queue? > To maximize performance there are 2 notifications should be used: > 1. Producer notifies Consumer about new item to proceed. > 2. Consumer notifies Producer about free place. >=20 > In our case 2nd notification not need since as I said before - it is > always free space in the queue. > There is no such notification as 1st, i.e. block layer has no way to > notify mmc layer about new request arrived. >=20 > What you suggesting is to resolve specific case, when FS READ_AHEAD > mechanism behavior causes delays in producing new requests. > Probably you can resolve this specific case, but do you have guarante= e > that this is only case that causes delays between new requests events= ? > Flash memory devices these days constantly improved on all levels: NA= ND, > firmware, bus speed and host controller capabilities, this makes any > yield/sleep/timeouts solution only temporary hacks. I never meant yield or sleep to be a permanent fix. I was only curious = of how if would affect the performance in order to gain a better knowle= dge of the root cause. My impression is that even if the SD card is very slow you will see the= same affect. The behavior of the block layer in this case is not relat= ed to the speed for the flash memory. On a slow card the MMC-queue runs empty just like it does for a fast eM= MC. According to you the block layer should have a better chance to feed th= e MMC queue if the card is slow (more time for the block layer to prepa= re next requests). BR Per