public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Few problems in mtd system
@ 2009-12-17 22:08 Maxim Levitsky
  2009-12-19 17:17 ` Maxim Levitsky
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-17 22:08 UTC (permalink / raw)
  To: linux-mtd

Hi,

I am writing a driver for an xD controller found in my notebook.
xD card is 99.9% nand chip, and the controller fits into the nand
subsystem nicely.

However I face few problems, out of which about half is due to that
'99.9' and rest I believe  are bugs in the nand/mtd system.

First of all I design a two layer driver.
First layer is mtd driver which plugs into nand system.
Second layer is xD specific FTL that already exists (and works) in R/O
form in mtd/ssfdc.c
(Of course you are free to use another FTL or UBI on top of mtd driver,
but you won't be able to read the card in xd readers)

I am using 2.6.32 vanilla, so maybe something was already fixed.

First, the biggest problem is that del_mtd_device fails if the mtd
device is open. 

However the user can pull the card out of the slot anytime, so this has
to be handled gracefully.
Currently I can't even test for that condition, because nand_release
discards the return value.


Secondary, the hardware I deal with supports hardware ecc, and will
return error bit location if correctable error found.
Thus I use the 'syndrome' set of ecc functions.

However, the nand_write_oob_syndrome is broken. It uses NAND_CMD_READ0
with ecc step size offset (this is 512 or minimum 256 in common cases),
but this offset can't even be sent over 8 bit bus.



Another problem I discovered recently is that bad block handling other
that using bad block table is broken.
Maybe I should use it too, but scan of whole device takes some
significant time (~1 sec) and I don't like that.

However, when I attempt to implement custom .block_bad/.block_markbad, I
run into trouble about how to read oob from hardware.
Default functions use lots of static helpers, and only way to to that in
standard way is to use mtd->read_oob, but that leads to deadlock when
doing the erase.
nand_erase_nand does nand_get_device and it tests for bad block which
end in mtd->read_oob (implemented as nand_read_oob and it does
nand_get_device too...)

Or I can use very low level functions, but this skips taking locks, and
could lead to races.

Note that although maybe I could avoid the deadlock by using the bad
block table, I still need these functions to determine/write the bad
status because there is no bad block table on the flash.



Yet another problem I found is inside add_mtd_blktrans_dev.
This function takes mtd_table_mutex but never releases it.
This sounds fishy, and deadlocks when it calls 'add_disk(gd);'
This function (I think in case partitions are detected on device) opens
block device, and this takes that lock too
(blktrans_open->get_mtd_device)



And lastly, here are problems due to slight incompatibilities:

First, its possible to use small page nand (256 bytes/page), but then
unfortunately oob contents of every odd and even page are 'related'
(odd page contains ecc of both pages).
This is very hard to support, and since it only present in SmartMedia,
and to read such old card you need an adapter (which I not sure is
possible to buy), I'll skip it.


Another problem is that chip does report correct ID, but doesn't report
valid settings in extended ID, thus chips > 128 MB aren't detected.
Also there is whole class of mask rom devices, and they have non nand
IDs.
Not sure its valuable to support mask roms (it is also smartmedia only
feature too), but erasesize/pagesize have to be hardcoded. 
I think that best solution it to allow to pass custom list of device IDs
to nand_scan.


Last problem I discovered recently is that the chip always reports via
status command that it is write protected. This is maybe a card bug.
Small workaround for that in the nand system isn't a problem.
I will test if that report is really bogus though.
The hardware has seperate register to check write-protect status, and
card I suspect has a write-protect seal.


Currently, with workarounds to above problems, my driver does read the
xD card (using ssfdc.c)

Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-17 22:08 Few problems in mtd system Maxim Levitsky
@ 2009-12-19 17:17 ` Maxim Levitsky
  2009-12-19 21:06   ` Maxim Levitsky
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-19 17:17 UTC (permalink / raw)
  To: linux-mtd; +Cc: joern, arnd, Alex Dubov

On Fri, 2009-12-18 at 00:08 +0200, Maxim Levitsky wrote: 
> Hi,
> 
> I am writing a driver for an xD controller found in my notebook.
> xD card is 99.9% nand chip, and the controller fits into the nand
> subsystem nicely.
> 
> However I face few problems, out of which about half is due to that
> '99.9' and rest I believe  are bugs in the nand/mtd system.
> 
> First of all I design a two layer driver.
> First layer is mtd driver which plugs into nand system.
> Second layer is xD specific FTL that already exists (and works) in R/O
> form in mtd/ssfdc.c
> (Of course you are free to use another FTL or UBI on top of mtd driver,
> but you won't be able to read the card in xd readers)
> 
> I am using 2.6.32 vanilla, so maybe something was already fixed.
> 
> First, the biggest problem is that del_mtd_device fails if the mtd
> device is open. 
> 
> However the user can pull the card out of the slot anytime, so this has
> to be handled gracefully.
> Currently I can't even test for that condition, because nand_release
> discards the return value.
> 
> 
> Secondary, the hardware I deal with supports hardware ecc, and will
> return error bit location if correctable error found.
> Thus I use the 'syndrome' set of ecc functions.
> 
> However, the nand_write_oob_syndrome is broken. It uses NAND_CMD_READ0
> with ecc step size offset (this is 512 or minimum 256 in common cases),
> but this offset can't even be sent over 8 bit bus.
> 
> 
> 
> Another problem I discovered recently is that bad block handling other
> that using bad block table is broken.
> Maybe I should use it too, but scan of whole device takes some
> significant time (~1 sec) and I don't like that.
> 
> However, when I attempt to implement custom .block_bad/.block_markbad, I
> run into trouble about how to read oob from hardware.
> Default functions use lots of static helpers, and only way to to that in
> standard way is to use mtd->read_oob, but that leads to deadlock when
> doing the erase.
> nand_erase_nand does nand_get_device and it tests for bad block which
> end in mtd->read_oob (implemented as nand_read_oob and it does
> nand_get_device too...)
> 
> Or I can use very low level functions, but this skips taking locks, and
> could lead to races.
> 
> Note that although maybe I could avoid the deadlock by using the bad
> block table, I still need these functions to determine/write the bad
> status because there is no bad block table on the flash.
> 
> 
> 
> Yet another problem I found is inside add_mtd_blktrans_dev.
> This function takes mtd_table_mutex but never releases it.
> This sounds fishy, and deadlocks when it calls 'add_disk(gd);'
> This function (I think in case partitions are detected on device) opens
> block device, and this takes that lock too
> (blktrans_open->get_mtd_device)
> 
> 
> 
> And lastly, here are problems due to slight incompatibilities:
> 
> First, its possible to use small page nand (256 bytes/page), but then
> unfortunately oob contents of every odd and even page are 'related'
> (odd page contains ecc of both pages).
> This is very hard to support, and since it only present in SmartMedia,
> and to read such old card you need an adapter (which I not sure is
> possible to buy), I'll skip it.
> 
> 
> Another problem is that chip does report correct ID, but doesn't report
> valid settings in extended ID, thus chips > 128 MB aren't detected.
> Also there is whole class of mask rom devices, and they have non nand
> IDs.
> Not sure its valuable to support mask roms (it is also smartmedia only
> feature too), but erasesize/pagesize have to be hardcoded. 
> I think that best solution it to allow to pass custom list of device IDs
> to nand_scan.
> 
> 
> Last problem I discovered recently is that the chip always reports via
> status command that it is write protected. This is maybe a card bug.
> Small workaround for that in the nand system isn't a problem.
> I will test if that report is really bogus though.
> The hardware has seperate register to check write-protect status, and
> card I suspect has a write-protect seal.
> 
> 
> Currently, with workarounds to above problems, my driver does read the
> xD card (using ssfdc.c)
> 
> Best regards,
> Maxim Levitsky


One small and a bit disappointing note.
I found out that ether controller or the card (I strongly suspect the
later) pokes at OOB contents that are written.

For example writing invalid sector number, results in no write, and the
sector reads random data until card is reinserted, at which point read
returns same contents as before.

Also setting 'reserved' field to anything but 0xFFFFFFFF results in it
being written as 0xFFFFFFFF.

I strongly suspect that xD card I have (1 GB) is not 'real' nand card.
Probably inside the card there is same embedded cpu, just as inside all
other cards. 
It just emulates the nand command set, and thus pokes at oob contents
(and doesn't store it fully ether....)

Will be glad to hear from you about rest of the questions I asked.


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-19 17:17 ` Maxim Levitsky
@ 2009-12-19 21:06   ` Maxim Levitsky
  2009-12-19 21:58     ` Alex Dubov
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-19 21:06 UTC (permalink / raw)
  To: linux-mtd; +Cc: joern, arnd, Alex Dubov

On Sat, 2009-12-19 at 19:17 +0200, Maxim Levitsky wrote: 
> On Fri, 2009-12-18 at 00:08 +0200, Maxim Levitsky wrote: 
> > Hi,
> > 
> > I am writing a driver for an xD controller found in my notebook.
> > xD card is 99.9% nand chip, and the controller fits into the nand
> > subsystem nicely.
> > 
> > However I face few problems, out of which about half is due to that
> > '99.9' and rest I believe  are bugs in the nand/mtd system.
> > 
> > First of all I design a two layer driver.
> > First layer is mtd driver which plugs into nand system.
> > Second layer is xD specific FTL that already exists (and works) in R/O
> > form in mtd/ssfdc.c
> > (Of course you are free to use another FTL or UBI on top of mtd driver,
> > but you won't be able to read the card in xd readers)
> > 
> > I am using 2.6.32 vanilla, so maybe something was already fixed.
> > 
> > First, the biggest problem is that del_mtd_device fails if the mtd
> > device is open. 
> > 
> > However the user can pull the card out of the slot anytime, so this has
> > to be handled gracefully.
> > Currently I can't even test for that condition, because nand_release
> > discards the return value.
> > 
> > 
> > Secondary, the hardware I deal with supports hardware ecc, and will
> > return error bit location if correctable error found.
> > Thus I use the 'syndrome' set of ecc functions.
> > 
> > However, the nand_write_oob_syndrome is broken. It uses NAND_CMD_READ0
> > with ecc step size offset (this is 512 or minimum 256 in common cases),
> > but this offset can't even be sent over 8 bit bus.
> > 
> > 
> > 
> > Another problem I discovered recently is that bad block handling other
> > that using bad block table is broken.
> > Maybe I should use it too, but scan of whole device takes some
> > significant time (~1 sec) and I don't like that.
> > 
> > However, when I attempt to implement custom .block_bad/.block_markbad, I
> > run into trouble about how to read oob from hardware.
> > Default functions use lots of static helpers, and only way to to that in
> > standard way is to use mtd->read_oob, but that leads to deadlock when
> > doing the erase.
> > nand_erase_nand does nand_get_device and it tests for bad block which
> > end in mtd->read_oob (implemented as nand_read_oob and it does
> > nand_get_device too...)
> > 
> > Or I can use very low level functions, but this skips taking locks, and
> > could lead to races.
> > 
> > Note that although maybe I could avoid the deadlock by using the bad
> > block table, I still need these functions to determine/write the bad
> > status because there is no bad block table on the flash.
> > 
> > 
> > 
> > Yet another problem I found is inside add_mtd_blktrans_dev.
> > This function takes mtd_table_mutex but never releases it.
> > This sounds fishy, and deadlocks when it calls 'add_disk(gd);'
> > This function (I think in case partitions are detected on device) opens
> > block device, and this takes that lock too
> > (blktrans_open->get_mtd_device)
> > 
> > 
> > 
> > And lastly, here are problems due to slight incompatibilities:
> > 
> > First, its possible to use small page nand (256 bytes/page), but then
> > unfortunately oob contents of every odd and even page are 'related'
> > (odd page contains ecc of both pages).
> > This is very hard to support, and since it only present in SmartMedia,
> > and to read such old card you need an adapter (which I not sure is
> > possible to buy), I'll skip it.
> > 
> > 
> > Another problem is that chip does report correct ID, but doesn't report
> > valid settings in extended ID, thus chips > 128 MB aren't detected.
> > Also there is whole class of mask rom devices, and they have non nand
> > IDs.
> > Not sure its valuable to support mask roms (it is also smartmedia only
> > feature too), but erasesize/pagesize have to be hardcoded. 
> > I think that best solution it to allow to pass custom list of device IDs
> > to nand_scan.
> > 
> > 
> > Last problem I discovered recently is that the chip always reports via
> > status command that it is write protected. This is maybe a card bug.
> > Small workaround for that in the nand system isn't a problem.
> > I will test if that report is really bogus though.
> > The hardware has seperate register to check write-protect status, and
> > card I suspect has a write-protect seal.
> > 
> > 
> > Currently, with workarounds to above problems, my driver does read the
> > xD card (using ssfdc.c)
> > 
> > Best regards,
> > Maxim Levitsky
> 
> 
> One small and a bit disappointing note.
> I found out that ether controller or the card (I strongly suspect the
> later) pokes at OOB contents that are written.
> 
> For example writing invalid sector number, results in no write, and the
> sector reads random data until card is reinserted, at which point read
> returns same contents as before.
> 
> Also setting 'reserved' field to anything but 0xFFFFFFFF results in it
> being written as 0xFFFFFFFF
Ah, this is even worse.
All OOB writes (except write of page+oob with valid content) result in
same funny behavier (junk is returned doing read, and after card
reinsert, contents return to original state)

Even erase fails in same way.
This is after a erase, block does became full of 0xFFs, but once you
insert/remove the card, old contents are back...

And yes, its possible to write a page many times, without doing the
erase....


It could be all some error in my driver, but now I am very confident it
executes same commands as original driver, and works correctly.
Card just emulates the nand command set.


Seeing all that things, I guess I drop one of my goals to make a generic
nand driver.

My goals were to write a driver for general use (this is to read the
picture cards, and as an option use the card as raw nand flash for
experiments with flash filesystems, UBI... etc...)

Due to all these problems, I guess special self contained driver
(similar to what alex wrote) is better.

I have type M or M+ card. I suspect that that type has this problems.
Standard XD cards probably are fine.




> .
> 
> I strongly suspect that xD card I have (1 GB) is not 'real' nand card.
> Probably inside the card there is same embedded cpu, just as inside all
> other cards. 
> It just emulates the nand command set, and thus pokes at oob contents
> (and doesn't store it fully ether....)
> 
> Will be glad to hear from you about rest of the questions I asked.
> 
> 
> Best regards,
> Maxim Levitsky
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-19 21:06   ` Maxim Levitsky
@ 2009-12-19 21:58     ` Alex Dubov
  2009-12-20 20:59       ` Maxim Levitsky
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Dubov @ 2009-12-19 21:58 UTC (permalink / raw)
  To: linux-mtd, Maxim Levitsky; +Cc: joern, arnd

This sort of matches my experience.
xD cards do behave in all kinds of interesting ways.
There are issues with internal buffering and
controllers.




      

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-19 21:58     ` Alex Dubov
@ 2009-12-20 20:59       ` Maxim Levitsky
  2009-12-21 14:00         ` Jörn Engel
  2010-01-07  7:11         ` Artem Bityutskiy
  0 siblings, 2 replies; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-20 20:59 UTC (permalink / raw)
  To: Alex Dubov; +Cc: joern, linux-mtd, arnd

Now it all clear.
Here is the stupid emulated nand interface of the olympus 1 GB card:


-> page read returns sector contents + cooked oob which includes LBA and
'always correct ecc'

-> oob read (0x50 ....) returns only LBA. ecc is set 0xFFFFF.
Only LBA in fact is returned.


-> erase is faked. It does put 0xFFs into erase block, but actual flash
isn't touched at all.


-> page write only works if LBA is set to correct number in the oob.
Nothing else matters. If invalid ecc is written, page writes correctly
and returns 'correct' ecc. In other words XD's ecc is not used.
Of course it possible to do many page writes, without doing erases,
writing arbitrary data.


-> oob write doesn't do anything but confuses the card. It returns
garbage on following reads, but flash isn't touched ether.


Writing different physical sector with same LBA makes other one magicly
disappear...
Of course you understand why....

However, individual page writes do work correctly.

I suspect that older non type M cards didn't have such emulation, but
were real nand chips, bacause there are some references on the web about
using XD card as a nand chip replacement.
Such card as I have will really will make very poor nand replacement...


Folks, could you review my other questions about bugs in mtd core, and
tell your opinion?


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-20 20:59       ` Maxim Levitsky
@ 2009-12-21 14:00         ` Jörn Engel
  2009-12-22 18:03           ` Maxim Levitsky
  2010-01-07  7:11         ` Artem Bityutskiy
  1 sibling, 1 reply; 14+ messages in thread
From: Jörn Engel @ 2009-12-21 14:00 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-mtd, arnd, Alex Dubov

On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
> 
> I suspect that older non type M cards didn't have such emulation, but
> were real nand chips, bacause there are some references on the web about
> using XD card as a nand chip replacement.
> Such card as I have will really will make very poor nand replacement...

If you want to know whether the cards or the readers are to blame, you
can try to buy an old alauda reader on ebay.  It is too slow to be
useful for most purposes, but I believe it did give me full access with
my cards.

> Folks, could you review my other questions about bugs in mtd core, and
> tell your opinion?

I didn't quite understand the 32bit DMA question and am next to clueless
wrt. hotplug.  Sorry.

Jörn

-- 
Joern's library part 6:
http://www.gzip.org/zlib/feldspar.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-21 14:00         ` Jörn Engel
@ 2009-12-22 18:03           ` Maxim Levitsky
  2009-12-22 20:32             ` Michael Trimarchi
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-22 18:03 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-mtd, arnd, Alex Dubov

On Mon, 2009-12-21 at 15:00 +0100, Jörn Engel wrote: 
> On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
> > 
> > I suspect that older non type M cards didn't have such emulation, but
> > were real nand chips, bacause there are some references on the web about
> > using XD card as a nand chip replacement.
> > Such card as I have will really will make very poor nand replacement...
> 
> If you want to know whether the cards or the readers are to blame, you
> can try to buy an old alauda reader on ebay.  It is too slow to be
> useful for most purposes, but I believe it did give me full access with
> my cards.
> 
> > Folks, could you review my other questions about bugs in mtd core, and
> > tell your opinion?

I have several problems, more correctly bugs in mtd system I have to fix
to make my driver work.

Lets start from  the problem I face now.

Problem is that add_mtd_blktrans_dev is called with mtd_table_mutex
locked, but it calls add_disk which opens the block device if you
specify that you need partitions on the disk. Open routine 'looks' at
mtd table using get_mtd_device, and thus deadlocks.

Do you have a clue how to fix that so it won't break anything?


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-22 18:03           ` Maxim Levitsky
@ 2009-12-22 20:32             ` Michael Trimarchi
  2009-12-22 21:42               ` Maxim Levitsky
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Trimarchi @ 2009-12-22 20:32 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: Jörn Engel, linux-mtd, Alex Dubov, arnd

Maxim Levitsky wrote:
> On Mon, 2009-12-21 at 15:00 +0100, Jörn Engel wrote: 
>   
>> On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
>>     
>>> I suspect that older non type M cards didn't have such emulation, but
>>> were real nand chips, bacause there are some references on the web about
>>> using XD card as a nand chip replacement.
>>> Such card as I have will really will make very poor nand replacement...
>>>       
>> If you want to know whether the cards or the readers are to blame, you
>> can try to buy an old alauda reader on ebay.  It is too slow to be
>> useful for most purposes, but I believe it did give me full access with
>> my cards.
>>
>>     
>>> Folks, could you review my other questions about bugs in mtd core, and
>>> tell your opinion?
>>>       
>
> I have several problems, more correctly bugs in mtd system I have to fix
> to make my driver work.
>
> Lets start from  the problem I face now.
>
> Problem is that add_mtd_blktrans_dev is called with mtd_table_mutex
> locked, but it calls add_disk which opens the block device if you
> specify that you need partitions on the disk. Open routine 'looks' at
> mtd table using get_mtd_device, and thus deadlocks.
>
> Do you have a clue how to fix that so it won't break anything?
>   
I try to follow the call, can you send the block stack trace?
just change the lock on get_mtd_device with a try_lock and BUG on
if it is taken

Michael
>
> Best regards,
> Maxim Levitsky
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-22 20:32             ` Michael Trimarchi
@ 2009-12-22 21:42               ` Maxim Levitsky
  2009-12-22 23:39                 ` Michael Trimarchi
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-22 21:42 UTC (permalink / raw)
  To: Michael Trimarchi; +Cc: Jörn Engel, linux-mtd, Alex Dubov, arnd

On Tue, 2009-12-22 at 21:32 +0100, Michael Trimarchi wrote: 
> Maxim Levitsky wrote:
> > On Mon, 2009-12-21 at 15:00 +0100, Jörn Engel wrote: 
> >   
> >> On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
> >>     
> >>> I suspect that older non type M cards didn't have such emulation, but
> >>> were real nand chips, bacause there are some references on the web about
> >>> using XD card as a nand chip replacement.
> >>> Such card as I have will really will make very poor nand replacement...
> >>>       
> >> If you want to know whether the cards or the readers are to blame, you
> >> can try to buy an old alauda reader on ebay.  It is too slow to be
> >> useful for most purposes, but I believe it did give me full access with
> >> my cards.
> >>
> >>     
> >>> Folks, could you review my other questions about bugs in mtd core, and
> >>> tell your opinion?
> >>>       
> >
> > I have several problems, more correctly bugs in mtd system I have to fix
> > to make my driver work.
> >
> > Lets start from  the problem I face now.
> >
> > Problem is that add_mtd_blktrans_dev is called with mtd_table_mutex
> > locked, but it calls add_disk which opens the block device if you
> > specify that you need partitions on the disk. Open routine 'looks' at
> > mtd table using get_mtd_device, and thus deadlocks.
> >
> > Do you have a clue how to fix that so it won't break anything?
> >   
> I try to follow the call, can you send the block stack trace?
> just change the lock on get_mtd_device with a try_lock and BUG on
> if it is taken
I don't want to crash the system now, but I know exactly the trace:
(this is created manually)

get_mtd_device
blktrans_open
__blkdev_get
blkdev_get
register_disk
add_disk
add_mtd_blktrans_dev
ssfdcr_add_mtd
blktrans_notify_add
add_mtd_device - takes the lock
rsc_register_nand_device - my driver function


The point is that that ssfdc specifies:

.part_bits = SSFDCR_PARTN_BITS

This triggers the open of the block device by block core.

Tomorrow I will post full backtrace.

Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-22 21:42               ` Maxim Levitsky
@ 2009-12-22 23:39                 ` Michael Trimarchi
  2009-12-23 16:18                   ` Maxim Levitsky
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Trimarchi @ 2009-12-22 23:39 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Artem.Bityutskiy, Jörn Engel, linux-mtd, Alex Dubov, arnd

Hi

Maxim Levitsky wrote:
> On Tue, 2009-12-22 at 21:32 +0100, Michael Trimarchi wrote: 
>   
>> Maxim Levitsky wrote:
>>     
>>> On Mon, 2009-12-21 at 15:00 +0100, Jörn Engel wrote: 
>>>   
>>>       
>>>> On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
>>>>     
>>>>         
>>>>> I suspect that older non type M cards didn't have such emulation, but
>>>>> were real nand chips, bacause there are some references on the web about
>>>>> using XD card as a nand chip replacement.
>>>>> Such card as I have will really will make very poor nand replacement...
>>>>>       
>>>>>           
>>>> If you want to know whether the cards or the readers are to blame, you
>>>> can try to buy an old alauda reader on ebay.  It is too slow to be
>>>> useful for most purposes, but I believe it did give me full access with
>>>> my cards.
>>>>
>>>>     
>>>>         
>>>>> Folks, could you review my other questions about bugs in mtd core, and
>>>>> tell your opinion?
>>>>>       
>>>>>           
>>> I have several problems, more correctly bugs in mtd system I have to fix
>>> to make my driver work.
>>>
>>> Lets start from  the problem I face now.
>>>
>>> Problem is that add_mtd_blktrans_dev is called with mtd_table_mutex
>>> locked, but it calls add_disk which opens the block device if you
>>> specify that you need partitions on the disk. Open routine 'looks' at
>>> mtd table using get_mtd_device, and thus deadlocks.
>>>
>>> Do you have a clue how to fix that so it won't break anything?
>>>   
>>>       
>> I try to follow the call, can you send the block stack trace?
>> just change the lock on get_mtd_device with a try_lock and BUG on
>> if it is taken
>>     
> I don't want to crash the system now, but I know exactly the trace:
> (this is created manually)
>
> get_mtd_device
> blktrans_open
> __blkdev_get
> blkdev_get
> register_disk
> add_disk
> add_mtd_blktrans_dev
> ssfdcr_add_mtd
> blktrans_notify_add
>   
The problem can be introduced by this commit
8022c13c27b822cf22f13df10b42aae89cd56bf0

 mtd: blkdevs: do not forget to get MTD devices
   
    Nowadays MTD devices have to be "get" before they can be
    used. This has to be done with 'get_mtd_device()'. The
    'blktrans_open()' function did not do this and instead
    used 'try_module_get()'. Fix this.
   
    Since 'get_mtd_device()' already gets the module, extra
    'try_module_get()' is not needed.
   
    This fixes oops when one tries to use mtdblock on top of
    gluebi.
   
    Reported-by: Holger Brunck <holger.brunck@keymile.com>
    Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

That call the

-       if (!try_module_get(dev->mtd->owner))
+       if (!get_mtd_device(NULL, dev->mtd->index))
                goto out;

Regards Michael
> add_mtd_device - takes the lock
>   
How do you define partition? cmdline partition or pdata?
Look of the probe of other device it calls the add_mtd_partions and no
the add_mtd_device.
> rsc_register_nand_device - my driver function
>
>
> The point is that that ssfdc specifies:
>
> .part_bits = SSFDCR_PARTN_BITS
>
> This triggers the open of the block device by block core.
>
> Tomorrow I will post full backtrace.
>
> Best regards,
> Maxim Levitsky
>
>
>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-22 23:39                 ` Michael Trimarchi
@ 2009-12-23 16:18                   ` Maxim Levitsky
  0 siblings, 0 replies; 14+ messages in thread
From: Maxim Levitsky @ 2009-12-23 16:18 UTC (permalink / raw)
  To: Michael Trimarchi
  Cc: Artem.Bityutskiy, Jörn Engel, linux-mtd, Alex Dubov, arnd

On Wed, 2009-12-23 at 00:39 +0100, Michael Trimarchi wrote: 
> Hi
> 
> Maxim Levitsky wrote:
> > On Tue, 2009-12-22 at 21:32 +0100, Michael Trimarchi wrote: 
> >   
> >> Maxim Levitsky wrote:
> >>     
> >>> On Mon, 2009-12-21 at 15:00 +0100, Jörn Engel wrote: 
> >>>   
> >>>       
> >>>> On Sun, 20 December 2009 22:59:55 +0200, Maxim Levitsky wrote:
> >>>>     
> >>>>         
> >>>>> I suspect that older non type M cards didn't have such emulation, but
> >>>>> were real nand chips, bacause there are some references on the web about
> >>>>> using XD card as a nand chip replacement.
> >>>>> Such card as I have will really will make very poor nand replacement...
> >>>>>       
> >>>>>           
> >>>> If you want to know whether the cards or the readers are to blame, you
> >>>> can try to buy an old alauda reader on ebay.  It is too slow to be
> >>>> useful for most purposes, but I believe it did give me full access with
> >>>> my cards.
> >>>>
> >>>>     
> >>>>         
> >>>>> Folks, could you review my other questions about bugs in mtd core, and
> >>>>> tell your opinion?
> >>>>>       
> >>>>>           
> >>> I have several problems, more correctly bugs in mtd system I have to fix
> >>> to make my driver work.
> >>>
> >>> Lets start from  the problem I face now.
> >>>
> >>> Problem is that add_mtd_blktrans_dev is called with mtd_table_mutex
> >>> locked, but it calls add_disk which opens the block device if you
> >>> specify that you need partitions on the disk. Open routine 'looks' at
> >>> mtd table using get_mtd_device, and thus deadlocks.
> >>>
> >>> Do you have a clue how to fix that so it won't break anything?
> >>>   
> >>>       
> >> I try to follow the call, can you send the block stack trace?
> >> just change the lock on get_mtd_device with a try_lock and BUG on
> >> if it is taken
> >>     
> > I don't want to crash the system now, but I know exactly the trace:
> > (this is created manually)
> >
> > get_mtd_device
> > blktrans_open
> > __blkdev_get
> > blkdev_get
> > register_disk
> > add_disk
> > add_mtd_blktrans_dev
> > ssfdcr_add_mtd
> > blktrans_notify_add
> >   
> The problem can be introduced by this commit
> 8022c13c27b822cf22f13df10b42aae89cd56bf0
> 
>  mtd: blkdevs: do not forget to get MTD devices
>    
>     Nowadays MTD devices have to be "get" before they can be
>     used. This has to be done with 'get_mtd_device()'. The
>     'blktrans_open()' function did not do this and instead
>     used 'try_module_get()'. Fix this.
>    
>     Since 'get_mtd_device()' already gets the module, extra
>     'try_module_get()' is not needed.
>    
>     This fixes oops when one tries to use mtdblock on top of
>     gluebi.
>    
>     Reported-by: Holger Brunck <holger.brunck@keymile.com>
>     Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
>     Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> 
> That call the
> 
> -       if (!try_module_get(dev->mtd->owner))
> +       if (!get_mtd_device(NULL, dev->mtd->index))
>                 goto out;
> 
> Regards Michael
> > add_mtd_device - takes the lock
> >   
> How do you define partition? cmdline partition or pdata?
> Look of the probe of other device it calls the add_mtd_partions and no
> the add_mtd_device.
It a bit different.
mtd device is standard nand device.
On top of that there is an FTL (ssfdc) that uses oob area to map
physical to logical sectors.
ssfdc just like mtdblock creates standard block device.

This block device can have any partition table. Usually there is
standard MSDOS partition table with one partition.



> > rsc_register_nand_device - my driver function
> >
> >
> > The point is that that ssfdc specifies:
> >
> > .part_bits = SSFDCR_PARTN_BITS
> >
> > This triggers the open of the block device by block core.
> >
> > Tomorrow I will post full backtrace.
> >
> > Best regards,
> > Maxim Levitsky
> >
> >
> >   
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2009-12-20 20:59       ` Maxim Levitsky
  2009-12-21 14:00         ` Jörn Engel
@ 2010-01-07  7:11         ` Artem Bityutskiy
  2010-01-07 16:16           ` Maxim Levitsky
  1 sibling, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2010-01-07  7:11 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: joern, linux-mtd, arnd, Alex Dubov

On Sun, 2009-12-20 at 22:59 +0200, Maxim Levitsky wrote:
> Folks, could you review my other questions about bugs in mtd core, and
> tell your opinion?

The MTD subsystem is nor very inhabitant these days, so you are mostly
on your own. I think you should just try fixing the issues you found,
test them, be careful, and send patches.

I believe MTD subsystem have many issues with "hotpluggable" devices
like your card, simply because no one probably worked seriously on this.
Most MTD devices are on-board chips, so few people care about
plugability.

> Yet another problem I found is inside add_mtd_blktrans_dev.
> This function takes mtd_table_mutex but never releases it.
> This sounds fishy, and deadlocks when it calls 'add_disk(gd);'
> This function (I think in case partitions are detected on device)
opens
> block device, and this takes that lock too
> (blktrans_open->get_mtd_device)> 

I vaguely remember there was a fix for this, may be in 2.6.33-rcX ?


-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2010-01-07  7:11         ` Artem Bityutskiy
@ 2010-01-07 16:16           ` Maxim Levitsky
  2010-01-08  6:17             ` Jörn Engel
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Levitsky @ 2010-01-07 16:16 UTC (permalink / raw)
  To: dedekind1; +Cc: joern, linux-mtd, arnd, Alex Dubov

On Thu, 2010-01-07 at 09:11 +0200, Artem Bityutskiy wrote: 
> On Sun, 2009-12-20 at 22:59 +0200, Maxim Levitsky wrote:
> > Folks, could you review my other questions about bugs in mtd core, and
> > tell your opinion?
> 
> The MTD subsystem is nor very inhabitant these days, so you are mostly
> on your own. I think you should just try fixing the issues you found,
> test them, be careful, and send patches.
> 
> I believe MTD subsystem have many issues with "hotpluggable" devices
> like your card, simply because no one probably worked seriously on this.
> Most MTD devices are on-board chips, so few people care about
> plugability.
> 
> > Yet another problem I found is inside add_mtd_blktrans_dev.
> > This function takes mtd_table_mutex but never releases it.
> > This sounds fishy, and deadlocks when it calls 'add_disk(gd);'
> > This function (I think in case partitions are detected on device)
> opens
> > block device, and this takes that lock too
> > (blktrans_open->get_mtd_device)> 
> 
> I vaguely remember there was a fix for this, may be in 2.6.33-rcX ?


I have recently posted a pathset that fixes everything, especially
hotplug.

Will be glad to hear feedback.


Best regards,
Maxim Levitsky

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Few problems in mtd system
  2010-01-07 16:16           ` Maxim Levitsky
@ 2010-01-08  6:17             ` Jörn Engel
  0 siblings, 0 replies; 14+ messages in thread
From: Jörn Engel @ 2010-01-08  6:17 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linux-mtd, arnd, Alex Dubov, dedekind1

On Thu, 7 January 2010 18:16:03 +0200, Maxim Levitsky wrote:
> 
> I have recently posted a pathset that fixes everything, especially
> hotplug.
> 
> Will be glad to hear feedback.

Sadly, this is a bit naive wrt. mtd.  The current state of affair seems
to be that you either provide a perfect patch that gets merged or you
get ignored.  Feedback has become a rarety.

I'm currently on vacation, so I haven't given you patches more than a
cursory glance.  Certainly useful, but not perfect yet.  Apart from the
obvious checkpatch, sending obviously useful (for everyone else but you)
patches first might help.

Jörn

-- 
Joern's library part 2:
http://www.art.net/~hopkins/Don/unix-haters/tirix/embarrassing-memo.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-01-08  6:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-17 22:08 Few problems in mtd system Maxim Levitsky
2009-12-19 17:17 ` Maxim Levitsky
2009-12-19 21:06   ` Maxim Levitsky
2009-12-19 21:58     ` Alex Dubov
2009-12-20 20:59       ` Maxim Levitsky
2009-12-21 14:00         ` Jörn Engel
2009-12-22 18:03           ` Maxim Levitsky
2009-12-22 20:32             ` Michael Trimarchi
2009-12-22 21:42               ` Maxim Levitsky
2009-12-22 23:39                 ` Michael Trimarchi
2009-12-23 16:18                   ` Maxim Levitsky
2010-01-07  7:11         ` Artem Bityutskiy
2010-01-07 16:16           ` Maxim Levitsky
2010-01-08  6:17             ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox