* [U-Boot-Users] Redundant environment
@ 2006-04-28 16:32 Tolunay Orkun
2006-04-28 19:31 ` [U-Boot-Users] " Wolfgang Denk
0 siblings, 1 reply; 10+ messages in thread
From: Tolunay Orkun @ 2006-04-28 16:32 UTC (permalink / raw)
To: u-boot
Dear Wolfgang,
After exchanging emails with some list members, I have the feeling that
there is some interest in a redundant environment implementation that is
synchronized.
If I am allowed to introduce this enhancement, I pledge that existing
implementations would not be effected unless CONFIG_ENV_REDUND_SYNC is
defined and generated code would be virtually identical.
Of course, I will also document both behavior in README properly,
introduce the same functionality in env_flash.c, env_nand.c etc. (i.e.
all mediums that redundant environment is implemented) and fixup
fw_setenv/fw_getenv utility.
Best regards,
Tolunay
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-04-28 16:32 [U-Boot-Users] Redundant environment Tolunay Orkun
@ 2006-04-28 19:31 ` Wolfgang Denk
2006-05-01 6:19 ` Stefan Roese
0 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Denk @ 2006-04-28 19:31 UTC (permalink / raw)
To: u-boot
In message <445243A3.3090806@orkun.us> you wrote:
>
> After exchanging emails with some list members, I have the feeling that
More mails that were visible on the list?
> Of course, I will also document both behavior in README properly,
> introduce the same functionality in env_flash.c, env_nand.c etc. (i.e.
> all mediums that redundant environment is implemented) and fixup
> fw_setenv/fw_getenv utility.
You mean, you want to change more than the saveenc command to run
twice? Is this really needed?
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
No user-servicable parts inside. Refer to qualified service personnel.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-04-28 19:31 ` [U-Boot-Users] " Wolfgang Denk
@ 2006-05-01 6:19 ` Stefan Roese
2006-05-01 22:55 ` Tolunay Orkun
0 siblings, 1 reply; 10+ messages in thread
From: Stefan Roese @ 2006-05-01 6:19 UTC (permalink / raw)
To: u-boot
Hi Wolfgang,
On Friday, 28. April 2006 21:31, Wolfgang Denk wrote:
> > Of course, I will also document both behavior in README properly,
> > introduce the same functionality in env_flash.c, env_nand.c etc. (i.e.
> > all mediums that redundant environment is implemented) and fixup
> > fw_setenv/fw_getenv utility.
>
> You mean, you want to change more than the saveenc command to run
> twice? Is this really needed?
As I mentioned before, I also tend to forget to use special commands like 2
times "saveenv". So I still vote to include Tolunay's patch.
Or do you mean that you would like to see this new behavior implemented in a
patched saveenv command, that calls the original _saveenv twice? This would
have the advantage of less code changes, but the disadvantage of doing
everything twice (like unprotect, protect).
Best regards,
Stefan
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-01 6:19 ` Stefan Roese
@ 2006-05-01 22:55 ` Tolunay Orkun
2006-05-01 23:13 ` Wolfgang Denk
0 siblings, 1 reply; 10+ messages in thread
From: Tolunay Orkun @ 2006-05-01 22:55 UTC (permalink / raw)
To: u-boot
Stefan Roese wrote:
> Hi Wolfgang,
>
> On Friday, 28. April 2006 21:31, Wolfgang Denk wrote:
>
>>> Of course, I will also document both behavior in README properly,
>>> introduce the same functionality in env_flash.c, env_nand.c etc. (i.e.
>>> all mediums that redundant environment is implemented) and fixup
>>> fw_setenv/fw_getenv utility.
>>>
>> You mean, you want to change more than the saveenc command to run
>> twice? Is this really needed?
>>
>
> As I mentioned before, I also tend to forget to use special commands like 2
> times "saveenv". So I still vote to include Tolunay's patch.
>
> Or do you mean that you would like to see this new behavior implemented in a
> patched saveenv command, that calls the original _saveenv twice? This would
> have the advantage of less code changes, but the disadvantage of doing
> everything twice (like unprotect, protect).
>
> Best regards,
> Stefan
>
Yes, I can do it in saveenv code to cycle twice but I would rather avoid
doing unlock/re-lock/over flag byte stuff twice.
Whichever way Wolfgang favors I am ready to work on a patch.
Best regards,
Tolunay
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-01 22:55 ` Tolunay Orkun
@ 2006-05-01 23:13 ` Wolfgang Denk
2006-05-05 16:42 ` Tolunay Orkun
0 siblings, 1 reply; 10+ messages in thread
From: Wolfgang Denk @ 2006-05-01 23:13 UTC (permalink / raw)
To: u-boot
Dear Tolunay,
in message <445691EF.1000401@orkun.us> you wrote:
>
> Yes, I can do it in saveenv code to cycle twice but I would rather avoid
> doing unlock/re-lock/over flag byte stuff twice.
>
> Whichever way Wolfgang favors I am ready to work on a patch.
I think adding another set of N #ifdef's to implement this feature is
not a good idea, when a single one (to duplicate the call to the C
function) does basicly the same.
Ummm... sorry for being stubborn, but before you start can you please
re-try to explain to me in which specific situations you expect this
patch to actually improve the reliability of operation of the device?
I am aware that some people interpreted the term "redundand environ-
ment" that two identical copies of the environment were stored. This
was obviously an unlucky choice of the name for this feature. Please
let's exclude this "I expected to see this, now change the code to
match my expectations" aspect for a moment. However, I still fail to
see any improvements in the suggested change; actually I only see
disadvantages like doubling the number of flash erase cycles for the
environment sectors.
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
core error - bus dumped
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-01 23:13 ` Wolfgang Denk
@ 2006-05-05 16:42 ` Tolunay Orkun
2006-05-05 16:45 ` Tolunay Orkun
2006-05-05 23:28 ` Wolfgang Denk
0 siblings, 2 replies; 10+ messages in thread
From: Tolunay Orkun @ 2006-05-05 16:42 UTC (permalink / raw)
To: u-boot
Wolfgang Denk wrote:
> Dear Tolunay,
>
> in message <445691EF.1000401@orkun.us> you wrote:
>
>> Yes, I can do it in saveenv code to cycle twice but I would rather avoid
>> doing unlock/re-lock/over flag byte stuff twice.
>>
>> Whichever way Wolfgang favors I am ready to work on a patch.
>>
>
> I think adding another set of N #ifdef's to implement this feature is
> not a good idea, when a single one (to duplicate the call to the C
> function) does basicly the same.
>
OK. That makes the patch simpler.
> Ummm... sorry for being stubborn, but before you start can you please
> re-try to explain to me in which specific situations you expect this
> patch to actually improve the reliability of operation of the device?
>
This patch would solve the issue that exists today that when the
"active" environment is lost/corrupted for some reason the "redundant"
environment would contain an exact copy of the primary to have the board
come up without requiring the need to redo the changes that was lost on
last save. Sometimes these changes could be critical enough not to allow
the system boot the OS properly anymore (like changes to bootcmd,
bootargs etc).
Among the things that can cause one environment to go corrupt would be
charge decays in memory cells in aging flash, supply variations/noise
during erase/write and random memory corruption when power is
interrupted while another section of flash memory is being written/erased.
Sure these could cause other problems as well like if this issue happens
for U-Boot code the system might become un-bootable. But at least we
have full recovery for the case when it happens within U-Boot environment.
> I am aware that some people interpreted the term "redundand environ-
> ment" that two identical copies of the environment were stored. This
> was obviously an unlucky choice of the name for this feature. Please
> let's exclude this "I expected to see this, now change the code to
> match my expectations" aspect for a moment. However, I still fail to
> see any improvements in the suggested change; actually I only see
> disadvantages like doubling the number of flash erase cycles for the
> environment sectors.
>
I understand you concern. In our application the environment would not
be updated occasionally so that is not a big concern for us.
Best regards,
Tolunay
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-05 16:42 ` Tolunay Orkun
@ 2006-05-05 16:45 ` Tolunay Orkun
2006-05-05 23:28 ` Wolfgang Denk
1 sibling, 0 replies; 10+ messages in thread
From: Tolunay Orkun @ 2006-05-05 16:45 UTC (permalink / raw)
To: u-boot
Tolunay Orkun wrote:
>
>> I am aware that some people interpreted the term "redundand environ-
>> ment" that two identical copies of the environment were stored. This
>> was obviously an unlucky choice of the name for this feature. Please
>> let's exclude this "I expected to see this, now change the code to
>> match my expectations" aspect for a moment. However, I still fail to
>> see any improvements in the suggested change; actually I only see
>> disadvantages like doubling the number of flash erase cycles for the
>> environment sectors.
>>
> I understand you concern. In our application the environment would not
> be updated occasionally so that is not a big concern for us.
I meant to say the environment would be updated occasionally but somehow
inverted the meaning.
Best regards,
Tolunay
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-05 16:42 ` Tolunay Orkun
2006-05-05 16:45 ` Tolunay Orkun
@ 2006-05-05 23:28 ` Wolfgang Denk
2006-05-08 13:09 ` Jerry Van Baren
2006-05-22 21:11 ` Tolunay Orkun
1 sibling, 2 replies; 10+ messages in thread
From: Wolfgang Denk @ 2006-05-05 23:28 UTC (permalink / raw)
To: u-boot
Dear Tolunay,
in message <445B8086.9000404@orkun.us> you wrote:
>
> This patch would solve the issue that exists today that when the
> "active" environment is lost/corrupted for some reason the "redundant"
> environment would contain an exact copy of the primary to have the board
> come up without requiring the need to redo the changes that was lost on
Actually I think that you will not acchieve this with your patch.
This is why I'm concerned. You see, if you feel better having this
patch I would not complain, but I am afraid that a lot of people
might just activate it because they think it would do them any good
when it doesn't (and actually it just hurts).
There is only one occasion when we have any significant likelyhood of
losing the environment data: this is when a call to "saveenv" fails
becaue either a) we have a power loss, b) we have an otherwise
induced reset of the CPU, or c) the flash sector that shall be
erased/written is failing.
So where exactly does your modification improve anything? Let's go
through this step by step.
Case 1: power loss/reset happens during the first "saveenv", i. e.
when writing the first copy of the new environment data.
In this case this first copy contains no valid data; the
second copy of the environment contains valid, but old data.
This is exactly the same as we have with the current imple-
mentation. I don't see any improvement.
Case 2: power loss/reset happens during the second "saveenv", i. e.
when writing the second copy of the new environment data.
In this case this first copy contains valid new data, while
the second copy of the environment does not contain valid
data.
In the current implementation, the first (and only) saveenv
would have completed, too, and the reset would hit after
leaving this part of code, so we had valid new data in the
first copy, and valid (but old) data in the second one.
Again, this is not an improvement. Actually I think the
current implementations is even more useful.
Case 3: A flash sector in the first copy of the environment becomes
defective while we erase or write it. In this case we will
see appropriate error conditions, and the "saveenv" command
will abort.
This is the same as case 1: no valid data in copy 1, valid,
but old data in copy 2; no difference between the existing
and your new implementation.
Case 4: A flash sector in the second copy of the environment becomes
defective while we erase or write it. In this case we will
see appropriate error conditions, and the "saveenv" command
will abort.
This is the same as case 2: valid new data in copy 1, no
valid data in copy 2 with your implementation, but probably
valid old data with the existing code.
I guess I must have missed some cases because there was none yet
where the new implementaion would improve the reliability. Please
fill in these missing cases.
But, and I think this is an undisputet fact, the current implemen-
tation needs only hald the number of erase/write cycles, so it causes
much less flash wear than your code. [Actually your code will see the
same level of flash wear as you have now without the redundant
environment enabled; it's that enabling the current implementation of
redundance *improves* flash lifetime by halfing the number of
erase/write cycles to the environment.]
> Among the things that can cause one environment to go corrupt would be
> charge decays in memory cells in aging flash, supply variations/noise
I think that the likelyhood of such a thing to happen during read
accesses only is infinitesimal.
> during erase/write and random memory corruption when power is
I agree that erase/write cycles are the critical phase where
corruption may happen, and which we want to try to protect with our
implementation. See above.
> interrupted while another section of flash memory is being written/erased.
I don't see how this could happen to flash. [Well, I've seen flash
corruption before; this was on Intel flash where you could write the
flash control commands to arbitrary addresses, so just copying a
binary image to a flash device could cause random write / erase
actions. But then, such devices should have hardware flash protection
(which you should enable, or you deserve what you get), or if you are
concerned about reliability you would avoid such devices like hell.]
> Sure these could cause other problems as well like if this issue happens
> for U-Boot code the system might become un-bootable. But at least we
> have full recovery for the case when it happens within U-Boot environment.
I'm not sure I can follow that logic. If you have some undetected and
unexpected memory corruption in your flash, and if you care about
reliability, then you must try to recognize such situations and halt
the system. Trying to continue in such an undefined state is too
hazardous.
So, can you please fill in the szenario where your modification would
really help to make the system more reliable?
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
Our business is run on trust. We trust you will pay in advance.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-05 23:28 ` Wolfgang Denk
@ 2006-05-08 13:09 ` Jerry Van Baren
2006-05-22 21:11 ` Tolunay Orkun
1 sibling, 0 replies; 10+ messages in thread
From: Jerry Van Baren @ 2006-05-08 13:09 UTC (permalink / raw)
To: u-boot
Wolfgang Denk wrote:
> Dear Tolunay,
>
> in message <445B8086.9000404@orkun.us> you wrote:
>> This patch would solve the issue that exists today that when the
>> "active" environment is lost/corrupted for some reason the "redundant"
>> environment would contain an exact copy of the primary to have the board
>> come up without requiring the need to redo the changes that was lost on
>
> Actually I think that you will not acchieve this with your patch.
> This is why I'm concerned. You see, if you feel better having this
> patch I would not complain, but I am afraid that a lot of people
> might just activate it because they think it would do them any good
> when it doesn't (and actually it just hurts).
>
> There is only one occasion when we have any significant likelyhood of
> losing the environment data: this is when a call to "saveenv" fails
> becaue either a) we have a power loss, b) we have an otherwise
> induced reset of the CPU, or c) the flash sector that shall be
> erased/written is failing.
>
> So where exactly does your modification improve anything? Let's go
> through this step by step.
>
> Case 1: power loss/reset happens during the first "saveenv", i. e.
> when writing the first copy of the new environment data.
>
> In this case this first copy contains no valid data; the
> second copy of the environment contains valid, but old data.
>
> This is exactly the same as we have with the current imple-
> mentation. I don't see any improvement.
>
> Case 2: power loss/reset happens during the second "saveenv", i. e.
> when writing the second copy of the new environment data.
>
> In this case this first copy contains valid new data, while
> the second copy of the environment does not contain valid
> data.
>
> In the current implementation, the first (and only) saveenv
> would have completed, too, and the reset would hit after
> leaving this part of code, so we had valid new data in the
> first copy, and valid (but old) data in the second one.
>
> Again, this is not an improvement. Actually I think the
> current implementations is even more useful.
>
> Case 3: A flash sector in the first copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 1: no valid data in copy 1, valid,
> but old data in copy 2; no difference between the existing
> and your new implementation.
>
> Case 4: A flash sector in the second copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 2: valid new data in copy 1, no
> valid data in copy 2 with your implementation, but probably
> valid old data with the existing code.
>
> I guess I must have missed some cases because there was none yet
> where the new implementaion would improve the reliability. Please
> fill in these missing cases.
Case 5: Data retention.
If you check the data sheets of your flash device, you should find a
section on data retention. It probably is more than 10 years
Grabbing a Spansion AM29F800B
<http://www.spansion.com/products/Am29F800B.html> data sheet at random,
I find it is rated at 10 years at 150^C, 20 years at 125^C.
Even if the retention rating of the part given by the manufacturer is
insufficient, a mirror (duplication) of the "redundant" environment that
was created at the same time as the "active" environment could not be
expected to last any longer than the "active" environment since they
were written at the same time.
If (a) you are really paranoid and (b) you expect your gadget to outlive
the retention of the memories, you need to rewrite the environment (and,
likely, your whole program!) every few years in order to reset the
decay. Making a duplicate environment once won't help.
[snip]
> So, can you please fill in the szenario where your modification would
> really help to make the system more reliable?
>
> Best regards,
> Wolfgang Denk
2cents, and worth every penny you paid,
gvb
^ permalink raw reply [flat|nested] 10+ messages in thread
* [U-Boot-Users] Re: Redundant environment
2006-05-05 23:28 ` Wolfgang Denk
2006-05-08 13:09 ` Jerry Van Baren
@ 2006-05-22 21:11 ` Tolunay Orkun
1 sibling, 0 replies; 10+ messages in thread
From: Tolunay Orkun @ 2006-05-22 21:11 UTC (permalink / raw)
To: u-boot
I am sorry I am responding to this so late as I got so busy recently and
had accumulated over 1000 emails from public lists I am following....
Wolfgang Denk wrote:
> Dear Tolunay,
>
> in message <445B8086.9000404@orkun.us> you wrote:
>
>> This patch would solve the issue that exists today that when the
>> "active" environment is lost/corrupted for some reason the "redundant"
>> environment would contain an exact copy of the primary to have the board
>> come up without requiring the need to redo the changes that was lost on
>>
>
> Actually I think that you will not acchieve this with your patch.
> This is why I'm concerned. You see, if you feel better having this
> patch I would not complain, but I am afraid that a lot of people
> might just activate it because they think it would do them any good
> when it doesn't (and actually it just hurts).
>
I can only offer a detailed description of what it does and under what
condition it might be useful and under what condition it can hurt in
README (and perhaps Wiki)
> There is only one occasion when we have any significant likelyhood of
> losing the environment data: this is when a call to "saveenv" fails
> becaue either a) we have a power loss, b) we have an otherwise
> induced reset of the CPU, or c) the flash sector that shall be
> erased/written is failing.
>
> So where exactly does your modification improve anything? Let's go
> through this step by step.
>
> Case 1: power loss/reset happens during the first "saveenv", i. e.
> when writing the first copy of the new environment data.
>
> In this case this first copy contains no valid data; the
> second copy of the environment contains valid, but old data.
>
> This is exactly the same as we have with the current imple-
> mentation. I don't see any improvement.
>
This is a tie in terms of functionality between two implementations.
> Case 2: power loss/reset happens during the second "saveenv", i. e.
> when writing the second copy of the new environment data.
>
> In this case this first copy contains valid new data, while
> the second copy of the environment does not contain valid
> data.
>
> In the current implementation, the first (and only) saveenv
> would have completed, too, and the reset would hit after
> leaving this part of code, so we had valid new data in the
> first copy, and valid (but old) data in the second one.
>
> Again, this is not an improvement. Actually I think the
> current implementations is even more useful.
>
I would call this as a tie too.
> Case 3: A flash sector in the first copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 1: no valid data in copy 1, valid,
> but old data in copy 2; no difference between the existing
> and your new implementation.
>
Tie.
> Case 4: A flash sector in the second copy of the environment becomes
> defective while we erase or write it. In this case we will
> see appropriate error conditions, and the "saveenv" command
> will abort.
>
> This is the same as case 2: valid new data in copy 1, no
> valid data in copy 2 with your implementation, but probably
> valid old data with the existing code.
>
Tie.
> I guess I must have missed some cases because there was none yet
> where the new implementaion would improve the reliability. Please
> fill in these missing cases.
>
You are right there is little difference under these conditions. The
alternate implementation I've proposed, takes care of the things that
happen after "saveenv" has completed successfully.
1) Charge loss/fading on flash cells.
When primary environment is partially lost due to charge loss on flash
cells. It is true that under perfect conditions, the cells should retail
charge for a long time but if there was a positive ripple in power
supply while flash was written vs a low power supply while being read
could reduce the time required significantly. A good power supply
regulation and good power supply distribution on PCB prevents more or
less but aging flash chip may be more susceptible.
2) If the power supply is lost while flash is being written/erased,
ongoing write might effect sometimes other cells/blocks that were not
the target. True when this occurs environment is not the only thing we
should be concerned but if it actually lands in the environment we can
recover from it.
> But, and I think this is an undisputet fact, the current implemen-
> tation needs only hald the number of erase/write cycles, so it causes
> much less flash wear than your code. [Actually your code will see the
> same level of flash wear as you have now without the redundant
> environment enabled; it's that enabling the current implementation of
> redundance *improves* flash lifetime by halfing the number of
> erase/write cycles to the environment.]
>
As I pointed earlier, if you are writing the environment not so often
this is not a concern. If you are updating the environment every time
the board boots it might be a concern. The documentation would note that
and have implementor decide for their situation.
>> Among the things that can cause one environment to go corrupt would be
>> charge decays in memory cells in aging flash, supply variations/noise
>>
>
> I think that the likelyhood of such a thing to happen during read
> accesses only is infinitesimal.
>
I've experienced it. It has been some years and the controllers were
deployed in factory environments (EMI noise issues) ... You might call
me unlucky, or perhaps we had a bad chip to begin with. Perhaps it is
not an issue with more modern/reliable production techniques. Who knows...
Well, I think this has dragged on way too long. If you are not convinced
that it might be useful, I will drop this patch proposal from consideration.
Best regards,
Tolunay
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-05-22 21:11 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-28 16:32 [U-Boot-Users] Redundant environment Tolunay Orkun
2006-04-28 19:31 ` [U-Boot-Users] " Wolfgang Denk
2006-05-01 6:19 ` Stefan Roese
2006-05-01 22:55 ` Tolunay Orkun
2006-05-01 23:13 ` Wolfgang Denk
2006-05-05 16:42 ` Tolunay Orkun
2006-05-05 16:45 ` Tolunay Orkun
2006-05-05 23:28 ` Wolfgang Denk
2006-05-08 13:09 ` Jerry Van Baren
2006-05-22 21:11 ` Tolunay Orkun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox