All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladislav Bolkhovitin <vst@vlnb.net>
To: Nico Williams <nico@cryptonector.com>
Cc: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>,
	"杨苏立 Yang Su Li" <suli@cs.wisc.edu>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	drh@hwaci.com
Subject: Re: [sqlite] light weight write barriers
Date: Fri, 26 Oct 2012 21:52:46 -0400	[thread overview]
Message-ID: <508B3E6E.6060702@vlnb.net> (raw)
In-Reply-To: <CAK3OfOjYgTQBeCh1SucYw=Vriw6W3qaygwmiRmude0oAYhcaxg@mail.gmail.com>


Nico Williams, on 10/24/2012 05:17 PM wrote:
>> Yes, SCSI has full support for ordered/simple commands designed exactly for
>> that task: [...]
>>
>> [...]
>>
>> But historically for some reason Linux storage developers were stuck with
>> "barriers" concept, which is obviously not the same as ORDERED commands,
>> hence had a lot troubles with their ambiguous semantic. As far as I can tell
>> the reason of that was some lack of sufficiently deep SCSI understanding
>> (how to handle errors, believe that ACA is something legacy from parallel
>> SCSI times, etc.).
>
> Barriers are a very simple abstraction, so there's that.

It isn't simple at all. If you think for some time about barriers from the storage 
point of view, you will soon realize how bad and ambiguous they are.

>> Before that happens, people will keep returning again and again with those
>> simple questions: why the queue must be flushed for any ordered operation?
>> Isn't is an obvious overkill?
>
> That [cache flushing]

It isn't cache flushing, it's _queue_ flushing. You can call it queue draining, if 
you like.

Often there's a big difference where it's done: on the system side, or on the 
storage side.

Actually, performance improvements from NCQ in many cases are not because it 
allows the drive to reorder requests, as it's commonly thought, but because it 
allows to have internal drive's processing stages stay always busy without any 
idle time. Drives often have a long internal pipeline.. Hence the need to keep 
every stage of it always busy and hence why using ORDERED commands is important 
for performance.

> is not what's being asked for here. Just a
> light-weight barrier.  My proposal works without having to add new
> system calls: a) use a COW format, b) have background threads doing
> fsync()s, c) in each transaction's root block note the last
> known-committed (from a completed fsync()) transaction's root block,
> d) have an array of well-known ubberblocks large enough to accommodate
> as many transactions as possible without having to wait for any one
> fsync() to complete, d) do not reclaim space from any one past
> transaction until at least one subsequent transaction is fully
> committed.  This obtains ACI- transaction semantics (survives power
> failures but without durability for the last N transactions at
> power-failure time) without requiring changes to the OS at all, and
> with support for delayed D (durability) notification.

I believe what you really want is to be able to send to the storage a sequence of 
your favorite operations (FS operations, async IO operations, etc.) like:

Write back caching disabled:

data op11, ..., data op1N, ORDERED data op1, data op21, ..., data op2M, ...

Write back caching enabled:

data op11, ..., data op1N, ORDERED sync cache, ORDERED FUA data op1, data op21, 
..., data op2M, ...

Right?

(ORDERED means that it is guaranteed that this ordered command never in any 
circumstances will be executed before any previous command completed AND after any 
subsequent command completed.)

Vlad

  parent reply	other threads:[~2012-10-27  1:52 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <415E76CC-A53D-4643-88AB-3D7D7DC56F98@dubeyko.com>
2012-10-06 13:54 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko
2012-10-06 20:06   ` Jaegeuk Kim
2012-10-07  7:09     ` Marco Stornelli
2012-10-07  9:31       ` Jaegeuk Kim
2012-10-07  9:31         ` Jaegeuk Kim
2012-10-07 12:08         ` Vyacheslav Dubeyko
2012-10-07 12:08           ` Vyacheslav Dubeyko
2012-10-08  8:25           ` Jaegeuk Kim
2012-10-08  8:25             ` Jaegeuk Kim
2012-10-08  9:59             ` Namjae Jeon
2012-10-08  9:59               ` Namjae Jeon
2012-10-08 10:52               ` Jaegeuk Kim
2012-10-08 11:21                 ` Namjae Jeon
2012-10-08 12:11                   ` Jaegeuk Kim
2012-10-09  3:52                     ` Namjae Jeon
2012-10-09  8:00                       ` Jaegeuk Kim
2012-10-09  8:31                 ` Lukáš Czerner
2012-10-09 10:45                   ` Jaegeuk Kim
2012-10-09 10:45                     ` Jaegeuk Kim
2012-10-09 11:01                     ` Lukáš Czerner
2012-10-09 12:01                       ` Jaegeuk Kim
2012-10-09 12:39                         ` Lukáš Czerner
2012-10-09 13:10                           ` Jaegeuk Kim
2012-10-09 21:20                         ` Dave Chinner
2012-10-09 21:20                           ` Dave Chinner
2012-10-10  2:32                           ` Jaegeuk Kim
2012-10-10  4:53                       ` Theodore Ts'o
2012-10-10  4:53                         ` Theodore Ts'o
2012-10-12 20:55                         ` Arnd Bergmann
2012-10-10 10:36                   ` David Woodhouse
2012-10-12 20:58                     ` Arnd Bergmann
2012-10-13  4:26                       ` Namjae Jeon
2012-10-13 12:37                         ` Jaegeuk Kim
2012-10-13 12:37                           ` Jaegeuk Kim
2012-10-17 11:12                           ` Namjae Jeon
     [not found]                             ` <000001cdacef$b2f6eaa0$18e4bfe0$%kim@samsung.com>
2012-10-18 13:39                               ` Vyacheslav Dubeyko
2012-10-18 22:14                                 ` Jaegeuk Kim
2012-10-19  9:20                                 ` NeilBrown
2012-10-08 19:22             ` Vyacheslav Dubeyko
2012-10-09  7:08               ` Jaegeuk Kim
2012-10-09  7:08                 ` Jaegeuk Kim
2012-10-09 19:53                 ` Jooyoung Hwang
2012-10-09 19:53                   ` Jooyoung Hwang
2012-10-10  8:05                   ` Vyacheslav Dubeyko
2012-10-10  9:02                   ` Theodore Ts'o
2012-10-10 11:52                     ` SQLite on flash (was: [PATCH 00/16] f2fs: introduce flash-friendly file system) Clemens Ladisch
     [not found]                       ` <50756199.1090103-P6GI/4k7KOmELgA04lAiVw@public.gmane.org>
2012-10-10 12:47                         ` Richard Hipp
2012-10-10 17:17                           ` light weight write barriers Andi Kleen
     [not found]                             ` <m2fw5mtffg.fsf_-_-Vw/NltI1exuRpAAqCnN02g@public.gmane.org>
2012-10-10 17:48                               ` Richard Hipp
     [not found]                                 ` <CALwJ=MyR+nU3zqi3V3JMuEGNwd8FUsw9xLACJvd0HoBv3kRi0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-11 16:38                                   ` Nico Williams
2012-10-11 16:38                                     ` [sqlite] " Nico Williams
     [not found]                                     ` <CAK3OfOi3E1ePfzWjq1epFaXsjtn8V_=r3h+PG6ankWW2fOr6GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-11 16:48                                       ` Nico Williams
2012-10-11 16:48                                         ` [sqlite] " Nico Williams
2012-10-11 16:32                               ` 杨苏立 Yang Su Li
2012-10-11 16:32                                 ` [sqlite] " 杨苏立 Yang Su Li
2012-10-11 17:41                                 ` Christoph Hellwig
2012-10-23 19:53                                 ` Vladislav Bolkhovitin
     [not found]                                   ` <5086F5A7.9090406-d+Crzxg7Rs0@public.gmane.org>
2012-10-24 21:17                                     ` Nico Williams
2012-10-24 21:17                                       ` [sqlite] " Nico Williams
2012-10-24 22:03                                       ` david
     [not found]                                         ` <alpine.DEB.2.02.1210241447210.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25  0:20                                           ` Nico Williams
2012-10-25  0:20                                             ` [sqlite] " Nico Williams
2012-10-25  1:04                                             ` david
     [not found]                                               ` <alpine.DEB.2.02.1210241748180.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25  5:18                                                 ` Nico Williams
2012-10-25  5:18                                                   ` [sqlite] " Nico Williams
2012-10-25  6:02                                                   ` Theodore Ts'o
2012-10-25  6:58                                                     ` david
     [not found]                                                       ` <alpine.DEB.2.02.1210242331060.31862-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25 14:03                                                         ` Theodore Ts'o
2012-10-25 14:03                                                           ` [sqlite] " Theodore Ts'o
     [not found]                                                           ` <20121025140327.GB13562-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-10-25 18:03                                                             ` david-gFPdbfVZQbY
2012-10-25 18:03                                                               ` [sqlite] " david
     [not found]                                                               ` <alpine.DEB.2.02.1210251048280.8519-Z4YwzcCRHZnr5h6Zg1Auow@public.gmane.org>
2012-10-25 18:29                                                                 ` Theodore Ts'o
2012-10-25 18:29                                                                   ` [sqlite] " Theodore Ts'o
2012-11-05 20:03                                                                   ` Pavel Machek
     [not found]                                                                     ` <20121105200348.GB15821-5NIqAleC692hcjWhqY66xCZi+YwRKgec@public.gmane.org>
2012-11-05 22:04                                                                       ` Theodore Ts'o
2012-11-05 22:04                                                                         ` [sqlite] " Theodore Ts'o
     [not found]                                                                         ` <20121105220440.GB25378-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2012-11-05 22:37                                                                           ` Richard Hipp
     [not found]                                                                             ` <CALwJ=Mx-uEFLXK2wywekk=0dwrwVFb68wocnH9bjXJmHRsJx3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-05 23:00                                                                               ` Theodore Ts'o
2012-11-05 23:00                                                                                 ` [sqlite] " Theodore Ts'o
2012-10-30 23:49                                                     ` Nico Williams
2012-10-25  5:42                                         ` Theodore Ts'o
2012-10-25  7:11                                           ` david
2012-10-27  1:52                                       ` Vladislav Bolkhovitin [this message]
2012-10-25  5:14                                   ` Theodore Ts'o
2012-10-25 13:03                                     ` Alan Cox
     [not found]                                       ` <20121025140325.49cd7c79-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-10-25 13:50                                         ` Theodore Ts'o
2012-10-25 13:50                                           ` [sqlite] " Theodore Ts'o
2012-10-27  1:55                                           ` Vladislav Bolkhovitin
2012-10-27  1:54                                     ` Vladislav Bolkhovitin
     [not found]                                       ` <508B3EED.2080003-d+Crzxg7Rs0@public.gmane.org>
2012-10-27  4:44                                         ` Theodore Ts'o
2012-10-27  4:44                                           ` [sqlite] " Theodore Ts'o
2012-10-30 22:22                                           ` Vladislav Bolkhovitin
     [not found]                                             ` <5090532D.4050902-d+Crzxg7Rs0@public.gmane.org>
2012-10-31  9:54                                               ` Alan Cox
2012-10-31  9:54                                                 ` [sqlite] " Alan Cox
2012-11-01 20:18                                                 ` Vladislav Bolkhovitin
     [not found]                                                   ` <5092D90F.7020105-d+Crzxg7Rs0@public.gmane.org>
2012-11-01 21:24                                                     ` Alan Cox
2012-11-01 21:24                                                       ` [sqlite] " Alan Cox
2012-11-02  0:15                                                       ` Vladislav Bolkhovitin
     [not found]                                                       ` <20121101212418.140e3a82-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-11-02  0:38                                                         ` Howard Chu
2012-11-02  0:38                                                           ` [sqlite] " Howard Chu
     [not found]                                                           ` <50931601.4060102-aQkYFu9vm6AAvxtiuMwx3w@public.gmane.org>
2012-11-02 12:24                                                             ` Richard Hipp
2012-11-13  3:41                                                               ` [sqlite] " Vladislav Bolkhovitin
2012-11-02 12:33                                                             ` Alan Cox
2012-11-02 12:33                                                               ` [sqlite] " Alan Cox
2012-11-13  3:41                                                               ` Vladislav Bolkhovitin
     [not found]                                                                 ` <50A1C15E.2080605-d+Crzxg7Rs0@public.gmane.org>
2012-11-13 17:40                                                                   ` Alan Cox
2012-11-13 17:40                                                                     ` [sqlite] " Alan Cox
     [not found]                                                                     ` <20121113174000.6457a68b-38n7/U1jhRXW96NNrWNlrekiAK3p4hvP@public.gmane.org>
2012-11-13 19:13                                                                       ` Nico Williams
2012-11-13 19:13                                                                         ` [sqlite] " Nico Williams
2012-11-15  1:17                                                                         ` Vladislav Bolkhovitin
     [not found]                                                                           ` <50A442AF.9020407-d+Crzxg7Rs0@public.gmane.org>
2012-11-15 12:07                                                                             ` David Lang
2012-11-15 12:07                                                                               ` [sqlite] " David Lang
     [not found]                                                                               ` <alpine.DEB.2.02.1211150353080.32408-UEhY+ZBZOcqqLGM74eQ/YA@public.gmane.org>
2012-11-15 16:14                                                                                 ` 杨苏立 Yang Su Li
2012-11-17  5:02                                                                                   ` [sqlite] " Vladislav Bolkhovitin
2012-11-17  5:02                                                                                     ` Vladislav Bolkhovitin
2012-11-16 15:06                                                                                 ` Howard Chu
2012-11-16 15:06                                                                                   ` [sqlite] " Howard Chu
2012-11-16 15:31                                                                                   ` Ric Wheeler
     [not found]                                                                                     ` <50A65C68.6080001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-11-16 15:54                                                                                       ` Howard Chu
2012-11-16 15:54                                                                                         ` [sqlite] " Howard Chu
2012-11-16 18:03                                                                                         ` Ric Wheeler
2012-11-16 18:03                                                                                           ` Ric Wheeler
     [not found]                                                                                   ` <50A65681.8000204-aQkYFu9vm6AAvxtiuMwx3w@public.gmane.org>
2012-11-16 19:14                                                                                     ` David Lang
2012-11-16 19:14                                                                                       ` [sqlite] " David Lang
2012-11-17  5:02                                                                               ` Vladislav Bolkhovitin
2012-11-15 17:06                                                                             ` Ryan Johnson
2012-11-15 17:06                                                                               ` [sqlite] " Ryan Johnson
2012-11-15 22:35                                                                               ` Chris Friesen
2012-11-17  5:02                                                                                 ` Vladislav Bolkhovitin
2012-11-20  1:23                                                                                   ` Vladislav Bolkhovitin
2012-11-26 20:05                                                                                     ` Nico Williams
     [not found]                                                                                       ` <CAK3OfOjD4XBGfu3cnMwTvCfec0Lvg3zrO16+pXtiFF4UWpFjDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-11-29  2:15                                                                                         ` Vladislav Bolkhovitin
2012-11-29  2:15                                                                                           ` [sqlite] " Vladislav Bolkhovitin
2012-11-15  1:16                                                                     ` Vladislav Bolkhovitin
2012-11-13  3:37                                                           ` Vladislav Bolkhovitin
2012-11-11  4:25                                         ` 杨苏立 Yang Su Li
2012-11-13  3:42                                           ` [sqlite] " Vladislav Bolkhovitin
2012-10-10  7:57                 ` [PATCH 00/16] f2fs: introduce flash-friendly file system Vyacheslav Dubeyko
2012-10-10  9:43                   ` Jaegeuk Kim
2012-10-11  3:14                     ` Namjae Jeon
     [not found]                       ` <CAN863PuyMkSZtZCvqX+kwei9v=rnbBYVYr3TqBXF_6uxwJe2_Q@mail.gmail.com>
2012-10-17 11:13                         ` Namjae Jeon
2012-10-17 23:06                           ` Changman Lee
2012-10-12 12:30                     ` Vyacheslav Dubeyko
2012-10-12 14:25                       ` Jaegeuk Kim
2012-10-07 10:15     ` Vyacheslav Dubeyko
2012-10-07 10:15       ` Vyacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=508B3E6E.6060702@vlnb.net \
    --to=vst@vlnb.net \
    --cc=drh@hwaci.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nico@cryptonector.com \
    --cc=sqlite-users@sqlite.org \
    --cc=suli@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.