Skip to content

Commit f5b8b29

Browse files
harshadjstytso
authored andcommitted
doc: update ext4 and journalling docs to include fast commit feature
This patch adds necessary documentation for fast commits. Signed-off-by: Harshad Shirwadkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
1 parent e0770e9 commit f5b8b29

File tree

2 files changed

+99
-0
lines changed

2 files changed

+99
-0
lines changed

Documentation/filesystems/ext4/journal.rst

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,17 @@ metadata are written to disk through the journal. This is slower but
2828
safest. If ``data=writeback``, dirty data blocks are not flushed to the
2929
disk before the metadata are written to disk through the journal.
3030

31+
In case of ``data=ordered`` mode, Ext4 also supports fast commits which
32+
help reduce commit latency significantly. The default ``data=ordered``
33+
mode works by logging metadata blocks to the journal. In fast commit
34+
mode, Ext4 only stores the minimal delta needed to recreate the
35+
affected metadata in fast commit space that is shared with JBD2.
36+
Once the fast commit area fills in or if fast commit is not possible
37+
or if JBD2 commit timer goes off, Ext4 performs a traditional full commit.
38+
A full commit invalidates all the fast commits that happened before
39+
it and thus it makes the fast commit area empty for further fast
40+
commits. This feature needs to be enabled at mkfs time.
41+
3142
The journal inode is typically inode 8. The first 68 bytes of the
3243
journal inode are replicated in the ext4 superblock. The journal itself
3344
is normal (but hidden) file within the filesystem. The file usually
@@ -609,3 +620,58 @@ bytes long (but uses a full block):
609620
- h\_commit\_nsec
610621
- Nanoseconds component of the above timestamp.
611622

623+
Fast commits
624+
~~~~~~~~~~~~
625+
626+
Fast commit area is organized as a log of tag length values. Each TLV has
627+
a ``struct ext4_fc_tl`` in the beginning which stores the tag and the length
628+
of the entire field. It is followed by variable length tag specific value.
629+
Here is the list of supported tags and their meanings:
630+
631+
.. list-table::
632+
:widths: 8 20 20 32
633+
:header-rows: 1
634+
635+
* - Tag
636+
- Meaning
637+
- Value struct
638+
- Description
639+
* - EXT4_FC_TAG_HEAD
640+
- Fast commit area header
641+
- ``struct ext4_fc_head``
642+
- Stores the TID of the transaction after which these fast commits should
643+
be applied.
644+
* - EXT4_FC_TAG_ADD_RANGE
645+
- Add extent to inode
646+
- ``struct ext4_fc_add_range``
647+
- Stores the inode number and extent to be added in this inode
648+
* - EXT4_FC_TAG_DEL_RANGE
649+
- Remove logical offsets to inode
650+
- ``struct ext4_fc_del_range``
651+
- Stores the inode number and the logical offset range that needs to be
652+
removed
653+
* - EXT4_FC_TAG_CREAT
654+
- Create directory entry for a newly created file
655+
- ``struct ext4_fc_dentry_info``
656+
- Stores the parent inode number, inode number and directory entry of the
657+
newly created file
658+
* - EXT4_FC_TAG_LINK
659+
- Link a directory entry to an inode
660+
- ``struct ext4_fc_dentry_info``
661+
- Stores the parent inode number, inode number and directory entry
662+
* - EXT4_FC_TAG_UNLINK
663+
- Unlink a directory entry of an inode
664+
- ``struct ext4_fc_dentry_info``
665+
- Stores the parent inode number, inode number and directory entry
666+
667+
* - EXT4_FC_TAG_PAD
668+
- Padding (unused area)
669+
- None
670+
- Unused bytes in the fast commit area.
671+
672+
* - EXT4_FC_TAG_TAIL
673+
- Mark the end of a fast commit
674+
- ``struct ext4_fc_tail``
675+
- Stores the TID of the commit, CRC of the fast commit of which this tag
676+
represents the end of
677+

Documentation/filesystems/journalling.rst

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,39 @@ The opportunities for abuse and DOS attacks with this should be obvious,
132132
if you allow unprivileged userspace to trigger codepaths containing
133133
these calls.
134134

135+
Fast commits
136+
~~~~~~~~~~~~
137+
138+
JBD2 to also allows you to perform file-system specific delta commits known as
139+
fast commits. In order to use fast commits, you first need to call
140+
:c:func:`jbd2_fc_init` and tell how many blocks at the end of journal
141+
area should be reserved for fast commits. Along with that, you will also need
142+
to set following callbacks that perform correspodning work:
143+
144+
`journal->j_fc_cleanup_cb`: Cleanup function called after every full commit and
145+
fast commit.
146+
147+
`journal->j_fc_replay_cb`: Replay function called for replay of fast commit
148+
blocks.
149+
150+
File system is free to perform fast commits as and when it wants as long as it
151+
gets permission from JBD2 to do so by calling the function
152+
:c:func:`jbd2_fc_begin_commit()`. Once a fast commit is done, the client
153+
file system should tell JBD2 about it by calling
154+
:c:func:`jbd2_fc_end_commit()`. If file system wants JBD2 to perform a full
155+
commit immediately after stopping the fast commit it can do so by calling
156+
:c:func:`jbd2_fc_end_commit_fallback()`. This is useful if fast commit operation
157+
fails for some reason and the only way to guarantee consistency is for JBD2 to
158+
perform the full traditional commit.
159+
160+
JBD2 helper functions to manage fast commit buffers. File system can use
161+
:c:func:`jbd2_fc_get_buf()` and :c:func:`jbd2_fc_wait_bufs()` to allocate
162+
and wait on IO completion of fast commit buffers.
163+
164+
Currently, only Ext4 implements fast commits. For details of its implementation
165+
of fast commits, please refer to the top level comments in
166+
fs/ext4/fast_commit.c.
167+
135168
Summary
136169
~~~~~~~
137170

0 commit comments

Comments
 (0)