|
| 1 | +.. SPDX-License-Identifier: GPL-2.0 |
| 2 | +
|
| 3 | +Orphan file |
| 4 | +----------- |
| 5 | + |
| 6 | +In unix there can inodes that are unlinked from directory hierarchy but that |
| 7 | +are still alive because they are open. In case of crash the filesystem has to |
| 8 | +clean up these inodes as otherwise they (and the blocks referenced from them) |
| 9 | +would leak. Similarly if we truncate or extend the file, we need not be able |
| 10 | +to perform the operation in a single journalling transaction. In such case we |
| 11 | +track the inode as orphan so that in case of crash extra blocks allocated to |
| 12 | +the file get truncated. |
| 13 | + |
| 14 | +Traditionally ext4 tracks orphan inodes in a form of single linked list where |
| 15 | +superblock contains the inode number of the last orphan inode (s\_last\_orphan |
| 16 | +field) and then each inode contains inode number of the previously orphaned |
| 17 | +inode (we overload i\_dtime inode field for this). However this filesystem |
| 18 | +global single linked list is a scalability bottleneck for workloads that result |
| 19 | +in heavy creation of orphan inodes. When orphan file feature |
| 20 | +(COMPAT\_ORPHAN\_FILE) is enabled, the filesystem has a special inode |
| 21 | +(referenced from the superblock through s\_orphan_file_inum) with several |
| 22 | +blocks. Each of these blocks has a structure: |
| 23 | + |
| 24 | +.. list-table:: |
| 25 | + :widths: 8 8 24 40 |
| 26 | + :header-rows: 1 |
| 27 | + |
| 28 | + * - Offset |
| 29 | + - Type |
| 30 | + - Name |
| 31 | + - Description |
| 32 | + * - 0x0 |
| 33 | + - Array of \_\_le32 entries |
| 34 | + - Orphan inode entries |
| 35 | + - Each \_\_le32 entry is either empty (0) or it contains inode number of |
| 36 | + an orphan inode. |
| 37 | + * - blocksize - 8 |
| 38 | + - \_\_le32 |
| 39 | + - ob\_magic |
| 40 | + - Magic value stored in orphan block tail (0x0b10ca04) |
| 41 | + * - blocksize - 4 |
| 42 | + - \_\_le32 |
| 43 | + - ob\_checksum |
| 44 | + - Checksum of the orphan block. |
| 45 | + |
| 46 | +When a filesystem with orphan file feature is writeably mounted, we set |
| 47 | +RO\_COMPAT\_ORPHAN\_PRESENT feature in the superblock to indicate there may |
| 48 | +be valid orphan entries. In case we see this feature when mounting the |
| 49 | +filesystem, we read the whole orphan file and process all orphan inodes found |
| 50 | +there as usual. When cleanly unmounting the filesystem we remove the |
| 51 | +RO\_COMPAT\_ORPHAN\_PRESENT feature to avoid unnecessary scanning of the orphan |
| 52 | +file and also make the filesystem fully compatible with older kernels. |
0 commit comments