Skip to content

Commit deb016c

Browse files
author
Andreas Gruenbacher
committed
gfs2: No more self recovery
When a node withdraws and it turns out that it is the only node that has the filesystem mounted, gfs2 currently tries to replay the local journal to bring the filesystem back into a consistent state. Not only is that a very bad idea, it has also never worked because gfs2_recover_func() will refuse to do anything during a withdraw. However, before even getting to this point, gfs2_recover_func() dereferences sdp->sd_jdesc->jd_inode. This was a use-after-free before commit 04133b6 ("gfs2: Prevent double iput for journal on error") and is a NULL pointer dereference since then. Simply get rid of self recovery to fix that. Fixes: 601ef0d ("gfs2: Force withdraw to replay journals and wait for it to finish") Reported-by: Chunjie Zhu <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
1 parent 557c024 commit deb016c

File tree

1 file changed

+11
-20
lines changed

1 file changed

+11
-20
lines changed

fs/gfs2/util.c

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -232,32 +232,23 @@ static void signal_our_withdraw(struct gfs2_sbd *sdp)
232232
*/
233233
ret = gfs2_glock_nq(&sdp->sd_live_gh);
234234

235+
gfs2_glock_put(live_gl); /* drop extra reference we acquired */
236+
clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags);
237+
235238
/*
236239
* If we actually got the "live" lock in EX mode, there are no other
237-
* nodes available to replay our journal. So we try to replay it
238-
* ourselves. We hold the "live" glock to prevent other mounters
239-
* during recovery, then just dequeue it and reacquire it in our
240-
* normal SH mode. Just in case the problem that caused us to
241-
* withdraw prevents us from recovering our journal (e.g. io errors
242-
* and such) we still check if the journal is clean before proceeding
243-
* but we may wait forever until another mounter does the recovery.
240+
* nodes available to replay our journal.
244241
*/
245242
if (ret == 0) {
246-
fs_warn(sdp, "No other mounters found. Trying to recover our "
247-
"own journal jid %d.\n", sdp->sd_lockstruct.ls_jid);
248-
if (gfs2_recover_journal(sdp->sd_jdesc, 1))
249-
fs_warn(sdp, "Unable to recover our journal jid %d.\n",
250-
sdp->sd_lockstruct.ls_jid);
251-
gfs2_glock_dq_wait(&sdp->sd_live_gh);
252-
gfs2_holder_reinit(LM_ST_SHARED,
253-
LM_FLAG_NOEXP | GL_EXACT | GL_NOPID,
254-
&sdp->sd_live_gh);
255-
gfs2_glock_nq(&sdp->sd_live_gh);
243+
fs_warn(sdp, "No other mounters found.\n");
244+
/*
245+
* We are about to release the lockspace. By keeping live_gl
246+
* locked here, we ensure that the next mounter coming along
247+
* will be a "first" mounter which will perform recovery.
248+
*/
249+
goto skip_recovery;
256250
}
257251

258-
gfs2_glock_put(live_gl); /* drop extra reference we acquired */
259-
clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags);
260-
261252
/*
262253
* At this point our journal is evicted, so we need to get a new inode
263254
* for it. Once done, we need to call gfs2_find_jhead which

0 commit comments

Comments
 (0)