dnode_is_dirty: use dn_dirty_txg to check dirtiness

dn_dirty_ctx is always set to the highest txg that has ever dirtied the
dnode. It is set in dbuf_dirty() when a data or metadnode dbuf is
dirtied, and never cleared.

[analysis of bug #15526 and fix #15571 below, for future readers]

The previous dirty check was:

    for (int i = 0; i < TXG_SIZE; i++) {
        if (multilist_link_active(&dn->dn_dirty_link[i])
            [dnode is dirty]

However, this check is not "is the dnode dirty?" but rather, "is the
dnode on a list?".

There is a gap in dmu_objset_sync_dnodes() where the dnode is moved from
os_dirty_dnodes to os_synced_dnodes, before dnode_sync() is called to
write out the dirty dbufs. So, there is a moment when the dnode is not
on a list, and so the check fails.

It doesn't matter that the dirty check takes dn_mtx, because that lock
isn't used for dn_dirty_link. The os_dirty_dnodes sublist lock is held
in dmu_objset_sync_dnodes(), but trying to take that would mean possibly
waiting until everything on that sublist has been synced.

The correct fix has to check something that positively asserts the dnode
is dirty, rather than an implementation detail. dn_dirty_txg (via
DNODE_IS_DIRTY()) is that - its a normal bit of dnode state, under the
dn_mtx lock, and unambiguously indicates whether or not there's changes
pending.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
This commit is contained in:
Rob Norris 2023-11-30 22:00:03 +11:00 committed by Rob Norris
parent a03ebd9bee
commit 531af601a1
1 changed files with 3 additions and 19 deletions

View File

@ -1778,31 +1778,15 @@ dnode_try_claim(objset_t *os, uint64_t object, int slots)
} }
/* /*
* Checks if the dnode itself is dirty, or is carrying any uncommitted records. * Check if the dnode (including its data) is dirty on this or any future txg.
* It is important to check both conditions, as some operations (eg appending
* to a file) can dirty both as a single logical unit, but they are not synced
* out atomically, so checking one and not the other can result in an object
* appearing to be clean mid-way through a commit.
*
* Do not change this lightly! If you get it wrong, dmu_offset_next() can
* detect a hole where there is really data, leading to silent corruption.
*/ */
boolean_t boolean_t
dnode_is_dirty(dnode_t *dn) dnode_is_dirty(dnode_t *dn)
{ {
mutex_enter(&dn->dn_mtx); mutex_enter(&dn->dn_mtx);
boolean_t dirty = DNODE_IS_DIRTY(dn);
for (int i = 0; i < TXG_SIZE; i++) {
if (multilist_link_active(&dn->dn_dirty_link[i]) ||
!list_is_empty(&dn->dn_dirty_records[i])) {
mutex_exit(&dn->dn_mtx);
return (B_TRUE);
}
}
mutex_exit(&dn->dn_mtx); mutex_exit(&dn->dn_mtx);
return (dirty);
return (B_FALSE);
} }
void void