Prevent metaslab_sync panic due to spa_final_dirty_txg

If a pool enables the SPACEMAP_HISTOGRAM feature shortly before being
exported, we can enter a situation that causes a kernel panic. Any metaslabs
that are loaded during the final dirty txg and haven't already been condensed
will cause metaslab_sync to proceed after the final dirty txg so that the
condense can be performed, which there are assertions to prevent. Because of
the nature of this issue, there are a number of ways we can enter this
state. Rather than try to prevent each of them one by one, potentially missing
some edge cases, we instead cut it off at the point of intersection; by
preventing metaslab_sync from proceeding if it would only do so to perform a
condense and we're past the final dirty txg, we preserve the utility of the
existing asserts while preventing this particular issue.

Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #9185
Closes #9186
Closes #9231
Closes #9253
This commit is contained in:
Paul Dagnelie 2019-08-30 09:28:31 -07:00 committed by Brian Behlendorf
parent e2fcfa70e3
commit 475aa97cab
2 changed files with 12 additions and 7 deletions

View File

@ -3553,12 +3553,19 @@ metaslab_sync(metaslab_t *msp, uint64_t txg)
/* /*
* Normally, we don't want to process a metaslab if there are no * Normally, we don't want to process a metaslab if there are no
* allocations or frees to perform. However, if the metaslab is being * allocations or frees to perform. However, if the metaslab is being
* forced to condense and it's loaded, we need to let it through. * forced to condense, it's loaded and we're not beyond the final
* dirty txg, we need to let it through. Not condensing beyond the
* final dirty txg prevents an issue where metaslabs that need to be
* condensed but were loaded for other reasons could cause a panic
* here. By only checking the txg in that branch of the conditional,
* we preserve the utility of the VERIFY statements in all other
* cases.
*/ */
if (range_tree_is_empty(alloctree) && if (range_tree_is_empty(alloctree) &&
range_tree_is_empty(msp->ms_freeing) && range_tree_is_empty(msp->ms_freeing) &&
range_tree_is_empty(msp->ms_checkpointing) && range_tree_is_empty(msp->ms_checkpointing) &&
!(msp->ms_loaded && msp->ms_condense_wanted)) !(msp->ms_loaded && msp->ms_condense_wanted &&
txg <= spa_final_dirty_txg(spa)))
return; return;

View File

@ -481,15 +481,13 @@ tests = ['zpool_trim_attach_detach_add_remove',
tags = ['functional', 'zpool_trim'] tags = ['functional', 'zpool_trim']
[tests/functional/cli_root/zpool_upgrade] [tests/functional/cli_root/zpool_upgrade]
tests = ['zpool_upgrade_001_pos', tests = ['zpool_upgrade_001_pos', 'zpool_upgrade_002_pos',
'zpool_upgrade_003_pos', 'zpool_upgrade_004_pos',
'zpool_upgrade_005_neg', 'zpool_upgrade_006_neg', 'zpool_upgrade_005_neg', 'zpool_upgrade_006_neg',
'zpool_upgrade_007_pos', 'zpool_upgrade_008_pos',
'zpool_upgrade_009_neg'] 'zpool_upgrade_009_neg']
tags = ['functional', 'cli_root', 'zpool_upgrade'] tags = ['functional', 'cli_root', 'zpool_upgrade']
# Disabled pending resolution of #9185 and #9186.
# 'zpool_upgrade_002_pos', 'zpool_upgrade_003_pos', 'zpool_upgrade_004_pos',
# 'zpool_upgrade_007_pos', 'zpool_upgrade_008_pos',
[tests/functional/cli_user/misc] [tests/functional/cli_user/misc]
tests = ['zdb_001_neg', 'zfs_001_neg', 'zfs_allow_001_neg', tests = ['zdb_001_neg', 'zfs_001_neg', 'zfs_allow_001_neg',
'zfs_clone_001_neg', 'zfs_create_001_neg', 'zfs_destroy_001_neg', 'zfs_clone_001_neg', 'zfs_create_001_neg', 'zfs_destroy_001_neg',