Archive-Team/zfs - zfs - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Brian Behlendorf	e130330a87	Handle /etc/mtab -> /proc/mounts symlink Under Fedora 15 /etc/mtab is now a symlink to /proc/mounts by default. When /etc/mtab is a symlink the mount.zfs helper should not update it. There was code in place to handle this case but it used stat() which traverses the link and then issues the stat on /proc/mounts. We need to use lstat() to prevent the link traversal and instead stat /etc/mtab. Closes #270	2011-06-14 16:48:38 -07:00
Brian Behlendorf	2e08aedba4	Always check -Wno-unused-but-set-variable gcc support The previous commit `8a7e1ceefa` wasn't quite right. This check applies to both the user and kernel space build and as such we must make sure it runs regardless of what the --with-config option is set too. For example, if --with-config=kernel then the autoconf test does not run and we generate build warnings when compiling the kernel packages.	2011-06-14 16:40:35 -07:00
Brian Behlendorf	8a7e1ceefa	Check for -Wno-unused-but-set-variable gcc support Gcc versions 4.3.2 and earlier do not support the compiler flag -Wno-unused-but-set-variable. This can lead to build failures on older Linux platforms such as Debian Lenny. Since this is an optional build argument this changes add a new autoconf check for the option. If it is supported by the installed version of gcc then it is used otherwise it is omited. See commit's `12c1acde76` and `79713039a2` for the reason the -Wno-unused-but-set-variable options was originally added.	2011-06-14 14:43:22 -07:00
Brian Behlendorf	10715a0187	Add default stack checking When your kernel is built with kernel stack tracing enabled and you have the debugfs filesystem mounted. Then the zfs.sh script will clear the worst observed kernel stack depth on module load and check the worst case usage on module removal. If the stack depth ever exceeds 7000 bytes the full stack will be printed for debugging. This is dangerously close to overrunning the default 8k stack. This additional advisory debugging is particularly valuable when running the regression tests on a kernel built with 16k stacks. In this case, almost no matter how bad the stack overrun is you will see be able to get a clean stack trace for debugging. Since the worst case stack usage can be highly variable it's helpful to always check the worst case usage.	2011-06-13 13:50:21 -07:00
Brian Behlendorf	da88a7fbe8	Pass -f option for import If a pool was not cleanly exported passing the -f flag may be required at 'zpool import' time. Since this test is simply validating that the pool can be successfully imported in the absense of the cache file always pass the -f to ensure it succeeds. This failure was observed under RHEL6.1.	2011-06-10 11:21:31 -07:00
Brian Behlendorf	1b9d8c340f	Fix 'zfs send -D' segfault Sending pools with dedup results in a segfault due to a Solaris portability issue. Under Solaris the pipe(2) library call creates a bidirectional data channel. Unfortunately, on Linux pipe(2) call creates unidirection data channel. The fix is to use the socketpair(2) function to create the expected bidirectional channel. Seth Heeren did the original leg work on this issue for zfs-fuse. We finally just rediscovered the same portability issue and dfurphy was able to point me at the original issue for the fix. Closes #268	2011-06-09 13:58:48 -07:00
Brian Behlendorf	cbc6fab65c	Sanatize zpios-sanity.sh environment Just like zconfig.sh the zpios-sanity.sh tests should run in a sanatized environment. This ensures they never conflict with an installed /etc/zfs/zpool.cache file. This commit additionally improves the -c cleanup option. It now removes the modules stack if loaded and destroys relevant md devices. This behavior is now identical to zconfig.sh.	2011-06-03 15:08:49 -07:00
Brian Behlendorf	608860b6d0	Delay before destroying loopback devices Generally I don't approve of just adding an arbitrary delay to avoid a problem but in this case I'm going to let it slide. We may need to delay briefly after 'zpool destroy' returns to ensure the loopback devices are closed. If they aren't closed than losetup -d will not be able to destroy them. Unfortunately, there's no easy state the check so we'll have to make due with a simple delay.	2011-06-03 14:38:25 -07:00
Brian Behlendorf	36391312af	Always unload zpios.ko on exit We should always unload zpios.ko on exit. This ensures that subsequent calls to 'zfs.sh -u' from other utilities will be able to unload the module stack and properly cleanup. This is important for the the --cleanup option which can be passed to zconfig.sh and zfault.sh.	2011-06-02 10:25:35 -07:00
Brian Behlendorf	2ea9dc40f8	Fix zpios-sanity.sh return code The zpios-sanity.sh script should return failure when any of the individual zpios.sh tests fail. The previous code would always return success suppressing real failures.	2011-06-02 10:13:15 -07:00
Brian Behlendorf	e95b3bdcbb	Fix stack ddt_class_contains() Stack usage for ddt_class_contains() reduced from 524 bytes to 68 bytes. This large stack allocation significantly contributed to the likelyhood of a stack overflow when scrubbing/resilvering dedup pools.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	5b8c7bbcea	Fix stack ddt_zap_lookup() Stack usage for ddt_zap_lookup() reduced from 368 bytes to 120 bytes. This large stack allocation significantly contributed to the likelyhood of a stack overflow when scrubbing/resilvering dedup pools.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	c7f8f831a4	Revert "Fix stack traverse_visitbp()" This abomination is no longer required because the zio's issued during this recursive call path will now be handled asynchronously by the taskq thread pool. This reverts commit `6656bf5621`.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	2fac4c2a74	Make tgx_sync_thread zio's async The majority of the recursive operations performed by the dsl are done either in the context of the tgx_sync_thread or during pool import. It is these recursive operations which contribute greatly to the stack depth. When this recursion is coupled with a synchronous I/O in the same context overflow becomes possible. Previously to handle this case I have focused on keeping the individual stack frames as light as possible. This is a good idea as long as it can be done in a way which doesn't overly complicate the code. However, there is a better solution. If we treat all zio's issued by the tgx_sync_thread as async then we can use the tgx_sync_thread stack for the recursive parts, and the zio_* threads for the I/O parts. This effectively doubles our available stack space with the only drawback being a small delay to schedule the I/O. However, in practice the scheduling time is so much smaller than the actual I/O time this isn't an issue. Another benefit of making the zio async is that the zio pipeline is now parallel. That should mean for CPU intensive pipelines such as compression or dedup performance may be improved. With this change in place the worst case stack usage observed so far is 6902 bytes. This is still higher than I'd like but significantly improved. Additional changes to specific functions should improve this further. This change allows us to revent commit `6656bf5` which did some horrible things to the recursive traverse_visitbp() callpath in the name of saving stack.	2011-05-31 12:17:27 -07:00
Brian Behlendorf	f74fae8b30	Fix 4K sector support Yesterday I ran across a 3TB drive which exposed 4K sectors to Linux. While I thought I had gotten this support correct it turns out there were 2 subtle bugs which prevented it from working. sudo ./cmd/zpool/zpool create -f large-sector /dev/sda cannot create 'large-sector': one or more devices is currently unavailable 1) The first issue was that it was possible that bdev_capacity() would return the number of 512 byte sectors rather than the number of 4096 sectors. Internally, certain Linux functions only operate with 512 byte sectors so you need to be careful. To avoid any confusion in the future I've updated bdev_capacity() to simply return the device (or partition) capacity in bytes. The higher levels of ZFS want the value in bytes anyway so this is cleaner. 2) When creating a bio the ->bi_sector count must always be expressed in 512 byte sectors. The existing code would scale the byte offset by the logical sector size. Until now this was always 512 so it never caused problems. Trying a 4K sector drive clearly exposed the issue. The problem has been fixed by hard coding the 512 byte sector which is exactly what the bio code does internally. With these changes I'm now able to create ZFS pools using 4K sector drives. No issues were observed during fairly extensive testing. This is also a low risk change if your using 512b sectors devices because none of the logic changes. Closes #256	2011-05-27 11:38:53 -07:00
Brian Behlendorf	2b8cad6159	Use vmem_alloc() for zfs_ioc_userspace_many() The default buffer size when requesting multiple quota entries is 100 times the zfs_useracct_t size. In practice this works out to exactly 27200 bytes. Since this will be a short lived buffer in a non-performance critical path it is preferable to vmem_alloc() the needed memory.	2011-05-20 14:23:18 -07:00
Brian Behlendorf	4804b739e1	Default to internal 'zfs userspace' implementation We will never bring over the pyzfs.py helper script from Solaris to Linux. Instead the missing functionality will be directly integrated in to the zfs commands and libraries. To avoid confusion remove the warning about the missing pyzfs.py utility and simply use the default internal support. The Illumous developers are of the same mind and have proposed an initial patch to do this which has been integrated in to the 'allow' development branch. After some additional testing this code can be merged in to master as the right long term solution.	2011-05-20 10:25:41 -07:00
Brian Behlendorf	f01b360e67	Pass caller's credential in zfsdev_ioctl() Initially when zfsdev_ioctl() was ported to Linux we didn't have any credential support implemented. So at the time we simply passed NULL which wasn't much of a problem since most of the secpolicy code was disabled. However, one exception is quota handling which does require the credential. Now that proper credentials are supported we can safely start passing the callers credential. This is also an initial step towards fully implemented the zfs secpolicy.	2011-05-20 10:12:25 -07:00
Brian Behlendorf	3fd70ee6b0	Fix 'negative objects to delete' warning Normally when the arc_shrinker_func() function is called the return value should be: >=0 - To indicate the number of freeable objects in the cache, or -1 - To indicate this cache should be skipped However, when the shrinker callback is called with 'nr_to_scan' equal to zero. The caller simply wants the number of freeable objects in the cache and we must never return -1. This patch reorders the first two conditionals in arc_shrinker_func() to ensure this behavior. This patch also now explictly casts arc_size and arc_c_min to signed int64_t types so MAX(x, 0) works as expected. As unsigned types we would never see an negative value which defeated the purpose of the MAX() lower bound and broke the shrinker logic. Finally, when nr_to_scan is non-zero we explictly prevent all reclaim below arc_c_min. This is done to prevent the Linux page cache from completely crowding out the ARC. This limit is tunable and some experimentation is likely going to be required to set it exactly right. For now we're sticking with the OpenSolaris defaults. Closes #218 Closes #243	2011-05-18 10:29:22 -07:00
Alexey Shvetsov	d9bfe0f57a	Fix distribution detection for gentoo Also this may fix other distros because some of them also provide /etc/lsb-release not only ubuntu. Closes #244	2011-05-14 08:54:48 -07:00
Brian Behlendorf	e814770f2e	Update synchronous open zfs_close() comment The comment in zfs_close() pertaining to decrementing the synchronous open count needs to be updated for Linux. The code was already updated to be correct, but the comment was missed and is now misleading. Under Linux the zfs_close() hook is only called once when the final reference is dropped. This differs from Solaris where zfs_close() is called for each close. Closes #237	2011-05-13 08:20:06 -07:00
Alexey Shvetsov	6f582dc708	Remove root 'ls' after mount workaround This workaround was introduced to workaround issue #164. This issue was fixed by commit `5f35b19` so the workaround can be safely dropped from both the zfs.fedora and zfs.gentoo init scripts.	2011-05-12 15:01:35 -07:00
Alexey Shvetsov	06abcdd3f4	Fix zfs.gentoo init script logic * Fix zfs.ko module check * Check 'zfs umount -a' return value	2011-05-12 14:45:57 -07:00
Alexey Shvetsov	04c22478a7	Make zfs.gentoo init script more gentoo style. * Improved compatibility with openrc * Removed LOCKFILE * Improved checksystem() function * Remove /etc/mtab check for / * General cleanup	2011-05-12 14:42:43 -07:00
Brian Behlendorf	c91d229809	Merge pull request #235 from nedbass/rdev Don't store rdev in SA for FIFOs and sockets	2011-05-09 16:41:28 -07:00
Ned A. Bass	aa6d8c1086	Don't store rdev in SA for FIFOs and sockets Update the handling of named pipes and sockets to be consistent with other platforms with regard to the rdev attribute. While all ZFS ipmlementations store the rdev for device files in a system attribute (SA), this is not the case for FIFOs and sockets. Indeed, Linux always passes rdev=0 to mknod() for FIFOs and sockets, so the value is not needed. Add an ASSERT that rdev==0 for FIFOs and sockets to detect if the expected behavior ever changes. Closes #216	2011-05-09 13:35:07 -07:00
Brian Behlendorf	21ade34764	Disable direct reclaim for z_wr_* threads The direct reclaim path in the z_wr_* threads must be disabled to ensure forward progress is always maintained for txg processing. This ensures that a txg will never get stuck waiting on itself because it entered the following memory reclaim callpath. ->prune_icache()->dispose_list()->zpl_clear_inode()->zfs_inactive() ->dmu_tx_assign()->dmu_tx_wait()->tgx_wait_open() It would be preferable to target this exact code path but the kernel offers no way to do this without custom patches. To avoid this we are forced to disable all reclaim for these threads. It should not be necessary to do this for other other z_* threads because they will not hold a txg open. Closes #232	2011-05-06 15:26:26 -07:00
Brian Behlendorf	3117dd0b90	Handle NULL in nfsd .fsync() hook How nfsd handles .fsync() has been changed a couple of times in the recent kernels. But basically there are three cases we need to consider. Linux 2.6.12 - 2.6.33 * The .fsync() hook takes 3 arguments * The nfsd will call .fsync() with a NULL file struct pointer. Linux 2.6.34 * The .fsync() hook takes 3 arguments * The nfsd no longer calls .fsync() but instead used sync_inode() Linux 2.6.35 - 2.6.x * The .fsync() hook takes 2 arguments * The nfsd no longer calls .fsync() but instead used sync_inode() For once it looks like we've gotten lucky. The first two cases can actually be collased in to one if we stop using the file struct pointer entirely. Since the dentry is still passed in both cases this is possible. The last case can then be safely handled by unconditionally using the dentry in the file struct pointer now that we know the nfsd caller has been removed. Closes #230	2011-05-06 12:33:45 -07:00
Brian Behlendorf	6ee44e32be	Fix awk usage The zpool_id and zpool_layout helper scripts have been updated to use the more common /usr/bin/awk symlink. On Fedora/Redhat systems there are both /bin/awk and /usr/bin/awk symlinks to your installed version of awk. On Debian/Ubuntu systems only the /usr/bin/awk symlink exists. Additionally, add the '\<' token to the beginning of the regex pattern to prevent partial matches. This pattern only appears to work with gawk despite the mawk man page claiming to support this extended regex. Thus you will need to have gawk installed to use these optional helper scripts. A comment has been added to the script to reflect this reality.	2011-05-06 10:16:04 -07:00
Brian Behlendorf	34b84cb831	Use vmem_alloc() for zfs_ioc_pool_get_history() The default buffer size when requesting history is 128k. This is far to large for a kmem_alloc() so instead use the slower vmem_alloc(). This path has no performance concerns and the buffer is immediately free'd after its contents are copied to the user space buffer.	2011-05-06 09:59:52 -07:00
Brian Behlendorf	3613204cd7	Allow mounting of read-only snapshots With the addition of the mount helper we accidentally regressed the ability to manually mount snapshots. This commit updates the mount helper to expect the possibility of a ZFS_TYPE_SNAPSHOT. All snapshot will be automatically treated as 'legacy' type mounts so they can be mounted manually.	2011-05-05 10:13:38 -07:00
Brian Behlendorf	c409e4647f	Add missing ZFS tunables This commit adds module options for all existing zfs tunables. Ideally the average user should never need to modify any of these values. However, in practice sometimes you do need to tweak these values for one reason or another. In those cases it's nice not to have to resort to rebuilding from source. All tunables are visable to modinfo and the list is as follows: $ modinfo module/zfs/zfs.ko filename: module/zfs/zfs.ko license: CDDL author: Sun Microsystems/Oracle, Lawrence Livermore National Laboratory description: ZFS srcversion: 8EAB1D71DACE05B5AA61567 depends: spl,znvpair,zcommon,zunicode,zavl vermagic: 2.6.32-131.0.5.el6.x86_64 SMP mod_unload modversions parm: zvol_major:Major number for zvol device (uint) parm: zvol_threads:Number of threads for zvol device (uint) parm: zio_injection_enabled:Enable fault injection (int) parm: zio_bulk_flags:Additional flags to pass to bulk buffers (int) parm: zio_delay_max:Max zio millisec delay before posting event (int) parm: zio_requeue_io_start_cut_in_line:Prioritize requeued I/O (bool) parm: zil_replay_disable:Disable intent logging replay (int) parm: zfs_nocacheflush:Disable cache flushes (bool) parm: zfs_read_chunk_size:Bytes to read per chunk (long) parm: zfs_vdev_max_pending:Max pending per-vdev I/Os (int) parm: zfs_vdev_min_pending:Min pending per-vdev I/Os (int) parm: zfs_vdev_aggregation_limit:Max vdev I/O aggregation size (int) parm: zfs_vdev_time_shift:Deadline time shift for vdev I/O (int) parm: zfs_vdev_ramp_rate:Exponential I/O issue ramp-up rate (int) parm: zfs_vdev_read_gap_limit:Aggregate read I/O over gap (int) parm: zfs_vdev_write_gap_limit:Aggregate write I/O over gap (int) parm: zfs_vdev_scheduler:I/O scheduler (charp) parm: zfs_vdev_cache_max:Inflate reads small than max (int) parm: zfs_vdev_cache_size:Total size of the per-disk cache (int) parm: zfs_vdev_cache_bshift:Shift size to inflate reads too (int) parm: zfs_scrub_limit:Max scrub/resilver I/O per leaf vdev (int) parm: zfs_recover:Set to attempt to recover from fatal errors (int) parm: spa_config_path:SPA config file (/etc/zfs/zpool.cache) (charp) parm: zfs_zevent_len_max:Max event queue length (int) parm: zfs_zevent_cols:Max event column width (int) parm: zfs_zevent_console:Log events to the console (int) parm: zfs_top_maxinflight:Max I/Os per top-level (int) parm: zfs_resilver_delay:Number of ticks to delay resilver (int) parm: zfs_scrub_delay:Number of ticks to delay scrub (int) parm: zfs_scan_idle:Idle window in clock ticks (int) parm: zfs_scan_min_time_ms:Min millisecs to scrub per txg (int) parm: zfs_free_min_time_ms:Min millisecs to free per txg (int) parm: zfs_resilver_min_time_ms:Min millisecs to resilver per txg (int) parm: zfs_no_scrub_io:Set to disable scrub I/O (bool) parm: zfs_no_scrub_prefetch:Set to disable scrub prefetching (bool) parm: zfs_txg_timeout:Max seconds worth of delta per txg (int) parm: zfs_no_write_throttle:Disable write throttling (int) parm: zfs_write_limit_shift:log2(fraction of memory) per txg (int) parm: zfs_txg_synctime_ms:Target milliseconds between tgx sync (int) parm: zfs_write_limit_min:Min tgx write limit (ulong) parm: zfs_write_limit_max:Max tgx write limit (ulong) parm: zfs_write_limit_inflated:Inflated tgx write limit (ulong) parm: zfs_write_limit_override:Override tgx write limit (ulong) parm: zfs_prefetch_disable:Disable all ZFS prefetching (int) parm: zfetch_max_streams:Max number of streams per zfetch (uint) parm: zfetch_min_sec_reap:Min time before stream reclaim (uint) parm: zfetch_block_cap:Max number of blocks to fetch at a time (uint) parm: zfetch_array_rd_sz:Number of bytes in a array_read (ulong) parm: zfs_pd_blks_max:Max number of blocks to prefetch (int) parm: zfs_dedup_prefetch:Enable prefetching dedup-ed blks (int) parm: zfs_arc_min:Min arc size (ulong) parm: zfs_arc_max:Max arc size (ulong) parm: zfs_arc_meta_limit:Meta limit for arc size (ulong) parm: zfs_arc_reduce_dnlc_percent:Meta reclaim percentage (int) parm: zfs_arc_grow_retry:Seconds before growing arc size (int) parm: zfs_arc_shrink_shift:log2(fraction of arc to reclaim) (int) parm: zfs_arc_p_min_shift:arc_c shift to calc min/max arc_p (int)	2011-05-04 10:02:37 -07:00
Brian Behlendorf	8db77dd7ed	Prep zfs-0.6.0-rc4 tag Create the fourth 0.6.0 release candidate tag (rc4).	2011-05-03 10:29:05 -07:00
Brian Behlendorf	712f8bd87b	Add Gentoo/Lunar/Redhat Init Scripts Every distribution has slightly different requirements for their init scripts. Because of this the zfs package contains several init scripts for various distributions. These scripts have been contributed by, and are supported by, the larger zfs community. Init scripts for Gentoo/Lunar/Redhat have been contributed by: Gentoo - devsk <devsku@gmail.com> Lunar - Jean-Michel Bruenn <jean.bruenn@ip-minds.de> Redhat - Fajar A. Nugraha <list@fajar.net>	2011-05-02 15:59:13 -07:00
Brian Behlendorf	5f35b19007	Fully update inode when created When a new znode/inode pair is created both the znode and the inode should be immediately updated to the correct values. This was done for the znode and for most of the values in the inode, but not all of them. This normally wasn't a problem because most subsequent operations would cause the inode to be immediately updated. This change ensures the inode is now fully updated before it is inserted in to the inode hash. Closes #116 Closes #146 Closes #164	2011-05-02 14:04:19 -07:00
Brian Behlendorf	df554c148e	Fix 'zfs set volsize=N pool/dataset' This change fixes a kernel panic which would occur when resizing a dataset which was not open. The objset_t stored in the zvol_state_t will be set to NULL when the block device is closed. To avoid this issue we pass the correct objset_t as the third arg. The code has also been updated to correctly notify the kernel when the block device capacity changes. For 2.6.28 and newer kernels the capacity change will be immediately detected. For earlier kernels the capacity change will be detected when the device is next opened. This is a known limitation of older kernels. Online ext3 resize test case passes on 2.6.28+ kernels: $ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023 $ zpool create tank /tmp/zvol $ zfs create -V 500M tank/zd0 $ mkfs.ext3 /dev/zd0 $ mkdir /mnt/zd0 $ mount /dev/zd0 /mnt/zd0 $ df -h /mnt/zd0 $ zfs set volsize=800M tank/zd0 $ resize2fs /dev/zd0 $ df -h /mnt/zd0 Original-patch-by: Fajar A. Nugraha <github@fajar.net> Closes #68 Closes #84	2011-05-02 08:54:40 -07:00
Alejandro R. Sedeño	e90a3de3e8	Add zpl_export.c to the list of targets.	2011-04-29 19:50:35 -07:00
Brian Behlendorf	94e954257a	Correct MAXUID The uid_t on most systems is in fact and unsigned 32-bit value. This is almost always correct, however you could compile your kernel to use an unsigned 16-bit value for uid_t. In practice I've never encountered a distribution which does this so I'm willing to overlook this corner case for now. Closes #165	2011-04-29 14:03:12 -07:00
Gunnar Beutner	055656d4f4	Implemented NFS export_operations. Implemented the required NFS operations for exporting ZFS datasets using the in-kernel NFS daemon.	2011-04-29 12:36:13 -07:00
Brian Behlendorf	5476e6952c	Suppress 'vdev_metaslab_init' memory warning The vdev_metaslab_init() function has been observed to allocate larger than 8k chunks. However, they are not much larger than 8k and it does this infrequently so it is allowed and the warning is supressed.	2011-04-27 09:35:18 -07:00
Brian Behlendorf	40a39e1103	Conserve stack in dsl_scan_visit() The dsl_scan_visit() function is a little heavy weight taking 464 bytes on the stack. This can be easily reduced for little cost by moving zap_cursor_t and zap_attribute_t off the stack and on to the heap. After this change dsl_scan_visit() has been reduced in size by 320 bytes. This change was made to reduce stack usage in the dsl_scan_sync() callpath which is recursive and has been observed to overflow the stack. Issue #174	2011-04-26 15:48:00 -07:00
Brian Behlendorf	b81c4ac9af	Conserve stack in dsl_scan_visitbp() This function is called recursively so everything possible must be done to limit its stack consumption. The dprintf_bp() debugging function adds 30 bytes of local variables to the function we cannot afford. By commenting out this debugging we save 30 bytes per recursion and depths of 13 are not uncommon. This yeilds a total stack saving of 390 bytes on our 8k stack. Issue #174	2011-04-26 15:48:00 -07:00
Brian Behlendorf	7a060636b0	Conserve stack in dsl_scan_visitbp() The recursive call chain dsl_scan_visitbp() -> dsl_scan_recurse() -> dsl_scan_visitdnode() -> dsl_scan_visitbp has been observed to consume considerable stack resulting in a stack overflow (>8k). The cleanest way I see to fix this with minimal impact to the existing flow of code, and with the fewest performance concerns, is to always inline dsl_scan_recurse() and dsl_scan_visitdnode(). While this will increase the function size of dsl_scan_visitbp(), by 4660 bytes, it also reduces the stack requirements by removing the function call overhead. Issue #174	2011-04-26 13:37:35 -07:00
Brian Behlendorf	44e9e34793	Merged pull request #212 from dajhorn/hostid. Use gethostid in the Linux convention.	2011-04-26 13:30:27 -07:00
Brian Behlendorf	701b1f8168	Fix zvol deadlock It's possible for a zvol_write thread to enter direct memory reclaim while holding open a transaction group. This results in the system attempting to write out data to the disk to free memory. Unfortunately, this can't succeed because the the thread doing reclaim is holding open the txg which must be closed to be synced to disk. To prevent this the offending allocation is marked KM_PUSHPAGE which will prevent it from attempting writeback. Closes #191	2011-04-26 12:56:35 -07:00
Darik Horn	492b8e9e7b	Use gethostid in the Linux convention. Disable the gethostid() override for Solaris behavior because Linux systems implement the POSIX standard in a way that allows a negative result. Mask the gethostid() result to the lower four bytes, like coreutils does in /usr/bin/hostid, to prevent junk bits or sign-extension on systems that have an eight byte long type. This can cause a spurious hostid mismatch that prevents zpool import on 64-bit systems.	2011-04-25 10:36:17 -05:00
Brian Behlendorf	a1cc0b3290	Fix 32-bit MAXOFFSET_T definition Having MAXOFFSET_T defined to 0x7fffffffl was artificially limiting the maximum file size on 32-bit systems. In reality MAXOFFSET_T is used when working with 'long long' types and as such we now define it as LLONG_MAX. This resolves the 2GB file size limit for files and additionally allows zvols greater than 2GB on 32-bit systems. Closes #136 Closes #81	2011-04-22 16:21:26 -07:00
Brian Behlendorf	e2448b0e62	Fix spurious -EFAULT when setting I/O scheduler Occasionally we would see an -EFAULT returned when setting the I/O scheduler on a vdev. This was caused an improperly formatted user mode helper command. This commit restructures the command to something simpler, allocates space for it dynamically to save stack, and removes the retry logic which is no longer needed. Closes #169	2011-04-22 14:55:35 -07:00
Brian Behlendorf	6a8f9b6bf0	Enforce ARC meta-data limits This change ensures the ARC meta-data limits are enforced. Without this enforcement meta-data can grow to consume all of the ARC cache pushing out data and hurting performance. The cache is aggressively reclaimed but this is a soft and not a hard limit. The cache may exceed the set limit briefly before being brought under control. By default 25% of the ARC capacity can be used for meta-data. This limit can be tuned by setting the 'zfs_arc_meta_limit' module option. Once this limit is exceeded meta-data reclaim will occur in 3 percent chunks, or may be tuned using 'arc_reduce_dnlc_percent'. Closes #193	2011-04-21 13:49:31 -07:00
Gunnar Beutner	36df284366	Fixed a use-after-free bug in zfs_zget(). Fixed a bug where zfs_zget could access a stale znode pointer when the inode had already been removed from the inode cache via iput -> iput_final -> ... -> zfs_zinactive but the corresponding SA handle was still alive. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #180	2011-04-21 13:48:01 -07:00

1 2 3 4 5 ...

407 Commits All Branches Search

407 Commits

All Branches