zfs/scripts/commitcheck.sh

219 lines
4.9 KiB
Bash
Raw Normal View History

bash scripts: use /usr/bin/env for bash shebangs Not all systems / distros have a `/bin/bash`, and these scripts are more difficult to run at development time. For example, my system is NixOS which doesn't have a /bin/bash. This is not a problem for NixOS building ZFS as a package: the build environment automatically replaces these shebangs with corrected paths. The problem is much more annoying at development time: either the scripts don't run, or I correct them for my local machine and deal with a perpetually dirty work tree. Before committing this patch I confirmed there are existing scripts which use `/usr/bin/env` to locate bash, so I am thinking this is a safe transformation. There are a handful of other shebangs in this repository which don't work on my system. This patch is useful on its own specifically for `commitcheck.sh`, otherwise I can't validate my commits before submission. Here are the remaining shebangs which NixOS systems won't have: 1274 #!/bin/ksh -p 91 #!/bin/ksh 89 #! /bin/ksh -p 2 #!/bin/sed -f 1 #!/usr/bin/perl -w 1 #!/usr/bin/ksh 1 #!/bin/nawk -f plus this which will create an invalid shebang in `tests/zfs-tests/tests/functional/mv_files/mv_files_common.kshlib`: echo "#!/bin/ksh" > $TEST_BASE_DIR/exitsZero.ksh I chose to leave those alone for now, and gauge the interest in this much smaller patch first. The fixes for these are easy enough by simply using `/usr/bin/env ksh`: 91 #!/bin/ksh 1 #!/usr/bin/ksh The fix for the other set is much trickier. Quoting the GNU coreutils manual: Most operating systems (e.g. GNU/Linux, BSDs) treat all text after the first space as a single argument. When using env in a script it is thus not possible to specify multiple arguments. and not all `env`'s support arguments. Mine (GNU Coreutils 8.31) does, though this feature is new since April 2018, GNU Coreutils 8.30: https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=668306ed86c8c79b0af0db8b9c882654ebb66db2 and worse, requires the -S argument: -S, --split-string=S process and split S into separate arguments; used to pass multiple arguments on shebang lines Example: $ seq 1 2 | $(nix-build '<nixpkgs>' -A coreutils)/bin/env "sort -nr" /nix/[...]-coreutils-8.31/bin/env: ‘sort -nr’: No such file or directory /nix/[...]-coreutils-8.31/bin/env: use -[v]S to pass options in shebang lines $ seq 1 2 | $(nix-build '<nixpkgs>' -A coreutils)/bin/env "-S sort -nr" 2 1 GNU Coreutils says FreeBSD's `env` does, though I wonder if FreeBSD's would be unhappy with the `-S`: https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#env-invocation BusyBox v1.30.1 does not, and does not have a `-S`-like option: $ seq 1 2 | $(nix-build '<nixpkgs>' -A busybox)/bin/env "sort -nr" env: can't execute 'sort -nr': No such file or directory Toybox 0.8.1 also does not, and also does not have a `-S` option: $ seq 1 2 | $(nix-build '<nixpkgs>' -A toybox)/bin/env "sort -nr" env: exec sort -nr: No such file or directory --- At any rate, if this patch merges and the remaining ~1,500 are updated, the much larger patch should probably include a checkstyle-like test asserting all new shebangs use `/usr/bin/env`. I also don't mind dealing with NixOS weirdness if the project would prefer that. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Graham Christensen <graham@grahamc.com> Closes #9893
2020-02-10 21:13:46 +00:00
#!/usr/bin/env bash
REF="HEAD"
# test a url
function test_url()
{
url="$1"
if ! curl --output /dev/null --max-time 60 \
--silent --head --fail "$url" ; then
echo "\"$url\" is unreachable"
return 1
fi
return 0
}
# test commit body for length
Direct IO support Direct IO via the O_DIRECT flag was originally introduced in XFS by IRIX for database workloads. Its purpose was to allow the database to bypass the page and buffer caches to prevent unnecessary IO operations (e.g. readahead) while preventing contention for system memory between the database and kernel caches. On Illumos, there is a library function called directio(3C) that allows user space to provide a hint to the file system that Direct IO is useful, but the file system is free to ignore it. The semantics are also entirely a file system decision. Those that do not implement it return ENOTTY. Since the semantics were never defined in any standard, O_DIRECT is implemented such that it conforms to the behavior described in the Linux open(2) man page as follows. 1. Minimize cache effects of the I/O. By design the ARC is already scan-resistant which helps mitigate the need for special O_DIRECT handling. Data which is only accessed once will be the first to be evicted from the cache. This behavior is in consistent with Illumos and FreeBSD. Future performance work may wish to investigate the benefits of immediately evicting data from the cache which has been read or written with the O_DIRECT flag. Functionally this behavior is very similar to applying the 'primarycache=metadata' property per open file. 2. O_DIRECT _MAY_ impose restrictions on IO alignment and length. No additional alignment or length restrictions are imposed. 3. O_DIRECT _MAY_ perform unbuffered IO operations directly between user memory and block device. No unbuffered IO operations are currently supported. In order to support features such as transparent compression, encryption, and checksumming a copy must be made to transform the data. 4. O_DIRECT _MAY_ imply O_DSYNC (XFS). O_DIRECT does not imply O_DSYNC for ZFS. Callers must provide O_DSYNC to request synchronous semantics. 5. O_DIRECT _MAY_ disable file locking that serializes IO operations. Applications should avoid mixing O_DIRECT and normal IO or mmap(2) IO to the same file. This is particularly true for overlapping regions. All I/O in ZFS is locked for correctness and this locking is not disabled by O_DIRECT. However, concurrently mixing O_DIRECT, mmap(2), and normal I/O on the same file is not recommended. This change is implemented by layering the aops->direct_IO operations on the existing AIO operations. Code already existed in ZFS on Linux for bypassing the page cache when O_DIRECT is specified. References: * http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s09.html * https://blogs.oracle.com/roch/entry/zfs_and_directio * https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics * https://illumos.org/man/3c/directio Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #224 Closes #7823
2018-08-27 17:04:21 +00:00
# lines containing urls are exempt for the length limit.
function test_commit_bodylength()
{
length="72"
Direct IO support Direct IO via the O_DIRECT flag was originally introduced in XFS by IRIX for database workloads. Its purpose was to allow the database to bypass the page and buffer caches to prevent unnecessary IO operations (e.g. readahead) while preventing contention for system memory between the database and kernel caches. On Illumos, there is a library function called directio(3C) that allows user space to provide a hint to the file system that Direct IO is useful, but the file system is free to ignore it. The semantics are also entirely a file system decision. Those that do not implement it return ENOTTY. Since the semantics were never defined in any standard, O_DIRECT is implemented such that it conforms to the behavior described in the Linux open(2) man page as follows. 1. Minimize cache effects of the I/O. By design the ARC is already scan-resistant which helps mitigate the need for special O_DIRECT handling. Data which is only accessed once will be the first to be evicted from the cache. This behavior is in consistent with Illumos and FreeBSD. Future performance work may wish to investigate the benefits of immediately evicting data from the cache which has been read or written with the O_DIRECT flag. Functionally this behavior is very similar to applying the 'primarycache=metadata' property per open file. 2. O_DIRECT _MAY_ impose restrictions on IO alignment and length. No additional alignment or length restrictions are imposed. 3. O_DIRECT _MAY_ perform unbuffered IO operations directly between user memory and block device. No unbuffered IO operations are currently supported. In order to support features such as transparent compression, encryption, and checksumming a copy must be made to transform the data. 4. O_DIRECT _MAY_ imply O_DSYNC (XFS). O_DIRECT does not imply O_DSYNC for ZFS. Callers must provide O_DSYNC to request synchronous semantics. 5. O_DIRECT _MAY_ disable file locking that serializes IO operations. Applications should avoid mixing O_DIRECT and normal IO or mmap(2) IO to the same file. This is particularly true for overlapping regions. All I/O in ZFS is locked for correctness and this locking is not disabled by O_DIRECT. However, concurrently mixing O_DIRECT, mmap(2), and normal I/O on the same file is not recommended. This change is implemented by layering the aops->direct_IO operations on the existing AIO operations. Code already existed in ZFS on Linux for bypassing the page cache when O_DIRECT is specified. References: * http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s09.html * https://blogs.oracle.com/roch/entry/zfs_and_directio * https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics * https://illumos.org/man/3c/directio Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Signed-off-by: Richard Yao <ryao@gentoo.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #224 Closes #7823
2018-08-27 17:04:21 +00:00
body=$(git log -n 1 --pretty=%b "$REF" | grep -Ev "http(s)*://" | grep -E -m 1 ".{$((length + 1))}")
if [ -n "$body" ]; then
echo "error: commit message body contains line over ${length} characters"
return 1
fi
return 0
}
# check for a tagged line
function check_tagged_line()
{
regex='^\s*'"$1"':\s[[:print:]]+\s<[[:graph:]]+>$'
foundline=$(git log -n 1 "$REF" | grep -E -m 1 "$regex")
if [ -z "$foundline" ]; then
echo "error: missing \"$1\""
return 1
fi
return 0
}
# check for a tagged line and check that the link is valid
function check_tagged_line_with_url()
{
regex='^\s*'"$1"':\s\K([[:graph:]]+)$'
foundline=$(git log -n 1 "$REF" | grep -Po "$regex")
if [ -z "$foundline" ]; then
echo "error: missing \"$1\""
return 1
fi
OLDIFS=$IFS
IFS=$'\n'
for url in $(echo -e "$foundline"); do
if ! test_url "$url"; then
return 1
fi
done
IFS=$OLDIFS
return 0
}
# check commit message for a normal commit
function new_change_commit()
{
error=0
# subject is not longer than 72 characters
long_subject=$(git log -n 1 --pretty=%s "$REF" | grep -E -m 1 '.{73}')
if [ -n "$long_subject" ]; then
echo "error: commit subject over 72 characters"
error=1
fi
# need a signed off by
if ! check_tagged_line "Signed-off-by" ; then
error=1
fi
# ensure that no lines in the body of the commit are over 72 characters
if ! test_commit_bodylength ; then
error=1
fi
return $error
}
function is_openzfs_port()
{
# subject starts with OpenZFS means it's an openzfs port
subject=$(git log -n 1 --pretty=%s "$REF" | grep -E -m 1 '^OpenZFS')
if [ -n "$subject" ]; then
return 0
fi
return 1
}
function openzfs_port_commit()
{
error=0
# subject starts with OpenZFS dddd
subject=$(git log -n 1 --pretty=%s "$REF" | grep -E -m 1 '^OpenZFS [[:digit:]]+(, [[:digit:]]+)* - ')
if [ -z "$subject" ]; then
echo "error: OpenZFS patch ports must have a subject line that starts with \"OpenZFS dddd - \""
error=1
fi
# need an authored by line
if ! check_tagged_line "Authored by" ; then
error=1
fi
# need a reviewed by line
if ! check_tagged_line "Reviewed by" ; then
error=1
fi
# need ported by line
if ! check_tagged_line "Ported-by" ; then
error=1
fi
# need a url to openzfs commit and it should be valid
if ! check_tagged_line_with_url "OpenZFS-commit" ; then
error=1
fi
# need a url to illumos issue and it should be valid
if ! check_tagged_line_with_url "OpenZFS-issue" ; then
error=1
fi
return $error
}
function is_coverity_fix()
{
# subject starts with Fix coverity defects means it's a coverity fix
subject=$(git log -n 1 --pretty=%s "$REF" | grep -E -m 1 '^Fix coverity defects')
if [ -n "$subject" ]; then
return 0
fi
return 1
}
function coverity_fix_commit()
{
error=0
# subject starts with Fix coverity defects: CID dddd, dddd...
subject=$(git log -n 1 --pretty=%s "$REF" |
grep -E -m 1 'Fix coverity defects: CID [[:digit:]]+(, [[:digit:]]+)*')
if [ -z "$subject" ]; then
echo "error: Coverity defect fixes must have a subject line that starts with \"Fix coverity defects: CID dddd\""
error=1
fi
# need a signed off by
if ! check_tagged_line "Signed-off-by" ; then
error=1
fi
# test each summary line for the proper format
OLDIFS=$IFS
IFS=$'\n'
for line in $(git log -n 1 --pretty=%b "$REF" | grep -E '^CID'); do
echo "$line" | grep -E '^CID [[:digit:]]+: ([[:graph:]]+|[[:space:]])+ \(([[:upper:]]|\_)+\)' > /dev/null
# shellcheck disable=SC2181
if [[ $? -ne 0 ]]; then
echo "error: commit message has an improperly formatted CID defect line"
error=1
fi
done
IFS=$OLDIFS
# ensure that no lines in the body of the commit are over 72 characters
if ! test_commit_bodylength; then
error=1
fi
return $error
}
if [ -n "$1" ]; then
REF="$1"
fi
# if openzfs port, test against that
if is_openzfs_port; then
if ! openzfs_port_commit ; then
exit 1
else
exit 0
fi
fi
# if coverity fix, test against that
if is_coverity_fix; then
if ! coverity_fix_commit; then
exit 1
else
exit 0
fi
fi
# have a normal commit
if ! new_change_commit ; then
exit 1
fi
exit 0