From bbcec73783c552658a4fe54de8aee110109874bc Mon Sep 17 00:00:00 2001 From: Matthew Ahrens Date: Tue, 6 Apr 2021 12:44:54 -0700 Subject: [PATCH] kmem_alloc(KM_SLEEP) should use kvmalloc() `kmem_alloc(size>PAGESIZE, KM_SLEEP)` is backed by `kmalloc()`, which finds contiguous physical memory. If there isn't enough contiguous physical memory available (e.g. due to physical page fragmentation), the OOM killer will be invoked to make more memory available. This is not ideal because processes may be killed when there is still plenty of free memory (it just happens to be in individual pages, not contiguous runs of pages). We have observed this when allocating the ~13KB `zfs_cmd_t`, for example in `zfsdev_ioctl()`. This commit changes the behavior of `kmem_alloc(size>PAGESIZE, KM_SLEEP)` when there are insufficient contiguous free pages. In this case we will find individual pages and stitch them together using virtual memory. This is accomplished by using `kvmalloc()`, which implements the described behavior by trying `kmalloc(__GFP_NORETRY)` and falling back on `vmalloc()`. The behavior of `kmem_alloc(KM_NOSLEEP)` is not changed; it continues to use `kmalloc(GPF_ATOMIC | __GFP_NORETRY)`. This is because `vmalloc()` may sleep. Reviewed-by: Tony Nguyen Reviewed-by: Brian Behlendorf Reviewed-by: George Wilson Signed-off-by: Matthew Ahrens Closes #11461 --- module/os/linux/spl/spl-kmem.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/module/os/linux/spl/spl-kmem.c b/module/os/linux/spl/spl-kmem.c index 943966cbb1..2b342140d0 100644 --- a/module/os/linux/spl/spl-kmem.c +++ b/module/os/linux/spl/spl-kmem.c @@ -245,7 +245,21 @@ spl_kmem_alloc_impl(size_t size, int flags, int node) return (NULL); } } else { + /* + * We use kmalloc when doing kmem_alloc(KM_NOSLEEP), + * because kvmalloc/vmalloc may sleep. We also use + * kmalloc on systems with limited kernel VA space (e.g. + * 32-bit), which have HIGHMEM. Otherwise we use + * kvmalloc, which tries to get contiguous physical + * memory (fast, like kmalloc) and falls back on using + * virtual memory to stitch together pages (slow, like + * vmalloc). + */ +#ifdef CONFIG_HIGHMEM if (flags & KM_VMEM) { +#else + if ((flags & KM_VMEM) || !(flags & KM_NOSLEEP)) { +#endif ptr = spl_kvmalloc(size, lflags); } else { ptr = kmalloc_node(size, lflags, node);