Ubuntu-focal-kernel/mm
Huang Ying c6ddfb6c58 mm, clear_huge_page: move order algorithm into a separate function
Patch series "mm, huge page: Copy target sub-page last when copy huge
page", v2.

Huge page helps to reduce TLB miss rate, but it has higher cache
footprint, sometimes this may cause some issue.  For example, when
copying huge page on x86_64 platform, the cache footprint is 4M.  But on
a Xeon E5 v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC
(last level cache).  That is, in average, there are 2.5M LLC for each
core and 1.25M LLC for each thread.

If the cache contention is heavy when copying the huge page, and we copy
the huge page from the begin to the end, it is possible that the begin
of huge page is evicted from the cache after we finishing copying the
end of the huge page.  And it is possible for the application to access
the begin of the huge page after copying the huge page.

In c79b57e462 ("mm: hugetlb: clear target sub-page last when clearing
huge page"), to keep the cache lines of the target subpage hot, the
order to clear the subpages in the huge page in clear_huge_page() is
changed to clearing the subpage which is furthest from the target
subpage firstly, and the target subpage last.  The similar order
changing helps huge page copying too.  That is implemented in this
patchset.

The patchset is a generic optimization which should benefit quite some
workloads, not for a specific use case.  To demonstrate the performance
benefit of the patchset, we have tested it with vm-scalability run on
transparent huge page.

With this patchset, the throughput increases ~16.6% in vm-scalability
anon-cow-seq test case with 36 processes on a 2 socket Xeon E5 v3 2699
system (36 cores, 72 threads).  The test case set
/sys/kernel/mm/transparent_hugepage/enabled to be always, mmap() a big
anonymous memory area and populate it, then forked 36 child processes,
each writes to the anonymous memory area from the begin to the end, so
cause copy on write.  For each child process, other child processes
could be seen as other workloads which generate heavy cache pressure.
At the same time, the IPC (instruction per cycle) increased from 0.63 to
0.78, and the time spent in user space is reduced ~7.2%.

This patch (of 4):

In c79b57e462 ("mm: hugetlb: clear target sub-page last when clearing
huge page"), to keep the cache lines of the target subpage hot, the
order to clear the subpages in the huge page in clear_huge_page() is
changed to clearing the subpage which is furthest from the target
subpage firstly, and the target subpage last.  This optimization could
be applied to copying huge page too with the same order algorithm.  To
avoid code duplication and reduce maintenance overhead, in this patch,
the order algorithm is moved out of clear_huge_page() into a separate
function: process_huge_page().  So that we can use it for copying huge
page too.

This will change the direct calls to clear_user_highpage() into the
indirect calls.  But with the proper inline support of the compilers,
the indirect call will be optimized to be the direct call.  Our tests
show no performance change with the patch.

This patch is a code cleanup without functionality change.

Link: http://lkml.kernel.org/r/20180524005851.4079-2-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Christopher Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:29 -07:00
..
kasan kasan: fix shadow_size calculation error in kasan_module_alloc 2018-07-03 17:32:19 -07:00
Kconfig kconfig: add a Memory Management options" menu 2018-08-02 08:06:55 +09:00
Kconfig.debug
Makefile mm: restructure memfd code 2018-06-07 17:34:35 -07:00
backing-dev.c bdi: Fix another oops in wb_workfn() 2018-06-22 12:08:07 -06:00
balloon_compaction.c
bootmem.c docs/mm: bootmem: add overview documentation 2018-08-02 12:17:27 -06:00
cleancache.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
cma.c Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" 2018-05-24 10:07:50 -07:00
cma.h
cma_debug.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
compaction.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
debug.c mm: teach dump_page() to correctly output poisoned struct pages 2018-07-03 17:32:19 -07:00
debug_page_ref.c
dmapool.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
early_ioremap.c
fadvise.c
failslab.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
filemap.c mm: use new return type vm_fault_t 2018-06-07 17:34:36 -07:00
frame_vector.c
frontswap.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
gup.c mm: do not bug_on on incorrect length in __mm_populate() 2018-07-14 11:11:10 -07:00
gup_benchmark.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
highmem.c
hmm.c mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
huge_memory.c thp: use mm_file_counter to determine update which rss counter 2018-08-17 16:20:28 -07:00
hugetlb.c ipc/shm.c add ->pagesize function to shm_vm_ops 2018-08-02 16:03:40 -07:00
hugetlb_cgroup.c mm: rename page_counter's count/limit into usage/max 2018-06-07 17:34:35 -07:00
hwpoison-inject.c
init-mm.c mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids 2018-07-17 09:35:30 +02:00
internal.h Changes for 4.18: 2018-06-05 13:24:20 -07:00
interval_tree.c
khugepaged.c mm: thp: inc counter for collapsed shmem THP 2018-08-17 16:20:28 -07:00
kmemleak-test.c
kmemleak.c
ksm.c mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
list_lru.c
maccess.c
madvise.c
memblock.c This was a moderately busy cycle for docs, with the usual collection of 2018-08-14 14:29:31 -07:00
memcontrol.c for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
memfd.c alloc_file(): switch to passing O_... flags instead of FMODE_... mode 2018-07-12 10:02:57 -04:00
memory-failure.c
memory.c mm, clear_huge_page: move order algorithm into a separate function 2018-08-17 16:20:29 -07:00
memory_hotplug.c mm: move is_pageblock_removable_nolock() to mm/memory_hotplug.c 2018-06-07 17:34:36 -07:00
mempolicy.c mm: use vma_init() to initialize VMAs on stack and data segments 2018-07-26 19:38:03 -07:00
mempool.c mm/mempool.c: remove unused argument in kasan_unpoison_element() and remove_element() 2018-08-17 16:20:28 -07:00
memtest.c
migrate.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mincore.c
mlock.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mm_init.c
mmap.c dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings 2018-06-20 19:10:01 +02:00
mremap.c mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns 2018-06-15 07:55:24 +09:00
msync.c
nobootmem.c mm/memblock: add a name for memblock flags enumeration 2018-08-02 12:17:27 -06:00
nommu.c mm: fix vma_is_anonymous() false-positives 2018-07-26 19:38:03 -07:00
oom_kill.c mm: fix oom_kill event handling 2018-06-15 07:55:25 +09:00
page-writeback.c
page_alloc.c mm, page_alloc: actually ignore mempolicies for high priority allocations 2018-08-17 16:20:28 -07:00
page_counter.c memcg: introduce memory.min 2018-06-07 17:34:36 -07:00
page_ext.c mm/page_ext.c: constify lookup_page_ext() argument 2018-08-17 16:20:28 -07:00
page_idle.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
page_io.c swap,blkcg: issue swap io with the appropriate context 2018-07-09 09:07:54 -06:00
page_isolation.c
page_owner.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
page_poison.c
page_vma_mapped.c
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c readahead: stricter check for bdi io_pages 2018-07-27 09:09:53 -06:00
rmap.c mm: do not drop unused pages when userfaultd is running 2018-07-14 11:11:09 -07:00
rodata_test.c
shmem.c shmem: use monotonic time for i_generation 2018-08-17 16:20:28 -07:00
slab.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
slab.h
slab_common.c slub: fix failure when we delete and create a slab cache 2018-06-28 11:16:44 -07:00
slob.c slab: __GFP_ZERO is incompatible with a constructor 2018-06-07 17:34:34 -07:00
slub.c mm, slub: restore the original intention of prefetch_freepointer() 2018-08-17 16:20:28 -07:00
sparse-vmemmap.c
sparse.c mm/sparse.c: pass the __highest_present_section_nr + 1 to alloc_func() 2018-06-07 17:34:35 -07:00
swap.c mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS 2018-05-22 06:59:39 -07:00
swap_cgroup.c
swap_slots.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
swap_state.c treewide: kvzalloc() -> kvcalloc() 2018-06-12 16:19:22 -07:00
swapfile.c for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
truncate.c
usercopy.c usercopy: Allow boot cmdline disabling of hardening 2018-07-04 08:04:52 -07:00
userfaultfd.c userfaultfd: prevent non-cooperative events vs mcopy_atomic races 2018-06-07 17:34:38 -07:00
util.c mm: kvmalloc does not fallback to vmalloc for incompatible gfp flags 2018-06-07 17:34:38 -07:00
vmacache.c
vmalloc.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
vmpressure.c mm/vmpressure.c: convert to use match_string() helper 2018-06-07 17:34:36 -07:00
vmscan.c mm/vmscan.c: condense scan_control 2018-08-17 16:20:28 -07:00
vmstat.c Revert mm/vmstat.c: fix vmstat_update() preemption BUG 2018-06-28 11:16:44 -07:00
workingset.c
z3fold.c z3fold: fix reclaim lock-ups 2018-05-11 17:28:45 -07:00
zbud.c
zpool.c
zsmalloc.c mm: use octal not symbolic permissions 2018-06-15 07:55:25 +09:00
zswap.c zswap: re-check zswap_is_full() after do zswap_shrink() 2018-07-26 19:38:03 -07:00