Commit Graph

126 Commits

Author SHA1 Message Date
Rafael Aquini 29fff32560 maple_tree: correct tree corruption on spanning store
JIRA: https://issues.redhat.com/browse/RHEL-27745
JIRA: https://issues.redhat.com/browse/RHEL-66950
CVE: CVE-2024-50200

This patch is a backport of the following upstream commit:
commit bea07fd63192b61209d48cbb81ef474cc3ee4c62
Author: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Date:   Mon Oct 7 16:28:32 2024 +0100

    maple_tree: correct tree corruption on spanning store

    Patch series "maple_tree: correct tree corruption on spanning store", v3.

    There has been a nasty yet subtle maple tree corruption bug that appears
    to have been in existence since the inception of the algorithm.

    This bug seems far more likely to happen since commit f8d112a4e657
    ("mm/mmap: avoid zeroing vma tree in mmap_region()"), which is the point
    at which reports started to be submitted concerning this bug.

    We were made definitely aware of the bug thanks to the kind efforts of
    Bert Karwatzki who helped enormously in my being able to track this down
    and identify the cause of it.

    The bug arises when an attempt is made to perform a spanning store across
    two leaf nodes, where the right leaf node is the rightmost child of the
    shared parent, AND the store completely consumes the right-mode node.

    This results in mas_wr_spanning_store() mitakenly duplicating the new and
    existing entries at the maximum pivot within the range, and thus maple
    tree corruption.

    The fix patch corrects this by detecting this scenario and disallowing the
    mistaken duplicate copy.

    The fix patch commit message goes into great detail as to how this occurs.

    This series also includes a test which reliably reproduces the issue, and
    asserts that the fix works correctly.

    Bert has kindly tested the fix and confirmed it resolved his issues.  Also
    Mikhail Gavrilov kindly reported what appears to be precisely the same
    bug, which this fix should also resolve.

    This patch (of 2):

    There has been a subtle bug present in the maple tree implementation from
    its inception.

    This arises from how stores are performed - when a store occurs, it will
    overwrite overlapping ranges and adjust the tree as necessary to
    accommodate this.

    A range may always ultimately span two leaf nodes.  In this instance we
    walk the two leaf nodes, determine which elements are not overwritten to
    the left and to the right of the start and end of the ranges respectively
    and then rebalance the tree to contain these entries and the newly
    inserted one.

    This kind of store is dubbed a 'spanning store' and is implemented by
    mas_wr_spanning_store().

    In order to reach this stage, mas_store_gfp() invokes
    mas_wr_preallocate(), mas_wr_store_type() and mas_wr_walk() in turn to
    walk the tree and update the object (mas) to traverse to the location
    where the write should be performed, determining its store type.

    When a spanning store is required, this function returns false stopping at
    the parent node which contains the target range, and mas_wr_store_type()
    marks the mas->store_type as wr_spanning_store to denote this fact.

    When we go to perform the store in mas_wr_spanning_store(), we first
    determine the elements AFTER the END of the range we wish to store (that
    is, to the right of the entry to be inserted) - we do this by walking to
    the NEXT pivot in the tree (i.e.  r_mas.last + 1), starting at the node we
    have just determined contains the range over which we intend to write.

    We then turn our attention to the entries to the left of the entry we are
    inserting, whose state is represented by l_mas, and copy these into a 'big
    node', which is a special node which contains enough slots to contain two
    leaf node's worth of data.

    We then copy the entry we wish to store immediately after this - the copy
    and the insertion of the new entry is performed by mas_store_b_node().

    After this we copy the elements to the right of the end of the range which
    we are inserting, if we have not exceeded the length of the node (i.e.
    r_mas.offset <= r_mas.end).

    Herein lies the bug - under very specific circumstances, this logic can
    break and corrupt the maple tree.

    Consider the following tree:

    Height
      0                             Root Node
                                     /      \
                     pivot = 0xffff /        \ pivot = ULONG_MAX
                                   /          \
      1                       A [-----]       ...
                                 /   \
                 pivot = 0x4fff /     \ pivot = 0xffff
                               /       \
      2 (LEAVES)          B [-----]  [-----] C
                                          ^--- Last pivot 0xffff.

    Now imagine we wish to store an entry in the range [0x4000, 0xffff] (note
    that all ranges expressed in maple tree code are inclusive):

    1. mas_store_gfp() descends the tree, finds node A at <=0xffff, then
       determines that this is a spanning store across nodes B and C. The mas
       state is set such that the current node from which we traverse further
       is node A.

    2. In mas_wr_spanning_store() we try to find elements to the right of pivot
       0xffff by searching for an index of 0x10000:

        - mas_wr_walk_index() invokes mas_wr_walk_descend() and
          mas_wr_node_walk() in turn.

            - mas_wr_node_walk() loops over entries in node A until EITHER it
              finds an entry whose pivot equals or exceeds 0x10000 OR it
              reaches the final entry.

            - Since no entry has a pivot equal to or exceeding 0x10000, pivot
              0xffff is selected, leading to node C.

        - mas_wr_walk_traverse() resets the mas state to traverse node C. We
          loop around and invoke mas_wr_walk_descend() and mas_wr_node_walk()
          in turn once again.

             - Again, we reach the last entry in node C, which has a pivot of
               0xffff.

    3. We then copy the elements to the left of 0x4000 in node B to the big
       node via mas_store_b_node(), and insert the new [0x4000, 0xffff] entry
       too.

    4. We determine whether we have any entries to copy from the right of the
       end of the range via - and with r_mas set up at the entry at pivot
       0xffff, r_mas.offset <= r_mas.end, and then we DUPLICATE the entry at
       pivot 0xffff.

    5. BUG! The maple tree is corrupted with a duplicate entry.

    This requires a very specific set of circumstances - we must be spanning
    the last element in a leaf node, which is the last element in the parent
    node.

    spanning store across two leaf nodes with a range that ends at that shared
    pivot.

    A potential solution to this problem would simply be to reset the walk
    each time we traverse r_mas, however given the rarity of this situation it
    seems that would be rather inefficient.

    Instead, this patch detects if the right hand node is populated, i.e.  has
    anything we need to copy.

    We do so by only copying elements from the right of the entry being
    inserted when the maximum value present exceeds the last, rather than
    basing this on offset position.

    The patch also updates some comments and eliminates the unused bool return
    value in mas_wr_walk_index().

    The work performed in commit f8d112a4e657 ("mm/mmap: avoid zeroing vma
    tree in mmap_region()") seems to have made the probability of this event
    much more likely, which is the point at which reports started to be
    submitted concerning this bug.

    The motivation for this change arose from Bert Karwatzki's report of
    encountering mm instability after the release of kernel v6.12-rc1 which,
    after the use of CONFIG_DEBUG_VM_MAPLE_TREE and similar configuration
    options, was identified as maple tree corruption.

    After Bert very generously provided his time and ability to reproduce this
    event consistently, I was able to finally identify that the issue
    discussed in this commit message was occurring for him.

    Link: https://lkml.kernel.org/r/cover.1728314402.git.lorenzo.stoakes@oracle.com
    Link: https://lkml.kernel.org/r/48b349a2a0f7c76e18772712d0997a5e12ab0a3b.1728314403.git.lorenzo.stoakes@oracle.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
    Reported-by: Bert Karwatzki <spasswolf@web.de>
    Closes: https://lore.kernel.org/all/20241001023402.3374-1-spasswolf@web.de/
    Tested-by: Bert Karwatzki <spasswolf@web.de>
    Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
    Closes: https://lore.kernel.org/all/CABXGCsOPwuoNOqSMmAvWO2Fz4TEmPnjFj-b7iF+XFRu1h7-+Dg@mail.gmail.com/
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
    Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
    Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:25:50 -05:00
Rafael Aquini 8284364e21 lib/maple_tree.c: fix build error due to hotfix alteration
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 5143eecd2af2b5424f7b96d53f17bb4718e46bd3
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Wed Dec 13 12:59:49 2023 -0800

    lib/maple_tree.c: fix build error due to hotfix alteration

    Commit 0de56e38b307 ("maple_tree: use maple state end for write
    operations") was broken by a later patch "maple_tree: do not preallocate
    nodes for slot stores".  But the later patch was scheduled ahead of
    0de56e38b307, for 6.7-rc.

    This fixlet undoes the damage.

    Fixes: 0de56e38b307 ("maple_tree: use maple state end for write operations")
    Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:54 -05:00
Rafael Aquini 72e1647ba4 maple_tree: mtree_range_walk() clean up
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit a3c63c8c5df6406e79490456a1fc41a287676070
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:29 2023 -0400

    maple_tree: mtree_range_walk() clean up

    mtree_range_walk() needed to be updated to avoid checking if there was a
    pivot value.  On closer examination, the code could avoid setting min or
    max in certain scenarios.  The commit removes the extra check for
    pivot[offset] before setting max and only sets max when necessary.  It
    also only sets min if it is necessary by checking offset 0 prior to the
    loop (as it has always done).

    The commit also drops a dead node check since the end of the node will
    return the array size when the last slot is occupied (by a potential reuse
    in a dead node).  The data will be discarded later if the node is marked
    dead.

    Benchmarking these changes results in an increase in performance of 5.45%
    using the BENCH_WALK in the maple tree test code.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-13-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:48 -05:00
Rafael Aquini 879f2cef85 maple_tree: don't find node end in mtree_lookup_walk()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 24662decdd44645e8f027d7912be962dd461d1aa
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:28 2023 -0400

    maple_tree: don't find node end in mtree_lookup_walk()

    Since the pivot being set is now reliable, the optimized loop no longer
    needs to find the node end.  The redundant check for a dead node can also
    be avoided as there is no danger of using the wrong pivot since the
    results will be thrown out in the case of a dead node by the later check.

    This patch also adds a benchmark test for the function to the maple tree
    test framework.  The benchmark shows an average increase performance of
    5.98% over 3 runs with this commit.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-12-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:48 -05:00
Rafael Aquini c5e2c5e92f maple_tree: use maple state end for write operations
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 0de56e38b307b0cb2ac825e8e7cb371a28daf844
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:27 2023 -0400

    maple_tree: use maple state end for write operations

    ma_wr_state was previously tracking the end of the node for writing.
    Since the implementation of the ma_state end tracking, this is duplicated
    work.  This patch removes the maple write state tracking of the end of the
    node and uses the maple state end instead.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-11-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:47 -05:00
Rafael Aquini 7d8cafb71e maple_tree: remove mas_searchable()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 9a40d45c1f2c49273c04938ec3d7849f685eb3c1
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:26 2023 -0400

    maple_tree: remove mas_searchable()

    Now that the status of the maple state is outside of the node, the
    mas_searchable() function can be dropped for easier open-coding of what is
    going on.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-10-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:46 -05:00
Rafael Aquini cf948a15b2 maple_tree: separate ma_state node from status
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 067311d33e650adfe7ae23765959ddcc1ba18510
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:25 2023 -0400

    maple_tree: separate ma_state node from status

    The maple tree node is overloaded to keep status as well as the active
    node.  This, unfortunately, results in a re-walk on underflow or overflow.
    Since the maple state has room, the status can be placed in its own enum
    in the structure.  Once an underflow/overflow is detected, certain modes
    can restore the status to active and others may need to re-walk just that
    one node to see the entry.

    The status being an enum has the benefit of detecting unhandled status in
    switch statements.

    [Liam.Howlett@oracle.com: fix comments about MAS_*]
      Link: https://lkml.kernel.org/r/20231106154124.614247-1-Liam.Howlett@oracle.com
    [Liam.Howlett@oracle.com: update forking to separate maple state and node]
      Link: https://lkml.kernel.org/r/20231106154551.615042-1-Liam.Howlett@oracle.com
    [Liam.Howlett@oracle.com: fix mas_prev() state separation code]
      Link: https://lkml.kernel.org/r/20231207193319.4025462-1-Liam.Howlett@oracle.com
    Link: https://lkml.kernel.org/r/20231101171629.3612299-9-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:46 -05:00
Rafael Aquini 9d232ad136 maple_tree: clean up inlines for some functions
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 271f61a8b41dcd86e1ecc2e0455bcc071bc7dde4
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:24 2023 -0400

    maple_tree: clean up inlines for some functions

    There are a few functions which were inlined but are somewhat too large to
    inline, so remove the inline key word.

    There are also several very small functions which are used in critical
    code sections which gcc was not inlining, so make this more strict and use
    __always_line for these functions.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-8-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:44 -05:00
Rafael Aquini 7971aab149 maple_tree: use cached node end in mas_destroy()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 1f41ef12abf8538b3d82cdae14c06aa171cb71ce
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:23 2023 -0400

    maple_tree: use cached node end in mas_destroy()

    The node end is set during the walk, so use the resulting end instead of
    re-fetching it.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-7-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:44 -05:00
Rafael Aquini 503aad06f7 maple_tree: use cached node end in mas_next()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit e9c52d8940cbfd94b36035bbebce7f55954e7728
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:22 2023 -0400

    maple_tree: use cached node end in mas_next()

    When looking for the next entry, don't recalculate the node end as it is
    now tracked in the maple state.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-6-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:43 -05:00
Rafael Aquini a54b368ef8 maple_tree: add end of node tracking to the maple state
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 31c532a8af57513228c2b12d281104198ff412b8
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:21 2023 -0400

    maple_tree: add end of node tracking to the maple state

    Analysis of the mas_for_each() iteration showed that there is a
    significant time spent finding the end of a node.  This time can be
    greatly reduced if the end of the node is cached in the maple state.  Care
    must be taken to update & invalidate as necessary.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-5-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:42 -05:00
Rafael Aquini b3e1e549ab maple_tree: make mas_erase() more robust
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit f7a59018953910032231c0a019208c4b0a4a8bc3
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:19 2023 -0400

    maple_tree: make mas_erase() more robust

    mas_erase() may not deal correctly with all maple states.  Make the
    function more robust by ensuring the state is in one of the two acceptable
    states.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-3-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:41 -05:00
Rafael Aquini b7929625d5 maple_tree: remove unnecessary default labels from switch statements
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 37a8ab24d3d4c465b070bd704e2ad2fa277df9d7
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Wed Nov 1 13:16:18 2023 -0400

    maple_tree: remove unnecessary default labels from switch statements

    Patch series "maple_tree: iterator state changes".

    These patches have some general cleanup and a change to separate the maple
    state status tracking from the maple state node.

    The maple state status change allows for walks to continue from previous
    places when the status needs to be recorded to make logical sense for the
    next call to the maple state.  For instance, it allows for prev/next to
    function in a way that better resembles the linked list.  It also allows
    switch statements to be used to detect missed states during compile, and
    the addition of fast-path "active" state is cleaner as an enum.

    While making the status change, perf showed some very small (one line)
    functions that were not inlined even with the inline key word.  Making
    these small functions __always_inline is less expensive according to perf.
    As part of that change, some inlines have been dropped from larger
    functions.

    Perf also showed that the commonly used mas_for_each() iterator was
    spending a lot of time finding the end of the node.  This series
    introduces caching of the end of the node in the maple state (and updating
    it during writes).  This caching along with the inline changes yielded at
    23.25% improvement on the BENCH_MAS_FOR_EACH maple tree test framework
    benchmark.

    I've also included a change to mtree_range_walk and mtree_lookup_walk to
    take advantage of Peng's change [1] to the initial pivot setup.

    mmtests did not produce any significant gains.

    [1] https://lore.kernel.org/all/20230711035444.526-1-zhangpeng.00@bytedance.com/T/#u

    This patch (of 12):

    Removing the default types from the switch statements will cause compile
    warnings on missing cases.

    Link: https://lkml.kernel.org/r/20231101171629.3612299-2-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Suggested-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:40 -05:00
Rafael Aquini ff4f4718c8 maple_tree: preserve the tree attributes when destroying maple tree
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 8e50d32c7a89bde896945e4e572ef28ccd87bbf8
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Fri Oct 27 11:38:44 2023 +0800

    maple_tree: preserve the tree attributes when destroying maple tree

    When destroying maple tree, preserve its attributes and then turn it into
    an empty tree.  This allows it to be reused without needing to be
    reinitialized.

    Link: https://lkml.kernel.org/r/20231027033845.90608-10-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Mateusz Guzik <mjguzik@gmail.com>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Mike Christie <michael.christie@oracle.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:33 -05:00
Rafael Aquini 76212efaff maple_tree: introduce interfaces __mt_dup() and mtree_dup()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit fd32e4e9b7646510ee9010e0d5f8b8857d48a6f7
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Fri Oct 27 11:38:38 2023 +0800

    maple_tree: introduce interfaces __mt_dup() and mtree_dup()

    Introduce interfaces __mt_dup() and mtree_dup(), which are used to
    duplicate a maple tree.  They duplicate a maple tree in Depth-First Search
    (DFS) pre-order traversal.  It uses memcopy() to copy nodes in the source
    tree and allocate new child nodes in non-leaf nodes.  The new node is
    exactly the same as the source node except for all the addresses stored in
    it.  It will be faster than traversing all elements in the source tree and
    inserting them one by one into the new tree.  The time complexity of these
    two functions is O(n).

    The difference between __mt_dup() and mtree_dup() is that mtree_dup()
    handles locks internally.

    Analysis of the average time complexity of this algorithm:

    For simplicity, let's assume that the maximum branching factor of all
    non-leaf nodes is 16 (in allocation mode, it is 10), and the tree is a
    full tree.

    Under the given conditions, if there is a maple tree with n elements, the
    number of its leaves is n/16.  From bottom to top, the number of nodes in
    each level is 1/16 of the number of nodes in the level below.  So the
    total number of nodes in the entire tree is given by the sum of n/16 +
    n/16^2 + n/16^3 + ...  + 1.  This is a geometric series, and it has log(n)
    terms with base 16.  According to the formula for the sum of a geometric
    series, the sum of this series can be calculated as (n-1)/15.  Each node
    has only one parent node pointer, which can be considered as an edge.  In
    total, there are (n-1)/15-1 edges.

    This algorithm consists of two operations:

    1. Traversing all nodes in DFS order.
    2. For each node, making a copy and performing necessary modifications
       to create a new node.

    For the first part, DFS traversal will visit each edge twice.  Let
    T(ascend) represent the cost of taking one step downwards, and T(descend)
    represent the cost of taking one step upwards.  And both of them are
    constants (although mas_ascend() may not be, as it contains a loop, but
    here we ignore it and treat it as a constant).  So the time spent on the
    first part can be represented as ((n-1)/15-1) * (T(ascend) + T(descend)).

    For the second part, each node will be copied, and the cost of copying a
    node is denoted as T(copy_node).  For each non-leaf node, it is necessary
    to reallocate all child nodes, and the cost of this operation is denoted
    as T(dup_alloc).  The behavior behind memory allocation is complex and not
    specific to the maple tree operation.  Here, we assume that the time
    required for a single allocation is constant.  Since the size of a node is
    fixed, both of these symbols are also constants.  We can calculate that
    the time spent on the second part is ((n-1)/15) * T(copy_node) + ((n-1)/15
    - n/16) * T(dup_alloc).

    Adding both parts together, the total time spent by the algorithm can be
    represented as:

    ((n-1)/15) * (T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)) -
    n/16 * T(dup_alloc) - (T(ascend) + T(descend))

    Let C1 = T(ascend) + T(descend) + T(copy_node) + T(dup_alloc)
    Let C2 = T(dup_alloc)
    Let C3 = T(ascend) + T(descend)

    Finally, the expression can be simplified as:
    ((16 * C1 - 15 * C2) / (15 * 16)) * n - (C1 / 15 + C3).

    This is a linear function, so the average time complexity is O(n).

    Link: https://lkml.kernel.org/r/20231027033845.90608-4-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Suggested-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Mateusz Guzik <mjguzik@gmail.com>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Mike Christie <michael.christie@oracle.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:29 -05:00
Rafael Aquini 2eb9f4f324 maple_tree: add mt_free_one() and mt_attr() helpers
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 4f2267b58a22d972be98edef8e6b3c7a67c9fb91
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Fri Oct 27 11:38:36 2023 +0800

    maple_tree: add mt_free_one() and mt_attr() helpers

    Patch series "Introduce __mt_dup() to improve the performance of fork()", v7.

    This series introduces __mt_dup() to improve the performance of fork().
    During the duplication process of mmap, all VMAs are traversed and
    inserted one by one into the new maple tree, causing the maple tree to be
    rebalanced multiple times.  Balancing the maple tree is a costly
    operation.  To duplicate VMAs more efficiently, mtree_dup() and __mt_dup()
    are introduced for the maple tree.  They can efficiently duplicate a maple
    tree.

    Here are some algorithmic details about {mtree,__mt}_dup().  We perform a
    DFS pre-order traversal of all nodes in the source maple tree.  During
    this process, we fully copy the nodes from the source tree to the new
    tree.  This involves memory allocation, and when encountering a new node,
    if it is a non-leaf node, all its child nodes are allocated at once.

    This idea was originally from Liam R.  Howlett's Maple Tree Work email,
    and I added some of my own ideas to implement it.  Some previous
    discussions can be found in [1].  For a more detailed analysis of the
    algorithm, please refer to the logs for patch [3/10] and patch [10/10].

    There is a "spawn" in byte-unixbench[2], which can be used to test the
    performance of fork().  I modified it slightly to make it work with
    different number of VMAs.

    Below are the test results.  The first row shows the number of VMAs.  The
    second and third rows show the number of fork() calls per ten seconds,
    corresponding to next-20231006 and the this patchset, respectively.  The
    test results were obtained with CPU binding to avoid scheduler load
    balancing that could cause unstable results.  There are still some
    fluctuations in the test results, but at least they are better than the
    original performance.

    21     121   221    421    821    1621   3221   6421   12821  25621  51221
    112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
    114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
    2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

    Thanks to Liam and Matthew for the review.

    This patch (of 10):

    Add two helpers:
    1. mt_free_one(), used to free a maple node.
    2. mt_attr(), used to obtain the attributes of maple tree.

    Link: https://lkml.kernel.org/r/20231027033845.90608-1-zhangpeng.00@bytedance.com
    Link: https://lkml.kernel.org/r/20231027033845.90608-2-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Mateusz Guzik <mjguzik@gmail.com>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Mike Christie <michael.christie@oracle.com>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:23:28 -05:00
Rafael Aquini 92a933dce3 maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 099d7439ce03d0e7bc8f0c3d7878b562f3a48d3d
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Thu Oct 12 11:52:33 2023 -0400

    maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()

    Users complained about OOM errors during fork without triggering
    compaction.  This can be fixed by modifying the flags used in
    mas_expected_entries() so that the compaction will be triggered in low
    memory situations.  Since mas_expected_entries() is only used during fork,
    the extra argument does not need to be passed through.

    Additionally, the two test_maple_tree test cases and one benchmark test
    were altered to use the correct locking type so that allocations would not
    trigger sleeping and thus fail.  Testing was completed with lockdep atomic
    sleep detection.

    The additional locking change requires rwsem support additions to the
    tools/ directory through the use of pthreads pthread_rwlock_t.  With this
    change test_maple_tree works in userspace, as a module, and in-kernel.

    Users may notice that the system gave up early on attempting to start new
    processes instead of attempting to reclaim memory.

    Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4
    Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: <jason.sim@samsung.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:38 -05:00
Rafael Aquini 78da632e69 maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit a8091f039c1ebf5cb0d5261e3613f18eb2a5d8b7
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Thu Sep 21 14:12:36 2023 -0400

    maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states

    When updating the maple tree iterator to avoid rewalks, an issue was
    introduced when shifting beyond the limits.  This can be seen by trying to
    go to the previous address of 0, which would set the maple node to
    MAS_NONE and keep the range as the last entry.

    Subsequent calls to mas_find() would then search upwards from mas->last
    and skip the value at mas->index/mas->last.  This showed up as a bug in
    mprotect which skips the actual VMA at the current range after attempting
    to go to the previous VMA from 0.

    Since MAS_NONE may already be set when searching for a value that isn't
    contained within a node, changing the handling of MAS_NONE in mas_find()
    would make the code more complicated and error prone.  Furthermore, there
    was no way to tell which limit was hit, and thus which action to take
    (next or the entry at the current range).

    This solution is to add two states to track what happened with the
    previous iterator action.  This allows for the expected behaviour of the
    next command to return the correct item (either the item at the range
    requested, or the next/previous).

    Tests are also added and updated accordingly.

    Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com
    Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
    Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle.com/
    Fixes: 39193685d585 ("maple_tree: try harder to keep active node with mas_prev()")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Pedro Falcato <pedro.falcato@gmail.com>
    Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
    Closes: https://bugs.archlinux.org/task/79656
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:22:03 -05:00
Rafael Aquini c1d9236336 maple_tree: clean up mas_wr_append()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 432af5c966667f12c7af38fb3b2cd52eef0c47b4
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 18 20:43:56 2023 -0400

    maple_tree: clean up mas_wr_append()

    Avoid setting the variables until necessary, and actually use the
    variables where applicable.  Introducing a variable for the slots array
    avoids spanning multiple lines.

    Add the missing argument to the documentation.

    Use the node type when setting the metadata instead of blindly assuming
    the type.

    Finally, add a trace point to the function for successful store.

    Link: https://lkml.kernel.org/r/20230819004356.1454718-3-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:59 -05:00
Rafael Aquini 3d71676cef maple_tree: replace data before marking dead in split and spanning store
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 530f745c7620af288b71b3d667cb90f10df3defe
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:51 2023 -0400

    maple_tree: replace data before marking dead in split and spanning store

    Reorder the operations for split and spanning stores so that new data is
    placed in the tree prior to marking the old data as dead.  This will limit
    re-walks on dead data to just once instead of a retry loop.

    The order of operations is as follows: Create the new data, put the new
    data in place, mark the top node of the old data as dead.

    Then repair parent links in the reused nodes through all levels of the
    tree, following the new nodes downwards.  Finally walk the top dead node
    looking for nodes that are no longer used, or subtrees that should be
    destroyed (marked dead throughout then freed), follow the partially used
    nodes downwards to discover other dead nodes and subtrees.

    Link: https://lkml.kernel.org/r/20230804165951.2661157-7-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:59 -05:00
Rafael Aquini 59be400cf8 maple_tree: change mas_adopt_children() parent usage
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 068bafcac0b89ee5b1616793231eb4b3dd41e3f0
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:50 2023 -0400

    maple_tree: change mas_adopt_children() parent usage

    All calls to mas_adopt_children() currently pass the parent as the node in
    the maple state.  Allow for the parent pointer that is passed in to be
    used instead.

    Link: https://lkml.kernel.org/r/20230804165951.2661157-6-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:58 -05:00
Rafael Aquini 4eebaf02fb maple_tree: introduce mas_tree_parent() definition
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 4ffc2ee2cf01f3d03977fbeb1b43da2dc22a95f4
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:49 2023 -0400

    maple_tree: introduce mas_tree_parent() definition

    Add a definition to shorten long code lines and clarify what the code is
    doing.  Use the new definition to get the maple tree parent pointer from
    the maple state where possible.

    Link: https://lkml.kernel.org/r/20230804165951.2661157-5-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:57 -05:00
Rafael Aquini 7b143058a2 maple_tree: introduce mas_put_in_tree()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 1238f6a226dc27ec34d229b71b02f0d6c46bbf11
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:48 2023 -0400

    maple_tree: introduce mas_put_in_tree()

    mas_replace() has a single user that takes a flag which is now always
    true.  Replace this function with mas_put_in_tree() to better align with
    mas_replace_node().  Inline the remaining logic into the only caller;
    mas_wmb_replace().

    Link: https://lkml.kernel.org/r/20230804165951.2661157-4-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:56 -05:00
Rafael Aquini 31cdd2e72e maple_tree: reorder replacement of nodes to avoid live lock
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 72bcf4aa86ece2b49fbdc7fe83d3a05c7ebcfc97
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:47 2023 -0400

    maple_tree: reorder replacement of nodes to avoid live lock

    Replacing nodes may cause a live lock-up if CPU resources are saturated by
    write operations on the tree by continuously retrying on dead nodes.  To
    avoid the continuous retry scenario, ensure the new node is inserted into
    the tree prior to marking the old data as dead.  This will define a window
    where old and new data is swapped.

    When reusing lower level nodes, ensure the parent pointer is updated after
    the parent is marked dead.  This ensures that the child is still reachable
    from the top of the tree, but walking up to a dead node will result in a
    single retry that will start a fresh walk from the top down through the
    new node.

    Link: https://lkml.kernel.org/r/20230804165951.2661157-3-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:56 -05:00
Rafael Aquini 11309d1369 maple_tree: add hex output to maple_arange64 dump
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 83d97f620f611ab3fbf2de585bf34bd9dab513c2
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 4 12:59:46 2023 -0400

    maple_tree: add hex output to maple_arange64 dump

    Patch series "maple_tree: Change replacement strategy".

    The maple tree marks nodes dead as soon as they are going to be replaced.
    This could be problematic when used in the RCU context since the writer
    may be starved of CPU time by the readers.  This patch set addresses the
    issue by switching the data replacement strategy to one that will only
    mark data as dead once the new data is available.

    This series changes the ordering of the node replacement so that the new
    data is live before the old data is marked 'dead'.  When readers hit
    'dead' nodes, they will restart from the top of the tree and end up in the
    new data.

    In more complex scenarios, the replacement strategy means a subtree is
    built and graphed into the tree leaving some nodes to point to the old
    parent.  The view of tasks into the old data will either remain with the
    old data, or see the new data once the old data is marked 'dead'.

    Iterators will see the 'dead' node and restart on their own and switch to
    the new data.  There is no risk of the reader seeing old data in these
    cases.

    The 'dead' subtree of data is then fully marked dead, but reused nodes
    will still point to the dead nodes until the parent pointer is updated.
    Walking up to a 'dead' node will cause a re-walk from the top of the tree
    and enter the new data area where old data is not reachable.

    Once the parent pointers are fully up to date in the active data, the
    'dead' subtree is iterated to collect entirely 'dead' subtrees, and dead
    nodes (nodes that partially contained reused data).

    This patch (of 6):

    When dumping the tree, honour formatting request to output hex for the
    maple node type arange64.

    Link: https://lkml.kernel.org/r/20230804165951.2661157-1-Liam.Howlett@oracle.com
    Link: https://lkml.kernel.org/r/20230804165951.2661157-2-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Paul E. McKenney <paulmck@kernel.org>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:54 -05:00
Rafael Aquini 5237af5a82 maple_tree: Be more strict about locking
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 19a462f06eb5a78e0c3ebe4fd4fbdc71620b8788
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Jul 14 15:55:51 2023 -0400

    maple_tree: Be more strict about locking

    Use lockdep to check the write path in the maple tree holds the lock in
    write mode.

    Introduce mt_write_lock_is_held() to check if the lock is held for
    writing.  Update the necessary checks for rcu_dereference_protected() to
    use the new write lock check.

    Link: https://lkml.kernel.org/r/20230714195551.894800-5-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Oliver Sang <oliver.sang@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:54 -05:00
Rafael Aquini 930748727a maple_tree: drop mas_first_entry()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 6783bd4b5f72b483cf492dc09500548b495670b5
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:44 2023 +0800

    maple_tree: drop mas_first_entry()

    The internal function mas_first_entry() is no longer used, so drop it.

    Link: https://lkml.kernel.org/r/20230711035444.526-9-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:52 -05:00
Rafael Aquini d54276ebdf maple_tree: replace mas_logical_pivot() with mas_safe_pivot()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 29b2681f1aa95cff6ec0afdeac0b2cab659a5564
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:43 2023 +0800

    maple_tree: replace mas_logical_pivot() with mas_safe_pivot()

    Replace mas_logical_pivot() with mas_safe_pivot() and drop
    mas_logical_pivot() since it won't be used anymore.  We can do this since
    now all nodes will have node limit pivot (if it is not full node).

    Link: https://lkml.kernel.org/r/20230711035444.526-8-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:51 -05:00
Rafael Aquini 069335083c maple_tree: update mt_validate()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit a489539e33c29b469bcd023a32c99078c2597c7c
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:42 2023 +0800

    maple_tree: update mt_validate()

    Instead of using mas_first_entry() to find the leftmost leaf, use a simple
    loop instead.  Remove an unneeded check for root node.  To make the error
    message more accurate, check pivots first and then slots, because checking
    slots depend on the node limit pivot to break the loop.

    Link: https://lkml.kernel.org/r/20230711035444.526-7-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:51 -05:00
Rafael Aquini 243e54d867 maple_tree: make mas_validate_limits() check root node and node limit
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 33af39d0244ce4944ab16728f7b04df9dfc6d365
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:41 2023 +0800

    maple_tree: make mas_validate_limits() check root node and node limit

    Update mas_validate_limits() to check root node, check node limit pivot if
    there is enough room for it to exist and check data_end.  Remove the check
    for child existence as it is done in mas_validate_child_slot().

    Link: https://lkml.kernel.org/r/20230711035444.526-6-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:50 -05:00
Rafael Aquini c089b6c6c0 maple_tree: fix mas_validate_child_slot() to check last missed slot
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit e93fda5a1ab7a0c6143ae8a6f231c9f5f3c417b1
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:40 2023 +0800

    maple_tree: fix mas_validate_child_slot() to check last missed slot

    Don't break the loop before checking the last slot.  Also here check if
    non-leaf nodes are missing children.

    Link: https://lkml.kernel.org/r/20230711035444.526-5-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:49 -05:00
Rafael Aquini cad446f965 maple_tree: make mas_validate_gaps() to check metadata
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit f8e5eac8abe3d26106e5470c735058f04f60f61e
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:39 2023 +0800

    maple_tree: make mas_validate_gaps() to check metadata

    Make mas_validate_gaps() check whether the offset in the metadata points
    to the largest gap.  By the way, simplify this function.

    Add the verification that gaps beyond the node limit are zero.

    Link: https://lkml.kernel.org/r/20230711035444.526-4-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:48 -05:00
Rafael Aquini 8b630356fc maple_tree: don't use MAPLE_ARANGE64_META_MAX to indicate no gap
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit d695c30a8ca07ac7e2138435b461b36289d5656e
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:38 2023 +0800

    maple_tree: don't use MAPLE_ARANGE64_META_MAX to indicate no gap

    Patch series "Improve the validation for maple tree and some cleanup", v2.

    This patch (of 7):

    Do not use a special offset to indicate that there is no gap.  When there
    is no gap, offset can point to any valid slots because its gap is 0.

    Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com
    Link: https://lkml.kernel.org/r/20230711035444.526-3-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:48 -05:00
Rafael Aquini e4748d3963 maple_tree: add a fast path case in mas_wr_slot_store()
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 64891ba3e51fb841b0af70db029038eb93bd5a43
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Wed Jun 28 15:36:57 2023 +0800

    maple_tree: add a fast path case in mas_wr_slot_store()

    When expanding a range in two directions, only partially overwriting the
    previous and next ranges, the number of entries will not be increased, so
    we can just update the pivots as a fast path. However, it may introduce
    potential risks in RCU mode, because it updates two pivots. We only
    enable it in non-RCU mode.

    Link: https://lkml.kernel.org/r/20230628073657.75314-5-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:47 -05:00
Rafael Aquini 0a67da430a maple_tree: optimize mas_wr_append(), also improve duplicating VMAs
JIRA: https://issues.redhat.com/browse/RHEL-27745

This patch is a backport of the following upstream commit:
commit 23e9dde0b246d47e4a1942ea50bf7fef63e2d41a
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Wed Jun 28 15:36:56 2023 +0800

    maple_tree: optimize mas_wr_append(), also improve duplicating VMAs

    When the new range can be completely covered by the original last range
    without touching the boundaries on both sides, two new entries can be
    appended to the end as a fast path. We update the original last pivot at
    the end, and the newly appended two entries will not be accessed before
    this, so it is also safe in RCU mode.

    This is useful for sequential insertion, which is what we do in
    dup_mmap(). Enabling BENCH_FORK in test_maple_tree and just running
    bench_forking() gives the following time-consuming numbers:

    before:               after:
    17,874.83 msec        15,738.38 msec

    It shows about a 12% performance improvement for duplicating VMAs.

    Link: https://lkml.kernel.org/r/20230628073657.75314-4-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-12-09 12:21:46 -05:00
Rafael Aquini 99c7407966 maple_tree: do not preallocate nodes for slot stores
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 4249f13c11be8b8b7bf93204185e150c3bdc968d
Author: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Date:   Wed Dec 13 12:50:57 2023 -0800

    maple_tree: do not preallocate nodes for slot stores

    mas_preallocate() defaults to requesting 1 node for preallocation and then
    ,depending on the type of store, will update the request variable.  There
    isn't a check for a slot store type, so slot stores are preallocating the
    default 1 node.  Slot stores do not require any additional nodes, so add a
    check for the slot store case that will bypass node_count_gfp().  Update
    the tests to reflect that slot stores do not require allocations.

    User visible effects of this bug include increased memory usage from the
    unneeded node that was allocated.

    Link: https://lkml.kernel.org/r/20231213205058.386589-1-sidhartha.kumar@oracle.com
    Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing")
    Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: <stable@vger.kernel.org>    [6.6+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:22:38 -04:00
Rafael Aquini d201512b82 maple_tree: reduce resets during store setup
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit fec29364348fec535c55708b1f4025b321aba572
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Jul 24 14:31:56 2023 -0400

    maple_tree: reduce resets during store setup

    mas_prealloc() may walk partially down the tree before finding that a
    split or spanning store is needed.  When the write occurs, relax the
    logic on resetting the walk so that partial walks will not restart, but
    walks that have gone too far (a store that affects beyond the current
    node) should be restarted.

    Link: https://lkml.kernel.org/r/20230724183157.3939892-15-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:33 -04:00
Rafael Aquini e096f6abaa maple_tree: refine mas_preallocate() node calculations
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit 17983dc617837a588a52848ab4034d8efa6c1fa6
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Jul 24 14:31:55 2023 -0400

    maple_tree: refine mas_preallocate() node calculations

    Calculate the number of nodes based on the pending write action instead
    of assuming the worst case.

    This addresses a performance regression introduced in platforms that
    have longer allocation timing.

    Link: https://lkml.kernel.org/r/20230724183157.3939892-14-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:32 -04:00
Rafael Aquini da7b84ace9 maple_tree: move mas_wr_end_piv() below mas_wr_extend_null()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit a7496ad529dfd96e37219849bddcda121d133536
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Jul 24 14:31:53 2023 -0400

    maple_tree: move mas_wr_end_piv() below mas_wr_extend_null()

    Relocate it and call mas_wr_extend_null() from within mas_wr_end_piv().
    Extending the NULL may affect the end pivot value so call
    mas_wr_endtend_null() from within mas_wr_end_piv() to keep it all
    together.

    Link: https://lkml.kernel.org/r/20230724183157.3939892-12-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:31 -04:00
Rafael Aquini fa45afdecf maple_tree: adjust node allocation on mas_rebalance()
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit c108df767fb786586274b2435473885151d6f360
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Jul 24 14:31:50 2023 -0400

    maple_tree: adjust node allocation on mas_rebalance()

    mas_rebalance() is called to rebalance an insufficient node into a
    single node or two sufficient nodes.  The preallocation estimate is
    always too many in this case as the height of the tree will never grow
    and there is no possibility to have a three way split in this case, so
    revise the node allocation count.

    Link: https://lkml.kernel.org/r/20230724183157.3939892-9-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:28 -04:00
Rafael Aquini 7067cbe997 maple_tree: re-introduce entry to mas_preallocate() arguments
JIRA: https://issues.redhat.com/browse/RHEL-27743

This patch is a backport of the following upstream commit:
commit da0892547b101df6e13255b378380d077975368d
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Jul 24 14:31:49 2023 -0400

    maple_tree: re-introduce entry to mas_preallocate() arguments

    The current preallocation strategy is to preallocate the absolute
    worst-case allocation for a tree modification.  The entry (or NULL) is
    needed to know how many nodes are needed to write to the tree.  Start by
    adding the argument to the mas_preallocate() definition.

    Link: https://lkml.kernel.org/r/20230724183157.3939892-8-Liam.Howlett@oracle.com
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Peng Zhang <zhangpeng.00@bytedance.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:19:28 -04:00
Aristeu Rozanski 1e3d03e26b maple_tree: fix mas_empty_area_rev() null pointer dereference
JIRA: https://issues.redhat.com/browse/RHEL-39862
CVE: CVE-2024-36891
Tested: sanity

commit 955a923d2809803980ff574270f81510112be9cf
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Mon Apr 22 16:33:49 2024 -0400

    maple_tree: fix mas_empty_area_rev() null pointer dereference

    Currently the code calls mas_start() followed by mas_data_end() if the
    maple state is MA_START, but mas_start() may return with the maple state
    node == NULL.  This will lead to a null pointer dereference when checking
    information in the NULL node, which is done in mas_data_end().

    Avoid setting the offset if there is no node by waiting until after the
    maple state is checked for an empty or single entry state.

    A user could trigger the events to cause a kernel oops by unmapping all
    vmas to produce an empty maple tree, then mapping a vma that would cause
    the scenario described above.

    Link: https://lkml.kernel.org/r/20240422203349.2418465-1-Liam.Howlett@oracle.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Reported-by: Marius Fleischer <fleischermarius@gmail.com>
    Closes: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
    Link: https://lore.kernel.org/lkml/CAJg=8jyuSxDL6XvqEXY_66M20psRK2J53oBTP+fjV5xpW2-R6w@mail.gmail.com/
    Tested-by: Marius Fleischer <fleischermarius@gmail.com>
    Tested-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
2024-07-29 12:11:25 -04:00
Nico Pache c552c2aa2a maple_tree: mtree_insert: fix typo in kernel-doc description of GFP flags
commit 4ae6944d15727c50ff1c0bb3fe38b9b412520d85
Author: Mike Rapoport (IBM) <rppt@kernel.org>
Date:   Sat Jul 15 11:40:38 2023 +0300

    maple_tree: mtree_insert: fix typo in kernel-doc description of GFP flags

    Replace FGP_FLAGS with GFP_FLAGS

    Link: https://lkml.kernel.org/r/20230715084038.987955-1-rppt@kernel.org
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache 96a3f8eeee maple_tree: mtree_insert*: fix typo in kernel-doc description
commit 4445e58264aea8ec6bb1287add79606f0e3f3988
Author: Mike Rapoport (IBM) <rppt@kernel.org>
Date:   Sat Jul 15 17:39:20 2023 +0300

    maple_tree: mtree_insert*: fix typo in kernel-doc description

    Replace "Insert and entry at a give index" with "Insert an entry at a
    given index"

    Link: https://lkml.kernel.org/r/20230715143920.994812-1-rppt@kernel.org
    Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache 9d911ab8ce maple_tree: disable mas_wr_append() when other readers are possible
commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b
Author: Liam R. Howlett <Liam.Howlett@oracle.com>
Date:   Fri Aug 18 20:43:55 2023 -0400

    maple_tree: disable mas_wr_append() when other readers are possible

    The current implementation of append may cause duplicate data and/or
    incorrect ranges to be returned to a reader during an update.  Although
    this has not been reported or seen, disable the append write operation
    while the tree is in rcu mode out of an abundance of caution.

    During the analysis of the mas_next_slot() the following was
    artificially created by separating the writer and reader code:

    Writer:                                 reader:
    mas_wr_append
        set end pivot
        updates end metata
        Detects write to last slot
        last slot write is to start of slot
        store current contents in slot
        overwrite old end pivot
                                            mas_next_slot():
                                                    read end metadata
                                                    read old end pivot
                                                    return with incorrect range
        store new value

    Alternatively:

    Writer:                                 reader:
    mas_wr_append
        set end pivot
        updates end metata
        Detects write to last slot
        last lost write to end of slot
        store value
                                            mas_next_slot():
                                                    read end metadata
                                                    read old end pivot
                                                    read new end pivot
                                                    return with incorrect range
        set old end pivot

    There may be other accesses that are not safe since we are now updating
    both metadata and pointers, so disabling append if there could be rcu
    readers is the safest action.

    Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache 6995d5aa65 maple_tree: set the node limit when creating a new root node
commit 3c769fd88b9742954763a968e84de09f7ad78cfe
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Tue Jul 11 11:54:37 2023 +0800

    maple_tree: set the node limit when creating a new root node

    Set the node limit of the root node so that the last pivot of all nodes is
    the node limit (if the node is not full).

    This patch also fixes a bug in mas_rev_awalk().  Effectively, always
    setting a maximum makes mas_logical_pivot() behave as mas_safe_pivot().
    Without this fix, it is possible that very small tasks would fail to find
    the correct gap.  Although this has not been observed with real tasks, it
    has been reported to happen in m68k nommu running the maple tree tests.

    Link: https://lkml.kernel.org/r/20230711035444.526-1-zhangpeng.00@bytedance.com
    Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/
    Link: https://lkml.kernel.org/r/20230711035444.526-2-zhangpeng.00@bytedance.com
    Fixes: 54a611b60590 ("Maple Tree: add new data structure")
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache 7de4d6d5c0 maple_tree: fix a few documentation issues
commit fad9c80e6371ee04a3fa5728efe20b88d8e4cccd
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Tue May 23 22:51:01 2023 +0200

    maple_tree: fix a few documentation issues

    The documentation of mt_next() claims that it starts the search at the
    provided index.  That's incorrect as it starts the search after the
    provided index.

    The documentation of mt_find() is slightly confusing.  "Handles locking"
    is not really helpful as it does not explain how the "locking" works.
    Also the documentation of index talks about a range, while in reality the
    index is updated on a succesful search to the index of the found entry
    plus one.

    Fix similar issues for mt_find_after() and mt_prev().

    Reword the confusing "Note: Will not return the zero entry." comment on
    mt_for_each() and document @__index correctly.

    Link: https://lkml.kernel.org/r/87ttw2n556.ffs@tglx
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Shanker Donthineni <sdonthineni@nvidia.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache b1ff91850e maple_tree: simplify and clean up mas_wr_node_store()
commit 7a03ae39209cf882bb673f724ba723020e1233cc
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Wed May 24 11:12:46 2023 +0800

    maple_tree: simplify and clean up mas_wr_node_store()

    Simplify and clean up mas_wr_node_store(), remove unnecessary code.

    Link: https://lkml.kernel.org/r/20230524031247.65949-10-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache 95266c182b maple_tree: rework mas_wr_slot_store() to be cleaner and more efficient.
commit e6d1ffd611affcd58b31047324f511d8e1de2a38
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Wed May 24 11:12:45 2023 +0800

    maple_tree: rework mas_wr_slot_store() to be cleaner and more efficient.

    Get whether the two gaps to be overwritten are empty to avoid calling
    mas_update_gap() all the time.  Also clean up the code and add comments.

    Link: https://lkml.kernel.org/r/20230524031247.65949-9-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:43 -06:00
Nico Pache c9d0d400d1 maple_tree: add comments and some minor cleanups to mas_wr_append()
commit 2e1da329b424c693662e7e2afa34654989de3fac
Author: Peng Zhang <zhangpeng.00@bytedance.com>
Date:   Wed May 24 11:12:44 2023 +0800

    maple_tree: add comments and some minor cleanups to mas_wr_append()

    Add comment for mas_wr_append(), move mas_update_gap() into
    mas_wr_append(), and other cleanups to make mas_wr_modify() cleaner.

    Link: https://lkml.kernel.org/r/20230524031247.65949-8-zhangpeng.00@bytedance.com
    Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
    Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

JIRA: https://issues.redhat.com/browse/RHEL-5595
Signed-off-by: Nico Pache <npache@redhat.com>
2023-09-26 10:23:42 -06:00