Skip to content
Snippets Groups Projects
  1. Aug 30, 2023
  2. Aug 21, 2023
    • Suren Baghdasaryan's avatar
      mm: enable page walking API to lock vmas during the walk · 49b06385
      Suren Baghdasaryan authored
      walk_page_range() and friends often operate under write-locked mmap_lock. 
      With introduction of vma locks, the vmas have to be locked as well during
      such walks to prevent concurrent page faults in these areas.  Add an
      additional member to mm_walk_ops to indicate locking requirements for the
      walk.
      
      The change ensures that page walks which prevent concurrent page faults
      by write-locking mmap_lock, operate correctly after introduction of
      per-vma locks.  With per-vma locks page faults can be handled under vma
      lock without taking mmap_lock at all, so write locking mmap_lock would
      not stop them.  The change ensures vmas are properly locked during such
      walks.
      
      A sample issue this solves is do_mbind() performing queue_pages_range()
      to queue pages for migration.  Without this change a concurrent page
      can be faulted into the area and be left out of migration.
      
      Link: https://lkml.kernel.org/r/20230804152724.3090321-2-surenb@google.com
      
      
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linuxfoundation.org>
      Suggested-by: default avatarJann Horn <jannh@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Laurent Dufour <ldufour@linux.ibm.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Michel Lespinasse <michel@lespinasse.org>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      49b06385
  3. Aug 16, 2023
  4. Jul 29, 2023
  5. Jul 27, 2023
    • Sven Schnelle's avatar
      s390/vmem: split pages when debug pagealloc is enabled · edc1e4b6
      Sven Schnelle authored
      
      Since commit bb1520d5 ("s390/mm: start kernel with DAT enabled")
      the kernel crashes early during boot when debug pagealloc is enabled:
      
      mem auto-init: stack:off, heap alloc:off, heap free:off
      addressing exception: 0005 ilc:2 [#1] SMP DEBUG_PAGEALLOC
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0-rc3-09759-gc5666c912155 #630
      [..]
      Krnl Code: 00000000001325f6: ec5600248064 cgrj %r5,%r6,8,000000000013263e
                 00000000001325fc: eb880002000c srlg %r8,%r8,2
                #0000000000132602: b2210051     ipte %r5,%r1,%r0,0
                >0000000000132606: b90400d1     lgr %r13,%r1
                 000000000013260a: 41605008     la %r6,8(%r5)
                 000000000013260e: a7db1000     aghi %r13,4096
                 0000000000132612: b221006d     ipte %r6,%r13,%r0,0
                 0000000000132616: e3d0d0000171 lay %r13,4096(%r13)
      
      Call Trace:
       __kernel_map_pages+0x14e/0x320
       __free_pages_ok+0x23a/0x5a8)
       free_low_memory_core_early+0x214/0x2c8
       memblock_free_all+0x28/0x58
       mem_init+0xb6/0x228
       mm_core_init+0xb6/0x3b0
       start_kernel+0x1d2/0x5a8
       startup_continue+0x36/0x40
      Kernel panic - not syncing: Fatal exception: panic_on_oops
      
      This is caused by using large mappings on machines with EDAT1/EDAT2. Add
      the code to split the mappings into 4k pages if debug pagealloc is enabled
      by CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT or the debug_pagealloc kernel
      command line option.
      
      Fixes: bb1520d5 ("s390/mm: start kernel with DAT enabled")
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      edc1e4b6
  6. Jul 24, 2023
    • Alexander Gordeev's avatar
      s390/extmem: improve reporting of -ERANGE error · 9916bf4e
      Alexander Gordeev authored
      
      Interface segment_warning() reports maximum mappable physical
      address for -ERANGE error. Currently that address is the value
      of VMEM_MAX_PHYS macro, but that well might change. A better
      way to obtain that address is calling arch_get_mappable_range()
      callback - one that is used by vmem_add_mapping() and generates
      -ERANGE error in the first place.
      
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      9916bf4e
    • Alexander Gordeev's avatar
      s390/mm: rework arch_get_mappable_range() callback · 94fd5220
      Alexander Gordeev authored
      
      As per description in mm/memory_hotplug.c platforms should define
      arch_get_mappable_range() that provides maximum possible addressable
      physical memory range for which the linear mapping could be created.
      
      The current implementation uses VMEM_MAX_PHYS macro as the maximum
      mappable physical address and it is simply a cast to vmemmap. Since
      the address is in physical address space the natural upper limit of
      MAX_PHYSMEM_BITS is honoured:
      
      	vmemmap_start = min(vmemmap_start, 1UL << MAX_PHYSMEM_BITS);
      
      Further, to make sure the identity mapping would not overlay with
      vmemmap, the size of identity mapping could be stripped like this:
      
      	ident_map_size = min(ident_map_size, vmemmap_start);
      
      Similarily, any other memory that could be added (e.g DCSS segment)
      should not overlay with vmemmap as well and that is prevented by
      using vmemmap (VMEM_MAX_PHYS macro) as the upper limit.
      
      However, while the use of VMEM_MAX_PHYS brings the desired result
      it actually poses two issues:
      
      1. As described, vmemmap is handled as a physical address, although
         it is actually a pointer to struct page in virtual address space.
      
      2. As vmemmap is a virtual address it could have been located
         anywhere in the virtual address space. However, the desired
         necessity to honour MAX_PHYSMEM_BITS limit prevents that.
      
      Rework arch_get_mappable_range() callback in a way it does not
      use VMEM_MAX_PHYS macro and does not confuse the notion of virtual
      vs physical address spacees as result. That paves the way for moving
      vmemmap elsewhere and optimizing the virtual address space layout.
      
      Introduce max_mappable preserved boot variable and let function
      setup_kernel_memory_layout() set it up. As result, the rest of the
      code is does not need to know the virtual memory layout specifics.
      
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      94fd5220
  7. Jul 18, 2023
  8. Jul 04, 2023
  9. Jul 03, 2023
  10. Jun 28, 2023
  11. Jun 27, 2023
    • Linus Torvalds's avatar
      mm: always expand the stack with the mmap write lock held · 8d7071af
      Linus Torvalds authored
      
      This finishes the job of always holding the mmap write lock when
      extending the user stack vma, and removes the 'write_locked' argument
      from the vm helper functions again.
      
      For some cases, we just avoid expanding the stack at all: drivers and
      page pinning really shouldn't be extending any stacks.  Let's see if any
      strange users really wanted that.
      
      It's worth noting that architectures that weren't converted to the new
      lock_mm_and_find_vma() helper function are left using the legacy
      "expand_stack()" function, but it has been changed to drop the mmap_lock
      and take it for writing while expanding the vma.  This makes it fairly
      straightforward to convert the remaining architectures.
      
      As a result of dropping and re-taking the lock, the calling conventions
      for this function have also changed, since the old vma may no longer be
      valid.  So it will now return the new vma if successful, and NULL - and
      the lock dropped - if the area could not be extended.
      
      Tested-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # ia64
      Tested-by: Frank Scheiner <frank.scheiner@web.de> # ia64
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d7071af
  12. Jun 20, 2023
  13. Jun 19, 2023
    • Hugh Dickins's avatar
      s390: gmap use pte_unmap_unlock() not spin_unlock() · b2f58941
      Hugh Dickins authored
      pte_alloc_map_lock() expects to be followed by pte_unmap_unlock(): to
      keep balance in future, pass ptep as well as ptl to gmap_pte_op_end(),
      and use pte_unmap_unlock() instead of direct spin_unlock() (even though
      ptep ends up unused inside the macro).
      
      Link: https://lkml.kernel.org/r/78873af-e1ec-4f9-47ac-483940ac6daa@google.com
      
      
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b2f58941
    • Hugh Dickins's avatar
      s390: allow pte_offset_map_lock() to fail · 5c7f3bf0
      Hugh Dickins authored
      In rare transient cases, not yet made possible, pte_offset_map() and
      pte_offset_map_lock() may not find a page table: handle appropriately.
      
      Add comment on mm's contract with s390 above __zap_zero_pages(),
      and fix old comment there: must be called after THP was disabled.
      
      Link: https://lkml.kernel.org/r/3ff29363-336a-9733-12a1-5c31a45c8aeb@google.com
      
      
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: John David Anglin <dave.anglin@bell.net>
      Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qi Zheng <zhengqi.arch@bytedance.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5c7f3bf0
  14. May 17, 2023
  15. May 04, 2023
    • Claudio Imbrenda's avatar
      KVM: s390: pv: fix asynchronous teardown for small VMs · 292a7d6f
      Claudio Imbrenda authored
      
      On machines without the Destroy Secure Configuration Fast UVC, the
      topmost level of page tables is set aside and freed asynchronously
      as last step of the asynchronous teardown.
      
      Each gmap has a host_to_guest radix tree mapping host (userspace)
      addresses (with 1M granularity) to gmap segment table entries (pmds).
      
      If a guest is smaller than 2GB, the topmost level of page tables is the
      segment table (i.e. there are only 2 levels). Replacing it means that
      the pointers in the host_to_guest mapping would become stale and cause
      all kinds of nasty issues.
      
      This patch fixes the issue by disallowing asynchronous teardown for
      guests with only 2 levels of page tables. Userspace should (and already
      does) try using the normal destroy if the asynchronous one fails.
      
      Update s390_replace_asce so it refuses to replace segment type ASCEs.
      This is still needed in case the normal destroy VM fails.
      
      Fixes: fb491d55 ("KVM: s390: pv: asynchronous destroy for reboot")
      Reviewed-by: default avatarMarc Hartmayer <mhartmay@linux.ibm.com>
      Reviewed-by: default avatarJanosch Frank <frankja@linux.ibm.com>
      Signed-off-by: default avatarClaudio Imbrenda <imbrenda@linux.ibm.com>
      Message-Id: <20230421085036.52511-2-imbrenda@linux.ibm.com>
      292a7d6f
  16. May 03, 2023
  17. Apr 21, 2023
    • Linus Torvalds's avatar
      mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() · 6b008640
      Linus Torvalds authored
      Instead of having callers care about the mmap_min_addr logic for the
      lowest valid mapping address (and some of them getting it wrong), just
      move the logic into vm_unmapped_area() itself.  One less thing for various
      architecture cases (and generic helpers) to worry about.
      
      We should really try to make much more of this be common code, but baby
      steps..
      
      Without this, vm_unmapped_area() could return an address below
      mmap_min_addr (because some caller forgot about that).  That then causes
      the mmap machinery to think it has found a workable address, but then
      later security_mmap_addr(addr) is unhappy about it and the mmap() returns
      with a nonsensical error (EPERM).
      
      The proper action is to either return ENOMEM (if the virtual address space
      is exhausted), or try to find another address (ie do a bottom-up search
      for free addresses after the top-down one failed).
      
      See commit 2afc745f ("mm: ensure get_unmapped_area() returns higher
      address than mmap_min_addr"), which fixed this for one call site (the
      generic arch_get_unmapped_area_topdown() fallback) but left other cases
      alone.
      
      Link: https://lkml.kernel.org/r/20230418214009.1142926-1-Liam.Howlett@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6b008640
    • Stefan Roesch's avatar
      mm: add new api to enable ksm per process · d7597f59
      Stefan Roesch authored
      Patch series "mm: process/cgroup ksm support", v9.
      
      So far KSM can only be enabled by calling madvise for memory regions.  To
      be able to use KSM for more workloads, KSM needs to have the ability to be
      enabled / disabled at the process / cgroup level.
      
      Use case 1:
        The madvise call is not available in the programming language.  An
        example for this are programs with forked workloads using a garbage
        collected language without pointers.  In such a language madvise cannot
        be made available.
      
        In addition the addresses of objects get moved around as they are
        garbage collected.  KSM sharing needs to be enabled "from the outside"
        for these type of workloads.
      
      Use case 2:
        The same interpreter can also be used for workloads where KSM brings
        no benefit or even has overhead.  We'd like to be able to enable KSM on
        a workload by workload basis.
      
      Use case 3:
        With the madvise call sharing opportunities are only enabled for the
        current process: it is a workload-local decision.  A considerable number
        of sharing opportunities may exist across multiple workloads or jobs (if
        they are part of the same security domain).  Only a higler level entity
        like a job scheduler or container can know for certain if its running
        one or more instances of a job.  That job scheduler however doesn't have
        the necessary internal workload knowledge to make targeted madvise
        calls.
      
      Security concerns:
      
        In previous discussions security concerns have been brought up.  The
        problem is that an individual workload does not have the knowledge about
        what else is running on a machine.  Therefore it has to be very
        conservative in what memory areas can be shared or not.  However, if the
        system is dedicated to running multiple jobs within the same security
        domain, its the job scheduler that has the knowledge that sharing can be
        safely enabled and is even desirable.
      
      Performance:
      
        Experiments with using UKSM have shown a capacity increase of around 20%.
      
        Here are the metrics from an instagram workload (taken from a machine
        with 64GB main memory):
      
         full_scans: 445
         general_profit: 20158298048
         max_page_sharing: 256
         merge_across_nodes: 1
         pages_shared: 129547
         pages_sharing: 5119146
         pages_to_scan: 4000
         pages_unshared: 1760924
         pages_volatile: 10761341
         run: 1
         sleep_millisecs: 20
         stable_node_chains: 167
         stable_node_chains_prune_millisecs: 2000
         stable_node_dups: 2751
         use_zero_pages: 0
         zero_pages_sharing: 0
      
      After the service is running for 30 minutes to an hour, 4 to 5 million
      shared pages are common for this workload when using KSM.
      
      
      Detailed changes:
      
      1. New options for prctl system command
         This patch series adds two new options to the prctl system call. 
         The first one allows to enable KSM at the process level and the second
         one to query the setting.
      
      The setting will be inherited by child processes.
      
      With the above setting, KSM can be enabled for the seed process of a cgroup
      and all processes in the cgroup will inherit the setting.
      
      2. Changes to KSM processing
         When KSM is enabled at the process level, the KSM code will iterate
         over all the VMA's and enable KSM for the eligible VMA's.
      
         When forking a process that has KSM enabled, the setting will be
         inherited by the new child process.
      
      3. Add general_profit metric
         The general_profit metric of KSM is specified in the documentation,
         but not calculated.  This adds the general profit metric to
         /sys/kernel/debug/mm/ksm.
      
      4. Add more metrics to ksm_stat
         This adds the process profit metric to /proc/<pid>/ksm_stat.
      
      5. Add more tests to ksm_tests and ksm_functional_tests
         This adds an option to specify the merge type to the ksm_tests. 
         This allows to test madvise and prctl KSM.
      
         It also adds a two new tests to ksm_functional_tests: one to test
         the new prctl options and the other one is a fork test to verify that
         the KSM process setting is inherited by client processes.
      
      
      This patch (of 3):
      
      So far KSM can only be enabled by calling madvise for memory regions.  To
      be able to use KSM for more workloads, KSM needs to have the ability to be
      enabled / disabled at the process / cgroup level.
      
      1. New options for prctl system command
      
         This patch series adds two new options to the prctl system call.
         The first one allows to enable KSM at the process level and the second
         one to query the setting.
      
         The setting will be inherited by child processes.
      
         With the above setting, KSM can be enabled for the seed process of a
         cgroup and all processes in the cgroup will inherit the setting.
      
      2. Changes to KSM processing
      
         When KSM is enabled at the process level, the KSM code will iterate
         over all the VMA's and enable KSM for the eligible VMA's.
      
         When forking a process that has KSM enabled, the setting will be
         inherited by the new child process.
      
        1) Introduce new MMF_VM_MERGE_ANY flag
      
           This introduces the new flag MMF_VM_MERGE_ANY flag.  When this flag
           is set, kernel samepage merging (ksm) gets enabled for all vma's of a
           process.
      
        2) Setting VM_MERGEABLE on VMA creation
      
           When a VMA is created, if the MMF_VM_MERGE_ANY flag is set, the
           VM_MERGEABLE flag will be set for this VMA.
      
        3) support disabling of ksm for a process
      
           This adds the ability to disable ksm for a process if ksm has been
           enabled for the process with prctl.
      
        4) add new prctl option to get and set ksm for a process
      
           This adds two new options to the prctl system call
           - enable ksm for all vmas of a process (if the vmas support it).
           - query if ksm has been enabled for a process.
      
      3. Disabling MMF_VM_MERGE_ANY for storage keys in s390
      
         In the s390 architecture when storage keys are used, the
         MMF_VM_MERGE_ANY will be disabled.
      
      Link: https://lkml.kernel.org/r/20230418051342.1919757-1-shr@devkernel.io
      Link: https://lkml.kernel.org/r/20230418051342.1919757-2-shr@devkernel.io
      
      
      Signed-off-by: default avatarStefan Roesch <shr@devkernel.io>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d7597f59
  18. Apr 20, 2023
    • Heiko Carstens's avatar
      s390/mm: use VM_FLUSH_RESET_PERMS in module_alloc() · 34e4c79f
      Heiko Carstens authored
      
      Make use of the set_direct_map() calls for module allocations.
      In particular:
      
      - All changes to read-only permissions in kernel VA mappings are also
        applied to the direct mapping. Note that execute permissions are
        intentionally not applied to the direct mapping in order to make
        sure that all allocated pages within the direct mapping stay
        non-executable
      
      - module_alloc() passes the VM_FLUSH_RESET_PERMS to __vmalloc_node_range()
        to make sure that all implicit permission changes made to the direct
        mapping are reset when the allocated vm area is freed again
      
      Side effects: the direct mapping will be fragmented depending on how many
      vm areas with VM_FLUSH_RESET_PERMS and/or explicit page permission changes
      are allocated and freed again.
      
      For example, just after boot of a system the direct mapping statistics look
      like:
      
      $cat /proc/meminfo
      ...
      DirectMap4k:      111628 kB
      DirectMap1M:    16665600 kB
      DirectMap2G:           0 kB
      
      Acked-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      34e4c79f
    • Heiko Carstens's avatar
      s390/mm: enable ARCH_HAS_SET_DIRECT_MAP · 0490d6d7
      Heiko Carstens authored
      
      Implement the set_direct_map_*() API, which allows to invalidate and set
      default permissions to pages within the direct mapping.
      
      Note that kernel_page_present(), which is also supposed to be part of this
      API, is intentionally not implemented. The reason for this is that
      kernel_page_present() is only used (and currently only makes sense) for
      suspend/resume, which isn't supported on s390.
      
      Reviewed-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      0490d6d7
  19. Apr 13, 2023
  20. Apr 06, 2023
  21. Mar 20, 2023
Loading