Skip to content
Snippets Groups Projects
  • Barry Song's avatar
    e7ac4dae
    mm: count zeromap read and set for swapout and swapin · e7ac4dae
    Barry Song authored
    When the proportion of folios from the zeromap is small, missing their
    accounting may not significantly impact profiling.  However, it's easy to
    construct a scenario where this becomes an issue—for example, allocating
    1 GB of memory, writing zeros from userspace, followed by MADV_PAGEOUT,
    and then swapping it back in.  In this case, the swap-out and swap-in
    counts seem to vanish into a black hole, potentially causing semantic
    ambiguity.
    
    On the other hand, Usama reported that zero-filled pages can exceed 10% in
    workloads utilizing zswap, while Hailong noted that some app in Android
    have more than 6% zero-filled pages.  Before commit 0ca0c24e ("mm:
    store zero pages to be swapped out in a bitmap"), both zswap and zRAM
    implemented similar optimizations, leading to these optimized-out pages
    being counted in either zswap or zRAM counters (with pswpin/pswpout also
    increasing for zRAM).  With zeromap functioning prior to both zswap and
    zRAM, userspace will no longer detect these swap-out and swap-in actions.
    
    We have three ways to address this:
    
    1. Introduce a dedicated counter specifically for the zeromap.
    
    2. Use pswpin/pswpout accounting, treating the zero map as a standard
       backend.  This approach aligns with zRAM's current handling of
       same-page fills at the device level.  However, it would mean losing the
       optimized-out page counters previously available in zRAM and would not
       align with systems using zswap.  Additionally, as noted by Nhat Pham,
       pswpin/pswpout counters apply only to I/O done directly to the backend
       device.
    
    3. Count zeromap pages under zswap, aligning with system behavior when
       zswap is enabled.  However, this would not be consistent with zRAM, nor
       would it align with systems lacking both zswap and zRAM.
    
    Given the complications with options 2 and 3, this patch selects
    option 1.
    
    We can find these counters from /proc/vmstat (counters for the whole
    system) and memcg's memory.stat (counters for the interested memcg).
    
    For example:
    
    $ grep -E 'swpin_zero|swpout_zero' /proc/vmstat
    swpin_zero 1648
    swpout_zero 33536
    
    $ grep -E 'swpin_zero|swpout_zero' /sys/fs/cgroup/system.slice/memory.stat
    swpin_zero 3905
    swpout_zero 3985
    
    This patch does not address any specific zeromap bug, but the missing
    swpout and swpin counts for zero-filled pages can be highly confusing and
    may mislead user-space agents that rely on changes in these counters as
    indicators.  Therefore, we add a Fixes tag to encourage the inclusion of
    this counter in any kernel versions with zeromap.
    
    Many thanks to Kanchana for the contribution of changing
    count_objcg_event() to count_objcg_events() to support large folios[1],
    which has now been incorporated into this patch.
    
    [1] https://lkml.kernel.org/r/20241001053222.6944-5-kanchana.p.sridhar@intel.com
    
    Link: https://lkml.kernel.org/r/20241107011246.59137-1-21cnbao@gmail.com
    
    
    Fixes: 0ca0c24e ("mm: store zero pages to be swapped out in a bitmap")
    Co-developed-by: default avatarKanchana P Sridhar <kanchana.p.sridhar@intel.com>
    Signed-off-by: default avatarBarry Song <v-songbaohua@oppo.com>
    Reviewed-by: default avatarNhat Pham <nphamcs@gmail.com>
    Reviewed-by: default avatarChengming Zhou <chengming.zhou@linux.dev>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Usama Arif <usamaarif642@gmail.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Cc: Hailong Liu <hailong.liu@oppo.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Shakeel Butt <shakeel.butt@linux.dev>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Chris Li <chrisl@kernel.org>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Kairui Song <kasong@tencent.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    e7ac4dae
    History
    mm: count zeromap read and set for swapout and swapin
    Barry Song authored
    When the proportion of folios from the zeromap is small, missing their
    accounting may not significantly impact profiling.  However, it's easy to
    construct a scenario where this becomes an issue—for example, allocating
    1 GB of memory, writing zeros from userspace, followed by MADV_PAGEOUT,
    and then swapping it back in.  In this case, the swap-out and swap-in
    counts seem to vanish into a black hole, potentially causing semantic
    ambiguity.
    
    On the other hand, Usama reported that zero-filled pages can exceed 10% in
    workloads utilizing zswap, while Hailong noted that some app in Android
    have more than 6% zero-filled pages.  Before commit 0ca0c24e ("mm:
    store zero pages to be swapped out in a bitmap"), both zswap and zRAM
    implemented similar optimizations, leading to these optimized-out pages
    being counted in either zswap or zRAM counters (with pswpin/pswpout also
    increasing for zRAM).  With zeromap functioning prior to both zswap and
    zRAM, userspace will no longer detect these swap-out and swap-in actions.
    
    We have three ways to address this:
    
    1. Introduce a dedicated counter specifically for the zeromap.
    
    2. Use pswpin/pswpout accounting, treating the zero map as a standard
       backend.  This approach aligns with zRAM's current handling of
       same-page fills at the device level.  However, it would mean losing the
       optimized-out page counters previously available in zRAM and would not
       align with systems using zswap.  Additionally, as noted by Nhat Pham,
       pswpin/pswpout counters apply only to I/O done directly to the backend
       device.
    
    3. Count zeromap pages under zswap, aligning with system behavior when
       zswap is enabled.  However, this would not be consistent with zRAM, nor
       would it align with systems lacking both zswap and zRAM.
    
    Given the complications with options 2 and 3, this patch selects
    option 1.
    
    We can find these counters from /proc/vmstat (counters for the whole
    system) and memcg's memory.stat (counters for the interested memcg).
    
    For example:
    
    $ grep -E 'swpin_zero|swpout_zero' /proc/vmstat
    swpin_zero 1648
    swpout_zero 33536
    
    $ grep -E 'swpin_zero|swpout_zero' /sys/fs/cgroup/system.slice/memory.stat
    swpin_zero 3905
    swpout_zero 3985
    
    This patch does not address any specific zeromap bug, but the missing
    swpout and swpin counts for zero-filled pages can be highly confusing and
    may mislead user-space agents that rely on changes in these counters as
    indicators.  Therefore, we add a Fixes tag to encourage the inclusion of
    this counter in any kernel versions with zeromap.
    
    Many thanks to Kanchana for the contribution of changing
    count_objcg_event() to count_objcg_events() to support large folios[1],
    which has now been incorporated into this patch.
    
    [1] https://lkml.kernel.org/r/20241001053222.6944-5-kanchana.p.sridhar@intel.com
    
    Link: https://lkml.kernel.org/r/20241107011246.59137-1-21cnbao@gmail.com
    
    
    Fixes: 0ca0c24e ("mm: store zero pages to be swapped out in a bitmap")
    Co-developed-by: default avatarKanchana P Sridhar <kanchana.p.sridhar@intel.com>
    Signed-off-by: default avatarBarry Song <v-songbaohua@oppo.com>
    Reviewed-by: default avatarNhat Pham <nphamcs@gmail.com>
    Reviewed-by: default avatarChengming Zhou <chengming.zhou@linux.dev>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Usama Arif <usamaarif642@gmail.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Cc: Hailong Liu <hailong.liu@oppo.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Shakeel Butt <shakeel.butt@linux.dev>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: Chris Li <chrisl@kernel.org>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Kairui Song <kasong@tencent.com>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>