Skip to content
Snippets Groups Projects
  1. Mar 03, 2025
  2. Jan 14, 2025
    • Waqar Hameed's avatar
      ubifs: authentication: Fix use-after-free in ubifs_tnc_end_commit · f35f522c
      Waqar Hameed authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 4617fb8fc15effe8eda4dd898d4e33eb537a7140 ]
      
      After an insertion in TNC, the tree might split and cause a node to
      change its `znode->parent`. A further deletion of other nodes in the
      tree (which also could free the nodes), the aforementioned node's
      `znode->cparent` could still point to a freed node. This
      `znode->cparent` may not be updated when getting nodes to commit in
      `ubifs_tnc_start_commit()`. This could then trigger a use-after-free
      when accessing the `znode->cparent` in `write_index()` in
      `ubifs_tnc_end_commit()`.
      
      This can be triggered by running
      
        rm -f /etc/test-file.bin
        dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync
      
      in a loop, and with `CONFIG_UBIFS_FS_AUTHENTICATION`. KASAN then
      reports:
      
        BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
        Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153
      
        Call trace:
         dump_backtrace+0x0/0x340
         show_stack+0x18/0x24
         dump_stack_lvl+0x9c/0xbc
         print_address_description.constprop.0+0x74/0x2b0
         kasan_report+0x1d8/0x1f0
         kasan_check_range+0xf8/0x1a0
         memcpy+0x84/0xf4
         ubifs_tnc_end_commit+0xa5c/0x1950
         do_commit+0x4e0/0x1340
         ubifs_bg_thread+0x234/0x2e0
         kthread+0x36c/0x410
         ret_from_fork+0x10/0x20
      
        Allocated by task 401:
         kasan_save_stack+0x38/0x70
         __kasan_kmalloc+0x8c/0xd0
         __kmalloc+0x34c/0x5bc
         tnc_insert+0x140/0x16a4
         ubifs_tnc_add+0x370/0x52c
         ubifs_jnl_write_data+0x5d8/0x870
         do_writepage+0x36c/0x510
         ubifs_writepage+0x190/0x4dc
         __writepage+0x58/0x154
         write_cache_pages+0x394/0x830
         do_writepages+0x1f0/0x5b0
         filemap_fdatawrite_wbc+0x170/0x25c
         file_write_and_wait_range+0x140/0x190
         ubifs_fsync+0xe8/0x290
         vfs_fsync_range+0xc0/0x1e4
         do_fsync+0x40/0x90
         __arm64_sys_fsync+0x34/0x50
         invoke_syscall.constprop.0+0xa8/0x260
         do_el0_svc+0xc8/0x1f0
         el0_svc+0x34/0x70
         el0t_64_sync_handler+0x108/0x114
         el0t_64_sync+0x1a4/0x1a8
      
        Freed by task 403:
         kasan_save_stack+0x38/0x70
         kasan_set_track+0x28/0x40
         kasan_set_free_info+0x28/0x4c
         __kasan_slab_free+0xd4/0x13c
         kfree+0xc4/0x3a0
         tnc_delete+0x3f4/0xe40
         ubifs_tnc_remove_range+0x368/0x73c
         ubifs_tnc_remove_ino+0x29c/0x2e0
         ubifs_jnl_delete_inode+0x150/0x260
         ubifs_evict_inode+0x1d4/0x2e4
         evict+0x1c8/0x450
         iput+0x2a0/0x3c4
         do_unlinkat+0x2cc/0x490
         __arm64_sys_unlinkat+0x90/0x100
         invoke_syscall.constprop.0+0xa8/0x260
         do_el0_svc+0xc8/0x1f0
         el0_svc+0x34/0x70
         el0t_64_sync_handler+0x108/0x114
         el0t_64_sync+0x1a4/0x1a8
      
      The offending `memcpy()` in `ubifs_copy_hash()` has a use-after-free
      when a node becomes root in TNC but still has a `cparent` to an already
      freed node. More specifically, consider the following TNC:
      
               zroot
               /
              /
            zp1
            /
           /
          zn
      
      Inserting a new node `zn_new` with a key smaller then `zn` will trigger
      a split in `tnc_insert()` if `zp1` is full:
      
               zroot
               /   \
              /     \
            zp1     zp2
            /         \
           /           \
        zn_new          zn
      
      `zn->parent` has now been moved to `zp2`, *but* `zn->cparent` still
      points to `zp1`.
      
      Now, consider a removal of all the nodes _except_ `zn`. Just when
      `tnc_delete()` is about to delete `zroot` and `zp2`:
      
               zroot
                   \
                    \
                    zp2
                      \
                       \
                       zn
      
      `zroot` and `zp2` get freed and the tree collapses:
      
                 zn
      
      `zn` now becomes the new `zroot`.
      
      `get_znodes_to_commit()` will now only find `zn`, the new `zroot`, and
      `write_index()` will check its `znode->cparent` that wrongly points to
      the already freed `zp1`. `ubifs_copy_hash()` thus gets wrongly called
      with `znode->cparent->zbranch[znode->iip].hash` that triggers the
      use-after-free!
      
      Fix this by explicitly setting `znode->cparent` to `NULL` in
      `get_znodes_to_commit()` for the root node. The search for the dirty
      nodes is bottom-up in the tree. Thus, when `find_next_dirty(znode)`
      returns NULL, the current `znode` _is_ the root node. Add an assert for
      this.
      
      Fixes: 16a26b20 ("ubifs: authentication: Add hashes to index nodes")
      Tested-by: default avatarWaqar Hameed <waqar.hameed@axis.com>
      Co-developed-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarWaqar Hameed <waqar.hameed@axis.com>
      Reviewed-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f35f522c
    • Zhihao Cheng's avatar
      ubifs: Correct the total block count by deducting journal reservation · f0d13618
      Zhihao Cheng authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit 84a2bee9c49769310efa19601157ef50a1df1267 ]
      
      Since commit e874dcde ("ubifs: Reserve one leb for each journal
      head while doing budget"), available space is calulated by deducting
      reservation for all journal heads. However, the total block count (
      which is only used by statfs) is not updated yet, which will cause
      the wrong displaying for used space(total - available).
      Fix it by deducting reservation for all journal heads from total
      block count.
      
      Fixes: e874dcde ("ubifs: Reserve one leb for each journal head while doing budget")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f0d13618
  3. Apr 11, 2024
  4. Feb 12, 2024
    • Zhihao Cheng's avatar
      ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path · 6fc982b5
      Zhihao Cheng authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      commit 1e022216 upstream.
      
      For error handling path in ubifs_symlink(), inode will be marked as
      bad first, then iput() is invoked. If inode->i_link is initialized by
      fscrypt_encrypt_symlink() in encryption scenario, inode->i_link won't
      be freed by callchain ubifs_free_inode -> fscrypt_free_inode in error
      handling path, because make_bad_inode() has changed 'inode->i_mode' as
      'S_IFREG'.
      Following kmemleak is easy to be reproduced by injecting error in
      ubifs_jnl_update() when doing symlink in encryption scenario:
       unreferenced object 0xffff888103da3d98 (size 8):
        comm "ln", pid 1692, jiffies 4294914701 (age 12.045s)
        backtrace:
         kmemdup+0x32/0x70
         __fscrypt_encrypt_symlink+0xed/0x1c0
         ubifs_symlink+0x210/0x300 [ubifs]
         vfs_symlink+0x216/0x360
         do_symlinkat+0x11a/0x190
         do_syscall_64+0x3b/0xe0
      There are two ways fixing it:
       1. Remove make_bad_inode() in error handling path. We can do that
          because ubifs_evict_inode() will do same processes for good
          symlink inode and bad symlink inode, for inode->i_nlink checking
          is before is_bad_inode().
       2. Free inode->i_link before marking inode bad.
      Method 2 is picked, it has less influence, personally, I think.
      
      Cc: stable@vger.kernel.org
      Fixes: 2c58d548 ("fscrypt: cache decrypted symlink target in ->i_link")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Suggested-by: default avatarEric Biggers <ebiggers@kernel.org>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fc982b5
  5. Jan 23, 2024
  6. May 11, 2023
  7. Mar 11, 2023
    • Zhihao Cheng's avatar
      ubifs: ubifs_releasepage: Remove ubifs_assert(0) to valid this process · 7750be5d
      Zhihao Cheng authored
      [ Upstream commit 66f4742e ]
      
      There are two states for ubifs writing pages:
      1. Dirty, Private
      2. Not Dirty, Not Private
      
      The normal process cannot go to ubifs_releasepage() which means there
      exists pages being private but not dirty. Reproducer[1] shows that it
      could occur (which maybe related to [2]) with following process:
      
           PA                     PB                    PC
      lock(page)[PA]
      ubifs_write_end
        attach_page_private         // set Private
        __set_page_dirty_nobuffers  // set Dirty
      unlock(page)
      
      write_cache_pages[PA]
        lock(page)
        clear_page_dirty_for_io(page)	// clear Dirty
        ubifs_writepage
      
                              do_truncation[PB]
      			  truncate_setsize
      			    i_size_write(inode, newsize) // newsize = 0
      
          i_size = i_size_read(inode)	// i_size = 0
          end_index = i_size >> PAGE_SHIFT
          if (page->index > end_index)
            goto out // jump
      out:
      unlock(page)   // Private, Not Dirty
      
      						generic_fadvise[PC]
      						  lock(page)
      						  invalidate_inode_page
      						    try_to_release_page
      						      ubifs_releasepage
      						        ubifs_assert(c, 0)
      		                                        // bad assertion!
      						  unlock(page)
      			  truncate_pagecache[PB]
      
      Then we may get following assertion failed:
        UBIFS error (ubi0:0 pid 1683): ubifs_assert_failed [ubifs]:
        UBIFS assert failed: 0, in fs/ubifs/file.c:1513
        UBIFS warning (ubi0:0 pid 1683): ubifs_ro_mode [ubifs]:
        switched to read-only mode, error -22
        CPU: 2 PID: 1683 Comm: aa Not tainted 5.16.0-rc5-00184-g0bca5994cacc-dirty #308
        Call Trace:
          dump_stack+0x13/0x1b
          ubifs_ro_mode+0x54/0x60 [ubifs]
          ubifs_assert_failed+0x4b/0x80 [ubifs]
          ubifs_releasepage+0x67/0x1d0 [ubifs]
          try_to_release_page+0x57/0xe0
          invalidate_inode_page+0xfb/0x130
          __invalidate_mapping_pages+0xb9/0x280
          invalidate_mapping_pagevec+0x12/0x20
          generic_fadvise+0x303/0x3c0
          ksys_fadvise64_64+0x4c/0xb0
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=215373
      [2] https://linux-mtd.infradead.narkive.com/NQoBeT1u/patch-rfc-ubifs-fix-assert-failed-in-ubifs-set-page-dirty
      
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7750be5d
    • Zhihao Cheng's avatar
      ubifs: ubifs_writepage: Mark page dirty after writing inode failed · 824452d5
      Zhihao Cheng authored
      [ Upstream commit fb8bc4c7 ]
      
      There are two states for ubifs writing pages:
      1. Dirty, Private
      2. Not Dirty, Not Private
      
      There is a third possibility which maybe related to [1] that page is
      private but not dirty caused by following process:
      
                PA
      lock(page)
      ubifs_write_end
        attach_page_private		// set Private
          __set_page_dirty_nobuffers	// set Dirty
      unlock(page)
      
      write_cache_pages
        lock(page)
        clear_page_dirty_for_io(page)	// clear Dirty
        ubifs_writepage
          write_inode
          // fail, goto out, following codes are not executed
          // do_writepage
          //   set_page_writeback 	// set Writeback
          //   detach_page_private	// clear Private
          //   end_page_writeback 	// clear Writeback
          out:
          unlock(page)		// Private, Not Dirty
      
                                             PB
      				ksys_fadvise64_64
      				  generic_fadvise
      				     invalidate_inode_page
      				     // page is neither Dirty nor Writeback
      				       invalidate_complete_page
      				       // page_has_private is true
      					 try_to_release_page
      					   ubifs_releasepage
      					     ubifs_assert(c, 0) !!!
      
      Then we may get following assertion failed:
        UBIFS error (ubi0:0 pid 1492): ubifs_assert_failed [ubifs]:
        UBIFS assert failed: 0, in fs/ubifs/file.c:1499
        UBIFS warning (ubi0:0 pid 1492): ubifs_ro_mode [ubifs]:
        switched to read-only mode, error -22
        CPU: 2 PID: 1492 Comm: aa Not tainted 5.16.0-rc2-00012-g7bb767dee0ba-dirty
        Call Trace:
          dump_stack+0x13/0x1b
          ubifs_ro_mode+0x54/0x60 [ubifs]
          ubifs_assert_failed+0x4b/0x80 [ubifs]
          ubifs_releasepage+0x7e/0x1e0 [ubifs]
          try_to_release_page+0x57/0xe0
          invalidate_inode_page+0xfb/0x130
          invalidate_mapping_pagevec+0x12/0x20
          generic_fadvise+0x303/0x3c0
          vfs_fadvise+0x35/0x40
          ksys_fadvise64_64+0x4c/0xb0
      
      Jump [2] to find a reproducer.
      
      [1] https://linux-mtd.infradead.narkive.com/NQoBeT1u/patch-rfc-ubifs-fix-assert-failed-in-ubifs-set-page-dirty
      [2] https://bugzilla.kernel.org/show_bug.cgi?id=215357
      
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      824452d5
    • Zhihao Cheng's avatar
      ubifs: dirty_cow_znode: Fix memleak in error handling path · 76c488e8
      Zhihao Cheng authored
      [ Upstream commit 122deabf ]
      
      Following process will cause a memleak for copied up znode:
      
      dirty_cow_znode
        zn = copy_znode(c, znode);
        err = insert_old_idx(c, zbr->lnum, zbr->offs);
        if (unlikely(err))
           return ERR_PTR(err);   // No one refers to zn.
      
      Fix it by adding copied znode back to tnc, then it will be freed
      by ubifs_destroy_tnc_subtree() while closing tnc.
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216705
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      76c488e8
    • Zhihao Cheng's avatar
      ubifs: Re-statistic cleaned znode count if commit failed · 0b8beac8
      Zhihao Cheng authored
      [ Upstream commit 944e096a ]
      
      Dirty znodes will be written on flash in committing process with
      following states:
      
      	      process A			|  znode state
      ------------------------------------------------------
      do_commit				| DIRTY_ZNODE
        ubifs_tnc_start_commit		| DIRTY_ZNODE
         get_znodes_to_commit			| DIRTY_ZNODE | COW_ZNODE
          layout_commit			| DIRTY_ZNODE | COW_ZNODE
           fill_gap                           | 0
        write master				| 0 or OBSOLETE_ZNODE
      
      	      process B			|  znode state
      ------------------------------------------------------
      do_commit				| DIRTY_ZNODE[1]
        ubifs_tnc_start_commit		| DIRTY_ZNODE
         get_znodes_to_commit			| DIRTY_ZNODE | COW_ZNODE
        ubifs_tnc_end_commit			| DIRTY_ZNODE | COW_ZNODE
         write_index                          | 0
        write master				| 0 or OBSOLETE_ZNODE[2] or
      					| DIRTY_ZNODE[3]
      
      [1] znode is dirtied without concurrent committing process
      [2] znode is copied up (re-dirtied by other process) before cleaned
          up in committing process
      [3] znode is re-dirtied after cleaned up in committing process
      
      Currently, the clean znode count is updated in free_obsolete_znodes(),
      which is called only in normal path. If do_commit failed, clean znode
      count won't be updated, which triggers a failure ubifs assertion[4] in
      ubifs_tnc_close():
       ubifs_assert_failed [ubifs]: UBIFS assert failed: freed == n
      
      [4] Commit 380347e9 ("UBIFS: Add an assertion for clean_zn_cnt").
      
      Fix it by re-statisticing cleaned znode count in tnc_destroy_cnext().
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216704
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0b8beac8
    • Li Zetao's avatar
      ubifs: Fix memory leak in alloc_wbufs() · 26ec45f1
      Li Zetao authored
      
      [ Upstream commit 4a1ff3c5 ]
      
      kmemleak reported a sequence of memory leaks, and show them as following:
      
        unreferenced object 0xffff8881575f8400 (size 1024):
          comm "mount", pid 19625, jiffies 4297119604 (age 20.383s)
          hex dump (first 32 bytes):
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
            00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          backtrace:
            [<ffffffff8176cecd>] __kmalloc+0x4d/0x150
            [<ffffffffa0406b2b>] ubifs_mount+0x307b/0x7170 [ubifs]
            [<ffffffff819fa8fd>] legacy_get_tree+0xed/0x1d0
            [<ffffffff81936f2d>] vfs_get_tree+0x7d/0x230
            [<ffffffff819b2bd4>] path_mount+0xdd4/0x17b0
            [<ffffffff819b37aa>] __x64_sys_mount+0x1fa/0x270
            [<ffffffff83c14295>] do_syscall_64+0x35/0x80
            [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
        unreferenced object 0xffff8881798a6e00 (size 512):
          comm "mount", pid 19677, jiffies 4297121912 (age 37.816s)
          hex dump (first 32 bytes):
            6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
            6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
          backtrace:
            [<ffffffff8176cecd>] __kmalloc+0x4d/0x150
            [<ffffffffa0418342>] ubifs_wbuf_init+0x52/0x480 [ubifs]
            [<ffffffffa0406ca5>] ubifs_mount+0x31f5/0x7170 [ubifs]
            [<ffffffff819fa8fd>] legacy_get_tree+0xed/0x1d0
            [<ffffffff81936f2d>] vfs_get_tree+0x7d/0x230
            [<ffffffff819b2bd4>] path_mount+0xdd4/0x17b0
            [<ffffffff819b37aa>] __x64_sys_mount+0x1fa/0x270
            [<ffffffff83c14295>] do_syscall_64+0x35/0x80
            [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      The problem is that the ubifs_wbuf_init() returns an error in the
      loop which in the alloc_wbufs(), then the wbuf->buf and wbuf->inodes
      that were successfully alloced before are not freed.
      
      Fix it by adding error hanging path in alloc_wbufs() which frees
      the memory alloced before when ubifs_wbuf_init() returns an error.
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarLi Zetao <lizetao1@huawei.com>
      Reviewed-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      26ec45f1
    • Zhihao Cheng's avatar
      ubifs: Reserve one leb for each journal head while doing budget · c17e1ae2
      Zhihao Cheng authored
      [ Upstream commit e874dcde ]
      
      UBIFS calculates available space by c->main_bytes - c->lst.total_used
      (which means non-index lebs' free and dirty space is accounted into
      total available), then index lebs and four lebs (one for gc_lnum, one
      for deletions, two for journal heads) are deducted.
      In following situation, ubifs may get -ENOSPC from make_reservation():
       LEB 84: DATAHD   free 122880 used 1920  dirty 2176  dark 6144
       LEB 110:DELETION free 126976 used 0     dirty 0     dark 6144 (empty)
       LEB 201:gc_lnum  free 126976 used 0     dirty 0     dark 6144
       LEB 272:GCHD     free 77824  used 47672 dirty 1480  dark 6144
       LEB 356:BASEHD   free 0      used 39776 dirty 87200 dark 6144
       OTHERS: index lebs, zero-available non-index lebs
      
      UBIFS calculates the available bytes is 6888 (How to calculate it:
      126976 * 5[remain main bytes] - 1920[used] - 47672[used] - 39776[used] -
      126976 * 1[deletions] - 126976 * 1[gc_lnum] - 126976 * 2[journal heads]
      - 6144 * 5[dark] = 6888) after doing budget, however UBIFS cannot use
      BASEHD's dirty space(87200), because UBIFS cannot find next BASEHD to
      reclaim current BASEHD. (c->bi.min_idx_lebs equals to c->lst.idx_lebs,
      the empty leb won't be found by ubifs_find_free_space(), and dirty index
      lebs won't be picked as gced lebs. All non-index lebs has dirty space
      less then c->dead_wm, non-index lebs won't be picked as gced lebs
      either. So new free lebs won't be produced.). See more details in Link.
      
      To fix it, reserve one leb for each journal head while doing budget.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216562
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c17e1ae2
    • Zhihao Cheng's avatar
      ubifs: do_rename: Fix wrong space budget when target inode's nlink > 1 · 31282bc4
      Zhihao Cheng authored
      [ Upstream commit 25fce616 ]
      
      If target inode is a special file (eg. block/char device) with nlink
      count greater than 1, the inode with ui->data will be re-written on
      disk. However, UBIFS losts target inode's data_len while doing space
      budget. Bad space budget may let make_reservation() return with -ENOSPC,
      which could turn ubifs to read-only mode in do_writepage() process.
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216494
      
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      31282bc4
    • Zhihao Cheng's avatar
      ubifs: Fix wrong dirty space budget for dirty inode · b08071c6
      Zhihao Cheng authored
      
      [ Upstream commit b248eaf0 ]
      
      Each dirty inode should reserve 'c->bi.inode_budget' bytes in space
      budget calculation. Currently, space budget for dirty inode reports
      more space than what UBIFS actually needs to write.
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b08071c6
    • Zhihao Cheng's avatar
      ubifs: Rectify space budget for ubifs_xrename() · f8bd27b6
      Zhihao Cheng authored
      [ Upstream commit 1b2ba090 ]
      
      There is no space budget for ubifs_xrename(). It may let
      make_reservation() return with -ENOSPC, which could turn
      ubifs to read-only mode in do_writepage() process.
      Fix it by adding space budget for ubifs_xrename().
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216569
      
      
      Fixes: 9ec64962 ("ubifs: Implement RENAME_EXCHANGE")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f8bd27b6
    • Zhihao Cheng's avatar
      ubifs: Rectify space budget for ubifs_symlink() if symlink is encrypted · f9e07484
      Zhihao Cheng authored
      [ Upstream commit c2c36cc6 ]
      
      Fix bad space budget when symlink file is encrypted. Bad space budget
      may let make_reservation() return with -ENOSPC, which could turn ubifs
      to read-only mode in do_writepage() process.
      
      Fetch a reproducer in [Link].
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216490
      
      
      Fixes: ca7f85be ("ubifs: Add support for encrypted symlinks")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f9e07484
    • Liu Shixin's avatar
      ubifs: Fix memory leak in ubifs_sysfs_init() · 1c5fdf2d
      Liu Shixin authored
      
      [ Upstream commit 203a55f0 ]
      
      When insmod ubifs.ko, a kmemleak reported as below:
      
       unreferenced object 0xffff88817fb1a780 (size 8):
         comm "insmod", pid 25265, jiffies 4295239702 (age 100.130s)
         hex dump (first 8 bytes):
           75 62 69 66 73 00 ff ff                          ubifs...
         backtrace:
           [<ffffffff81b3fc4c>] slab_post_alloc_hook+0x9c/0x3c0
           [<ffffffff81b44bf3>] __kmalloc_track_caller+0x183/0x410
           [<ffffffff8198d3da>] kstrdup+0x3a/0x80
           [<ffffffff8198d486>] kstrdup_const+0x66/0x80
           [<ffffffff83989325>] kvasprintf_const+0x155/0x190
           [<ffffffff83bf55bb>] kobject_set_name_vargs+0x5b/0x150
           [<ffffffff83bf576b>] kobject_set_name+0xbb/0xf0
           [<ffffffff8100204c>] do_one_initcall+0x14c/0x5a0
           [<ffffffff8157e380>] do_init_module+0x1f0/0x660
           [<ffffffff815857be>] load_module+0x6d7e/0x7590
           [<ffffffff8158644f>] __do_sys_finit_module+0x19f/0x230
           [<ffffffff815866b3>] __x64_sys_finit_module+0x73/0xb0
           [<ffffffff88c98e85>] do_syscall_64+0x35/0x80
           [<ffffffff88e00087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      When kset_register() failed, we should call kset_put to cleanup it.
      
      Fixes: 2e3cbf42 ("ubifs: Export filesystem error counters")
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c5fdf2d
    • Li Hua's avatar
      ubifs: Fix build errors as symbol undefined · 7508453e
      Li Hua authored
      
      [ Upstream commit aa6d148e ]
      
      With CONFIG_UBIFS_FS_AUTHENTICATION not set, the compiler can assume that
      ubifs_node_check_hash() is never true and drops the call to ubifs_bad_hash().
      Is CONFIG_CC_OPTIMIZE_FOR_SIZE enabled this optimization does not happen anymore.
      
      So When CONFIG_UBIFS_FS and CONFIG_CC_OPTIMIZE_FOR_SIZE is enabled but
      CONFIG_UBIFS_FS_AUTHENTICATION is not set, the build errors is as followd:
          ERROR: modpost: "ubifs_bad_hash" [fs/ubifs/ubifs.ko] undefined!
      
      Fix it by add no-op ubifs_bad_hash() for the CONFIG_UBIFS_FS_AUTHENTICATION=n case.
      
      Fixes: 16a26b20 ("ubifs: authentication: Add hashes to index nodes")
      Signed-off-by: default avatarLi Hua <hucool.lihua@huawei.com>
      Reviewed-by: default avatarSascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7508453e
  8. Oct 11, 2022
    • Jason A. Donenfeld's avatar
      treewide: use get_random_bytes() when possible · 197173db
      Jason A. Donenfeld authored
      
      The prandom_bytes() function has been a deprecated inline wrapper around
      get_random_bytes() for several releases now, and compiles down to the
      exact same code. Replace the deprecated wrapper with a direct call to
      the real function. This was done as a basic find and replace.
      
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarYury Norov <yury.norov@gmail.com>
      Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      197173db
    • Jason A. Donenfeld's avatar
      treewide: use get_random_u32() when possible · a251c17a
      Jason A. Donenfeld authored
      
      The prandom_u32() function has been a deprecated inline wrapper around
      get_random_u32() for several releases now, and compiles down to the
      exact same code. Replace the deprecated wrapper with a direct call to
      the real function. The same also applies to get_random_int(), which is
      just a wrapper around get_random_u32(). This was done as a basic find
      and replace.
      
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarYury Norov <yury.norov@gmail.com>
      Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
      Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
      Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
      Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
      Acked-by: Helge Deller <deller@gmx.de> # for parisc
      Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      a251c17a
    • Jason A. Donenfeld's avatar
      treewide: use prandom_u32_max() when possible, part 1 · 81895a65
      Jason A. Donenfeld authored
      
      Rather than incurring a division or requesting too many random bytes for
      the given range, use the prandom_u32_max() function, which only takes
      the minimum required bytes from the RNG and avoids divisions. This was
      done mechanically with this coccinelle script:
      
      @basic@
      expression E;
      type T;
      identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
      typedef u64;
      @@
      (
      - ((T)get_random_u32() % (E))
      + prandom_u32_max(E)
      |
      - ((T)get_random_u32() & ((E) - 1))
      + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
      |
      - ((u64)(E) * get_random_u32() >> 32)
      + prandom_u32_max(E)
      |
      - ((T)get_random_u32() & ~PAGE_MASK)
      + prandom_u32_max(PAGE_SIZE)
      )
      
      @multi_line@
      identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
      identifier RAND;
      expression E;
      @@
      
      -       RAND = get_random_u32();
              ... when != RAND
      -       RAND %= (E);
      +       RAND = prandom_u32_max(E);
      
      // Find a potential literal
      @literal_mask@
      expression LITERAL;
      type T;
      identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
      position p;
      @@
      
              ((T)get_random_u32()@p & (LITERAL))
      
      // Add one to the literal.
      @script:python add_one@
      literal << literal_mask.LITERAL;
      RESULT;
      @@
      
      value = None
      if literal.startswith('0x'):
              value = int(literal, 16)
      elif literal[0] in '123456789':
              value = int(literal, 10)
      if value is None:
              print("I don't know how to handle %s" % (literal))
              cocci.include_match(False)
      elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
              print("Skipping 0x%x for cleanup elsewhere" % (value))
              cocci.include_match(False)
      elif value & (value + 1) != 0:
              print("Skipping 0x%x because it's not a power of two minus one" % (value))
              cocci.include_match(False)
      elif literal.startswith('0x'):
              coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
      else:
              coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))
      
      // Replace the literal mask with the calculated result.
      @plus_one@
      expression literal_mask.LITERAL;
      position literal_mask.p;
      expression add_one.RESULT;
      identifier FUNC;
      @@
      
      -       (FUNC()@p & (LITERAL))
      +       prandom_u32_max(RESULT)
      
      @collapse_ret@
      type T;
      identifier VAR;
      expression E;
      @@
      
       {
      -       T VAR;
      -       VAR = (E);
      -       return VAR;
      +       return E;
       }
      
      @drop_var@
      type T;
      identifier VAR;
      @@
      
       {
      -       T VAR;
              ... when != VAR
       }
      
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarYury Norov <yury.norov@gmail.com>
      Reviewed-by: default avatarKP Singh <kpsingh@kernel.org>
      Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
      Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
      Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
      Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      81895a65
  9. Sep 24, 2022
    • Miklos Szeredi's avatar
      vfs: open inside ->tmpfile() · 863f144f
      Miklos Szeredi authored
      
      This is in preparation for adding tmpfile support to fuse, which requires
      that the tmpfile creation and opening are done as a single operation.
      
      Replace the 'struct dentry *' argument of i_op->tmpfile with
      'struct file *'.
      
      Call finish_open_simple() as the last thing in ->tmpfile() instances (may
      be omitted in the error case).
      
      Change d_tmpfile() argument to 'struct file *' as well to make callers more
      readable.
      
      Reviewed-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      863f144f
  10. Sep 21, 2022
    • Zhihao Cheng's avatar
      ubifs: Fix AA deadlock when setting xattr for encrypted file · a0c51565
      Zhihao Cheng authored
      Following process:
      vfs_setxattr(host)
        ubifs_xattr_set
          down_write(host_ui->xattr_sem)   <- lock first time
            create_xattr
              ubifs_new_inode(host)
                fscrypt_prepare_new_inode(host)
                  fscrypt_policy_to_inherit(host)
                    if (IS_ENCRYPTED(inode))
                      fscrypt_require_key(host)
                        fscrypt_get_encryption_info(host)
                          ubifs_xattr_get(host)
                            down_read(host_ui->xattr_sem) <- AA deadlock
      
      , which may trigger an AA deadlock problem:
      
      [  102.620871] INFO: task setfattr:1599 blocked for more than 10 seconds.
      [  102.625298]       Not tainted 5.19.0-rc7-00001-gb666b6823ce0-dirty #711
      [  102.628732] task:setfattr        state:D stack:    0 pid: 1599
      [  102.628749] Call Trace:
      [  102.628753]  <TASK>
      [  102.628776]  __schedule+0x482/0x1060
      [  102.629964]  schedule+0x92/0x1a0
      [  102.629976]  rwsem_down_read_slowpath+0x287/0x8c0
      [  102.629996]  down_read+0x84/0x170
      [  102.630585]  ubifs_xattr_get+0xd1/0x370 [ubifs]
      [  102.630730]  ubifs_crypt_get_context+0x1f/0x30 [ubifs]
      [  102.630791]  fscrypt_get_encryption_info+0x7d/0x1c0
      [  102.630810]  fscrypt_policy_to_inherit+0x56/0xc0
      [  102.630817]  fscrypt_prepare_new_inode+0x35/0x160
      [  102.630830]  ubifs_new_inode+0xcc/0x4b0 [ubifs]
      [  102.630873]  ubifs_xattr_set+0x591/0x9f0 [ubifs]
      [  102.630961]  xattr_set+0x8c/0x3e0 [ubifs]
      [  102.631003]  __vfs_setxattr+0x71/0xc0
      [  102.631026]  vfs_setxattr+0x105/0x270
      [  102.631034]  do_setxattr+0x6d/0x110
      [  102.631041]  setxattr+0xa0/0xd0
      [  102.631087]  __x64_sys_setxattr+0x2f/0x40
      
      Fetch a reproducer in [Link].
      
      Just like ext4 does, which skips encrypting for inode with
      EXT4_EA_INODE_FL flag. Stop encypting xattr inode for ubifs.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216260
      
      
      Fixes: f4e3634a ("ubifs: Fix races between xattr_{set|get} ...")
      Fixes: d475a507 ("ubifs: Add skeleton for fscrypto")
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      a0c51565
    • ZhaoLong Wang's avatar
      ubifs: Fix UBIFS ro fail due to truncate in the encrypted directory · 713346ca
      ZhaoLong Wang authored
      The ubifs_compress() function does not compress the data When the
      data length is short than 128 bytes or the compressed data length
      is not ideal.It cause that the compressed length of the truncated
      data in the truncate_data_node() function may be greater than the
      length of the raw data read from the flash.
      
      The above two lengths are transferred to the ubifs_encrypt()
      function as parameters. This may lead to assertion fails and then
      the file system becomes read-only.
      
      This patch use the actual length of the data in the memory as the
      input parameter for assert comparison, which avoids the problem.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216213
      
      
      Signed-off-by: default avatarZhaoLong Wang <wangzhaolong1@huawei.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      713346ca
    • Yang Li's avatar
      ubifs: Fix ubifs_check_dir_empty() kernel-doc comment · 27ef523a
      Yang Li authored
      
      Fix function name in fs/ubifs/dir.c kernel-doc comment
      to remove warning found by running scripts/kernel-doc,
      which is caused by using 'make W=1'.
      
      fs/ubifs/dir.c:883: warning: expecting prototype for check_dir_empty().
      Prototype was for ubifs_check_dir_empty() instead
      
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      27ef523a
  11. Aug 02, 2022
  12. Jul 04, 2022
    • Roman Gushchin's avatar
      mm: shrinkers: provide shrinkers with names · e33c267a
      Roman Gushchin authored
      Currently shrinkers are anonymous objects.  For debugging purposes they
      can be identified by count/scan function names, but it's not always
      useful: e.g.  for superblock's shrinkers it's nice to have at least an
      idea of to which superblock the shrinker belongs.
      
      This commit adds names to shrinkers.  register_shrinker() and
      prealloc_shrinker() functions are extended to take a format and arguments
      to master a name.
      
      In some cases it's not possible to determine a good name at the time when
      a shrinker is allocated.  For such cases shrinker_debugfs_rename() is
      provided.
      
      The expected format is:
          <subsystem>-<shrinker_type>[:<instance>]-<id>
      For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair.
      
      After this change the shrinker debugfs directory looks like:
        $ cd /sys/kernel/debug/shrinker/
        $ ls
          dquota-cache-16     sb-devpts-28     sb-proc-47       sb-tmpfs-42
          mm-shadow-18        sb-devtmpfs-5    sb-proc-48       sb-tmpfs-43
          mm-zspool:zram0-34  sb-hugetlbfs-17  sb-pstore-31     sb-tmpfs-44
          rcu-kfree-0         sb-hugetlbfs-33  sb-rootfs-2      sb-tmpfs-49
          sb-aio-20           sb-iomem-12      sb-securityfs-6  sb-tracefs-13
          sb-anon_inodefs-15  sb-mqueue-21     sb-selinuxfs-22  sb-xfs:vda1-36
          sb-bdev-3           sb-nsfs-4        sb-sockfs-8      sb-zsmalloc-19
          sb-bpf-32           sb-pipefs-14     sb-sysfs-26      thp-deferred_split-10
          sb-btrfs:vda2-24    sb-proc-25       sb-tmpfs-1       thp-zero-9
          sb-cgroup2-30       sb-proc-39       sb-tmpfs-27      xfs-buf:vda1-37
          sb-configfs-23      sb-proc-41       sb-tmpfs-29      xfs-inodegc:vda1-38
          sb-dax-11           sb-proc-45       sb-tmpfs-35
          sb-debugfs-7        sb-proc-46       sb-tmpfs-40
      
      [roman.gushchin@linux.dev: fix build warnings]
        Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle
      
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev
      
      
      Signed-off-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Dave Chinner <dchinner@redhat.com>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e33c267a
  13. May 27, 2022
  14. May 10, 2022
  15. May 09, 2022
  16. May 08, 2022
  17. Apr 13, 2022
    • Eric Biggers's avatar
      fscrypt: split up FS_CRYPTO_BLOCK_SIZE · 63cec138
      Eric Biggers authored
      
      FS_CRYPTO_BLOCK_SIZE is neither the filesystem block size nor the
      granularity of encryption.  Rather, it defines two logically separate
      constraints that both arise from the block size of the AES cipher:
      
      - The alignment required for the lengths of file contents blocks
      - The minimum input/output length for the filenames encryption modes
      
      Since there are way too many things called the "block size", and the
      connection with the AES block size is not easily understood, split
      FS_CRYPTO_BLOCK_SIZE into two constants FSCRYPT_CONTENTS_ALIGNMENT and
      FSCRYPT_FNAME_MIN_MSG_LEN that more clearly describe what they are.
      
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Link: https://lore.kernel.org/r/20220405010914.18519-1-ebiggers@kernel.org
      63cec138
  18. Mar 22, 2022
  19. Mar 17, 2022
Loading