Commit Graph

603 Commits

Author SHA1 Message Date
Konstantin Khlebnikov 47f6ce223d ovl: skip overlayfs superblocks at global sync
BugLink: https://bugs.launchpad.net/bugs/2049084

[ Upstream commit 32b1924b21 ]

Stacked filesystems like overlayfs has no own writeback, but they have to
forward syncfs() requests to backend for keeping data integrity.

During global sync() each overlayfs instance calls method ->sync_fs() for
backend although it itself is in global list of superblocks too.  As a
result one syscall sync() could write one superblock several times and send
multiple disk barriers.

This patch adds flag SB_I_SKIP_SYNC into sb->sb_iflags to avoid that.

Reported-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Stable-dep-of: b836c4d29f27 ("ima: detect changes to the backing overlay file")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
2024-02-02 14:13:17 +01:00
Jeff Layton 2fd59eff16 overlayfs: set ctime when setting mtime and atime
BugLink: https://bugs.launchpad.net/bugs/2043724

[ Upstream commit 03dbab3bba5f009d053635c729d1244f2c8bad38 ]

Nathan reported that he was seeing the new warning in
setattr_copy_mgtime pop when starting podman containers. Overlayfs is
trying to set the atime and mtime via notify_change without also
setting the ctime.

POSIX states that when the atime and mtime are updated via utimes() that
we must also update the ctime to the current time. The situation with
overlayfs copy-up is analogies, so add ATTR_CTIME to the bitmask.
notify_change will fill in the value.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Message-Id: <20230913-ctime-v1-1-c6bc509cbc27@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2024-01-05 14:29:57 +01:00
Al Viro e9a1126fa2 new helper: lookup_positive_unlocked()
BugLink: https://bugs.launchpad.net/bugs/2040284

[ Upstream commit 6c2d4798a8 ]

Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT).  Provide a helper that would do just that.  Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 0d5a4f8f775f ("fs: Fix error checking for d_hash_and_lookup()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2023-10-30 11:42:07 +01:00
Christian Brauner 0f114b29a8 ovl: check type and offset of struct vfsmount in ovl_entry
BugLink: https://bugs.launchpad.net/bugs/2039440

[ Upstream commit f723edb8a532cd26e1ff0a2b271d73762d48f762 ]

Porting overlayfs to the new amount api I started experiencing random
crashes that couldn't be explained easily. So after much debugging and
reasoning it became clear that struct ovl_entry requires the point to
struct vfsmount to be the first member and of type struct vfsmount.

During the port I added a new member at the beginning of struct
ovl_entry which broke all over the place in the form of random crashes
and cache corruptions. While there's a comment in ovl_free_fs() to the
effect of "Hack! Reuse ofs->layers as a vfsmount array before freeing
it" there's no such comment on struct ovl_entry which makes this easy to
trip over.

Add a comment and two static asserts for both the offset and the type of
pointer in struct ovl_entry.

Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
2023-10-30 11:41:55 +01:00
Andrea Righi 6a8a45f2b7 UBUNTU: SAUCE: overlayfs: fix reference count mismatch
BugLink: https://bugs.launchpad.net/bugs/2016398

Opened files reported in /proc/pid/map_files can be shows with the wrong
mount point using overlayfs with filesystem namspaces.

This incorrect behavior is fixed:

  UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files

However, the fix introduced a new regression, the reference to the
original file stored in vma->vm_prfile is not properly released when
vma->vm_prfile is replaced with a new file.

This can cause a reference counter unbalance, leading errors such as
"target is busy" when trying to unmount overlayfs, even if the
filesystem has not active reference.

Fix by properly releasing the original file stored in vm_prfile.

Fixes: 508fdae3f62dd ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
2023-08-09 12:25:42 +02:00
Kees Cook dc08591ef5 treewide: Remove uninitialized_var() usage
BugLink: https://bugs.launchpad.net/bugs/2028981

commit 3f649ab728 upstream.

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2023-08-09 12:25:41 +02:00
Miklos Szeredi 8106a34253 ovl: adhere to the vfs_ vs. ovl_do_ conventions for xattrs
Call ovl_do_*xattr() when accessing an overlay private xattr, vfs_*xattr()
otherwise.

This has an effect on debug output, which is made more consistent by this
patch.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

CVE-2023-32629
(cherry picked from commit 7109704705)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2023-07-10 17:22:07 +02:00
Zhang Tianci b91458a747 ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
BugLink: https://bugs.launchpad.net/bugs/2003914

commit 5b0db51215e895a361bc63132caa7cca36a53d6a upstream.

There is a wrong case of link() on overlay:
  $ mkdir /lower /fuse /merge
  $ mount -t fuse /fuse
  $ mkdir /fuse/upper /fuse/work
  $ mount -t overlay /merge -o lowerdir=/lower,upperdir=/fuse/upper,\
    workdir=work
  $ touch /merge/file
  $ chown bin.bin /merge/file // the file's caller becomes "bin"
  $ ln /merge/file /merge/lnkfile

Then we will get an error(EACCES) because fuse daemon checks the link()'s
caller is "bin", it denied this request.

In the changing history of ovl_link(), there are two key commits:

The first is commit bb0d2b8ad2 ("ovl: fix sgid on directory") which
overrides the cred's fsuid/fsgid using the new inode. The new inode's
owner is initialized by inode_init_owner(), and inode->fsuid is
assigned to the current user. So the override fsuid becomes the
current user. We know link() is actually modifying the directory, so
the caller must have the MAY_WRITE permission on the directory. The
current caller may should have this permission. This is acceptable
to use the caller's fsuid.

The second is commit 51f7e52dc9 ("ovl: share inode for hard link")
which removed the inode creation in ovl_link(). This commit move
inode_init_owner() into ovl_create_object(), so the ovl_link() just
give the old inode to ovl_create_or_link(). Then the override fsuid
becomes the old inode's fsuid, neither the caller nor the overlay's
mounter! So this is incorrect.

Fix this bug by using ovl mounter's fsuid/fsgid to do underlying
fs's link().

Link: https://lore.kernel.org/all/20220817102952.xnvesg3a7rbv576x@wittgenstein/T
Link: https://lore.kernel.org/lkml/20220825130552.29587-1-zhangtianci.1997@bytedance.com/t
Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com>
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Fixes: 51f7e52dc9 ("ovl: share inode for hard link")
Cc: <stable@vger.kernel.org> # v4.8
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2023-02-01 15:23:19 +01:00
Jiachen Zhang 6ceca34cff ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh()
BugLink: https://bugs.launchpad.net/bugs/1990190

commit dd524b7f317de8d31d638cbfdc7be4cf9b770e42 upstream.

Some code paths cannot guarantee the inode have any dentry alias. So
WARN_ON() all !dentry may flood the kernel logs.

For example, when an overlayfs inode is watched by inotifywait (1), and
someone is trying to read the /proc/$(pidof inotifywait)/fdinfo/INOTIFY_FD,
at that time if the dentry has been reclaimed by kernel (such as
echo 2 > /proc/sys/vm/drop_caches), there will be a WARN_ON(). The
printed call stack would be like:

    ? show_mark_fhandle+0xf0/0xf0
    show_mark_fhandle+0x4a/0xf0
    ? show_mark_fhandle+0xf0/0xf0
    ? seq_vprintf+0x30/0x50
    ? seq_printf+0x53/0x70
    ? show_mark_fhandle+0xf0/0xf0
    inotify_fdinfo+0x70/0x90
    show_fdinfo.isra.4+0x53/0x70
    seq_show+0x130/0x170
    seq_read+0x153/0x440
    vfs_read+0x94/0x150
    ksys_read+0x5f/0xe0
    do_syscall_64+0x59/0x1e0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

So let's drop WARN_ON() to avoid kernel log flooding.

Reported-by: Hongbo Yin <yinhongbo@bytedance.com>
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Signed-off-by: Tianci Zhang <zhangtianci.1997@bytedance.com>
Fixes: 8ed5eec9d6 ("ovl: encode pure upper file handles")
Cc: <stable@vger.kernel.org> # v4.16
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2022-10-17 15:00:15 +02:00
Miklos Szeredi 7a8a00ec7a ovl: fix warning in ovl_create_real()
BugLink: https://bugs.launchpad.net/bugs/1957991

commit 1f5573cfe7a7056e80a92c7a037a3e69f3a13d1c upstream.

Syzbot triggered the following warning in ovl_workdir_create() ->
ovl_create_real():

	if (!err && WARN_ON(!newdentry->d_inode)) {

The reason is that the cgroup2 filesystem returns from mkdir without
instantiating the new dentry.

Weird filesystems such as this will be rejected by overlayfs at a later
stage during setup, but to prevent such a warning, call ovl_mkdir_real()
directly from ovl_workdir_create() and reject this case early.

Reported-and-tested-by: syzbot+75eab84fd0af9e8bf66b@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2022-03-07 16:34:17 +01:00
Miklos Szeredi d7d48fb589 ovl: fix deadlock in splice write
BugLink: https://bugs.launchpad.net/bugs/1953387

commit 9b91b6b019 upstream.

There's possibility of an ABBA deadlock in case of a splice write to an
overlayfs file and a concurrent splice write to a corresponding real file.

The call chain for splice to an overlay file:

 -> do_splice                     [takes sb_writers on overlay file]
   -> do_splice_from
     -> iter_file_splice_write    [takes pipe->mutex]
       -> vfs_iter_write
         ...
         -> ovl_write_iter        [takes sb_writers on real file]

And the call chain for splice to a real file:

 -> do_splice                     [takes sb_writers on real file]
   -> do_splice_from
     -> iter_file_splice_write    [takes pipe->mutex]

Syzbot successfully bisected this to commit 82a763e61e ("ovl: simplify
file splice").

Fix by reverting the write part of the above commit and by adding missing
bits from ovl_write_iter() into ovl_splice_write().

Fixes: 82a763e61e ("ovl: simplify file splice")
Reported-and-tested-by: syzbot+579885d1a9a833336209@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2022-01-13 18:42:44 +01:00
Miklos Szeredi ec0c3157d6 ovl: simplify file splice
BugLink: https://bugs.launchpad.net/bugs/1951291

commit 82a763e61e upstream.

generic_file_splice_read() and iter_file_splice_write() will call back into
f_op->iter_read() and f_op->iter_write() respectively.  These already do
the real file lookup and cred override.  So the code in ovl_splice_read()
and ovl_splice_write() is redundant.

In addition the ovl_file_accessed() call in ovl_splice_write() is
incorrect, though probably harmless.

Fix by calling generic_file_splice_read() and iter_file_splice_write()
directly.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
[reported to resolve issues with 1a980b8cbf ("ovl: add splice file read write helper")]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2021-11-22 14:29:42 +01:00
Zheng Liang eb264fc4be ovl: fix missing negative dentry check in ovl_rename()
BugLink: https://bugs.launchpad.net/bugs/1950014

commit a295aef603 upstream.

The following reproducer

  mkdir lower upper work merge
  touch lower/old
  touch lower/new
  mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge
  rm merge/new
  mv merge/old merge/new & unlink upper/new

may result in this race:

PROCESS A:
  rename("merge/old", "merge/new");
  overwrite=true,ovl_lower_positive(old)=true,
  ovl_dentry_is_whiteout(new)=true -> flags |= RENAME_EXCHANGE

PROCESS B:
  unlink("upper/new");

PROCESS A:
  lookup newdentry in new_upperdir
  call vfs_rename() with negative newdentry and RENAME_EXCHANGE

Fix by adding the missing check for negative newdentry.

Signed-off-by: Zheng Liang <zhengliang6@huawei.com>
Fixes: e9be9d5e76 ("overlay filesystem")
Cc: <stable@vger.kernel.org> # v3.18
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2021-11-12 14:07:25 +01:00
chenying 40273f0f87 ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()
BugLink: https://bugs.launchpad.net/bugs/1946802

commit 52d5a0c6bd upstream.

If function ovl_instantiate() returns an error, ovl_cleanup will be called
and try to remove newdentry from wdir, but the newdentry has been moved to
udir at this time.  This will causes BUG_ON(victim->d_parent->d_inode !=
dir) in fs/namei.c:may_delete.

Signed-off-by: chenying <chenying.kernel@bytedance.com>
Fixes: 01b39dcc95 ("ovl: use inode_insert5() to hash a newly created inode")
Link: https://lore.kernel.org/linux-unionfs/e6496a94-a161-dc04-c38a-d2544633acb4@bytedance.com/
Cc: <stable@vger.kernel.org> # v4.18
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-10-12 16:31:39 -06:00
Miklos Szeredi 44f7a825a0 ovl: fix uninitialized pointer read in ovl_lookup_real_one()
BugLink: https://bugs.launchpad.net/bugs/1944756

[ Upstream commit 580c610429 ]

One error path can result in release_dentry_name_snapshot() being called
before "name" was initialized by take_dentry_name_snapshot().

Fix by moving the release_dentry_name_snapshot() to immediately after the
only use.

Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2021-10-01 11:31:03 +02:00
Murphy Zhou aae99b8053 ovl: add splice file read write helper
BugLink: https://bugs.launchpad.net/bugs/1944212

[ Upstream commit 1a980b8cbf ]

Now overlayfs falls back to use default file splice read
and write, which is not compatiple with overlayfs, returning
EFAULT. xfstests generic/591 can reproduce part of this.

Tested this patch with xfstests auto group tests.

Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-09-24 12:27:50 +02:00
Dan Carpenter b39b3b68f1 ovl: fix missing revert_creds() on error path
BugLink: https://bugs.launchpad.net/bugs/1929615

commit 7b279bbfd2 upstream.

Smatch complains about missing that the ovl_override_creds() doesn't
have a matching revert_creds() if the dentry is disconnected.  Fix this
by moving the ovl_override_creds() until after the disconnected check.

Fixes: aa3ff3c152 ("ovl: copy up of disconnected dentries")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
2021-05-26 15:39:08 +02:00
Miklos Szeredi 1d10b4bc45 ovl: allow upperdir inside lowerdir
BugLink: https://bugs.launchpad.net/bugs/1928823

commit 708fa01597 upstream.

Commit 146d62e5a5 ("ovl: detect overlapping layers") made sure we don't
have overlapping layers, but it also broke the arguably valid use case of

 mount -olowerdir=/,upperdir=/subdir,..

where upperdir overlaps lowerdir on the same filesystem.  This has been
causing regressions.

Revert the check, but only for the specific case where upperdir and/or
workdir are subdirectories of lowerdir.  Any other overlap (e.g. lowerdir
is subdirectory of upperdir, etc) case is crazy, so leave the check in
place for those.

Overlaps are detected at lookup time too, so reverting the mount time check
should be safe.

Fixes: 146d62e5a5 ("ovl: detect overlapping layers")
Cc: <stable@vger.kernel.org> # v5.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2021-05-19 10:53:24 +02:00
Alexander Mikhalitsyn 28eab192cf UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files
BugLink: https://bugs.launchpad.net/bugs/1857257

The hack was introduced in ("UBUNTU: SAUCE: overlayfs: allow with
shiftfs as underlay") and it broke checkpoint/restore of docker
contains:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257

The following script can be used to trigger the issue:
  #!/bin/bash

  cat > test.py << EOF
  import sys

  f = open("/proc/self/maps")

  for l in f.readlines():
    if "python" not in l:
      continue
    print(l)
    s = l.split()
    start, end = s[0].split("-")
    fname = s[-1]
    print(start, end, fname)
    break
  else:
    sys.exit(1)

  test_file1 = open(fname)
  test_file2 = open("/proc/self/map_files/%s-%s" % (start, end))

  fdinfo1 = open("/proc/self/fdinfo/%d" % test_file1.fileno()).read()
  fdinfo2 = open("/proc/self/fdinfo/%d" % test_file2.fileno()).read()

  if fdinfo1 != fdinfo2:
    print("FAIL")
    print(test_file1)
    print(fdinfo1)
    print(test_file2)
    print(fdinfo2)
    sys.exit(1)
  print("PASS")
  EOF
  sudo docker run -it --privileged --rm -v `pwd`:/mnt python python /mnt/test.py

Thanks to Andrei Vagin for the reproducer and investigation of this problem.

Cc: Andrei Vagin <avagin@gmail.com>
Cc: Adrian Reber <areber@redhat.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>

Fixes: d24b8a5 ("UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay")
Signed-off-by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-05-07 18:14:24 -06:00
Miklos Szeredi c6edd27d79 ovl: expand warning in ovl_d_real()
BugLink: https://bugs.launchpad.net/bugs/1918167

commit cef4cbff06 upstream.

There was a syzbot report with this warning but insufficient information...

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-03-24 11:14:35 +01:00
Amir Goldstein 65defe693a ovl: skip getxattr of security labels
BugLink: https://bugs.launchpad.net/bugs/1918167

[ Upstream commit 03fedf9359 ]

When inode has no listxattr op of its own (e.g. squashfs) vfs_listxattr
calls the LSM inode_listsecurity hooks to list the xattrs that LSMs will
intercept in inode_getxattr hooks.

When selinux LSM is installed but not initialized, it will list the
security.selinux xattr in inode_listsecurity, but will not intercept it
in inode_getxattr.  This results in -ENODATA for a getxattr call for an
xattr returned by listxattr.

This situation was manifested as overlayfs failure to copy up lower
files from squashfs when selinux is built-in but not initialized,
because ovl_copy_xattr() iterates the lower inode xattrs by
vfs_listxattr() and vfs_getxattr().

ovl_copy_xattr() skips copy up of security labels that are indentified by
inode_copy_up_xattr LSM hooks, but it does that after vfs_getxattr().
Since we are not going to copy them, skip vfs_getxattr() of the security
labels.

Reported-by: Michael Labriola <michael.d.labriola@gmail.com>
Tested-by: Michael Labriola <michael.d.labriola@gmail.com>
Link: https://lore.kernel.org/linux-unionfs/2nv9d47zt7.fsf@aldarion.sourceruckus.org/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-03-24 11:14:27 +01:00
Miklos Szeredi c35a47bb69 ovl: perform vfs_getxattr() with mounter creds
BugLink: https://bugs.launchpad.net/bugs/1918167

[ Upstream commit 554677b972 ]

The vfs_getxattr() in ovl_xattr_set() is used to check whether an xattr
exist on a lower layer file that is to be removed.  If the xattr does not
exist, then no need to copy up the file.

This call of vfs_getxattr() wasn't wrapped in credential override, and this
is probably okay.  But for consitency wrap this instance as well.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2021-03-24 11:14:26 +01:00
Liangyan 36bce1e3f8 ovl: fix dentry leak in ovl_get_redirect
BugLink: https://bugs.launchpad.net/bugs/1916066

commit e04527fefb upstream.

We need to lock d_parent->d_lock before dget_dlock, or this may
have d_lockref updated parallelly like calltrace below which will
cause dentry->d_lockref leak and risk a crash.

     CPU 0                                CPU 1
ovl_set_redirect                       lookup_fast
  ovl_get_redirect                       __d_lookup
    dget_dlock
      //no lock protection here            spin_lock(&dentry->d_lock)
      dentry->d_lockref.count++            dentry->d_lockref.count++

[   49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0
[   49.799689] Oops: 0002 [#1] SMP PTI
[   49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1
[   49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014
[   49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20
[   49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
[   49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
[   49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
[   49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
[   49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
[   49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
[   49.807200] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
[   49.807935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
[   49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   49.810410] Call Trace:
[   49.810653]  d_delete+0x2c/0xb0
[   49.810951]  vfs_rmdir+0xfd/0x120
[   49.811264]  do_rmdir+0x14f/0x1a0
[   49.811573]  do_syscall_64+0x5b/0x190
[   49.811917]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   49.812385] RIP: 0033:0x7ffbf505ffd7
[   49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054
[   49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7
[   49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50
[   49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b
[   49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012
[   49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8
[   49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042
[   49.819052] CR2: 0000000000000088
[   49.819368] ---[ end trace 4e652b8aa299aa2d ]---
[   49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20
[   49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246
[   49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000
[   49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088
[   49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040
[   49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000
[   49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0
[   49.828890] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000
[   49.830725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0
[   49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Cc: <stable@vger.kernel.org>
Fixes: a6c6065511 ("ovl: redirect on rename-dir")
Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
2021-03-24 11:11:41 +01:00
Miklos Szeredi 1a1099363a ovl: do not fail because of O_NOATIME
BugLink: https://bugs.launchpad.net/bugs/1900141

In case the file cannot be opened with O_NOATIME because of lack of
capabilities, then clear O_NOATIME instead of failing.

Remove WARN_ON(), since it would now trigger if O_NOATIME was cleared.
Noticed by Amir Goldstein.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(backported from commit b6650dab40)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: William Breathitt Gray <william.gray@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
2021-01-18 17:26:15 +01:00
Miklos Szeredi 24d019b8de ovl: check permission to open real file
BugLink: https://bugs.launchpad.net/bugs/1894980

Call inode_permission() on real inode before opening regular file on one of
the underlying layers.

In some cases ovl_permission() already checks access to an underlying file,
but it misses the metacopy case, and possibly other ones as well.

Removing the redundant permission check from ovl_permission() should be
considered later.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(backported from commit 05acefb487)
[ saf: resolve conflicts with code added to support mounts over
  shiftfs ]
CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Miklos Szeredi d9e213524a ovl: call secutiry hook in ovl_real_ioctl()
BugLink: https://bugs.launchpad.net/bugs/1894980

Verify LSM permissions for underlying file, since vfs_ioctl() doesn't do
it.

[Stephen Rothwell] export security_file_ioctl

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(backported from commit 292f902a40)
[ saf: trivial conflict resolution ]
CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Miklos Szeredi 87609cff43 ovl: verify permissions in ovl_path_open()
BugLink: https://bugs.launchpad.net/bugs/1894980

Check permission before opening a real file.

ovl_path_open() is used by readdir and copy-up routines.

ovl_permission() theoretically already checked copy up permissions, but it
doesn't hurt to re-do these checks during the actual copy-up.

For directory reading ovl_permission() only checks access to topmost
underlying layer.  Readdir on a merged directory accesses layers below the
topmost one as well.  Permission wasn't checked for these layers.

Note: modifying ovl_permission() to perform this check would be far more
complex and hence more bug prone.  The result is less precise permissions
returned in access(2).  If this turns out to be an issue, we can revisit
this bug.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 56230d9567)
CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Miklos Szeredi 13edb02efd ovl: switch to mounter creds in readdir
BugLink: https://bugs.launchpad.net/bugs/1894980

In preparation for more permission checking, override credentials for
directory operations on the underlying filesystems.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 48bd024b8a)
CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Miklos Szeredi dcdcca9867 ovl: pass correct flags for opening real directory
BugLink: https://bugs.launchpad.net/bugs/1894980

The three instances of ovl_path_open() in overlayfs/readdir.c do three
different things:

 - pass f_flags from overlay file
 - pass O_RDONLY | O_DIRECTORY
 - pass just O_RDONLY

The value of f_flags can be (other than O_RDONLY):

O_WRONLY	- not possible for a directory
O_RDWR		- not possible for a directory
O_CREAT		- masked out by dentry_open()
O_EXCL		- masked out by dentry_open()
O_NOCTTY	- masked out by dentry_open()
O_TRUNC		- masked out by dentry_open()
O_APPEND	- no effect on directory ops
O_NDELAY	- no effect on directory ops
O_NONBLOCK	- no effect on directory ops
__O_SYNC	- no effect on directory ops
O_DSYNC		- no effect on directory ops
FASYNC		- no effect on directory ops
O_DIRECT	- no effect on directory ops
O_LARGEFILE	- ?
O_DIRECTORY	- only affects lookup
O_NOFOLLOW	- only affects lookup
O_NOATIME	- overlay sets this unconditionally in ovl_path_open()
O_CLOEXEC	- only affects fd allocation
O_PATH		- no effect on directory ops
__O_TMPFILE	- not possible for a directory

Fon non-merge directories we use the underlying filesystem's iterate; in
this case honor O_LARGEFILE from the original file to make sure that open
doesn't get rejected.

For merge directories it's safe to pass O_LARGEFILE unconditionally since
userspace will only see the artificial offsets created by overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
(cherry picked from commit 130fdbc3d1)
CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Seth Forshee 32fd0cdff7 Revert "UBUNTU: SAUCE: overlayfs: ensure mounter privileges when reading directories"
BugLink: https://bugs.launchpad.net/bugs/1894980

In preparation for backporting upstream patches to add similar
but more expansive permission checks, revert our SAUCE patch
which adds permission checking on directory reads.

CVE-2020-16120
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
2020-09-30 09:44:10 -03:00
Amir Goldstein b9947eecb8 ovl: fix unneeded call to ovl_change_flags()
BugLink: https://bugs.launchpad.net/bugs/1888560

commit 81a33c1ee9 upstream.

The check if user has changed the overlay file was wrong, causing unneeded
call to ovl_change_flags() including taking f_lock on every file access.

Fixes: d989903058 ("ovl: do not generate duplicate fsnotify events for "fake" path")
Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2020-08-08 01:53:12 -04:00
Amir Goldstein ff7331df44 ovl: relax WARN_ON() when decoding lower directory file handle
BugLink: https://bugs.launchpad.net/bugs/1888560

commit 124c2de2c0 upstream.

Decoding a lower directory file handle to overlay path with cold
inode/dentry cache may go as follows:

1. Decode real lower file handle to lower dir path
2. Check if lower dir is indexed (was copied up)
3. If indexed, get the upper dir path from index
4. Lookup upper dir path in overlay
5. If overlay path found, verify that overlay lower is the lower dir
   from step 1

On failure to verify step 5 above, user will get an ESTALE error and a
WARN_ON will be printed.

A mismatch in step 5 could be a result of lower directory that was renamed
while overlay was offline, after that lower directory has been copied up
and indexed.

This is a scripted reproducer based on xfstest overlay/052:

  # Create lower subdir
  create_dirs
  create_test_files $lower/lowertestdir/subdir
  mount_dirs
  # Copy up lower dir and encode lower subdir file handle
  touch $SCRATCH_MNT/lowertestdir
  test_file_handles $SCRATCH_MNT/lowertestdir/subdir -p -o $tmp.fhandle
  # Rename lower dir offline
  unmount_dirs
  mv $lower/lowertestdir $lower/lowertestdir.new/
  mount_dirs
  # Attempt to decode lower subdir file handle
  test_file_handles $SCRATCH_MNT -p -i $tmp.fhandle

Since this WARN_ON() can be triggered by user we need to relax it.

Fixes: 4b91c30a5a ("ovl: lookup connected ancestor of dir in inode cache")
Cc: <stable@vger.kernel.org> # v4.16+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2020-08-08 01:53:12 -04:00
youngjun 54a992fc79 ovl: inode reference leak in ovl_is_inuse true case.
BugLink: https://bugs.launchpad.net/bugs/1888560

commit 24f14009b8 upstream.

When "ovl_is_inuse" true case, trap inode reference not put.  plus adding
the comment explaining sequence of ovl_is_inuse after ovl_setup_trap.

Fixes: 0be0bfd2de ("ovl: fix regression caused by overlapping layers detection")
Cc: <stable@vger.kernel.org> # v4.19+
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: youngjun <her0gyugyu@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2020-08-08 01:53:12 -04:00
Amir Goldstein 2ae79129e6 ovl: fix regression with re-formatted lower squashfs
BugLink: https://bugs.launchpad.net/bugs/1888560

commit a888db3101 upstream.

Commit 9df085f3c9 ("ovl: relax requirement for non null uuid of lower
fs") relaxed the requirement for non null uuid with single lower layer to
allow enabling index and nfs_export features with single lower squashfs.

Fabian reported a regression in a setup when overlay re-uses an existing
upper layer and re-formats the lower squashfs image.  Because squashfs
has no uuid, the origin xattr in upper layer are decoded from the new
lower layer where they may resolve to a wrong origin file and user may
get an ESTALE or EIO error on lookup.

To avoid the reported regression while still allowing the new features
with single lower squashfs, do not allow decoding origin with lower null
uuid unless user opted-in to one of the new features that require
following the lower inode of non-dir upper (index, xino, metacopy).

Reported-by: Fabian <godi.beat@gmx.net>
Link: https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/
Fixes: 9df085f3c9 ("ovl: relax requirement for non null uuid of lower fs")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2020-08-08 01:53:12 -04:00
Yuxuan Shui 1fba8146d1 ovl: initialize error in ovl_copy_xattr
BugLink: https://bugs.launchpad.net/bugs/1884089

commit 520da69d26 upstream.

In ovl_copy_xattr, if all the xattrs to be copied are overlayfs private
xattrs, the copy loop will terminate without assigning anything to the
error variable, thus returning an uninitialized value.

If ovl_copy_xattr is called from ovl_clear_empty, this uninitialized error
value is put into a pointer by ERR_PTR(), causing potential invalid memory
accesses down the line.

This commit initialize error with 0. This is the correct value because when
there's no xattr to copy, because all xattrs are private, ovl_copy_xattr
should succeed.

This bug is discovered with the help of INIT_STACK_ALL and clang.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1050405
Fixes: 0956254a2d ("ovl: don't copy up opaqueness")
Cc: stable@vger.kernel.org # v4.8
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
2020-08-08 01:53:12 -04:00
Kleber Sacilotto de Souza 9ac77e721a Revert "UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay"
BugLink: https://bugs.launchpad.net/bugs/1879690

This reverts commit 6f18a84340.

The change applied for LP: #1857257 and its followup fix LP: #1876645
introduced a regression on overlayfs. Revert these commits for now.

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
2020-05-21 14:32:18 +02:00
Kleber Sacilotto de Souza e1ecc108eb Revert "UBUNTU: SAUCE: overlayfs: fix shitfs special-casing"
BugLink: https://bugs.launchpad.net/bugs/1879690

This reverts commit b3bdda24f1.

The change applied for LP: #1857257 and its followup fix LP: #1876645
introduced a regression on overlayfs. Revert these commits for now.

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
2020-05-21 14:31:52 +02:00
Christian Brauner b3bdda24f1 UBUNTU: SAUCE: overlayfs: fix shitfs special-casing
BugLink: https://bugs.launchpad.net/bugs/1876645

When I picked up Andrei's patch I ported it wrong. We need to initialize
realpath before dereferencing it obviously.

Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Kamal Mostafa <kamal@canonical.com>
Cc: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Cc: Andrei Vagin <avagin@gmail.com>
Fixes: 4e1f6efeedae ("UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay")
Link: https://bugs.launchpad.net/bugs/1857257
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Souza <kleber.souza@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
2020-05-07 01:15:39 -04:00
Andrei Vagin 6f18a84340 UBUNTU: SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay
BugLink: https://bugs.launchpad.net/bugs/1857257

The hack was introduced in ("UBUNTU: SAUCE: overlayfs: allow with
shiftfs as underlay") and it broke checkpoint/restore of docker
contains:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257

The following script can be used to trigger the issue:
  #!/bin/bash

  cat > test.py << EOF
  import sys

  f = open("/proc/self/maps")

  for l in f.readlines():
    if "python" not in l:
      continue
    print(l)
    s = l.split()
    start, end = s[0].split("-")
    fname = s[-1]
    print(start, end, fname)
    break
  else:
    sys.exit(1)

  test_file1 = open(fname)
  test_file2 = open("/proc/self/map_files/%s-%s" % (start, end))

  fdinfo1 = open("/proc/self/fdinfo/%d" % test_file1.fileno()).read()
  fdinfo2 = open("/proc/self/fdinfo/%d" % test_file2.fileno()).read()

  if fdinfo1 != fdinfo2:
    print("FAIL")
    print(test_file1)
    print(fdinfo1)
    print(test_file2)
    print(fdinfo2)
    sys.exit(1)
  print("PASS")
  EOF
  sudo docker run -it --privileged --rm -v `pwd`:/mnt python python /mnt/test.py

Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Connor Kuehl <connor.kuehl@canonical.com>
Cc: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Fixes: 58009298c6bd ("UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
[ klebers: fixed compilation error by including <uapi/linux/magic.h>.]
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
2020-05-05 12:32:22 +02:00
Amir Goldstein deee57eca6 ovl: fix value of i_ino for lower hardlink corner case
BugLink: https://bugs.launchpad.net/bugs/1874111

commit 300b124fcf upstream.

Commit 6dde1e42f4 ("ovl: make i_ino consistent with st_ino in more
cases"), relaxed the condition nfs_export=on in order to set the value of
i_ino to xino map of real ino.

Specifically, it also relaxed the pre-condition that index=on for
consistent i_ino. This opened the corner case of lower hardlink in
ovl_get_inode(), which calls ovl_fill_inode() with ino=0 and then
ovl_init_inode() is called to set i_ino to lower real ino without the xino
mapping.

Pass the correct values of ino;fsid in this case to ovl_fill_inode(), so it
can initialize i_ino correctly.

Fixes: 6dde1e42f4 ("ovl: make i_ino consistent with st_ino in more ...")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
2020-05-05 12:32:22 +02:00
Miklos Szeredi 1bf9f6e960 ovl: fix lseek overflow on 32bit
BugLink: https://bugs.launchpad.net/bugs/1863588

commit a4ac9d45c0 upstream.

ovl_lseek() is using ssize_t to return the value from vfs_llseek().  On a
32-bit kernel ssize_t is a 32-bit signed int, which overflows above 2 GB.

Assign the return value of vfs_llseek() to loff_t to fix this.

Reported-by: Boris Gjenero <boris.gjenero@gmail.com>
Fixes: 9e46b840c7 ("ovl: support stacked SEEK_HOLE/SEEK_DATA")
Cc: <stable@vger.kernel.org> # v4.19
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
2020-02-17 10:57:49 +01:00
Amir Goldstein 07fa4af9f8 ovl: fix wrong WARN_ON() in ovl_cache_update_ino()
BugLink: https://bugs.launchpad.net/bugs/1863588

commit 4c37e71b71 upstream.

The WARN_ON() that child entry is always on overlay st_dev became wrong
when we allowed this function to update d_ino in non-samefs setup with xino
enabled.

It is not true in case of xino bits overflow on a non-dir inode.  Leave the
WARN_ON() only for directories, where assertion is still true.

Fixes: adbf4f7ea8 ("ovl: consistent d_ino for non-samefs with xino")
Cc: <stable@vger.kernel.org> # v4.17+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
2020-02-17 10:57:49 +01:00
Amir Goldstein bef69d50e9 ovl: relax WARN_ON() on rename to self
BugLink: https://bugs.launchpad.net/bugs/1858424

commit 6889ee5a53 upstream.

In ovl_rename(), if new upper is hardlinked to old upper underneath
overlayfs before upper dirs are locked, user will get an ESTALE error
and a WARN_ON will be printed.

Changes to underlying layers while overlayfs is mounted may result in
unexpected behavior, but it shouldn't crash the kernel and it shouldn't
trigger WARN_ON() either, so relax this WARN_ON().

Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com
Fixes: 804032fabb ("ovl: don't check rename to self")
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2020-01-06 07:33:22 -06:00
Amir Goldstein 9030d27e70 ovl: fix corner case of non-unique st_dev;st_ino
BugLink: https://bugs.launchpad.net/bugs/1858424

commit 9c6d8f13e9 upstream.

On non-samefs overlay without xino, non pure upper inodes should use a
pseudo_dev assigned to each unique lower fs and pure upper inodes use the
real upper st_dev.

It is fine for an overlay pure upper inode to use the same st_dev;st_ino
values as the real upper inode, because the content of those two different
filesystem objects is always the same.

In this case, however:
 - two filesystems, A and B
 - upper layer is on A
 - lower layer 1 is also on A
 - lower layer 2 is on B

Non pure upper overlay inode, whose origin is in layer 1 will have the same
st_dev;st_ino values as the real lower inode. This may result with a false
positive results of 'diff' between the real lower and copied up overlay
inode.

Fix this by using the upper st_dev;st_ino values in this case.  This breaks
the property of constant st_dev;st_ino across copy up of this case. This
breakage will be fixed by a later patch.

Fixes: 5148626b80 ("ovl: allocate anon bdev per unique lower fs")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2020-01-06 07:33:21 -06:00
Amir Goldstein c8966fa31e UBUNTU: SAUCE: ovl: fix lookup failure on multi lower squashfs
BugLink: https://bugs.launchpad.net/bugs/1824407

In the past, overlayfs required that lower fs have non null uuid in
order to support nfs export and decode copy up origin file handles.

Commit 9df085f3c9 ("ovl: relax requirement for non null uuid of
lower fs") relaxed this requirement for nfs export support, as long
as uuid (even if null) is unique among all lower fs.

However, said commit unintentionally also relaxed the non null uuid
requirement for decoding copy up origin file handles, regardless of
the unique uuid requirement.

Amend this mistake by disabling decoding of copy up origin file handle
from lower fs with a conflicting uuid.

We still encode copy up origin file handles from those fs, because
file handles like those already exist in the wild and because they
might provide useful information in the future.

There is an unhandled corner case described by Miklos this way:
- two filesystems, A and B, both have null uuid
- upper layer is on A
- lower layer 1 is also on A
- lower layer 2 is on B

In this case bad_uuid won't be set for B, because the check only
involves the list of lower fs.  Hence we'll try to decode a layer 2
origin on layer 1 and fail.

We will deal with this corner case later.

Reported-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/
Fixes: 9df085f3c9 ("ovl: relax requirement for non null uuid ...")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
(cherry picked from commit b2d4f0ea5af42e16e154254de99da064f3ac551a
 https://github.com/amir73il/linux)
Acked-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2019-12-05 16:31:26 -06:00
Seth Forshee bdc22284ef UBUNTU: SAUCE: ovl: Restore vm_file value when lower fs mmap fails
BugLink: https://bugs.launchpad.net/bugs/1850994

ovl_mmap() overwrites vma->vm_file before calling the lower
filesystem mmap but does not restore the original value on
failure. This means it is giving a pointer to the lower fs file
back to the caller with no reference, which is a bad practice.
However, it does not lead to any issues with upstream kernels as
no caller accesses vma->vm_file after call_mmap().

With the aufs patches applied the story is different. Whereas
mmap_region() previously fput a local variable containing the
file it assigned to vm_file, it now calls vma_fput() which will
fput vm_file, for which it has no reference, and the reference
for the original vm_file is not put.

Fix this by restoring vma->vm_file to the original value when the
mmap call into the lower fs fails.

CVE-2019-15794

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2019-11-25 14:57:01 +01:00
Christian Brauner d24b8a547b UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay
BugLink: https://bugs.launchpad.net/bugs/1846272

In commit [1] we enabled overlayfs on top of shiftfs. This approach was
buggy since it let to a regression for some standard overlayfs workloads
(cf. [2]).
In our original approach in [1] Seth and I concluded that running
overlayfs on top of shiftfs was not possible because of the way
overlayfs is currently opening files. The fact that it did not pass down
the dentry of shiftfs but rather it's own caused shiftfs to be confused
since it stashes away necessary information in d_fsdata.
Our solution was to modify open_with_fake_path() to also take a dentry
as an argument, then change overlayfs to pass in the shiftfs dentry
which then would override the dentry in the passed in struct path in
open_with_fake_path().
However, this led to a regression for some standard overlayfs workloads
(cf. [2]).
After various discussions involving Seth and myself in Paris we
concluded the reason for the regression was that we effectively created
a struct path that was comprised of the vfsmount of the overlayfs dentry
and the dentry of shiftfs. This is obviously broken.
The fix is to a) not modify open_with_fake_path() and b) change
overlayfs to do what shiftfs is doing, namely correctly setup the struct
path such that vfsmount and dentry match and are both from shiftfs.
Note, that overlayfs already does this for the .open method for
directories. It just did not do it for the .open method for regular
files leading to this issue. The reason why this hasn't been a problem
for overlayfs so far is that it didn't allow running on top of
filesystems that make use of d_fsdata _implicitly_ by disallowing any
filesystem that is itself an overlay, or has revalidate methods for it's
dentries as those usually have d_fsdata set up. Any other filesystem
falling in this category would have suffered from the same problem.

Seth managed to trigger the regression with the following script:
 #!/bin/bash

 utils=(bash cat)

 mkdir -p lower/proc upper work root
 for util in ${utils[@]}; do
 	path="$(which $util)"
 	dir="$(dirname $path)"
 	mkdir -p "lower/$dir"
 	cp -v "$path" "lower/$path"
 	libs="$(ldd $path | egrep -o '(/usr)?/lib.*\.[0-9]')"
 	for lib in $libs; do
 		dir="$(dirname $lib)"
 		mkdir -p "lower/$dir"
 		cp -v "$lib" "lower/$lib"
 	done
 done

 mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work nodev root
 mount -t proc nodev root/proc
 chroot root bash -c "cat /proc/self/maps"
 umount root/proc
 umount root

With the patch here applied the regression is not reproducible.

/* References */
[1]: 37430e430a14 ("UBUNTU: SAUCE: shiftfs: enable overlayfs on shiftfs")
[2]: https://bugs.launchpad.net/bugs/1838677

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2019-11-25 14:56:57 +01:00
Andy Whitcroft 5593f69b94 UBUNTU: SAUCE: overlayfs: ensure mounter privileges when reading directories
BugLink: https://launchpad.net/bugs/1793458

When reading directory contents ensure the mounter has permissions for
the operation over the constituent parts (lower and upper). Where we are
in a namespace this ensures that the mounter (root in that namespace)
has permissions over the files and directories, preventing exposure of
protected files and directory contents.

CVE-2018-6559

Signed-off-by: Andy Whitcroft <apw@canonical.com>
[tyhicks: make use of new upstream check in ovl_permission() for copy-ups]
[tyhicks: make use of creator (mounter) creds hanging off the super block]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2019-11-25 14:56:35 +01:00
Seth Forshee 111cd1a984 UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs
The original mounter had CAP_SYS_ADMIN in the user namespace
where the mount happened, and the vfs has validated that the user
has permission to do the requested operation. This is sufficient
for allowing the kernel to write these specific xattrs, so we can
bypass the permission checks for these xattrs.

To support this, export __vfs_setxattr_noperm and add an similar
__vfs_removexattr_noperm which is also exported. Use these when
setting or removing trusted.overlayfs.* xattrs.

BugLink: http://bugs.launchpad.net/bugs/1531747
BugLink: http://bugs.launchpad.net/bugs/1534961
BugLink: http://bugs.launchpad.net/bugs/1535150
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
2019-11-25 14:56:29 +01:00
Seth Forshee f6ad5e07fd UBUNTU: SAUCE: overlayfs: Enable user namespace mounts
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
2019-11-25 14:56:27 +01:00