atomic mkdir+open?

Discussion:

atomic mkdir+open?

Richard Hansen

2014-05-30 07:16:57 UTC

Hi all,

With Issue 7 it's possible to create a regular file and open it at the
~same time via open(O_CREAT). (Does the spec say that the file
creation and open is atomic? If so, it isn't immediately obvious.)

As far as I can tell, it's not possible to create and open a directory
atomically -- you must first call mkdir()/mkdirat() and then make a
separate open() call. Is that correct?

If so, doesn't this open up operations like 'mkdir -p' to race
conditions? (malicious code could sneak in between a mkdirat() and the
subsequent open())

Do any implementations allow you to atomically create and open a
directory as an extension to the spec, perhaps via
open(O_CREAT|O_DIRECTORY)? For example:

#define _XOPEN_SOURCE 700
#include <fcntl.h>
#include <unistd.h>

int
main(int argc, const char *argv[])
{
int fd_parent = open(".", O_WRONLY|O_DIRECTORY);
for (int i = 1; i < argc; ++i) {
int fd_new = openat(fd_parent, argv[i],
O_WRONLY|O_CREAT|O_DIRECTORY, 0777);
close(fd_parent);
fd_parent = fd_new;
}
close(fd_parent);
return 0;
}

Thanks,
Richard

Matthew Dempsky

2014-05-30 09:19:58 UTC

Permalink

Post by Richard Hansen
With Issue 7 it's possible to create a regular file and open it at the
~same time via open(O_CREAT). (Does the spec say that the file
creation and open is atomic? If so, it isn't immediately obvious.)

If you look at the error cases, you'll see that the creation and
opening are implicitly guaranteed to be atomic. ENOENT can only be
returned if O_CREAT is not specified, or the path points to a
non-existing directory or empty string.

Post by Richard Hansen
As far as I can tell, it's not possible to create and open a directory
atomically -- you must first call mkdir()/mkdirat() and then make a
separate open() call. Is that correct?

Issue 7 simply says "the file shall be created" for O_CREAT, and
directories are a type of file... so technically it would be within
the wording of the spec for an implementation to create a directory
when open() is called with O_CREAT|O_DIRECTORY. But I expect the
intended wording is that O_CREAT shall create *regular* files, and I
also expect this is what most systems actually provide.

So I believe you're right: a portable POSIX application cannot
atomically create a directory and also open a file descriptor for that
directory.

Post by Richard Hansen
If so, doesn't this open up operations like 'mkdir -p' to race
conditions? (malicious code could sneak in between a mkdirat() and the
subsequent open())

Not sure I understand the concern? Even assuming "mkdir -p" had an
atomic create-directory-and-open system call to use, how do you make
sure malicious code doesn't rename the directories immediately after
mkdir finishes?

Anyway, the main nit I can see from trying to make O_CREAT|O_DIRECTORY
create a directory is that existing filesystem code probably only
supports exclusive mkdir(). In particular, it looks like NFS doesn't
support non-exclusive directory creation. So providing support for
O_CREAT|O_EXCL|O_DIRECTORY would probably be easier for existing
implementations than O_CREAT|O_DIRECTORY.

Richard Hansen

2014-05-30 17:58:51 UTC

Permalink

Post by Matthew Dempsky

I think I found a stronger argument that open(O_CREAT) is atomic. XSH
2.9.7 says:

All of the following functions shall be atomic with respect to each
other in the effects specified in POSIX.1-2008 when they operate on
regular files or symbolic links:

(many functions listed, including open())

If two threads each call one of these functions, each call shall
either see all of the specified effects of the other call, or none
of them.

Thanks,
Richard

Eric Blake

2014-05-30 15:31:00 UTC

Permalink

Post by Richard Hansen
Hi all,
With Issue 7 it's possible to create a regular file and open it at the
~same time via open(O_CREAT). (Does the spec say that the file
creation and open is atomic? If so, it isn't immediately obvious.)
As far as I can tell, it's not possible to create and open a directory
atomically -- you must first call mkdir()/mkdirat() and then make a
separate open() call. Is that correct?

Correct.

Post by Richard Hansen
If so, doesn't this open up operations like 'mkdir -p' to race
conditions? (malicious code could sneak in between a mkdirat() and the
subsequent open())

But what can that malicious code do? Simply create the directory with
sufficiently restricted permissions (perhaps with an intent to fchown it
to be more public afterwards), and you are guaranteed that the contents
of the directory can only be affected by you. An empty directory is
relatively state-free - if a malicious app renames it, you can easily
recreate the same state. And since permissions lock the malicious app
out from changing it to be a non-empty directory, and since you use
O_DIRECTORY to open the directory, you can be fairly sure that the
directory you just created is still empty (you can double-check with
fstat() that the permissions weren't changed in the meantime, if you are
paranoid).

Post by Richard Hansen
Do any implementations allow you to atomically create and open a
directory as an extension to the spec, perhaps via
open(O_CREAT|O_DIRECTORY)?

I know of no implementation with an extension to atomically create and
open a directory, but also no code that needs such an extension.

--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Richard Hansen

2014-05-30 17:04:17 UTC

Permalink

Post by Eric Blake

Post by Richard Hansen
If so, doesn't this open up operations like 'mkdir -p' to race
conditions? (malicious code could sneak in between a mkdirat() and the
subsequent open())

But what can that malicious code do?

I'm not really sure -- I haven't spent a lot of time thinking about it
and a quick search didn't bring up any discussion about exploiting the
time between mkdir() and open(). So maybe there's nothing. But "proof
by lack of an example" doesn't exactly inspire confidence. :) Lack of
discussion on the topic is also worrisome.

'cp -pr' is probably a better example than 'mkdir -p'. Imagine this
contrived scenario:

* dst is a directory that a malicious user has write access to
* src/foo is an empty directory owned by root with the S_ISUID bit set
* root runs 'cp -pr src/foo dst'

A normal sequence of events from 'cp' might look like this:

1. mkdir(dst/foo)
2. fd = open(dst/foo)
3. fchmod(fd, file mode bits from src/foo)
4. fchown(fd, uid/gid from src/foo)
5. futimens(fd, times from src/foo)
6. close(fd)

Now suppose the malicious user runs some code that manages to do the
following between steps 1 and 2:

1a. rename(dst/foo, dst/foo.moved)
1b. rename(path/to/malicious/executable, dst/foo)

Now the user has root access. The circumstances have to be *just* right
-- the malicious user must have write access to dst yet src/foo is owned
by root and u+s, which doesn't seem likely. The call to open() can be
given O_DIRECTORY|O_NOFOLLOW to ensure that the directory isn't swapped
out for an executable or symlink, but if the platform ascribes special
privileges to a directory's S_ISUID bit then this still may be
exploitable by moving in a directory containing specially-crafted files.

Maybe there's a mitigating factor that I'm not thinking of, but if this
contrived case is actually possible then I wouldn't be surprised if
there exists a reasonably exploitable case.

Post by Eric Blake
Simply create the directory with
sufficiently restricted permissions (perhaps with an intent to fchown it
to be more public afterwards), and you are guaranteed that the contents
of the directory can only be affected by you. An empty directory is
relatively state-free - if a malicious app renames it, you can easily
recreate the same state. And since permissions lock the malicious app
out from changing it to be a non-empty directory, and since you use
O_DIRECTORY to open the directory, you can be fairly sure that the
directory you just created is still empty (you can double-check with
fstat() that the permissions weren't changed in the meantime, if you are
paranoid).

True, fdopendir() and readdir() can be used to ensure that the newly
created directory is still empty, and maybe that's good enough to ensure
security. I doubt any implementations would be this careful. :)

Post by Eric Blake

Post by Richard Hansen
Do any implementations allow you to atomically create and open a
directory as an extension to the spec, perhaps via
open(O_CREAT|O_DIRECTORY)?

I know of no implementation with an extension to atomically create and
open a directory, but also no code that needs such an extension.

OK, thanks for the info!

-Richard

Rich Felker

2014-05-30 18:39:57 UTC

Permalink

Post by Richard Hansen
'cp -pr' is probably a better example than 'mkdir -p'. Imagine this
* dst is a directory that a malicious user has write access to
* src/foo is an empty directory owned by root with the S_ISUID bit set
* root runs 'cp -pr src/foo dst'
1. mkdir(dst/foo)
2. fd = open(dst/foo)
3. fchmod(fd, file mode bits from src/foo)
4. fchown(fd, uid/gid from src/foo)
5. futimens(fd, times from src/foo)
6. close(fd)
Now suppose the malicious user runs some code that manages to do the

You're missing step 2b: fstat(fd) to determine that it's a directory
(and perhaps that it has the right owner, etc.).

Post by Richard Hansen
1a. rename(dst/foo, dst/foo.moved)
1b. rename(path/to/malicious/executable, dst/foo)
Now the user has root access. The circumstances have to be *just* right
-- the malicious user must have write access to dst yet src/foo is owned
by root and u+s, which doesn't seem likely. The call to open() can be
given O_DIRECTORY|O_NOFOLLOW to ensure that the directory isn't swapped
out for an executable or symlink, but if the platform ascribes special
privileges to a directory's S_ISUID bit then this still may be
exploitable by moving in a directory containing specially-crafted files.

Indeed, O_DIRECTORY is another solution. But you probably still want
the fstat to determine that ownership is as expected.

I agree this is more complicated than would be ideal (especially if
you consider atypical permissions systems, which are
implementation-defined, whereby it might be possible to add an ACL to
a new directory giving yourself permissions to it then "gift" the
actual ownership of the directory back to the user who expects to own
it) so perhaps there is need for an improved interface...

Rich

Richard Hansen

2014-05-30 21:30:03 UTC

Permalink

Post by Rich Felker

Post by Richard Hansen
The call to open() can be
given O_DIRECTORY|O_NOFOLLOW to ensure that the directory isn't swapped
out for an executable or symlink, but if the platform ascribes special
privileges to a directory's S_ISUID bit then this still may be
exploitable by moving in a directory containing specially-crafted files.

Indeed, O_DIRECTORY is another solution. But you probably still want
the fstat to determine that ownership is as expected.

Yes, and to check that permissions are still as expected (in case the
directory was swapped out with another one with the same owner).

Post by Rich Felker
I agree this is more complicated than would be ideal (especially if
you consider atypical permissions systems, which are
implementation-defined, whereby it might be possible to add an ACL to
a new directory giving yourself permissions to it then "gift" the
actual ownership of the directory back to the user who expects to own
it)

Good point, I hadn't thought of that.

Post by Rich Felker
so perhaps there is need for an improved interface...

Sadly it sounds like no implementations have an improved interface so
standardization is premature at this point. (Indeed, some
implementations I use on a daily basis are still lagging far behind the
6-year-old Issue 7 interfaces.) Perhaps it would be worthwhile to file
a bug report to add a hint that we may want
open(O_DIRECTORY|O_CREAT|O_EXCL) to mean "do a mkdir() and open() in one
atomic action" in a future version of the standard, and see how many
implementations are willing to get on board.

-Richard