Adding simple string functions to async-signal-safe list?

Discussion:

Matthew Dempsky

2013-05-07 02:26:21 UTC

Is it intentional that trivial methods like memcpy(), strlen(), and
some others from <string.h> are not marked as async-signal-safe? Are
there implementations that would be burdened by requiring that they be
async-signal-safe? Or optimizations that would be precluded by this?

Obviously not all of <string.h> could be async-signal-safe; e.g.,
strcoll(), strdup(), and/or strerror() to name a few obvious ones.

Matthew Dempsky

2013-05-07 03:18:56 UTC

Permalink

Hm, that's an interesting point. Though on the other hand GCC and
Clang on x86 and x86-64 optimize large struct assignments by emitting
a call to memcpy(), and large struct assignments should be
async-signal-safe.

Eric Blake

2013-05-07 03:45:05 UTC

Permalink

Post by Matthew Dempsky

This lkml thread has more details on gcc 4.3 vs. various kernels;
including some highlights I will quote here:

There is an industry-standard ABI document for x86-64 which proscribes
that DF will be clear on function entry (if memcpy() is always
forward-only, then DF is most likely to be set only during a memmove()
of overlapping memory where reverse directionality meets the contract of
memmove()). However, multiple OS's failed to follow the ABI (at least,
at the time this issue was raised; I'm not sure which, if any, of these
OS's have been patched in the meantime):

https://lkml.org/lkml/2008/3/5/343
"And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
they have the same behaviour as Linux, that is they don't clear DF
before calling the signal handler.
"I also tested Hurd, and it causes a kernel crash."

https://lkml.org/lkml/2008/3/5/518
"The issue is that the kernel is entered (due to a trap, interrupt or
whatever) and the state is saved. The kernel decides to revector
userspace to a signal handler. The kernel modifies the userspace state
to do so, but doesn't set DF=0.
"Upon return to userspace, the modified state kicks in. Thus the signal
handler is entered with DF from userspace at trap time, not DF=0.
"So it's an asynchronous state leak from one piece of userspace to another."

Therefore, it is still debatable whether this is a bug in the kernels
that must be fixed for leaking DF state into signal handlers, or a bug
in the compiler for making assumptions that DF=0 will be true on
function entry (even for signal handler functions), or a bug in the ABI
document for not matching existing practice, or merely the evidence that
simple string functions are not async-signal-safe and mandating them to
be safe would be too much of a burden on existing implementations, and
that signal handlers should use open-coded loops instead of any simple
libc string functions. But it is certainly food for thought.

Post by Matthew Dempsky
Hm, that's an interesting point. Though on the other hand GCC and
Clang on x86 and x86-64 optimize large struct assignments by emitting
a call to memcpy(), and large struct assignments should be
async-signal-safe.

Compilers implementing a large struct copy in a signal handler via
memcpy() under the hood are presumably safe, even if the signal handler
is started with DF=1, because they presumably proved the two structs are
non-overlapping, at which point directionality no longer affects the
correctness of the copy. But there are probably other cases where
directionality matters, and where leaking the wrong directionality
either into or out of a signal handler could cause the copying of wrong
data or even reading beyond array bounds. Still, it seems like if the
kernels could be made to obey the ABI document and guarantee a known DF
state on signal handler entry, and restore prior DF state on signal
handler exit, then the simple string functions would behave as if they
are async-signal-safe even if they mess with DF.

--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Rich Felker

2013-05-07 11:34:22 UTC

Permalink

Post by Eric Blake

This lkml thread has more details on gcc 4.3 vs. various kernels;
There is an industry-standard ABI document for x86-64 which proscribes
that DF will be clear on function entry (if memcpy() is always
forward-only, then DF is most likely to be set only during a memmove()
of overlapping memory where reverse directionality meets the contract of
memmove()). However, multiple OS's failed to follow the ABI (at least,
at the time this issue was raised; I'm not sure which, if any, of these
https://lkml.org/lkml/2008/3/5/343
"And for other kernels. I tested OpenBSD 4.1, FreeBSD 6.3, NetBSD 4.0,
they have the same behaviour as Linux, that is they don't clear DF
before calling the signal handler.
"I also tested Hurd, and it causes a kernel crash."
https://lkml.org/lkml/2008/3/5/518
"The issue is that the kernel is entered (due to a trap, interrupt or
whatever) and the state is saved. The kernel decides to revector
userspace to a signal handler. The kernel modifies the userspace state
to do so, but doesn't set DF=0.
"Upon return to userspace, the modified state kicks in. Thus the signal
handler is entered with DF from userspace at trap time, not DF=0.
"So it's an asynchronous state leak from one piece of userspace to another."
Therefore, it is still debatable whether this is a bug in the kernels
that must be fixed for leaking DF state into signal handlers, or a bug
in the compiler for making assumptions that DF=0 will be true on
function entry (even for signal handler functions), or a bug in the ABI
document for not matching existing practice, or merely the evidence that
simple string functions are not async-signal-safe and mandating them to
be safe would be too much of a burden on existing implementations, and
that signal handlers should use open-coded loops instead of any simple
libc string functions. But it is certainly food for thought.

Whatever this is, it's not evidence that the string functions should
not be specified as async-signal-safe. The fact that GCC automatically
generates either calls to memcpy or direct use of the x86 string
instructions without clearing DF means that the bug, whoever's
responsibility it is, affects programs regardless of whether they call
memcpy or other string functions from a signal handler.

My opinion is strongly in the direction of this being a kernel bug,
but I don't think that's relevant to the discussion at hand.

Rich

Joseph S. Myers

2013-05-07 15:53:02 UTC

Permalink

Post by Eric Blake
Therefore, it is still debatable whether this is a bug in the kernels
that must be fixed for leaking DF state into signal handlers, or a bug

It's a bug that was fixed five years ago for Linux (commit
e40cd10ccff3d9fbffd57b93780bee4b7b9bff51, release 2.6.25).

--
Joseph S. Myers
joseph-***@public.gmane.org

Matthew Dempsky

2013-05-07 17:53:06 UTC

Permalink

Okay, so looking at <string.h>, I think these functions should not be
required to be async-signal-safe:

Requires memory allocation: strdup, strndup
Requires locale: strcoll, strcoll_l, strerror, strerror_l,
strerror_r, strsignal, strxfrm, strxfrm_l
Uses static state: strtok

That leaves these <string.h> functions as candidates for being added
to the async-signal-safe set I think:

memccpy, memchr, memcmp, memcpy, memmove, memset, stpcpy, stpncpy,
strcat, strchr, strcmp, strcpy, strcspn, strlen, strncat, strncmp,
strncpy, strnlen, strpbrk, strchr, strspn, strstr, strtok_r

I think practically applications only care about a subset of these,
but the remaining ones shouldn't be any more difficult to make
async-signal-safe too IMO.

Since there seems to be at least some support for this, I'll file a
formal request in aardvark.

Post by Joseph S. Myers

Post by Eric Blake
Therefore, it is still debatable whether this is a bug in the kernels
that must be fixed for leaking DF state into signal handlers, or a bug

It's a bug that was fixed five years ago for Linux (commit
e40cd10ccff3d9fbffd57b93780bee4b7b9bff51, release 2.6.25).
--
Joseph S. Myers

Eric Blake

2013-05-07 18:02:32 UTC

Permalink

Post by Matthew Dempsky
Okay, so looking at <string.h>, I think these functions should not be
Requires memory allocation: strdup, strndup
Requires locale: strcoll, strcoll_l, strerror, strerror_l,
strerror_r, strsignal, strxfrm, strxfrm_l
Uses static state: strtok
That leaves these <string.h> functions as candidates for being added
memccpy, memchr, memcmp, memcpy, memmove, memset, stpcpy, stpncpy,
strcat, strchr, strcmp, strcpy, strcspn, strlen, strncat, strncmp,
strncpy, strnlen, strpbrk, strchr, strspn, strstr, strtok_r

The second strchr should be strrchr. This list looks fine to me.

Post by Matthew Dempsky
I think practically applications only care about a subset of these,
but the remaining ones shouldn't be any more difficult to make
async-signal-safe too IMO.
Since there seems to be at least some support for this, I'll file a
formal request in aardvark.

Thanks, looking forward to it.

--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Scott Lurndal

2013-05-07 20:45:10 UTC

Permalink

I think strerror[_l] is quite commonly used in signal handlers, I'd be
tempted to add it as well.

scott

Matthew Dempsky

2013-05-07 21:33:00 UTC

Permalink

Post by Scott Lurndal
I think strerror[_l] is quite commonly used in signal handlers, I'd be
tempted to add it as well.

strerror() depends on the thread's locale state, so it could be bad if
a signal handler interrupted a call to uselocale() or setlocale().

Also, on OpenBSD, strerror() depends on catopen() for native language
support, which I don't think we [OpenBSD] are interested in having to
make async-signal-safe either. It looks like FreeBSD and NetBSD have
similar code.

None of {Free,Net,Open}BSD have strerror_l() currently. But also,
it'd be pretty hard to get a useful locale_t value in a POSIX
conforming async signal handler. :)

Eric Blake

2013-05-07 21:40:10 UTC

Permalink

Post by Matthew Dempsky

Post by Scott Lurndal
I think strerror[_l] is quite commonly used in signal handlers, I'd be
tempted to add it as well.

strerror() depends on the thread's locale state, so it could be bad if
a signal handler interrupted a call to uselocale() or setlocale().

Furthermore, strerror() is permitted to use static storage, so it would
be bad if a signal handler interrupted a call to strerror(). There is
no way I would approve a change requesting any of the strerror (or
strsignal) functions to be made async-signal-safe; that is just asking
too much from implementation developers. Any robust application code is
better off storing the error number of interest, leaving signal handler
context, and then processing the information in a safe state.

--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Matthew Dempsky

2013-05-07 21:46:59 UTC

Permalink

Post by Eric Blake
Any robust application code is
better off storing the error number of interest, leaving signal handler
context, and then processing the information in a safe state.

Agreed.

Eric Blake

2013-05-07 03:06:52 UTC

Permalink

Post by Matthew Dempsky
Is it intentional that trivial methods like memcpy(), strlen(), and
some others from <string.h> are not marked as async-signal-safe? Are
there implementations that would be burdened by requiring that they be
async-signal-safe? Or optimizations that would be precluded by this?
Obviously not all of <string.h> could be async-signal-safe; e.g.,
strcoll(), strdup(), and/or strerror() to name a few obvious ones.

I _think_ (although I don't have hard evidence) that the reason that
memcpy() is NOT async-signal-safe is to make it possible for x86
implementations that use the rep instruction prefix without having to
always use cld/std, and making it possible to save thread state with
fewer instructions per task swap. For example, see this bug report
about what happens if a signal handler calls a string function that
messes with the direction flag in code outside the signal handler:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=469565

But yes, I would _love_ to see the simple string functions required to
be async-signal-safe instead of having to open-code my hand-rolled
variants for calling within a signal handler, just to ensure that if my
signal handler interrupts a long-running rep instruction, that I'm not
corrupting the directionality when the handler returns back into that
instruction. And failing that, I'd love to see more APPLICATION USAGE
commenting on exactly WHY such seemingly-simple functions are unsafe.

--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org