Discussion:
SA_RESTART specification unclear
Philip Guenther
2012-09-10 19:40:09 UTC
Permalink
XSH then specifies the effect of the SA_RESTART flag in the
sigaction() page like this:

SA_RESTART This flag affects the behavior of interruptible functions;
that is, those
specified to fail with errno set to [EINTR]. If set, and a
function specified
as interruptible is interrupted by this signal, the
function shall restart and
shall not fail with [EINTR] unless otherwise specified. If
an interruptible
function which uses a timeout is restarted, the duration of
the timeout
following the restart is set to an unspecified value that
does not exceed
the original timeout value. If the flag is not set,
interruptible functions
interrupted by this signal shall fail with errno set to [EINTR].

Okay, so SA_RESTART handling is "opt out" for functions. How can I
determine whether a given function is opting out?

The one clear text is the page for select() and pselect(), which says:

[EINTR] The function was interrupted before any of the selected events
occurred and
before the timeout interval expired.
If SA_RESTART has been set for the interrupting signal, it is
implementation-
defined whether the function restarts or returns with [EINTR].

That's the only page which explicitly references SA_RESTART. Does
that mean it's the only one which is opting out?


Based on the manpages on at least Solaris, FreeBSD, and OpenBSD, I
would expect several other functions to be unaffected by SA_RESTART:
poll(), sigsuspend(), nanosleep(), and clock_nanosleep() are the
obvious ones. Testing shows that they are indeed unaffected on those
three systems as well as on Linux.

Furthermore, on at least Solaris, FreeBSD, and OpenBSD, connect() and
open("/path/to/fifo") are also unaffected by SA_RESTART and return
EINTR. On Linux 2.6.18, those do appear to restart.

Finally, on Solaris the accept() call is also not restarted, while it
is on FreeBSD, OpenBSD, and Linux.


So, is there something specific in the specifications for poll(),
sigsuspend(), nanosleep(), and clock_nanosleep() that indicates that
they don't restart?

Should there be something in the specifications for open(), connect(),
and accept() that either permits or requires it them to not restart?



Philip Guenther
Philip Guenther
2012-09-11 03:23:29 UTC
Permalink
On Mon, Sep 10, 2012 at 12:40 PM, Philip Guenther <guenther-***@public.gmane.org> wrote:
...
[The pselect/select page is] the only page which explicitly references SA_RESTART. Does
that mean it's the only one which is opting out?
...
So, is there something specific in the specifications for poll(),
sigsuspend(), nanosleep(), and clock_nanosleep() that indicates that
they don't restart?
For those pondering the above, how about the following wording from
the semop() page:
-------
-- The calling thread receives a signal that is to be caught. When
this occurs, the
value of semncnt associated with the specified semaphore shall be
decremented, and the calling thread shall resume execution in the manner
prescribed in sigaction().
-------

So what does "shall resume execution in the manner prescribed in
sigaction()" mean? After all, sigaction() is where SA_RESTART is
defined, with that generic wording about restart of functions that
would otherwise return EINTR! And yet, all platforms I've tried do
_not_ restart semop().

Oh, and that's the _only_ page that uses that "in the manner
prescribed by" wording...


Side information:
I've been poking around in the OpenBSD kernel on this and I currently
see the following standard syscalls as explicitly ignoring the
SA_RESTART flag (by mapping the internal ERESTART error value to
EINTR):
- sigsuspend()
- nanosleep() (clock_nanosleep() is not currently implemented by OpenBSD)
- select()
- poll()
- semop()
- connect()
- open()


Philip Guenther
Terry Lambert
2012-09-11 20:01:17 UTC
Permalink
Post by Philip Guenther
...
[The pselect/select page is] the only page which explicitly references
SA_RESTART. Does
that mean it's the only one which is opting out?
...
So, is there something specific in the specifications for poll(),
sigsuspend(), nanosleep(), and clock_nanosleep() that indicates that
they don't restart?
For those pondering the above, how about the following wording from
-------
-- The calling thread receives a signal that is to be caught. When
this occurs, the
value of semncnt associated with the specified semaphore shall be
decremented, and the calling thread shall resume execution in the manner
prescribed in sigaction().
-------
So what does "shall resume execution in the manner prescribed in
sigaction()" mean? After all, sigaction() is where SA_RESTART is
defined, with that generic wording about restart of functions that
would otherwise return EINTR! And yet, all platforms I've tried do
_not_ restart semop().
Oh, and that's the _only_ page that uses that "in the manner
prescribed by" wording...
I've been poking around in the OpenBSD kernel on this and I currently
see the following standard syscalls as explicitly ignoring the
SA_RESTART flag (by mapping the internal ERESTART error value to
- sigsuspend()
- nanosleep() (clock_nanosleep() is not currently implemented by OpenBSD)
- select()
- poll()
- semop()
- connect()
- open()
Apologies in advance for length...

I think it's possible to argue that at least two of those are
incorrect. I understand the desire to tighten up behaviors so you can
depend on them without more complicated code, however, this tends to
be a fuzzy area on purpose.

I believe we introduced SA_RESTART to sigaction() in order to be able
to support default BSD signal behavior on System V systems. The
bsd_signal() function (obsolete, but still in Issue 6) gives a user
space implementation for the function. In a BSD compatible libc on a
System V system (for example, Prime UNIX dual universe mode and
Solaris BSD compatibility mode), that function would just be called
signal() in order to keep older binaries happy. The default on BSD
systems was to always restart the system calls, and if you wanted to
actually interrupt something, you had to longjmp() from the handler to
do a stack unwind to avoid the restart.

What follows from that is that the BSD select(), unless it's
implemented as a wrapper for poll() or a similar primitive, which
could indicate an interruption of an idempotent but not signal-atomic
operation in a user space implementation, should probably behave in
traditional BSD signal restart fashion (e.g. per BSD 4.1c or later,
following the introduction of select() in the first place).

Mac OS X and True64 UNIX, both of which use Mach ASTs to send signals
as delayed events for non-hard signals (i.e. NOT including signals
such as floating point exceptions, illegal instructions, segmentation
violations, or bus errors) tend to have the ability to cancel a signal
after it's been sent, but before it has been trampolined to user
space. For example, an explicit alarm out of read() which actually
completes before it trampolines back out. Both OSs have permanent
interpretations permitting this behaviour as a signal system
implementation detail.

I don't actually see any good reason for connect() to not be
restarted; at worst it'd give an EISCONN instead of an EINTR. You
could probably make a case for accept(), since it returns an fd, and
open(), at least when applied to devices where there could be a signal
level change or other side effect. The sigsuspend() is kind of a
weird duck; at least in True64 and Mac OS X, there's no guarantee that
a given thread won't be delivered a signal(), unless you explicitly
mask them out of all the threads with sigprocmask() and put them back
in the explicit handler threads with pthread_sigmask().

-- Terry
Geoff Clare
2012-09-12 14:12:35 UTC
Permalink
Post by Philip Guenther
-------
-- The calling thread receives a signal that is to be caught. When
this occurs, the
value of semncnt associated with the specified semaphore shall be
decremented, and the calling thread shall resume execution in the manner
prescribed in sigaction().
-------
So what does "shall resume execution in the manner prescribed in
sigaction()" mean? After all, sigaction() is where SA_RESTART is
defined, with that generic wording about restart of functions that
would otherwise return EINTR! And yet, all platforms I've tried do
_not_ restart semop().
I believe the intention of the quoted text is to require that when
a signal that is to be caught is received during a semop() call,
semop() behaves according to all the usual requirements specified
elsewhere in the standard, except that it decrements semncnt before
doing all that other stuff.
--
Geoff Clare <g.clare-7882/***@public.gmane.org>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Loading...