No pthread_sigaction() in POSIX ?

Discussion:

Xavier Roche

2011-05-17 19:36:33 UTC

Hi folks,

While playing with signals, I realized that threads and signals do not
perfectly fit together. Signal handlers (sigaction(), signal() ..) are
globally defined for a given process, and there is no way to define
locally (per thread) a handler when a thread receive a signal.

We have the ability to send a signal to a specific thread,
however (pthread_kill() ;
<http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html>),
but not the ability to handle it properly without overriding the global
signal handler.

Worse, if one thread wants to handle a signal cooperatively, overriding
the global handler, it has to do forbidden things (such as locking,
which is not signal-safe)

Would a function such as:

int pthread_sigaction(int signum, const struct sigaction *act,
struct sigaction *oldact);

.. be totally absurd ?

[ One biggest cons. is to define what would happend if a global handler is
defined, AND a thread local handler. A possible answer would be to handle
the case as it is handled with sigaction vs. signal (the latest call
overrides the previous one ; ie. sigaction() overrides everything, and
pthread_sigaction() then overrides a specific thread handler) ]

Don Cragun

2011-05-17 20:28:20 UTC

Permalink

Post by Xavier Roche
Hi folks,
While playing with signals, I realized that threads and signals do not perfectly fit together. Signal handlers (sigaction(), signal() ..) are globally defined for a given process, and there is no way to define locally (per thread) a handler when a thread receive a signal.
We have the ability to send a signal to a specific thread, however (pthread_kill() ; <http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html>), but not the ability to handle it properly without overriding the global signal handler.

Hi Xavier,

Note, however, that the application usage points out that when the
signal handler is called, it will be in the context of the designated
thread. Note that when using sigaction(), SA_SIGINFO is set, and a
signal is caught, the signal catching function is invoked as:
void func(int signo, siginfo_t *info, void *context);
where the third argument refers to the receiving thread's context.

Isn't this sufficient?

- Don

Post by Xavier Roche
Worse, if one thread wants to handle a signal cooperatively, overriding the global handler, it has to do forbidden things (such as locking, which is not signal-safe)
int pthread_sigaction(int signum, const struct sigaction *act,
struct sigaction *oldact);
.. be totally absurd ?
[ One biggest cons. is to define what would happend if a global handler is defined, AND a thread local handler. A possible answer would be to handle the case as it is handled with sigaction vs. signal (the latest call overrides the previous one ; ie. sigaction() overrides everything, and pthread_sigaction() then overrides a specific thread handler) ]

Eric Blake

2011-05-17 20:29:14 UTC

Permalink

The common solution to this is to use pthread_sigmask to block a signal
from all threads except for a dedicated signal-handler thread, so that
the global signal handler is guaranteed to use the signal-handling
thread's resources, at which point you can then design the signal
handler to hand information back to the handler thread, then use normal
locking within the rest of the handler thread to convey signal
information to other threads in a safe manner.

Post by Xavier Roche
int pthread_sigaction(int signum, const struct sigaction *act,
struct sigaction *oldact);
.. be totally absurd ?

POSIX is generally about specifying existing practice; you're better off
convincing an existing implementation to implement this function as an
extension to prove whether it is worthwhile, rather than trying to have
this list invent such an interface with no existing implementation practice.

--
Eric Blake eblake-H+wXaHxf7aLQT0dZR+***@public.gmane.org +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Ersek, Laszlo

2011-05-17 21:01:59 UTC

Permalink

Post by Xavier Roche
While playing with signals, I realized that threads and signals do not
perfectly fit together. Signal handlers (sigaction(), signal() ..) are
globally defined for a given process, and there is no way to define
locally (per thread) a handler when a thread receive a signal.

(I don't know the historical background.)

Since an asynchronously generated signal is delivered to exactly one,
unspecified, eligible thread, for me it's not straightforward to see how
different handlers could be useful. Say there are three eligible threads,
T1, T2, T3, with (fictional) separate handlers H1, H2, H3,
correspondingly, for a given signal. Since the choice of T1..T3 is
arbitrary at delivery time, the programmer would have to design each of
H1, H2, H3 so that any single one of them can handle the signal correctly
wrt. the application's purposes and internal state.

Multiple delivery is different (as in, when a signal is generated for the
process and then delivered, *all* eligible threads should run some
thread-specific handler code). This case (and, I believe, the full answer
to your question) is described in the SUSv4 sigwaitinfo() rationale:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigwaitinfo.html#tag_16_556_08

(In a single-threaded program, signal delivery is (should be?) usually
constrained to narrow portions of the code (with sigprocmask()). In a
multi-threaded program that handles non-RT async SIGTERM, SIGINT etc. as a
"necessary unpleasantness" (as opposed to using sigwaitinfo() with
realtime / queued signals for event processing), I think it is safest (and
least hard to program) to further constrain delivery to a single thread,
with pthread_sigmask() instead of sigprocmask().

This is just my opinion, of course.)

Post by Xavier Roche
We have the ability to send a signal to a specific thread, however
(pthread_kill() ;
<http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html>),
but not the ability to handle it properly without overriding the global
signal handler.

(The above link points to an older version of the standard, SUSv3.)

See the APPLICATION USAGE section:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_kill.html#tag_16_435_07

lacos

Butenhof, David

2011-05-18 15:05:43 UTC

Permalink

There are two main factors involved here:

First, per-thread signal actions wreak havoc with the POSIX job control model. When you suspend, resume, ^C... where does the signal go?

It could be defined to always go to the "initial thread"; but there's not supposed to be anything much special about that thread (except the call stack, of course)... you'd be substantially limiting the ability to use that initial thread for normal processing. Furthermore, it means you couldn't terminate that thread and still have the process behave correctly. Or, we can say (as we do), that an external signal goes to some random thread... in which case the job control behavior becomes non-deterministic if threads have distinct signal actions. E.g., ^C might either terminate the process or repeat a prompt, depending on which thread got the signal. And that's presuming that the signal side effect still affects the process as a whole. If you're going for a consistent per-thread model, it won
't; and that's even worse as you now need to ^C once for each thread in the process before the process will actually terminate.

Of course there are always ways around problems like this; most of them are complicated and error prone, and provide no real advantages.

Second, there were an enormous number of divergent and strongly held believes regarding how the POSIX signal model should behave with threads. We spent more time arguing that than anything else. There seemed virtually no hope of getting consensus from the working group, never mind the balloting group. The advent of a "grand signal compromise", largely due to Nawaf Bitar, got everyone more or less together. And we discovered (repeatedly) that any attempt to stray from that path would dump us back into the swamp of despair.

The current integration of signals into threads introduces "quirks" only when you want to treat the process as a collection of independent entities; and there are always better ways to do this than with signals. I really don't see how breaking POSIX job control (or vastly complicating it) is a better alternative.

But, aside from all this; what purpose would there be to having distinct signal actions for threads? POSIX thread IDs aren't visible or usable outside the process -- nor, in my opinion, SHOULD they be, as threads are essentially transient and interchangeable engines (e.g., the workers in a thread pool). If you think you want to communicate with a particular thread within the process, you should probably instead be communicating with a particular functional module; and signals are not a good way to accomplish that. A condition variable, a semaphore, a message queue, etc.; those are all vastly more appropriate mechanisms.

-----Original Message-----
Sent: Tuesday, May 17, 2011 17:02
To: Xavier Roche
Subject: Re: No pthread_sigaction() in POSIX ?

(I don't know the historical background.)
Since an asynchronously generated signal is delivered to exactly one,
unspecified, eligible thread, for me it's not straightforward to see how
different handlers could be useful. Say there are three eligible threads,
T1, T2, T3, with (fictional) separate handlers H1, H2, H3,
correspondingly, for a given signal. Since the choice of T1..T3 is
arbitrary at delivery time, the programmer would have to design each of
H1, H2, H3 so that any single one of them can handle the signal correctly
wrt. the application's purposes and internal state.
Multiple delivery is different (as in, when a signal is generated for the
process and then delivered, *all* eligible threads should run some
thread-specific handler code). This case (and, I believe, the full answer
http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigwaitinfo.html#
tag_16_556_08
(In a single-threaded program, signal delivery is (should be?) usually
constrained to narrow portions of the code (with sigprocmask()). In a
multi-threaded program that handles non-RT async SIGTERM, SIGINT etc. as a
"necessary unpleasantness" (as opposed to using sigwaitinfo() with
realtime / queued signals for event processing), I think it is safest (and
least hard to program) to further constrain delivery to a single thread,
with pthread_sigmask() instead of sigprocmask().
This is just my opinion, of course.)

Post by Xavier Roche
We have the ability to send a signal to a specific thread, however
(pthread_kill() ;

<http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html

Post by Xavier Roche
),
but not the ability to handle it properly without overriding the global
signal handler.

(The above link points to an older version of the standard, SUSv3.)
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_kill.html
#tag_16_435_07
lacos

Schwarz, Konrad

2011-05-19 15:26:35 UTC

Permalink

Could this stuff be added to the rationale? The question pops up from time to time.

Thanks,

Konrad Schwarz

-----Original Message-----
Sent: Wednesday, May 18, 2011 5:06 PM
To: Ersek, Laszlo; Xavier Roche
Subject: RE: No pthread_sigaction() in POSIX ?
First, per-thread signal actions wreak havoc with the POSIX
job control model. When you suspend, resume, ^C... where does
the signal go?
It could be defined to always go to the "initial thread"; but
there's not supposed to be anything much special about that
thread (except the call stack, of course)... you'd be
substantially limiting the ability to use that initial thread
for normal processing. Furthermore, it means you couldn't
terminate that thread and still have the process behave
correctly. Or, we can say (as we do), that an external signal
goes to some random thread... in which case the job control
behavior becomes non-deterministic if threads have distinct
signal actions. E.g., ^C might either terminate the process
or repeat a prompt, depending on which thread got the signal.
And that's presuming that the signal side effect still
affects the process as a whole. If you're going for a
consistent per-thread model, it won't; and that's even worse
as you now need to ^C once for each thread in the process
before the process will actually terminate.
Of course there are always ways around problems like this;
most of them are complicated and error prone, and provide no
real advantages.
Second, there were an enormous number of divergent and
strongly held believes regarding how the POSIX signal model
should behave with threads. We spent more time arguing that
than anything else. There seemed virtually no hope of getting
consensus from the working group, never mind the balloting
group. The advent of a "grand signal compromise", largely due
to Nawaf Bitar, got everyone more or less together. And we
discovered (repeatedly) that any attempt to stray from that
path would dump us back into the swamp of despair.
The current integration of signals into threads introduces
"quirks" only when you want to treat the process as a
collection of independent entities; and there are always
better ways to do this than with signals. I really don't see
how breaking POSIX job control (or vastly complicating it) is
a better alternative.
But, aside from all this; what purpose would there be to
having distinct signal actions for threads? POSIX thread IDs
aren't visible or usable outside the process -- nor, in my
opinion, SHOULD they be, as threads are essentially transient
and interchangeable engines (e.g., the workers in a thread
pool). If you think you want to communicate with a particular
thread within the process, you should probably instead be
communicating with a particular functional module; and
signals are not a good way to accomplish that. A condition
variable, a semaphore, a message queue, etc.; those are all
vastly more appropriate mechanisms.

-----Original Message-----
Sent: Tuesday, May 17, 2011 17:02
To: Xavier Roche
Subject: Re: No pthread_sigaction() in POSIX ?

Post by Xavier Roche
While playing with signals, I realized that threads and

signals do not

Post by Xavier Roche
perfectly fit together. Signal handlers (sigaction(),

signal() ..) are

Post by Xavier Roche
globally defined for a given process, and there is no way

to define

Post by Xavier Roche
locally (per thread) a handler when a thread receive a signal.

(I don't know the historical background.)
Since an asynchronously generated signal is delivered to

exactly one,

unspecified, eligible thread, for me it's not

straightforward to see how

different handlers could be useful. Say there are three

eligible threads,

T1, T2, T3, with (fictional) separate handlers H1, H2, H3,
correspondingly, for a given signal. Since the choice of T1..T3 is
arbitrary at delivery time, the programmer would have to

design each of

H1, H2, H3 so that any single one of them can handle the

signal correctly

wrt. the application's purposes and internal state.
Multiple delivery is different (as in, when a signal is

generated for the

process and then delivered, *all* eligible threads should run some
thread-specific handler code). This case (and, I believe,

the full answer
http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigw
aitinfo.html#

tag_16_556_08
(In a single-threaded program, signal delivery is (should

be?) usually

constrained to narrow portions of the code (with

sigprocmask()). In a

multi-threaded program that handles non-RT async SIGTERM,

SIGINT etc. as a

"necessary unpleasantness" (as opposed to using sigwaitinfo() with
realtime / queued signals for event processing), I think it

is safest (and

least hard to program) to further constrain delivery to a

single thread,

with pthread_sigmask() instead of sigprocmask().
This is just my opinion, of course.)

Post by Xavier Roche
We have the ability to send a signal to a specific thread, however
(pthread_kill() ;

<http://pubs.opengroup.org/onlinepubs/009695399/functions/pthr
ead_kill.html

Post by Xavier Roche
),
but not the ability to handle it properly without

overriding the global

Post by Xavier Roche
signal handler.

(The above link points to an older version of the standard, SUSv3.)

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthr
ead_kill.html

#tag_16_435_07
lacos