[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Possible fix for the signal-handling problem in es-0.9-beta1.tar.gz
In article <cWdHfF5Et0@iadd.jivetech.com>,
Paul Haahr <haahr@jivetech.com> writes:
>> I have no idea if anyone besides me still runs es (or anyone cares to
>> maintain their copies),
> Well, I still run it.
Hi Paul,
[Since the es list appears to insert a rather large time delay, I have
explicitly CC'd parties that were known to be interested in this
problem back in January 2000.]
It is always great to hear from the author of the software that they
still run it. ;-)
> Hasn't needed much maintenance lately,
How true: The only reason I found the minor memory-handling bug is
that I upgraded my operating system recently. The new version
(FreeBSD 4.2) checks for that exact type of memory handling error and
many others (by default).
IIRC, I seem to have hit this problem before but the crash/failure was
not always so predictable. At least, I am quite sure I was never able
to pinpoint the errant code.
> but I've been meaning to fix the damn control-C on Linux bug for a
> long time.
OK, the signal-handling code for interactive use could still use some
minor tuning. ;-)
Of course, I could never do the amount of work on es that you did, but
as a small token of my gratitude, I have finally attempted to debug
this problem in earnest and produce a patch for it. It works for me
on FreeBSD, OSF1 and Solaris (sorry, can't test Linux at the moment).
If you are referring to the same long-standing bug that I am aware of
(and that I can reproduce on Linux, Solaris, FreeBSD and OSF1), then
it is not Linux-specific. I assume that Linux also configures to
HAVE_SIGACTION=1 (even if it doesn't, I have also found an unrelated
regression that affects platforms that configure to HAVE_SIGACTION=0,
SYSV_SIGNALS=1). If so, then you may enjoy this analysis and possible
patch.
Below is a patch that was found tonight by manual inspection after
getting a handle on the failure mode. I found that the file-scoped
variable ``blocked'' in signal.c became non-zero forever after the
first interrupt was processed inside a shell-level while loop. Thus,
clearly, an imbalance in calls to blocksignals()/unblocksignals() must
be present in some code path. I hope I am not just trading the
obvious fix of the call pairing for a race condition in some other
case (I figure that you can judge this far faster than I).
Although perhaps benign whenever HAVE_SIGACTION=1, I also removed the
gratuitous signal() call in catcher() when SYSV_SIGNALS=0 and,
conversely, added the needed signal() call in catcher() when
SYSV_SIGNALS=1. According to my local CVS archive of es, someone
clearly hosed this between ES-0_9-ALPHA1 and ES-0_9-BETA1.
Regards,
Loren
*** prim-ctl.c.orig Fri Apr 11 15:54:34 1997
--- prim-ctl.c Wed Dec 13 00:55:27 2000
***************
*** 77,84 ****
if (termeq(fromcatcher->term, "retry")) {
retry = TRUE;
unblocksignals();
! } else
throw(fromcatcher);
EndExceptionHandler
EndExceptionHandler
--- 77,86 ----
if (termeq(fromcatcher->term, "retry")) {
retry = TRUE;
unblocksignals();
! } else {
! unblocksignals();
throw(fromcatcher);
+ }
EndExceptionHandler
EndExceptionHandler
*** signal.c.orig Fri Apr 11 15:54:37 1997
--- signal.c Wed Dec 13 00:59:29 2000
***************
*** 68,74 ****
/* catcher -- catch (and defer) a signal from the kernel */
static void catcher(int sig) {
! #if !SYSV_SIGNALS /* only do this for unreliable signals */
signal(sig, catcher);
#endif
if (hasforked)
--- 68,74 ----
/* catcher -- catch (and defer) a signal from the kernel */
static void catcher(int sig) {
! #if SYSV_SIGNALS /* only do this for unreliable signals */
signal(sig, catcher);
#endif
if (hasforked)