So, trying to help someone troubleshoot a problem with AOLserver on Linux, I asked the simple question of “did it drop a core file”? The answer was, “No, it didn’t.” That prompts the next question, “Did you set
ulimit -c unlimited?” The affirmative answer surprised me: how is it not dropping a core file? Ahh, that’s when it occurred to me: Linux has had issues with threads (same info, more polished) for a long time now. So, I decided to do some hacking on my own Debian Linux box running my own hand-compiled 2.6.7 kernel.
Apparently, there are some kernel patches to implement multithreaded core dumps which suggests that the patches were merged into the 2.5.47 kernel. So, why isn’t AOLserver writing a core file when it segfaults?
After making dinner and putting the kids to bed, I thought about it and wrote a little test program that starts a thread then aborts in the thread. It wrote out a core file just fine! Then it dawned on me: AOLserver does a setuid to drop root privileges — Linux, by default, won’t write a core file out if the process has changed uid/euid. Figuring I wasn’t the first person to want Linux to behave like every other Unix out there, I found my answer in the unofficial comp.os.linux.development.* FAQ: How can I make a suid executable dump core? Making the change in nsd/nsmain.c, adding the prctl(PR_SET_DUMPABLE) call, I can now get AOLserver to dump a core file on Linux. Yay.
I’ve just filed SF Bug #1031599, attached the diff’s and committed the change.
While this doesn’t solve the original problem of why this person’s AOLserver is crashing, at least now I might be able to get a core file to look at …