Apple’s Leopard Server appears to have a major bug that is causing havoc for those users unlucky enough to see it occur on a daily basis.

Basically, DirectoryService is crashing on users of Leopard Server. This in itself isn’t a big problem – Leopard Server restarts DirectoryService if it fails. The problem is with AppleFileServer. AppleFileServer seems to lose the ability to authenticate users with the new instance of DirectoryService. This means Time Machine backups stop working, and users can’t mount the server using AFP.

Currently, as of 10.5.2, this has still not been fixed. Apple have apparently given users suggestions such as sending a HUP signal to AppleFileServer at regular intervals to get things back on track, but in my limited testing, this doesn’t work – leaving the user with the ability to only mount certain shares.

The only reliable solution I’ve found is to restart AFP. Obviously we don’t want to do that all the time – it should only be done when DirectoryService crashes. To do this, I’ve built the following launchd daemon. Basically, it works as follows:

The daemon watches the /Library/Logs/Crashes directory, and wait until a crash occurs. When it does, it runs a script that checks to see if the crash was a DirectoryService crash, moves these crash files away in to a sub-directory, and restarts AFP.

Not a great solution, as if someone is connected at the time, they get booted and have to remount. I’m experimenting with other fixes people have listed that don’t require AFP to be restarted, but so far I’ve found they don’t seem to work consistently.

You can download the daemon here.

Once downloaded, uncompress the file (double click in Finder). Safari may unzip it automatically for you.

From a Terminal, cd to the unzipped folder (if downloaded from Safari, by default it will be in ~/Downloads).

cd ~/Downloads/restartafp

Now, type the following. Note that you need to be logged in as an administrator, and you will be asked for the administrator password in order to do the first operation.

sudo mkdir \
/Library/Logs/CrashReporter/DirectoryService
sudo cp com.curmi.restartafp.plist \
/Library/LaunchDaemons/
sudo cp restartafpondscrash.sh \
/usr/local/bin/
sudo chmod a+x \
/usr/local/bin/restartafpondscrash.sh
sudo launchctl load \
/Library/LaunchDaemons/com.curmi.restartafp.plist

To test this is working, you can do the following, and check /var/logs/system.log to see if it mentions the restart.

sudo touch \
/Library/Logs/CrashReporter/DirectoryService_trigger.crash

If for some reason you want to uninstall, run the following commands from the Terminal.

sudo launchctl unload \
/Library/LaunchDaemons/com.curmi.restartafp.plist
sudo rm \
/Library/LaunchDaemons/com.curmi.restartafp.plist
sudo rm /usr/local/bin/restartafpondscrash.sh
sudo mv \
/Library/Logs/CrashReporter/DirectoryService/* \
/Library/Logs/CrashReporter
sudo rmdir \
/Library/Logs/CrashReporter/DirectoryService

I hope this is useful to others out there. I’ve filed this bug with apple (radar bug report number 5836741), and I hope Apple fixes it soon.

The fix I’m currently trialling was listed here, and suggests that rather than doing:

serveradmin stop afp
serveradmin start afp

in the script, we do:

serveradmin settings afp:authenticationMode = "standard"
serveradmin settings afp:authenticationMode = "standard_and_kerberos"

I’ve tested this and it didn’t seem to work, but I’ll try it again just in case. I’m sure DirectoryService will crash on us sometime tomorrow to confirm if the fix works or not.

Leopard Server and DirectoryService crashes

22 thoughts on “Leopard Server and DirectoryService crashes

  • April 12, 2008 at 6:03 am
    Permalink

    Howdy, i’m going to test this too. I’ve had the same problem, and I don’t want to just cron an afp restart in the wee hours because that’s when timemachines tend to run for my clients. let us all know what’s up with your work around.

    -Sky

  • April 12, 2008 at 8:51 am
    Permalink

    Hey Sky. I tested the afp:authenticationMode work around. The first day I did this, it worked fine, however my machine locked up. I’m not sure of the cause, and since I hadn’t applied the latest firmware, I applied it, and restarted. It has been going now for 2 days with the new work around, and seems to be working well.

    One thing I do notice though, is with the new workaround, DirectoryService crashes more often. In some cases once an hour, 6-7 times a day. We really desperately need a fix for this from Apple.

  • April 17, 2008 at 2:12 am
    Permalink

    Thanks! This seems to have lessened the screaming here. Here it seems the number of DirectoryService crashes has gone done, sometime I go the whole day without seeing one. Watched pot never boils? Dead cat in a box? Who knows.

  • April 18, 2008 at 6:18 am
    Permalink

    Just a follow-up…it’s been a couple days now with the AFP restart script in place. We were going down at least a few times per day. Ironically, since putting in this temporary fix, the DirectoryServices crashes have disappeared. The only AFP restarts have been caused by: sudo touch /Library/Logs/CrashReporter/DirectoryService_trigger.crash

    Go figure…

  • April 18, 2008 at 3:59 pm
    Permalink

    The authenticationMode workaround does appear to be working for us. We’ve had no problems all week, even though logs show DirectoryService crashing at least 6 times a day for us.

    I’d love to know what picciano’s secret is though as we still see the crashes as I mentioned.

  • April 19, 2008 at 8:45 am
    Permalink

    Hi curmi,

    Thanks so much for the workaround as I’ve been getting the same AFP disconnects on two Leopard servers.

    I ran the Terminal prompts as directed, but just created several new empty folders in ~Downloads (a+x,chmod, etc.) Library> LaunchDaemons is still empty.

    Do I need to manually place the com.curmi.restartafp.plist file somewhere?

    Note this is on a 30 min. old virgin install with no attempted AFP connections or crashes.

    Here’s what Terminal said after entering the admin password:

    mkdir: 
/Library/Logs/CrashReporter: No such file or directory
    mkdir: com.curmi.restartafp.plist: File exists
    mkdir: 
/Library/LaunchDaemons: No such file or directory
    mkdir: cp: File exists
    mkdir: restartafpondscrash.sh: File exists
    mkdir: 
/usr/local/bin: No such file or directory
    mkdir: 
/usr/local/bin: No such file or directory
    mkdir: 
/Library/LaunchDaemons: No such file or directory
    domain:restartafp servername$

  • April 19, 2008 at 8:54 am
    Permalink

    Macintosah, you seem to have entered the commands incorrectly.

    You have to type them exactly as I wrote them, one line at a time. The \ can be removed at the end of lines if you join the next line to the first.

    The first command would then be:

    sudo mkdir
    /Library/Logs/CrashReporter/DirectoryService

    All on one line, then you press return.

    Then you do the next. And so on.

  • April 19, 2008 at 10:57 am
    Permalink

    Thanks, Cumi!

    [Noob exits, pursued by a bear.]

  • April 19, 2008 at 1:40 pm
    Permalink

    I am new to leopard server and I reinstalled it 3 times thinking this issue was my fault. This server is useless with this type of issue. This needs to be fixed or I demand a refund :)

  • April 19, 2008 at 3:08 pm
    Permalink

    Hey James, make sure Apple know your dissatisfaction, otherwise we’ll never get them to fix this issue.

  • April 19, 2008 at 10:08 pm
    Permalink

    Love the blog. Its now added to my RSS reader! What is the best way to complain to apple? Should I call support and let them have it or should I submit something on apple’s site?

  • April 19, 2008 at 11:20 pm
    Permalink

    Thanks James. You can either call support, or actually log a bug in bugreporter (http://bugreport.apple.com/). The second method is good because it seems they look at them eventually, and occasionally give you feedback. However, I’ve logged many bugs over the years, and they ignore a lot that they consider trivial. :-)

  • April 20, 2008 at 1:22 am
    Permalink

    I think this issue might also be messing up my VPN access also. Because it seems to hang on authentication a lot and then fail to connect.

    But with a fresh reboot the VPN works without any issues.

  • April 23, 2008 at 6:35 am
    Permalink

    Hi Curmi,

    So sorry to keep spamming your comments with my Terminal noobiness, but I’m still getting errors trying to install your fix. I’m pretty sure I’m typing everything correctly, but I’m getting “a no such file or directory” response on “restartafpondscrash.sh”.

    The one Leopard Server box I’ve tried this fix on is still getting AFP crashes, so I must be doing something wrong. Here is the terminal session:

    Leopard-Server:~ admin$ cd ~/Downloads/restartafp
    Leopard-Server:restartafp admin$ sudo mkdir \
    > /Library/Logs/CrashReporter/DirectoryService
    Password:
    Leopard-Server:restartafp admin$ sudo cp com.curmi.restartafp.plist \
    > /Library/LaunchDaemons
    Leopard-Server:restartafp admin$ sudo cp restartafpondscrash.sh \
    > sudo cp restartafpondscrash.sh \
    > /usr/local/bin
    usage: cp [-R [-H | -L | -P]] [-fi | -n] [-pvX] source_file target_file
    cp [-R [-H | -L | -P]] [-fi | -n] [-pvX] source_file … target_directory
    Leopard-Server:restartafp admin$ sudo chmod a+x \
    > /usr/local/bin/restartafpondscrash.sh
    chmod: /usr/local/bin/restartafpondscrash.sh: No such file or directory
    Leopard-Server:restartafp admin$ sudo launchctl load \
    > /Library/LaunchDaemons/com.curmi.restartafp.plist
    Leopard-Server:restartafp admin$

  • April 23, 2008 at 9:38 pm
    Permalink

    Macintosah, copy and paste my commands one word at a time. You are copy and pasting whole slabs, and somehow it copies an extra character after the \ character.

    Go through the commands carefully. Do one word at a time.

  • April 24, 2008 at 6:02 am
    Permalink

    Curmi,

    We’re cross posting now at the Apple support forum thread, (I’m TheProfessor there) so I promise this is the last time I’ll post to your blog comments. I’m just really desperate to have a solution to this. My client’s Leopard server is crashing hourly and telling them OS X Server doesn’t work right just isn’t an option.

    I think I found the problem. I retyped everything by hand with absolutely no errors, but I still get an error at the chmod prompt:

    ServerName:restartafp server$ sudo chmod a+x /usr/local/bin/restartafpondscrash.sh
    chmod: /usr/local/bin/restartafpondscrash.sh: Not a directory

    Turns out my bin folder at usr/local/bin shows as a Unix executable, not as a folder. How do I get it to be a directory?

  • April 24, 2008 at 4:52 pm
    Permalink

    Contact me in email and we’ll resolve it outside the blog.

  • May 12, 2008 at 11:10 am
    Permalink

    Hi
    Thanks for the blog. We call this the period 2 bug as when 3 or 4 classes log on at period 2, this bug arrises. Most have logged on and the stragglers can not log on. we tell everyone to leave the keyboards alone and restart the server. This fixes the log problem and the existing users can continue working without losing connection with the server which is great. Your fix however seems to kill all the students connections to the server and they lose all their work and have to log back in. which I accidentally did by following your instructions to the letter and the last one killed all users on the server.
    What is it about the restart of the server that is different from restarting afp in your script.
    Could your script be changed to automatically restart the server and maybe the users would not notice a thing.
    Any thoughts?
    Apple must fix this!!!!

  • May 16, 2008 at 3:56 am
    Permalink

    A bit past the point where the issue was raised, but Macintosah’s comment raises an issue that the basic instructions seem to ignore. A straight Leopard Server setup has a /usr/local directory, but /usr/local/bin does not exist until you create it. The install instructions should have the line:

    sudo mkdir /usr/local/bin

    before the line that copes the restart script into /usr/local/bin (or else the script will actually be named /usr/local/bin and not /usr/local/bin/restartafpondscrash.sh) with a short comment that an error message of “/usr/local/bin: FIle exists” after this command is run can be ignored.

  • May 30, 2008 at 11:31 pm
    Permalink

    Apple has release 10.5.3 for the Client and Server.

    From Apples support page it says:

    “Addresses an issue that could cause the AppleFileServer process to stop accepting connections while consuming most of the available CPU time on the server.

    Addresses an issue that could cause the Apple File Service refuse new connections after DirectoryService becomes unresponsive, and improves stability of DirectoryService.”

    This should resolve the issue.

  • April 6, 2009 at 5:29 pm
    Permalink

    Has this been fixed? I am experiencing this issue on 10.5.6 server which is connected to a directory system: windows 2003 server AD environment/domain.

    thx.

  • April 6, 2009 at 5:38 pm
    Permalink

    Yes it seems to be fixed for us.

Leave a Reply

Your email address will not be published. Required fields are marked *

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Anti-spam image