[This site will look better in browsers that support web standards, but should be legible in all.]

Sunday, 05/18/08

I left the laptop at home on Friday, where its last normal Time Machine backup was at 5:30 PM. At 6:30 something odd happened; that was around the time I got home but I'm not sure what it was. Friday night the machine kernel-panicked a couple of times, which is unusual. As Google had nothing useful to contribute, here's the panic log, with the marginally interesting bits highlighted:

Fri May 16 22:51:59 2008
panic(cpu 0 caller 0x0031E569): "hfs_getnewvnode: bad ca_blocks (too small)"@/SourceCache/xnu/xnu-1228.4.31/bsd/hfs/hfs_cnode.c:660
Backtrace, Format - Frame : Return Address (4 potential args on stack) 
0x52bef5a8 : 0x12b0f7 (0x4581f4 0x52bef5dc 0x133230 0x0) 
0x52bef5f8 : 0x31e569 (0x484f4c 0x65b52ec 0x14 0xf2d92) 
0x52bef698 : 0x32f7d4 (0x10802804 0x0 0x52bef788 0x52bef7b0) 
0x52bef7e8 : 0x32facf (0x10802804 0x66d10b 0x52bef93c 0x1) 
0x52bef848 : 0x1f66f4 (0x5c15260 0x66d10b 0x0 0x52bef93c) 
0x52bef898 : 0x1d524a (0x5c15260 0x66d10b 0x0 0x52bef93c) 
0x52bef958 : 0x1c3708 (0x52befa00 0x1040ec9c 0x52bef9c8 0x19d8da) 
0x52beff78 : 0x3dcf13 (0x5ae3c50 0x779bc20 0x779bc64 0x6d498c4) 
0x52beffc8 : 0x19f1c3 (0x6456a60 0x0 0x1a20b5 0x851a7d8) 
No mapping exists for frame pointer
Backtrace terminated-invalid frame pointer 0xb0101218

BSD process name corresponding to current thread: backupd

Mac OS version:
9C7010

Kernel version:
Darwin Kernel Version 9.2.2: Tue Mar  4 21:17:34 PST 2008; root:xnu-1228.4.31~1/RELEASE_I386
System model name: MacBookPro2,2 (Mac-F42187C8)

By the next morning it was clear that the machine was hanging on every backup cycle. It wasn't quite hourly, because fsevents had gotten out of sync and it had to perform a full-disk scan each time, which knocked it off schedule. At some point during each backup cycle (whether during the scan or the copy, I don't know) it was invariably panicking with the above signature, which is pretty much the worst thing a frequently-recurring data-saving device can reasonably do, short of actively deleting things.

I had the bright idea to remove, in Time Machine, that most recent 5:30 PM backup, but that also caused a panic. I noticed the following morning that a 6:30 PM backup with the ".inprogress" filename extension was still hanging around in my Time Machine backups directory, which the Finder allowed me to delete, however, doing so had no effect on the panics. I was able to use the Time-Machine tweaked version of GrandPerspective on the 5:30 backup, which ran without complaint and showed it to be wholly unexceptional. Not very interesting.

Sadly, I have no concrete resolution to report to anyone else who's seeing this irritating condition. I renamed the TM sparse image (I'm using a Time Capsule), which smoothly caused TM to recreate it and start from scratch with a new initial backup, which I re-ran over gigabit last night. It's since done five incremental backups, which at least suggests the problem was in the backup, not on my disk, and while that means I won't lose the whole weekend to this, it also means the problem is likely more prevalent.

If anyone has any bright ideas to try, I've still got the afflicted backup image, and next week when I'm over some deadlines I plan to give it another go, using tms and more concerted efforts to delete the trailing backup or two.

Belated update! Matt Deatherage did the responsible thing, located the error in the Darwin source, and suggested running Disk Utility's repair command on the mounted sparseimage. DU cranked away over wi-fi for just over eight hours (the image was about 140 MB at that point), but when it was done, the panics ceased. If it happens to you, I suggest performing that surgery over gigabit, which may shave off some time. Thanks Matt! 09:36AM «


Bits pushed by Movable Type