A Lurking Horror in Debugging

Mark Dalrymple

5 min read

Jul 30, 2014

A Lurking Horror in Debugging

…and upon looking into the face of indescribable horror, a bug so unfathomably odd that it shook the foundations of all meager human beings, I was overcome by an indistinct feeling of dread and approximate horror previously unfamiliar to me.

— HP Nerdcraft

Last week I was teaching one of our Advanced iOS bootcamps. One of the really fun things about teaching courses like this is the opportunity for on-the-fly debugging, because our students can come up with really interesting problems. Most of the time, the debugging session turns into a double lesson of “here’s the problem and how to fix it” and (maybe more importantly) “here’s how we figured it out.”

One particular bug seemed innocuous enough. It started with, “I’m getting this crash, and I don’t know why.” The crash was 100% reproducible: “Start the app, do a pinch gesture, and boom.” These kind of bugs tend to be easy: “Oh yeah, you forgot to conjugate the submarklar before invoking the datasource delegate.”

So we huddled over one computer while the student reproduced the problem. Here is the evidence in the debugger:

Stack Trace

Huh. Uhm. Huh.

Two things immediately jumped out at me. The first is a crash inside of compiler-synthesized code that backs this property:

@property (strong, nonatomic)
    id<UIViewControllerInteractiveTransitioning> interactiveTransition;

If you recall my Hierarchy of Potential Blame from Thoughts on Debugging, the compiler is at the bottom of my to-blame list. But here’s a crash in code emitted by the compiler. Either it’s bad, or something surrounding that value is bad.

And then there’s the address being passed to the method:1. As in 0x1. As in 0x00000001.
That’s a strange address. It’s not nil, which would be all zeroes. It’s definitely not a valid address—it’d have a much larger value, as well as being a multiple of 16, because objects are aligned on 16-byte addresses. Maybe it’s a stray enum getting passed around or something.

Hypothesis: -setInteractiveTransition is being passed bogus values in general. What are the kinds of values that are being passed to this method? Maybe there’s a pattern that’s leading up to 0x0000001 coming through. An easy way to see this is to replace the compiler-generated setter with some caveman debugging:

- (void) setInteractiveTransition: (id<UIViewControllerInteractiveTransitioning>) transition {
    NSLog (@"Got set a transition of %p", transition);
    _interactiveTransition = transition;

It printed out two settings to nil:

2014-07-29 19:05:12.225 FieldTech[22852:60b] Got set a transition of 0x0
2014-07-29 19:05:20.588 FieldTech[22852:60b] Got set a transition of 0x0

And then it died before entering into the function:

Entry crash

Whoa. That’s weird. After some moments of contemplation it made sense, but that initial crash before hitting the log put me in vapor-lock for a bit. For safety, ARC is going to be retaining the pointer as it comes in to the method. A quick disassembly around the site of the crash shows some memory management work:

(lldb) disassemble
   0x446a:  movl   %ecx, 0x4(%esp)
   0x446e:  movl   %eax, -0x18(%ebp)
   0x4471:  calll  0x5a8e          ; symbol stub for: objc_storeStrong
-> 0x4476:  movl   -0x18(%ebp), %eax
   0x4479:  leal   0x4c8f(%eax), %ecx

Is this useful data? Not really. It shows that ARC crashes on an insane address, whether it’s a compiler-emitted setter or my own code.

So where is this coming from? The stack trace shows the call is coming from -[UINavicationController _startCustomTransition:]. That sounds like a reasonable place to start looking. There’s no available source code for UIKit, so time to take a look at a disassembly. Here I’m using the Hopper Disassembler, which can generate pseudocode.

It’s a pretty big method, but there’s this very interesting construct:

Hopper stack trace

It tests register r5, which was initialized from a call to _interactionController. If that value is non-zero, set register r2 to be 0x1 and then call setInteractiveTransition.

So 0x1 isn’t a bad address. That looks like a boolean value! Why would it be passing a boolean to one of our methods? Actually, why would it be calling one of our methods in the first place? Almost sounds like UINavigationController has its own interactiveTransitions property.

Time for Class-dump! We looked at UINavigationController:

@interface UINavigationController : UIViewController
    UIView *_containerView;
    BOOL _interactiveTransition;
@property(nonatomic, getter=isInteractiveTransition) BOOL interactiveTransition;

Well, what do you know. There’s an undocumented property called interactiveTransition lurking in this class’ private underbelly, and it’s a BOOL. (That name doesn’t sound very BOOLy.) This is the cause of the problem.

The fact that there already exists a BOOL interactiveTransition is hidden from the compiler, so it couldn’t tell us “Hey, you’re overriding a method of a different type. You sure you want to do that?” Clang happily emitted code for the proper care and feeding of an Objective-C pointer, including memory management via ARC.

UINavigationController, on the other hand, is happily passing BOOL values around. That nil seen logging in setInteractiveTransition? That actually was a NO. nil and NO are both zero values, and are indistinguishable at runtime. That’s what led to this small wild-goose chase.

Renaming the property to something else fixed the problem.

The Takeaway

The tools from Leveling Up are very powerful, and are useful for more than just hacking the system. They give you information. Information while debugging is power. In particular, the Hopper Disassembler gave us a “huh?” moment that invited some exploratory class-dumping to see what was going on.

This bug also shows kind of how dangerous Objective-C can be at times. There was no way for the compiler to know something was wrong so that it could warn us. Being a C-derivative, the language assumes we know what we’re doing, and happily passed BOOL values to pointers.

What about Swift?

After fixing this bug, I posted about it to one of our internal iOS chat channels. A fellow Nerd piped up: “I believe that Swift private would fix this. If you have parent/child classes (in different files) that both define private func foo(), they both continue to exist and cannot see each other. You can’t call super from the child; calls in the parent class go to the parent version, and calls in the child class go to the child one.” Score another one for Swift.

Mark Dalrymple

Author Big Nerd Ranch

MarkD is a long-time Unix and Mac developer, having worked at AOL, Google, and several start-ups over the years.  He’s the author of Advanced Mac OS X Programming: The Big Nerd Ranch Guide, over 100 blog posts for Big Nerd Ranch, and an occasional speaker at conferences. Believing in the power of community, he’s a co-founder of CocoaHeads, an international Mac and iPhone meetup, and runs the Pittsburgh PA chapter. In his spare time, he plays orchestral and swing band music.

Speak with a Nerd

Schedule a call today! Our team of Nerds are ready to help

Let's Talk

Related Posts

We are ready to discuss your needs.

Not applicable? Click here to schedule a call.

Stay in Touch WITH Big Nerd Ranch News