Swift Regex Deep Dive
iOS MacOur introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
As you’ve seen before, DTrace can give you a huge amount of visibility into the guts of your computer. You saw the syscall
provider which lets you attach probes to system calls, those functions that let you take advantage of services from the kernel. You also saw the pid
provider which lets you look inside individual processes and trace the function activity inside of there.
This time around is the objc
provider, the one that started this whole ride. It lets you trace Objective-C methods inside of an individual process. A typical probe description for the objc provider looks like this:
objc2342:NSImage:-dealloc:entry
Which says, for process id 2342, for objects of the NSImage
class, whenever -dealloc
is entered, evaluate some D code.
The parts of the probe description of the objc
provider are:
probeprov
: The provider – objc + the process ID
probemod
: A class name, such as NSWindow
, or class name and a category, like NSWindow(NSConstraintBasedLayout)
probefunc
: A method name / selector
probename
: entry
, return
, or an instruction offset, like the pid
provider
The same process ID annoyances happen here like they do with the pid
provider. You can provide an explicit process ID in the probe description, pass an argument and refer to it with $1
, or you can use $target
coupled with the -c
option to tell dtrace
to run or examine a program. Don’t forget you can dig into Cocoa application bundles to get to the crunchy executable inside. Also don’t forget that any parts of the probe description that are omitted act as wildcards, along with the globbing characters *
(match zero or more) and ?
(match one).
You saw how to use the pid
provider to list all the function probe points available in a program. You can use the objc
provider to do the the same thing, but with methods:
# dtrace -l -n 'objc$target:::entry' -c /Applications/TextEdit.app/Contents/MacOS/TextEdit > textedit-probes.txt
Feel free to poke around this file with your favorite editor. Search around for your favorite classes.
Like with Objective-C, class methods are prefixed with +, and instance methods prefixed with -. For example, a class method:
# dtrace -l -n 'objc$target:NSView:+initialize:entry' -c /Applications/TextEdit.app/Contents/MacOS/TextEdit
ID PROVIDER MODULE FUNCTION NAME
24101 objc7385 NSView +initialize entry
as well as an instance method:
# dtrace -l -n 'objc$target:NSFontManager:-init:entry' -c /Applications/TextEdit.app/Contents/MacOS/TextEdit
ID PROVIDER MODULE FUNCTION NAME
24101 objc7391 NSFontManager -init entry
There’s one hitch. How do you specify a method name that contains colons? What if you wanted to do stuff whenever someone called NSDicitonary -getObjects:andKeys:
? What happens if you try this this command:
# dtrace -l -n 'objc$target:NSDictionary:-getObjects:andKeys::entry ...
The command will smack you down with an error:
dtrace: invalid probe specifier. neener neener.
DTrace uses colons. Objective-C uses colons. If you need to use them both, who wins? It’s DTrace’s playground, so it wins. You need to finesse your probe descriptions using wildcarding, by replacing the colons in the selector with question marks:
# dtrace -l -n 'objc$target:NSDictionary:-getObjects?andKeys?:entry' -c /Applications/TextEdit.app/Contents/MacOS/TextEdit
ID PROVIDER MODULE FUNCTION NAME
185597 objc7462 NSDictionary -getObjects:andKeys: entry
Two postings and some change later, I can finally tell you about some practical applications (that is, fun stuff) you can do with DTrace in Objective-C land.
I had a problem a couple of weeks ago. My team inherited a large code base and we’re needing to fix bugs, improve performance, and turn around a new release on a short schedule. One of the bugs I discovered (and subsequently needed to fix) related to undo. You could type text in one text field and do all sorts of editing. If you then navigate to another text field, the Undo menu item was still enabled. Choosing Undo at this time would apply the changes from the first text field (insert characters at this location, remove characters in this range) to the second text one, corrupting the text in the process. Subsequent Undos wouldn’t work because everything’s messed up inside. The twist is that this is a Mac executable, but an iOS codebase, using the Chameleon compatibility library.
One option was to just turn off Undo support entirely. That would have been kind of mean, because it’s easy to accidentally do something stupid when editing text and immediately reach for command-Z to undo it. It’s a nice feature. The next thought was to grab the appropriate NSUndoManager
and call -removeAllActions
when the text field loses first responder status. That way the user could Undo while editing, and then we commit it when navigating away. It’s a good compromise between “no undo” and cooking up full-featured undo for the whole app.
Unfortunately, because of the compatibility library, it wasn’t obvious where to get the right undo manager. There’s one you can get from UIDocument, and another from UIResponder, but neither one seemed to be useful. Time to pull out the big guns. Just where are NSUndoManagers
coming from, and who is using them? If I can get this info while typing in one of the text fields, I can figure out the right undo manager to use.
Whenever faced with “who creates this” or “who calls this”, and I have no idea where to begin looking in the code, I reach for DTrace. First off is to see who is creating them. This was the first trial:
# dtrace -n 'objc$target:NSUndoManager:-init:return' -c /path/to/app
2 3721 -init:entry
3 3721 -init:entry
1 3721 -init:entry
3 3721 -init:entry
That’s semi-useful. I’m catching the return from -[NSUndoManager init]
. This is telling me that four undo managers are being created. I could find out the address of these objects by printing out -init’s return value:
# dtrace -n 'objc$target:NSUndoManager:-init:return { printf("%x", arg1); }' -c /path/to/app
2 3721 -init:entry 102871da0
3 3721 -init:entry 10114b4f0
1 3721 -init:entry 101362270
3 3721 -init:entry 10113e2c0
Recall when using the return name of the pid
or objc
provider, arg0
has the instruction offset (which is not terribly useful for day-to-day work), and arg1
has the return value (useful!). -init
returns the address of the object it initialized, so I can see the actual addresses of the four undo managers running around the app. I could attach to the program with the debugger and inspect this memory, send messages to these objects, and unleash all kinds of ad-hoc mayhem.
Next, I wanted to ask was “just who are creating these undo managers?” I want to see stack traces. The ustack()
action comes to the rescue. This command told me who created them:
#dtrace -n 'objc$target:NSUndoManager:-init:return { printf("%x", arg1); ustack(20); }' -c /path/to/app
This gives the top 20 functions on the call stacks. I’ll reduce the clutter here with what I found:
2 3721 -init:entry 102871da0
Foundation`-[NSUndoManager init]
UIKit`-[UIWindow initWithFrame:]+0x9d
AppOfAwesome`-[AppDelegate application:didFinishLaunchingWithOptions:]+0x275
A UIWindow
is creating an undo manager
3 3721 -init:entry 10114b4f0
Foundation`-[NSUndoManager init]
CoreData`-[NSManagedObjectContext(_NSInternalAdditions) _initWithParentObjectStore:]+0x297
CoreData`-[NSManagedObjectContext initWithConcurrencyType:]+0x8a
AppOfAwesome`__35-[AppDelegate managedObjectContext]_block_invoke_0+0x7a
libdispatch.dylib`dispatch_once_f+0x35
AppOfAwesome`-[AppDelegate managedObjectContext]+0xa8
Core Data is making one for the MOC
1 3721 -init:entry 101362270
Foundation`-[NSUndoManager init]
UIKit`-[UIWindow initWithFrame:]+0x9d
UIKit`-[UIInputController init]+0xd2
UIKit`+[UIInputController sharedInputController]+0x44
UIKit`-[UIResponder becomeFirstResponder]+0x13f
UIKit`-[UITextView becomeFirstResponder]+0x5f
UIKit`-[UITextLayer textViewBecomeFirstResponder:]+0x5f
Another window is being created due to the UITextView
gaining focus.
3 3721 -init:entry 10113e2c0
Foundation`-[NSUndoManager init]
AppKit`-[NSWindow _getUndoManager:]+0x1d5
AppKit`-[NSWindow undoManager]+0x17
AppKit`-[NSResponder undoManager]+0x22
AppKit`-[NSResponder undoManager]+0x22
AppKit`-[NSResponder undoManager]+0x22
AppKit`-[NSResponder undoManager]+0x22
AppKit`-[NSTextView undoManager]+0x8d
AppKit`-[NSTextView(NSSharing) shouldChangeTextInRanges:replacementStrings:]+0x136
And then Cocoa is making one for the underlying NSTextView
. Which one of this is my undo manager? Ask DTrace again, this time logging everything that’s being done by NSUndoManager:
# sudo dtrace -n 'objc57447:NSUndoManager*::entry { printf("%x", arg0); ustack(20); }'
This catches the entry into any NSUndoManager
call. It prints out the first argument, which for the entry
probename
is the receiver of the message. It also prints out the stack. The *
wildcard in the class name catches any categories, such as NSUndoManager(NSPrivate)
. How did I know there were categories? Do the “-l -n -c
” trick to dump all the things out to a file and then search around in it.
Running this command emitted a bunch of output. Then I focused a text field and started typing, and saw even more stuff:
3 3777 -beginUndoGrouping:entry 10113e2c0
Foundation`-[NSUndoManager beginUndoGrouping]
Foundation`-[NSUndoManager _registerUndoObject:]+0x7d
AppKit`-[NSTextViewSharedData coalesceInTextView:affectedRange:replacementRange:]+0x1ea
This looks promising. The text view’s data is coalescing stuff (I love the word “coalesce”), which makes sense when typing. There’s also some undo grouping going on, which also makes sense when typing is happening:
3 3737 -_setGroupIdentifier::entry 10113e2c0
Foundation`-[NSUndoManager _setGroupIdentifier:]
AppKit`-[NSTextViewSharedData coalesceInTextView:affectedRange:replacementRange:]+0x22e
These dudes are using the undo manager living at 0x10113e2c0, which is the one that came from inside of Cocoa. Neither of the Chameleon undo managers, nor the Core Data undo manager are involved. Luckily I could get ahold of the NSWindow
underneath everything, get its -undoManager
, and clear all of its actions at the proper time. That fixed the problem.
Another bug we had was with a scrolling container of editable values that didn’t reset its scroll position when used to create a new object in the application. You could be editing an existing object with lots of stuff, scroll to the bottom, and then create a new object. The Panel of Stuff would still be scrolled to the previous bottom, clipping all the entry fields at the top. It was pretty confusing.
Grepping and searching around for UIScrollView
didn’t show anything useful. There was a view controller object hierarchy using a third party library involved too. Maybe it was buried in there? No luck searching through its code. Before going along the route of setting random breakpoints in the app, I put a DTrace probe on UIScrollView
’s setContentOffset*
along with a call to ustack()
to see what changed it. Repro the bug, look at the output, and it turns out it was a UITableView
(which itself is a UIScrollView
subclass). Kind of a “duhhhh” momement, but luckily pretty quick to find. It was a piece of cake to reset the scroll position then.
Back in November, Adam Preble wrote about some weird behavior he was seeing with the new Objective-C literals. To track down the problem he used DTrace to see all of the messages being sent to a particular object.
Just using a probe like
objc$target:::entry
would be pretty worthless – you get an incredible amount of spew. There is a lot of messaging activity happening inside of a Cocoa program. Adam came up with a clever way to limit the tracing to a particular object, and added a way to turn this tracing on and off from within the program. In essence, Adam’s test program told DTrace to start emitting the method trace at a well-defined point in the overall program’s execution, and then told DTrace to stop, because there would be no more interesting activity happening.
He added a function which would be the tracing toggle:
void observeObject (id objectToWatch) {
}
It’s just an empty function. When you want to start tracing the messages sent to an object, call it in your code:
observeObject (self.gronkulator);
When you want to stop tracing, pass nil
:
observeObject (nil);
And then run the program under the auspices of Dtrace with this script (available at this gist)
/* run with
* # dtrace -qs watch-array.d -c ./array
*/
int64_t observedObject;
pid$target::observeObject:entry
{
printf ("observing object: %pn", arg0);
observedObject = arg0;
}
objc$target:::entry
/arg0 == observedObject/
{
printf("%s %sn", probemod, probefunc);
/* uncomment to print stack:
* ustack();
*/
}
So, how does this work? There’s a global variable, observedObject
, that holds the address of the object that you want to trace messages to. It defaults to zero.
When the Objective-C program calls observeObject()
, the first clause gets invoked. It uses the pid
provider to attach some actions that get run when observeObject()
is called. It snarfs the first (and only argument), and stashes it in the observedObject
global. Its work is done.
The second clause says to do stuff on every single Objective-C message send. Every single one. Having the actions run on every message would be overkill, plus would probably overwhelm the DTrace machinery in the kernel and cause it to drop data. The /arg0 == observedObject/
predicate looks at arg0
, the receiver of the message, and checks to see if it’s the same as the object that needs the tracing. If it is, print out the module (class) and function (method name) of the sent message. Otherwise do nothing.
Here’s a program:
#import <Foundation/Foundation.h>
// clang -g -fobjc-arc -framework Foundation -o array array.m
void observeObject (id objectToWatch) {
// Just a stub to give DTrace something to hook in to.
}
int main (void) {
@autoreleasepool {
NSArray *thing1 = @[ @1, @2, @3, @4];
observeObject (thing1);
[thing1 enumerateObjectsUsingBlock: ^(id thing, NSUInteger index, BOOL *whoa) {
NSLog (@"Saw %@", thing);
}];
observeObject (nil);
NSLog (@"it's got %ld elements", thing1.count);
}
return 0;
} // main
It makes an array, starts the observation, and then enumerates. The observation gets turned off, and the array is used again.
Here’s the output:
# dtrace -qs watch-array.d -c ./array
dtrace: script 'watch-array.d' matched 11413 probes
2013-02-20 19:36:26.645 array[7674:803] Saw 1
2013-02-20 19:36:26.648 array[7674:803] Saw 2
2013-02-20 19:36:26.649 array[7674:803] Saw 3
2013-02-20 19:36:26.649 array[7674:803] Saw 4
2013-02-20 19:36:26.650 array[7674:803] it's got 4 elements
dtrace: pid 7674 has exited
observing object: 7fdfeac14890
NSArray -enumerateObjectsUsingBlock:
NSArray -enumerateObjectsWithOptions:usingBlock:
__NSArrayI -count
__NSArrayI -count
__NSArrayI -countByEnumeratingWithState:objects:count:
__NSArrayI -countByEnumeratingWithState:objects:count:
observing object: 0
A lot of interesting stuff happening here. You can see the original enumerateObjectsUsingBlock
call, which turns around and calls enumerateObjectsWithOptions
Those are NSArray
methods. Then there’s some __NSArrayI
classes, most likely the actual implementation of Immutable arrays. It does some work, and then the observation stops.
One interesting feature of Objective-C is that messages to nil
are totally legal. They’re fundamentally no-ops so you don’t have to keep checking for nil
all over your code. You can sometimes eliminate whole classes of if
statements by using an object to do some work, and then put a nil
object in its place to not
do some work. It’d be kind of cool to see how often real code uses message sends to nil.
Objective-C messages are actually implemented by a C function called objc_msgSend
. Its first argument is the receiver of the object. You can tell DTrace to look at that first argument and log a stack trace when it’s zero. The pid
provider lets you do that:
pid$target::objc_msgSend:entry
/arg0 == 0/
{
ustack (5);
}
Run it against TextEdit:
# dtrace -s nilsends.d -c /Applications/TextEdit.app/Contents/MacOS/TextEdit
And you’ll get a lot of stuff, like
0 24327 objc_msgSend:entry
libobjc.A.dylib`objc_msgSend
Foundation`-[NSURL(NSURLPathUtilities) URLByAppendingPathComponent:]+0xdc
AppKit`persistentStateDirectoryURLForBundleID+0x104
AppKit`+[NSPersistentUIManager(NSCarbonMethods) copyPersistentCarbonWindowDictionariesForBundleID:]+0x74
Wouldn’t it be nice you could get a count of who sends messages to nil
most often? Wouldn’t it be nice if we could aggregate some data and print it out when the tracing ends?
Why yes you can. You can’t leave a DTrace discussion without talking about aggregates. An aggregates is a variable, similar to a DTrace associative array. They’re indexed by an arbitrary list of stuff, and the name is preceded by an at-sign. You use an aggregate like this:
@aggregateName[list, of, indexes] = function(expression);
function
is something like count
, min
, max
, avg
, sum
, and so on. Take a look at count
:
@counts[probefunc] = count();
This will count the number of times a function is called. Here’s a one-liner to run against the array program from earlier that counts all the system calls made:
# dtrace -n 'syscall:::entry { @counts[probefunc] = count(); }' -c ./array
When the program ends it prints the a list of system calls and the number of times each one was called. It’s also handily sorted from smallest to largest:
...
mmap 17
lseek 25
write_nocancel 26
getuid 29
workq_kernreturn 110
kevent 190
select 274
That’s a lot of system work for such a small program.
The other aggregate functions take arguments. You can use sum(expression) to accumulate the amount of data asked for in reads, system wide:
# dtrace -n 'syscall::read:entry { @read_amounts[execname] = sum(arg2); }'
Run that command and then interrupt it with control-C. It’ll print out the aggregate:
configd 33152
Scrivener 104393
zip 119508
Mailplane 294912
Activity Monito 305703
mdworker 1095182
Terminal 1772088
Safari 2005992
mds 2658305
WebProcess 3367064
fseventsd 3675047
?Getting back to things Objective-C, you can see who is making the most message sends to nil. Here’s another script (nilsends-2.d):
pid$target::objc_msgSend:entry
/arg0 == 0/
{
@stacks[ustack(5)] = count();
}
END
{
trunc (@stacks, 20);
}
Three cool things: the END
provider, whose full name is dtrace:::END
, has its actions run when the script exits (there’s a corresponding BEGIN
provider too). You can see the @stacks
aggregate – it’s using the call stack as the index. This means that common stacks will have a higher count. When the script ends, the aggregate gets truncated to the highest 20 values, and then is automatically printed out. Here’s a run with TextEdit, letting it launch, and then immediately quitting.
# dtrace -s nilsends-2.d -c /Applications/TextEdit.app/Contents/MacOS/TextEdit
It has these stacks at the end of the output:?
libobjc.A.dylib`objc_msgSend
AppKit`-[NSView convertRect:toView:]+0x3e
AppKit`-[NSWindow _viewIsInContentBorder:]+0x5f
AppKit`-[NSCell(NSCellBackgroundStyleCompatibility) _externalContextSeemsLikelyToBeRaised]+0xf5
AppKit`-[NSCell(NSCellBackgroundStyleCompatibility) _initialBackgroundStyleCompatibilityGuess]+0x48
206
libobjc.A.dylib`objc_msgSend
AppKit`+[NSTextFinder _setFindIndicatorNeedsUpdateForView:]+0x32
AppKit`-[NSView _invalidateGStatesForTree]+0x38a
CoreFoundation`CFArrayApplyFunction+0x44
AppKit`-[NSView _invalidateGStatesForTree]+0x2a2
208
libobjc.A.dylib`objc_msgSend
AppKit`-[NSMenuItem _titlePathForUserKeyEquivalents]+0x1b2
AppKit`-[NSMenuItem _fetchFreshUserKeyEquivalentInfo]+0x66
AppKit`-[NSMenuItem userKeyEquivalent]+0x4f
AppKit`-[NSMenuItem _desiredKeyEquivalent]+0x3a
244
libobjc.A.dylib`objc_msgSend
AppKit`-[NSMenuItem _titlePathForUserKeyEquivalents]+0xed
AppKit`-[NSMenuItem _fetchFreshUserKeyEquivalentInfo]+0x66
AppKit`-[NSMenuItem userKeyEquivalent]+0x4f
AppKit`-[NSMenuItem _desiredKeyEquivalent]+0x3a
244
So NSMenuItem
is doing the most nil
message sends (244 times from two different locations inside of _titlePathForUserKeyEqivalents), and then NSTextFinder
.
Take the listing of all the objc
probes in TextEdit, and search for the term “hack”. See if you can find some cool sounding classes or methods, and then see if they’re documented.
Put probes on mouseDown
/ mouseUp
, and use aggregates to see how many of those events happen in a typical run of your program.
You can index aggregates by more than one value. Adapt the system call counting script to also record the executable name. You’d index @counts by execname
and probefunc
.
Ask Uncle Google about printa()
, which lets you print out DTrace aggregates in a prettier format. Also look around for the quantize()
aggregate function, which will build a histogram of the values it sees, and output a glorious ASCII chart.
So you can see, there’s a lot of power contained in DTrace for investigating things in Objective-C land as there are looking programs with the pid provider or looking at system calls. Hopefully this has demystified enough of DTrace to let you pull it out of your toolbox and amaze your friends.
Our introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
The Combine framework in Swift is a powerful declarative API for the asynchronous processing of values over time. It takes full advantage of Swift...
SwiftUI has changed a great many things about how developers create applications for iOS, and not just in the way we lay out our...