Swift Regex Deep Dive
iOS MacOur introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
Fast Enumeration, part 1 covered what fast enumeration is. Part 2 covered the method countByEnumeratingState
, which is the centerpiece of fast enumeration. This time around there’s actual code which implements a collection that can be fast enumerated, without cheating and falling back on one of Apple’s collections.
The entirety of the code can be found at this gist
When in doubt, make bad design decisions. Here is a class to append strings to a collection, and then access those strings by index. Newer strings are appended to the end of the list. The API is pretty simple:
// The collection. Add new strings to the end of the collection,
// and access by index.
@interface LifoList : NSObject <NSFastEnumeration>
@property (readonly, nonatomic) NSUInteger count;
- (void) addString: (NSString *) string;
- (NSString *) stringAtIndex: (NSUInteger) index;
@end // LifoList
Pretty straightforward. It has a count
property you can read, you can add something to the collection, and then get a string at an index.
Remember what I said about bad design decisions? What’s the worst backing data structure you could use for something accessed index-styles? A linked list! But linked lists are easy to write, and they’re not contiguous chunks of memory, so that’s what LifoList is actually implemented as.
For a linked list, you need a node:
@interface Node : NSObject
@property (copy, nonatomic) NSString *string;
@property (strong, nonatomic) Node *next;
@end // Node
The node has to be an objective-C object, otherwise you’ll be fighting ARC to put an NSString
pointer into a structure (see Unsafe at Any Speed for details on this.)
The implementation is dirt-simple, thanks to automatic instance variable creation and accessor synthesis:
@implementation Node
@end // Node
Back to the implementation of the class. There’s three pieces of bookkeeping needed: the head of the list (so you can find the rest of the nodes by chasing next
pointers), the tally of objects, and a mutations pointer. For convenience, I put these instance variables in the class @interface
. If you’re curious where you can declare instance variables, check out Where does it go, George?
@implementation LifoList {
Node *_head;
NSUInteger _count;
unsigned long _mutations;
}
The rest of the methods are pretty standard data structures 101. Add a new node to the head of the list:
- (void) addString: (NSString *) string {
// New node
Node *node = [[Node alloc] init];
// Put it at the start of the list.
node.string = string;
node.next = _head;
_head = node;
// Necessary bookkeeping.
_count++;
_mutations++;
} // addString
You can see the mutations pointer being incremented at the end. That’s fast enumeration bookkeeping so the iterating machinery can catch collections being changed while iteration happens.
There’s no need to implement -count
. Auto synthesis will take care of it.
Getting a string at an index requires scanning through the list:
- (NSString *) stringAtIndex: (NSUInteger) index {
// Holy O(N) algorithm, BatMan
Node *scan = _head;
for (NSUInteger i = 0; i < index; i++) {
scan = scan.next;
}
return scan.string;
} // stringAtIndex
And there you have a working collection class. The exhaustive unit test shows that things works the intended way:
LifoList *lifolist = [[LifoList alloc] init];
[lifolist addString: @"hello"];
[lifolist addString: @"there"];
[lifolist addString: @"greeble"];
Running through the list will print things in reverse order (last-in, first-out):
for (NSUInteger i = 0; i < lifolist.count; i++) {
NSLog (@"%ld : %@", i, [lifolist stringAtIndex: i]);
}
prints
0 : greeble
1 : there
2 : hello
The foundation has been built. Time to add fast enumeration. Recall from last time the fast enumeration method:
- (NSUInteger) countByEnumeratingWithState: (NSFastEnumerationState *) enumerationState
objects: (id __unsafe_unretained []) stackBuffer
count: (NSUInteger) len;
This will be using the stack buffer to return the strings in the collection.
Recall the enumerationState struct that is passed in to the method:
typedef struct {
unsigned long state;
id __unsafe_unretained *itemsPtr;
unsigned long *mutationsPtr;
unsigned long extra[5];
} NSFastEnumerationState;
The fast enumeration implementation will set state
to 1 once iteration has started. Leaving it at zero will cause bad things to happen in the fast enumeration machinery. itemsPtr
will point to the stackBuffer
provided by the caller of countByEnumeratingWithState
. mutationsPtr
will point to the _mutations
instance variable, which you saw getting incremented on every change to the collection.
One last bit of business: how do you preserve the iteration from call to call? The typical way to scan through a linked list is to have a scan pointer. You reach through the pointer to get to the node’s data, and then set the scan pointer to the node’s next pointer. Around and around you go until you hit the end, a terminating NULL
or nil
value. You can’t do this with fast enumeration because local variables go away.
Luckily enumerationState
has that array of extra
values. These are (at least) pointer sized, so you can stick a scan pointer in there before exiting from one call to countByEnumeratingWithState
, and then grab it back on the next one.
Here’s what enumerationState
structure looks like for LifoList:
Yay! Time for the implementation. Just for paranoia’s sake, there’s an assert to make sure that there’s enough room to store a pointer’s bit pattern in the extra slot:
// Sanity check. I know sizeof(unsigned long) >= sizeof(pointer),
// but be a little paranoid.
assert (sizeof(Node *) <= sizeof(unsigned long));
First time through the iteration, set up initial bookkeeping:
// Using the enumerationState's extra[0] as the scan pointer.
if (enumerationState->state == 0) { // first time
enumerationState->state = 1; // Must set to non-zero
enumerationState->mutationsPtr = &_mutations; // Can't be NULL.
enumerationState->extra[0] = (unsigned long)_head;
}
Change state
to say that you accept this enumeration and all of the rights and privileges thereof. The mutations pointer is set too. extra[0]
is pre-populated so the subsequent code is the same whether it’s the first time or not.
Which is, grab the scan pointer:
Node *scan = (__bridge Node *)((void *)enumerationState->extra[0]);
Scan through the list until stackBuffer
fills up, or it runs out of objects:
// Fill up as much of the stack buffer as we can.
NSUInteger i;
for (i = 0; i < len && scan != nil; i++) {
stackBuffer[i] = scan.string;
scan = scan.next;
}
And then stash away the scan pointer:
enumerationState->extra[0] = (unsigned long) scan;
Point titemsPtr
to the stackBuffer
that just got filled in:
enumerationState->itemsPtr = stackBuffer;
Finally, return number of items that were iterated.
return i;
And that’s it. Here it is in all of its glory:
- (NSUInteger) countByEnumeratingWithState: (NSFastEnumerationState *) enumerationState
objects: (id __unsafe_unretained []) stackBuffer
count: (NSUInteger) len {
// Sanity check. I know sizeof(unsigned long) >= sizeof(pointer), but be
// a little paranoid.
assert (sizeof(Node *) <= sizeof(unsigned long));
// Using the enumerationState's extra[0] as the scan pointer.
if (enumerationState->state == 0) { // first time
enumerationState->state = 1; // Must set to non-zero
enumerationState->mutationsPtr = &_mutations; // Can't be NULL.
enumerationState->extra[0] = (unsigned long)_head;
}
Node *scan = (__bridge Node *)((void *)enumerationState->extra[0]);
// Fill up as much of the stack buffer as we can.
NSUInteger i;
for (i = 0; i < len && scan != nil; i++) {
stackBuffer[i] = scan.string;
scan = scan.next;
}
enumerationState->extra[0] = (unsigned long) scan;
// Point the returned items pointer to the stack buffer.
enumerationState->itemsPtr = stackBuffer;
return i;
} // countByEnumeratingWithState
Wow. 4300 words for 32 lines of code. But hopefully everything makes sense by this point.
By picking a poor data structure match, I’ve guaranteed that fast enumeration will be FAST. BNRTimeBlock
is a little utility to time a block of code and return the number of seconds (as a floating point number) that it took. Check out A Timing Utility for details.
I’ll be timing two iterations of the collection. Once getting things by index (which will be an O(N-squared) algorithm), and once using fast enumeration.
#define LIFO_COUNT 10000
10,000 objects. Doesn’t seem like much. Build up a list of them:
LifoList *lifolist = [[LifoList alloc] init];
NSString *value = @"hi";
for (NSUInteger i = 0; i < LIFO_COUNT; i++) {
[lifolist addString: value];
}
First up, using stringAtIndex
:
// See how long it takes to iterate using stringAtIndex. This is
// O(N^2), so will take a long time.
__block NSUInteger count = 0;
CGFloat indexingTime = BNRTimeBlock (^{
<strong> for (NSUInteger i = 0; i < lifolist.count; i++) { </strong>
<strong>(void)[lifolist stringAtIndex: i];</strong>
count++;
}
});
assert (count == LIFO_COUNT); // paranoia
assert (count == lifolist.count);
NSLog (@"lifo list indexing time: %f", indexingTime);
Just for sanity’s sake, a count is incremented for each object enumerated, and at the end it’s compared to the number of objects the collection thinks it has, as well as compared to the predefined constant.
This takes a fair amount of time on my older MacBook Pro:
lifo list indexing time: 3.373175
Now for the fast enumeration:
count = 0;
CGFloat fastEnumerationTime = BNRTimeBlock (^{
<strong>for (NSString *string in lifolist) { </strong>
count++;
}
});
assert (count == LIFO_COUNT);
assert (count == lifolist.count);
NSLog (@"fast enumeration time: %f", fastEnumerationTime);
And it comes in at a much svelter time:
fast enumeration time: 0.001544
That’s actually kind of cool. This collection really isn’t appropriate for indexed access, but with fast enumeration, you can come pretty close, without having to expose any additional API to support scanning through the linked list (especially because it would have to include some kind of state about the current state of the node scan.)
Fast enumeration is pretty cool. Once you pull it apart it’s not very complicated, just a lot of different concepts packed tightly together – stack buffers, preserving state across method invocations, object mutation, arrays of pointers, and so on.
It’s trivially easy to support fast enumeration in your own classes that are already backed by a fast-enumerable collection. It’s also pretty easy to add in direct support for fast enumeration, and participate in the for/in syntax as a first class citizen.
Our introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
The Combine framework in Swift is a powerful declarative API for the asynchronous processing of values over time. It takes full advantage of Swift...
SwiftUI has changed a great many things about how developers create applications for iOS, and not just in the way we lay out our...