Swift Regex Deep Dive
iOS MacOur introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
Fast Enumeration, part 1 covered what fast enumeration is, NSEnumeration
a bit, and introduced adopting fast enumeration in your own classes by doing a simple pass-through to Apple’s collections. This time around it’s time to look deeper at the central Fast Enumeration call, which I’ll refer to as countByEnumeratingState
:
- (NSUInteger) countByEnumeratingWithState: (NSFastEnumerationState *) enumerationState
objects: (id __unsafe_unretained []) stackBuffer
count: (NSUInteger) length;
OK. It’s still kind of scary. There’s a lot of stuff packed into that one method. What do you need to do when you implement this yourself?
Ponder what this method needs to do. It needs to communicate back to the fast enumeration machinery how many objects it should iterate over. It needs to communicate back which objects should be fed to the for (thing in collection)
loop. It needs to have some persistent state so this method can be called repeatedly to return more objects. That’s a lot of stuff going on.
countByEnumeratingWithState
’s return value is the number of objects that the for
loop should spin over. Return zero when you’re done iterating. Return a non-zero (positive value, because it’s unsigned) when you have some objects for the loop to spin over. This means that your countByEnumeratingWithState
will be called at least twice in the common case. Once to iterate over some objects. Once to say “here’s a zero because I’m all done.” You’d just return zero the first time through if you have no objects to iterate over.
You don’t have to return all of your objects at one time. It’s fine to return them piecemeal. In fact, it might be more efficient to do it that way.
The first argument to countByEnumeratingWithState
is a pointer to an enumeration state. This is a structure (not an object) with some interesting stuff in it:
typedef struct {
unsigned long state;
id __unsafe_unretained *itemsPtr;
unsigned long *mutationsPtr;
unsigned long extra[5];
} NSFastEnumerationState;
The same bucket of bytes will be passed to every call to countByEnumeratingWithState
in the current for(...)
loop. You can use it to hold on to any stuff that would be convenient to keep around while you’re iterating through your data structures from call to call. You can think of it as a rock to hide your data under. Stick some information under the rock, and return from the method. Then when you get called again, look under that rock for what you need. Concepts like context pointers and reference constants are also rocks you can hide data under to use later.
This structure will be filled with zeros on the very first call. There’s some necessary bookkeeping to take care of first. enumerationState->state
is zero the very first time through the iteration. This is where you do any setup that you’ll be using for the whole iteration. If you have a scanning pointer and a stop address, you could calculate those and store them into the extra
s fields in the enumerationState
. (More on that in a bit.)
You must change this to a non-zero value. Changing this value tells the fast enumeration machinery that “yes, I’m actually starting an iteration”. If you don’t set this to a non-zero value your method will be called over and over again, forever. Fast enumeration doesn’t care what you set the value to, so long as it’s non-zero. You could use it for the index into a C array of pointers to start reading the next batch of object pointers. Or you could have constants, like kIteratingHeaders = 1
if you’re vending individual HTTP headers, and kIteratingFormData = 2
if you’re now vending individual elements of an HTML form.
The other piece of bookkeeping is enumerationState->mutationsPtr
, which is a pointer to a long
. The value of the long being pointed to is checked every time through the loop. If it changes, an exception is thrown and the programmer doesn’t get a cookie.
Whoa. Why is that? One of the invariants to Cocoa collection enumeration via NSEnumeration
and fast enumeration is that the collection should not change during the iteration. The contract has been there forever, but it has never been enforced until fast enumeration came along. If you need to mutate a collection during fast enumeration, you need to make a copy and either change the one you’re iterating over or the copy. Otherwise, results are undefined. (Don’t forget that “seems to be working fine” is a member of the set of undefined results.)
You must set this value, otherwise fast enumeration will dereference NULL
and you’ll crash. (bad programmer, no cookie, blah blah blah)
Typically you will have a long
instance variable that you’ll increment every time the collection’s contents change, and then stash its address into this pointer:
enumerationState->mutationsPtr = &_mutationsCounter;
If your collection is immutable, you can put your object’s address here.
(Aside – how does that work? If you put your object’s address into the mutations pointer, fast enumeration will go and grab sizeof(long)
bytes at the very beginning of your object. These are the bytes for the isa
pointer that’s the first part of every Objective-C object, which points to the object’s class. If your object changes classes during an iteration, you probably have other, deeper design problems to worry about.)
That’s it for the required bookkeeping. You have to set enumerationState->state
to a non-zero value and enumerationState->mutationsPtr
to a legitimate address to not die a horrible nasty death.
Time to actually send some objects back. This is done by aiming enumerationState->itemsPtr
to a C-array of object pointers. The fast enumeration machinery will then use this pointer, along with the return value from countByEnumeratingWithState
to know what to feed back to the loop.
Take a closer look at its declaration:
id __unsafe_unretained *itemsPtr;
Take out the ARCism, and you have
id *itemsPtr;
Don’t forget that id
is a pointer type, so this is actually a pointer to a pointer
(Object **) itemsPtr;
Or, expressed another way
(Object *) itemsPtr[];
Or, a C array of pointers. Which means that it’s a sequential list of addresses in memory.
Non-quick aside: The __unsafe_unretaied
part says that ARC is not going to do anything with memory management. Any objects you pass back in this array need to stick around. Unsafe at Any Speed cogitates on __unsafe_unretained
. Also, you may notice this intriguing phrase in Apple’s docs when talking about this method, and the state structure in particular: “The state structure is assumed to be of stack local memory, so you can recast the passed in state structure to one more suitable for your iteration.” This is for Cocoa garbage collection so you can make assumptions about the life cycle of pointers. Because the state is on the stack any pointer references will be strong pointers under garbage collection. This is completely opposite behavior with ARC, which will not treat these as strong pointers. Also recall that ARC forbids pointers in structs that aren’t decorated with __unsafe_unretained
, like it is here.
This is what things look like when you ship back some objects:
Fast enumeration will feed these five objects through the loop and then call your countByEnumeratingWithState
again to get another batch of objects. And again. And again. And again until you return zero. Then the looping stops.
That’s all well and good. Where does this C array come from? You might have a data structure that uses a slab of pointers you can just return. Array implementations typically work like that. See Advanced Mac OS X Programming : The Big Nerd Ranch Guide – a.k.a. AMOSXP(3) – for an example of that kind of implementation. But what if you have your object stored in some other format, like a tree, or a linked list, or a hash table?
You have three options. Two of which are generally bad, and one which is kind of cool.
The two generally bad options are : 1) allocate your own block of memory and stuff your pointers into it. This kind of defeats the “iterate efficiently” aspect of fast enumeration because you’ve got (at least) one additional memory allocation. 2) The other is to return one object at a time. This will force one method call per object iterated, which also defeats the efficient iteration concept. Might as well subclass NSEnumerator
.
The third option is to use the second and third arguments passed to countByEnumeratingWithState
:
- (NSUInteger) countByEnumeratingWithState: (NSFastEnumerationState *) enumerationState
<strong> objects: (id __unsafe_unretained []) stackBuffer
count: (NSUInteger) length;</strong>
These arguments point to some scratch space you can use to stuff some pointers in to. stackBuffer
points to a row of bytes on the stack (hence the name), enough to hold length
object pointers. You can use as many of these as you wish for the objects you’re returning from countByEnumeratingWithState
. If you have more than length
objects left to iterate over, you’d fill in length
objects, return length
, and adjust your iteration state to pick up at the next uniterated object when countByEnumeratingState
is called again. This way you reduce the number of method calls made per iterated object, yet not have to come up with object storage space yourself. Apple can tune this value over over time if they think the current values are too big or small.
One last bit of stuff to cover in the NSFastEnumerationState
state struct. The extras:
unsigned long extra[5];
This is 5 unsigned long
’s worth of space. Anything that can fit into an unsigned long
can fit into this (cough pointer cough) And there’s five of these to play with. If you’re using a pointer scan, you could store the scanning pointer in one slot and the stop pointer in another slot. (You can see this general technique in Table Scanning if you’re not familiar with it.) If you’re post-order iterating a binary tree, you could store four levels of recursion here.
These five values are yours to play with and are not modified by the fast enumeration machinery from call to call. AMOSXP(3) includes a fast enumeration-based mechanism for generating the Fibonacci numbers by keeping the last two sequence values in extra[0]
and extra[1]
. So you really are only limited by your imagination, or at least what you can get past your code reviewer.
The return value of countByEnumeratingState
and the NSFastEnumerationState
struct are really the heart and soul of fast enumeration. The two extra stack buffer arguments to the method are a nice convenience if you don’t have any C-arrays-of-pointers lying around on the floor. Here’s the state structure again and what the fields are used for:
The third (and thankfully the final) installment will have code that has a working custom collection that lets itself be fast enumerated by using the stack buffer and an extra[]
pointer.
Our introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
The Combine framework in Swift is a powerful declarative API for the asynchronous processing of values over time. It takes full advantage of Swift...
SwiftUI has changed a great many things about how developers create applications for iOS, and not just in the way we lay out our...