Swift Regex Deep Dive
iOS MacOur introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
iPhone 5s. iPad Air. The 64-bit era has moved from the desktop and into our hands. Mike Ash has an excellent article covering what 64-bit ARM is. I’m here to explore some of the day-to-day implications of this 64-bit thing.
Pro: You get bigger processor registers, and you get more of them. This lets the compiler optimize your code better by keeping intermediate values very close to the processor. You have larger integers, so you can calculate larger values without having to juggle two 32-bit values. This means fewer instructions. Floating point values have more bits making them more accurate (not to mention true IEEE754 support in the NEON vector processor). ARM64 is a modern instruction set without small-system legacy issues. Much larger address spaces, which are handy for large memory-mapped files. More pointer bits give you room to play with non-pointer addresses, as outlined by Greg Parker’s deconstruction of the new Objective-C non-pointer isa.
Con: Code and data are bigger, possibly blowing out your caches faster and filling memory faster. New exciting classes of bugs. Larger testing matrix. Plus backward compatibility issues with current tooling.
Aggressive early-adopting notwithstanding, the decision to migrate your code to 64 bits and ship it shouldn’t be taken lightly. With the current versions of Xcode 5, you can’t have a compiled executable that targets 64-bit iOS7 (your only choice) and 32-bit iOS6. (Scuttlebutt is that this is actually due to a loader issue on the older versions of iOS, and the Xcode team is needing to devise work-arounds.) This means jettisoning your iOS6 (and earlier) customers when you ship a 64-bit iOS app.
Update – I’ve heard that Xcode 5.0.2 addresses this issue, but I haven’t personally verified it yet.
If you need particular 64-bit features, or get a noticeable performance increase, then by all means go for a 64-bit version. Don’t forget that your code and data will get bigger which could negate the performance increases. Build and measure to justify your decision. But if your app doesn’t need the extra bits or the extra performance, you could be costing yourself time you could better spend elsewhere.
Unless you’re only targeting the iPhone 5s or the iPad Air, your application will have a 32-bit side and a 64-bit side. That’s doubled your testing matrix because those are entirely different executables. The fact that something runs ok on an iPhone 4 doesn’t imply the same feature ought to work in 64-bit mode on the 5s. This may be ok – your app may be small enough to where individual device testing isn’t an ordeal, or it may turn into a time sink. And please please please actually test on 64-bit devices. Don’t rely on the iOS Simulator to catch all of your 64-bit issues.
That being said, even if you don’t plan on shipping a 64-bit version of your app any time soon, you should go ahead and build and run your app 64-bit style so you can shake out any latent bugs or 32-bit assumptions you might have.
When you build your app for ARM64, everything it uses needs to be 64 bit savvy. All the system frameworks and libraries have to be 64-bits (luckily Apple has that taken care of), any open source code you use will need to be ported, along with any binary-only libraries you link in, such as third party crash reporters. You cannot mix 32 bit and 64 bit code in a single executable. If your vendor is slow in releasing a 64-bit version of the library (or have gone out of business), you’re stuck until you find a replacement.
This also means that 32-bit-only apps need the 32-bit frameworks. Right now that’s not a big issue because so much iOS software is 32-bit only. As time goes on, though, and all future iOS devices are 64-bit, the 32-bit software will become a drain on the system, having to page-in and keep live the 32-bit framework stack.
I knew we were headed for problems when Apple started calling combined iPhone / iPad apps “universal”. Prior to that, “universal” on the Mac meant an executable that contained Intel and Power PC code (and later contained 64-bit versions as well). This is one executable with different architectures inside of it.
Contrast that with the iOS “universal” app, which is one executable/architecture, and a bunch of if statements (or polymorphism) to decide how to behave depending on what device you’re running on.
Now we have dual-architecture executable files on iOS, and no good term to describe things. Universal Universal? A friend of mine suggested Multiversal.
64-bit OS X and iOS use a memory model known as LP64. Longs are 64 bits, Pointers are 64 bits. Other things, like int
s, are 32 bits. 32-bit OS X and iOS use the ILP32 model – Ints, Longs, and Pointers are all 32 bits. 64-bit Windows uses the LLP64 model, Long-Longs and Pointers are 64-bits, long
s are 32 bits. Something to keep in mind if you’re sharing code between Mac/iOS and Windows.
You can use the preprocessor symbol __LP64__
to tell you if you’re compiling for Apple-style 64-bits:
// clang -arch i386 -o lp64 lp64.m
// clang -arch x86_64 -o lp64 lp64.m
#if __LP64__
#warning "64-bit land!"
#else
#warning "32-bit land!"
#endif
int main (void) {
return 0;
}
% <strong>clang -arch x86_64 -o lp64 lp64.m</strong>
lp64.m:5:2: warning: "64-bit land!" [-W#warnings]
#warning "64-bit land!"
^
1 warning generated.
% <strong>clang -arch i386 -o lp64 lp64.m</strong>
lp64.m:7:2: warning: "32-bit land!" [-W#warnings]
#warning "32-bit land!"
^
1 warning generated.
There is no ILP32 equivalent for 32-bit land, so you’ll need to use the #else clause, or look at another predefined symbol, such as __POINTER_WIDTH__
. You can see all the predefine macros by running this command:
% <strong>clang -E -dM - < /dev/null</strong>
#define OBJC_NEW_PROPERTIES 1
#define _LP64 1
#define __APPLE_CC__ 5621
#define __APPLE__ 1
#define __ATOMIC_ACQUIRE 2
#define __ATOMIC_ACQ_REL 4
...
(and here follows 150 more lines of stuff)
The main class of problems you’ll encounter when building your 32-bit code in the 64-bit world is passing a 64-bit value (such as a pointer) through 32-bit storage (such as an int
) and then back out and used as a pointer. When this happens, the top 32 bits of the original value are sliced off and lost forever.
Be sure to use function prototypes (or declare a function before use), otherwise you’ll get a warning. Be careful casting a function pointer to a form that takes a different set of parameters, otherwise the compiler might not put the arguments in the right place in memory where the function is expecting them to be.
Your calculations might be end up differently – you could be overflowing a 32-bit integer and things are working ok, but with 64-bits, you’ll now be getting correct (and possibly really large) 64-bit values.
Rather than funneling everything through int storage, you should instead use some of the special types that are designed to vary in size based on the architecture you’re using:
uintptr_t
: a type big enough to hold an unsigned int and a pointer. Use this for storage that can old integers or pointers
ptrdiff_t
: a type big enough to hold the difference between any two pointers on the system. Use this when storing the result of pointer subtraction.
size_t
: a type big enough to hold the number of bytes in any struct the compiler can create. Use this for the destination of sizeof.
NSInteger
/ NSUInteger
/ CGFloat
– these will be 64 bits on 64-bit systems, and 32-bit on 32-bit systems. Use these when you’re interacting with Cocoa (which use NSInteger and CGFloat everywhere)
What about format specifiers? You’ve probably seen this annoyance:
NSInteger ook = 42; // don't panic
NSLog (@"%d", ook);
On 32-bit systems, this compiles fine. On 64-bit systems you’ll get a complaint:
warning: values of type 'NSInteger' should not be used as format arguments;
add an explicit cast to 'long' instead [-Wformat]
You can follow the recommendation and use long format specifiers and casting the value:
NSLog (@"%ld", (long)ook);
If you’re using Cocoa, you can use the objective-C literals to simplify things:
NSLog (@"%@", @(ook));
(Thanks to Jeremy W. Sherman for that trick.)
Bitmask constants are implicitly unsigned ints, which are 32-bits. If you assign a 32-bit literal mask to a 64-bit long, the top bits will be zeros thanks to the unsignedness. This may, or may not, be what you expect. Here’s a bitmask with all the bits set except for the bottom two:
0xFFFFFFFFC : 111.....111100
If you assign this to a long in 32-bit land, the mask doesn’t change – all bits set except the bottom two. If you assign it to a long in 64-bit land, you now have a confused mask:
0x00000000FFFFFFFC
The top 32 bits are clear, the bottom 32 bits are all set except for the bottom-most two bits. If you want the mask to be sign-extended through all 64-bits, you need to either cast the mask to a signed long
or int
to get automatic sign extension:
NSInteger mask = (int)0xfffffffc;
or take the NOT of the NOT of the mask:
The NOT of 0xFFFFFFFC
is 3, binary 0000...0011
. Then if you NOT it again, you’ll get ones in all the upper bits:
NSInteger mask2 = ~0x3;
If you need to know the precise number of bits in a long, you can look at the LONG_BIT
predefined preprocessor symbol.
It should go without saying: please always use sizeof when measuring the size of a structure, especially if you’re using types that don’t have a defined size like longs or pointers. If you hard code the size of your structure, say as 16 bytes (pointers will always be four bytes on phones!), you’ll have a bad day when you have 32 bytes for four pointers. If you need to have precise sizes (such as 16-bit short
s, 32-bit int
s), use the explicit types such as uint32_t
.
Also be aware of how you have your structure contents are laid out by the compiler. The change in the size of pointers and long
s can introduce padding in the structure. For example, this structure:
typedef struct Flonk {
int thing1;
int *thing2;
int thing3;
long thing4;
int thing5;
} Flonk;
is 20 bytes in 32-bit land:
It’s 40 bytes in 64-bit land:
An unexpected doubling of your memory consumed could be most distressing.
But wait! That’s only 28 bytes! Where’d the extra byte consumption come from? Padding! Larger data types are accessed more efficiently if they’re aligned on “natural” boundaries. With 32-bits, that boundary is every four bytes. With 64 bits, it’s every 8 bytes. So, to make sure the 64-bit thing2 pointer lies at a good, 8-byte aligned address, there needs to be some padding:
The last bit of padding bytes is so the entire structure is aligned on an 8-byte boundary. This makes arrays of these structures maintain their alignment.
You can remove the padding by using the packed attribute:
typedef struct __attribute__ ((__packed__)) FlonkPacked {
int thing1;
int *thing2;
int thing3;
long thing4;
int thing5;
} FlonkPacked;
This structure is 28 bytes, so you have your unused memory back. Just be aware that you’ve lost some performance accessing data on unaligned addresses, and some parts of the system might not like accessing large data at unaligned addresses. As with all things performance, measure and see what impact it’s having.
Update: Rancher Rod Strougo reminded me of a standard C technique regarding structs – put the largest items first in the struct, followed by the smaller items. That way you reduce the padding necessary for alignment. You can have your pointers and longs aligned to 8 bytes, then you could fit in two ints which have 4-byte alignment, and so on.
Be sure to test your interactions between 32-bit and 64-bit versions of your software, especially things like document formats, network protocols, and anything you might transfer via iCloud.
Also be aware that NSNotFound
has different values in 32-bit and 64-bit land, since it’s defined in terms of LONG_MAX
, especially if you have stored this value in an object archive or a file.
When you first convert your project to 64-bit, the compiler will emit a lot of warnings. Be sure to fix all your warnings.
BOOL in ARM 64-bit is an actual boolean type, making BOOL’s Sharp Corners mostly irrelevant. Be aware this is only for ARM 64-bit and the 64-bit iOS simulator. BOOL still has the same problem on the other Objective-C platforms.
You need Xcode 5 to update your iOS projects. Change the deployment target to 7.0. Change the the ARCHS
build settings to “Standard Architectures (including 64-bit)”:
?
Choose the 64-bit simulator, or a 64-bit device, and then build. Fix any warnings and errors. Xcode automatically turns on some nice warning flags (such as “`-Wconversio`n) that we had to turn on manually back in the Mac 64-bit migration days.
If you get a build error “No architectures to compile for”, set the “Build Active Architecture Only” setting to “No.”
Be sure to actually test on 64-bit hardware. There is no substitute for running on an actual device. Macs are 64-bit architectures too, but there can be bugs that only manifest themselves on ARM.
In addition to Mike Ash’s and Greg Parker’s articles mentioned earlier, Apple has an iOS transition guide. There is some good generally useful information in Apple’s Mac 64-bit migration guide). Also, if you happen to have Advanced Mac OS X Programming : The Big Nerd Ranch Guide lying around, chapter 2 has a discussion of 64-bittitude.
The Auxiliary Tools download (you’ll have to search for “Auxiliary Tools” once you get there) includes a tool called ConvertCocoa64
, which uses a tops script to perform some mechanical operations on your code. This shipped when the mac moved to 64 bit, and was primarily used to change <code
>ints to NSIntegers
and float
s to CGFloats, flags all of your printf
-style format strings with warnings so that you’ll reevaluate them, changes some api to use NSInteger forms, and drops warnings for things it finds questionable, such as using INT_MAX
or FLT_MAX
, which don’t make sense to compare against 64-bit values. It’s not designed for iOS, but could be illuminating what it changes. Because you are using source code control, you can always back out changes you don’t like.
This is an exciting time, getting 64 bit processors in our pockets. Like any change in technology, there are positives and negatives, and you need to weight the short-term cost with any long-term gain a 64-bit migration would entail. Even if you never ship a 64-bit version of your software, it’s worth the effort to even do a perfunctory 64-bit compile and run to fix warnings and errors that come up.
Our introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
The Combine framework in Swift is a powerful declarative API for the asynchronous processing of values over time. It takes full advantage of Swift...
SwiftUI has changed a great many things about how developers create applications for iOS, and not just in the way we lay out our...