Swift Regex Deep Dive
iOS MacOur introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
Sometimes you stumble across a file. It might be something random in your Documents folder. It might be something a parent or a client sent you. Unfortunately, you have no idea what it might be. Files don’t have to have extensions on the Mac, so there’s not much hint what “Flongnozzle-2012” might contain. But if you’re comfortable in the Terminal, you have some built-in tools to help you identify files.
file
is my go-to command for this kind of work. file
examines the contents of the file and tries to figure out what it is:
% file launchHandler.m
launchHandler.m: ASCII C++ program text
Well, it’s Objective-C text, but Terminal came pretty close, identifying it as a file of code. “But MarkD, can’t it just look at the file extension?” That’s certainly one of the tools file
can use, but it’s not necessary:
% cp launchHandler.m splunge
% file splunge
splunge: ASCII C++ program text
No file extension, but we figured out what the file is. Point file
at something that could contain executable code, and it will tell you the included architectures:
% file /bin/ls
/bin/ls: Mach-O 64-bit executable x86_64
You can tell if you have a fat binary (a.k.a Universal App in its original sense):
% file /Applications/Reason/Reason.app/Contents/MacOS/Reason
Reason.app/Contents/MacOS/Reason: Mach-O universal binary with 2 architectures
Reason.app/Contents/MacOS/Reason (for architecture i386): Mach-O executable i386
Reason.app/Contents/MacOS/Reason (for architecture x86_64): Mach-O 64-bit executable x86_64
Point it at an image file to see the image dimensions:
% file Flongnozzle-2012
Flongnozzle-2012: PNG image data, 1932 x 904, 8-bit/color RGB, non-interlaced
OBTW, here’s a handy Terminal tip: You can drag icons from the finder into a Terminal window. This will paste in the full path to the file or folder you drag in.
Sometimes file
lets you down, or maybe you want more information about a file. You can always try QuickLook in the finder. If that doesn’t work, you can use hexdump
to show a file’s bytes. Pass the -C
option to show an ASCII translation as well.
For example, back to that image file:
% hexdump -C Flongnozzle-2012 | head
00000000 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 |.PNG........IHDR|
00000010 00 00 07 8c 00 00 03 88 08 02 00 00 00 a2 e0 9b |................|
00000020 61 00 00 0c 45 69 43 43 50 49 43 43 20 50 72 6f |a...EiCCPICC Pro|
00000030 66 69 6c 65 00 00 48 0d ad 57 77 54 53 c9 17 be |file..H..WwTS...|
00000040 af 24 81 90 84 12 88 80 94 d0 9b 28 bd 4a ef 82 |.$.........(.J..|
Not much useful in the data area, but you can see PNG in all its glory.
Some files have more string content in them. Here’s a hexdump from a patch file from the Reason digital audio workstation:
% hexdump -C CV-Spy--md.cmb
00000000 46 4f 52 4d 00 00 03 d8 50 54 43 48 43 41 54 20 |FORM....PTCHCAT |
00000010 00 00 00 04 52 45 46 53 43 4f 49 4e 00 00 00 06 |....REFSCOIN....|
00000020 bc 01 00 00 00 01 43 41 54 20 00 00 00 fc 44 45 |......CAT ....DE|
00000030 56 4c 46 4f 52 4d 00 00 00 f0 44 45 56 49 44 45 |VLFORM....DEVIDE|
00000040 53 43 00 00 00 47 bc 02 01 00 00 07 00 00 00 10 |SC...G..........|
00000050 00 00 00 12 43 56 20 56 61 6c 75 65 73 20 28 30 |....CV Values (0|
00000060 2d 3e 32 35 36 29 00 00 00 00 00 00 00 00 00 00 |->256)..........|
00000070 00 00 16 44 44 4c 20 44 69 67 69 74 61 6c 20 44 |...DDL Digital D|
00000080 65 6c 61 79 20 4c 69 6e 65 00 00 00 04 00 50 41 |elay Line.....PA|
...
If you’ve used Reason, the terms “CV Values” and “DDL Digital Delay Line” will be familiar.
The strings
command extracts string-like sequences of bytes from a file:
% strings CV-Spy--md.cmb
FORM
PTCHCAT
REFSCOIN
CAT
DEVLFORM
DEVIDESC
CV Values (0->256)
DDL Digital Delay Line
...
Property lists are a standard Mac and iOS file type, constructed of structured data of predictable types. Most property lists on the system you’ll come across are in a compressed binary format that is fast to load. User preferences are stored as plists:
% pwd
/Users/markd/Library/Preferences
% file com.apple.iphonesimulator.plist
com.apple.iphonesimulator.plist: Apple binary property list
Unfortunately the compressed plist file is kind of hard to read:
% hexdump -C com.apple.iphonesimulator.plist
00000000 62 70 6c 69 73 74 30 30 dc 01 02 03 04 05 06 07 |bplist00........|
00000010 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 |................|
00000020 18 5e 53 69 6d 75 6c 61 74 65 44 65 76 69 63 65 |.^SimulateDevice|
00000030 5f 10 2f 4e 53 57 69 6e 64 6f 77 20 46 72 61 6d |_./NSWindow Fram|
00000040 65 20 69 50 68 6f 6e 65 53 69 6d 75 6c 61 74 6f |e iPhoneSimulato|
00000050 72 57 69 6e 64 6f 77 2e 32 2e 30 2e 37 35 30 30 |rWindow.2.0.7500|
00000060 30 30 5f 10 2f 4e 53 57 69 6e 64 6f 77 20 46 72 |00_./NSWindow Fr|
00000070 61 6d 65 20 69 50 68 6f 6e 65 53 69 6d 75 6c 61 |ame iPhoneSimula|
...
Luckily, there is the plutil
utility that will convert between this binary format and something more human readable:
% plutil -convert xml1 com.apple.iphonesimulator.plist
% head !$
head com.apple.iphonesimulator.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>CurrentDeviceUDID</key>
<string>66E762FE-C171-481D-8DCF-BE908BB2945B</string>
<key>LocationMode</key>
<integer>3102</integer>
<key>NSWindow Frame iPhoneSimulatorWindow.0.0.500000</key>
<string>1264 312 368 716 0 0 1680 1028 </string>
(The !$
shortcut grabs the last argument from the previous command)
The OS may know more about a particular file than you might think. Spotlight’s job is to index files on your disk and make it easy to Find Stuff by querying metadata. There’s command-line access to this metadata via the mdls
command, so you can ask Spotlight what goods it has about a file:
% mdls launchHandler.m
kMDItemContentCreationDate = 2014-07-02 19:22:02 +0000
kMDItemContentModificationDate = 2014-07-02 19:23:58 +0000
kMDItemContentType = "public.objective-c-source"
kMDItemContentTypeTree = (
"public.objective-c-source",
"public.source-code",
"public.plain-text",
"public.text",
"public.data",
"public.item",
"public.content"
)
...
kMDItemKind = "Objective-C Source"
kMDItemLastUsedDate = 2014-07-02 19:32:46 +0000
kMDItemLogicalSize = 1443
kMDItemPhysicalSize = 4096
kMDItemUseCount = 2
kMDItemUsedDates = (
"2014-07-02 10:00:00 +0000"
Here mdls
tells you that this file is Objective-C source code, along with other UTIs (Uniform Type Identifiers) that describe the data. It is indeed source code, and plain text. There’s also some interesting data such as how much space on disk it actually consumes, vs how many bytes comprise the file.
Another system database of information is maintained by Launch Services, which has the last word on what program will open which file. Double-click on a file to open it? The Finder asks Launch Services. Use open
to open a file from the command-line? It too uses Launch Services to figure out who to actually launch.
lsappinfo
is a utility that uses Launch Services (as well as Core Application Services) to give you information about currently running applications. This is tangential to figuring out what a file actually is, but you can learn some cool stuff with it. Try lsappinfo sharedmemory
to get some shared memory information, or lsappinfo visibleProcessList
for a list of visible applications, (front-to back window ordering)
The other Launch Services features are accessed either through API, or through lsregister
, a well-known but fundamentally undocumented utility. lsregister
lives in the Support directory of the LaunchServices framework that lives inside of the CoreServices framework, most likely at this path on your machine:
/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister
lsregister
is primarily used by the OS to register the files that’ll be handled by a particular application, but you can get a dump of its database by using
% lsregister -dump > services-db.txt
(To run this command, you’ll need to extend your PATH
with that Support directory).
This produced about 61,000 lines out output, so it’s a bit unwieldy to use on a day-to-day basis, but it an be fun to poke around.
But some real power comes from a single call: LSCopyApplicationURLsForURL
. Give this call a URL to a file, and it will return the list of applications that can deal with it. There are different query modes, like “What are all of the applications that can open this file?” or “What are all of the applications that can edit this file?” Launch services doesn’t actually introspect the files like file
does. Instead it uses file extensions, creator codes and the like to match files to eligible applications.
Here’s a little utility that takes a filename on the command line, calls LSCopyApplicationURLsForURL and prints out an array of matching applications. You can find the code at this gist.
@import Foundation;
@import CoreServices;
// clang -g -fobjc-arc -fmodules launchHandler.m -o launchHandler
int main (int argc, const char *argv[]) {
// Rudimentary argument checking.
if (argc != 2) {
printf ("usage: %s filenamen", argv[0]);
return -1;
}
const char *filename = argv[1];
// Get a string of the full path of the file, using realpath() as the workhorse
char pathbuffer[MAXPATHLEN];
char *fullpath = realpath (filename, pathbuffer);
if (fullpath == NULL) {
fprintf (stderr, "could not find %sn", filename);
return -1;
}
NSURL *url = [NSURL fileURLWithPath: @( fullpath )];
// Ask launch services for the different apps that it thinks could edit this file.
// This is usually a more useful list than what can view the file.
LSRolesMask roles = kLSRolesEditor;
CFArrayRef urls = LSCopyApplicationURLsForURL((__bridge CFURLRef)url, roles);
NSArray *appUrls = CFBridgingRelease(urls);
// Extract the app names and sort them for prettiness.
NSMutableArray *appNames = [NSMutableArray arrayWithCapacity: appUrls.count];
for (NSURL *url in appUrls) {
[appNames addObject: url.lastPathComponent];
}
[appNames sortUsingSelector: @selector(compare:)];
// Finally emit to the user.
for (NSString *appName in appNames) {
printf ("%sn", appName.UTF8String);
}
return 0;
} // main
The main interesting parts are using the realpath()
library call to turn the command-line argument into a full path (so you don’t have to worry if the user specified a relative, full, or ~-relative path), and then feeding that into LSCopyApplicationURLsForURL
. The kLSRolesEditor
is used because it returns the most reasonable list of applications. Sometimes the candidate applications can give you a clue as to what a file is.
% ./launchHandler launchHandler.m
TextEdit.app
Xcode-4.6.app
Xcode-5.0.2.app
Xcode.app
Xcode6-Beta2.app
% ./launchHandler someGraphic.png
Acorn.app
ColorSync Utility.app
Preview.app
% ./launchHandler ./Flongnozzle-2012
%
Unfortunately, it didn’t help out with the Flongnozzle case because there is no file extension or any other creator / file type information available.
The available set of command-line tools available is remarkably vast, so I have probably missed one or two or a dozen other tools that might help you identify random files. If you have a favorite trick, please leave a comment!
Our introductory guide to Swift Regex. Learn regular expressions in Swift including RegexBuilder examples and strongly-typed captures.
The Combine framework in Swift is a powerful declarative API for the asynchronous processing of values over time. It takes full advantage of Swift...
SwiftUI has changed a great many things about how developers create applications for iOS, and not just in the way we lay out our...