Where Servers Are Going
Back-End Full-Stack WebI wonder, however, if Cloud-related technologies have received all the attention and left no space to discuss other trends. In this post, I thought...
With the release of macOS 10.12 and iOS 10, Apple has given users everywhere access to its Basic Neural Network Subroutines (BNNS, which we at Big Nerd Ranch feel should be pronounced “bananas”). Google open-sourced TensorFlow, its machine learning framework, nearly a year ago. Maybe you think it is time to add some artificial intelligence in your Mac or iOS application and are wondering which to use. The answer, for now at least, is you will probably use both.
The summer after my first year in college, I had a terrible, terrible job on the night shift at a USAir customer service call center. This job mainly involved talking on the phone with people who hated me—a soul-bruising task. I knew someone who had a great job at the Mitre Corporation’s Advanced Signal Processing Lab, and I asked him, “What do I need to know to get a job like yours?” And he replied, “C programming on Unix.”
I went back to school and raised a ruckus in the Electrical Engineering department until they gave me access to a Unix machine, and I taught myself C. I got the job at Mitre, and I spent the rest of my summers in college doing machine learning experiments. In particular, I worked on speech recognition problems using neural networks.
Thanks to 25 years of video gamers who were willing to pay top dollar for good GPUs, a lot has changed since 1989. Neural networks involve huge amounts of floating point operations, so in 1989 we could only train and use the simplest networks. In 1989, if you sprang for a MIPS R3010, you would be delighted with 4 million floating point operations per second. Today, the Nvidia GTX 1080 graphics card (just $650) is 2 million times faster: it does 9 trillion floating point operations per second.
And this brings us to one of the challenges of using Google’s TensorFlow: The engineers who wrote TensorFlow implemented all the code to move the computation onto the graphics processor using CUDA. CUDA is an Nvidia-specific technology and most Apple products do not use Nvidia graphics processors. (There is an effort to rewrite those parts using OpenCL, which is supported on all Apple devices, but if you are using TensorFlow today it will not be GPU-accelerated on most Apple devices.)
Most deep learning techniques are based on neural nets. Neural nets are a rough simulation of how biological neurons work. They are connected in a network and the output of one neuron acts as one input to many other neurons. The network learns by adjusting the weights between the neurons using a technique called Backpropagation. (You can get more details from Bolot’s recent blog post.)
This brings us to one of the challenges of using Apple’s BNNS: There is no support for backpropagation—the networks don’t learn. To use the BNNS, you need to train the network using something else (like TensorFlow) and then import the weights.
Thus, there are two ways to get deep learning into your Mac or iOS application:
Solution 1: Do all the neural net work on a server using TensorFlow. You must be certain that all your users always have a good internet connection and that the data you are sending/receiving is not too voluminous.
Solution 2: Train the neural net using TensorFlow and export the weights. Then, when you write the iOS or Mac application, recreate the the neural net using BNNS and import the weights.
Google would love to talk to you about the first solution, so the rest of this posting will be about the second. I’ve used TensorFlow’s MNIST example: handwritten digit recognition with just an input and an output layer, fully connected. The input layer has 784 nodes (one for each pixel) and the output has 10 nodes (one for each digit). Each output node gets a bias added before it is run through the softmax algorithm. Here is the source, which you will get automatically when you install TensorFlow.
My sample code is posted on GitHub.
If you train your neural net using TensorFlow, you will almost certainly write that code in Python. (There is a C++ interface, but it is very limited and poorly documented.) You will create a Variable tensor to hold the weights. Here I’m creating a two-dimensional Variable tensor filled with zeros:
W = tf.Variable(tf.zeros([784, 10]))
You will give the neural net data and train it. Then write out the weights:
weight_list = W.eval().tolist()
thefile = open('/tmp/weights.data', 'w')
thefile.write(str(weight_list))
thefile.close()
This will result in a text file filled with arrays of floating point numbers. In my example, I get an array containing 784 arrays. The inner arrays each contain 10 floating point numbers:
[[0.007492697797715664, -0.0013006168883293867, …, -0.006132100708782673], [0.0033850250765681267, …-5.2658630011137575e-05]]
This is a easy format to read in Cocoa. The sample code has some routines that will read one- and two- dimensional arrays.
Then, using BNNS, recreate the topology of the neural network that you created in TensorFlow and copy the weights in:
BNNSVectorDescriptor inVectorDescriptor =
{ .data_type = BNNSDataTypeFloat32, .size = IN_COUNT };
BNNSVectorDescriptor outVectorDescriptor =
{.data_type = BNNSDataTypeFloat32, .size = OUT_COUNT};
BNNSFullyConnectedLayerParameters parameters =
{ .in_size = IN_COUNT, .out_size = OUT_COUNT };
float *weightVector = (float *)malloc(sizeof(float) * IN_COUNT * OUT_COUNT);
// Fill 'weightVector' with data from a file here!
parameters.weights.data = weightVector;
parameters.weights.data_type = BNNSDataTypeFloat32;
float *biasVector = (float *)malloc(sizeof(float) * OUT_COUNT);
// Fill 'biasVector' with data from a file here!
parameters.bias.data = biasVector;
parameters.bias.data_type = BNNSDataTypeFloat32;
parameters.activation.function = BNNSActivationFunctionIdentity;
// Create the filter
filter = BNNSFilterCreateFullyConnectedLayer(&inVectorDescriptor,
&outVectorDescriptor,
¶meters,NULL);
To use the resulting filter, supply arrays for the input and output:
float inBuffer[IN_COUNT];
// Fill inBuffer with input here
float outBuffer[OUT_COUNT];
int success = BNNSFilterApply(filter, inBuffer, outBuffer);
There is a big shortcoming here: Once you dump the weights out of TensorFlow, your application won’t get any smarter. At this time, BNNS doesn’t do backpropagation—it doesn’t learn. However, it will be GPU accelerated on any iOS or macOS device, which will make it faster and consume less power.
Note that TensorFlow is very extensive and has lots of operations that are not available in BNNS. When you are creating the topology of your neural network, use the operations that are available in both toolkits. If you don’t, you will need to implement the operation yourself. In this example, TensorFlow had a built-in softmax
operation. I had to implement softmax
to use it on the Mac.
This is a pretty poor solution, and eventually two things will happen:
However, for now, you will probably use both if you want to add machine learning to your Mac or iOS app.
Want more info on machine learning? Check out this post on getting started with Core ML, a new framework announced at WWDC 2017.
I wonder, however, if Cloud-related technologies have received all the attention and left no space to discuss other trends. In this post, I thought...
Big Nerd Ranch was founded on the idea that developers learn best when immersed in a distraction-free environment. Today I want to talk to...
If current trends continue, most babies born today will never own a laptop or a desktop computer. They will carry a smart phone,...