Replacing Myself: Writing Unit Tests with ChatGPT
Leveling UpThe Bot that’s got everyone talking The science-fiction future is among us as we find ourselves on the precipice of an AI revolution. As...
Data collection is both expensive and difficult. Unless your company already has systems in place when questions about what machine learning can do for your company com up, the data roadblock can prevent your team from moving forward.
This is not to say your team doesn’t think it will be worth it. However, when moonshot ideas come up, you have to consider things like if the data is available, and whether the problem is even solvable with this kind of data or model.
This problem of product validation without real data is demonstrated in a recent project that the team at BNR worked on. We were asked to find small objects on a label using compute vision. The catch was the objects are very small and the label itself less than an inch square. We were also going into the project with no data previously collected.
This posed an interesting problem for us. In the everyday world of object detection, you are normally looking for large objects, like people or cars. These very complex multi-pixel objects, when passed into a neural network and compressed, create very unique signatures. To make things more difficult, all the training data was collected at the start of the project.
Before contracts were signed our machine learning team wanted to make sure we could in fact build brilliance and deliver what the client was asking for. Normally we would look up research papers and form a game plan on what models might perform well and what preprocessing we might need to build. However, small objects a few pixels wide are not covered by really any research papers.
So we decided to generate our own synthetic dataset aimed at being harder conditioned than we were expecting with the goal of seeing if we could find the bounds to a successful model. But, we had a big limitation to our research at this point as there was no signed contract and we needed to make the data and train a model fast so we can feel good about green-lighting the project.
The goal was to build a small image with a barcode in the middle. We were going to try and find how small we can make this while still finding good data. So we used a Datamax barcode as they are the smaller of the 2d barcodes. We’ll also lay down flecks and record the location of all of them. Below is a result from our code.
We only need a few tools to make these images. The main one for this task is the pylibdmtx
library. This tool allows users to generate Datamax barcodes easily. The pillow library, or PIL
, is used to create and manipulate images in python. The last important library is pandas
that allowed us to create our dataset on the object location.
The settings of our label are below. Most notably our label will be 160px, contain 40 objects, and we will create a total of 4000 images.
During the process, we randomly pick the location for objects. The size of the objects are fixed at 2px squared and we assigned them one of two colors. Using pandas we added each of the bounding boxes to a data frame for training.
What data to add to the bounding box depends on the model you end up using. A Faster-RCNN model, for instance, wants xmin, ymin, xmax, ymax
for the bounding box variables. A YOLO model requires x, y, width, height
where x and y are the center of the object. Below we have an example of our output.
,image,name,xMin,yMin,xMax,yMax 0,0.png,BARCODE,145,145,655,655 1,0.png,GOLD_FLAKE,346,161,358,173 2,0.png,GOLD_FLAKE,117,734,129,746
When picking out a model for prototyping we weren’t worried about platform or deployment issues. We really wanted to know if a generic object detection algorithm could be trained to solve this problem.
One item we took into consideration is cost. Normally we would use a cloud platform to get extra power as object detection problems can take a very long time to reach convergence. But we couldn’t run up compute cost for the prototype so trained locally on our laptop. Luckily we have access to an eGPU with a Vega GPU.
With our hardware selected we needed a tool that allowed us to quickly train an object detector with an AMD gpu on macOS. As of this writing, we have access to Create ML, Turi Create, and PlaidML as tools on macOS that give us access to the eGPU for training. We ruled out PlaidML because we want to move quick and the other two options already have out-of-the-box object detectors we just need to train.
Now we would use Create ML to create this prototype. Just note that Create ML expects a JSON file with the object annotations. At the time we created this prototype, Create ML was not an option, so we went with Turi Create. It is important to note, both Create ML and Turi Create use a YOLO model for object detection so your annotations will need to be formatted as mentioned above.
Turi Create is a bit hard to pick up at first since the documentation is lacking many details. Once finished, we have a small script of less than 70 lines of code that converts the CSV file into an SFrame, a datatype used by TuriCreate, much like TensorFlow’s TFRecord object.
The actual training code is even smaller than our converter clocking in at less than 40 lines of code.
import turicreate as tc # Params grid_shape = [20, 20] batch_size = 64 itterations = 20000 # Load the data data = tc.SFrame(‘data/ig02.sframe’) # Make a train-test split train_data, test_data = data.random_split(0.8) # Create a model model = tc.object_detector.create(train_data, grid_shape=grid_shape, batch_size=batch_size, max_iterations=itterations ) # Save predictions to an SArray predictions = model.predict(test_data) # Evaluate the model and save the results into a dictionary metrics = model.evaluate(test_data) # Save the model for later use in Turi Create model.save(‘models/barcode.model’) # Export for use in Core ML model.export_coreml(‘models/barcodeFlakeDetector.mlmodel’) # Show test results test_data[‘image_with_predictions’] = tc.object_detector.util.draw_bounding_boxes(test_data[‘image’], test_data[‘predictions’]) test_data[[‘image’, ‘image_with_predictions’]].explore()
After a considerable time training, we were able to observe the results seen below. Overall, it is not bad for only 20k steps. Google’s object detectors are trained for around 200k steps on their initial passes. Again, we are not going for production, we are just wanting to validate the project. Additionally, we are excited to see these results with our model because YOLO is built around speed and not accuracy; so another model like Faster-RCNN might be able to provide better results.
With this result we are able to feel confidant about the job ahead of us. With real data, longer training runs, and better turned models we can easily get better results. The exciting conclusion to the saga was the fact that we were successful. With about a days worth of work and processing we were able to squash fears about the project being improbable and develop important questions about the goals of the project and help define what success is.
Do not let lack of data be the thing that slows innovation at your company. With some downtime you could easily create enough info to get leadership buy-in and push machine learning forward at your company.
The Bot that’s got everyone talking The science-fiction future is among us as we find ourselves on the precipice of an AI revolution. As...
Big Nerd Ranch is chock-full of incredibly talented people. Today, we’re starting a series, Tell Our BNR Story, where folks within our industry share...
Writing documentation is fun—really, really fun. I know some engineers may disagree with me, but as a technical writer, creating quality documentation that will...