From Punched Cards to Prompts
AndroidIntroduction When computer programming was young, code was punched into cards. That is, holes were punched into a piece of cardboard in a format...
The audience at the Google I/O 2018 Keynote gasped and applauded at the demonstrations of pervasive and ubiquitous AI in Google’s products. Almost every mention of a new feature highlighted a machine learning application that makes seemingly magical things available at our fingertips. But it’s just sufficiently advanced technology, commoditized into nice packages such as MLKit and TensorFlow, that make it possible for all developers, even those outside Google, to add such features in their own applications.
Google Assistant has learned to perform multiple actions and gained the ability to continue conversations so that we don’t have to keep prompting it with “OK Google” after every request. One of the most impressive demonstrations was that of the prototype Google Assistant that is capable of carrying out telephone conversations to make appointments and check hours of operation. So convincing was the assistant at mimicking humans that it raised concerns, but it also demonstrated state of the art technology that was unimaginable just a few years ago.
Applications of AI and ML to medicine continue to impress and inspire. Various ML models trained with the assistance of medical professionals often match the performance of practitioners and even reach the level of experts. Furthermore, as demonstrated at the keynote, sometimes these models are able to explore and discover additional pieces of information that the researchers had not explored. For example, the diabetic retinopathy research was able to determine the biological sex, age and other factors that were not thought to be possible to extract from the images of retinas.
ML models are very good at making predictions. Gmail can now generate smart replies and even help compose emails based on contextual cues, such as the subject of the email, current date or recipient. Google Photos can determine the content of the photo and suggest the most likely action, like scanning a document or sharing your friends photos with them.
AI and ML are at the core of the next version of Android, bringing features like adaptive battery, adaptive brightness and predictive actions. The OS adapts to user’s usage patterns and allocates resources accordingly, resulting in a 30% reduction in battery usage.
As impressive as these product demos are, most developers are interested in the tools that enable us to build our own applications. Fortunately, we were not disappointed. TensorFlow continues to evolve and gain new features. Eager execution allows easier easier experimentation, TensorFlow Lite scales down models to execute on mobile hardware, Swift for TensorFlow brings compiler technology to graph extraction and promises to combine the best of both worlds in terms of performance, flexibility and expressivity.
Google also introduced MLKit, a new set of machine learning APIs to Firebase. MLKit APIs can categorize images, detect faces, recognize text, scan barcodes, and identify landmarks out of the box. They do this by using appropriate ML models for each of the tasks. Some of the APIs can work on the device, while others are cloud-only. It is also possible to roll out a custom ML model with TensorFlow Lite. New built-in models are being added to MLKit, such as the smart replies functionality from the Gmail demo.
The goal of MLKit is to make it easier to incorporate advanced capabilities in mobile applications. Here’s an example of using MLKit for identifying image labels:
FirebaseVision.getInstance().visionLabelDetector
.detectInImage(image)
.addOnSuccessListener { labels ->
val output = labels.map { "$it.label: $it.confidence" }
Log.d(TAG, "Found labels:n$output")
}
Every year it is getting easier to take advantage of the amazing progress made by AI and ML researchers. Many things recently considered impossible are becoming commonplace features in mobile applications. MLKit aims to make ML widely accessible to a much broader audience. The power of off-the-shelf ML is just a Gradle dependency away.
const val REQUEST_GET_IMAGE = 1
const val TAG = "MLKitDemo"
class MainActivity : AppCompatActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
FirebaseApp.initializeApp(baseContext)
setContentView(R.layout.activity_main)
findViewById<Button>(R.id.get_image_button).setOnClickListener { getImage() }
}
private fun getImage() {
val intent = Intent(Intent.ACTION_PICK)
intent.type = "image/*"
startActivityForResult(intent, REQUEST_GET_IMAGE)
}
override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {
if (resultCode != Activity.RESULT_OK) return
when(requestCode) {
REQUEST_GET_IMAGE -> data?.let { processImage(data) }
}
}
private fun processImage(data: Intent) {
val image = FirebaseVisionImage.fromFilePath(baseContext, data.data)
FirebaseVision.getInstance().visionLabelDetector
.detectInImage(image)
.addOnSuccessListener { labels ->
val output = labels.map { "$it.label: $it.confidence" }
Log.d(TAG, "Found labels:n$output")
}
.addOnFailureListener {
Log.e(TAG, "Error", it)
}
}
}
Introduction When computer programming was young, code was punched into cards. That is, holes were punched into a piece of cardboard in a format...
Jetpack Compose is a declarative framework for building native Android UI recommended by Google. To simplify and accelerate UI development, the framework turns the...
Big Nerd Ranch is chock-full of incredibly talented people. Today, we’re starting a series, Tell Our BNR Story, where folks within our industry share...