On-device image intelligence
Apple has developed powerful image intelligence features in its devices, creating tools iOS users and developers can make use of to get things done.
When it ships, iOS 13 will bring sophisticated AI to imaging, such as:
- The Photos app can curate your image library to highlight and organize the best images – it cuts out clutter, screenshots, duplicates and tries to put visually pleasing collections together.
- Photo editing tools can provide more granular effects and swifter application of them. Portrait Lighting adjustments can be made inside the Camera app while taking a picture.
- iOS 13 also lets you use Photos image editing tools in video.
- The AI in Photos can now actually ‘look’ at one of your images, figure out the central subject(s) of that image using a tech called ‘Image Saliency’ and recommend adjustments to make that image look better – all on the device.
- Current speculation is that next-generation iPhones will introduce a new 3-lens camera, offering 3D imaging capture and much improved night vision.
- A newly-revealed FaceTime Attention Correction feature uses AI to make it look as if you are looking at the other person when you speak, and not at the display.
State-of-the-art image intelligence
Apple has been working hard on computer vision.
It introduced Core ML 3 at WWDC, which gives developers – including enterprise developers – new imaging tools for use in their apps.
“Core ML 3 supports the acceleration of more types of advanced, real-time machine learning models. With over 100 model layers now supported with Core ML, apps can use state-of-the-art models to deliver experiences that deeply understand vision, natural language and speech like never before. And for the first time, developers can update machine learning models on-device using model personalization.” – Apple
The AI supports things like face detection, tracking, and capture quality as well as text recognition, image saliency and classification, and image similarity identification.
You also get better landmark detection, rectangle detection, barcode detection and object tracking.
Apple has also figured out how to make all these operations take place using AI on the device, with calculations taking place inside the Neural Engine on Apple’s A-series processors.
Animal detectives and other stories
Practically, Core ML 3 means developers can build apps that host their own built-in AI models to help drive intelligent features such as object recognition in photos.
This is the kind of intelligence that recently generated a flood of stories that explained how iOS 13 iPhones will soon be able to recognize cats and dogs within images, using Apple’s new VNAnimalDetector framework.
(Photos has been able to recognize both types of animal since 2016, but this is a more granular model, capable of determining which animal is depicted from less visual information).
This means that you’ll be able to ask Siri to show all the images you have of traffic lights, dogs and horses, for example. Or any image that doesn’t include a cat.
Creating dog and cat detection apps, plant recognition solutions or automated shopping systems are merely the spiky bit at the top of the rapidly diminishing iceberg.
Can an iPhone save your life?
Apple surely hopes developers will find ways to use this kind of image intelligence to create hitherto unheard-of solutions, perhaps even including face-based sentiment analysis. It is also possible we will see rapid development of AI-based shopping solutions in which AI uses images to recognize and find the products you want.
There are much bigger consequences, particularly in medical.
Apple recognizes how computer vision innovation is enabling the development of life-saving technologies, such as Apple Design Award 2018 winner, Triton Sponge, which helps surgeons accurately track patient blood loss.
Similar solutions – based on computer imaging – are being developed to improve X-ray, angiography, ultrasound and other forms of medical assessment and diagnosis.
In typical Apple fashion, Apple is marketing these sophisticated technologies within terms boundaried by consumer need, such as image recognition or improved FaceTime chats.
In truth, its machine intelligence teams are working on technologies that developers may be able to use in order to build solutions that turn out to be much more profound. AI + AR + remote medical diagnosis, perhaps?