You might not know this, but VoiceOver Image Descriptions on your iPhone can describe any photo amazingly well, offline!


VoiceOver on your iPhone is definitely useful for the visually impaired. But did you know that it has an Image Descriptions feature that does amazingly well at determining and describing the contents from any photo?

VoiceOver is a screen reader created by Apple that lives on your iPhone, whose job is to describe the contents of your screen by reading it out aloud. It is dedicated to the visually impaired and lets them interact with their iPhones better. You simply touch or swipe on the screen and your iPhone speaks the relative content.

VoiceOver can read content from text to hyperlinks to even images. It uses Apple’s machine learning algorithms to make out what’s on the screen. However, Apple’s machine learning algorithm is so good that it can accurately read a photo and describe it to the user using the Image Descriptions feature. It’s not only the software that does all the magic, moreover. Apple’s powerful silicon like the A13 Bionic on the iPhone 11 is capable of performing machine learning computations much faster, which directly benefits the Image Detections feature.

Apple introduced Image Descriptions alongside iOS 11 back in 2017. At the time, the feature was a bit inconsistent with reading images. However, Apple has recently made certain improvements to it in iOS 14 that has made it much better.

But wait, doesn’t Google Lens already do that? Well, yes. But the thing is, you shouldn’t compare the Image Detections feature with Google’s search-engine backed online image detection service. The interesting fact here is that the Image Descriptions feature on iPhone works completely offline. This means nothing that VoiceOver sees on your screen is sent over to Apple’s servers for processing, which cannot be said for Google Lens. On top of that, considering it works offline, it’s really good at making out the contents of a photo.

While it’s quite helpful for visually impaired individuals, even those who are not can use VoiceOver on photos to maybe find out the name of the strange fast-food franchise behind you and your buddy posing together.

That said, before I show you how to use VoiceOver to make your iPhone speak the content of a photo to you, there are some additional steps you’ll need to go through.

How to use VoiceOver to make your iPhone describe photos aloud?

Apple’s Image Descriptions, while it works offline, requires you to download it first for offline usage. Here’s how you can do that:

  • Head over to Settings
  • Go to Accessibility - VoiceOver - VoiceOver Recognition - Image Descriptions
  • Toggle the Image Descriptions option on. The toggle should turn green instead of remain gray.
  • Right below the toggle will be a button that will invite you to download the image recognition package. Tap on it.

The Image Descriptions feature is now set up. All you have to do now is test it out. You’ll need to enable VoiceOver for that. However, if you are visually impaired, chances are that you might already have it turned on and you won’t have to do anything else before we begin the test. If you are not visually impaired and not accustomed to how VoiceOver works, you might want to leave it disabled.

The trick is to enable it within the Photos app—which is where we are going to test the Image Descriptions feature—using Siri.

Testing out the Image Descriptions feature by making it describe a photo in the Photos app

Note: This works both when you are in the grid view within the Camera Roll album in the Photos app on your iPhone and when you have a photo opened in full-view.

  • Head over to the Photos app on your iPhone.
  • Tap on any photo you want your iPhone to describe for you.
  • Summon Siri using the “Hey, Siri” voice command.
  • Speak “Turn on VoiceOver” and wait until Siri does.
  • To dismiss Siri easily, say “Hey, Siri” again and speak “Cancel.”

Now, simply tap on the photo once to let VoiceOver’s Image Detections feature do its job. You’ll be surprised at how accurate it is.

This is the image I used:

When I summoned VoiceOver to do its thing, here’s how it described this photo: “A body of water in front of a group of trees and buildings under a pink and blue sky.”

And this is a photo that one of my friends sent me:

Source: HomeFoods05

Can you guess what Image Descriptions would have described it? Here you go: “A fried food item topped with chopped scallions on a plate.”

The fact that it was able to determine the ingredients of the dish above is particularly interesting, to say the least. You might want to try it out for yourself. Let us know how it worked out for you in the comments section below.


Note: This story contains affiliate links that may earn The 8-Bit commissions on successful purchases to help keep the site running.