In the April 2012 AccessWorld review of the VizWiz, Digit-Eyes, and LookTel Recognizer object identification apps for iOS, the reviewer concluded that each of these apps would make a valuable addition to your iPhone's app library. In the intervening months, several new offerings have hit the App Store that can help identify everything from a soup can to the return address on a package you receive in the mail. In this article I'll take a look at three: TapTapSee, CamFind, and Talking Goggles. All three of these apps compare pictures you snap using your rear-facing camera against extensive image databases and return the best match. TapTapSee and CamFind also use human staff to view the images and provide accurate matches. The third app, Talking Goggles, adds a unique twist: optical character recognition (OCR) that can decode small excerpts or labels of text printed on boxes, jars, magazine covers, and more.
An iPhone 5 was used for this review, but these apps should also work on other iOS devices that have a camera.
TapTapSee
TapTapSee from Net Ideas, LLC is a free iOS app boasting an extremely simple interface. Indeed, there are only three buttons, all clearly labeled. At the extreme upper left of the screen is the "Repeat" button, and at the extreme upper right is the "About" button. Tap anywhere else on the screen to locate the button most essential to this app: "Take Picture."
Aim your iPhone's camera at the item you'd like to recognize and double tap the screen. The shutter click sounds, and VoiceOver announces, "Picture One taken." Your photo is automatically uploaded to the company's servers, and when a match is found, VoiceOver speaks the item name.
The item name is not displayed on the screen, but you can press the "Repeat" button to have it re-voiced. The "Repeat" and "About" buttons at the upper left and right of the screen are very small and can be difficult to find. A better way to access is with a one-finger swipe to the right to reach the "Repeat" button and a second swipe to reach the "About" button.
You don't have to wait until you get the results of one search before you take a second or third picture. In fact, you can take five pictures before the app recycles to the "Picture One is taken" notification. This feature is especially handy when you have a number of items you wish to recognize or if you suspect your first attempt may not have been in focus or at the right distance. Consider snapping several and waiting for the results to come back one by one. Since the "Take Picture" button is so easy to find and press, it's a simple matter to hold your phone between your thumb and middle finger and tap with your index finger, leaving you a free hand to arrange each item for its moment in the spotlight.
TapTapSee announces the number of each picture as it is uploaded and the number of each result as it is returned and voiced. Unfortunately, you can only repeat the most recent result, so if TapTapSee has announced the name of item three, you cannot return to the information on item two. Also, since these results are not displayed on the screen, there is no way to review the item name letter by letter. Braille display and Zoom users will need to turn on speech to use the app since it only voices results.
Along with the app's version number, Terms of Use, and Privacy Policy, the About menu includes a setting labeled "Enable Auto-Focus Sound." If you enable this setting, TapTapSee will sound a double beep when your item is in focus. There is no flash setting, so the app defaults to auto-flash mode. I found the auto-focus sound feature extremely useful and not only for photo accuracy. If you do not close TapTapSee after you use it, the app will continue to auto-focus on one item after another until your phone is locked. If you're like me and run your iPhone with auto-lock set to "Never," the alert beep is handy as it will remind you the app is still running in the foreground.
Results
TapTapSee results are swift and accurate, and on those occasions when the results are delayed, the app voices a message saying, "This will take a few seconds," so you aren't left wondering if the photo had, in fact, uploaded. Most cans and boxes from my pantry were identified on the first try, but if I had to turn the can a bit to get a better picture, it was an easy matter to double tap the screen again without having to search for the "Take Picture" icon or find a "Back" button.
The auto-focus button is a real help in learning to position the camera, and I quickly found myself using it to submit several photos in rapid succession from different angles. As mentioned, TapTapSee uses a combination of a proprietary image database and humans to perform recognition, so the same item often returned slightly different results. Three successive pictures of my keyboard returned these results: "Apple keyboard, wireless," "Apple keyboard" and "computer keyboard." Two successive images of a DVD reported "WKRP in Cincinnati" and "WKRP in Cincinnati: The Complete First Season DVD"
The app had little trouble identifying US currency. It did a fine job identifying most DVDs and CDs in my small disk library, and it also recognized most of the books I tried even if I photographed the wrong side. However, as was the case with the DVDs, the results varied in their completeness.
TapTapSee truly shines when it comes to non-branded, non-labeled items. From the street it reported that I live in a "white stucco house." A second photo revealed that the verbena bush near my front door is covered in red and orange flowers. Submitting a pair of photographs with the camera aimed at myself gave the results "man in blue shirt" and "man with gray beard." I hung a red t-shirt on my home office door, and TapTapSee nailed it with "red shirt hanging on white door." The app frequently included color and background information, such as "bacon in a black pan" or "orange cat sleeping on black couch."
Conclusions
Considering its simplicity and ease of use, TapTapSee is a must-have app for any individual with vision loss who uses VoiceOver with their iDevice. The price is right (free), and it has already become my go-to app for quickly identifying objects at home or out and about. If you are a Zoom or braille display user or just want a more full-featured recognizer app, read on.
Update
Shortly before publication, the developers of TapTapSee released an update that includes some exciting new features. There are two new buttons across the top of the Home screen: "Library" and "Share." Double tapping the "Library" button calls up your photo stream. Select any picture and double tap, and the image is sent to TapTapSee for identification. This is a great way to help clear the clutter of all those pictures of blank walls and wayward thumbs from your stream. Now that you've found the picture of your cat playing the piano, return to the Home screen and double tap the "Share" button to e-mail it to a friend or post to one of your social networks. TapTapSee sends out the most recent identified image, so you can now check to make sure you actually captured that circus clown in the frame before you share it on Facebook or Twitter.
CamFind
The CamFind app, also from Net Ideas, LLC and also free, builds upon the developer's experience with TapTapSee, and responds to a lot of useful feedback they received from the sight-impaired community.
CamFind is geared toward people who want to learn more about an item, and perhaps shop for it online, without having to type in a search. The CamFind interface differs significantly from the simplicity of TapTapSee. The opening screen still has only three buttons: "History," "Take Picture," and "Options," but tapping anywhere on the screen does not call up the "Take Picture" button as it does in TapTapSee. Instead, double tap the bottom center just above the "Home" button.
Also unlike TapTapSee, CamFind displays the item name on the Results page along with a thumbnail image of your picture. There are four additional buttons on the results page: "Shopping," "Related Image Search," "Map Search Results," and "Loading Movies." Snap a movie poster, and the "Loading Movies" button will take you to a site where you can watch a trailer for the film. The WKRP DVD image took me to a web store where I could order that DVD and several other related titles, including Night Court and Murphy Brown. The "Map Search Results" button is supposed to show you where you can purchase the item locally, but mostly I got links to local hotels and restaurants. The "Related Image Search" button does just that, and the "Shopping" button leads you to online merchants who sell the item plus other resources, such as the manufacturer's homepage, Wikipedia articles, and other sites of possible interest.
Unlike with TapTapSee, you can't take successive pictures without first returning to the main screen. Unfortunately, the two-finger scrub gesture does not work in this app. Instead, you have to locate and double tap the "Back" button at the upper left of the screen.
The main screen then lists all of the items you have scanned to date. You can return to the Results menu by tapping any item on the list, which can be handy if you identified an item while you were out and want to wait until you return home to shop or learn more. The History menu has a "Clear" button when your list starts to get unwieldy, and an "Add Item" button that allows you to search by text or open and identify a picture from your photo library.
Results
Since CamFind and TapTapSee use the same image databases and human staff, the results tend to be similar. According to the developers, TapTapSee has been optimized to give results that individuals who are visually impaired would prefer, and I did experience at least two instances where I believed this to be true. The "red shirt against the white door" message that TapTapSee reported came back as simply as "red shirt" using CamFind. In general, TapTapSee is far more likely to include background information or to mention more than one item, such as "loaf of bread on a cutting board." Another time when I got usefully different results was when I snapped the same car with both apps. CamFind reported "red Toyota Camry" whereas TapTapSee reported "car with open door," which was potentially the information I wanted to know.
CamFind also offers a voice search feature and language support. You'll find both in the Options menu. The app also uses auto-flash and auto-focus, but currently, there is no beep alert to let you know the object is in focus. Happily, I was informed by one of the developers that this feature will be added to an upcoming release.
Conclusions
This app is an excellent companion to TapTapSee, especially if you want to save your results for later or do additional research on the items you photograph. Braille and Zoom users may also prefer this app to TapTapSee, since it displays results on screen. Since CamFind is also free, there's no reason not to have both apps in your app library.
Talking Goggles
The Talking Goggles iOS app from Sparkling Apps ($0.99) uses the Google Goggles image database, the same one used by the popular Android app of the same name. The app uses its own voice to announce the results. It also features a "Video Camera" mode you can use to identify one item after another without having to pause to take individual photos. Finally (and most interestingly), whenever the app detects text, it uses OCR to try to recognize and announce it.
When you start Talking Goggles, you are presented with four buttons. At the upper left of the screen is the "Flash" button that toggles the device's flash off and on. (This app does not use auto-flash.) At the upper left, there is a "Flag" button. Select this option to change the app's language settings. The screen's bottom left contains the "Gallery" button. Double tap this button to bring forth icons for "Camera Roll," "Photo Stream," and "Goggles Library." Open any of these, select an image, and double tap. Talking Goggles will open the photo and try to identify it.
Move to the bottom right, and you will find a button that toggles the app between "Video Camera" and "Still Camera" modes. In the lower middle is one last button. When you are in "Video Camera" mode, this button toggles between "Record" and "Stop." When you are in "Still Camera" mode, this button has the somewhat baffling label "Camera Copy14."
Double tapping the "Camera Copy14" button does not take a picture. You have to double tap a second time because the first double tap offers you the opportunity to focus the camera the proper distance from the object you wish to scan. Unfortunately, there is no audible feedback to assist.
The "Video Camera" mode causes Talking Goggles to attempt to identify objects it finds in the video stream. After you point the camera toward an object you wish to identify, Talking Goggles will continue to attempt to identify one item after another on the fly until you press the "Stop" button, the "Still Camera" button, or press your device's "Home" button to exit the app.
After voicing a "Still Camera" result, Talking Goggles displays the name of the last item on the screen. Double tapping the name calls up a web search for the item. There is no list of previously recognized items. You can access a history of your photos from the Goggles Library via the "Gallery" button, but unlike with CamFind, this app does not list the names of the items. Instead, it only lists the time and date stamps, so users who are visually impaired will find locating a previously recognized object frustrating at best.
Talking Goggles places app options in the iOS Settings menu. There you can choose whether or not to save your captured images and if you want the app to self-voice the results or not.
The app offers three scan modes: "Fast," "Balanced," and "Best." I tested all three and identification was in fact strongest using the "Best" setting.
Results
Talking Goggles performed well for most catalog items, including CDs, DVDs, and books. It also recognized pantry items well, but many times the results only contained a few words instead of the full title. I was also pleasantly surprised when I aimed the app at one of my antique poker dog prints. CamFind had recognized this as "a picture of seven dogs playing poker." Talking Goggles also included the print's title, "A Friend in Need." It also recognized my granddaughter's Mickey Mouse shirt and, in both cases, called up a web search for those characters.
The app did an adequate but not stellar job identifying US currency. I often had to take multiple photos or pull the video camera away and then go back for a second or third pass. Also, Talking Goggles often provided more information than necessary, such as "'In God We Trust.' Twenty dollars."
Unfortunately, my red shirt hanging on the door, a pair of reading glasses, and many other everyday items resulted in a fairly high percentage of "No Close Images Found" errors. Unlike for item matches, this message does not self-voice, and sometimes it takes several minutes to appear on the screen. In the meantime I was left wondering if the picture had, indeed, uploaded properly.
Often, instead of properly recognizing an object, Talking Goggles would report back a seemingly random string of words or characters. At first glance this seems like a weakness in the app, but in fact, it is one of its marquee features. Talking Goggles has been programmed to search for text and, when it finds some, use OCR to recognize it. Sometimes this works against the app, like when a screwdriver I scanned came back "swash up." (Perhaps the app was trying to identify a brand name, or the screwdriver's handle ridges appeared to the app as some sort of text.) Other times, this can be a true help to the visually impaired.
Whereas TapTapSee and CamFind described a bubble wrapped package as an orange cell phone, I was able to use Talking Goggles to find the words "refill" and "Febreze" on the package, which was all the information I needed. The OCR is by no means complete, but with practice I was able to catch a few words here and a sentence there, such as magazine titles and product names on the empty boxes stacking up in my closet.
Unfortunately, there are also times when the OCR works against the app. Instead of using an image match to identify a can of Chef Boyardee Ravioli, the app went straight to OCR and spoke small blurbs from the nutrition label. Several quarter turns of the can and about 20 seconds of trying to hold the camera perfectly still finally revealed the product name. TapTapSee nailed it on its first try. I also tried using Talking Goggles to identify a bottle of my wife's nail polish. The "Still Camera" mode gave me several "No Close Images Found" errors in a row. I moved on to the "Video Camera" mode, but after several minutes trying, the only text I could get the app to announce was "no chip" and "one coat." Again, TapTapSee identified the object correctly on its first try as a bottle of Revlon Clear nail polish.
One very useful task Talking Goggles did help me perform was sorting through a pile of mail. Often the brief blurbs of text or labels were all I needed to discern a return address or read "Limited Time Offer!" or "Current Resident," which allowed me to quickly sort out and toss much of the junk mail.
Conclusions
In "Still Camera" mode, Talking Goggles is considerably slower and less accurate than both TapTapSee and CamFind. It also does not perform well in poorly lit areas, even with the flash turned on. The "Video Camera" mode provides some useful text information via the OCR feature, but the learning curve is rather steep. Without audible feedback it's difficult to sense exactly how far away to hold various items. Also, it takes several seconds for Talking Goggles to identify and recognize text, and I often found myself struggling to hold my iPhone perfectly still for any length of time. It is also impossible to tell if the app has found text and needs more time to recognize and announce it or if there is no text in the focus area and the camera needs to be moved one way or another.
Despite these shortcomings, I definitely think Talking Goggles is well worth the $0.99 price even if all I use it for is to help presort my mail.
The Bottom Line
If you haven't already read the April 2012 AccessWorld article What is This?: A Review of the VizWiz, Digit-Eyes, and LookTel Recognizer Apps for the iPhone, it's definitely worth a look. The free VizWiz app described in the piece is a definite must-have even if you obtain all three of the apps mentioned in this follow-up.
Here's why:
Before you submit a photo to the VizWiz recognition engine and web workers, you are given the opportunity to record a question. Often your question is "What is this?" but just as often there will be a particular aspect (color, expiration date, oven temperature, etc.) which is the information you really want to know. VizWiz's question and answer format is ideally suited to these sorts of tasks. Even so, in my opinion, VoiceOver users should definitely consider adding both TapTapSee and CamFind to their app libraries for quick and easy identification of most products. Braille and Zoom users will probably be satisfied with just CamFind.
If you already use a dedicated OCR app with your iOS device, Talking Goggles will not add a great deal of functionality that is not already available on the previous two apps. Identifying works of art and characters on t-shirts is an amusing novelty, but except for sorting mail, I don't see myself firing up this app more than once or twice a week. At $0.99, I would nonetheless recommend the purchase of this app if it's in your budget. I believe the future potential for this and other similar apps is enormous, and I eagerly await Talking Goggles updates and the accessibility advances they might make possible. For now, as users who are visually impaired, it's in our best interests to offer the developers our encouragement and financial support.
Comment on this article.