On Photos face tagging and workflowability

Face tagging in Photos frustrates me sometimes.

I know everyone uses apps like Photos differently and cares about different things. My wife & I mainly use Photos to keep track of the family photo library (around 30k pics over 10 years). About half come from iPhones, the rest Canon or Nikon. We curate out the bad or redundant, but err on the side of keeping more (not every shot has to be a masterpiece). We crop and tweak, but aren't heavy editors. We do albums for vacations or events (~10 albums per year) and do one photobook per year. And we feel strongly about having everything facetagged and geotagged correctly.

Caring about facetagging is where frustrations come from.

The machine learning algorithm's suggestions aren't perfect (I'll write a separate post about that), but that doesn't bother me too much. Such algorithms will never tag everything right: you can't expect an AI to know which kid is behind a Batman mask or who that person is with their back to the camera. That info needs to come from us. And I know my 3 kids look alike - heck, I sometimes have trouble telling them apart. I'm fine with manually inputting some tags. But the user interface to allow me to do that opposes me at every turn. It's like Apple purposefully designed it to make people not want to actually tag pictures.

Making manual tagging difficult may be Apple's misguided attempt to say "it just works". But tagging faces in pictures isn't like pairing a set of Bluetooth earphones. You can't cut the user out of the process. Machine learning relies on a large enough sample size of labelled training data to get better, so making it hard for us to provide that training data is a bad idea. And even if they push it from 70% (which is what I've experienced so far) to 90%, we as end users will still need to check every photo and correct that last 10%. Doing so needs to be less painful than it is today, but Photos does not seem to be built with any type of workflow for this in mind.

Linked to that is the fact that Photos does not visually distinguish between (1) names that I have entered, (2) names that it has suggested and I have confirmed, and (3) names that it has suggested but I have not yet checked. It constantly re-trains its model as new pictures come in and new faces are tagged, and it re-evaluates its own past suggestions based on new data, which makes sense. But the consequence is that faces which I've already checked to be correct get changed after I've passed them.Things change in the background outside of your control. Faces which were correct once are now wrong, making the system feel less reliable than it probably is. All in all, this lacks in workflowability. If that's not a word, it should be.

On iOS, face tagging could have been such an intuitive blast. Go to a picture, see the labelled faces, press the face if it's wrong, select the right name from a list. Instead, Apple makes you swipe up to show the names (which is cute for 1 photo but an annoying waste of time for 20k). It doesn't allow you to force-touch or long-press directly on a face to tag it. It doesn't allow you to tag something that Photos hadn't recognized as a candidate face in the first place. It looks to me that when Apple designed the face tagging user interface, they did not intended for people to start from the pictures.

Instead, there are 2 routes starting from the People album itself. You can go through each invididual person, scroll down below all of the pictures and check if you need to "Confirm additional Photos" for this Person. There is no way to see across all of your People which ones have additional Photos waiting for you to confirm - a proactive notification when a batch of Photos is available for us to confirm would be nice. When a face is wrong, there is no "No, this is X instead" button - only a "No" button, which means that sometimes you see the same picture pass by 3 times before you can finally tell it who this is instead. This seems like something that was lab-tested on a library with maybe 5 People in it and I'm sure it demoes well, but once you have an extended family and friends selection of 50+ People tagged, and mainly want to process the additions of the week, it doesn't scale. Again, no workflowability.

Confirming additional photos for a Person is done 1-by-1 by clicking Yes / No on individual photos. Occasionally, Photos has granted me the right to use a "Review" screen which works much more efficiently compared to the "Confirm additional Photos" screen, by asking me to "select all Photos that are person X". It's great! But after a full year of using this, I still have no idea if I can trigger this screen proactively - I would use it all the time if I could.

The other route through the People album is to "merge" faces, i.e. select multiple faces which the AI has clustered together and teach it that these are the same person. This is a nice way to bootstrap the process if you're initiating a new library, but once you've gotten your first few thousand photos tagged it becomes increasingly less useful. Frustratingly, once you have a lot of named faces, you no longer see which unnamed faces are still available to be merged.

Alarmingly, both the Confirm Yes / No method and the Merge Faces method have no Undo option. That is extremely dangerous, since a single slip of the finger can irrepairably merge person A and B, causing Photos to wrongly re-tag all photos of A into B, with no other way of repairing the damage than once again going through all the photos with these people in it. This needs fixing.

On macOS, the situation is a bit better. It shows the name in place on the picture, and you can circle faces that Photos didn't realize were a face, and you can see the faces labelled within the Photo rather than having to swipe up. You can do smart albums on People, but it still doesn't provide any method of finding untagged faces (e.g. you can't do a smart album to look for "unnamed"). Nor does it in any way distinguish between what Photos has suggested itself versus user input or confirmed faces. Finally, it also doesn't allow you to permanently ignore random background faces.

The syncing between devices seems to work more or less OK. I've found multiple cases where Photos were tagged inconsistently, but it's hard to say whether those were temporary inconsistencies that would eventually disappear once all devices have fully synced all photos from the cloud, fully re-trained their model, and fully re-analyzed all photos. Conceptually though, it seems tricky to me that Apple only syncs the ground truth (user input) and has each device trains its own model, even though most devices only have a limited subset of the full library. I wish Apple would be more forthcoming with information about how this actually works.

All in all, face tagging in Photos feels like it's still in its infancy, and needs 2-3 more versions to get to something truly user friendly. It seems to me like the Photos team has had to prioritise their efforts on features which were tied to hardware upgrades: Live Photos, HEIC, Portrait Mode, the Depth API, ... Apple priorities features which are important to its deep integration across hardware, software, and services. I get that. But when they find some time in between the next tentpole features, I hope they take some time to think about workflowability.

Comments

Popular posts from this blog

Maybe 2018 will be the year UIKit replaces AppKit on the Mac

Why losing some customers may be good for the Mac

The path for technology for the next decade? Here's 5 guesses.