Have any of you guys made any local image classifiers?
I had this idea that sprang forth from a problem: I was chatting with a friend about how artists in the 80's leveraged the strangeness of CRT pixels to make better sprites, and I had the perfect image stored to show it off. I won't post it here because it was a pixelated butt.
To find it, I went to my pictures dir and scrolled forever till I found it. There are 1000 files in there.
So I thought, why not make a classifier that looks at every image in there, and spits out a guess as to what is in it. I could associate these guesses with the filename, and do some kind of command line program that searches, maybe listing the hits or opening a file manager type thing with the proper ones selected, or even firing off a viewer with a list of hits.
I searched around a bit and decided on resnet50. Found a decent bit of sample code to go with it:
DeepLearningExamples/PyTorch/Classification/ConvNets/resnet50v1.5 at master · NVIDIA/DeepLearningExamples And grabbed the pretrained model.
And of course like anything python or machine learning, it didn't work. There were 3 or 4 formats of model on huggingface and none of them worked. Eventually I scattered prints throughout the python and figured out that the code was expecting model weights in the form layers.x.y and all the stuff I could download was layerx.y. The pretrained models also began counting from 1, so I had to do all this munging and re-arranging but eventually got it to work.
I could have done said re-arranging in c in like 5 minutes but it probably took me an hour in py. Here's it running on a random funny jpg I had:
View attachment 598160
Anyway, it works, somewhat. It's not supergood at anything anime or cartoony. Just tends to label it all "comic book" without distinguishing much of the image. But for a 100 meg model it isn't too bad.
I was wondering if you guys knew of anything better? I could probably handle up to around an 8 gig model.
One thing I discovered: I'm going to need a thesaurus. Any sort of cosplay girl (and I have a few in my pictures) came out with "maillot" as the top guess. I had to look up what that is.
So would it have helped me with my original problem? Not really. I would have searched for sprite, butt, crt, pixels something like that.
When classified for that image it was:
Code:
scoreboard: 31.0%
window screen: 7.4%
comic book: 7.2%
fire screen, fireguard: 3.1%
European fire salamander, Salamandra salamandra: 2.4%
So probably not