Fanfart Trashblog @animunerdery - Tumblr Blog

Remember when I talked about how how I wished there was some image-to-text AI instead just the text-to-image AI? Turns out there is!

This is a screenshot of an image-to-text AI called "clip_prefix_caption," specifically using the model "Coco." And while it's not 100% accurate, it still did a reasonably impressive job with this image. Of course, it kind of makes sense since this photograph is a free-to-use image I pulled off the web, which is almost certainly the kind of stuff this AI was trained on. If we get a type of image very different from what this AI was probably trained on, the results are not nearly as accurate.

But that's okay, Coco isn't designed for Optical Character Recognition (OCR). If you put this same image into An OCR-focused AI program like Image to Text Converter, you'd get:

Night Vale podcast (zioNightValeRadio A mafia guy who has really misunderstood "make it look like an accident" shouting WHOOPSIE every time he fires the gun. 1:06 PM • 2023-02-10 • 72.4K Views 3,487 Likes 776 Retweets 27 Quotes

Still not perfect, but defintely better than "a man is playing a game on the Nintendo Wii."

But what about art? How do these types of things fair with describing art? Since I'm not 100% clear on what (if any) information about the input images are put into a data set for AI to learn from, I did not want to put just any art in here. And if you play with any of these programs, I strongly encourage you not to put anything in there you don't have explicit permission to use for this.

I got specific permission from @animunerdery to use their drawing of Vinsmoke Sanji for some AI tests:

I decide to try a few different models too.

clip_prefix_caption (using coco model): A man wearing a tie and a shirt.

Blip: Caption: a black and white drawing of a man wearing a tie

CLIP Interrogator (using ViT-L model): a man in a shirt and tie smoking a cigarette, sanji, fanart ”, short silver hair, boring, lanky, zero - hour, coal, alp

I should note that the last one, while much more detailed, took a lot longer to generate than the other two.

This is by no means exhaustive. If you take a look at the post this image came from, you will find some even more detailed image-to-text AI outputs.

And this isn't even counting image-to-text AI in less open-source projects. Microsoft Word, for example, generates alt text for almost every image you put in a Word document, assuming you're using the current version. The Accessibility Checker will prompt you to check these though, because their accuracy is iffy at best, especially with images that are very far from what was probably in the data sets Microsoft trained its AI on. You can also contribute to that training data set if you want, because Microsoft gives you the option to "donate" any manually-created alt text you add to an image in your document to their database to improve accuracy. It's a case-by-case opt in though, don't worry.

Some screen readers have built-in image-to-text AI as well. For example, sometimes after reading the alt text on an image (be it properly written alt text or the default word "photo" on every image on a Tumblr post without user-added alt text), the VoiceOver (iOS) screen reader will read an additional description it makes using its text-to-image AI. I can't always get it to do this consistently, but after playing around a bit with a version of the Vinsmoke Sanji image that did not have anything but the default "photo" alt text, I got it to give me this:

Adult. Clothing. Illustrations. One.

Not the most helpful. But this technology is still pretty young, and I think it has a lot of potential if used correctly.

animunerdery

I had a conversation with this accessibility blogger and we dug into various ai that can provide fairly accurate and detailed functional description for those who need it.

While it’s true that text to image exploded in the public limelight, the same technology came from image to text ai designed as accessibility tools.

As the blogger points out, one should definitely check with the original artist before running their work through ai. I personally don’t mind if any of you want to use what’s on this blog to either test out or simply be read by ai. The only caveat is to not make any money off my stuff.

Also, the @accessibleaesthetics blog is a valuable resource for anyone interested in learning more. Do check them out if you’re interested in providing more effective accessibility.

#ai tools for accessibility #txt #accessibility

Sanjiweek2023 | 5/6 | Kindess / Straw Hat Crew’s Chef

#sanji week #sanji week 2023 #vinsmoke sanji #tony tony chopper #nami #usopp #nami usopp bffs #roronora zoro #a bit of Robin’s hair if you squint #Brook #monkey d luffy #luffy and brook being nuisances #fun in the kitchen #bentos for everyone #one piece

SanjiWeek2023 1-4

#one piece #sanji week #sanji week 2023 #vinsmoke sanji #zeff patty carne and the rest if you squint

hey do you take comissions?

I don’t, but I know plenty of people who do!

@marashi96art does one piece, resident evil, slam dunk, dbz and others. Her work is fantastic, definitely worth commissioning.

@bukojuiced Jay does one piece among other things. His aesthetic is both clean and soft, really lovely stuff.

Cam is my girl who recently just moved and has been having car issues so could definitely use the help. Her work is also fun as hell.

Ali is a student, but her work is fantastic and you should 100% commission her if you can.

Bao is amazing and you should definitely commission her. She does pretty much everything and does it well

Obviously the big names are around as well, but hopefully this is a start. All of these people are deserving and fantastic artists!

🐊🌸

#wanihana #nico robin #sir crocodile #one piece

🦅🐊🤡

#cross guild #dracule mihawk #buggy the clown #sir crocodile #one piece #a bunch of old men

Happy Valentine’s Day kiddos!

#one piece #bepo #tony tony chopper #i like to draw wholesome choppy bepo vibes #you are all beary deer to them indeed #though I’d imagine them making that for their crews #trafalgar law #nico robin #lawbin #hungry days #sorta #alternate universe human chopper and bepo #dracule mihawk

animunerdery

hi pls don't use the ALT image option as an extra caption, that's meant to describe images for visually impaired users!

Ok, so… some thoughts on alt text and visual impairment.

The original purpose of alt texts are indeed to offer the visually impaired an opportunity to experience an image.

However. How does one experience an image? What is the purpose of the image?

So, to reveal a little about myself. I am visually impaired. I have one severely myopic barely functional eye, and the other is an indiscernible soup of color and shape.

From the functional eye, I try to take in whatever minor little detail. From the nonfunctional one, I suppose adaptation is in order, as the visual world no longer has depth, the realms of the other senses intersect along the crossroads of imagination in order to see that which you cannot.

We feel and experience through so called trivialities and minutiae. Onomatopoeia of scritching along finely toothed margins.

Description itself, the thick kind that oozes with the flavor of the experience, is in a way a practice of inducing nostalgia.

My purpose, however, is to offer alternate hints for immersion. Did you miss out on this? Here’s a little something else for you to experience.

With technology, AI can already efficiently offer basic descriptions of images.

As a maker of things, immersion goes beyond a mimesis of that which exists. The experience is the tone, the mood, the absurd little notes in the margins.

saltiestgempearl

What AI are you using? The only additional detail my screen reader adds to your image here outside of the "our boy has gotta be partially visually impaired with one eye and all" alt text is "illustrations, people, ALT."

Do you have access to some kind of AI that can tell the image is Vinsmoke Sanji, or even just a man smoking? And if so, can you please share with me?

animunerdery

To respond to @saltiestgempearl and really to anyone who would like more robust AI image to text tools.

Replicate has a series of image to text tools. Your mileage may vary as most of the tools are optimized for photography and photo hybrids. However, in a pinch they will give you similar information someone who isn’t familiar with one piece would get.

For drawings I have found that Clip Interrogator and Clip Interrogator2 give the best results.

The output from clip interrogator 1 at a fast speed with the openAI option:

a man in a shirt and tie smoking a cigarette, sanji, smoker, subject action: smoking a cigar, he is smoking a cigarette, with cigar, holding cigar, short goatee, goatee, smoking, man from uncle, long tie, smoke :6, necktie, with a business suit on, (smoke), johan liebert, oda non

The output from clip interrogator 2 at a fast speed with max flavors set at 4:

a man in a shirt and tie smoking a cigarette, kentaro miura manga art style, kentaro miura manga style, inspired by Sadamichi Hirasawa, wearing a shirt with a tie, tall anime guy with blue eyes, handsome anime pose, manga style of kentaro miura, anime handsome man, sanji, kentaro miura art style

They’re not perfect descriptions, but if I had to write a functional description, it would be:

Greyscale anime-esque line drawing of scrawny curly browed blond boy in a skinny black tie and rolled up to his elbows white shirtsleeves. His floppy hair parts to his right our left, covering his right eye entirely. On his left, his hair tucks behind his ear leaving his left side of the face clear as his clear left eye, which, like his shirtsleeves, betray a degree of rumpled exhaustion. The swirly part of his eyebrow swirls up and is closer to his nose than ear. He also sports a scraggly little darker colored goatee and a dainty cigarette dangles from his ever so slightly parted lips. Cuz he’s got his right hand in his way too tight black pants pocket and the other one doing who knows what off screen.

(While it would give more information on the image itself, the description would still be meaningless to people unfamiliar to one piece.)

At any rate…

I’m sure people have a spectrum of feelings towards AI, so it’s probably best to ask for permission if you want to use these tools to “read” other artist’s work. I personally don’t mind what you do with things I make, (just don’t make money off it!)

The vast majority of images on the internet have no tags at all, be it alt or title text. However, with these AI tools, hopefully at least the functional description side of things will open up.

#image to text tools #accessibility #ai tools for accessibility #txt #thoughts on visual impairment from a visually impaired person #tldr use the clip interrogator options for drawings

Cross guild daddy’s mean business

#cross guild #sir crocodile #buggy the clown #dracule mihawk #a ton of cameos in the newspaper articles #like croccoboy’s other avian entanglements #a certain rubber boy #some hints of goth daddy’s kids #and other stuff #like stonks #buggy really wants to corner the crypto market

Birthday present

#lawbin #trafalgar law #nico robin #ship hell lmao #apparently it’s her birthday?

Dilf doodles

#cross guild daddys #dracule mihawk #sir crocodile #one piece

Mugiwara Sportball

#one piece #sports #soccer #football #vinsmoke sanji #roronoa zoro #zosan if you squint #franky if you squint #idk who they’re playing against #maybe kaku for all we know

Doodle dump

#one piece #roronoa zoro #vinsmoke sanji #trafalgar law #nico robin #zolaw #zoro’s kicking his sempai’s ass at kendo #lawbin #Choppy and Bepo are implied but not present