What is your preference between alt text, text IDs, or both? I have seen discussions of it in accessibility-focused spaces (I think people lean towards both, sometimes making the alt text shorter?) but I haven’t heard from anyone who actually uses a screen reader as far as I know. Also, can they pick up on all caps or emojis? What about punctuation like “?!” (question mark, exclamation mark) or looooooong (long) words drawn out for emphasis? Is there a free or inexpensive screen reader I could experiment with to better understand how they work? Thank you for your time!
These are all good questions, thank you for asking. I’m sorry I’m getting back to you so late. Strap in though because this post is going to be a big one.
First, most mainstream screen readers for the blind can handle emojis just fine, so I wouldn’t worry too much about emoji accessibility most of the time. The biggest things to be aware of are to not use too many of them, especially not more than a couple in a row. The descriptive titles for emojis can be very long and there is unfortunately a trend on many social media platforms to put half a dozen emojis in your username which can be mildly infuriating having to sit through all of them to get to the actual post content because screen readers cannot skim in the traditional sense. Similarly, those posts that place an emoji between almost every single word are exhausting for us in a lot of cases, so the only real thing to be careful about with emojis is how many of them you cram into a post or username. But most typical ordinary emoji use is totally fine.
Most screen readers cannot necessarily tell you if text is in all caps, so when writing descriptions, it is best to specify that posts are written that way. Sometimes we can guess because a screen reader will sometimes interpret short capitalized words as acronyms, so when I randomly hear A-N-D spelled out pronounced like an acronym in the middle of a sentence, that does tip me off that clearly at least part of that text must have been capitalized and I can go back and check manually, but it is generally just simpler if you state “in all caps” or something of that nature when describing a section of text that is written that way.
Screen readers will typically mispronounce drawn out words, attempting to pronounce them phonetically, so it will not sound like the actual word but in my opinion this is fine. That mispronunciation tells me that that word has been written a certain way and then I can arrow specifically to that word and spell it out to see what it is if I couldn’t already guess based on the pronunciation, though usually I can guess. There will probably be some level of differing opinions on doing things like this, but personally as a lifelong Internet user I don’t really have an interest in stopping people from using language that’s going to appear all over the Internet anyway because I can check it just fine on my own so it’s not a big deal to me personally.
As for punctuation, this entirely depends on the settings the person has configured for their screen reader. By default, a lot of screen readers will read some punctuation which is less common such as colons or brackets, but not speak more ordinary punctuation like periods, commas, and exclamation points. Most of the time if there’s a question, the speech will use a vocal inflection like a person would when asking a question. However, a person can also tweak these settings in every screen reader I know about to read either more or less of it depending on their preference. Some screen readers have slightly different rules on when they pause and do not pause, and some voice profiles within the same screen reader can have different rules as well, but I wouldn’t worry too terribly much about punctuation. We have our settings set the way we prefer to interact with punctuation and we are definitely used to the particular rules that our screen reader and voice profile of choice use.
My one caveat about punctuation is that at least for me personally, I am a bit more of a stickler about using the correct punctuation marks to achieve the correct auditory sentence flow. This mostly comes up in dashes, because the different dashes have different purposes and therefore are pronounced and paused under different rules using a screen reader according to the grammatical uses of that dash. I will always turn into an old man yelling at clouds when someone is clearly using a hyphen as an em dash because hyphens do not indicate pauses but em dashes do, so I can immediately tell when you are using the wrong one because two phrases will just sound run together without a pause where there should have been a pause lol. This is just a personal gripe though and not a big serious thing that anyone should criticize anyone else over, just a tiny pet peeve of mine. We aren’t here to be the grammar police today, but dashes in particular can be a little annoying when used incorrectly.
Important to note though for all of these things is that in the context of writing image descriptions, the goal should still always be to preserve the original text and not to alter it for the sake of better screen reader flow. This is very similar to editing captions in a way that no longer accurately reflects what was actually said. Preserving the original information in its entirety even if it contains misspelled words or incorrect punctuation is really important. Stripping away those things, while it might make it easier for us to read, means that we are no longer experiencing the exact same information as everyone else and fundamentally changes how we will interpret the post or text being described. Certainly feel free to insert clarifications like capitalization notes, but please don’t correct spelling or fix punctuation or correct drawn out words in an image description.
There are indeed a handful of free screen readers you could experiment with if you want to get a very broad feel for how certain things are red, although I would caution you from drawing too many conclusions or making decisions based on your experience with them. The experiences of sighted users who are not proficient in the use of these tools is going to be significantly different from those of us who use them every day and have a lot more skill and know what kinds of commands to use for certain things and how to efficiently move through text and jump to certain sections, so the basic experience you have is likely not super representative of how we actually use them. If you need to make major accessibility decisions, asking an actual user will probably get you much more accurate advice, but I still certainly think playing with them can be a valuable learning experience to some extent. On Windows, you can download the NVDA program, and on Mac you can turn on voiceover which is built into the computer in the accessibility settings. Windows also has a screen reader built in called Narrator, though it’s generally less robust and most of us don’t use it as a primary tool and it’s more of a backup in case of problems with our main. On mobile, there are very good built-in screen readers under accessibility settings, voiceover again on iOS and talkback on android.
Now, as for the debate about alt text versus image descriptions in the body of the post, this is a hard one and I think a lot of us have different opinions on this. Because of that, I don’t want my opinion here to be taken as the new rule, but I can tell you what I personally prefer.
Most of the time, I vastly prefer the description to be in the alt text. For a long time, there was some debate about this because a number of social media platforms including Tumblr did not have visual indicators that alt text was included, which meant that low vision users who did not use a screen reader but still needed alt text did not usually know whether an image contained it or not. However, in just about all of the social media places I frequent these days, it seems that issue has been mostly resolved and most platforms now include an ALT logo in the corner when alt text is present.
The main reason I prefer the alt text is because, especially on a platform like Tumblr, if I am moving through my dash and I land on an undescribed image element, in a lot of cases I am not necessarily going to bother checking the body of the post to see if there is a description. Most of the time there isn’t, and so it is kind of a waste of my time to check every single post body just in case there is a description in a couple of them. It is usually a safer assumption that if the image does not have alt text, that post is not going to be one that I can understand the context of so I will skip it most of the time. This is even more relevant on platforms like Instagram where you usually have to activate the post to open the body of it so there are extra steps if the description is down in the body, and again, most images without alt text do not have a description so I’m not usually going to take the extra step to check on every single post when I don’t even know what the image could possibly be of to decide if it’s relevant to me or worth checking.
The other reason is because, like most blind people, I do have a little tiny bit of residual vision, and especially when I am using my phone with a touchscreen and especially on a platform like Tumblr where the posts are not red as one single element altogether, I have to individually tap different sections of the post to read them, and visually speaking an image is usually a much larger target to tap. When I see an image on my mobile Dash, I can easily find at least somewhere to tap on that image to see if it has alt text or not, but the text portions of the post are usually smaller targets that are harder for me to see and lock onto to find out what they are. In a text post this isn’t really an issue because I know there isn’t an image to worry about, I can just go sequentially down the post, but if the post starts with an image, I am going to assume that the rest of the text in that post is relevant to that image and so just like the paragraph above, I am going to assume I can’t understand the full context of that post if the image isn’t described, and they are so rarely described that I’m not going to check the text on every single post. I am again much more likely to skip the post entirely if it does not have alt text. I will certainly tap at least one paragraph under the image in some cases, but if I don’t find a description very quickly, I am going to move on.
This by the way is why, if you are going to put a description underneath, it is extra useful to put the description directly below the image instead of saving it for the bottom of the post or something. I am never going to get all the way to the bottom of a post and think hm, after all of those paragraphs that were definitely not descriptions, I wonder how likely it is that they bothered to put a description in the post at all. If it’s not easy to find, I am moving on.
That said, I will also admit that there are some circumstances in which I do actually kind of prefer a description in the body of the post, and these are usually cases where the description would be incredibly long or information dense, or if the image is actually a collage of multiple images in one JPEG. Alt text usually gets read by screen readers as one singular paragraph, so if that description is huge and full of information that you really want to concentrate on and sift through like a complex infographic or is a super detailed description of a long comic strip or a mood board with like 12 images on it, only getting to read that entire paragraph all at once can make it hard to actually take in all of the information. In these cases, I do quite like when the description is in the body of the post, primarily if that description is then broken into distinct paragraphs which I can read one by one and take my time on. If it’s all one big paragraph anyway under the image I will be equally overwhelmed as I would if it were alt text, but if it’s broken up, I have a much easier time digesting complex and significant amounts of information in pieces. But I also have pretty severe ADHD, so do with that what you will.
Breaking image descriptions up into multiple paragraphs also seems to be somewhat controversial, I have seen some Guide posts advising everyone to do the exact opposite, but personally I have always hated that rule. In the same way that most people reading information visually appreciate when information is broken into distinct paragraphs for easier reading flow instead of one big wall of text, I prefer lengthy complex descriptions to be broken up into separate paragraphs for the same reason. Information is often just more digestible in pieces in my opinion and I wish it was more widely acceptable to do that with descriptions.
Even in the case of these descriptions where I do prefer it broken up into paragraphs in the body of the post, this is a situation where I do think having a shortened overview in the alt text is important, because again I’m not necessarily going to check that post if it doesn’t have alt text.
However, on the general debate between whether we should be using both alt text and body descriptions on every post, I have pretty complicated and possibly controversial feelings. I do not think every post needs to be done this way. There is often some guidance that says all text should be incredibly short, like only a couple hundred characters at most, and for any image which has more than that tiny amount of information it should be provided in the body of the post. I personally think this is kind of ridiculous and that alt text does not need to have such an incredibly limited character count on certain platforms. I do not want it to become the norm that most descriptions on social media platforms are broken into two separate places. Again, for incredibly long or incredibly information dense descriptions I do think the double description system is valuable, but I don’t think this should become the standard for most things. At least part of the argument here goes that these character counts are to prevent overly lengthy unnecessarily detailed descriptions that bog it down and don’t add anything of value and just make the description more confusing, and that does certainly happen sometimes with inexperienced describers, but there are plenty of images which do necessitate that higher character count and I don’t think we should be artificially capping them so small. I think it should be up to us how much description an image really does warrant.
For example, descriptions of fan art can be pretty lengthy, but I don’t personally think it’s an issue to have a good six or seven sentences of description of your art in the alt text. I don’t think the alt text needs to say “a painting of Superman in the sky“ and for all the rest to be put somewhere else. I have no problem listening to all of your thoughtful description of your art in one paragraph, so long as we aren’t reaching the point where it really should get broken into multiple paragraphs, like in the case of massively detailed mood boards or long comics. But a single image of your character? I think it’s fine to have a big paragraph of description of your art in that one place, because that’s not the kind of information that is challenging to follow when it’s in one place.
Like I mentioned further up, there was originally a lot more discussion about putting descriptions in both places because there didn’t used to be consistent visual indicators of alt text for low vision users without screen readers, but in the places where that is no longer true, I don’t think we need to be copy pasting the exact same description in the alt text and in the body of the post. That said, on platforms where that might not be the case still, I think it is completely reasonable to do that for the benefit of low vision users who aren’t using screen readers. But otherwise, in most places I do not think there is a need for that anymore.
So essentially, I think it’s somewhat depends on what the image is and what kind of description it really necessitates, which is usually kind of a judgment call. Most of the time I would much rather see it in the alt text, when it’s information dense I would rather see it split into multiple paragraphs in the body with shortened alt text directing me to the description below, and if the image is being posted on a platform which does not consistently use ALT logos in the corner across both mobile and desktop platforms, I think it is still fine to use the same description in both places. But again, these are all my personal opinions and feelings and not necessarily intended to be guidelines or directions to follow.
If you are still here, thanks for reading this massive novel of a post. I knew that this one was going to necessitate some real sitdown time and energy because of how much information would be needed to answer it, and it took me quite a while to get the free time I needed to give it the space it deserved. I hope you find these thoughts valuable or at least interesting in some way and I welcome discussion from the rest of the blind and low vision community on these topics.