Accessibility Metadata Project: Final Report

UPDATE (November 09, 2021)

This Project is no longer maintained here.  This project’s work has been moved to the W3C and has been taken up in a Community Group.  For more information please visit the new Accessibility Discoverability Vocabulary For Schema.org Community Group.

To review or report issues with the  Schema.org Accessibility Properties for Discoverability Vocabulary, please refer to the vocabulary’s GitHub issue tracker.

Original Report (Outdated)

Submitted by Madeleine Rothberg, NCAM, WGBH

As the Accessibility Metadata Project funding with the Bill & Melinda Gates Foundation has concluded, we wish to thank them very much for their support of our work on the discoverability of accessible educational resources, and for their recognition of the importance of accessibility in furthering inclusive education.

Thanks to the Gates funding, we have made tremendous strides in our goals to develop standards for accessibility metadata and to have these standards accepted by Schema.org, the organization that keeps a list of agreed-upon tags that all search engines can use in common so that users of those search engines can refine their searches to find exactly what they are looking for. Now that the standard set of Schema.org tagging of online educational resources’ properties includes accessibility metadata, these properties have been picked up by the Internet Archive’s Open Library initiative, Hathi Trust Digital Library, and the Learning Registry, a leading metadata aggregation platform about online learning resources. We have also added accessibility metadata tags both to Bookshare and to a payload of metadata submitted to the Learning Registry. Bookshare now automatically submits accessibility metadata for Bookshare titles in the registry. Because of our reference implementation, Bookshare’s accessible content is more easily discoverable via online search and others are able to better understand how to make their content include accessibility metadata.

Beyond our grant commitment, we developed additional reference implementations and tools:

  • Searching for videos with closed captions: Before this project, it was not possible to search for captioned videos beyond the YouTube domain. By collaborating with the creator of the “WP YouTube Lyte” plug-in, which allows WordPress site administrators to automatically add accessibility properties to videos that have closed captions, we contributed code that allows for search based on accessibility. Now people who need captioned videos can easily find them on all WordPress sites using the plug-in using tools, such as Google’s Custom Search Engine.
  • Described video tagging: Smith-Kettlewell Eye Research Institute has developed a web-based video description product called YouDescribe that enables anyone to describe YouTube videos on the web. To assist people with visual impairments to easily discover videos described with the YouDescribe platform, Smith-Kettlewell is automatically tagging their videos with accessibility properties. Now search engines such as Google Custom Search Engine can index those properties.

The adoption of our project’s proposed set of accessibility metadata tags and the implementation successes I just listed are a tremendous milestone in the collaborative journey towards our vision of a “Born Accessible” world: a world in which all content born digital is made accessible—and discoverable—from the outset. Our implementations demonstrate that a broad adoption of accessibility metadata is possible.

Now that this groundwork has been laid, what is next? We must now encourage content management systems, publishers, search engines, and sites like Wikimedia to start using Schema.org metadata in their sites, so that one day everyone will be able to find the great accessible content that is out there.  And, we have elements that we would still love to add – see the “properties under consideration” section of the specification. If you are interested in seeing this work progress, and would like to discuss your ideas for projects and next steps, please contact us by submitting a contact request in the project update sign-up form on this page.

WordPress “WP YouTube Lyte” Plug-in Now Supports Schema.org Accessibility Properties

At Benetech, one of the Seven Truths we live by is, “partnership over going alone.” Our collaboration with Frank Goossens is a fantastic example of the effectiveness of that principle. Recently, Benetech created a patch for Frank’s very popular “WP YouTube Lyte” plug-in that enables WordPress site administrators to automatically add Schema.org accessibility properties to videos that have closed captions. As Frank announced in his February 3rd blog post, that patch has now been incorporated into the plug-in. “If you have microdata enabled, WP YouTube Lyte now will automatically check if captions are available and if so, adds the accessibilityFeature property with value ‘captions’ to the HTML-embedded microdata.”  Translation: users of this plug-in for their WordPress sites can now make their captioned videos more easily discoverable by people who need them.

Compare Results: With and Without Schema.org Accessibility Properties

To see what this looks like in practice, compare the following two search results:

  1. Run a search for closed-captioned videos on the Predictive Analytics Today website using Google’s closed-captions filter, which does not yet leverage Schema.org accessibility properties. The result: “Your search – site:predictiveanalyticstoday.com – did not match any video results.”
  2. Next, run the same search using Google Custom Search Engine (CSE), which does allow you to filter results by any Schema.org properties, specifically in this case the accessibilityFeature property. This time, the search result correctly returns a link to the web site that includes the closed-captioned video you were looking for.

Try It Yourself: Captioned Video Search on a11ymetadata.org

If you would like to experiment with other Google CSE searches for closed-captioned videos, check out the Captioned Video Search link on a11ymetadata.org under the “Implementations” menu.  Be sure to always have the filter “more:p:videoobject-accessibilityfeature:captions” included in the search box without the quotes.

Encourage Adoption of Schema.org Accessibility Metadata

The good news is, with both an accessibility metadata standard in place, and successful implementations like the WP YouTube Lyte plug-in enhancement, we know how to create a search function for accessible content and we know that it works.  The next step: Encourage other content management systems, publishers, and sites like the Internet Archive and Wikimedia to start using Schema.org metadata in their sites so that one day everyone will be able to find the great accessible content that is out there now but can’t yet be found by those who need it.

Pay It Forward: The Rewards of Collaboration

I encourage everyone to engage in collaborations with their favorite content and service providers. Said Frank in his blog post about the project, “This was the first time someone actively submitted code changes to add functionality to a project of mine, actually. Working with Benetech was a breeze and GitHub is a great platform to share, review and comment code. I for one am looking forward to more high-quality contributions like this one!”

We agree with you Frank!

Learning Registry Blazes New Trails with Accessibility Metadata

As I announced earlier this month, Schema.org recently adopted our proposal for accessibility metadata tagging that will make it possible for anyone with access to the Internet and a search engine to more easily find accessible content and applications on the Web. (See “Schema.org Accepts Our Proposal!” January 2014.)  Exciting work is already underway that leverages the power that such tagging provides. In the meantime, we are working on reducing the barriers to entry for anyone to tag content with accessibility metadata.

Fostering universal adoption of accessibility metadata

In order for users to feel confident that everyone can really find all (or even most) accessible resources using a search engine there are two hurdles to clear:

(1)    All digital content and applications with accessibility features must be tagged with accessibility metadata whenever such features are present (e.g. image descriptions, tactile images, video captioning, support for screen readers, and the like).

(2)    Major search engines, like Google, Yahoo, Bing, and Yandex, and vertical search products, such as the Federal Registry for Educational Excellence (free.ed.gov) must support accessibility metadata and display the associated information to the user in search results.

This is a chicken-and-egg problem – there is no reason for search engines to support tagging if the tagging isn’t in the content and there is no reason for people to tag content if the search engines don’t support it. How to break that impasse? The first step is to have free, easy tools available that anyone can use to attach metadata to digital content.

EasyPublish: First publicly available tool to support accessibility metadata

The Learning Registry, a leading metadata aggregation platform about online learning resources, has recently enhanced their EasyPublish tool to allow anyone to tag any digital content with accessibility information. This marks the first time that a tool has been made freely available for anyone – includinAccessibility Informationg teachers, publishers, content creators, parents, or students – to take advantage of accessibility metadata tagging, which in turn will make it possible for the average person to easily discover accessible materials using online search engines.  Jim Klo’s note to the developers’ Google group for this project includes a link to the sandbox area for this tool. There you can explore how it works and enter sample data without making your entry live. Or, if you have a resource that you would like to tag for real, you can visit the production site to register content and attach accessibility metadata tags to digital content.  One exciting feature of the Learning Registry is that anyone can describe the accessibility of a resource, even if they are not the original publisher of that resource.

Bookshare automatically puts accessibility metadata into Learning Registry-powered sites

Anyone can put metadata into Learning Registry for any piece of digital content by using the EasyPublish interface described above – it can be done either manually or autoFeature filtersmatically using the API (Application Programming Interface) that the Learning Registry provides. Using the Learning Registry API, Bookshare now automatically submits accessibility metadata for Bookshare books into the Learning Registry. This lays the groundwork for users of Learning Registry-powered sites, such as the Federal Registry for Educational Excellence (free.ed.gov), to filter search results for Bookshare titles based on specific accessibility features, such as described images or MathML.

Smith-Kettlewell demonstrates described video tagging

Smith-Kettlewell Eye Research Institute has developed a web-based video description product called YouDescribe that enables anyone to describe YouTube videos on the web.  To enable people with visual impairments to easily discover videos described with the YouDescribe platform, they have tagged their videos with Schema.org accessibility properties.  Now those properties can be indexed by search engines such as Google Custom Search Engine.  Click here for an example search for tutorials about description.

Spread the word

We continue to be in dialogue with organizations such as Google and Archive.org about the importance of supporting accessibility metadata tagging, but we can use all the help we can get. If you feel strongly about the importance of this initiative, please let your favorite search engine, publishers, and websites know how important it is to you to be able to easily find accessible materials on the web, including captioned videos, described images, and more.

Schema.org Accepts our Proposal!

I am delighted to announce that Schema.org has accepted our proposal for a key set of accessibility metadata tagging that could allow anyone with access to the Internet to more easily locate content and applications with accessible features.  Google’s TV Raman and Yandex’s Charles McCathieNevile announced on Schema.org: “This work draws upon many collaborations and projects including the IMS Global Learning Consortium’s Access For All specification, the work of the Accessibility Metadata Project, alongside many discussions that helped ensure the work integrated well into Schema.org.” See my recent blog post on the Benetech web site about the Accessibility Metadata Project for more details. This is a tremendous milestone in our collaborative journey towards enabling a born accessible future and reaping its benefits. Many thanks and congratulations to all have contributed to this important work.

Accessibility Metadata in action at Teachers’ Domain

As the schema.org proposal receives more attention, there have been numerous questions concerning the utility of mediaFeature and accessMode for both the search and usefulness of accessible content. These questions are best answered with a demonstration. It turns out that WGBH built a repository for learning resources, which are tagged with information representing LRMI and Accessibility Metadata (the database was done with the original AccessforAll) tags. Madeleine Rothberg, Project Director for WGBH National Center for Accessible Media, recorded a short video (5:19) that shows the use of the accessibility metadata. You can watch the video (yes, it is captioned and there is a transcript at the end of this page) and see the following major points in the video.

Accessibility Metadata at WGBH Teacher's Domain

  • 00:30 An example search, where the accessibility properties (mediaFeature) are displayed.
  • 01:20 Enter a new situation where the user can’t hear, possibly because the computer environment does not have computer speakers in the laboratory or because of deafness. Preferences are set for this.
  • 01:57 Now that the profile is known, the search results show whether the content is accessible to that user or not.
  • 03:10 Shows that an animation with no audio is flagged as accessible to this user even though it doesn’t have captions.
  • 04:15 Search filters let you search on specific mediaFeatures.

I think you’ll agree that this information makes accessible content easier to find.

Their plans for future implementation would allow similar
searches without saved preferences, by offering search filters that
replicate the kind of personalized search the preferences allow, for
example by finding resources with either no audio content, or with all
auditory content adapted to other accessModes: only modes that had no audio or the auditory content adapted would display.
The magic that makes this happen is the accessibility metadata. I’ve included code snippets below for what these tags would be in the data if tagged today.
This demo was done with encodings that were predecessors to LRMI and Accessibility Metadata… I have translated the syntax and names to our schema.org proposed names.

For http://www.teachersdomain.org/resource/biot11.sci.life.gen.structureofdna

</pre>
<div itemscope="" itemtype="http://schema.org/Movie"><meta itemprop="accessMode" content="visual" />
<meta itemprop="accessMode" content="auditory" />
<meta itemprop="mediaFeature" content="captions" />
<span itemprop="name">The Structure of DNA</span>
<meta itemprop="about" />DNA
<meta itemprop="keywords" content="National K -12 Subject" />
<meta itemprop="keywords" content="Science" />
<meta itemprop="keywords" content="Life Science" />
<meta itemprop="keywords" content="Genetics and Heredity" />
<meta itemprop="keywords" content="Molecular Mechanisms of DNA" />
<meta itemprop="learningResourceType" content="Movie" />
<meta itemprop="inLanguage" content="en-us" />
<meta itemprop="typicalAgeRange" content="14-18+" /></div>
<pre>

For the silent video http://www.teachersdomain.org/resource/bb09.res.vid.dna/, you can see that the accessmode is now only visual, and that captions is not an adaption/ mediaFeature, as there was no auditory accessMode to adapt; the third and fourth line of the sample tags disappear.

</pre>
<div itemscope="" itemtype="http://schema.org/Movie"><meta itemprop="accessMode" content="visual" />
<span itemprop="name">DNA Animation</span>
<meta itemprop="about" />DNA
<meta itemprop="keywords" content="National K -12 Subject" />
<meta itemprop="keywords" content="Science" />
<meta itemprop="keywords" content="Life Science" />
<meta itemprop="keywords" content="Genetics and Heredity" />
<meta itemprop="keywords" content="Molecular Mechanisms of DNA" />
<meta itemprop="learningResourceType" content="Movie" />
<meta itemprop="inLanguage" content="en-us" />
<meta itemprop="typicalAgeRange" content="14-18+" /></div>
<pre>

The transcript of the above video can be found below:

This is a demonstration of accessibility features in Teachers’ Domain.  Teachers’ Domain is a digital media library for teachers and students.  It’s being transitioned to a new site, called PBS Learning Media, where the accessibility features may be a little bit different.  So I wanted to show you how these accessibility features work right now in Teachers’ Domain.

If I do a search in Teachers’ Domain on a general topic, like DNA, I get back hundreds of results. Some of them have accessibility information available, and some do not.  So, for example, here is a video that offers captions, so it’s labeled with having the accessibility feature of captions.  And if I play that video, I’d be able to turn the captions on.  Here are a lot of other resources that don’t have any accessibility information. Here’s one. It’s a document that includes the accessibility feature long description.  So that means that images that are part of the document have been described in text thoroughly enough that a person who can’t see the images can make use of the document.

So this amount of information is really useful.  If you’re scanning a bunch of results, you can look and see which accessibility features are offered for which videos.  But it doesn’t give you the full story.  For example, if you are a person who can’t hear, or you’re using video in a classroom without speakers, you might think that this video, called “Insect DNA Lab,” wouldn’t be useful to you because it doesn’t list that it contains captions.  But what you can’t tell from this piece of information is that that video doesn’t have any audio at all, so it’s perfectly suited for use without audio, because there isn’t any.  So in order to extract that kind of detail about how well different resources meet the needs of a particular teacher or students, we can set the features in the profile.  So we go to My Profile, and scroll to bottom where there are Accessibility Settings. And right now none of the accessibility settings are set, and that is why we aren’t getting custom information. I set my accessibility preferences to indicate that I need captions for video and audio, and when transcripts are available, that I’d like those too. Now I’ve got a set of accessibility preferences that match the needs of a person who can’t hear, or can’t hear well, and might also match the needs of a school where there are not speakers on the computers in the computer lab.

So now if I repeat that same search for material on DNA, my resources that don’t have any Accessibility Metadata look just the same.  But the resources that have Accessibility Metadata start to pop up with more information.  So this resource, “DNA Evidence” is an audio file, and there are no captions or transcripts available. It’s labeled as inaccessible to me.  This video, which we had already noted has captions, now has a green check mark and says that it’s accessible.  Similarly, this interactive activity that doesn’t have any audio in it is fine. This DNA animation video that doesn’t have any audio is listed as accessible to me.  So now with a combination of the metadata on the resources and my own personal preferences recorded in my profile, I can actually get much better information as I look through the set of search results about which resources will suit me best. And when it works properly in the player, you can also use this feature to serve up the video automatically with the right features turned on.  So for example, if you look at this video about the structure of DNA, we know that it has captions and when we view the video those captions should come on automatically.

Another way that you can use the information in Teachers’ Domain is with the search filters. Here there’s a set of accessibility features that are listed for my search results.  So in all of the search results about DNA, for which there are 202, I can see quickly here that five of those offer audio description, which is an additional audio track for use by people who are blind or visually impaired who can’t see the video, but can learn a lot from the audio. There are 76 resources that have captions, 13 with transcripts, and 8 that offer long description of images, so that those static images can be made accessible to people who can’t see them.  So the faceted search is another way to quickly find resources if you know you are looking for a particular kind of resource, like one that has captions, or one that has descriptions.  But the additional benefit of the accessibility checkmarks is that they alert you to resources that are accessible to you whether because they have a feature you especially need, like captions, or because they don’t have any audio in the first place and they don’t pose a barrier to your use.