Cookbook/GoodRelations and Yahoo SearchMonkey

= News =

2010-02-25: Great news: Yahoo has just turned on the improved rendering of GoodRelations-augmented product pages! Today is the day GoodRelations + RDFa will become a mainstream SEO technique.



2010-02-20: If you use the Magento shop software, there is now a free extension to use GoodRelations for SEO automatically.

2009-11-25: Small bugfix: The value for the gr:has-EAN_UCC-13 property was too short (11 instead of 13 digits). The new value is correct.

2009-10-30: Yahoo Searchmonkey rendering temporarily down:  Yahoo crashed the improved rendering temporarily in the process of updating the Yahoo UI - they are working on that but '''it may take until mid of November for the improved pages to re-appear. '''

2009-10-10: GoodRelations &amp; Yahoo confirmed in the wild! Our demo page shows up in Yahoo now, see http://tr.im/yahooproduct2

2009-09-22: We removed the XML shortcuts for empty span/div elements from the examples, since that could cause problems with some HTML agents (browsers, applications).

2009-07-24: Yahoo just confirmed to me that GoodRelations and RDFa could be used for optimizing eBay offers in new www.yahoo.com (even if the page has no proper XHTML header)

2009-07-23: The Yahoo! Validator is currently down for maintenance, see this note from Yahoo!. I suspect they are currently fixing the false warning that media:image objects cannot be URIs.

= Introduction =

Great news for any business in the World, and any Web of Linked Data developer: As of now, Yahoo will display price and offering details and other meta-data of any e-commerce Web page if the site owner uses the free GoodRelations vocabulary.

Previously, such data was only used within special applications developed in the Yahoo ecosystem. Now, every site owner can enhance the search results of his or her offers in Yahoo!.

On this page we will show how you can augment a Web page describing your business and your products with additional data so that Yahoo! will show those details in the results.

At the same time, this data can be used by many novel services in the Web of Linked Data, i.e., you do not only improve your search results in Yahoo! but also become visible for customers in novel search and recommendation engines.

We will use the following specifications:


 * 1) The RDFa syntax for embedding RDF meta-data in XHTML Web pages.
 * 2) The GoodRelations Web vocabulary for e-commerce.

[http://developer.search.yahoo.com/start Yahoo! currently supports eight types of augmented search results] in the standard output, of which the following two are most relevant for a business:


 * 1) Local (phone numbers, addresses, opening hours, position, etc. of your shops etc.)
 * 2) Product'(prices, images, and product info)' 

We will not cover the news, video, event, documents, discussion, and games patterns in this recipe.

Important: This document in here goes beyond the original Yahoo! specifications at


 * http://developer.search.yahoo.com/help/objects/product and
 * http://developer.search.yahoo.com/help/objects/local

because the recipes shown in here make your business visible for


 * 1) Yahoo!
 * 2) Yahoo! SearchMonkey and Yahoo! BOSS services, and
 * 3) all other novel Web of Linked Data services unrelated to Yahoo.



= Getting Started =

Note: If you want that Yahoo! displays your data in the standard search interface, you are currently limited to one type of additional information per page. That means, you must e.g. have separate HTML pages for


 * your company and
 * one for each individual product or offer.

In the following example, we will use the two Web pages:


 * http://www.heppnetz.de/searchmonkey/company.html
 * http://www.heppnetz.de/searchmonkey/product.html

Let's assume they initially look as follows:

 http://www.heppnetz.de/searchmonkey/company-raw.html 

http://www.heppnetz.de/searchmonkey/product-raw.html

Now, the first step is that you change the DOCTYPE in the header to "XHTML+RDFa":

Also, make sure that the "head" element includes the proper content type and encoding for XHTML:

Then, insert the following namespace prefix definitions into the "body" element in both files:

So the complete header should look like

= Describing Your Business =

Then, add the additional "div" and "span" elements plus all attributes as shown below for encoding your contact details and opening hours to the file at http://www.heppnetz.de/searchmonkey/company.html:

http://www.heppnetz.de/searchmonkey/company.html - With meta-data:

= Describing Each Product =

Next, add the additional "div" and "span" elements plus all attributes as shown below for encoding your product details including pricing to the file at http://www.heppnetz.de/searchmonkey/product.html:

 http://www.heppnetz.de/searchmonkey/product.html - With meta-data: 

= Publishing the Data =

Next, upload the new files to their original location:


 * company.html to http://www.heppnetz.de/searchmonkey/company.html
 * product.html to http://www.heppnetz.de/searchmonkey/product.html

Check with a standard browser that the layout of the page is fine.

If you want, you can use stylesheets and other standard design techniques for optimizing the rendering. The only important thing is that the file remains a valid XHTML document. You can use the W3C Markup Validation service at http://validator.w3.org/ to check that.

Now, validate the pages using the Yahoo! Validation Service at the bottom of


 * http://developer.search.yahoo.com/help/objects/product

You can also invoke it directly using the URI of your page as the respective parameter:


 * http://developer.search.yahoo.com/help/objectfinder?url=http%3A%2F%2Fwww.heppnetz.de%2Fsearchmonkey%2Fcompany.html
 * http://developer.search.yahoo.com/help/objectfinder?url=http%3A%2F%2Fwww.heppnetz.de%2Fsearchmonkey%2Fproduct.html

It should show that your mark-up is okay, but not yet included in the Yahoo! index:



= Telling Yahoo and the World =

Now, the last thing that remains to be done is telling Yahoo! and the world to consider your new data.

As for Yahoo!, use http://siteexplorer.search.yahoo.com/submit to submit your page.

This requires free registration with Yahoo!.



Then, enter the URIs of all pages that you changed. If there are links between the pages, Yahoo! will find all of them, so you won't have to submit hundreds of URIs.

If the page you enhanced following this recipe is not a highly ranked page itself, you should link to it from a prominent page, e.g. your main page, because it will depend on the popularity of your page how quickly the Yahoo! crawler will stop by and index your new data. It can be anything between a few days and several months.

As a last step, you should inform Semantic Web indexing services. Currently, sindice.com is the only service that we know that considers RDFa data, so this should be your first choice.

Go to http://sindice.com/main/submit.



and enter the URIs of all of your enhanced pages there. Then press "submit".

Another popular service, Ping The Semantic Web does not yet support RDFa directly, but you can still make your data known with at trick:


 * Go to http://www.w3.org/2007/08/pyRdfa/
 * Paste the URI of your page in the field "URI of XHTML or SVG file"
 * Select "RDF/XML" as the output format, select "no" for warnings, and "strict" for the parsing type.
 * Click on "Go!"
 * After a few seconds, your browser will display XML content without rendering.
 * Copy the pretty long URI from the browser's address field in your clipboard. It will look like http://www.w3.org/2007/08/pyRdfa/extract?uri=http%3A%2F%2Fwww.heppnetz.de%2Fsearchmonkey%2Fcompany.html&amp;format=pretty-xml&amp;warnings=false&amp;parser=lax&amp;space-preserve=true&amp;submit=Go!&amp;text=
 * Paste that long URI in the field "Ping the Semantic Web!" at http://pingthesemanticweb.com/.



= Congratulations! =

You are now all set - your additional data will sooner or later appear in


 * the Yahoo! search results,
 * numerous novel services based on Yahoo! SearchMonkey and
 * Web of Linked Data applications.

If you want to check whether your data is already included in the Yahoo! SearchMonkey index, you can use the tool at

http://goodrelations-search.appspot.com/

It is basically a variant of the Yahoo! search service that displays all meta-data that is found in the Yahoo! index.

Note: At the time of writing, there are some problems in Yahoo's RDF export, so even though my page has been crawled by Yahoo!, the SeachMonkey index does not return fully correct RDF. Yahoo is aware of the bug.



If you enter


 * wwwurl:&lt;your uri&gt;

as the search parameter, the tool will list all meta-data that is currently in the Yahoo! SearchMonkey index.

For example,


 * wwwurl:

http://www.heppnetz.de/searchmonkey/company.html will show the meta-data for



http://www.heppnetz.de/searchmonkey/company.html

You can use the following links to check those pages directly


 * Check company.html in Yahoo!
 * Check company.html in Yahoo! SearchMonkey
 * Check product.html in Yahoo!
 * Check product.html in Yahoo! SearchMonkey

= Information for Web of Linked Data Developers (RDF, OWL, SPARQL, ...) =

If you are familiar with the W3C Semantic Web technologies RDF and RDFS or use the Tabulator browser plug-in, you can look at the meta-data in your pages using the RDFa distiller (and validator) written by Ivan Herman (http://www.w3.org/People/Ivan/), available at

http://www.w3.org/2007/08/pyRdfa/

Select - "RDF/XML" or "Turtle" as the output format, - "yes" for warnings, and - "strict" for the parsing type: - Click on "Go!"

The result using the Turtle syntax should look as follows:  

 a) http://www.heppnetz.de/searchmonkey/company.html 

b) http://www.heppnetz.de/searchmonkey/product.html

= Things That Took Us A While To Learn =

;-)

1. Yahoo accepts images only when the URI is specified this way:

The semantically equivalent pattern does not work. It triggers an error in the Yahoo Validator.

2. The ordering of elements matters for Yahoo. For example, the gr:includesObject property MUST ENCLOSE the gr:ProductOrServicesSomeInstancesPlaceholder. Otherwise Yahoo complains about "multiple objects" in one page and will not produce the enhanced rendering.

= Additional Resources =


 * Webcast: http://www.heppnetz.de/projects/goodrelations/webcast/ (video, 15 minutes - mainly the business perspective)
 * Talk at the Semantic Technology Conference 2009: "Semantic Web-based E-Commerce: The GoodRelations Ontology"
 * http://tinyurl.com/semtech-hepp
 * Overview article on Semantic Universe: http://tinyurl.com/goodrelations-universe
 * GoodRelations Wiki: http://www.ebusiness-unibw.org/wiki/GoodRelations
 * GoodRelations project page: http://purl.org/goodrelations/
 * Tutorial at ESWC 2009: The Web of Data for E-Commerce in One Day: A Hands-on Introduction to the GoodRelations Ontology, RDFa, and Yahoo! SearchMonkey
 * http://www.ebusiness-unibw.org/wiki/GoodRelations_Tutorial_ESWC2009
 * Kingsley Idehen's RDFa and Linked Open Commerce resources
 * GoodRelations hashtag #goodrelations on Twitter
 * RDFa Tutorial at ISWC2008
 * RDFa Wiki (I don't agree with some statements made in there, but still the Wiki contains a lot of information for "normal" Web designers.)