Search Knowledge Base by Keyword

Table of Contents
< Back
You are here:
Print

Image Sitemap Extraction for Product Mapping

Enable the extraction and parsing of an image name representing a unique Product ID contained in the <image:loc> string of an Image XML sitemap and be used to map product pages across different CMS. With this option, if your site(s) have a unique image ID, Hreflang Builder can extract it and add it to the database as a mapping variable. 

Requirements:  

  •  Image XML sitemap with <image:loc> containing an extractable ID
  • A parsing rule to tell the system which segment of the <image:loc> to extract

Step 1 – Open your image XML sitemap and look for the row that begins with <image:loc> this is what we will import in the parameter selector step you will select the option <image:loc>

Step 2 –  Identify what part of the <image:loc> contains the unique ID.  The system will import the entire string if a rule is not set.  Note – it is unlikely this will match anything.

Step 3 – Create the system’s parsing rule. Since we can only use the main product ID in red below, we must tell the system to remove the elements before and after.   

<image:loc>https://cdn.shopify.com/s/files/1/0744/products/avyzt01850_abcd.jpg</image:loc>

Rule Construction Tips:   (defines where the Anchor is)

XXXXX – Full or Part of your Anchor. If there are multiple of them, we combine them using underscore.

% – any alphanumeric characters [A-Za-z0-9] %% – remove any text we don’t need, basically the beginning and ending

Because we want the segment (product ID) between products/ and .jpg we can write this rule and the system will extract the product ID and extension.  Note.  If you don’t need the _abcd extension, remove the _XXXXX from the rule.  

Rule: %%/products/XXXXX_XXXXX_%%

Extract Result: avyzt01852_abcd

Step 5 – Import Initial Image XML Sitemap

For the initial build, you must load the XML sitemap manually through the URL mapping screen because you must set the parsing rule and indicate that you will use this feature.  Once this has been done, we can extract this from future XML sitemaps imported via Auto Update. In your project, go to the main screen and click “URL Mapping” to open the mapping screen.

Step 6 – In URL Mapping, click Upload XML/JSON and open the Image Sitemap Importer.

Step 7 – Once the import window opens, click “Choose File” and select the Image XML sitemap file you wish to import.  Click the green UPLOAD button.

Step 8 – The system will import the XML sitemap and prompt you to add the mapping rule you created earlier.

Step 9 – Paste the rule you created earlier in the box for the mapping rule and click Upload and Run.  The system will import the XML sitemap and extract the product ID.  Once completed, you can confirm the extraction by viewing the mapping matrix.

Step 10 – System Action:  The system will parse the product ID from the image source and append it and the associated URL to the system.  If that product ID is set as a rule, this URL will be added to the cluster.  If the product ID is new, a new Rule ID will be added, and this URL will be added to the cluster.  Any other URLs with this product ID, extracted from the page or uploaded manually into the system, will be merged to create the hreflang cluster. 

Table of Contents