Create Sitemap for S3 Bucket

This article provides an approach to create a sitemap for files stored in an Amazon Web Services S3 bucket using Invantive SQL.

Instructions

Perform the following steps to create a sitemap:

  • Install and configure the AWS CLI.
  • Retrieve all objects in a bucket using the command prompt:
aws s3api list-objects --output json --no-cli-pager --bucket www.name.nl >c:\temp\name.json

Or execute directly from Invantive Query Tool or Data Hub using Invantive Script:

local host "aws" "s3api list-objects --output json --no-cli-pager --bucket www.name.nl" ""
  • The output is stored in the local variable ${local:hoststdout}.
  • Log on to Invantive SQL using any provider (use Dummy when you can’t choose).
  • Run the following query:
select xmlformat
       ( '<?xml version="1.0" encoding="UTF-8"?>'
         || '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">'
         || xml
         || '</urlset>'
       )
       sitemap_contents
from   ( select listagg
                ( '<url><loc>https://www.name.nl/' 
                  || xmlencode(dta.name) 
                  || '</loc><lastmod>' 
                  || xmlencode(dta.lastmodified) 
                  || '</lastmod></url>'
                , null
                ) 
                xml
         from   jsontable
               ( 'Contents.[*]'
                 passing '${local:hoststdout}'
                 columns lastmodified       datetime path 'LastModified'
                 ,       name               varchar2 path 'Key'
               ) dta
         where name like '%%' /* Insert filter here to only include specific folders. */
       )
  • Either copy the contents to a sitemap.xml file or use the Invantive Script statement ‘local export documents’ to generate the file such as:
local export results as "${SITE_ROOT_FOLDER}\sitemap.xml" format txt