This topic illustrates how to use standard functions of Invantive UniversalSQL to parse an iCalendar (ICS) file as a list of tables.
First it is explained what ICS is and some samples are given. In the next step, all ICS-files in a specific folder are mapped to XML. In the last step, the XML is parsed into several tables.
ICS
ICS (iCalendar) is a popular format to exchange calendar information with. It is a text based format which can include many calendar items in one file.
More information on ICS is available on:
A file in ICS-format contains structured data with nesting. The format does not use XML or JSON to represent a tree structure but keywords such as BEGIN:word
and END:word
on an individual line.
The payload follows a keyword such as:
DTSTART:16010325T020000
Multi-line text is included by a keyword and start of text on the first line, followed by continuation lines, each starting with the TAB-character (ASCII code 9).
Example
The start of an ICS-file could be:
BEGIN:VCALENDAR
PRODID:-//Microsoft Corporation//Outlook 16.0 MIMEDIR//EN
VERSION:2.0
METHOD:PUBLISH
X-CALSTART:20180517T123000Z
X-WR-RELCALID:{0000002E-6F07-C199-97F1-D97565A83546}
X-WR-CALNAME:MyCalendar
BEGIN:VTIMEZONE
TZID:W. Europe Standard Time
BEGIN:STANDARD
DTSTART:16011028T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
Each calendar item is an event signaled by VEVENT
, such as:
BEGIN:VEVENT
ATTENDEE;someone
CLASS:PUBLIC
CREATED:20240910T142509Z
DESCRIPTION:Line 1
continuation line 2
continuation line 3
DTEND;TZID="W. Europe Standard Time":20241001T150000
DTSTAMP:20240910T142856Z
DTSTART;TZID="W. Europe Standard Time":20241001T140000
LAST-MODIFIED:20240910T142857Z
LOCATION:Amsterdam\, Nederland
ORGANIZER;someone
PRIORITY:5
SEQUENCE:0
SUMMARY;LANGUAGE=nl:Line 1
continuation line 2
TRANSP:OPAQUE
UID:040000008200E00074C5B7101A82E00800000000CC53B73D8D03DB01000000000000000
010000000A070351FB134DE48A12A5F436749365E
X-ALT-DESC;FMTTYPE=text/html:<html>\n<head>\n<meta http-equiv="Content-Type
" content="text/html\; charset=us-ascii">\n</head>\n<body>\n<div>more text...</html>\n
X-MICROSOFT-CDO-BUSYSTATUS:BUSY
X-MICROSOFT-CDO-IMPORTANCE:1
X-MICROSOFT-DISALLOW-COUNTER:FALSE
X-MS-OLK-APPTLASTSEQUENCE:0
X-MS-OLK-APPTSEQTIME:20240910T142856Z
BEGIN:VALARM
TRIGGER:-PT15M
ACTION:DISPLAY
DESCRIPTION:Reminder
END:VALARM
END:VEVENT
Convert ICS to XML
The following Invantive UniversalSQL-statement changes for each ICS-file format in a folder the payload to XML format:
--
-- Rewrite ICS to XML format.
--
create or replace table Ics2Xml@InMemoryStorage
as
select fle.file_path
, xmlformat
( replace
( xmlencode
( regexp_replace
( regexp_replace
( regexp_replace
( regexp_replace
( regexp_replace
( --
-- Enable CRLF to be matched by '$'.
--
-- Rewrite ATTENDEE;VAL to ATTENDEE:VAL.
-- Rewrite ORGANIZER;VAL to ORGANIZER:VAL.
--
-- Remove timezone signals.
--
replace
( rft.file_contents
, chr(13) || chr(10)
, chr(10)
, 'ATTENDEE;'
, 'ATTENDEE:'
, 'ORGANIZER;'
, 'ORGANIZER:'
, '"W. Europe Standard Time"'
, ''
)
--
-- Remove time zones, format type and languages.
--
, ';(TZID|LANGUAGE|FMTTYPE)=' || '(|")[A-Za-z ' || '.' || '/-]*(|")'
, ''
, 1
, 0 /* All appearances. */
, 'm' /* Consider whole file as multiple lines. */
)
--
-- Rewrite BEGIN:XXX to <XXX>.
--
, '^BEGIN:([A-Z]+)$'
, '#OPENTAG#$1#CLOSETAG#'
, 1
, 0 /* All appearances. */
, 'm' /* Consider whole file as multiple lines. */
)
--
-- Rewrite END:XXX to <XXX>.
--
, '^END:([A-Z]+)$'
, '#OPENTAG#/$1#CLOSETAG#'
, 1
, 0 /* All appearances. */
, 'm' /* Consider whole file as multiple lines. */
)
--
-- Step 1 for rewrite multiline values (continuation line starts with TAB) to a single line.
--
, '^([A-Z-]+):(.*)$((\n^\t.*$)*)'
, '#OPENTAG#$1#CLOSETAG#$2#CONTINUEDLINESSTART#$3#CONTINUEDLINESEND##OPENTAG#/$1#CLOSETAG#'
, 1
, 0 /* All appearances. */
, 'm' /* Consider whole file as multiple lines. */
)
--
-- Step 2 for rewrite multiline values (continuation line starts with TAB) to a single line.
--
, '\n\t'
, ''
, 1
, 0
, 'm'
)
)
--
-- After xmlencode payload, rewrite the escaped
-- tags to the XML variant.
--
, '#OPENTAG#'
, '<'
, '#CLOSETAG#'
, '>'
, '#CONTINUEDLINESSTART#'
, ''
, '#CONTINUEDLINESEND#'
, ''
)
)
xml
from files@os('c:\temp', '*.ics', false) fle
join read_file_text@os(path => fle.file_path) rft
The header in XML becomes something like:
<?xml version="1.0" encoding="utf-8"?>
<VCALENDAR>
<PRODID>-//Microsoft Corporation//Outlook 16.0 MIMEDIR//EN</PRODID>
<VERSION>2.0</VERSION>
<METHOD>PUBLISH</METHOD>
<X-CALSTART>20180517T123000Z</X-CALSTART>
<X-WR-RELCALID>{0000002E-6F07-C199-97F1-D97565A83546}</X-WR-RELCALID>
<X-WR-CALNAME>MyCalendar</X-WR-CALNAME>
<VTIMEZONE>
<TZID>W. Europe Standard Time</TZID>
<STANDARD>
<DTSTART>16011028T030000</DTSTART>
<RRULE>FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10</RRULE>
<TZOFFSETFROM>+0200</TZOFFSETFROM>
<TZOFFSETTO>+0100</TZOFFSETTO>
</STANDARD>
whereas the event in XML format becomes:
<VEVENT>
<ATTENDEE>someone</ATTENDEE>
<CLASS>PUBLIC</CLASS>
<CREATED>20240910T142509Z</CREATED>
<DESCRIPTION>Line 1continuation line 2continuation line 3</DESCRIPTION>
<DTEND>20241001T150000</DTEND>
<DTSTAMP>20240910T142856Z</DTSTAMP>
<DTSTART>20241001T140000</DTSTART>
<LAST-MODIFIED>20240910T142857Z</LAST-MODIFIED>
<LOCATION>Amsterdam\, Nederland</LOCATION>
<ORGANIZER>someone</ORGANIZER>
<PRIORITY>5</PRIORITY>
<SEQUENCE>0</SEQUENCE>
<SUMMARY>Line 1continuation line 2</SUMMARY>
<TRANSP>OPAQUE</TRANSP>
<UID>040000008200E00074C5B7101A82E00800000000CC53B73D8D03DB01000000000000000010000000A070351FB134DE48A12A5F436749365E</UID>
<X-ALT-DESC><html>\n<head>\n<meta http-equiv="Content-Type" .../html>\n</X-ALT-DESC>
<X-MICROSOFT-CDO-BUSYSTATUS>BUSY</X-MICROSOFT-CDO-BUSYSTATUS>
<X-MICROSOFT-CDO-IMPORTANCE>1</X-MICROSOFT-CDO-IMPORTANCE>
<X-MICROSOFT-DISALLOW-COUNTER>FALSE</X-MICROSOFT-DISALLOW-COUNTER>
<X-MS-OLK-APPTLASTSEQUENCE>0</X-MS-OLK-APPTLASTSEQUENCE>
<X-MS-OLK-APPTSEQTIME>20240910T142856Z</X-MS-OLK-APPTSEQTIME>
<VALARM>
<TRIGGER>-PT15M</TRIGGER>
<ACTION>DISPLAY</ACTION>
<DESCRIPTION>Reminder</DESCRIPTION>
</VALARM>
</VEVENT>
XML Structure
The resulting XML structure is:
VCALENDAR
+- VTIMEZONE*
+- VEVENT*
XML to Table
The mapping from XML to a table with calendar items is quite simple:
create or replace table CalendarItems@InMemoryStorage
as
select t.file_path
, x.uid
, x.attendee
, to_date(translate(x.created, 'xTZ', 'x'), 'YYYYMMDDHH24MISS') createdUtc
, to_date(translate(x.dtstart, 'xTZ', 'x'), 'YYYYMMDDHH24MISS') startUtc
, to_date(translate(x.dtend, 'xTZ', 'x'), 'YYYYMMDDHH24MISS') endUtc
, x.location
, x.summary
, x.description
from Ics2Xml@InMemoryStorage t
join xmltable
( '/VCALENDAR/VEVENT'
passing t.xml
columns attendee varchar2 path 'ATTENDEE'
, created varchar2 path 'CREATED'
, description varchar2 path 'DESCRIPTION'
, dtstart varchar2 path 'DTSTART'
, dtend varchar2 path 'DTEND'
, location varchar2 path 'LOCATION'
, summary varchar2 path 'SUMMARY'
, uid varchar2 path 'UID'
) x
Export Events to Excel or JSON
Invantive Script can be used to export the calendar events to (for instance) Excel, using for example:
select *
from CalendarItems@InMemoryStorage
local export results as "c:\temp\calendar-items.xlsx" format xlsx include technical headers
But you can also use the SQL-integrated tools such as JSON (see Generate JSON from Invantive UniversalSQL queries):
select *
from CalendarItems@InMemoryStorage
for json auto
More samples for export approaches can be found in: