There is a limitation on the following special drivers above 1 billion characters / 2 billion bytes:
-
csvtable
: process text as CSV. -
htmltable
: process text as HTML tree. -
jsontable
: process text as JSON (non-NDJSON). -
xmltable
: process text as XML tree. -
exceltable
: a little different since it is binary input, but Excel files larger than 2 GB we have never encountered.
The text is read completely into memory taking 2 bytes per character (UTF-16 encoding). A 2 GB text file in UTF-8 typically converts into 4 GB of memory consumption. The .NET platform we use can not support text string larger than 2 GB, even on 64-bit.
See for another exempla System.OutOfMemoryException thrown on JSONTABLE with Twinfield BI-ON-Focus - 8 van anon44580209
For the future it is an option to make the processing streaming, consuming text by the character instead as a large text, but there are currently not sufficient use cases for it.
As a workaround, pre-process the files with a tool like the UNIX cut
.