Parsing big csv file raises itgenclr007: OutOfMemory

There is a limitation on the following special drivers above 1 billion characters / 2 billion bytes:

  • csvtable: process text as CSV.
  • htmltable: process text as HTML tree.
  • jsontable: process text as JSON (non-NDJSON).
  • xmltable: process text as XML tree.
  • exceltable: a little different since it is binary input, but Excel files larger than 2 GB we have never encountered.

The text is read completely into memory taking 2 bytes per character (UTF-16 encoding). A 2 GB text file in UTF-8 typically converts into 4 GB of memory consumption. The .NET platform we use can not support text string larger than 2 GB, even on 64-bit.

See for another exempla System.OutOfMemoryException thrown on JSONTABLE with Twinfield BI-ON-Focus - 8 van anon44580209

For the future it is an option to make the processing streaming, consuming text by the character instead as a large text, but there are currently not sufficient use cases for it.

As a workaround, pre-process the files with a tool like the UNIX cut.