Parquet Convert
The Parquet Convert Node transforms structured data from either JSON or XML files into the highly efficient, column-oriented Parquet format. It automatically analyses the input data to build a schema and allows you to apply various compression methods to the output file. To learn more about Parquet files and their uses, see their website.
Revision History
1.0.0.0 Initial release
1.0.0.1 Fixed empty field handling.
Properties
Action
Type: List Input
Selects the input data format to convert from.
JsonToParquet
- Converts JSON data into a Parquet file.XmlToParquet
- Converts XML data into a Parquet file.
Input
Type: File Input
The source JSON or XML file you want to convert. This property expects a file provided as a byte array.
CompressionMethod
Type: List Input
Specifies the compression algorithm to use for the output Parquet file.
None
(Default)Snappy
Gzip
Lzo
Brotli
LZ4
Zstd
Lz4Raw
ParquetOutput
Type: File Output
The resulting Parquet file, provided as a byte array.
Remarks
Schema Inference
The Node automatically determines the data types for your Parquet file by scanning the values in your source Input
. Note the following behaviours:
If a column in your data contains mixed data types (e.g. both numbers and text), the Node will treat the entire column as a
string
to ensure no data is lost.For the Node to correctly interpret nested data as a
Struct Field
(a nested object), every record must contain an object at that position, and all of those objects must have the exact same field names. If the fields differ between records, the Node will attempt to create aMap Field
, which is currently unsupported.
Known Issues
Unsupported Features
Currently, the Node does not support List Fields
(arrays) or Map Fields
(objects with dynamic keys). If your JSON or XML input contains these structures, the Node will show an error message. If you would like support for these data formats in a future update, please contact support to let us know.