ploadText#

swordfish.function.ploadText()#

Load a text data file in parallel as an in-memory partitioned table. When the file is greater than 16 MB, it returns a in-memory table with sequential partitions. A regular in-memory table is returned otherwise.

Note

  • The partitioned table returned by ploadText distributes data evenly across all partitions. Each partition holds between 8-16 MB of data.

  • ploadText is faster than loadText with concurrent data loading.

Parameters:
  • filename (Constant) – The input text file name with its absolute path. Currently only .csv files are supported.

  • delimiter (Constant, optional) – A STRING scalar indicating the table column separator. It can consist of one or more characters, with the default being a comma (‘,’).

  • schema (Constant, optional) –

    A table. It can have the following columns, among which “name” and “type” columns are required.

    Column

    Data Type

    Description

    name

    STRING scalar

    column name

    type

    STRING scalar

    data type

    format

    STRING scalar

    the format of temporal columns

    col

    INT scalar or vector

    the columns to be loaded

    Note

    If “type” specifies a temporal data type, the format of the source data must match a DolphinDB temporal data type. If the format of the source data and the DolphinDB temporal data types are incompatible, you can specify the column type as STRING when loading the data and convert it to a DolphinDB temporal data type using the temporalParse function afterwards.

  • skipRows (Constant, optional) – An integer between 0 and 1024 indicating the rows in the beginning of the text file to be ignored. The default value is 0.

  • arrayDelimiter (Constant, optional) – A single character indicating the delimiter for columns holding the array vectors in the file. You must use the schema parameter to update the data type of the type column with the corresponding array vector data type before import.

  • containHeader (Constant, optional) – A Boolean value indicating whether the file contains a header row. The default value is null.

  • arrayMarker (Constant, optional) –

    A string containing 2 characters or a CHAR pair. These two characters represent the identifiers for the left and right boundaries of an array vector. The default identifiers are double quotes (“).

    • It cannot contain spaces, tabs (t), or newline characters (t or n).

    • It cannot contain digits or letters.

    • If one is a double quote (“), the other must also be a double quote.

    • If the identifier is ‘, “, or , a backslash ( ) escape character should be used as appropriate. For example, arrayMarker=”""”.

    • If delimiter specifies a single character, arrayMarker cannot contain the same character.

    • If delimiter specifies multiple characters, the left boundary of arrayMarker cannot be the same as the first character of delimiter.