Data Import Method
DolphinDB offers a comprehensive suite of tools for handling data transfers across different file formats and sources. These tools enable you to load, clean, and transfer data either immediately or on a scheduled basis, streamlining the process of data management and integration.
Built-in Functions
DolphinDB provides various built-in functions for importing and exporting data from text files, binary files, and JSON files. During the data import process, users can set relevant parameters for tasks such as data type matching and other preprocessing tasks.
Importing Text Files
Function | Description |
---|---|
loadText |
Imports a text file as an in-memory table. |
ploadText |
Imports a text file as a partitioned in-memory
table in parallel, which is faster than
loadText . ploadText
leverages multi-core CPUs for parallel loading, making it
suitable for quickly loading larger files (16MB or
more). |
loadTextEx |
Imports text files directly into either a
distributed database or in-memory database.
loadTextEx loads a text file into a
database in batches to prevent OOM issues. It enables data
cleaning and preprocessing during data import. |
textChunkDS |
Divides the text file into multiple data
sources. It can be used with function mr to
load data. |
Importing Binary and JSON Files
File Types | Function | Description |
---|---|---|
Binary Files | readRecord! |
Imports binary files into memory, without supporting strings. |
loadRecord |
Imports binary files with fixed-length rows into memory, supporting strings. | |
JSON Files | fromJson ,
fromStdJson |
Converts a JSON string into DolphinDB object. |
Data Export
File Types | Function | Description |
---|---|---|
Text Files | saveText |
Saves any variable or table as a text file on disk. |
saveTextFile |
Saves strings to a file by appending or overwriting. | |
Binary Files | writeRecord |
Converts a DolphinDB object (e.g., table or tuple) into a binary file and returns the number of rows written to the file. |
saveAsNpy |
Saves a DolphinDB vector or matrix as a .npy binary file supported by NumPy. | |
JSON Files | toJson ,
toStdJson |
Converts a DolphinDB object into a JSON string. |
Plugins and Tools
DolphinDB also provides a variety of specialized plugins for seamless data integration.
Plugins for Database Migration
With ready-to-use interfaces, database migration plugins simplify the process of transferring data between systems. Dedicated plugins are available for specific databases like MySQL, kdb+, and MongoDB. Additionally, the ODBC plugin serves as a universal connector, enabling integration with other database systems including SQL Server, Oracle, ClickHouse, and SQLite.
Plugin | Description |
---|---|
odbc | Reads data from other data sources via ODBC, including data from MySQL, Oracle, SQL Server, and other databases. |
mysql | Connects to MySQL and reads data. |
HBase | Connects to HBase via Thrift to read data. |
kdb | Connects to the kdb+ database or directly reads kdb+ data files from disk to import data. |
mongodb | Connects to MongoDB and reads data. |
Plugins for File-Based Data Import
File-based data import plugins excel at handling large volumes of data.
Plugin | Description |
---|---|
Arrow | Serializes data in Apache Arrow format. |
aws | Reads from and writes to AWS S3 network files. |
feather | Reads from and writes to Apache Feather files. |
hdfs | Reads from and writes to Hadoop HDFS files. |
hdf5 | Reads from and writes to HDF5 files. |
mat | Reads from and writes to MATLAB files. |
mseed | Reads from and writes to miniSEED files. |
orc | Reads from and writes to ORC files. |
parquet | Reads from and writes to Apache Parquet files. |
zip | Extracts ZIP files. |
zlib | Compresses and decompresses gz files. |
Plugins for Message Middlewares Integration
Message middleware plugins enable real-time data ingestion by establishing connections with various messaging systems. These plugins serve as bridges, allowing seamless streaming of live data directly into DolphinDB's environment.
Plugin | Description |
---|---|
zmq | Publishes and subscribes to ZeroMQ messages. |
mqtt | Publishes and subscribes to MQTT messages. |
kafka | Publishes or subscribes to Kafka messages. |
Other Data Import Tools
DolphinDB offers the dolphindbwriter plugin, built on the DataX offline data synchronization tool, to import data from various sources and synchronize incremental updates with high extensibility and versatility.
Users can also import data through DolphinDB's C++ API, Python API, Java API, and other supported interfaces. For details, refer to the corresponding API documentation.