# DStream::windowJoinEngine {#DStream_windowJoinEngine}

**Parent topic:**[Functions](../../Functions/category.md)

## Syntax {#syntax}

`DStream::windowJoinEngine(rightStream, window, metrics, matchingColumn, [timeColumn], [useSystemTime=false], [garbageSize], [maxDelayedTime], [nullFill], [sortByTime], [closed])`

## Details {#details}

Creates a window join streaming engine. For details, see [createWindowJoinEngine](../c/createWindowJoinEngine.md).

**Return value**: A DStream object.

## Arguments {#arguments}

**rightStream** is a DStream object indicating the input data source of the right table.

**window** is a pair of integers or duration values, indicating the range of a sliding window, including both left and right bounds.

**metrics** is metacode \(which can be a tuple\) specifying the calculation formulas. For more information about metacode, please refer to [Metaprogramming](../c/../../Programming/Metaprogramming/functional_meta.md).

-   *metrics* can use one or more expressions, built-in or user-defined functions \(both aggregate functions and non-aggregate functions are accepted\).

-   *metrics* can be functions that return multiple values and the columns in the output table to hold the return values must be specified. For example, &lt;func\(price\) as \`col1\`col2&gt;.

-   The column names specified in *metrics* are not case-sensitive and can be inconsistent with the column names of the input tables.


If you want to specify a column that exists in both the left and the right tables, use the format *tableName.colName*. By default, the column from the left table is used.

The following functions are optimized in the engine when they are applied only to the columns from the right table: `sum`, `sum2`, `avg`, `std`, `var`, `corr`, `covar`, `wavg`, `wsum`, `beta`, `max`, `min`, `last`, `first`, `med`, `percentile`.

**matchingColumn** is a STRING scaler/vector/tuple indicating the column\(s\) on which the tables are joined. It supports integral, temporal or literal \(except UUID\) types.

-   When there is only 1 column to match - If the names of the matching column are the same in both tables, *matchingColumn* should be specified as a STRING scalar; otherwise it's a tuple of two elements. For example, if the column is named "sym" in the left table and "sym1" in the right table, then *matchingColumn* = \[\[\`sym\],\[\`sym1\]\].
-   When there are multiple columns to match - If the names of all the columns to match are the same in both tables, *matchingColumn* is a STRING vector; otherwise it's a tuple of two elements. For example, if the columns are named "timestamp" and "sym" in the left table, whereas in the right table they're named "timestamp" and "sym1", then matchingColumn = \[\[\`timestamp, \`sym\], \[\`timestamp,\`sym1\]\].

**timeColumn** \(optional\) When *useSystemTime* = false, it must be specified to indicate the name\(s\) of the time column in the left table and the right table. The time columns must have the same data type. If the names of the time column in the left table and the right table are the same, *timeColumn* is a string. Otherwise, it is a vector of 2 strings indicating the time column in each table.

**useSystemTime** \(optional\) indicates whether the left table and the right table are joined on the system time, instead of on the *timeColumn*.

-   *useSystemTime* = true: join records based on the system time \(timestamp with millisecond precision\) when they are ingested into the engine.

-   *useSystemTime* = false \(default\): join records based on the specified timeColumn from the left table and the right table.


**garbageSize** \(optional\) is a positive integer with the default value of 5,000 \(rows\). As the subscribed data is ingested into the engine, it continues to take up the memory. Within the left/right table, the records are grouped by *matchingColumn* values; When the number of records in a group exceeds *garbageSize*, the system will remove those already been calculated from memory.

**maxDelayedTime** \(optional\) is a positive integer. *maxDelayedTime* only takes effect when *timeColumn* is specified and the two arguments must have the same time precision. Use *maxDelayedTime* to trigger windows which remain uncalculated long past its end. The default *maxDelayedTime* is 3 seconds. For more information about this parameter, see "Window triggering rules" in the Details section.

**nullFill** \(optional\) is a tuple of the same size as the number of output columns. It is used to fill in the null values in the output table. The data type of each element corresponds to each output column.

**sortByTime** \(optional\) is a Boolean value that indicates whether the output data is globally sorted by time. The default value is false, meaning the output data is sorted only within groups. Note that if *sortByTime* is set to true, the data input to the left and right tables must be globally sorted, and the parameter *maxDelayedTime* cannot be specified \(i.e., no delayed triggering allowed\).

**closed**\(optional\) is a string that indicates whether the left or the right boundary is included. It only takes effect when *window*=0:0.

## Examples {#examples}

``` {#codeblock_zq1_tms_c2c}
if (!existsCatalog("orca")) {
	createCatalog("orca")
}
go
use catalog orca

// If a stream graph with the same name already exists, destroy it first.
// dropStreamGraph('joinEngine')
g = createStreamGraph('joinEngine')

r = g.source("right", 1024:0, `time`sym`val, [TIMESTAMP, SYMBOL, DOUBLE])
g.source("left", 1024:0, `time`sym`price, [TIMESTAMP, SYMBOL, DOUBLE])
    .windowJoinEngine(r, window=-2:2, metrics=<[price,val,sum(val)]>, matchingColumn=`sym, timeColumn=`time, useSystemTime=false,nullFill=[2012.01.01T00:00:00.000, `NONE, 0.0, 0.0, 0.0])
    .sink("output")
g.submit()

go

n=10
tp1=table(take(2012.01.01T00:00:00.000+0..10, 2*n) as time, take(`A, n) join take(`B, n) as sym, take(NULL join rand(10.0, n-1),2*n) as price)
tp1.sortBy!(`time)
appendOrcaStreamTable("left", tp1)

tp2=table(take(2012.01.01T00:00:00.000+0..10, 2*n) as time, take(`A, n) join take(`B, n) as sym, take(double(1..n),2*n) as val)
tp2.sortBy!(`time)
appendOrcaStreamTable("right", tp2)


select * from orca_table.output
```

