qcut

Syntax

qcut(X, q, [labels],[dropDuplicates=false]

Details

Determines the quantile bin for each element based on its rank in a numeric vector. For example, given 1,000 values, divides them into 10 quantile bins and returns the bin label each element belongs to.

Parameters

X: A numeric vector.

q: An INT scalar or a FLOATING vector.

  • An INT scalar specifies the number of quantile bins (e.g., 10 for deciles, 4 for quartiles).
  • A FLOATING vector specifies the quantile breakpoints. It must contain at least two elements, with values in the range [0, 1].

labels (optional): A vector of labels for each quantile bin.

  • It defaults to NULL, which means the function returns an integer vector representing the bin index for each element.
  • If q is a scalar, the length of labels must equal q.
  • If q is a vector, the length of labels must be len(q) - 1.

dropDuplicates: A boolean value specifying whether to drop duplicate bin boundaries.

  • It defaults to false, which means raising an error if duplicate boundaries exist.
  • If it is set to true, duplicate boundaries are removed.

Returns

A vector indicating the quantile bin to which each element belongs.

Examples

// Divide the data into 4 quantile bins
qcut([1,2,3,4,5,6,7,8,9,10], 4)
// Output: [0 0 0 1 1 2 2 3 3 3]

// Divide using custom quantile breakpoints: 0–30%, 30–70%, 70–100%
qcut([1,2,3,4,5,6,7,8,9,10], [0, 0.3, 0.7, 1.0])
// Output: [0 0 0 1 1 1 1 2 2 2]

// Divide the data into 4 quantile bins and use custom labels
qcut([1,2,3,4,5,6,7,8,9,10], 4, ["Q1", "Q2", "Q3", "Q4"])
// Output: [Q1 Q1 Q1 Q2 Q2 Q3 Q3 Q4 Q4 Q4]

/* Due to a large number of duplicate values in the data,
   the quantile boundaries are not unique.
   After enabling dropDuplicates, duplicate boundaries are automatically removed,
   resulting in fewer than 4 quantile bins.
*/
qcut(X=[1, 1, 1, 1, 2, 3], q=4, dropDuplicates=true)
// Output: [0 0 0 0 2 2]