In PostgreSQL, a window function (also known as an analytical function) is a type of SQL function that performs a calculation across a set of rows related to the current row, without reducing the result set. Window functions operate on a group of rows called a “window” that is defined by a set of rows specified by the OVER clause.

I. Key Features of Window Functions

  1. Partitioning: Window functions can be partitioned into groups of rows based on one or more columns. Each partition forms a separate window for calculation.

  2. Ordering: Within each partition, rows can be ordered using the ORDER BY clause, which defines the sequence of rows used in the calculation.

  3. Frame: The frame defines the subset of rows within the partition used in the calculation. It specifies the rows preceding or following the current row, as well as the range or number of rows to include in the frame.

  4. Aggregation: Window functions can perform aggregation operations such as sum, average, count, min, and max over the rows in the window.

II. Common Use Cases of Window Functions:

  1. Ranking and Percentiles: Window functions can be used to assign ranks, calculate percentiles, and identify top or bottom performing rows within a group.

  2. Moving Averages and Cumulative Sums: Window functions are commonly used to calculate moving averages, cumulative sums, and other rolling calculations over a specified window of rows.

  3. Comparative Analysis: Window functions facilitate comparative analysis by allowing calculations to be performed across different rows or groups of rows within a partition.

  4. Aggregating Data without Grouping: Window functions provide a way to aggregate data without collapsing the result set into a single row, allowing for more granular analysis.

III .Methods of Window Function Calculation:

Input data:

order_idcustomer_idorder_datetotal_amount
112024-01-01100
212024-01-05150
322024-01-10200
422024-01-15120
  1. ROW_NUMBER(): Assigns a unique sequential integer to each row within a partition.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS row_number
FROM
    orders;

Result:

order_idorder_datetotal_amountrow_number
12024-01-011001
22024-01-051501
32024-01-102001
42024-01-151202
  1. RANK(): Assigns a unique rank to each row within a partition, with gaps in the ranking for duplicate values.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    RANK() OVER (PARTITION BY customer_id ORDER BY total_amount DESC) AS rank
FROM
    orders;

Result:

order_idorder_datetotal_amountrank
12024-01-011002
22024-01-051501
32024-01-102001
42024-01-151203
  1. DENSE_RANK(): Assigns a unique rank to each row within a partition, without gaps in the ranking for duplicate values.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY total_amount DESC) AS dense_rank
FROM
    orders;

Result:

order_idorder_datetotal_amountdense_rank
12024-01-011002
22024-01-051501
32024-01-102001
42024-01-151203
  1. NTILE(): Divides the rows within a partition into a specified number of buckets or tiles.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    NTILE(2) OVER (PARTITION BY customer_id ORDER BY total_amount DESC) AS ntile
FROM
    orders;

Result:

order_idorder_datetotal_amountntile
12024-01-011002
22024-01-051501
32024-01-102001
42024-01-151202
  1. LAG() and LEAD(): Accesses the value of a column from a previous or subsequent row within the partition.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    LAG(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_amount,
    LEAD(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_amount
FROM
    orders;

Result:

order_idorder_datetotal_amountprev_amountnext_amount
12024-01-01100NULL150
22024-01-05150100200
32024-01-10200150120
42024-01-15120200NULL
  1. FIRST_VALUE() and LAST_VALUE(): Returns the first or last value of a column within the partition.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    FIRST_VALUE(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS first_amount,
    LAST_VALUE(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS last_amount
FROM
    orders;

Result:

order_idorder_datetotal_amountfirst_amountlast_amount
12024-01-01100100120
22024-01-05150100120
32024-01-10200100120
42024-01-15120100120
  1. SUM(), AVG(), MIN(), MAX(): Perform aggregation operations over the rows in the window.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS cumulative_amount
FROM
    orders;

Result:

order_idorder_datetotal_amountcumulative_amount
12024-01-01100100
22024-01-05150250
32024-01-10200450
42024-01-15120570
  1. CUME_DIST(): Calculates the cumulative distribution of a value within a partition.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    CUME_DIST() OVER (PARTITION BY customer_id ORDER BY total_amount) AS cume_dist
FROM
    orders;

Result:

order_idorder_datetotal_amountcume_dist
12024-01-011000.333
22024-01-051500.666
32024-01-102001.0
42024-01-151201.0
  1. PERCENT_RANK(): Calculates the relative rank of a row within a partition as a percentage.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    PERCENT_RANK() OVER (PARTITION BY customer_id ORDER BY total_amount) AS percent_rank
FROM
    orders;

Result:

order_idorder_datetotal_amountpercent_rank
12024-01-011000.0
22024-01-051500.5
32024-01-102001.0
42024-01-151200.0
  1. PERCENTILE_CONT() and PERCENTILE_DISC(): Calculate the percentile value of a column within a partition.

Example:

SELECT
    order_id,
    order_date,
    total_amount,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) OVER (PARTITION BY customer_id) AS median_amount
FROM

    orders;

Result:

order_idorder_datetotal_amountmedian_amount
12024-01-01100125
22024-01-05150125
32024-01-10200160
42024-01-15120160

IV. Conclusion

Window functions in PostgreSQL offer powerful capabilities for performing complex analytical calculations within SQL queries. By leveraging window functions, you can perform sophisticated analysis and gain valuable insights into your data without the need for complex procedural code.