Ben Nour

Using SQL to calculate how often and how many days a store's inventory is out of stock

Recently a Reddit user posted in r/SQL wanting to know how to calculate
how often and how many days an item was out of stock.

I enjoyed coming up with a solution and so I’ve decided to write about it in more detail.

Note that I’ll be using Postgres.

The data

First, the dataset we’ll be operating on:

iteminventory_check_dtinventory_level
Bicycle2024-01-1010
Bicycle2024-01-155
Bicycle2024-01-160
Bicycle2024-01-200
Bicycle2024-01-230
Bicycle2024-01-250
Bicycle2024-01-2820
Bicycle2024-01-2912
Bicycle2024-01-310

Each record in the dataset is when an inventory check is conducted.

Here’s the DDL:

CREATE TABLE stock (
    item VARCHAR(50),
    transactioninventory_check_dt_date DATE,
    inventory_level INT
);

INSERT INTO stock (item, inventory_check_dt, inventory_level) VALUES
('Bicycle', '2024-01-10', 10),
('Bicycle', '2024-01-15', 5),
('Bicycle', '2024-01-16', 0),
('Bicycle', '2024-01-20', 0),
('Bicycle', '2024-01-23', 0),
('Bicycle', '2024-01-25', 0),
('Bicycle', '2024-01-28', 20),
('Bicycle', '2024-01-29', 12),
('Bicycle', '2024-01-31', 0)
;

The problem

We want to find out how often our store was out of stock of bicycles and for how many days.

We’re going to operate on the assumption that when new stock comes in an inventory check is recorded in our table.

Calculating how many times bicycles were sold out

It’s simple enough to verify when an item goes out of stock. There’s 0 inventory on hand and when the last inventory check was conducted there was stock.

Verifying this nicely demonstrates the power of both CASE statements and window functions and how they can be combined to analyse and transform data. In this case we use them to verify both the aformentioned criteria.

Specifically by utilising the LAG() window function we can look to see how much inventory there was on the previous inventory check date:

SELECT
item
, inventory_check_dt
, inventory_level
, CASE 
	WHEN inventory_level != 0 THEN 'IN STOCK'
    WHEN inventory_level = 0 AND LAG(inventory_level, 1) OVER (PARTITION BY ITEM ORDER BY inventory_check_dt) != 0 
		THEN 'OUT_OF_STOCK' 
  END AS stock_status
FROM stock
;
iteminventory_check_dtinventory_levelstock_status
Bicycle2024-01-1010IN STOCK
Bicycle2024-01-155IN STOCK
Bicycle2024-01-160OUT_OF_STOCK
Bicycle2024-01-200
Bicycle2024-01-230
Bicycle2024-01-250
Bicycle2024-01-2820IN STOCK
Bicycle2024-01-2912IN STOCK
Bicycle2024-01-310OUT_OF_STOCK

Perfect! We can see that twice bicycles have gone out of stock.

Calculating how long bicycles were sold out

Next let’s calculate how long bicycles were sold out for.

All we need to do is calculate the difference in days between when an item was flagged as having gone out of stock and when it next came into stock (or more accurately, when an inventory check was recorded that revealed there was stock).

Again we’ll use a CASE statement but this time with the LEAD() window function to return the next transcation date.

We’ll filter out rows with NULL for the stock_status column so that what will be returned, if it exists, is the next date in which an inventory count is conducted that reveals bicycles are back in stock.

I’ve also wrapped COALESCE() around LEAD() so if there is no next row (meaning bicycles are still out of stock) what’s returned is the current date:

WITH stock_status AS 
(
SELECT
*
, CASE 
	WHEN inventory_level != 0 THEN 'IN STOCK'
    WHEN inventory_level = 0 AND LAG(inventory_level, 1) OVER (PARTITION BY item ORDER BY inventory_check_dt) != 0 
    	THEN 'OUT_OF_STOCK' 
  END AS stock_status
FROM stock
)
SELECT
item
, inventory_check_dt
, stock_status
, CASE 
	WHEN stock_status = 'OUT_OF_STOCK' 
		THEN COALESCE(LEAD(inventory_check_dt, 1) OVER (PARTITION BY ITEM ORDER BY inventory_check_dt), CURRENT_DATE)
		- inventory_check_dt 
  END AS days_out_of_stock
FROM stock_status
WHERE stock_status IS NOT NULL
;
iteminventory_check_dtstock_statusdays_out_of_stock
Bicycle2024-01-10IN STOCK
Bicycle2024-01-15IN STOCK
Bicycle2024-01-16OUT_OF_STOCK12
Bicycle2024-01-28IN STOCK
Bicycle2024-01-29IN STOCK
Bicycle2024-01-31OUT_OF_STOCK246

Perfect!

We have our answer.

The Redditor who I helped asked for a single row per item which broke out how many
times the item was out of stock and for how long (in a comma-seperated string) so let’s do that:

WITH stock_status AS 
(
SELECT
*
, CASE 
	WHEN inventory_level != 0 THEN 'IN STOCK'
    WHEN inventory_level = 0 AND LAG(inventory_level, 1) OVER (PARTITION BY item ORDER BY inventory_check_dt) != 0 
    	THEN 'OUT_OF_STOCK' 
  END AS stock_status
FROM stock
), time_calculation AS 
(
SELECT
item
, stock_status
, COALESCE(LEAD(inventory_check_dt, 1) OVER (PARTITION BY ITEM ORDER BY inventory_check_dt), CURRENT_DATE) 
- inventory_check_dt AS days_out_of_stock
FROM stock_status
WHERE stock_status IS NOT NULL 
)
SELECT 
item 
, COUNT(stock_status) AS times_out_of_stock
, STRING_AGG(days_out_of_stock::VARCHAR, ',') AS days_out_of_stock
FROM time_calculation 
WHERE stock_status = 'OUT_OF_STOCK'
GROUP BY item 
;

Our final result set:

itemtimes_out_of_stockdays_out_of_stock
Bicycle212,153

#Data-Analysis #Sql