Building A Complete Time-Series Data with SQL Date Spines and Cross Joins

September 8, 2025
5 min read

Learn how to build SQL date spines and use CROSS JOINs to create complete time series data for SaaS cohort analysis and LTV calculations.

Excerpt

During a recent team meeting, my analysts and I were discussing the challenges of cohorting and LTV analysis. Analysis that is crucial to the SaaS businesses we work with and informs our strategic and tactical marketing optimization. While the business logic behind these metrics is relatively straightforward (See my article on LTV), the SQL required to build the underlying data structures is complex. These analyses require a “date spine” table combined with a CROSS JOIN, a join clause that often puzzles analysts. In this article, I will guide you through building this foundation using a Cartesian product approach.

The Missing Date Problem

Transactional databases excel at recording events like sign-ups, purchases, and form submissions, but are not equipped to handle time series analysis. This naturally creates a problem for cohort analysis, where you need to track user behavior across consistent time periods. 

For example, a user who signs up on the first of a month and subscribes on the 9th of the month will have two records with both of these dates, but not the dates in between that could contain valuable information and be crucial to cohort analysis.

A date spine solves this by generating a calendar with a record for every date in your analysis period. When it is joined with your transactional data using a CROSS JOIN, there is now a date for every user for every day, regardless of whether they performed a recorded action or not.

To better understand this process, let's walk through building this foundation for a SaaS cohort analysis. Provided below is a sample transactional table of 15 records with:

  • signup_date (cohort date): When the user first registered.
  • event_date (n periods after): When a conversion event occurred.
  • subscription_type: The plan they selected. 

This business in this example has a SaaS offering and a subscription funnel of:

  • Users sign up for free accounts
  • Some convert to paid trials after a week
  • Some of those trial users convert to basic or premium subscriptions
  • The trial is for 7 days
signup_date event_date subscription_tier signups trials subscriptions
2025-01-13 2025-01-13 free 12 0 0
2025-01-13 2025-01-20 basic 0 8 5
2025-01-13 2025-01-27 premium 0 0 2
2025-01-14 2025-01-14 free 15 0 0
2025-01-14 2025-01-21 basic 0 11 7
2025-01-14 2025-01-28 premium 0 0 3
2025-01-15 2025-01-15 free 18 0 0
2025-01-15 2025-01-22 basic 0 14 9
2025-01-15 2025-01-29 premium 0 0 4
2025-01-16 2025-01-16 free 22 0 0
2025-01-16 2025-01-23 basic 0 17 11
2025-01-16 2025-01-30 premium 0 0 5
2025-01-17 2025-01-17 free 19 0 0
2025-01-17 2025-01-24 basic 0 15 10
2025-01-17 2025-01-31 premium 0 0 4

You likely recognize that this transactional table doesn’t have a continuous time series of signup_date and event_date records for each subscription type. This demonstrates the scenario in which we need to fill in the time series gaps to observe cohort maturation for a given signup date and subscription type.

Building a Date Spine with SQL

The solution is to create a date spine, a table containing every single date in your analysis period. This serves as the foundation for your analysis to ensure that you aren’t missing dates in your final analysis:

SELECT
   date_series.calendar_date
FROM (
   SELECT generate_series(
   DATE('2025-01-15'), -- Start date
   DATE('2025-04-30'), -- End date
   INTERVAL '1 day'
)::date AS calendar_date
) AS date_series;

This SQL query generates a date for every date from January 1st, 2025 to April 30th, 2025.

The Catesian Product with CROSS JOIN’s

Having a date spine alone isn’t enough. For cohort analysis, we need a grid for every combination of user cohort, date, and dimension, where every user has a record for every day and segment you want to analyze.

This is where CROSS JOIN becomes essential. Unlike typical joins, a CROSS JOIN creates a cartesian product; every row from the first table is combined with every row from the joining table, creating the completed grid.

To do this, we use the following SQL:

SELECT 
    sd.signup_date,
    ds.calendar_date,
    st.subscription_tier
  FROM (SELECT DISTINCT signup_date FROM your_table_name) sd
  CROSS JOIN date_spine ds  
  CROSS JOIN (SELECT DISTINCT subscription_tier FROM your_table_name) st
  WHERE ds.calendar_date >= sd.signup_date

Final Table (LEFT JOIN)

With the Cartesian product table generated, it is time to join the transactional data to complete the grid. The base table is the cartesian product, and you LEFT OUTER JOIN your original data to it. This ensures every date, cohort, and dimension combination has a row, even when no events occurred. The key is matching the three dimensions:

  • signup_date = signup_date (cohort date matching)
  • calendar_date = event_date (n periods matching)
  • subscription_tier = subscription_tier (dimension matching)

When no events occurred on a given day, the LEFT OUTER JOIN returns NULL values. We use COALESCE to convert those NULLs to zeros for the aggregated columns, giving us clean aggregations for analysis. 

The completed SQL described is here:

-- Step 1: Create the date spine (fixed syntax)
WITH date_spine AS (
  SELECT 
    date_series.calendar_date
  FROM (
    SELECT generate_series(
      DATE('2025-01-15'),  -- Start date
      DATE('2025-04-30'),  -- End date
      INTERVAL '1 day'
    )::date AS calendar_date
  ) AS date_series
),

-- Step 2: Create the complete scaffolding
complete_grid AS (
  SELECT 
    sd.signup_date,
    ds.calendar_date,
    st.subscription_tier
  FROM (SELECT DISTINCT signup_date FROM your_table_name) sd
  CROSS JOIN date_spine ds  
  CROSS JOIN (SELECT DISTINCT subscription_tier FROM your_table_name) st
  WHERE ds.calendar_date >= sd.signup_date
),

-- Step 3: LEFT JOIN with actual data
final_result AS (
  SELECT 
    cg.signup_date,
    cg.calendar_date,
    cg.subscription_tier,
    COALESCE(bt."Count(signups)", 0) AS signups,
    COALESCE(bt."Count(trials)", 0) AS trials,
    COALESCE(bt."Count(subscriptions)", 0) AS subscriptions
  FROM complete_grid cg
  LEFT JOIN your_table_name bt ON cg.signup_date = bt.signup_date 
                               AND cg.calendar_date = bt.event_date 
                               AND cg.subscription_tier = bt.subscription_tier
)

SELECT * FROM final_result
ORDER BY signup_date, subscription_tier, calendar_date;

The final output (Limited to the first 25 records here):

signup_date calendar_date subscription_tier signups trials subscriptions
2025-01-15 2025-01-15 basic 0 0 0
2025-01-15 2025-01-16 basic 0 0 0
2025-01-15 2025-01-17 basic 0 0 0
2025-01-15 2025-01-18 basic 0 0 0
2025-01-15 2025-01-19 basic 0 0 0
2025-01-15 2025-01-20 basic 0 0 0
2025-01-15 2025-01-21 basic 0 0 0
2025-01-15 2025-01-22 basic 0 12 8
2025-01-15 2025-01-23 basic 0 0 0
2025-01-15 2025-01-24 basic 0 0 0

Adding Date Spines to Your Analysis Toolkit

Date spines solve one of the most common challenges in data analysis: ensuring time series completion for any metric that changes over time. Once mastered, you will find an increase in utilization beyond cohort analysis for things like user activity analysis, marketing attribution, and a/b testing. It's a SQL pattern that serves as a template that can be adapted to various data structures, saving you time and energy on future projects.

📬 Subscribe to My Newsletter

Get the latest articles and exclusive content straight to your inbox.