SQL Azure – Creating a numbers table using a Cartesian Product and then expanding dates between two field values for individual records

The following code uses a Common Table Expression to create a numbers table ,which is then used to expand all dates between two date values of an individual record. This structure could be used in a variety of circumstances where you are needing to charge costs to a product over a period of time or if you are needing to record consecutive dated values between two dates.

I am using Whisky Casks as an example here. Imagine a cask came in on a day and needs to be matured for a period and there will be a charge for every day that the cask is being stored to an account. Clearly going forward you might want to set up an overnight process to create the daily charges. But what happens where you wish to retrospectively create charges. Charge records will build up quickly and it will be difficult to manually keep on top of the required record correction.

Firstly I create a table that will hold a single record for each date a cask is in bond between the two tables. There are additional columns here as I need them for my use case (casks in bond).

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[t066tempexpand](
	[pkid] [int] IDENTITY(1,1) NOT NULL,
	[chargedate] [datetime] NULL,
	[accountcode] [nvarchar](16) NULL,
	[pkidt002] [int] NULL,
	[pkidt053dailystoragecharge] [int] NULL,
	[locationwarehouse] [nvarchar](50) NULL,
 CONSTRAINT [PK_0000temp] PRIMARY KEY CLUSTERED 
(
	[pkid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

Next we create a CTE numbers table and use this to explode a single record for each dated day between the two dates in this case bondindate and a disgorge date. I have an extra filter here that filters what casks are being put into the table (here pkid>2000). Other filters could be where disgorgedate is not null. The following line will be used to identify the charge date.
dateadd(day,v.number,d.bondindate)
and
join Numbers v on v.number between 0 and datediff(day, bondindate, disgorgedate)
The first line shows the date that will be charged. The join is particularly elegant as it filters or restricts the records within the numbers table stored as a common table expression to multiple records of an individual cask based on all dates between two field values. The starting date identifies where to explode from and the second identifies the ending date (in this case the disgorge date) identifies where it must end. Another example if you were wanting to bring it up to todays date or the day before today , if you have an overnight process running at midnight getdate()-1.

The cross join in the common table expression definition is used to create a cartesian join on itself, from my investigations there is no particular significance to the sys.columns table used other than this is a system table that holds all columns in all tables in the database and therefore is likely to be somewhat large. It so happens in the database that I am using there are a lot of tables and a lot of columns so a single self cross join creates more than enough numbers than I will ever need. If you wish to create further records you can further cross join the sys.columns table on itself by repeating cross join line with subsequent aliases eg (cross join sys.columns c). After experimentation I discovered for instance if I left out the cross join there were only some 4,000 records in my numbers table. This is suspect reflects the fact that I have 4,000 columns in the tables of my database. In my case 4,000 X 4,000 or the cartesian product of the sys.columns table on itself results in 16 million records more than enough numbers for me. It should be noted that I am restricting the CTE numbers table to the top 20,000 anyway which in our case relates to over 54 years which is more back dating than I think I will ever need. It should be noted the original code I discovered had three cross joins presumably because the number of tables and so columns in the database was much smaller.

;WITH Numbers(number) AS (
  select top(20000) row_number() over (order by (select 0))-1
  from sys.columns a
  cross join sys.columns b
)

INSERT into t066tempexpand
(chargedate,
accountcode,
pkidt002,
pkidt053dailystoragecharge,
locationwarehouse)
select 
dateadd(day,v.number,d.bondindate) 
chargedate,
accountcode, 
pkid,
pkidt053dailystoragecharge,
locationwarehouse
  from t002caskfill d
  join Numbers v on v.number between 0 and datediff(day, bondindate, disgorgedate)
  where pkid > 2000

Here I am joining with an extra table to find out the daily storage charge and this is just a test.

select
a.chargedate,
a.accountcode,
a.pkidt002,
b.dailycharge,
a.pkidt053dailystoragecharge,
a.locationwarehouse,
b.chargecode
from dbo.t0000temp a
join dbo.v128dailycaskstoragelk023 b on a.pkidt053dailystoragecharge=b.pkid 
order by pkidt002,chargedate;

And lastly I insert all this information into a Charge history table.

INSERT INTO dbo.t0032chargehistory
(chargedate,
accountcode,
pkidt002cask,
chargedamount,
pkidt053chargecode,
locationwarehouse,
chargecode)
select
a.chargedate,
a.accountcode,
a.pkidt002,
b.dailycharge,
a.pkidt053dailystoragecharge,
a.locationwarehouse,
b.chargecode
from dbo.t066tempexpand a
join dbo.v128dailycaskstoragelk023 b on a.pkidt053dailystoragecharge=b.pkid 
order by pkidt002,chargedate;