Following on from my previous post, which can be found here, where I presented one solution for implementing process add functionality, I’ll try to cover off some functional code in order to implement this. Clearly this code can be enhanced and/or converted to your specific needs but hopefully this is a start.
First let’s setup a dummy set of tables to represent a customer dimension. (Apologies for the formatting, will fix once I can edit my templates again.)
Dummy Tables
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET NOCOUNT ON;
GO
/**
* Delete objects.
*/
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[customer]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[customer];
END;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[address]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[address];
END;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[customer_group]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[customer_group];
END;
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[customer_class]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[customer_class];
END;
GO
/**
* CUSTOMER CLASS table.
*/
CREATE TABLE [dbo].[customer_class]
(
[customer_class_id] INT IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
[customer_class_name] VARCHAR(50) NOT NULL
) ON [PRIMARY];
GO
SET IDENTITY_INSERT [dbo].[customer_class] ON;
INSERT INTO [dbo].[customer_class] ([customer_class_id], [customer_class_name])
VALUES(1, 'Class1');
SET IDENTITY_INSERT [dbo].[customer_class] OFF;
GO
/**
* CUSTOMER GROUP table.
*/
CREATE TABLE [dbo].[customer_group]
(
[customer_group_id] INT IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
[customer_group_name] VARCHAR(50) NOT NULL,
[customer_class_id] INT NOT NULL
CONSTRAINT [fk_customer_group_customer_class]
REFERENCES [dbo].[customer_class] ([customer_class_id])
) ON [PRIMARY];
GO
SET IDENTITY_INSERT [dbo].[customer_group] ON;
INSERT INTO [dbo].[customer_group] ( [customer_group_id], [customer_group_name],
[customer_class_id] )
VALUES(1, 'Group1', 1);
INSERT INTO [dbo].[customer_group] ( [customer_group_id], [customer_group_name],
[customer_class_id] )
VALUES(2, 'Group2', 1);
SET IDENTITY_INSERT [dbo].[customer_group] OFF;
GO
/**
* ADDRESS table.
*/
CREATE TABLE [dbo].[address]
(
[address_id] INT IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
[address_line_1] VARCHAR(50) NOT NULL,
[address_line_2] VARCHAR(50) NULL
) ON [PRIMARY];
GO
SET IDENTITY_INSERT [dbo].[address] ON;
INSERT INTO [dbo].[address] ([address_id], [address_line_1], [address_line_2])
VALUES(-1, 'Unknown', NULL);
INSERT INTO [dbo].[address] ([address_id], [address_line_1], [address_line_2])
VALUES(1, 'House 1', 'Street 1');
INSERT INTO [dbo].[address] ([address_id], [address_line_1], [address_line_2])
VALUES(2, 'House 2', NULL);
SET IDENTITY_INSERT [dbo].[address] OFF;
GO
/**
* CUSTOMER table.
*/
CREATE TABLE [dbo].[customer]
(
[customer_id] INT IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED,
[customer_name] VARCHAR(50) NOT NULL,
[address_id] INT NOT NULL
CONSTRAINT [fk_customer_address]
REFERENCES [dbo].[address]([address_id]),
[customer_group_id] INT NOT NULL
CONSTRAINT [fk_customer_customer_group]
REFERENCES [dbo].[customer_group]([customer_group_id])
) ON [PRIMARY];
GO
SET IDENTITY_INSERT [dbo].[customer] ON;
INSERT INTO [dbo].[customer] ([customer_id], [customer_name], [address_id], [customer_group_id])
VALUES(1, 'Customer 1', 1, 1);
INSERT INTO [dbo].[customer] ([customer_id], [customer_name], [address_id], [customer_group_id])
VALUES(2, 'Customer 2', -1, 1);
INSERT INTO [dbo].[customer] ([customer_id], [customer_name], [address_id], [customer_group_id])
VALUES(3, 'Customer 3', 2, 2);
SET IDENTITY_INSERT [dbo].[customer] OFF;
GO
Now, taking each of the steps in turn…
The Seven-And-A-Half Steps to Process Add
Step 1
First we need to capture the list of dimensions we are interested in and the associated key table for that dimension. Note that by “key table”, I’m referring to the table that contains the attribute to be used as the key in the dimension. The following code will create a table that captures this information.
/**
* LIST OF DIMENSIONS
*/
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[dimension]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[dimension];
END;
GO
CREATE TABLE [dbo].[dimension](
[dimension_id] INT IDENTITY(1,1) NOT NULL
CONSTRAINT [pk_dimension] PRIMARY KEY CLUSTERED,
[olap_dimension_id] SYSNAME NOT NULL
CONSTRAINT [uk_dimension] UNIQUE NONCLUSTERED ,
[dimension_name] SYSNAME NOT NULL
) ON [PRIMARY];
GO
Step 1a
For performance reasons we will create a table in which we will capture the names of all the tables used by a given. (i.e. The key table and recursively all related tables.)
/**
* DIMENSION TABLE RELATIONSHIP
*/
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[dimension_table_relationship]') AND
[type] in (N'U') )
BEGIN
DROP TABLE [dbo].[dimension_table_relationship];
END;
GO
CREATE TABLE [dbo].[dimension_table_relationship](
[dimension_table_relationship_id] INT IDENTITY(1,1) NOT NULL
CONSTRAINT [pk_dimension_table_relationship] PRIMARY KEY CLUSTERED,
[dimension_id] INT NOT NULL
CONSTRAINT [fk_dimension_table_relationship_dimension_dimension]
REFERENCES [dbo].[dimension]([dimension_id]),
[olap_dimension_id] SYSNAME NOT NULL
CONSTRAINT [fk_dimension_table_relationship_dimension_olap]
REFERENCES [dbo].[dimension]([olap_dimension_id]),
[dimension_key_table_schema] SYSNAME NOT NULL,
[dimension_key_table_name] SYSNAME NOT NULL,
[attribute_table_schema] SYSNAME NOT NULL,
[attribute_table_name] SYSNAME NOT NULL,
[first_referencing_table_schema] SYSNAME NOT NULL,
[first_referencing_table_name] SYSNAME NOT NULL
) ON [PRIMARY];
GO
Step 2
Next we need to create a table in which we will capture the names of all the tables used by a given dimension on rows. (i.e. The key table and all related tables.) as well as the previous and current identity values for each table.
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[dimension_process_attribute_key_range]') AND type in (N'U'))
BEGIN
DROP TABLE [dbo].[dimension_process_attribute_key_range];
END;
GO
CREATE TABLE [dbo].[dimension_process_attribute_key_range](
[dimension_process_attribute_key_range_key] INT IDENTITY(1,1) NOT NULL
CONSTRAINT [pk_dimension_process_attribute_key_range] PRIMARY KEY CLUSTERED ,
[olap_dimension_id] SYSNAME NOT NULL,
[dimension_key_table_schema] SYSNAME NOT NULL,
[dimension_key_table_name] SYSNAME NOT NULL,
[attribute_table_schema] SYSNAME NOT NULL,
[attribute_table_name] SYSNAME NOT NULL,
[min_key_value] BIGINT NOT NULL,
[max_key_value] BIGINT NULL,
[new_data_flag] BIT NOT NULL,
[process_flag] BIT NOT NULL,
[created_date] DATETIME NOT NULL,
[created_by] NVARCHAR(255) NOT NULL,
[updated_date] DATETIME NULL,
[updated_by] NVARCHAR(255) NULL
) ON [PRIMARY];
GO
Step 3
Populate the tables created in point 1 and 1a above with the dimensions for which you wish to use process add and where the underlying relational schema has been configured as described in the previous posts.
SET IDENTITY_INSERT [dbo].[dimension] ON;
INSERT INTO [dbo].[dimension] ([dimension_id], [olap_dimension_id], [dimension_name] )
VALUES(1, 'Customer', 'Customer');
SET IDENTITY_INSERT [dbo].[dimension] OFF;
GO
INSERT INTO [dbo].[dimension_table_relationship]
( [dimension_id], [olap_dimension_id],
[dimension_key_table_schema], [dimension_key_table_name],
[attribute_table_schema], [attribute_table_name],
[first_referencing_table_schema], [first_referencing_table_name] )
VALUES(1, 'Customer', 'dbo', 'customer', 'dbo', 'customer', 'dbo', 'customer');
GO
Step 4
Create a stored procedure that accepts a given dimension name and key table name and, using the information schema views etc, identify all related tables by means of foreign key relationships. If you did not create foreign keys between all tables, you will need statically capture all tables used by a given dimension and maintain this over time as tables are added or dropped.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_reseed_dimension_table_relationship_by_dimension]') AND type in (N'P', N'PC'))
BEGIN
DROP PROCEDURE [dbo].[usp_reseed_dimension_table_relationship_by_dimension];
END;
GO
CREATE PROCEDURE [dbo].[usp_reseed_dimension_table_relationship_by_dimension]
@i_DimensionID INT = NULL,
@i_DimensionKeyTableSchema SYSNAME = NULL,
@i_DimensionKeyTableName SYSNAME = NULL
AS
/**
* Recursively captures all tables related to the table passed into the
* stored procedure. Essentially it recursively captures the attribute
* relationships from a given "dimension key" table.
*
* Input Parameters:
* ----------------
* @i_DimensionID The relational identifier for the dimension for
* which attribute relationships are required.
* @i_DimensionKeyTableSchema The schema in which the dimension key table
* resides.
* @i_DimensionKeyTableName The name of the table from which all related
* tables should be derived.
*
* Ver Author Date Comments
* --- ------------------- ----------- ------------------------------
* 1 Philip Stephenson 11-Mar-2011 Original version.
*/
SET NOCOUNT ON;
DECLARE -- Define Error Codes.
@ERR_MISSING_PARAM INT,
@ERR_GENERAL INT,
-- Define user variables.
@ErrCode INT,
@ErrMsg NVARCHAR(MAX),
@StepNbr SMALLINT,
@FQProcedureName NVARCHAR(392),
@RetVal INT,
@NewLine NVARCHAR(MAX),
@OlapDimensionID SYSNAME,
@RowsAffected BIGINT;
BEGIN
/**
* STEP 1
* Set variables.
*/
SET @StepNbr = @StepNbr + 1;
SET @ERR_MISSING_PARAM = 1;
SET @ERR_GENERAL = 128;
SET @ErrCode = 0;
SET @FQProcedureName = QUOTENAME(DB_NAME()) + '.' + QUOTENAME(OBJECT_SCHEMA_NAME(@@PROCID)) + '.' + QUOTENAME(OBJECT_NAME(@@PROCID));
SET @ErrMsg = N'ERROR - Stored procedure ' + @FQProcedureName + N' ';
SET @NewLine = NCHAR(13) + NCHAR(10);
-- Capture the OLAP dimension ID for the dimension ID provided.
SELECT @OlapDimensionID = [olap_dimension_id]
FROM [dbo].[dimension]
WHERE [dimension_id] = @i_DimensionID;
-- Set the default rows affected count.
SET @RowsAffected = -1;
/**
* STEP 2
* Ensure that the dimension ID parameter is not NULL and that a non-null olap
* dimension ID was found.
*/
SET @StepNbr = @StepNbr + 1;
IF (@i_DimensionID IS NULL) OR (@OlapDimensionID IS NULL)
BEGIN
SET @ErrMsg = @ErrMsg + N'identified that either the given @i_DimensionID parameter (i.e. ' + ISNULL(@i_DimensionID, '{null}') + N') was NULL or the associated [olap_dimension_id] was NULL. This is not allowed!';
SET @ErrCode = @ERR_MISSING_PARAM;
GOTO FAIL;
END;
/**
* STEP 3
* Ensure that the dimension key table schema parameter is not NULL.
*/
SET @StepNbr = @StepNbr + 1;
IF (@i_DimensionKeyTableSchema IS NULL)
BEGIN
SET @ErrMsg = @ErrMsg + N'identified that the @i_DimensionKeyTableSchema parameter was NULL. This is not allowed!';
SET @ErrCode = @ERR_MISSING_PARAM;
GOTO FAIL;
END;
/**
* STEP 4
* Ensure that the dimension key table name parameter is not NULL.
*/
SET @StepNbr = @StepNbr + 1;
IF (@i_DimensionKeyTableName IS NULL)
BEGIN
SET @ErrMsg = @ErrMsg + N'identified that the @i_DimensionKeyTableName parameter was NULL. This is not allowed!';
SET @ErrCode = @ERR_MISSING_PARAM;
GOTO FAIL;
END;
BEGIN TRY
/**
* STEP 5
* Create temporary table, else truncate it if it already exists.
*/
SET @StepNbr = @StepNbr + 1;
IF (OBJECT_ID('[tempdb].[dbo].[#foreign_key_tree]') IS NULL)
BEGIN
CREATE TABLE [#foreign_key_tree]
(
[base_table_schema] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[base_table_name] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[referenced_table_schema] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[referenced_table_name] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[referencing_table_schema] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[referencing_table_name] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL
)
END;
ELSE
BEGIN
TRUNCATE TABLE [#foreign_key_tree];
END;
/**
* STEP 6
* Capture the starting point from which to recursively derive all related
* tables.
*/
SET @StepNbr = @StepNbr + 1;
INSERT INTO [#foreign_key_tree] ( [base_table_schema], [base_table_name],
[referenced_table_schema],
[referenced_table_name],
[referencing_table_schema],
[referencing_table_name] )
VALUES(@i_DimensionKeyTableSchema, @i_DimensionKeyTableName,
@i_DimensionKeyTableSchema, @i_DimensionKeyTableName,
@i_DimensionKeyTableSchema, @i_DimensionKeyTableName);
/**
* STEP 7
* Recursively capture all tables related to the set of tables already in
* the temporary table. The idea here is to obtain the set of all tables
* that could be used by a dimension in Analysis Services.
*/
SET @StepNbr = @StepNbr + 1;
WHILE (@RowsAffected <> 0)
BEGIN
INSERT INTO [#foreign_key_tree] ( [base_table_schema], [base_table_name],
[referenced_table_schema],
[referenced_table_name],
[referencing_table_schema],
[referencing_table_name] )
SELECT inline.[base_table_schema] AS "base_table_schema",
inline.[base_table_name] AS "base_table_name",
inline.[TABLE_SCHEMA] AS "referenced_table_schema",
inline.[TABLE_NAME] AS "referenced_table_name",
inline.[referenced_table_schema] AS "referencing_table_schema",
inline.[referenced_table_name] AS "referencing_table_name"
FROM
(
SELECT DISTINCT tmp.[base_table_schema],
tmp.[base_table_name],
reftab.[TABLE_SCHEMA],
reftab.[TABLE_NAME],
tmp.[referenced_table_schema],
tmp.[referenced_table_name]
FROM INFORMATION_SCHEMA.TABLES tab
INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS con
ON tab.[TABLE_SCHEMA] = con.[TABLE_SCHEMA] AND
tab.[TABLE_NAME] = con.[TABLE_NAME]
INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE fctcol
ON tab.[TABLE_SCHEMA] = fctcol.[TABLE_SCHEMA] AND
tab.[TABLE_NAME] = fctcol.[TABLE_NAME] AND
con.[CONSTRAINT_NAME] = fctcol.[CONSTRAINT_NAME]
INNER JOIN INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS rfc
ON con.[CONSTRAINT_SCHEMA] = rfc.[CONSTRAINT_SCHEMA] AND
con.[CONSTRAINT_NAME] = rfc.[CONSTRAINT_NAME]
INNER JOIN ( SELECT [CONSTRAINT_SCHEMA],
[CONSTRAINT_NAME],
[TABLE_SCHEMA],
[TABLE_NAME],
[COLUMN_NAME]
FROM INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
UNION
-- Include unique indexes
SELECT SCHEMA_NAME(obj.[schema_id]) AS "CONSTRAINT_SCHEMA",
idx.[name] AS "CONSTRAINT_NAME",
SCHEMA_NAME(obj.[schema_id]) AS "TABLE_SCHEMA",
obj.[name] AS "TABLE_NAME",
col.[name] AS "COLUMN_NAME"
FROM sys.index_columns icl
INNER JOIN sys.indexes idx
ON icl.[index_id] = idx.[index_id]
INNER JOIN sys.objects obj
ON idx.[object_id] = obj.[object_id] AND
icl.[object_id] = obj.[object_id]
INNER JOIN sys.columns col
ON obj.[object_id] = col.[object_id] AND
icl.[column_id] = col.[column_id]
WHERE idx.[is_unique] = 1 AND
idx.[is_primary_key] = 0 AND
obj.[is_ms_shipped] = 0 ) refcol
ON rfc.[UNIQUE_CONSTRAINT_SCHEMA] = refcol.[CONSTRAINT_SCHEMA] AND
rfc.[UNIQUE_CONSTRAINT_NAME] = refcol.[CONSTRAINT_NAME]
INNER JOIN INFORMATION_SCHEMA.TABLES reftab
ON refcol.[TABLE_SCHEMA] = reftab.[TABLE_SCHEMA] AND
refcol.[TABLE_NAME] = reftab.[TABLE_NAME]
INNER JOIN [#foreign_key_tree] tmp
ON tab.[TABLE_SCHEMA] = tmp.[referenced_table_schema] AND
tab.[TABLE_NAME] = tmp.[referenced_table_name]
WHERE tab.[TABLE_TYPE] = 'BASE TABLE' AND
con.[CONSTRAINT_TYPE] = 'FOREIGN KEY'
) inline LEFT OUTER JOIN [#foreign_key_tree] nxt
ON inline.[TABLE_SCHEMA] = nxt.[referenced_table_schema] AND
inline.[TABLE_NAME] = nxt.[referenced_table_name]
WHERE nxt.[base_table_name] IS NULL;
-- Capture the number of rows inserted.
SET @RowsAffected = @@ROWCOUNT;
END;
/**
* STEP 9
* Update current information in the static table.
*/
SET @StepNbr = @StepNbr + 1;
UPDATE tgt
SET [first_referencing_table_schema] = src.[referencing_table_schema],
[first_referencing_table_name] = src.[referencing_table_name]
FROM [dbo].[dimension_table_relationship] tgt
INNER JOIN [#foreign_key_tree] src
ON tgt.[dimension_id] = @i_DimensionID AND
tgt.[olap_dimension_id] = @OlapDimensionID AND
tgt.[dimension_key_table_schema] = src.[base_table_schema] AND
tgt.[dimension_key_table_name] = src.[base_table_name] AND
tgt.[attribute_table_schema] = src.[referenced_table_schema] AND
tgt.[attribute_table_name] = src.[referenced_table_name]
WHERE tgt.[first_referencing_table_schema] <> src.[referencing_table_schema] OR
tgt.[first_referencing_table_name] <> src.[referencing_table_name];
/**
* STEP 9
* Insert new information into the static table.
*/
SET @StepNbr = @StepNbr + 1;
INSERT INTO [dbo].[dimension_table_relationship]
( [dimension_id], [olap_dimension_id],
[dimension_key_table_schema], [dimension_key_table_name],
[attribute_table_schema], [attribute_table_name],
[first_referencing_table_schema], [first_referencing_table_name] )
SELECT @i_DimensionID AS "dimension_id",
@OlapDimensionID AS "olap_dimension_id",
[base_table_schema],
[base_table_name],
[referenced_table_schema],
[referenced_table_name],
[referencing_table_schema],
[referencing_table_name]
FROM [#foreign_key_tree] src
LEFT OUTER JOIN [dbo].[dimension_table_relationship] tgt
ON tgt.[dimension_id] = @i_DimensionID AND
tgt.[olap_dimension_id] = @OlapDimensionID AND
tgt.[dimension_key_table_schema] = src.[base_table_schema] AND
tgt.[dimension_key_table_name] = src.[base_table_name] AND
tgt.[attribute_table_schema] = src.[referenced_table_schema] AND
tgt.[attribute_table_name] = src.[referenced_table_name]
WHERE tgt.[dimension_table_relationship_id] IS NULL;
/**
* STEP 11
* Clean up.
*/
SET @StepNbr = @StepNbr + 1;
IF (OBJECT_ID('[tempdb].[dbo].[#foreign_key_tree]') IS NOT NULL)
BEGIN
DROP TABLE [#foreign_key_tree];
END;
END TRY
BEGIN CATCH
SET @ErrCode = @ERR_GENERAL;
SET @ErrMsg = @ErrMsg + N'raised error (' + CAST(ERROR_NUMBER() AS NVARCHAR) + N') in step number (' + CAST(@StepNbr AS NVARCHAR) + N') at line number (' + CAST(ERROR_LINE() AS NVARCHAR) + N') with message: ' + ERROR_MESSAGE();
GOTO FAIL;
END CATCH;
GOTO SUCCESS;
FAIL:
-- Cater for max exception string length.
SET @ErrMsg = CASE WHEN (LEN(@ErrMsg) > 2044) THEN N'...' + RIGHT(@ErrMsg, 2041) ELSE RIGHT(@ErrMsg, 2044) END;
RAISERROR(@ErrMsg, 16, 1);
IF (OBJECT_ID('[tempdb].[dbo].[#foreign_key_tree]') IS NOT NULL)
BEGIN
DROP TABLE [#foreign_key_tree];
END;
RETURN @ErrCode;
SUCCESS:
RETURN 0;
END;
GO
Step 5
Create a stored procedure that uses the information captured by the SP created in step 4 above, to now populate the table created in point 2 above with an arbitrary minimum value and, using the IDENT_CURRENT(‘’) function, populate the max value with the current identity value for the table. Note that you must ensure that you retrieve the values for the key table before any of the related tables. This is what provides protection against the race condition discussed in my previous post.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_derive_new_attribute_keys_for_processing]') AND type in (N'P', N'PC'))
BEGIN
DROP PROCEDURE [dbo].[usp_derive_new_attribute_keys_for_processing];
END;
GO
CREATE PROCEDURE [dbo].[usp_derive_new_attribute_keys_for_processing]
AS
/**
* For each dimension configured to be processed using "process add", capture the
* set of new attribute keys for every attribute in the dimension.
*
* Ver Author Date Comments
* --- ------------------- ----------- ------------------------------
* 1 Philip Stephenson 01-Jul-2011 Original version.
*/
SET NOCOUNT ON;
DECLARE -- Define Error Codes.
@ERR_GENERAL INT,
@ERR_LOOKUP_FAIL INT,
-- Define user variables.
@ErrCode INT ,--= 0,
@ErrMsg NVARCHAR(MAX),
@StepNbr SMALLINT,-- = 0,
@FQProcedureName NVARCHAR(392),
@RetVal INT,
@AbsoluteMinKeyValue BIGINT,
@AbsoluteMaxKeyValue BIGINT,
@MinKeyValue BIGINT,
@RowsAffected BIGINT,
@ProcessAddFlag BIT;
BEGIN
/**
* STEP 1
* Set variables.
*/
SET @StepNbr = @StepNbr + 1;
SET @ERR_GENERAL = 128
SET @ERR_LOOKUP_FAIL = 512;
SET @FQProcedureName = QUOTENAME(DB_NAME()) + '.' + QUOTENAME(OBJECT_SCHEMA_NAME(@@PROCID)) + '.' + QUOTENAME(OBJECT_NAME(@@PROCID));
SET @ErrMsg = N'ERROR - Stored procedure ' + @FQProcedureName + N' ';
SET @AbsoluteMinKeyValue = -32768;
SET @AbsoluteMaxKeyValue = ((POWER(CAST(2 AS BIGINT), 62) - 1) * 2) + 1;
SET @MinKeyValue = @AbsoluteMinKeyValue;
SET @RowsAffected = -1;
BEGIN TRY
/**
* STEP 2
* Create temporary table, else truncate it if it already exists.
*/
SET @StepNbr = @StepNbr + 1;
IF (OBJECT_ID('[tempdb].[dbo].[#foreign_key_tree]') IS NULL)
BEGIN
CREATE TABLE [#foreign_key_tree]
(
[olap_dimension_id] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[base_table_schema] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[base_table_name] SYSNAME COLLATE DATABASE_DEFAULT NOT NULL,
[referenced_table_schema] SYSNAME COLLATE DATABASE_DEFAULT NULL,
[referenced_table_name] SYSNAME COLLATE DATABASE_DEFAULT NULL,
[new_min_key_value] BIGINT NULL,
[min_key_value] BIGINT NULL,
[max_key_value] BIGINT NULL,
[process_add_flag] BIT NOT NULL
)
END
ELSE
BEGIN
TRUNCATE TABLE [#foreign_key_tree];
END;
/**
* STEP 3
* Capture all tables (and their current identity column value) for all
* dimensions set to be processed. The idea here is to obtain the set of
* all tables that could be used by a dimension in Analysis Services and
* to capture the available keys.
*/
SET @StepNbr = @StepNbr + 1;
INSERT INTO [#foreign_key_tree] ( [olap_dimension_id], [base_table_schema],
[base_table_name], [referenced_table_schema],
[referenced_table_name], [new_min_key_value],
[min_key_value], [max_key_value],
[process_add_flag] )
SELECT rel.[olap_dimension_id],
rel.[dimension_key_table_schema] AS "base_table_schema",
rel.[dimension_key_table_name] AS "base_table_name",
rel.[attribute_table_schema] AS "referenced_table_schema",
rel.[attribute_table_name] AS "referenced_table_name",
atr.[max_key_value] + 1 AS "new_min_key_value",
@AbsoluteMinKeyValue AS "min_key_value",
IDENT_CURRENT(QUOTENAME(rel.[attribute_table_schema]) + N'.' + QUOTENAME(rel.[attribute_table_name])) AS "max_key_value",
1 AS "process_add_flag"
FROM [dbo].[dimension_table_relationship] rel
LEFT OUTER JOIN [dbo].[dimension_process_attribute_key_range] atr
ON rel.[olap_dimension_id] = atr.[olap_dimension_id] AND
rel.[dimension_key_table_schema] = atr.[dimension_key_table_schema] AND
rel.[dimension_key_table_name] = atr.[dimension_key_table_name] AND
rel.[attribute_table_schema] = atr.[attribute_table_schema] AND
rel.[attribute_table_name] = atr.[attribute_table_name] AND
atr.[process_flag] = 1;
/**
* STEP 4
* Invert current (process flag = 1) and non-current (process flag = 0) rows,
* updating related attributes appropriately.
*/
SET @StepNbr = @StepNbr + 1;
UPDATE tgt
SET [process_flag] = CASE WHEN tgt.[process_flag] = 1 THEN
0
ELSE
1
END,
[min_key_value] = CASE
-- Update soon to be non-current record for ProcessAdd.
WHEN (tmp.[process_add_flag] = 1) AND (tgt.[process_flag] = 1) THEN tmp.[max_key_value] + 1
-- Update soon to be current record for ProcessAdd when flipped from "non process add".
WHEN (tmp.[process_add_flag] = 1) AND (tgt.[process_flag] = 0) AND (tgt.[min_key_value] = @AbsoluteMinKeyValue) THEN non.[max_key_value] + 1
-- Update soon to be current record for ProcessUpdate/ProcessFull
WHEN (tmp.[process_add_flag] = 0) AND (tgt.[process_flag] = 0) THEN @AbsoluteMinKeyValue
ELSE
tgt.[min_key_value]
END,
[max_key_value] = tmp.[max_key_value],
[new_data_flag] = CASE
WHEN ( (tmp.[process_add_flag] = 1) AND (tgt.[process_flag] = 1) AND ((tmp.[max_key_value] + 1) <= ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue)) ) THEN 1
WHEN ( (tmp.[process_add_flag] = 0) AND (tgt.[process_flag] = 0) AND (@AbsoluteMinKeyValue <= ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue)) ) THEN 1
WHEN ( (tmp.[process_add_flag] = 1) AND (tgt.[process_flag] = 0) AND (tgt.[min_key_value] = @AbsoluteMinKeyValue) AND ((non.[max_key_value] + 1) <= ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue)) ) THEN 1
WHEN ( (tmp.[process_add_flag] = 1) AND (tgt.[process_flag] = 0) AND (tgt.[min_key_value] <> @AbsoluteMinKeyValue) AND (tgt.[min_key_value] <= ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue)) ) THEN 1
WHEN ( (tmp.[process_add_flag] = 0) AND (tgt.[process_flag] = 1) AND (tgt.[min_key_value] <= ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue)) ) THEN 1
ELSE
0
END,
[updated_date] = GETDATE(),
[updated_by] = ORIGINAL_LOGIN()
FROM [dbo].[dimension_process_attribute_key_range] tgt
LEFT OUTER JOIN [dbo].[dimension_process_attribute_key_range] non
ON tgt.[olap_dimension_id] = non.[olap_dimension_id] AND
tgt.[dimension_key_table_schema] = non.[dimension_key_table_schema] AND
tgt.[dimension_key_table_name] = non.[dimension_key_table_name] AND
tgt.[attribute_table_schema] = non.[attribute_table_schema] AND
tgt.[attribute_table_name] = non.[attribute_table_name] AND
tgt.[process_flag] <> non.[process_flag]
INNER JOIN [#foreign_key_tree] tmp
ON tgt.[dimension_key_table_schema] = tmp.[base_table_schema] AND
tgt.[dimension_key_table_name] = tmp.[base_table_name] AND
tgt.[attribute_table_schema] = tmp.[referenced_table_schema] AND
tgt.[attribute_table_name] = tmp.[referenced_table_name]
WHERE tgt.[olap_dimension_id] = tmp.[olap_dimension_id] AND
( (non.[process_flag] IS NULL) OR ((non.[process_flag] IS NOT NULL) AND
(tgt.[process_flag] <> non.[process_flag])) );
/**
* STEP 5
* Capture information for use by OLAP views during dimension processing.
*/
SET @StepNbr = @StepNbr + 1;
INSERT INTO [dbo].[dimension_process_attribute_key_range]
( [olap_dimension_id], [dimension_key_table_schema],
[dimension_key_table_name], [attribute_table_schema],
[attribute_table_name], [min_key_value],
[max_key_value], [new_data_flag], [process_flag],
[created_date], [created_by] )
SELECT tmp.[olap_dimension_id] AS "olap_dimension_id",
tmp.[base_table_schema],
tmp.[base_table_name],
tmp.[referenced_table_schema],
tmp.[referenced_table_name],
-- In the case of ProcessAdd, set the Min Key value appropriately else it must be the AbsoluteMinKeyValue.
CASE WHEN (tmp.[process_add_flag] = 1) THEN
-- Note that the prev min key value was bumped by the update statement in the prior step so we can just carry this forward.
COALESCE(tmp.[new_min_key_value], @AbsoluteMinKeyValue)
ELSE
@AbsoluteMinKeyValue
END AS "min_key_value",
tmp.[max_key_value],
CASE WHEN ISNULL(tmp.[max_key_value], @AbsoluteMaxKeyValue) >= COALESCE(tmp.[new_min_key_value], @AbsoluteMinKeyValue) THEN
1
ELSE
0
END AS "new_data_flag" ,
1 AS "process_flag",
GETDATE() AS "created_date",
ORIGINAL_LOGIN() AS "created_by"
FROM [#foreign_key_tree] tmp
LEFT OUTER JOIN [dbo].[dimension_process_attribute_key_range] tgt
ON tmp.[olap_dimension_id] = tgt.[olap_dimension_id] AND
tmp.[base_table_schema] = tgt.[dimension_key_table_schema] AND
tmp.[base_table_name] = tgt.[dimension_key_table_name] AND
tmp.[referenced_table_schema] = tgt.[attribute_table_schema] AND
tmp.[referenced_table_name] = tgt.[attribute_table_name] AND
tgt.[process_flag] = 1
WHERE tgt.[dimension_process_attribute_key_range_key] IS NULL;
END TRY
BEGIN CATCH
SET @ErrCode = @ERR_GENERAL;
SET @ErrMsg = @ErrMsg + N'raised error (' + CAST(ERROR_NUMBER() AS NVARCHAR) + N') in step number (' + CAST(@StepNbr AS NVARCHAR) + N') at line number (' + CAST(ERROR_LINE() AS NVARCHAR) + N') with message: ' + ERROR_MESSAGE();
GOTO FAIL;
END CATCH;
GOTO SUCCESS;
FAIL:
-- Cater for max exception string length.
SET @ErrMsg = CASE WHEN (LEN(@ErrMsg) > 2044) THEN N'...' + RIGHT(@ErrMsg, 2041) ELSE RIGHT(@ErrMsg, 2044) END;
RAISERROR(@ErrMsg, 16, 1);
RETURN @ErrCode;
SUCCESS:
RETURN 0;
END;
GO
Step 6
Create a user defined table function that accepts a dimension name, schema name and table name and returns the minimum and maximum values from the table created in point 2 above for a given dimension, schema and table.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF EXISTS ( SELECT *
FROM sys.objects
WHERE [object_id] = OBJECT_ID(N'[dbo].[udf_attribute_keys_for_process_add]') AND
[type] IN (N'FN', N'IF', N'TF', N'FS', N'FT') )
BEGIN
DROP FUNCTION [dbo].[udf_attribute_keys_for_process_add];
END;
GO
CREATE FUNCTION [dbo].[udf_attribute_keys_for_process_add] (
@i_OlapDimensionID SYSNAME,
@i_DimensionTableSchema SYSNAME,
@i_DimensionTableName SYSNAME,
@i_AttributeTableSchema SYSNAME,
@i_AttributeTableName SYSNAME )
RETURNS TABLE
AS
/*
* Returns the minimum and maximum keys for a given dimension's attribute
* table that need to be processed by Analysis Services.
*
*
* Input Parameters:
* ----------------
* @i_OlapDimensionID The Analysis Services identifier for the dimension
* for which attribute key ranges are required.
* @i_DimensionTableSchema The schema for the table that contains the key
* attribute for the dimension.
* @i_DimensionTableName The name of the table that contains the key
* attribute for the dimension.
* @i_AttributeTableSchema The schema for the table for the attribute to be
* processed.
* @i_AttributeTableName The name of the table for the attribute to be
* processed.
*
* Ver Author Date Comments
* --- ------------------- ----------- ---------------------------------
* 1 Philip Stephenson 01-Jul-2011 Original version.
*/
-- Return results.
RETURN SELECT [min_key_value], [max_key_value]
FROM [dbo].[dimension_process_attribute_key_range]
WHERE [olap_dimension_id] = @i_OlapDimensionID AND
[dimension_key_table_schema] = @i_DimensionTableSchema AND
[dimension_key_table_name] = @i_DimensionTableName AND
[attribute_table_schema] = @i_AttributeTableSchema AND
[attribute_table_name] = @i_AttributeTableName AND
[process_flag] = 1 AND
[new_data_flag] = 1;
GO
Step 7
Update the current views used to present data to Analysis Services to join to the function created in point 6 above, limiting the output to be where the surrogate key value is between the minimum and maximum values.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF EXISTS (SELECT * FROM sys.views WHERE [object_id] = OBJECT_ID(N'[dbo].[v_process_add_customer]'))
BEGIN
DROP VIEW [dbo].[v_process_add_customer];
END;
GO
CREATE VIEW [dbo].[v_process_add_customer]
AS
SELECT [customer_id],
[customer_name],
[address_id],
[customer_group_id]
FROM [dbo].[customer] att
INNER JOIN [dbo].[udf_attribute_keys_for_process_add]('Customer',
'dbo',
'customer',
'dbo',
'customer') prc
ON att.[customer_id] BETWEEN prc.[min_key_value] AND prc.[max_key_value];
GO
Test Drive
Now that we have everything setup, let’s give it a whirl. First of all we need to identify all of the relationships for a given dimension, customer in this case, starting from the key table.
EXECUTE [dbo].[usp_reseed_dimension_table_relationship_by_dimension]
@i_DimensionID = 1,
@i_DimensionKeyTableSchema = 'dbo',
@i_DimensionKeyTableName = 'customer';
We can see what this did be executing the following:
SELECT * FROM [dbo].[dimension_table_relationship];
Next we need to identify the minimum and maximum surrogate key values in each of the tables, which we do as follows:
EXECUTE [dbo].[usp_derive_new_attribute_keys_for_processing];
We can see what this did by executing the following:
SELECT * FROM [dbo].[dimension_process_attribute_key_range];
All we need to do now is select from the view to see all new entries:
SELECT *
FROM [dbo].[v_process_add_customer];
To see how this works, we can select from the function we created earlier as follows:
SELECT *
FROM [dbo].[udf_attribute_keys_for_process_add]('Customer',
'dbo',
'customer',
'dbo',
'customer');
To actually test the concept of “process add”, let’s now add a new member:
INSERT INTO [dbo].[customer] ([customer_name], [address_id], [customer_group_id])
VALUES('Customer 4', 2, 2);
In order to capture the metadata around new key values etc. we need to execute:
EXECUTE [dbo].[usp_derive_new_attribute_keys_for_processing];
Which, once again, has captured the metadata as follows:
SELECT * FROM [dbo].[dimension_process_attribute_key_range];
To see what Analysis Services would see, run the following:
SELECT *
FROM [dbo].[v_process_add_customer];
Conclusion
Now I will admit that I haven’t fully test driven all of this myself to see if it all works in all situations etc. so please let me know if anything is broken and I’ll aim to fix it. Either way I hope the concept is clear, such that you can make any modifications as required.
Hopefully you will have noted that in the stored procedure that determines the min and max key values, that it can be easily updated to accommodate ProcessUpdate and ProcessAdd situations by not hard-coding the value for the “process_add_flag” but I’ll leave that as an exercise for the reader.
Nice example! I'm trying to understand it all the way through. Will you please show us the XMLA to make this work?
ReplyDeleteSure, I appreciate that there's a lot to go through and digest there. :-) The XMLA should be no different to normal processing. As you are changing the views backing the DSV such that they present the data as determined by the "snapshot", you only need to set the process option to ProcessAdd if you are using a snapshot or ProcessUpdate if not. Does that make sense?
ReplyDelete