SQL Azure – TSQL User Defined Function – Pull out money from NVARCHAR(MAX)

I think this one is pretty dirty but here goes.

Again it was adapted from StackOverflow thread
How to get part of string that matches with regular expression in SQL Server

Continuing the theme of identifying substrings in NVARCHAR(MAX) field here is a set of Functions that can be used identify and separate out money in an email body which has been uploaded into a NVARCHAR(MAX) field. looks for the pattern of 2 numbers after a dot and certain numerals before a dot and steps down from hundreds of thousands of pounds to tens of pounds. Note the order is important of the final query as each shorter pattern is a subset the next longer pattern.

First important to strip out the commas in the body as this will skip any currencies which have been formatted with commas.

UPDATE T032email
SET body = REPLACE(body,',','');

WARNING It should be noted that this is very unoptimised code and even on small sets combining the functions in a nested ISNULL SQL really takes time to run.

Firstly a function that looks for the pattern 100000.00 ie 999k to 100k.

CREATE FUNCTION [dbo].[ReturnMoneyDecimal6](@fieldtosearch as nvarchar(max))
 Returns varchar(9)
 as
 begin
 declare @length as int
 declare @loop as int
 declare @retemoney6 as decimal(9,2)
 set @loop =1
 set @length = len(@fieldtosearch)
 
 while @loop<=@length-8
    begin
    if exists(select 1 where substring(@fieldtosearch,@loop,9) like '[0-9][0-9][0-9][0-9][0-9][0-9][.][0-9][0-9]')
        set @retemoney6 = substring(@fieldtosearch,@loop,9)
    set @loop=@loop+1
    end
Return @retemoney6
END

Next a function that looks for the pattern 10000.00 ie 99k to 10k.

CREATE function [dbo].[ReturnMoneyDecimal5](@fieldtosearch as nvarchar(max))
 Returns varchar(8)
 as
 begin
 declare @length as int
 declare @loop as int
 declare @retemoney5 as decimal(8,2)
 set @loop =1
 set @length = len(@fieldtosearch)
 
 while @loop<=@length-7
    begin
    if exists(select 1 where substring(@fieldtosearch,@loop,8) like '[0-9][0-9][0-9][0-9][0-9][.][0-9][0-9]')
        set @retemoney5 = substring(@fieldtosearch,@loop,8)
    set @loop=@loop+1
    end
Return @retemoney5
END

Next a function that looks for the pattern 1000.00 ie 9k to 1k.

CREATE function [dbo].[ReturnMoneyDecimal4](@fieldtosearch as nvarchar(max))
 Returns varchar(7)
 as
 begin
 declare @length as int
 declare @loop as int
 declare @retemoney4 as decimal(7,2)
 set @loop =1
 set @length = len(@fieldtosearch)
 
 while @loop<=@length-6
    begin
    if exists(select 1 where substring(@fieldtosearch,@loop,7) like '[0-9][0-9][0-9][0-9][.][0-9][0-9]')
        set @retemoney4 = substring(@fieldtosearch,@loop,7)
    set @loop=@loop+1
    end
Return @retemoney4
END

Next a function that looks for the pattern 100.00 ie 999 to 100.

CREATE function [dbo].[ReturnMoneyDecimal3](@fieldtosearch as nvarchar(max))
 Returns varchar(6)
 as
 begin
 declare @length as int
 declare @loop as int
 declare @retemoney3 as decimal(6,2)
 set @loop =1
 set @length = len(@fieldtosearch)
 
 while @loop<=@length-5
    begin
    if exists(select 1 where substring(@fieldtosearch,@loop,6) like '[0-9][0-9][0-9][.][0-9][0-9]')
        set @retemoney3 = substring(@fieldtosearch,@loop,6)
    set @loop=@loop+1
    end
Return @retemoney3
END

Lastly a function that looks for the pattern 10.00 ie 99 to 10.

CREATE function [dbo].[ReturnMoneyDecimal2](@fieldtosearch as nvarchar(max))
 Returns varchar(5)
 as
 begin
 declare @length as int
 declare @loop as int
 declare @retemoney2 as decimal(5,2)
 set @loop =1
 set @length = len(@fieldtosearch)
 
 while @loop<=@length-4
    begin
    if exists(select 1 where substring(@fieldtosearch,@loop,5) like '[0-9][0-9][.][0-9][0-9]')
        set @retemoney2 = substring(@fieldtosearch,@loop,5)
    set @loop=@loop+1
    end
Return @retemoney2
END

And then here is the quite horribly inefficient query that can be used to run all functions and seems to do the job.

You know your searchstrings better than me but any text with multiple decimal substrings will be an issue!!!

Which is still a few million times quicker than any person!

SELECT dbo.T032email.pkid,
ISNULL(dbo.ReturnMoneyDecimal6(dbo.T032email.body),
	ISNULL(dbo.ReturnMoneyDecimal5(dbo.T032email.body),
		ISNULL(dbo.ReturnMoneyDecimal4(dbo.T032email.body),
			ISNull(dbo.ReturnMoneyDecimal3(dbo.T032email.body), dbo.ReturnMoneyDecimal2(dbo.T032email.body)
				)))) as money4 from T032email;