I think this one is pretty dirty but here goes.
Again it was adapted from StackOverflow thread
How to get part of string that matches with regular expression in SQL Server
Continuing the theme of identifying substrings in NVARCHAR(MAX) field here is a set of Functions that can be used identify and separate out money in an email body which has been uploaded into a NVARCHAR(MAX) field. looks for the pattern of 2 numbers after a dot and certain numerals before a dot and steps down from hundreds of thousands of pounds to tens of pounds. Note the order is important of the final query as each shorter pattern is a subset the next longer pattern.
First important to strip out the commas in the body as this will skip any currencies which have been formatted with commas.
UPDATE T032email SET body = REPLACE(body,',','');
WARNING It should be noted that this is very unoptimised code and even on small sets combining the functions in a nested ISNULL SQL really takes time to run.
Firstly a function that looks for the pattern 100000.00 ie 999k to 100k.
CREATE FUNCTION [dbo].[ReturnMoneyDecimal6](@fieldtosearch as nvarchar(max)) Returns varchar(9) as begin declare @length as int declare @loop as int declare @retemoney6 as decimal(9,2) set @loop =1 set @length = len(@fieldtosearch) while @loop<=@length-8 begin if exists(select 1 where substring(@fieldtosearch,@loop,9) like '[0-9][0-9][0-9][0-9][0-9][0-9][.][0-9][0-9]') set @retemoney6 = substring(@fieldtosearch,@loop,9) set @loop=@loop+1 end Return @retemoney6 END
Next a function that looks for the pattern 10000.00 ie 99k to 10k.
CREATE function [dbo].[ReturnMoneyDecimal5](@fieldtosearch as nvarchar(max)) Returns varchar(8) as begin declare @length as int declare @loop as int declare @retemoney5 as decimal(8,2) set @loop =1 set @length = len(@fieldtosearch) while @loop<=@length-7 begin if exists(select 1 where substring(@fieldtosearch,@loop,8) like '[0-9][0-9][0-9][0-9][0-9][.][0-9][0-9]') set @retemoney5 = substring(@fieldtosearch,@loop,8) set @loop=@loop+1 end Return @retemoney5 END
Next a function that looks for the pattern 1000.00 ie 9k to 1k.
CREATE function [dbo].[ReturnMoneyDecimal4](@fieldtosearch as nvarchar(max)) Returns varchar(7) as begin declare @length as int declare @loop as int declare @retemoney4 as decimal(7,2) set @loop =1 set @length = len(@fieldtosearch) while @loop<=@length-6 begin if exists(select 1 where substring(@fieldtosearch,@loop,7) like '[0-9][0-9][0-9][0-9][.][0-9][0-9]') set @retemoney4 = substring(@fieldtosearch,@loop,7) set @loop=@loop+1 end Return @retemoney4 END
Next a function that looks for the pattern 100.00 ie 999 to 100.
CREATE function [dbo].[ReturnMoneyDecimal3](@fieldtosearch as nvarchar(max)) Returns varchar(6) as begin declare @length as int declare @loop as int declare @retemoney3 as decimal(6,2) set @loop =1 set @length = len(@fieldtosearch) while @loop<=@length-5 begin if exists(select 1 where substring(@fieldtosearch,@loop,6) like '[0-9][0-9][0-9][.][0-9][0-9]') set @retemoney3 = substring(@fieldtosearch,@loop,6) set @loop=@loop+1 end Return @retemoney3 END
Lastly a function that looks for the pattern 10.00 ie 99 to 10.
CREATE function [dbo].[ReturnMoneyDecimal2](@fieldtosearch as nvarchar(max)) Returns varchar(5) as begin declare @length as int declare @loop as int declare @retemoney2 as decimal(5,2) set @loop =1 set @length = len(@fieldtosearch) while @loop<=@length-4 begin if exists(select 1 where substring(@fieldtosearch,@loop,5) like '[0-9][0-9][.][0-9][0-9]') set @retemoney2 = substring(@fieldtosearch,@loop,5) set @loop=@loop+1 end Return @retemoney2 END
And then here is the quite horribly inefficient query that can be used to run all functions and seems to do the job.
You know your searchstrings better than me but any text with multiple decimal substrings will be an issue!!!
Which is still a few million times quicker than any person!
SELECT dbo.T032email.pkid, ISNULL(dbo.ReturnMoneyDecimal6(dbo.T032email.body), ISNULL(dbo.ReturnMoneyDecimal5(dbo.T032email.body), ISNULL(dbo.ReturnMoneyDecimal4(dbo.T032email.body), ISNull(dbo.ReturnMoneyDecimal3(dbo.T032email.body), dbo.ReturnMoneyDecimal2(dbo.T032email.body) )))) as money4 from T032email;