Monday, March 26, 2012

More efficient than LEFT JOIN


I have a table with data that is refreshed regularly but I still need to
store the old data. I have created a seperate table with a foreign key
to the table and the date on which it was replaced. I'm looking for an
efficient way to select only the active data.

Currently I use:

SELECT ...
FROM DataTable AS D
LEFT OUTER JOIN InactiveTable AS I ON I.Key = D.Key
WHERE D.Key IS NULL

However I am not convinced that this is the most efficient, or the most
intuitive method of acheiving this.

Can anyone suggest a more efficient way of getting this information
please.

Many thanks.

*** Sent via Developersdex http://www.developersdex.com ***Hi, Brian

I think that you wanted to write "WHERE I.Key IS NULL" (instead of
"WHERE D.Key IS NULL"). In this case, you can use something like this:

SELECT ...
FROM DataTable WHERE Key NOT IN (
SELECT Key FROM InactiveTable
)

Razvan

PS. I assume that the "Key" column does not allow NULL-s.|||On Wed, 15 Feb 2006 12:41:09 GMT, Brian Wotherspoon wrote:

>
>I have a table with data that is refreshed regularly but I still need to
>store the old data. I have created a seperate table with a foreign key
>to the table and the date on which it was replaced. I'm looking for an
>efficient way to select only the active data.
>Currently I use:
>SELECT ...
>FROM DataTable AS D
>LEFT OUTER JOIN InactiveTable AS I ON I.Key = D.Key
>WHERE D.Key IS NULL
>However I am not convinced that this is the most efficient, or the most
>intuitive method of acheiving this.
>Can anyone suggest a more efficient way of getting this information
>please.
>Many thanks.

Hi Brian,

The most intuitive way, IMO, is

SELECT ...
FROM DataTable AS D
WHERE NOT EXISTS
(SELECT *
FROM InActiveTable AS I
WHERE I.Key = D.Key)

The most efficient is either the above or your own LEFT OUTER JOIN query
(but do change D.Key to I.Key in the IS NOT NULL check!!) - but it'll be
only efficient if the I.Key column is indexed.

--
Hugo Kornelis, SQL Server MVP|||try to bring it up a level.

for instance, you are probably creating a temp table? Perhaps create
the temp table with closer to the data you really need.
if you are looking for only one cusotmer, then only pull that one
customer. or, for a specfiic time period, then only that time period.

also, make sure you have an index on inactivetable.key.

if you knew that inactivetable started at some timeframe for all
records, then you could create a composite index on
inactivedata.timestamp plus key.

would it be worth putting an "inactive data datestamp" at the customer
level? perhaps if you have enough rows.

No comments:

Post a Comment