Searching Across Tables In SQL Server
I recently had to search many tables for a list of values in SQL Server. Here's how I used dynamic SQL to achieve the result I needed.
I was recently tasked with a data cleanup assignment in Microsoft SQL Server. We had a group of records to delete from a master table, but there was related data on over twenty other different tables. Before the records could be deleted from the master table, the related tables all needed to be queried to see if any linked rows needed to be reviewed.
Querying these 20+ tables individually would be a great deal of manual effort, of course. Fortunately, SQL Server gives us a better way, using the INFORMATION_SCHEMA
.
What is INFORMATION_SCHEMA?
SQL Server's INFORMATION_SCHEMA
gives us a way to view metadata about a database. It provides information about tables, columns, views, schemata, etc. Click here to view the documentation on INFORMATION_SCHEMA
from Microsoft.
As an example of how we can utilize INFORMATION_SCHEMA
, let's say we want to get a list of all tables that contain the word "PERSON" in them. You could do so with the following query.
SELECT
TABLE_NAME
FROM
INFORMATION_SCHEMA.TABLES
WHERE
TABLE_NAME LIKE '%PERSON%';
If you have multiple databases on your SQL Server instance, you could also include TABLE_CATALOG
in your where clause to restrict the database(s) you want to look at.
By using INFORMATION_SCHEMA
, I was able to programmatically generate the list of related tables I needed to go query. Note that if the tables you are looking at do not have any kind of standard naming convention, this may not be beneficial in your case, and you'll likely have to generate the list of tables to go check manually. But, if you are querying tables that have a common pattern in their naming convention, this can save you a good deal of time and effort.
Putting it into Practice
Here is the full code, and then I'll go piece by piece through it:
DROP TABLE IF EXISTS #DataToReview
CREATE TABLE #DataToReview
(
src_table varchar(100),
user_id numeric(10,0)
)
DECLARE @table VARCHAR(100);
DECLARE @sqlCommand VARCHAR(2000);
DECLARE src_cursor CURSOR FOR
SELECT
TABLE_NAME
FROM
INFORMATION_SCHEMA.TABLES
WHERE
TABLE_CATALOG = 'CORE'
AND TABLE_NAME LIKE '%USER_CORE%'
OPEN src_cursor
FETCH NEXT FROM src_cursor INTO @table
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id)
SELECT DISTINCT
''' + @table + ''',
T.USER_ID
FROM
[CORE].[dbo].[ITEMS_TO_DELETE] T
JOIN [CORE].[dbo].' + @table + ' TBL
ON TBL.USER_ID = T.USER_ID'
EXEC (@sqlCommand)
FETCH NEXT FROM src_cursor INTO @table
END
CLOSE src_cursor
DEALLOCATE src_cursor
SELECT * FROM #DataToReview
GO
Let's look at this, piece by piece:
DROP TABLE IF EXISTS #DataToReview
CREATE TABLE #DataToReview
(
src_table varchar(100),
user_id numeric(10,0)
)
At the outset, I'm creating a temporary table to hold my final results (data from my source set of records to remove from the parent table, that also exists in any of the child tables). I'm dropping the table in case it already exists, and then recreating it fresh. I'm populating it with two columns. In addition to user_id
, which will hold the user's id that matches the data in the source set, I'm also including the src_table
to hold the table name that the matching record appears in.
DECLARE @table VARCHAR(100);
DECLARE @sqlCommand VARCHAR(2000);
DECLARE src_cursor CURSOR FOR
SELECT
TABLE_NAME
FROM
INFORMATION_SCHEMA.TABLES
WHERE
TABLE_CATALOG = 'CORE'
AND TABLE_NAME LIKE '%USER_CORE%'
OPEN src_cursor
Next, I'm declaring two variables and a SQL cursor. The variable @table
will be used the process of iterating through the tables we need to query for matches, and the @sqlCommand
variable will hold dynamically generated SQL queries to check for matching data.
Finally, src_cursor
creates a cursor, which is a way to iterate through the rows of a result set in SQL Server. Using a cursor, we can go through the query results one row at a time (in addition to other advanced options such as moving backwards through the results as well). For more on cursors in SQL Server, click here. In our case, we are running a query against the previously explained INFORMATION_SCHEMA.TABLES
- I'm looking in the CORE
database and trying to locate all tables that contain USER_CORE
in the name. Note that so far, we have not obtained any final results yet, we've only gotten the list of tables we need to go and query.
FETCH NEXT FROM src_cursor INTO @table
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sqlCommand = 'INSERT INTO #DataToReview (src_table, user_id)
SELECT DISTINCT
''' + @table + ''',
T.USER_ID
FROM
[CORE].[dbo].[ITEMS_TO_DELETE] T
JOIN [CORE].[dbo].' + @table + ' TBL
ON TBL.USER_ID = T.USER_ID'
EXEC (@sqlCommand)
FETCH NEXT FROM src_cursor INTO @table
END
CLOSE src_cursor
DEALLOCATE src_cursor
Now we are into the thick of it. The first statement will fetch the next row of the cursor query's result (in this case, the first row), storing it in the @table
variable. Next, it checks the global SQL Server function @@FETCH_STATUS
, which will return the status of the last cursor fetch. I will say that @@FETCH_STATUS
is a bit strange; while you might expect it to return a value of 1
if the last fetch returned a row, 0
is, in fact, the result we are looking for. If there is an error or we've reached the end of the result set, a negative value is returned. Click here for more information on @@FETCH_STATUS
, how it's used, and what the function returns.
In our code, then if we get that 0
value back for @@FETCH_STATUS
, the code between the BEGIN
and END
block is executed. We are dynamically generating a SQL query that is going to insert its results into the temporary table we created back at the beginning. This query is going to take the ITEMS_TO_DELETE
table, which is a table we created only holding the user_id
values to delete from the parent table. It then joins to whichever table we are currently looking at in the list the cursor is iterating through, using the user_id
as the common field. Once the inserting query has been dynamically generated and stored in @sqlCommand
, the EXEC
statement is used to carry out the command. Just prior to the end of the block, the FETCH NEXT
status grabs the next row, and the block will repeat as long as there is a next row in the result set. Once the end is reached, the cursor is closed and deallocated.
SELECT * FROM #DataToReview
GO
Finally, we conclude our code by selecting the results from our DataToReview
temp table. This will give us all of the user_id's that appear in any of the tables we searched, along with the table that they showed up in.
Wrapping Up
The above might not necessarily be a specific scenario you face in SQL Server, but knowing how to dynamically search INFORMATION_SCHEMA
for tables, columns, and more is a very useful thing to have in your skillset. Likewise, being able to iterate through results with cursors and build SQL statements dynamically can help save you a great deal of time.
Have questions or suggestions on other ways to solve this problem? Feel free to comment below or reach out on Twitter!
Enjoyed this post? Consider buying me a coffee to fuel the creation of more content.