Here I adapted my old answer from a similar SO question How to find all connected subgraphs of an undirected graph.
This variant doesn't use a cursor or explicit loop, but uses a single recursive query.
Essentially, it treats the data as edges in a graph and traverses recursively all edges of the graph, stopping when the loop is detected. Then it puts all found loops in groups and gives each group a number.
See the detailed explanations of how it works below. I recommend you to run the query CTE-by-CTE and examine each intermediate result to understand what it does.
Sample data
DECLARE @T TABLE (ID1 int NOT NULL, ID2 int NOT NULL);
INSERT INTO @T (ID1, ID2) VALUES
(10, 20),
(10, 30),
(30, 30),
(10, 40),
(50, 70),
(60, 50),
(70, 70),
(4457745, 255714),
(4457745,2540222),
(2540222,4457745),
( 255714,4457745);
Query
WITH
CTE_Ids
AS
(
SELECT ID1 AS ID
FROM @T
UNION
SELECT ID2 AS ID
FROM @T
)
,CTE_Pairs
AS
(
SELECT ID1, ID2
FROM @T
WHERE ID1 <> ID2
UNION
SELECT ID2 AS ID1, ID1 AS ID2
FROM @T
WHERE ID1 <> ID2
)
,CTE_Recursive
AS
(
SELECT
CAST(CTE_Ids.ID AS varchar(8000)) AS AnchorID
,ID1
,ID2
,CAST(
',' + CAST(ID1 AS varchar(8000)) +
',' + CAST(ID2 AS varchar(8000)) +
',' AS varchar(8000)) AS IdPath
,1 AS Lvl
FROM
CTE_Pairs
INNER JOIN CTE_Ids ON CTE_Ids.ID = CTE_Pairs.ID1
UNION ALL
SELECT
CTE_Recursive.AnchorID
,CTE_Pairs.ID1
,CTE_Pairs.ID2
,CAST(
CTE_Recursive.IdPath +
CAST(CTE_Pairs.ID2 AS varchar(8000)) +
',' AS varchar(8000)) AS IdPath
,CTE_Recursive.Lvl + 1 AS Lvl
FROM
CTE_Pairs
INNER JOIN CTE_Recursive ON CTE_Recursive.ID2 = CTE_Pairs.ID1
WHERE
CTE_Recursive.IdPath NOT LIKE
CAST('%,' + CAST(CTE_Pairs.ID2 AS varchar(8000)) + ',%' AS varchar(8000))
)
,CTE_RecursionResult
AS
(
SELECT AnchorID, ID1, ID2
FROM CTE_Recursive
)
,CTE_CleanResult
AS
(
SELECT AnchorID, ID1 AS ID
FROM CTE_RecursionResult
UNION
SELECT AnchorID, ID2 AS ID
FROM CTE_RecursionResult
)
SELECT
DENSE_RANK() OVER (ORDER BY CA_Data.XML_Value) AS [NewID]
,CTE_Ids.ID AS [OldID]
,CA_Data.XML_Value AS GroupMembers
FROM
CTE_Ids
CROSS APPLY
(
SELECT CAST(CTE_CleanResult.ID AS nvarchar(max)) + ','
FROM CTE_CleanResult
WHERE CTE_CleanResult.AnchorID = CTE_Ids.ID
ORDER BY CTE_CleanResult.ID FOR XML PATH(''), TYPE
) AS CA_XML(XML_Value)
CROSS APPLY
(
SELECT CA_XML.XML_Value.value('.', 'NVARCHAR(MAX)')
) AS CA_Data(XML_Value)
ORDER BY [NewID], [OldID];
Result
+-------+---------+-------------------------+
| NewID | OldID | GroupMembers |
+-------+---------+-------------------------+
| 1 | 10 | 10,20,30,40, |
| 1 | 20 | 10,20,30,40, |
| 1 | 30 | 10,20,30,40, |
| 1 | 40 | 10,20,30,40, |
| 2 | 255714 | 255714,2540222,4457745, |
| 2 | 2540222 | 255714,2540222,4457745, |
| 2 | 4457745 | 255714,2540222,4457745, |
| 3 | 50 | 50,60,70, |
| 3 | 60 | 50,60,70, |
| 3 | 70 | 50,60,70, |
+-------+---------+-------------------------+
How it works
CTE_Ids
CTE_Ids gives the list of all Identifiers that appear in both ID1 and ID2 columns.
Since they can appear in any order we UNION both columns together. UNION also removes any duplicates.
+---------+
| ID |
+---------+
| 10 |
| 20 |
| 30 |
| 40 |
| 50 |
| 60 |
| 70 |
| 255714 |
| 2540222 |
| 4457745 |
+---------+
CTE_Pairs
CTE_Pairs gives the list of all edges of the graph in both directions. Again, UNION is used to remove any duplicates.
+---------+---------+
| ID1 | ID2 |
+---------+---------+
| 10 | 20 |
| 10 | 30 |
| 10 | 40 |
| 20 | 10 |
| 30 | 10 |
| 40 | 10 |
| 50 | 60 |
| 50 | 70 |
| 60 | 50 |
| 70 | 50 |
| 255714 | 4457745 |
| 2540222 | 4457745 |
| 4457745 | 255714 |
| 4457745 | 2540222 |
+---------+---------+
CTE_Recursive
CTE_Recursive is the main part of the query that recursively traverses the graph starting from each unique Identifier.
These starting rows are produced by the first part of UNION ALL.
The second part of UNION ALL recursively joins to itself linking ID2 to ID1.
Since we pre-made CTE_Pairs with all edges written in both directions, we can always link only ID2 to ID1 and we'll get all paths in the graph.
At the same time the query builds IdPath - a string of comma-delimited Identifiers that have been traversed so far.
It is used in the WHERE filter:
CTE_Recursive.IdPath NOT LIKE
CAST('%,' + CAST(CTE_Pairs.ID2 AS varchar(8000)) + ',%' AS varchar(8000))
As soon as we come across the Identifier that had been included in the Path before, the recursion stops as the list of connected nodes is exhausted.
AnchorID is the starting Identifier for the recursion, it will be used later to group results.
Lvl is not really used, I included it for better understanding of what is going on.
+----------+---------+---------+--------------------------+-----+
| AnchorID | ID1 | ID2 | IdPath | Lvl |
+----------+---------+---------+--------------------------+-----+
| 10 | 10 | 20 | ,10,20, | 1 |
| 10 | 10 | 30 | ,10,30, | 1 |
| 10 | 10 | 40 | ,10,40, | 1 |
| 20 | 20 | 10 | ,20,10, | 1 |
| 30 | 30 | 10 | ,30,10, | 1 |
| 40 | 40 | 10 | ,40,10, | 1 |
| 50 | 50 | 60 | ,50,60, | 1 |
| 50 | 50 | 70 | ,50,70, | 1 |
| 60 | 60 | 50 | ,60,50, | 1 |
| 70 | 70 | 50 | ,70,50, | 1 |
| 255714 | 255714 | 4457745 | ,255714,4457745, | 1 |
| 2540222 | 2540222 | 4457745 | ,2540222,4457745, | 1 |
| 4457745 | 4457745 | 255714 | ,4457745,255714, | 1 |
| 4457745 | 4457745 | 2540222 | ,4457745,2540222, | 1 |
| 2540222 | 4457745 | 255714 | ,2540222,4457745,255714, | 2 |
| 255714 | 4457745 | 2540222 | ,255714,4457745,2540222, | 2 |
| 70 | 50 | 60 | ,70,50,60, | 2 |
| 60 | 50 | 70 | ,60,50,70, | 2 |
| 40 | 10 | 20 | ,40,10,20, | 2 |
| 40 | 10 | 30 | ,40,10,30, | 2 |
| 30 | 10 | 20 | ,30,10,20, | 2 |
| 30 | 10 | 40 | ,30,10,40, | 2 |
| 20 | 10 | 30 | ,20,10,30, | 2 |
| 20 | 10 | 40 | ,20,10,40, | 2 |
+----------+---------+---------+--------------------------+-----+
CTE_CleanResult
CTE_CleanResult leaves only relevant parts from CTE_Recursive and again merges both ID1 and ID2 using UNION.
+----------+---------+
| AnchorID | ID |
+----------+---------+
| 10 | 10 |
| 10 | 20 |
| 10 | 30 |
| 10 | 40 |
| 20 | 10 |
| 20 | 20 |
| 20 | 30 |
| 20 | 40 |
| 2540222 | 255714 |
| 2540222 | 2540222 |
| 2540222 | 4457745 |
| 255714 | 255714 |
| 255714 | 2540222 |
| 255714 | 4457745 |
| 30 | 10 |
| 30 | 20 |
| 30 | 30 |
| 30 | 40 |
| 40 | 10 |
| 40 | 20 |
| 40 | 30 |
| 40 | 40 |
| 4457745 | 255714 |
| 4457745 | 2540222 |
| 4457745 | 4457745 |
| 50 | 50 |
| 50 | 60 |
| 50 | 70 |
| 60 | 50 |
| 60 | 60 |
| 60 | 70 |
| 70 | 50 |
| 70 | 60 |
| 70 | 70 |
+----------+---------+
Final SELECT
Now we need to build a string of comma-separated ID values for each AnchorID.
CROSS APPLY with FOR XML does it.
DENSE_RANK() calculates the NewID numbers for each AnchorID.
This query builds a full path for each identifier, which is not really necessary.
It is possible to make the overall solution more efficient by running this query once for each group/subgraph, in a loop.
Pick a single starting Identifier (add WHERE ID = 10 to CTE_Ids and CTE_Pairs), then delete from the source table all identifiers that were appended to this group. Repeat until the big table is empty.