Search This Blog

December 01, 2008

How to eliminate duplicate rows in a table



/*Create table DupRecords*/
IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES t
WHERE t.TABLE_NAME='DUPRECORDS' AND t.TABLE_SCHEMA='dbo')
BEGIN
Create Table dbo.DupRecords
(
id INT not null
,Name varchar(40)
,Salary money
)
END

/*Insert duplicate data*/
insert into DupRecords values
(1,'A', 1000)
,(1,'A', 1000)
,(2,'B',2000)
,(3,'C',3000)
,(2,'B',2000)
,(2,'B',2000)
,(3,'C',3000)
,(4,'D',4000)


If we want to eliminate duplicate rows in a table use the following syntax


/*Display Non duplicate data*/
SELECT distinct * FROM dbo.duprecords


The following code is used to eliminate duplicate rows based on selected columns by displaying all the columns. The columns on which duplicate is decided need to be mentioned after Partition by in the Over() function


SELECT Id,Name,Salary FROM
(
select Id,Name,Salary, ROW_NUMBER()over(partition by id,name,salary order by id) as [RowNumber] from dbo.DupRecords
) t
WHERE RowNumber=1


If we want to display what are the records we need to delete to eliminate duplicate rows in a table we can use the following sql statement.

SELECT Id,Name,Salary FROM
(
select *, ROW_NUMBER()over(partition by id,name,salary order by id) as [RowNumber] from dbo.DupRecords
) t
WHERE RowNumber>1

No comments: