Data cleansing is the process of removing or correcting incorrect data from a dataset
Questions tagged [data-cleansing]
7 questions
5
votes
2 answers
Find rows with similar string values
I have a Microsoft SQL Server 2012 database table with around 7 million crowd-sourced records, primarily containing a string name value with some related details. For nearly every record it seems there are a dozen similar typo records and I am…
kscott
- 151
- 1
- 2
- 6
2
votes
1 answer
How to compare two tables that have no primary key?
So I got two sets of tables at work that I have to compare the data.
The fields are identical, but there is no column that has unique entries.
(Employee ID, Assignment ID, Employee's last name, Employee's First Name,
Dependent Last name, Dependent…
vpxoxo
- 121
- 1
- 3
2
votes
2 answers
How should I handle measurement errors in a timeseries database?
I have a table used to record measurements sampled at regular intervals on different sensors. Each row records the time, the identifier of the quantity being measured, and the value itself.
Now and again measurement errors occur and garbage is being…
lindelof
- 225
- 1
- 2
- 7
1
vote
1 answer
Find duplicate values by joining tables SQL server
Finding the duplicate values in the 'Item_Sales_Detail' table as NULL rows in the 'Sales' and 'Item' tables by joining three tables.
'Sales' table (ID is primary…
Mohammad Bastan
- 33
- 1
- 4
1
vote
0 answers
Selecting Duplicates on All Fields
I have an MS Access (no laughing at the back) database I've used to import a bunch of IIS logs into.
Having looked at the Excel files I pulled these in from, I'm worried going by the dates that some of the IIS files might have full duplicates (i.e.…
user788561
- 11
- 2
0
votes
0 answers
How to filter extraneous Unicode values from a column?
I am cleaning up / priming our PostgreSQL 12 database for future data-related activities (e.g. data encryption). I have tried the following methods to delete non-basic Latin / basic accented Latin / punctuational values from one of our…
0
votes
1 answer
Why get tables cleaned and copied in an SQL-Server DB?
I'm working on an SQL-Server database.
Regularly, entries from one table get moved to another one (from entries to Log_Entries) in order not to flood the database. (The Log_Entries get cleaned afterwards too)
I would like to know how this works, but…
Dominique
- 609
- 1
- 7
- 23