For questions about the usage of hashing in a database context
Questions tagged [hashing]
67 questions
31
votes
4 answers
What is a scalable way to simulate HASHBYTES using a SQL CLR scalar function?
As part of our ETL process, we compare rows from staging against the reporting database to figure out if any of the columns have actually changed since the data was last loaded.
The comparison is based on the unique key of the table and some kind of…
Joe Obbish
- 32,976
- 4
- 74
- 153
26
votes
4 answers
Choosing the right algorithm in HashBytes function
We need to create hash value of nvarchar data for comparison purposes. There are multiple hash algorithms available in T-SQL, but which one the best to choose from in this scenario?
We want to ensure the risk of having duplicate hash value for two…
Sky
- 3,744
- 18
- 53
- 68
20
votes
3 answers
How is it possible for Hash Index not to be faster than Btree for equality lookups?
For every version of Postgres that supported hash indexing, there is a warning or note that hash indexes are "similar or slower" or "not better" than btree indexes, at least up to version 8.3. From the docs:
Version 7.2:
Note: Because of the…
Sampson Crowley
- 315
- 1
- 3
- 9
18
votes
2 answers
redis newbie - how to create hash within a hash?
I want to create this type of a structure in redis: (its basically json data)
{
"id": "0001",
"name":"widget ABC",
"model": "model123",
"service":"standard",
"admin_password": 82616416,
"r1":
{
…
Happydevdays
- 355
- 1
- 2
- 9
11
votes
1 answer
What is the algorithm behind the EXCEPT operator?
What is the internal algorithm of how the Except operator works under the covers in SQL Server? Does it internally take a hash of each row and compare?
David Lozinksi ran a study, SQL: Fastest way to insert new records where one doesn’t already…
user162241
10
votes
1 answer
Hash aggregate bailout
A question that arose in a chat discussion:
I know hash join bailout switches internally to a sort of nested loops thing.
What does SQL Server do for a hash aggregate bailout (if it can happen at all)?
Paul White
- 94,921
- 30
- 437
- 687
10
votes
1 answer
Hash Join vs Hash Semi Join
PostgreSQL 9.2
I'm trying to understand the difference between Hash Semi Join and just Hash Join.
Here're two queries:
I
EXPLAIN ANALYZE SELECT * FROM orders WHERE customerid IN (SELECT
customerid FROM customers WHERE state='MD');
Hash Semi Join …
St.Antario
- 1,285
- 1
- 10
- 15
9
votes
1 answer
Best Data Type to Store Result of HASHBYTES('MD5', ...)
What would be the best data type to store the results of the HASHBYTES('MD5', ...)?
It outputs 16 bytes of binary as follows: e.g.
0x5CFCD77F9FF836189D2F647EBCEA183E
I could store it in the following data types:
char(34)
binary(16) (I think - I…
GWR
- 2,847
- 9
- 35
- 42
8
votes
3 answers
Is left hash join always better than left outer join?
I have a query that runs quite slow (see below). While searching for a way to improve it we found that the query runs about ten times faster if we replaced the LEFT OUTER JOIN with LEFT HASH JOIN
The result seems to be the same. Is it? In what cases…
merger
- 199
- 1
- 1
- 8
8
votes
3 answers
Does Microsoft release its SQL Query Hash Algorithm..?
We use SQL Spotlight in our environment. it's pretty handy.
we use particularly the output of sys.dm_exec_requests and sys.dm_exec_query_stats.
Spotlight pulls the query plan from the plan cache for you using the hash which is nice.
Problem is,…
g0pher
- 83
- 4
7
votes
2 answers
Can't run digest() on PosrgreSQL 9.3 on RDS
Following the advice given on another answer, I tried using the digest() function on PostgreSQL 9.3 on AWS RDS:
devdb=> SELECT digest('blah', 'sha1');
ERROR: function digest(unknown, unknown) does not exist
LINE 1: SELECT digest('blah', 'sha1');
…
Adam Matan
- 12,079
- 30
- 82
- 96
7
votes
1 answer
Using HASHBYTES() yields different results for nvarchar and a variable
I'm using server-side hashing to transmit passwords, then running PBKDF2 in the database to store the hashed password + salt combination.
Hashing nvarchar(max) and a @variable holding the same value yielded different results with the HASHBYTES()…
Nate Anderson
- 315
- 1
- 2
- 10
6
votes
2 answers
Byte ordering for multibyte characters in SQL Server versus Oracle
I am currently in the process of migrating data from Oracle to SQL Server and I'm encountering an issue trying to validate the data post-migration.
Environment Details:
Oracle 12 - AL32UTF8 character set
Client - NLS_LANG - WE8MSWIN1252
VARCHAR2…
HandyD
- 10,432
- 1
- 13
- 27
4
votes
1 answer
Create Postgres hash aggregate function
Postgres: 15.1
My goal is simple: I want to create an aggregated hash value for x rows from column c in table z.
Example Input:
SELECT hash_agg(c) AS checksum FROM z;
Example Output:
checksum
51jj5l1jl55
Unfortunately, this seems to be such an…
Yuki
- 41
- 1
- 4
4
votes
1 answer
In PostgreSQL, what area in memory does HashSetOp use, work_mem or shared_buffer?
I want to know the difference between the hash table in a hash join and the hash table in HashSetOp Except.
Here is the PostgreSQL script I wrote to explain my question.
drop table if exists t1 cascade;
drop table if exists t2 cascade;
create table…
JAEGEUN YU
- 51
- 4