Questions tagged [hashing]

For questions about the usage of hashing in a database context

67 questions
31
votes
4 answers

What is a scalable way to simulate HASHBYTES using a SQL CLR scalar function?

As part of our ETL process, we compare rows from staging against the reporting database to figure out if any of the columns have actually changed since the data was last loaded. The comparison is based on the unique key of the table and some kind of…
Joe Obbish
  • 32,976
  • 4
  • 74
  • 153
26
votes
4 answers

Choosing the right algorithm in HashBytes function

We need to create hash value of nvarchar data for comparison purposes. There are multiple hash algorithms available in T-SQL, but which one the best to choose from in this scenario? We want to ensure the risk of having duplicate hash value for two…
Sky
  • 3,744
  • 18
  • 53
  • 68
20
votes
3 answers

How is it possible for Hash Index not to be faster than Btree for equality lookups?

For every version of Postgres that supported hash indexing, there is a warning or note that hash indexes are "similar or slower" or "not better" than btree indexes, at least up to version 8.3. From the docs: Version 7.2: Note: Because of the…
Sampson Crowley
  • 315
  • 1
  • 3
  • 9
18
votes
2 answers

redis newbie - how to create hash within a hash?

I want to create this type of a structure in redis: (its basically json data) { "id": "0001", "name":"widget ABC", "model": "model123", "service":"standard", "admin_password": 82616416, "r1": { …
Happydevdays
  • 355
  • 1
  • 2
  • 9
11
votes
1 answer

What is the algorithm behind the EXCEPT operator?

What is the internal algorithm of how the Except operator works under the covers in SQL Server? Does it internally take a hash of each row and compare? David Lozinksi ran a study, SQL: Fastest way to insert new records where one doesn’t already…
10
votes
1 answer

Hash aggregate bailout

A question that arose in a chat discussion: I know hash join bailout switches internally to a sort of nested loops thing. What does SQL Server do for a hash aggregate bailout (if it can happen at all)?
Paul White
  • 94,921
  • 30
  • 437
  • 687
10
votes
1 answer

Hash Join vs Hash Semi Join

PostgreSQL 9.2 I'm trying to understand the difference between Hash Semi Join and just Hash Join. Here're two queries: I EXPLAIN ANALYZE SELECT * FROM orders WHERE customerid IN (SELECT customerid FROM customers WHERE state='MD'); Hash Semi Join …
St.Antario
  • 1,285
  • 1
  • 10
  • 15
9
votes
1 answer

Best Data Type to Store Result of HASHBYTES('MD5', ...)

What would be the best data type to store the results of the HASHBYTES('MD5', ...)? It outputs 16 bytes of binary as follows: e.g. 0x5CFCD77F9FF836189D2F647EBCEA183E I could store it in the following data types: char(34) binary(16) (I think - I…
GWR
  • 2,847
  • 9
  • 35
  • 42
8
votes
3 answers

Is left hash join always better than left outer join?

I have a query that runs quite slow (see below). While searching for a way to improve it we found that the query runs about ten times faster if we replaced the LEFT OUTER JOIN with LEFT HASH JOIN The result seems to be the same. Is it? In what cases…
merger
  • 199
  • 1
  • 1
  • 8
8
votes
3 answers

Does Microsoft release its SQL Query Hash Algorithm..?

We use SQL Spotlight in our environment. it's pretty handy. we use particularly the output of sys.dm_exec_requests and sys.dm_exec_query_stats. Spotlight pulls the query plan from the plan cache for you using the hash which is nice. Problem is,…
g0pher
  • 83
  • 4
7
votes
2 answers

Can't run digest() on PosrgreSQL 9.3 on RDS

Following the advice given on another answer, I tried using the digest() function on PostgreSQL 9.3 on AWS RDS: devdb=> SELECT digest('blah', 'sha1'); ERROR: function digest(unknown, unknown) does not exist LINE 1: SELECT digest('blah', 'sha1'); …
Adam Matan
  • 12,079
  • 30
  • 82
  • 96
7
votes
1 answer

Using HASHBYTES() yields different results for nvarchar and a variable

I'm using server-side hashing to transmit passwords, then running PBKDF2 in the database to store the hashed password + salt combination. Hashing nvarchar(max) and a @variable holding the same value yielded different results with the HASHBYTES()…
Nate Anderson
  • 315
  • 1
  • 2
  • 10
6
votes
2 answers

Byte ordering for multibyte characters in SQL Server versus Oracle

I am currently in the process of migrating data from Oracle to SQL Server and I'm encountering an issue trying to validate the data post-migration. Environment Details: Oracle 12 - AL32UTF8 character set Client - NLS_LANG - WE8MSWIN1252 VARCHAR2…
HandyD
  • 10,432
  • 1
  • 13
  • 27
4
votes
1 answer

Create Postgres hash aggregate function

Postgres: 15.1 My goal is simple: I want to create an aggregated hash value for x rows from column c in table z. Example Input: SELECT hash_agg(c) AS checksum FROM z; Example Output: checksum 51jj5l1jl55 Unfortunately, this seems to be such an…
Yuki
  • 41
  • 1
  • 4
4
votes
1 answer

In PostgreSQL, what area in memory does HashSetOp use, work_mem or shared_buffer?

I want to know the difference between the hash table in a hash join and the hash table in HashSetOp Except. Here is the PostgreSQL script I wrote to explain my question. drop table if exists t1 cascade; drop table if exists t2 cascade; create table…
JAEGEUN YU
  • 51
  • 4
1
2 3 4 5