This page explains how to use the hash function in APL.
Use the hash
scalar function to transform any data type as a string of bytes into a signed integer. The result is deterministic so the value is always identical given the same input data.
Use the hash
function to:
Don’t use hash
to generate values for long term usage. hash
is generic and the underlying hashing algorithm may change. For long term stability, use the other hash functions with specific algorithm like hash_sha1
.
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
Splunk SPL users
Splunk’s hash
(or md5
, sha1
, etc.) returns a hexadecimal string and lets you pick an algorithm. In APL hash
always returns a 64-bit integer that trades cryptographic strength for speed and compactness. Use hash_sha256
if you need a cryptographically secure digest.
ANSI SQL users
Standard SQL often exposes vendor-specific functions such as HASH
(BigQuery), HASH_BYTES
(SQL Server), or MD5
. These return either bytes or hex strings. In APL hash
always yields an int64
. To emulate SQL’s modulo bucketing, pipe the result into the arithmetic operator that you need.
Name | Type | Description |
---|---|---|
valsourceue | scalar | Any scalar expression except real . |
salt | int | (Optional) Salt that lets you derive a different 64-bit domain while keeping determinism. |
The signed integer hash of source
(and salt
if supplied).
Hash requesters to see your busiest anonymous users.
Query
Output
anon_id | requests |
---|---|
-5872831405421830129 | 128 |
902175364502087611 | 97 |
-354879610945237854 | 85 |
6423087105927348713 | 74 |
-919087345721004317 | 69 |
The query replaces raw IDs with hashed surrogates, counts requests per surrogate, then lists the five most active requesters without exposing PII.
Hash requesters to see your busiest anonymous users.
Query
Output
anon_id | requests |
---|---|
-5872831405421830129 | 128 |
902175364502087611 | 97 |
-354879610945237854 | 85 |
6423087105927348713 | 74 |
-919087345721004317 | 69 |
The query replaces raw IDs with hashed surrogates, counts requests per surrogate, then lists the five most active requesters without exposing PII.
Hash trace IDs to see which anonymous trace has the most spans.
Query
Output
trace_bucket | spans |
---|---|
8,858,860,617,655,667,000 | 62 |
4,193,515,424,067,409,000 | 62 |
1,779,014,838,419,064,000 | 62 |
5,399,024,001,804,211,000 | 62 |
-2,480,347,067,347,939,000 | 62 |
Group suspicious endpoints without leaking the exact URI.
Query
Output
uri_hash | status | requests |
---|---|---|
-123640987553821047 | 404 | 230 |
4385902145098764321 | 403 | 145 |
-85439034872109873 | 401 | 132 |
493820743209857311 | 404 | 129 |
-90348122345872001 | 500 | 118 |
The query hides sensitive path information yet still lets you see which hashed endpoints return the most errors.