ID, Ego, Super-Ego

Bartek „Koziołek“ Kuczyński

ID

Instinct without morality

How JVM identify Objects?

Ordinary Object Pointers – OOPS

This internal stuff!!!
Mark Word
Klass Word

Mark Word

Identity hashCode
Locks
GC metadata

Klass Word

Klass pointer
Compressed Class

In most cases you should not care

When you should care?

You use object as monitor
CPU cache fit - Old code
High performance

Ego

How do I identify myself

What can be there?

int(eger) or long

Sequence

Pros

It is simple
It is fast
It has „business value”

Cons

It is not secure
It is slow
The „missed value” problem

Ideal Identifier

Technical

Unique
Efficient & Fast
Easy to implement

Business

Unique
Easy to use
Has business meaning
Sortable/Gap-free

Is it unique? Yes but no
Is it fast? Yes
Is it easy to use? Yes
Is it sortable? Yes
Is it gap-free? Yes but no

Other solution

UUID

Is it unique? Yes
Is it fast? Not so
Is it easy to use? No
Is it sortable? No
Is it gap-free? Unapplicable (Sparse)

Other solution

ULID

What is ULID

Universally Unique Lexicographically Sortable Identifier
UUID with time
48bits of timestamp + 80bits of randomness

Is it unique? Yes
Is it fast? Not so
Is it easy to use? No
Is it sortable? Yes
Is it gap-free? Unapplicable

Super-EGO

How society identify us

Natural identifiers

Email

Pros

It is simple
Easy to maintain

Cons

GDPR

National ID

Pros

It is simple
Easy to maintain

Cons

GDPR
Design flaws

Business-value ID

Invoice ID
Customer ID
Account number

Business-value IDs

Have some specific requirements
Could be hard to implement
Exist in different context

What about users?

UX
Links
Randomness

How many unique ID we can generate?

24986644000165537791

24 quintillion 986 quadrillion 644 trillion 165 million 537 thousand 791

24 tryliardy 986 tryliony 644 miliardów 165 miliony 537 tysiące 791

Alternatives

Custom Random ID for UUID
Snowflake ID for ULID
Business ID for Sequence

What does it mean efficient & fast?

Easy to use/maintain
Works well with many nodes
Index friendly
Number of values per time unit

Stats time

OS: Ubuntu 20.04.6 LTS x86_64
Kernel: 5.8.0-43-generic
CPU: AMD Ryzen Threadripper 3960X (48) @ 3.800GHz
Memory: 128746MiB
Java: OpenJDK Runtime Environment (build 20+36-2344)

Generate ID ops/s (higher is better)

	1	2	24	48
Seq	1943,8±5,5	1076,6±43,6	604,6±72,7	682,3±117,3
UUID	20,9 ± 0,2	9,7 ± 2	9,9 ±0,1	9,9 ±0,1
ULID	15,0 ± 0,2	5,0 ± 1,5	4,3 ± 0,1	4,0 ± 0,1
Custom	196,0 ± 1,0	89,0 ± 5,2	69,0 ± 2,1	66,0 ± 0,3
SecCustom	25,4 ± 0,2	12,7 ± 2,6	11,6 ± 0,2	10,8 ± 0,1
UnSecCustom	206,7 ± 4,1	106,6 ± 3,2	93,3 ± 2,4	67,0 ± 1,2

Databases…

Blocking and parallel inserts
UUID type in databases
Indexes

So…

ID is important part of entity
Sequence are almost never the best solution
UUID and ULID are better
Think about structure of your data

What is Aggregate Root?
Do you need gap-free ID?
Where you will use that ID?
Can you use natural IDs?

Feedback

Thanks!