Why haven't "offline-first" apps taken off?
The big problem with the web is it does not work offline. Even if you are online, if you're on slow network connection, your app feels terrible. That sucks. That's a big limitation.
Local-first is a solution. But writing full local app is a complete re-architecture The main reason why it (local-first app) hasn't taken off, it's because syncing is a very hard problem.
Local-first apps are a distributed system. Distributed system is hard to build. Correctly syncing different states on different computers is so hard.
Why is syncing so hard?
Two main problems
- Unreliable ordering
- Conflicts
Unreliable ordering
In a situation where two people making changes to a single document, we want to think about this as if there's one timeline. But they are distributed systems. As the changes are propagated and synced each other, it's going to get them different orders. We can't just naively mutate the data as we see changes.
Eventual Consistency
Even if we get the changes in different orders, is there a way to still produce the same state on each device? We need a special kind of clock and assign timestamps to each change.
Time is relative; its only worth depends upon what we do as it is passing - Albert Einstein
Hybrid Logical Clock (HLC) Paper
- Vector clock
- Hybrid Logical Clock (HLC)
- Per-device
HLC generates timestamps that we can rely on. All that we care about is, when you assign a time into a change, the clock is accurate enough to give sync system some data so that we can compare changes and say, did this change that happened on this machine come after the all things that the machine has already seen; Relative Ordering.
Reliability
Simplicity is a prerequisite for reliability. - Edsger W. Dijkstra
This is complex. Distributed apps/systems are a complex problem. The way to have it reliable is simple solution. We want this to be reliable. For this to happen, we have to have the least amount of code possible. HLC (Hybrid Logical Clock)'s implemented in 256 lines of code with no dependencies.
Conflicts
How do we deal with conflicts? Manual conflict resolution does not work. As a developer, you are not going to write correct manual conflict resolvers. Way too hard, way too many edge cases.
CRDTs
Partially ordered monoid in the category of endofunctors with a least upper bound.
Data Types
- G-Counter
- LWW-Element-Set
- PN-Counter
- OR-Set
- G-Set
- ORSWOT
- 2P-Set
- and more...
Two CRDTs properties
- Commutative (The order doesn't matter in which you apply the changes)
2 + 3 = 3 + 2 = 5
- Idempotent (It doesn't matter how many times you apply the changes)
f(x) = f(x)f(x) = f(x)f(x)f(x)
How to take basic relational data and turn it into CRDTs?
- SQLite table -> G-Set of LWW-Maps
- One new table:
messages_crdt
- This table saves all the messages I have ever seen whether I generated myself or I got them from other devices.
- It has
- timestamp
- dataset (table that the change is applying to)
- row (item id)
- column
- value
- To implement this
- server: 132 lines of JS
- client: 639 lines of JS
- only dependencies:
uuid
andmurmurhash
- Source
Conclusion
- Local apps have superior UX. They are super fast, no latency, and work offline.
- We've got to start simplifying our solutions
- Clocks (particularly HLCs) and CRDTs are an elegant solution to distributed apps.
Links
- Actual
- Hybrid logical clocks
- CRDTs
- Demo apps