Multi Tenancy in Dgraph

I recognize the desire to keep the model simple as @paras suggests, but I have some concerns regarding multi-tenant applications that have dgraph behind privelged APIs, riffing on what @iluminae brings up.

Writes

I find it to be a common pattern to “hydrate” read-effecient datastores like dgraph and elasticsearch via other sources of truth (cold storage, old RDBMs, eventlogs etc). Often there exists an “all seeing, all knowing” process that can stitch this data and pipe it into the right “tenant”.

Now, if that ETL process (or worse, dozens of stateless replicas of that process) need to call login for every request and batch data into discrete tenant-requests, I worry that performance and ergonomics will suffer. (Especially in the case of reading the “firehose” see: Dgraph can't idle without being oomkilled after large data ingestion where we explored various techniques and improvents for ingestion at high speed)

For these batch ingestion uses, an RDF 4-tuple would be more ergonomic since we could include inserts for multiple namespaces with one request. I think this quote was in regards to storage representation (which I have no preference on), but I wanted to highlight 4-tuple as an idea for the API.

Reads

Similarly, when dgraph lives as a small piece of a “greater” multi-tenant platform, lots of nuanced, domain-specific access control happens at the application layer- usually fed by APIs that have permissive access to the upstream databases/caches/services. We’d like to able to isolate data in dgraph by various criteria, without the overhead of managing users/tokens/state etc.

Mental model

@paras calls out this class of cross-cutting permissions as being “exotic” (which I do agree with), however they are foundational in most databases. Databases evolve to include these pathways because they are not directly exposed to the end-user; intermediary APIs and services do a lot stitching/filtering.

Conflating ACL and Isolation

The concern/critique Ive seen so far relates to the coupling of data-isolation and access-control. I believe they are independent, but compounding features.

Data isolation is quite important- and I think for instances where only privileged users (apis) interact with dgraph, ACL is a bit overkill, and in fact, can make things more painful. I see the features serving two different purposes:

Data Isolation

  • prevents dgraph schema collisions
  • avoids silly footguns (rm -rfing the whole prod database)
  • helps common DB-isms like hot/cold schema migrations/swap-over
  • provides no guarantees of security (if the header says namespaceX, they get namespaceX)

Access Control

  • Verifiably “correct”/strict data-isolation and compliance
  • User provisioning/mgmt/audit trails

I see in the meeting notes there was a comment about this approach. Was it ruled out?

Prior art

Elasticsearch is an apt comparison. In their model, I can make many isolated indices, and even bulk ingest to them in one big request… however native access control ontop of their indices is provided their enterprise license.

I suspect most people are familiar with a model like this, so ACL ontop of namespaces seems like a good way to distinguish dgraph-enterprise from OSS… however that is drifting into a business discussion which I want to veer away from while assessing the RFC.

That being said, we would definitely purchase enterprise for isolation capabilities- but the user-per-namespace design would require us share/track/provision more state than we’d like to, leaving a bit of a sour taste.

I realize I’m promoting a seemingly less safe system :stuck_out_tongue: - but if we conclude isolation != security, the line seems clearer.

Also, FWIW, I think physical separation (via multiple badgers) would be a sweet enterprise upsell for those that have extremely strict compliance needs