How to prevent RAM usage of Alpha node from growing?

Our setup:

  • Dgraph setup with about 111000 graph-nodes with various edges
  • The Alpha container uses 3.8GB RAM (using docker stats)

Scenario:

  • sending 1 mutation(s) per second to Dgraph
  • each mutation writes the same structure/type of 36 edges with primitive values plus two dgraph.type entries
  • none of the 36 edges has an index

Observation (using docker stats):

  • after 88000 write (about 1 day later) the Alpha container uses 21GB RAM
  • after stopping and restarting all containers the Alpha container RAM usage shows 10GB
  • the RAM usage is growing about 6.2GB/88000 - 70KB per mutation
  • the RAM increase in one month is projected to be 186GB

Expected behavior:

  • The Alpha container’s RAM usage should not increase that drastic, we are planning running this for many month.

Questions:

  • Is this the expected behavior?
  • If those mutation data are written to the harddrive why is the RAM usage increasing?
  • If the Alpha node starts with option ‘–lru_mb=4096’, why does the RAM usage get so far over 4096MB?
  • What is the recommendation to keep this running for a year on a 32GB RAM system without running out of memory?

docker-compose.yaml

version: '3.5'

services:
  zero:
    image: dgraph/dgraph:v20.03.3
    volumes:
      - dgraph-volume:/dgraph
    restart: always
    command: dgraph zero --my=zero:5080
  alpha:
    image: dgraph/dgraph:v20.03.3
    volumes:
      - dgraph-volume:/dgraph
    ports:
      - 8080:8080   # http for ratel
      - 9080:9080   # gRPC
    restart: always
    command: dgraph alpha --my=alpha:7080 --lru_mb=4096 --zero=zero:5080
  ratel:
    image: dgraph/dgraph:v20.03.3
    ports:
      - 8000:8000
    restart: always
    command: dgraph-ratel

Example mutation data:

     {
       "AreaName": "OTS",
       "Category": "PROCESS",
       "AlarmSet": "Standard",
       "ModuleDescription": "E-103",
       "Nalm": 0,
       "OperatorSuppressed": false,
       "RecordType": 2,
       "SequenceNumber": 85,
       "SuppressionReason": 255,
       "UnitName": "HEAT_EXCHANGERS",
       "Laalm": 0,
       "ZoneId": 0,
       "CualmWord": "",
       "FunctionalClassificationWord": "Not classified",
       "LaalmWord": ":HEAT",
       "Priority": 24,
       "ZoneName": "",
       "AlarmMessage": "Inputs Transfer Failure",
       "AlarmStateWord": "NotValid",
       "Attribute": "",
       "EventClass": 1,
       "ModuleName": "E-103",
       "NodeName": "USAUST-DEV797",
       "OutOfService": false,
       "AlarmId": ":HEAT_EXCHANGERS:E-103",
       "Importance": "1729431211091013888",
       "EventType": 73,
       "EORType": 1,
       "FunctionalClassification": 0,
       "PriorityWord": "",
       "AlarmState": "Unknown",
       "AreaNumber": 16,
       "Cualm": 0,
       "Message": "E-103/HX_E-103/XFR2/SELECTOR",
       "OwnedBy": "?",
       "SuppressionReasonWord": "",
       "dgraph.type": [
         "Core/Obj",
         "Types/Event"
       ]
     }
1 Like

Welcome @peter-hartmann-emrsn to the Community!

While we investigate the increase in the memory for each mutation, I also wanted to suggest you to import all your data into Dgraph via bulk-loader or live-loader

Re: mutation, how are you sending these mutations? Are you using curl or a Dgraph client?

2 Likes

Hi @peter-hartmann-emrsn we are facing the same issue. I kind of know that guys in DGraph are aware of it and they are trying to solve the problem.

We are not using DGraph in production mainly because that memory issue (check my test that I’ve sent in different post GitHub - igormiletic/dgraphtest: Simple performance and load test for DGraph)

Another big problem is possibility to query graph that has many nodes. For example, in our case if number of nodes is more then 20.000.000 doing any kind of queries is almost impossible.

Looking forward to get this solved.

Thank you @Paras - it means a lot to us. If we can get to production with Dgraph that would be a paradigm shift for us.
We are using the C# client nuget Dgraph (20.3.0) sending nQuads that get chunked up to sets of 1000 nQuads during each mutation, op to 10 mutations run in parallel. The mutations are send using this code:

    var mutation = new MutationBuilder { SetNquads = nQuads };
    var req = new RequestBuilder { Query = query, CommitNow = true }.WithMutations(mutation);
    var r1 = await txn.Mutate(req);

Upserts queries are built for unknown uids. Uids returned by the mutation are cached and used to build new mutations.
The event records are live events that need to be written to Dgraph as they occure/arrive so they can be consumed by Dgraph queries with little delay. Bulk-loader or live-loader seem not right for this. Just for testing perhaps I could write all records to an rdf file and see if live-loader import ends up with similar memory increase.

Hope this helps getting to the bottom of this. Our use cases greatly benefit from keeping Event records in Dgraph.

@pjolep thanks for sharing this info, I’ll check out your test. We tried using Dgraph for time-series data but had no luck doing any kind of query while creating 25000 new nodes per second. Keeping last value only looks ok to far - so I still have hope! Also what do you use in production if not Dgraph, may I ask?

We were testing DGraph to support our Identity graph (connected users). Unfortunately mainly because problems mentioned above it is still just experiment.

We are hacking this at the moment with Postgres, but not truly graph. We tried couple open source graph databases and DGraph was kind of the best for experiments, but when it came to be reliable we started facing serious problems which are still open. I am looking forward that DGraph guys will solve it soon.

Until then or new graph database show-up, old good Postgres works as expected:)

1 Like

I was thinking of using Dgraph to handle some time-series data that would go back a short time, like two days maybe. I’d love to hear if you get anything like that working, and maybe a simple example of your Schema if you don’t mind. :smiley:

1 Like

Hey @peter-hartmann-emrsn, we had a long thread going over here with similar memory related issues with high-throughput ingestion. Dgraph can't idle without being oomkilled after large data ingestion. Have you been able to take a heap snapshot of the memory? In that thread @JimWen found the root cause was actually etcd not reading large messages in chunks, which he outlines in this post → Dgraph can't idle without being oomkilled after large data ingestion - #60 by JimWen

Additionally, as part of that thread we learned that there’s a setting in dgraph for keeping the L0 cache in memory (default) vs disk. I don’t believe that setting has been exposed as a flag- but if you compile dgraph it’s just a bool that can be flipped. For us, the system would OOM very quickly without L0 being set to disk when under heavy ingestion workloads.

As a quick aside @Paras, this asynchronous ingestion pipeline Peter describes is similar to ours (and what I was alluding to in our discussion of the multi-tenancy RFC :slight_smile: ). In these cases there’s utility if one ingestor can write to many tenants.

Not to derail too much, but as a general critique, the upsert flow is a little painful (we had to implement our own cache for the uids like Peter did), but as I understand it, those incrementing uids as provisioned by the oracle are integral design of the system so I’m not sure on ways to make it more ergonomic.

1 Like

This probably deserves a separate forum post. My general approach is described here: dgraph for timeseries data.md · GitHub .

Problem is, you will likely want to query by TimeStamp and need to index the TimeStamp in your schema. However when sending tons of mutations adding new values with high frequency the index keeps rebuilding and all attempts querying the timeseries data will either timeout or fail saying something like “try again later”.
Recording a few values per minute should work fine, but for anything more aggressive you may want to run more tests before building a whole product relying on it.
Also my approach put timestamp on the object, not sure how Dgraph behaves when putting timestamps on facets.

2 Likes

@peter-hartmann-emrsn Another thing we learned is that dgraph uses one goroutine per predicate (to provide the transactional guarantees), so if you are slamming value and timestamp, that will likely lead to high contention (likely just on one of your nodes). I did experiments where I “namespaced” predicates and got my throughput to increase by orders of magnitude (this fanned out the load across my nodes). There’s some ongoing work to remove this limitation in ludicrous mode Run mutations concurrently per predicate in Ludicrous mode · Issue #5403 · dgraph-io/dgraph · GitHub

@seanlaff - your links are helping me a lot understanding the problem better, thanks very much!
I may try time-series again when the ludicrous fix is in. Hope this also somehow fixes the read-errors-due-to-ongoing-re-indexing issues.

@peter-hartmann-emrsn Sure thing- your use-case probably stands to gain the most performance via the upcoming “shard predicates across nodes” feature Split predicates into multiple groups · Issue #4585 · dgraph-io/dgraph · GitHub

As it stands right now- dgraph scales beautifully horizontally with “wide” data (i.e many different predicates). This is because each predicate is assigned 1 node to live on, so its easy to to keep scaling out to increase throughput.

In your case however, those very few predicates (timestamp, value ) get all bunched up, and you enter a bit of a vertical scaling situation. There is a point of diminishing returns with scaling vertically though, since that one-goroutine-per-predicate will eventually bottleneck.

When doing performance tests with “narrow” predicates as you have, it may make sense to manually reassign those shards to different nodes via the ratel ui. Dgraph will intelligently do this for you in time, but a certain watermark has to be hit before it starts shifting things around- and if you’re starting your performance test from a clean-slate each time, you may get unlucky and have those few predicates all assigned to just one of your alphas (skewing the results of the beginning of your test).

In my case I was giving everything I ingested an xid, but I’d hit a performance wall quite fast. Since I had many different “things” I was ingesting, I “namespaced” my predicates by type, i.e user.xid, virtualmachine.xid, etc- which allowed me to scale significantly. (FWIW this is how the native GQL api handles the naming of GQL types).

In my discussions with the team they acknowledged dgraph should be more ergonomic than this, and said they are committed to solving the sharded-predicates problem- but I imagine its a significant piece of work.

1 Like