Dgraph takes all disk space in windows

I Want to Do

I want to use Dgraph with windows. I am using the latest master:

.\dgraph.exe version
[Decoder]: Using assembly version of decoder
Page Size: 4096

Dgraph version   : v20.11.0-g28b75cf26
Dgraph codename  : unnamed-mod
Dgraph SHA-256   : 276a70655e4c56f996a080590645d200808c7d5b71677d7f5bc3e55d68dca6fe
Commit SHA-1     : 28b75cf26
Commit timestamp : 2021-01-27 16:54:35 +0530
Branch           : master
Go version       : go1.15.6
jemalloc enabled : false

I compiled dgraph for windows with adding the following to the make file and then run make windows:

windows:
	@GOOS=windows $(MAKE) dgraph
	@mkdir windows
	@mv ./dgraph/dgraph ./windows/dgraph

However, when I start dgraph alpha and zero on the windows system and post the schema the space of the harddrive used up quickly (>100GB in the “t” folder, no data, just schema posted) resulting in the following dgraph error:

(important log lines …)

I0127 18:05:50.622405    3604 log.go:34] Rebuilding index for predicate Denkbox.dockerId (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
I0127 18:05:50.639406    3604 log.go:34] Rebuilding index for predicate ResearchBoxTag.name (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
I0127 18:05:50.716405    3604 log.go:34] Rebuilding index for predicate User.lastSyncedAt (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
fatal error: out of memory allocating heap arena metadata
fatal error: runtime: cannot allocate memory


What I Did

I don’t know what to do. In linux the same setup results in 0 GB “t” folder. Maybe it has sth to do with jemalloc that is not enabled in windows?

Thanks in advance!

1 Like

Out of curiosity, could you share your schema?

So if I understand this right, you used Ratel, edited the schema, and then Dgraph Alpha started taking up 100GiB of data?

I am not able to share the graphql schema, however, I built one with a similiar structure.

This one creates a “t” folder with ~10MB and a lot of dgraph_index folders and DISCARD files in it. Our original schema is way bigger, ~1000 lines.

I hope you get some insight from the schema. I have no clue why the same schema produces 0MB “t” folder on linux.

############################################################
##################### CUSTOM LOGIC #########################
############################################################

type TokenPayload @remote {
  token: String!
}

type Query {
  getUserToken(username: String!, role: Role): TokenPayload!
    @custom(
      http: {
        url: "http://localhost:8085/custom"
        method: POST
        graphql: "query($username: String!, $role: Role) {getUserToken(username: $username, role: $role)}"
        forwardHeaders: ["X-Auth"]
        skipIntrospection: true
      }
    )
}

type Mutation {

  login(username: String!, password: String!): TokenPayload!
    @custom(
      http: {
        url: "http://localhost:8085/custom"
        method: POST
        graphql: "mutation($username: String!, $password: String!) {login(username: $username, password: $password)}"
        skipIntrospection: true
      }
    )

}

############################################################
####################### DATABASE ###########################
############################################################

enum Role {
  TENANT
  ADMIN
}

interface Cuid {
  id: String! @id
}


interface Ownable
{
  owner: User!
}

interface Timestamped {
  createdAt: DateTime! @search
  updatedAt: DateTime! @search
}

type User implements Timestamped
 {
  username: String! @id

  password: String!
  email: String!

  permissions: UserPermissions @hasInverse(field: owner)
  projects: [Project!] @hasInverse(field: owner)

}

type UserPermissions implements Cuid & Timestamped & Ownable {
  username: String! @search(by: [hash])

  role: Role!
}

type Project implements Cuid & Timestamped & Ownable @withSubscription {
  name: String! @search(by: [hash])
  description: String! @search(by: [fulltext])

  pinnedProjects: [Project!]
}
# Dgraph.Authorization {"VerificationKey":"thisisthesecretkey","Header":"X-Auth","Namespace":"https://thisisatest.com/jwt/claims","Algo":"HS256"}

I start dgraph alpha and zero via

.\dgraph alpha
.\dgraph zero

and then I post the schema via

curl -X POST localhost:8080/admin/schema --data-binary ‘@dgraph.graphql

OK this looks like a bug. Tagging @pawan

Maybe there is a problem with the os.RemoveAll line here?:

I created a go test program

package main

import (
	"bufio"
	"fmt"
	"io/ioutil"
	"log"
	"math/rand"
	"os"
	"path"
	"time"
)

func check(err error) {
	if err != nil {
		panic(err)
	}
}

func main() {
	fmt.Println("Checking RemoveAll with subfiles")

	if _, err := os.Stat("./tmp"); os.IsNotExist(err) {
		os.Mkdir("./tmp", 0700)
	}

	for i := 0; i < 20; i++ {

		dir, err := ioutil.TempDir("./tmp", "dgraph_index_")
		if err != nil {
			fmt.Println("err: ", err)
			log.Fatal(err)
		}
		fmt.Println("dir: ", dir)

		f, err := os.Create(path.Join(dir, "DISCARD"))
		check(err)
		defer f.Close()
		w := bufio.NewWriter(f)
		//choose random number for recipe
		r := rand.New(rand.NewSource(time.Now().UnixNano()))
		i := r.Perm(5)

		_, err = fmt.Fprintf(w, "%v\n", i)
		check(err)
		_, err = fmt.Fprintf(w, "%d\n", i[0])
		check(err)
		_, err = fmt.Fprintf(w, "%d\n", i[1])
		check(err)
		w.Flush()

		defer os.RemoveAll(dir)
	}

}


in linux everything in tmp gets removed, in windows nothing gets removed

EDIT:

Just realized an error in my test program:

defer os.RemoveAll should be at line 30, directly after ioutil.TempDir, this works now in windows

Are there any news on this? Got the same error on windows.

1 Like

Oh right. I managed to reproduce this error. We’re gonna fix it. @ibrahim has set up some Windows test pipelines, which should catch this.

2 Likes

Hey there @chewxy @ibrahim

is there any update on this issue? Can I help to fix it?

Thanks!

Hey @marcown, unfortunately, we haven’t been able to work on this. Please feel free to send a PR if you know what needs to be done.

Quick Fix here:
dgraph → posting/index.go:590

Change to:

dbOpts := badger.DefaultOptions(tmpIndexDir).
		WithSyncWrites(false).
		WithNumVersionsToKeep(math.MaxInt32).
		WithLogger(&x.ToGlog{}).
		WithCompression(options.None).
		WithEncryptionKey(x.WorkerConfig.EncryptionKey).
		WithLoggingLevel(badger.WARNING).
		WithInMemory(true).WithDir("").WithValueDir("")

Note the last line. This will use InMemory for the temporary badger instance and the error does not appear.

To reproduce the bug:

package main

import (
	"os"
	"path/filepath"

	"github.com/dgraph-io/badger/v3"
)

func main() {

	tmpIndexDir := filepath.Join(".", "test")
	b, err := badger.Open(badger.DefaultOptions(tmpIndexDir))
	if err != nil {
		panic(err)
	}

	err = b.Close()
	if err != nil {
		panic(err)
	}

	err = os.RemoveAll(tmpIndexDir)

	if err != nil {
		panic(err)
	}

}

@Naman maybe you got an idea? I saw you’ve fixed a possible related bug

3 Likes

Sorry for asking again, but is there any update here or any plans when it can be done? At the moment it’s not usable at all

Should we maybe post this somewhere else where the topic is more badger related?

We’re currently using docker on windows but that’s not a long-term solution.

Hey @maaft, sorry for the delays.
I see the issue is that we were not closing the DISCARD file and windows complain about deleting the open file.
Can you please try https://github.com/dgraph-io/badger/tree/naman/test-windows branch and confirm if this fixes the issue for you? I will get it reviewed and merged.

1 Like

@Naman

thanks for the reply und fix. I will test it tomorrow morning ( GMT+1 :wink: ) and report back

1 Like

Hey again,
@Naman

i just tested it (could not wait^^):

I did a
go get github.com/dgraph-io/badger/v3@bfb6a35bad430f6b946cceb9a30e69fa0a88ee81
in the dgraph latest master and then build windows as described in the first post.

Sadly the error is still there. But now, not only the DISCARD file is remaining, there are also
00001.vlog
00001.mem
KEYREGISTRY and
MANIFEST

Files in the temp folder.

EDIT: Files get deleted, but quite slow, during scheme posting the hdd space gets used to the maximum (>65GB on my system).

EDIT2: Seems a bit that they only get deleted after scheme posting. Furthermore, RAM usage is really high during scheme posting → i cannot run it on my second system

EDIT3: With an empty db the ram usage remains high (7.9GB of 8GB) after scheme posting on my first system

Hey @marcown, does this test works fine and deletes directory?
For the performance issues, please have a look at Dropping support for Windows and Mac.

This works.

1 Like