I Want to Do
I want to use Dgraph with windows. I am using the latest master:
.\dgraph.exe version
[Decoder]: Using assembly version of decoder
Page Size: 4096
Dgraph version : v20.11.0-g28b75cf26
Dgraph codename : unnamed-mod
Dgraph SHA-256 : 276a70655e4c56f996a080590645d200808c7d5b71677d7f5bc3e55d68dca6fe
Commit SHA-1 : 28b75cf26
Commit timestamp : 2021-01-27 16:54:35 +0530
Branch : master
Go version : go1.15.6
jemalloc enabled : false
I compiled dgraph for windows with adding the following to the make file and then run make windows:
windows:
@GOOS=windows $(MAKE) dgraph
@mkdir windows
@mv ./dgraph/dgraph ./windows/dgraph
However, when I start dgraph alpha and zero on the windows system and post the schema the space of the harddrive used up quickly (>100GB in the “t” folder, no data, just schema posted) resulting in the following dgraph error:
(important log lines …)
I0127 18:05:50.622405 3604 log.go:34] Rebuilding index for predicate Denkbox.dockerId (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
I0127 18:05:50.639406 3604 log.go:34] Rebuilding index for predicate ResearchBoxTag.name (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
I0127 18:05:50.716405 3604 log.go:34] Rebuilding index for predicate User.lastSyncedAt (1/2): Streaming about 0 B of uncompressed data (0 B on disk)
fatal error: out of memory allocating heap arena metadata
fatal error: runtime: cannot allocate memory
What I Did
I don’t know what to do. In linux the same setup results in 0 GB “t” folder. Maybe it has sth to do with jemalloc that is not enabled in windows?
Thanks in advance!
1 Like
chewxy
(chewxy)
January 28, 2021, 12:00am
2
Out of curiosity, could you share your schema?
So if I understand this right, you used Ratel, edited the schema, and then Dgraph Alpha started taking up 100GiB of data?
I am not able to share the graphql schema, however, I built one with a similiar structure.
This one creates a “t” folder with ~10MB and a lot of dgraph_index folders and DISCARD files in it. Our original schema is way bigger, ~1000 lines.
I hope you get some insight from the schema. I have no clue why the same schema produces 0MB “t” folder on linux.
############################################################
##################### CUSTOM LOGIC #########################
############################################################
type TokenPayload @remote {
token: String!
}
type Query {
getUserToken(username: String!, role: Role): TokenPayload!
@custom(
http: {
url: "http://localhost:8085/custom"
method: POST
graphql: "query($username: String!, $role: Role) {getUserToken(username: $username, role: $role)}"
forwardHeaders: ["X-Auth"]
skipIntrospection: true
}
)
}
type Mutation {
login(username: String!, password: String!): TokenPayload!
@custom(
http: {
url: "http://localhost:8085/custom"
method: POST
graphql: "mutation($username: String!, $password: String!) {login(username: $username, password: $password)}"
skipIntrospection: true
}
)
}
############################################################
####################### DATABASE ###########################
############################################################
enum Role {
TENANT
ADMIN
}
interface Cuid {
id: String! @id
}
interface Ownable
{
owner: User!
}
interface Timestamped {
createdAt: DateTime! @search
updatedAt: DateTime! @search
}
type User implements Timestamped
{
username: String! @id
password: String!
email: String!
permissions: UserPermissions @hasInverse(field: owner)
projects: [Project!] @hasInverse(field: owner)
}
type UserPermissions implements Cuid & Timestamped & Ownable {
username: String! @search(by: [hash])
role: Role!
}
type Project implements Cuid & Timestamped & Ownable @withSubscription {
name: String! @search(by: [hash])
description: String! @search(by: [fulltext])
pinnedProjects: [Project!]
}
# Dgraph.Authorization {"VerificationKey":"thisisthesecretkey","Header":"X-Auth","Namespace":"https://thisisatest.com/jwt/claims","Algo":"HS256"}
I start dgraph alpha and zero via
.\dgraph alpha
.\dgraph zero
and then I post the schema via
curl -X POST localhost:8080/admin/schema --data-binary ‘@dgraph.graphql ’
chewxy
(chewxy)
January 28, 2021, 11:20pm
4
OK this looks like a bug. Tagging @pawan
Maybe there is a problem with the os.RemoveAll line here?:
glog.Infof("maxassigned is 0, no indexing work for predicate %s", r.attr)
return nil
}
// We write the index in a temporary badger first and then,
// merge entries before writing them to p directory.
tmpIndexDir, err := ioutil.TempDir(x.WorkerConfig.TmpDir, "dgraph_index_")
if err != nil {
return errors.Wrap(err, "error creating temp dir for reindexing")
}
defer os.RemoveAll(tmpIndexDir)
glog.V(1).Infof("Rebuilding indexes using the temp folder %s\n", tmpIndexDir)
dbOpts := badger.DefaultOptions(tmpIndexDir).
WithSyncWrites(false).
WithNumVersionsToKeep(math.MaxInt32).
WithLogger(&x.ToGlog{}).
WithCompression(options.None).
WithEncryptionKey(x.WorkerConfig.EncryptionKey).
WithLoggingLevel(badger.WARNING)
I created a go test program
package main
import (
"bufio"
"fmt"
"io/ioutil"
"log"
"math/rand"
"os"
"path"
"time"
)
func check(err error) {
if err != nil {
panic(err)
}
}
func main() {
fmt.Println("Checking RemoveAll with subfiles")
if _, err := os.Stat("./tmp"); os.IsNotExist(err) {
os.Mkdir("./tmp", 0700)
}
for i := 0; i < 20; i++ {
dir, err := ioutil.TempDir("./tmp", "dgraph_index_")
if err != nil {
fmt.Println("err: ", err)
log.Fatal(err)
}
fmt.Println("dir: ", dir)
f, err := os.Create(path.Join(dir, "DISCARD"))
check(err)
defer f.Close()
w := bufio.NewWriter(f)
//choose random number for recipe
r := rand.New(rand.NewSource(time.Now().UnixNano()))
i := r.Perm(5)
_, err = fmt.Fprintf(w, "%v\n", i)
check(err)
_, err = fmt.Fprintf(w, "%d\n", i[0])
check(err)
_, err = fmt.Fprintf(w, "%d\n", i[1])
check(err)
w.Flush()
defer os.RemoveAll(dir)
}
}
in linux everything in tmp gets removed, in windows nothing gets removed
EDIT:
Just realized an error in my test program:
defer os.RemoveAll should be at line 30, directly after ioutil.TempDir, this works now in windows
maaft
January 29, 2021, 2:31pm
7
Are there any news on this? Got the same error on windows.
1 Like
chewxy
(chewxy)
January 31, 2021, 3:05am
8
Oh right. I managed to reproduce this error. We’re gonna fix it. @ibrahim has set up some Windows test pipelines, which should catch this.
2 Likes
marcown
February 10, 2021, 11:45am
10
Hey there @chewxy @ibrahim
is there any update on this issue? Can I help to fix it?
Thanks!
ibrahim
(Ibrahim Jarif)
February 10, 2021, 5:16pm
11
Hey @marcown , unfortunately, we haven’t been able to work on this. Please feel free to send a PR if you know what needs to be done.
maaft
February 11, 2021, 3:25pm
14
Quick Fix here:
dgraph → posting/index.go:590
Change to:
dbOpts := badger.DefaultOptions(tmpIndexDir).
WithSyncWrites(false).
WithNumVersionsToKeep(math.MaxInt32).
WithLogger(&x.ToGlog{}).
WithCompression(options.None).
WithEncryptionKey(x.WorkerConfig.EncryptionKey).
WithLoggingLevel(badger.WARNING).
WithInMemory(true).WithDir("").WithValueDir("")
Note the last line. This will use InMemory for the temporary badger instance and the error does not appear.
To reproduce the bug:
package main
import (
"os"
"path/filepath"
"github.com/dgraph-io/badger/v3"
)
func main() {
tmpIndexDir := filepath.Join(".", "test")
b, err := badger.Open(badger.DefaultOptions(tmpIndexDir))
if err != nil {
panic(err)
}
err = b.Close()
if err != nil {
panic(err)
}
err = os.RemoveAll(tmpIndexDir)
if err != nil {
panic(err)
}
}
@Naman maybe you got an idea? I saw you’ve fixed a possible related bug
3 Likes
marcown
February 17, 2021, 7:33am
15
Sorry for asking again, but is there any update here or any plans when it can be done? At the moment it’s not usable at all
maaft
February 23, 2021, 2:55pm
16
Should we maybe post this somewhere else where the topic is more badger related?
We’re currently using docker on windows but that’s not a long-term solution.
Naman
(Naman Jain)
February 23, 2021, 6:56pm
17
Hey @maaft , sorry for the delays.
I see the issue is that we were not closing the DISCARD file and windows complain about deleting the open file.
Can you please try https://github.com/dgraph-io/badger/tree/naman/test-windows branch and confirm if this fixes the issue for you? I will get it reviewed and merged.
1 Like
marcown
February 23, 2021, 7:01pm
18
@Naman
thanks for the reply und fix. I will test it tomorrow morning ( GMT+1 ) and report back
1 Like
marcown
February 23, 2021, 9:40pm
19
Hey again,
@Naman
i just tested it (could not wait^^):
I did a
go get github.com/dgraph-io/badger/v3@bfb6a35bad430f6b946cceb9a30e69fa0a88ee81
in the dgraph latest master and then build windows as described in the first post.
Sadly the error is still there. But now, not only the DISCARD file is remaining, there are also
00001.vlog
00001.mem
KEYREGISTRY and
MANIFEST
Files in the temp folder.
EDIT: Files get deleted, but quite slow, during scheme posting the hdd space gets used to the maximum (>65GB on my system).
EDIT2: Seems a bit that they only get deleted after scheme posting. Furthermore, RAM usage is really high during scheme posting → i cannot run it on my second system
EDIT3: With an empty db the ram usage remains high (7.9GB of 8GB) after scheme posting on my first system
Naman
(Naman Jain)
February 24, 2021, 6:40am
20
maaft:
To reproduce the bug:
package main
import (
"os"
"path/filepath"
"github.com/dgraph-io/badger/v3"
)
func main() {
tmpIndexDir := filepath.Join(".", "test")
b, err := badger.Open(badger.DefaultOptions(tmpIndexDir))
if err != nil {
panic(err)
}
err = b.Close()
if err != nil {
panic(err)
}
err = os.RemoveAll(tmpIndexDir)
if err != nil {
panic(err)
}
}
Hey @marcown , does this test works fine and deletes directory?
For the performance issues, please have a look at Dropping support for Windows and Mac .