Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a db-backup related speed bottleneck? #313

Open
sb10 opened this issue Jul 1, 2020 · 0 comments
Open

Is there a db-backup related speed bottleneck? #313

sb10 opened this issue Jul 1, 2020 · 0 comments
Labels

Comments

@sb10
Copy link
Member

sb10 commented Jul 1, 2020

rm ~/.wr_production/db && s3cmd -c ~/.s3cfg.sb10 rm s3://sb10/wr_backups/db_bk
wr manager start -s lsf
cat longmult_inputs | perl -ne '($am, $bm) = split("\t", $_); print "perl ./longmult_step1.pl $am $bm\n"' | wr add -i longmult_step1 --cwd_matters -o 2

Got up to ~150 (without fluctuating down) running simultaneously.
But very slow? Much faster with backups disabled? Try the new coreos version of boltdb? Try the db on ram disk?

It currently (with or without backups) takes 22s, which looks bottle necked vs
33s given 24 -> 80..150
Updating to bbolt made no difference to performance.

wr manager stop && rm ~/.wr_production/db && s3cmd -c ~/.s3cfg.sb10 rm s3://sb10/wr_backups/db_bk
wr manager start
cat longmult_inputs | perl -ne '($am, $bm) = split("\t", $_); print "perl ./longmult_step1.pl $am $bm\n"' | wr add -i longmult_step1 --cwd_matters -o 2 -r 0

Seemed to run fine, but then it abruptly hung with 24 runners going but not completing. wr status "could not reach the server". It never recovered. manager couldn't be stopped. ~/.wr_production/.db_bk_mount/sb10/ could not be lsed without also hanging. Did fuse just wedge?

@sb10 sb10 added the question label Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant