You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
s3-mounting jobs were causing millions of 500 errors from ceph. From user's end, they doesn't think anything is wrong; jobs completed fine. After completing 32k irods upload jobs (the ones that actually read the cram from S3), the mean walltime was only 26s.
slowread.pl:
#!/usr/bin/env perl
use warnings;
use strict;
use Time::HiRes qw(usleep);
open my $f, "<", shift or die($!);
binmode($f);
my $buf;
while(my $len = read($f, $buf, 100000 * rand() + 10))
{
usleep 1000*rand(5);
}
close $f;
Confirmed issue! Apparently, even though baton is normally single threaded and does a simple streaming read, if the file is large, then it uses irods put code (a call using its C API) to do the putting, which is multi-threaded and reads chunks in parallel.
Confirm by forcing normal single-threaded behaviour:
Problem remains. Would have to have a cached mode option that meant "if a file is read at all, read and cache all of it, don't do any range requests".
Or better, instead of doing range requests to EOF, do them in 4MB chunks (or start with small chunks and learn how much is actually being read, and do that size) behind the scenes.
The text was updated successfully, but these errors were encountered:
s3-mounting jobs were causing millions of 500 errors from ceph. From user's end, they doesn't think anything is wrong; jobs completed fine. After completing 32k irods upload jobs (the ones that actually read the cram from S3), the mean walltime was only 26s.
slowread.pl:
irods_startup.sh:
Confirmed issue! Apparently, even though baton is normally single threaded and does a simple streaming read, if the file is large, then it uses irods put code (a call using its C API) to do the putting, which is multi-threaded and reads chunks in parallel.
Confirm by forcing normal single-threaded behaviour:
No problems. Double check slow reads don't cause it on uncached crams:
No problems.
Try turning on caching for the S3 mount?
Problem remains. Would have to have a cached mode option that meant "if a file is read at all, read and cache all of it, don't do any range requests".
Or better, instead of doing range requests to EOF, do them in 4MB chunks (or start with small chunks and learn how much is actually being read, and do that size) behind the scenes.
The text was updated successfully, but these errors were encountered: