You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I've found a performance issue in utils.py: .batch(MODEL_PARAMS['batch_size'] )(line 72) should be called before .map( parse_example_helper_csv, num_parallel_calls=8 )(line 46), which could make your program more efficient.
Besides, you need to check the function parse_example_helper_csv called in .map( parse_example_helper_csv, num_parallel_calls=8 ) whether to be affected or not to make the changed code work properly. For example, if parse_example_helper_csv needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
The text was updated successfully, but these errors were encountered:
@DLPerf Thanks for point this out! Honestly I haven't pay much attention to performance before >< I just took a look at that performance doc, and found there are actually multiple ways to speed up the tf.data ^O^ !
speed up data transformation
sequential mapping -> parall mapping, by using the num_parallel_calls that I already used in the code
scalar mapping -> vectorized mapping, by using batch before map as you proposed
speed up data extraction: sequential extraction -> parallel extraction, by using interleave. But in order to use this, I think we need to chunk train sample into multiple tf records in advance ?
parallelize above ops with training, by using prefetch.
Maybe we can add the other 2 also? Currenly I am indeed not available to manage this repo, could you please help me fix this? Much appreciated!
Hello! I've found a performance issue in utils.py:
.batch(MODEL_PARAMS['batch_size'] )
(line 72) should be called before.map( parse_example_helper_csv, num_parallel_calls=8 )
(line 46), which could make your program more efficient.Here is the tensorflow document to support it.
Besides, you need to check the function
parse_example_helper_csv
called in.map( parse_example_helper_csv, num_parallel_calls=8 )
whether to be affected or not to make the changed code work properly. For example, ifparse_example_helper_csv
needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z) after fix.Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
The text was updated successfully, but these errors were encountered: