Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in Synchronizer.get_failed_() #15

Open
eamartin opened this issue Dec 7, 2016 · 0 comments
Open

Potential bug in Synchronizer.get_failed_() #15

eamartin opened this issue Dec 7, 2016 · 0 comments

Comments

@eamartin
Copy link

eamartin commented Dec 7, 2016

I'm considering adding TensorFlow bindings to persistent-rnn, and I've been reading the codebase for my own understanding.

In https://github.com/baidu-research/persistent-rnn/blob/master/include/prnn/detail/rnn/synchronizer.h#L86

    bool get_failed_() const {
        index_t failed = 0;

        prnn::parallel::CudaRuntimeLibrary::cudaMemcpyAsync(&failed,
            get_barrier_failed_flag(), sizeof(index_t),
            prnn::parallel::CudaRuntimeLibrary::cudaMemcpyDefault, stream_);

        return failed != 0;
    }

It seems to me that this has has a race condition between function return (failed going out of scope) and cudaMemcpyAsync completing. Is this the case? If not, what is causing synchronization of the memcpy before function return?
There are several other similar examples to this in Synchronizer.

Thanks for any comment on this "maybe not an issue but just a question about code" issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant