-
Notifications
You must be signed in to change notification settings - Fork 57
Comments
Once you compile your code, you can call the executable by adding nvprof
in front of the call to enable profiling. This will give you the time spent in each kernel.
This is demonstrated in the MBody1 project. Changes need to be made to both the model definition in MBody1.cc (model.setTiming(TRUE);) and the classol_sim.h header file (#define TIMING) to switch this on.
The GeNN provided part of timing is that the variables neuron_tme etc. are at all times filled with the accurate cumulative time spent in the neuron kernel etc. What is done with it (e.g. it's written to a file in the MBody1 example) is user choice. At the moment (14-01-2014) the timing only works correctly in the development branch.
GPU simulation results may differ from the CPU simulations, and also from other runs of the same GPU simulation, as a result of the nondeterministic order of execution of threads on the GPU. This is expected as the floating point operations are not associative.
Close to zero values may also accumulate and result in unwanted values when dense connectivity is used for sparse arrays by setting the non-existing connections to 0, or epsilon. If precision is an issue, in the user-defined model one can add a check against close to zero values before calling the update code in the code snippet, such as:
synapseModel.simCode = tS(" $(addtoinSyn) = $(g);\n \
if($(addtoinSyn) > 1e-15){\n \
$(updatelinsyn);\n \
}\n;)
The other day I (TN) was creating a new project to study associative learning in honeybees. I struggled a little to get everything to work properly because I kept making the following mistakes:
- One should always take care with sizes of data arrays. Most of my errors were generated on the "user side" by using wrong sizes when filling input patterns or direct input arrays
- When initializing variables like direct input or input patterns for Poisson neurons etc, it is essential to remember to copy them to the GPU memory. CopyStateToDevice() does not cover these variables.
- Be aware that on some GPUs the numerical behaviour of functions may differ from IEEE standards or naively expected values. I was running into trouble with exp(-10000.0) being returned as nan instead of the expected 0.