-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple file formats for the raw data #5
Comments
I don't really know what is the best way to proceed here. On one hand, there is neo, a python package meant to be able to read/write various file formats, in a fast and efficient way. On the other hand, we also recoded numerous wrappers on our side, close/similar to the wrappers you'll find in neo, but lighter, for the internal needs of SpyKING CIRCUS. Since neo is more structured, maybe this is the good way to go? It would be amazing if phy could display several native/proprietary file format, as numerous users are struggling to simply export data into raw binary... |
Could you point me to the code of your wrappers? |
The code is here https://github.com/spyking-circus/spyking-circus/tree/master/circus/files |
@rossant : There are 2 levels for read in neo:
https://github.com/NeuralEnsemble/python-neo https://neo.readthedocs.io/en/latest/rawio.html Reading ephy format have been done in many places. (circus, neo, spikeextractors and many individual wrapper for particular format). It is a total energy disperssion. Neo have a strong API with 2 levels that support multiblock, multi segment, signals multi sample rate, events, epochs, spike and waveforms. I really think that, Pierre should move all the wrapper in neo and Cyril you should use neo.rawio. Also note that recently, lazy reading have also been incorporate in neo.io so it is also a solution you could use. |
I think moving to Neo wrappers, for us, would be the solution. I just never managed to take the time, but this is an Open Issue with SC :-) One day, it will happen. |
Thanks @yger and @samuelgarcia, I'll have a look at this soon. I agree that we should reuse the same code as much as possible. For phy, what I'll need is a function with the following signature: read_raw_data(data_files, n_channels_dat=None, dtype=None, offset=None, sample_rate=None) which returns a single memmap NumPy array with (virtual) shape I already have this function for raw binary files. Parameters like Can neo be used to write such a wrapper function? |
Yes. So I think we should all converge to neo as a dependency, and centralize all the individual wrappers. Neo can do what you want, I think, and expose such functions. The problem is that for different file format, you have different inputs to give (sampling rate, nb_channels, ...). Some have everything in the header, some only partial information. Not a big deal, you just need, in phy, to know how to handle this in the params.py I guess |
I think it should be easy to make an objec proxy between neo.rawio and this function. But why don't you use directly the neo.rawio API that explicitly have multi segment and lazy read instead of this virtual object inside phy ? Note that in your case offset can change from one file to another. So this function can lead to problems |
Few format need parameters as input except raw binary. |
These parameters would just be used for the raw binary format, which is what we use at the moment. If the file format is different, these parameters would be None and just discarded, since they would be parsed from the files themselves. The course of action I gave is the least effort path for me since I wouldn't have to change anything in phy. The virtual concatenation object we have already works very well for us and ideally, we'd use it indistinctively for all file formats. What's the difference between lazy read and memmap? |
Some file don't have continuous block in the file. |
No description provided.
The text was updated successfully, but these errors were encountered: