Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory efficient implementation for the update() in OutputLagFeatureProcessor #14

Open
garyfanhku opened this issue Jul 20, 2021 · 1 comment

Comments

@garyfanhku
Copy link

Hi Jinyu,

Regarding the update in OutputLagFeatureProcessor, perhaps a little off-topic, but I wonder what would be more memory efficient other than deque here?

Thanks!

@jxx123
Copy link
Owner

jxx123 commented Jul 20, 2021

Hi,

My memory is a bit vague now. I originally thought a better way would be keeping track of the lags without duplicating the original data. We only need to generate the final feature matrix in the end. For example, what we have in the beginning is just a vector, [x1, x2, x3]. Once we add lag feature, we have a matrix [[x1, x2, x3], [None, x1, x2], [None, None, x1]]. What we actually need to construct this matrix is just the vector [x1, x2, x3], and the lags, 0, 1, 2. By doing this, we can reduce the memory footprint, but it might increase the computation time, since you need to reconstruct the matrix in the end, and in the end you will still need this size of memory to store the matrix. This is a trade-off. These lag processors were implemented in a mess. Any refactoring work is welcome. Feel free to send me PRs :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants