You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The function get_noisy_distribution_of_attributes only gets a partial distribution. This bug was introduced in commit 1abe702. Here is the relevant code as it appears in master (currently commit be8b65a):
In particular, full_space.append does not modify full_space; instead, it returns a new object. (This seems to be true for all versions of pandas.) As a result, full_space does not store all of the intended rows but, rather, only at most the first 1000000.
The text was updated successfully, but these errors were encountered:
Thanks for the quick response and fix. It's worth noting that the bug was introduced in an attempt to fix memory issues (commit 1abe702). It's clear why this initial "fix" would have reduced memory consumption: No more than two million rows would ever be loaded at the same time. However, if I'm reading the code correctly, that is no longer the case with this new commit (1ced27c). As such, the attempt to reduce memory consumption may need to be revisited. (To be clear, this is not currently an issue for me.)
I just wanted to mention this in case a new issue needs to be created to resolve potential memory issues.
Description
The function
get_noisy_distribution_of_attributes
only gets a partial distribution. This bug was introduced in commit 1abe702. Here is the relevant code as it appears in master (currently commit be8b65a):In particular,
full_space.append
does not modifyfull_space
; instead, it returns a new object. (This seems to be true for all versions of pandas.) As a result,full_space
does not store all of the intended rows but, rather, only at most the first 1000000.The text was updated successfully, but these errors were encountered: