Package review PR #1

KarthikSubbarao · 2024-04-29T19:40:11Z

This is a PR which merges current changes in unstable into an empty branch in order to help with a review code of the entire codebase.

src/commands/bloom.rs

hpatro · 2024-05-01T22:03:26Z

src/commands/bloom.rs

+    let mut result = Vec::new();
+    match value {
+        Some(bf) => {
+            for item in input_args.iter().take(argc).skip(idx) {
+                result.push(RedisValue::Integer(bf.add_item(item.as_slice())));
+            }
+            Ok(RedisValue::Array(result))
+        }
+        None => {
+            if nocreate {
+                return Err(RedisError::Str("ERR not found"));
+            }
+            let mut bf = BloomFilterType::new_reserved(fp_rate, capacity, expansion);
+            for item in input_args.iter().take(argc).skip(idx) {
+                result.push(RedisValue::Integer(bf.add_item(item.as_slice())));
+            }
+            match filter_key.set_value(&BLOOM_FILTER_TYPE, bf) {
+                Ok(_) => Ok(RedisValue::Array(result)),
+                Err(_) => Err(RedisError::Str(ERROR)),
+            }
+        }


Could we check the bloom filter exists or not first and then insert the data in a single flow. Don't like the code duplication around insertion.

I can create a separate function for multi adds and call it from both flows. I wanted to handle both through the same flow, however it will be out of reference scope and result in a moved error

hpatro · 2024-05-01T22:08:06Z

src/commands/bloom_util.rs

+pub struct BloomFilterType {
+    pub expansion: u32,
+    pub fp_rate: f32,
+    pub filters: Vec<BloomFilter>,


Does the underlying support scalable bloom filters or do we need to support this mechanism?

This particular library does not have auto scaling

https://docs.rs/bloomfilter/1.0.13/bloomfilter/struct.Bloom.html

src/commands/bloom_util.rs

hpatro · 2024-05-01T22:12:18Z

src/commands/bloom_util.rs

+                let new_capacity = filter.capacity * self.expansion;
+                let mut new_filter = BloomFilter::new(self.fp_rate, new_capacity);
+                // Add item.
+                new_filter.bloom.set(item);
+                new_filter.num_items += 1;
+                self.filters.push(new_filter);


Shouldn't we fail the request if the expansion is much higher than we can support?

expansion really means expansion_rate. I can rename this variable if that would help clarify this.

I don't think there is a limit on the number of sub filters an object can have. But (if we want) we can define a config for this to either silently fail expansion (by allowing a set) beyond a limit of X sub filters per object OR explicitly return an error (without a set).

One other aspect we should handle this checking / rejecting based on memory overhead of every operation that creates a new BloomFilter object (BF.ADD, BF.MADD, BD.RESERVE, BF.INSERT, RDB Load). Before any of these operations, we should probably check the memory usage and reject the operations if there is not sufficient space. We have a mechanism to compute the estimated additional memory overhead per creation.

src/wrapper/bloom_callback.rs

src/commands/bloom_data_type.rs

Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

Signed-off-by: Karthik Subbarao <[email protected]>

… objects. Update RDB load/save Signed-off-by: Karthik Subbarao <[email protected]>

Signed-off-by: Karthik Subbarao <[email protected]>

Signed-off-by: KarthikSubbarao <[email protected]> Signed-off-by: Karthik Subbarao <[email protected]>

Signed-off-by: Karthik Subbarao <[email protected]>

…oom module + Add basic sanity tests Signed-off-by: Karthik Subbarao <[email protected]>

Signed-off-by: Karthik Subbarao <[email protected]>

…c feature Signed-off-by: Karthik Subbarao <[email protected]>

…alse positive rate correctness Signed-off-by: Karthik Subbarao <[email protected]>

Add unit testing for scaling & non scaling filters

…s, integration tests and ASAN testing Signed-off-by: zackcam <[email protected]>

Adding github workflow for building, running format checks, unit tests, integration tests and ASAN testing

Signed-off-by: Vanessa Tang <[email protected]>

Signed-off-by: Karthik Subbarao <[email protected]>

…ionality for more flexibility in the future (#6) Signed-off-by: zackcam <[email protected]>

…ity (#7) Add test for bloom command arity, behavior and basic error Set up a base case for valkey bloom filter module Optimize BF.INSERT arguments handling and BF.INFO response when NONSCALING set Add test for basic valkey command Signed-off-by: Vanessa Tang <[email protected]>

…it instead (#14) Signed-off-by: zackcam <[email protected]>

…lters and other core functionality validation (#15) * Add Integration Testing for correctness of scaling and non scaling filters, and maxmemory, memory usage, type, encoding, etc * Refactor correctness integration tests Signed-off-by: Karthik Subbarao <[email protected]> * Fix comment Signed-off-by: Karthik Subbarao <[email protected]> * Refactor Signed-off-by: Karthik Subbarao <[email protected]> * Fix import Signed-off-by: Karthik Subbarao <[email protected]> --------- Signed-off-by: KarthikSubbarao <[email protected]> Signed-off-by: Karthik Subbarao <[email protected]>

parthpatel

Mostly just questions - trying to understand this code base.

parthpatel · 2024-09-25T19:34:04Z

src/wrapper/bloom_callback.rs

+// "unsafe extern C" based on the Rust module API definition
+
+/// # Safety
+pub unsafe extern "C" fn bloom_rdb_save(rdb: *mut raw::RedisModuleIO, value: *mut c_void) {


Why are we not using "https://docs.rust-embedded.org/book/interoperability/rust-with-c.html#no_mangle" as recommended by Rust documentation?

parthpatel · 2024-10-02T18:32:38Z

.github/workflows/ci.yml

@@ -0,0 +1,103 @@
+name: ci


If I am a new developer looking at this file, how do I understand the intent of this file without documentation? Are these configurations obvious? What documentation did you read to write this file?

parthpatel · 2024-10-02T18:34:37Z

Cargo.toml

+valkey-module = "0.1.2"
+bloomfilter = "1.0.13"
+lazy_static = "1.4.0"
+libc = "0.2"


Why do we need libc?
Why do we not depend on "valkey-module-macros" package? (valkey-module-rs github repo)

parthpatel · 2024-10-02T18:37:54Z

Cargo.toml

+debug-assertions = true
+
+[features]
+enable-system-alloc = ["valkey-module/enable-system-alloc"]


What is the intention with this configuration?

parthpatel · 2024-10-02T18:41:37Z

requirements.txt

@@ -0,0 +1,2 @@
+valkey


What does this file do?

parthpatel · 2024-10-02T18:45:42Z

src/lib.rs

+}
+
+/// Command handler for BF.EXISTS <key> <item>
+fn bloom_exists_command(ctx: &Context, args: Vec<ValkeyString>) -> ValkeyResult {


Why do we need these wrapper functions? Can't we feed the commands from command_handler module directly in the macro below?

parthpatel · 2024-10-02T19:10:46Z

src/lib.rs

+    data_types: [
+        BLOOM_FILTER_TYPE,
+    ],
+    init: initialize,


Does the valkey module not have a default? What happens if we don't specify this?

parthpatel · 2024-10-02T19:13:44Z

src/bloom/data_type.rs

+use valkey_module::native_types::ValkeyType;
+use valkey_module::{logging, raw};
+
+const BLOOM_FILTER_TYPE_ENCODING_VERSION: i32 = 0;


parthpatel · 2024-10-02T19:14:15Z

src/bloom/data_type.rs

+    "bloomfltr",
+    BLOOM_FILTER_TYPE_ENCODING_VERSION,
+    raw::RedisModuleTypeMethods {
+        version: raw::REDISMODULE_TYPE_METHOD_VERSION as u64,


u64? that seem like an overkill.

…ness. Fixed others tests and updated the build.sh script for running single integration tests (#16) Signed-off-by: Karthik Subbarao <[email protected]>

* Handle bloom object max allowed size limit, switch fp rate to f64, update defrag exemption logic and free effort logic Signed-off-by: Karthik Subbarao <[email protected]> * Update error message and add unit test Signed-off-by: Karthik Subbarao <[email protected]> * Rename config Signed-off-by: Karthik Subbarao <[email protected]> * Rename configs Signed-off-by: Karthik Subbarao <[email protected]> --------- Signed-off-by: Karthik Subbarao <[email protected]>

…ters and memory_bytes. Additionally updated drop and created tests around this new info handler (#19) Signed-off-by: zackcam <[email protected]>

* add bf.load and bf.dump to support bloom filter aofrewrite. Signed-off-by: wuranxx <[email protected]> add version in BloomFilterType. change bf.load behavior: bf.load can't override exists keys. Signed-off-by: wuranxx <[email protected]> * remove bf.dump. add bloom filter encode decode comment. add unit test for decode error case. Signed-off-by: wuranxx <[email protected]> * 1. Currently, the version field is added to the array only encode. 2. Add metrics when decode bloomFilterType. 3. Add metrics test when load bloom filter from aof. Signed-off-by: wuranxx <[email protected]> * 1. Add filter bytes check and unit test. 2. Fixed some comments and returned error messages. Signed-off-by: wuranxx <[email protected]> * fix `test_bf_decode_when_bytes_is_exceed_limit_should_failed` unit test. Signed-off-by: wuranxx <[email protected]> * Add expansion maximum value check to filter decode. Signed-off-by: wuranxx <[email protected]> --------- Signed-off-by: wuranxx <[email protected]>

Signed-off-by: Vanessa Tang <[email protected]>

KarthikSubbarao force-pushed the unstable branch 3 times, most recently from 61aa1fb to 0e4d4d1 Compare April 30, 2024 18:14

KarthikSubbarao changed the title ~~Package review~~ Package review PR May 1, 2024

KarthikSubbarao self-assigned this May 1, 2024

hpatro reviewed May 1, 2024

View reviewed changes

KarthikSubbarao force-pushed the unstable branch from 055728d to 7027032 Compare July 16, 2024 03:25

KarthikSubbarao force-pushed the unstable branch from c87fd97 to 869a254 Compare August 23, 2024 17:20

KarthikSubbarao pushed a commit that referenced this pull request Aug 23, 2024

Add information about RFC process (#1)

657f83c

Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>

KarthikSubbarao added 21 commits September 10, 2024 23:06

Initial commit

5237a10

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.ADD, BF.EXISTS, BF.CARD

a96c08a

Signed-off-by: Karthik Subbarao <[email protected]>

Update README

be97a25

Signed-off-by: Karthik Subbarao <[email protected]>

Optimize bloom operations to use a reference to already created Bloom…

5ebc1cb

… objects. Update RDB load/save Signed-off-by: Karthik Subbarao <[email protected]>

Remove older Bloom APIs using prev serialization apporach

8557b55

Signed-off-by: Karthik Subbarao <[email protected]>

Add Dev Profile in Cargo.toml for debugging

1729520

Signed-off-by: Karthik Subbarao <[email protected]>

Create Module Datatype and support RDB load, save, free

be80d9d

Signed-off-by: Karthik Subbarao <[email protected]>

Update README.md

ed4d0f9

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.RESERVE and expansion config

2db0e4b

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.INFO

778c219

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.MEXISTS

b11970e

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.MADD

bdf5f0d

Signed-off-by: Karthik Subbarao <[email protected]>

Update error handling / messages and update expansion logic

8f5e3fc

Signed-off-by: Karthik Subbarao <[email protected]>

Add auto scaling support for bloom filters

5545680

Signed-off-by: Karthik Subbarao <[email protected]>

Fix RDB Save for scaled filters

42d015c

Signed-off-by: Karthik Subbarao <[email protected]>

Refactoring

481ecac

Signed-off-by: Karthik Subbarao <[email protected]>

Update TODOs

ff60817

Signed-off-by: Karthik Subbarao <[email protected]>

Fix mem_usage calculation

1c1b6ab

Signed-off-by: Karthik Subbarao <[email protected]>

Update Cargo.toml

d00fa99

Signed-off-by: Karthik Subbarao <[email protected]>

minor refactoring

053d5e9

Signed-off-by: Karthik Subbarao <[email protected]>

Add support for BF.INSERT and fix multi add logic

fbdfb5b

Signed-off-by: Karthik Subbarao <[email protected]>

KarthikSubbarao added 4 commits September 10, 2024 23:06

Replication support + Update Module/Datatype name + Refactor

5b0df15

Signed-off-by: KarthikSubbarao <[email protected]> Signed-off-by: Karthik Subbarao <[email protected]>

Update data type name and use static str for errors

c66243d

Signed-off-by: Karthik Subbarao <[email protected]>

Support keyspace notifications for write operations

0d2ad8a

Signed-off-by: Karthik Subbarao <[email protected]>

Add Python testing framework to support Integration testing of the bl…

85e20fd

…oom module + Add basic sanity tests Signed-off-by: Karthik Subbarao <[email protected]>

KarthikSubbarao force-pushed the unstable branch from 3f8b29e to 85e20fd Compare September 10, 2024 23:07

KarthikSubbarao and others added 8 commits September 14, 2024 00:26

Types, Ranges, limit updates and overflow handling

0b9a90d

Signed-off-by: Karthik Subbarao <[email protected]>

Add unit testing support using the valkeymodule-rs enable-system-allo…

a180f42

…c feature Signed-off-by: Karthik Subbarao <[email protected]>

Add unit testing for scaling & non scaling filters for behavior and f…

736a48a

…alse positive rate correctness Signed-off-by: Karthik Subbarao <[email protected]>

Merge pull request #3 from KarthikSubbarao/unstable

6d0a4b6

Add unit testing for scaling & non scaling filters

Adding github workflow for building, running format checks, unit test…

dee8a7e

…s, integration tests and ASAN testing Signed-off-by: zackcam <[email protected]>

Merge pull request #5 from zackcam/unstable

145f9b2

Adding github workflow for building, running format checks, unit tests, integration tests and ASAN testing

RDB format optimization: Using a fixed seed for bloom filters (#2)

282599d

Signed-off-by: Vanessa Tang <[email protected]>

Update build.sh and fix import in save/restore pytest

687bec7

Signed-off-by: Karthik Subbarao <[email protected]>

KarthikSubbarao force-pushed the unstable branch from 7e77f7c to 687bec7 Compare September 19, 2024 22:34

zackcam and others added 2 commits October 1, 2024 17:03

Adding replication ability to valkey test case, changing waiter funct…

f299216

…ionality for more flexibility in the future (#6) Signed-off-by: zackcam <[email protected]>

KarthikSubbarao force-pushed the unstable branch from d9e8f1b to 4d38649 Compare October 9, 2024 23:17

zackcam and others added 2 commits October 10, 2024 12:10

Updating _ to -in lib.rs. Also updating loading from rdb to use a tra…

fde4d37

…it instead (#14) Signed-off-by: zackcam <[email protected]>

parthpatel reviewed Oct 15, 2024

View reviewed changes

Add integration tests for maxmemory scenarios and replication correct…

57a7901

…ness. Fixed others tests and updated the build.sh script for running single integration tests (#16) Signed-off-by: Karthik Subbarao <[email protected]>

KarthikSubbarao force-pushed the unstable branch from 9ad711e to 57a7901 Compare October 17, 2024 21:55

KarthikSubbarao force-pushed the unstable branch from 75fc3cc to 12031b2 Compare October 29, 2024 16:53

zackcam and others added 2 commits November 5, 2024 12:01

Adding info handler, that contains three fields, num_objects, num_fil…

31d21c0

…ters and memory_bytes. Additionally updated drop and created tests around this new info handler (#19) Signed-off-by: zackcam <[email protected]>

KarthikSubbarao force-pushed the unstable branch 2 times, most recently from e74b825 to 188835b Compare November 21, 2024 21:50

Add new metrics to show capacity and items across objects (#20)

a33e0e3

Signed-off-by: Vanessa Tang <[email protected]>

KarthikSubbarao force-pushed the unstable branch from 188835b to a33e0e3 Compare November 21, 2024 21:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package review PR #1

Package review PR #1

KarthikSubbarao commented Apr 29, 2024 •

edited

Loading

hpatro May 1, 2024

KarthikSubbarao May 1, 2024

hpatro May 1, 2024

KarthikSubbarao May 1, 2024

hpatro May 1, 2024

KarthikSubbarao May 1, 2024

KarthikSubbarao May 2, 2024

parthpatel left a comment

parthpatel Sep 25, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

parthpatel Oct 2, 2024

Package review PR #1

Are you sure you want to change the base?

Package review PR #1

Conversation

KarthikSubbarao commented Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parthpatel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KarthikSubbarao commented Apr 29, 2024 •

edited

Loading