Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support serializing non-hash/sort keys as String/Binary #49

Open
davidbarsky opened this issue Mar 24, 2019 · 3 comments
Open

Support serializing non-hash/sort keys as String/Binary #49

davidbarsky opened this issue Mar 24, 2019 · 3 comments

Comments

@davidbarsky
Copy link

💡 Feature description

Hi! I wanted to see if you'd be interested in a PR that supports serializing all non-hash or sort key fields serialized as a string or binary type. I think the primary reason someone would prefer a complex DynamoDB datatype is to support some limited operations though the Dynamo Update API, such as adding to a list, or updating elements of a map. In my experience, this is not a common operation. I'll list why I'd like to have a feature similar to this:

  • Any complex operations/attributes that are not stored as a top-level attribute in a DynamoDB Item could be stored as a (compressed) string or binary. This has the benefit of making items smaller, which consumes less WCUs/RCUs when accessed.
  • DynamoDB doesn't really have a schema beyond the partition and sort key, so applications are forced to validate the data on storage and retrieval anyways.

A potential downside I can think of is that the items in DynamoDB won't be easily introspectable in the console and that it's tricky to change data representation down the line, but I think this is an issue with DynamoDB specifically.

💻 Basic example

Here's what I imagine the derive interface would be like:

use dynomite::Item;
use chrono::{DateTime, Utc};
use uuid::Uuid;
    
#[derive(Item, Debug, Clone)]
pub struct Book {
    #[dynomite(hash)]
    id: Uuid,
    #[dynomite(sort)]
    timestamp: DateTime<Utc>,
    // other fields could be serialized as:
    // - serialized as a JSON using serde-json, or
    // - a binary blob using msgpack, or protobuf. 
    #[dynomite(serialize_as = "JSON")]
    metadata: Metadata, 
}
@softprops
Copy link
Owner

Always happy to take prs. I want to dive into the motivation a bit more.

Sounds neat. I need to grok this a bit more but I think this is actually doable today by implementing Attribute for a custom type https://docs.rs/dynomite/0.3.0/dynomite/trait.Attribute.html

In the particular case of JSON above I could see an impl of Attribute for serve_json::Value behind a feature flag that (de)serializes to a byte array or string ddb attr value type. That would of course need to live in this crate because of the foreign trait impl rules. I'd be happy to start with this.

I like where your head is at for generalizing this to avoid repetition of custom serialization formats. You couldn't really impl attribute for serve serialize/deserialize types because it's there's already impls provided for the same primitives (Strings, numerics, ...) Serde supports.

I think the drawback for not seeing something in the console isn't a big deal. I'm used to these kinds of cases where you store protobuf bytes into some storage like mysql blob field or in this case dynamodb byte array attr. I'm not a huge console user so I may be biased :)

@davidbarsky
Copy link
Author

I want to dive into the motivation a bit more.

Sure! I think if I'm to summarize my motivation, it'd be the following:

  • At higher read/writes rates on DynamoDB, some form of compression on larger data types could drive down costs for customers.
  • This changes makes it easier to define document-like structures using an enum. I haven't thought through alternate mechanisms, however.

Sounds neat. I need to grok this a bit more but I think this is actually doable today by implementing Attribute for a custom type https://docs.rs/dynomite/0.3.0/dynomite/trait.Attribute.html

That's interesting—I'll try that out/explore that option.

I think the drawback for not seeing something in the console isn't a big deal. I'm used to these kinds of cases where you store protobuf bytes into some storage like mysql blob field or in this case dynamodb byte array attr. I'm not a huge console user so I may be biased :)

I mean, neither am I, but I can see this being somewhat useful for people!

@softprops
Copy link
Owner

softprops commented Mar 25, 2019

Wanted to clarify the impl Attribute for xxx suggestion since serdes traits are foriegn types in your crate you wouldn't be able to impl Attribute for it locally but you can do something like the following pseudoish code

#[derive(Serialize, Deserialize, Debug, Clone)]
struct Metadata {
   some_str_field: String,
   some_other_field: usize
}

#[derive(Item, Debug, Clone)]
pub struct Book {
    #[dynomite(hash)]
    id: Uuid,
    #[dynomite(sort)]
    timestamp: DateTime<Utc>,
    metadata: Metadata, 
}

impl Attribute for Metadata {
    fn into_attr(self: Self) -> AttributeValue {
       // use whatever serialization scheme you like, heres json stored in a ddb str
        AttributeValue {
            s: Some(serde_json::to_string(&self).unwrap_or_default()),
            ..AttributeValue::default()
        }
    }
    fn from_attr(value: AttributeValue) -> Result<Self, AttributeError> {
        value
            .s
            .ok_or(AttributeError::InvalidType)
            .and_then(|txt| {
                // likewise this is the reverse, json string back to your type
               // if that fails, just return AttributeError for lack of a current better current way to communicate 
               // deserialization failed
                serde_json::from_str::<Metadata>(&txt).map_err(|_| AttributeError::InvalidType)
             })
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants