grogers
4 days ago
I would strongly recommend against using a fixed key 'user' for the hash key on dynamodb, with the range key being used to select the actual record. DDB does not handle splitting by range key very well, so you will run into load balance and throttling issues even with the sharding scheme (i.e. 'user!2') mentioned later.
It will save you a lot of headaches to make the hash key the actual userid (e.g. 'user!abcdef123456'). This will make it more expensive if you do need to occasionally scan all users, but it's not drastically so. You can either do scan and ignore stuff you don't care about, or maintain an index that just contains the userids (in a similar hash/range key as the article) and then do point gets for each userid for the actual data. This will spread the load of these scans out better, because the range scan contains little data compared to if all user data is stored in the range key.
karmaniverous
4 days ago
This approach is not really compatible with the single-table design pattern, which has some significant advantages. The point where performance degrades due to the issues you mentioned would be a good point to start applying sharding.
wmfiv
3 days ago
With due respect, I think you've misunderstood the single-table design pattern.
Because you've introduced static hash keys ("user", "email", etc) you've had to manually partition which DDB should do for you automatically. And while you covered the partition size limit you're also likely to have write performance issues because you're not distributing writes to the "user" and "email" hash keys.
Single-table design should distribute writes and minimize roundtrips to the database. user#12345 as a hash key and range keys of 'User', 'Email#jo@email.com', 'Email#joe@email.com', etc achieve those goals. If you need to query and/or sort on a large number of attributes it's going to be easier, faster, and probably cheaper to stream data into Elasticsearch or similar to support those queries.
kellengreen
3 days ago
Agreed, this approach will not scale very well in DynamoDB.