20 October 2016

Let's assume our database manages a collection of users. Each belongs to certain region and we have two data centers, located in one of the regions respectively. To avoid high network latency between regions, we shall store user information in the nearest data center. Following demo showcases how tag aware sharding can help in such kind of cases.

NOTE: In the examples, the order of keys matters.

  1. Set up a two-shard cluster one for each data center: "shard01" and "shard02".
  2. Tag those two shards
    sh.addShardTag('shard01', 'regionA');
    sh.addShardTag('shard02', 'regionB');
  3. Shard collection "users" in database "demo"
    sh.shardCollection('demo.users', {home_location:1, user_id: 1});
  4. Attach home location to corresponding shard tags
    // NOTE: the order of members in shard key matters
                   {home_location:'regionA', user_id:MinKey},
                   {home_location:'regionA', user_id:MaxKey},
                   {home_location:'regionB', user_id:MinKey},
                   {home_location:'regionB', user_id:MaxKey},
  5. Insert documents of different regions
    db.users.insert({home_location:'regionA', user_id:1, name:'adam'});
    db.users.insert({home_location:'regionB', user_id:2, name:'bob'});
  6. Verify that documents are stored in shards as epxected using either way:
    • Run the following commands and check output.
    • Or, connect to and query on each shard directly.

    NOTE: If there is already some data before sharding/tagging, we may need to wait for some time to allow chunks migrate to designated shards respectively.

  7. Move one user to another region

    Since values of shard keys cannot be changed, we have to delete the user and re-insert it with new location info.

    var adam = db.users.findOneAndDelete({home_location:'regionA', user_id:1});
    adam.home_location = 'regionB';

One more thing to bear in mind is that shard configure servers and mongos shall also accessible locally.

blog comments powered by Disqus