Comments on: Sharding and Time Base Partitioning https://www.percona.com/blog/sharding-and-time-base-partitioning/ Sun, 16 Mar 2008 20:43:20 +0000 hourly 1 https://wordpress.org/?v=6.5.2 By: Peter Zaitsev https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253452 Sun, 16 Mar 2008 20:43:20 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253452 Howa,

First I should say the problem is often exaggerated – indeed some users can be 100 times more than others, so what ? With big numbers of say 300000 users per box it all evens out pretty well anyway.

There is however a different problem which is about user “groups” not users individual users. Consider for example you decided to put out box one by one and keep all registrations going to them until it gets 300.000 registrations.

This may be suboptimal because you may find out users are more active first days after registration, it may be even more complicated like users which came through some marketing action may have sustained interest while from other may mostly leave after short period of time.

It is also quite natural for some users to stop using service over time so proportion of active users goes down.

So you need to think and see what kind of allocation policy makes sense in your case. There are many ways to organize it.

]]>
By: howa https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253423 Sun, 16 Mar 2008 17:53:47 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253423 Let say, partitioning by user_id, the main difficulty I think is not partitioning, but load balacing, e.g. 80% of users are inactive, how can do spread the 20% of active users across all the database servers for max. performance?

]]>
By: Wes Mahler https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253307 Sun, 16 Mar 2008 11:53:18 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253307 Interesting way to partition data!

]]>
By: Peter Zaitsev https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253175 Sat, 15 Mar 2008 22:24:25 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253175 Frank,

Databases are at large extent created to ease the development process and the more database can do to help it the better. Sure frameworks can have their cache but this cache is not good enough in all cases plus you may have different application accessing same data.

Database may not need to guess how you’re using its data but at least follow your recommendation. If you think about it indexes are nothing more but hints to the database – I’m going to use this column for lookups and I want these to be fast. Clustering, partitioning or physical sorting is another ways to speedup certain access paths.

]]>
By: Frank https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253163 Sat, 15 Mar 2008 20:24:34 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253163 I honestly think some things are out of the scope of mysql, I think you’re application should be smart enough to pull from a cache. In fact frameworks (.net, cakephp, rails, etc) have built in cache (file, memory, mcache, etc) support you just need to use it. MySQL is very powerful and while I think it needs to do more beyond master/slave replicationing, determining how you use it’s data isn’t exactly it’s job. Being able to treat multiple servers, db’s and tables as a cluster and provide easy to use methods for moving data (tables and databases) for optimization of access might be within it’s scope. Knowing when and why to move your data rests firmly with you and your app.

]]>
By: Peter Zaitsev https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253136 Sat, 15 Mar 2008 16:51:02 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253136 Of course there are layers of caching – squid, memcache etc but finally some load hits the database and you need to deal with it.

]]>
By: howa https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253030 Sat, 15 Mar 2008 08:09:26 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253030 I think the golden rule for large site should be don’t let your visitor hit your MySQL directly?

Such as using Squid?

]]>
By: mike https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-253000 Sat, 15 Mar 2008 05:43:40 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-253000 Hmm. Even having the option would be nice. A configuration option (just like InnoDB recovery style) would probably be used by a lot of people.

Trying to determine how to partition my data and all that myself seems a bit daunting. I’d be happy with an LRU-style thing. I mean, I will deploy memcached and use that for caching prior to the database anyway. But any further optimizations to MySQL are welcome.

]]>
By: Peter Zaitsev https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-252999 Sat, 15 Mar 2008 05:37:57 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-252999 Well it would. Though I would like it to get online defragmentation first.

Also note it is not as easy as last access time is the only important factor for data organization – on the contrary there is bunch of other things such as access paths which should be considered.

]]>
By: mike https://www.percona.com/blog/sharding-and-time-base-partitioning/#comment-252998 Sat, 15 Mar 2008 05:30:49 +0000 https://www.percona.com/blog/2008/03/14/sharding-and-time-base-partitioning/#comment-252998 It would be cool if MySQL itself had some sort of LRU type query cache method for re-organizing how data is physically stored. That would be cool.

]]>