by Justin Swanhart | Mar 10, 2015 | MySQL
by Justin Swanhart | Feb 17, 2015 | MySQL
by Justin Swanhart | Sep 23, 2014 | MySQL, Webinars
This Wednesday I’ll be discussing two common types of big data: machine-generated data and user-generated content. These types of big data are amenable to sharding, a commonly used technique for spreading data over more than one database server.I’ll be... by Justin Swanhart | Sep 10, 2014 | Benchmarks, Insight for DBAs, MySQL
There are a lot of tools that generate test data. Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data... by Justin Swanhart | Aug 27, 2014 | MySQL
by Justin Swanhart | May 1, 2014 | Insight for DBAs, MySQL
While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node. Shard-Query can add parallelism to queries which use partitioned tables. Very large tables can often be partitioned fairly easily.... by Justin Swanhart | Sep 12, 2013 | MySQL, Webinars
Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.” What will be discussed? This... by Justin Swanhart | May 22, 2013 | Benchmarks, MySQL
This blog post is part two in what is now a continuing series on the Star Schema Benchmark.In my previous blog post I compared MySQL 5.5.30 to MySQL 5.6.10, both with default settings using only the InnoDB storage engine. In my testing I discovered... by Justin Swanhart | Mar 11, 2013 | MySQL
So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload. I wanted to test a set of queries which, unlike sysbench, utilize joins. I also wanted an easily reproducible set of data which is more rich than the simple sysbench... by Justin Swanhart | Feb 6, 2013 | MySQL, Percona Events
On Friday, February 15, 2013 10:00am Pacific Standard Time, I will be delivering a webinar entitled “Building a highly scaleable distributed row, document or column store with MySQL and Shard-Query”The first part of this webinar will focus on why... by Justin Swanhart | Nov 28, 2012 | Insight for DBAs, MySQL
Notice the result of the NOW() function in the following query. The query was run on a real database server and I didn’t change the clock of the server or change anything in the database configuration settings. mysql> SELECT NOW(),SYSDATE();... by Justin Swanhart | Aug 28, 2012 | Insight for DBAs, MySQL
As an instructor with Percona, I’m sometimes asked about the differences between the REPEATABLE-READ and READ-COMMITTED transaction isolation levels. There are a few differences between them, and they are all related to locking.Extra locking (not gap... by Justin Swanhart | May 19, 2011 | MySQL
http://Flexvie.ws fully implements a method for creating materialized views for MySQL data sets. The tool is for MySQL, but the methods are database agnostic. A materialized view is an analogue of software transactional memory. You can think of this as database... by Justin Swanhart | May 17, 2011 | MySQL
The most useful feature of the relational database is that it allows us to easily process data in sets, which can be much faster than processing it serially. When the relational database was first implemented, write-ahead-logging and other technologies did not exist.... by Justin Swanhart | May 16, 2011 | MySQL
Hi, Here is an easy way to run the subset sum check from SQL, which you can then distribute with Shard-Query: CREATE TABLE `the list` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `val` bigint(20) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `id` (`id`) ) ENGINE=MyISAM;... by Justin Swanhart | May 16, 2011 | MySQL
Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set... by Justin Swanhart | May 14, 2011 | MySQL
Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful... by Justin Swanhart | May 14, 2011 | MySQL
Demonstrating distributed set processing performance Shard-Query + ICE scales very well up to at least 20 nodes This post is a detailed performance analysis of what I’ve coined “distributed set processing”.Please also read this post’s... by Justin Swanhart | May 11, 2011 | MySQL
Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog... by Justin Swanhart | May 6, 2011 | MySQL
Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with...