In the very early days of Percona, Vadim wrote very nice post about MySQL GROUP_CONCAT (GROUP_CONCAT).
But I want to show you a bit more about it.
When is MySQL GROUP_CONCAT useful? Usually while working with Support customers I recommend it when you have aggregation of many-to-many info. It makes the view simpler and more beautiful and it doesn’t need much effort to make it work.
The following are some simple examples.
This is a test table:
1 2 3 4 5 6 7 | CREATE TABLE `group_c` ( `parent_id` int(11) DEFAULT NULL, `child_id` int(11) DEFAULT NULL ) ENGINE=InnoDB; INSERT INTO group_c(parent_id, child_id) VALUES (1,1),(1,2),(1,3),(2,1),(2,4),(1,4),(2,6),(3,1),(3,2),(4,1),(4,1),(1,1),(5,0); |
Without grouping info the only way you can check things is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | mysql> SELECT DISTINCT -> parent_id, child_id -> FROM group_c -> ORDER BY parent_id; +-----------+----------+ | parent_id | child_id | +-----------+----------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 1 | 4 | | 2 | 1 | | 2 | 3 | | 2 | 4 | | 2 | 6 | | 3 | 1 | | 3 | 2 | | 4 | 1 | | 5 | 0 | +-----------+----------+ 12 rows in set (0.00 sec) |
But it looks much better and easier to read with GROUP_CONCAT:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> SELECT DISTINCT -> parent_id, GROUP_CONCAT(DISTINCT child_id ORDER BY child_id) AS child_id_list -> FROM group_c -> group by parent_id -> ORDER BY parent_id; +-----------+---------------+ | parent_id | child_id_list | +-----------+---------------+ | 1 | 1,2,3,4 | | 2 | 1,3,4,6 | | 3 | 1,2 | | 4 | 1 | | 5 | 0 | +-----------+---------------+ 5 rows in set (0.00 sec) |
Easy? Let’s go to production usage and some “real” examples 🙂
Assume you have 4 Support Engineers who were working with 6 Customers this week on 15 issues.
As it usually happens: everyone (sure, except those who are on vacation :)) worked on everything with everybody.
How you would represent it?
Here is my way:
Create test tables:
- engineers (id, name, surname, URL) – list of engineers
- customers (id, company name, URL) – list of customers
- issues (id, customer_id, description) – list of issues assigned to customers
- workflow (id, engineer_id, issue_id) – list of actions: issues and engineers who worked on them
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | -- Engineers CREATE TABLE engineers ( id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, e_name VARCHAR(30) NOT NULL, e_surname VARCHAR(30) NOT NULL, url VARCHAR(255) NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; -- Customers CREATE TABLE customers ( id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, company_name VARCHAR(30) NOT NULL, url VARCHAR(255) NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; -- Issues (Issue-Customer) CREATE TABLE issues ( id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT, customer_id VARCHAR(30) NOT NULL, description TEXT, PRIMARY KEY (id) ) ENGINE=InnoDB; -- Workflow (Action: Engineer-Issue(Customer)) CREATE TABLE workflow ( action_id INT UNSIGNED NOT NULL AUTO_INCREMENT, engineer_id SMALLINT UNSIGNED NOT NULL, issue_id SMALLINT UNSIGNED NOT NULL, PRIMARY KEY (action_id) ) ENGINE=InnoDB; INSERT INTO engineers (e_name, e_surname, url) VALUES ('Miguel', 'Nieto', 'https://www.percona.com/about-us/our-team/miguel-angel-nieto'), ('Marcos', 'Albe', 'https://www.percona.com/about-us/our-team/marcos-albe'), ('Valerii', 'Kravchuk', 'https://www.percona.com/about-us/our-team/valerii-kravchuk'), ('Michael', 'Rikmas', 'https://www.percona.com/about-us/our-team/michael-rikmas'); INSERT INTO customers (company_name, url) VALUES ('OT','http://www.ovaistariq.net/'), ('PZ','http://www.peterzaitsev.com/'), ('VK','http://mysqlentomologist.blogspot.com/'), ('FD','http://www.lefred.be/'), ('AS','http://mysqlunlimited.blogspot.com/'), ('SS','https://www.flamingspork.com/blog/'); INSERT INTO issues(customer_id, description) VALUES (1,'Fix replication'), (2,'Help with installation of Percona Cluster'), (3,'Hardware suggestions'), (4,'Error: no space left'), (5,'Help with setup daily backup by Xtrabackup'), (6,'Poke sales about Support agreement renewal'), (4,'Add more accounts for customer'), (2,'Create Hot Fix of Bug 1040735'), (1,'Query optimisation'), (1,'Prepare custom build for Solaris'), (2,'explain about Percona Monitoring plugins'), (6,'Prepare access for customer servers for future work'), (5,'Decribe load balancing for pt-online-schema-change'), (4,'Managing deadlocks'), (1,'Suggestions about buffer pool size'); INSERT INTO workflow (engineer_id, issue_id) VALUES (1,1),(4,2),(2,3),(1,4),(3,5),(2,6),(3,7),(2,8),(2,9),(1,10),(3,11),(2,12),(2,13),(3,14),(1,15),(1,9),(4,14),(2,9),(1,15),(3,10),(4,2),(2,15),(4,8),(4,4),(3,11),(1,7),(3,7),(1,1),(1,9),(3,4),(4,3),(1,5),(1,7),(1,4),(2,4),(2,5); |
Examples:
List of issues for each engineer (GROUP_CONCAT):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | mysql> SELECT -> CONCAT (e_name, ' ', e_surname) AS engineer, -> GROUP_CONCAT(DISTINCT issue_id, ' (', c.company_name,')' ORDER BY issue_id SEPARATOR ', ' ) AS 'issue (customer)' -> FROM -> workflow w, -> engineers e, -> customers c, -> issues i -> WHERE -> w.engineer_id = e.id -> AND w.issue_id = i.id -> AND i.customer_id = c.id -> GROUP BY -> e.id -> ORDER BY -> e_name, e_surname; +------------------+---------------------------------------------------------------------------+ | engineer | issue (customer) | +------------------+---------------------------------------------------------------------------+ | Marcos Albe | 3 (VK), 4 (FD), 5 (AS), 6 (SS), 8 (PZ), 9 (OT), 12 (SS), 13 (AS), 15 (OT) | | Michael Rikmas | 2 (PZ), 3 (VK), 4 (FD), 8 (PZ), 14 (FD) | | Miguel Nieto | 1 (OT), 4 (FD), 5 (AS), 7 (FD), 9 (OT), 10 (OT), 15 (OT) | | Valerii Kravchuk | 4 (FD), 5 (AS), 7 (FD), 10 (OT), 11 (PZ), 14 (FD) | +------------------+---------------------------------------------------------------------------+ 4 rows in set (0.00 sec) |
List of engineers for each customer (GROUP_CONCAT inside of GROUP_CONCAT):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | mysql> SELECT -> c.company_name AS company, -> GROUP_CONCAT(DISTINCT issue_id, ' (', engineer_list, ')' ORDER BY issue_id SEPARATOR ', ' ) AS issue -> FROM -> workflow w, -> engineers e, -> customers c, -> issues i, -> (SELECT -> i.id AS i_id, -> GROUP_CONCAT(DISTINCT CONCAT(e_name, ' ', e_surname) ORDER BY e_name SEPARATOR ', ') AS engineer_list -> FROM -> workflow w, -> engineers e, -> issues i -> WHERE -> w.engineer_id = e.id -> AND w.issue_id = i.id -> GROUP BY -> i.id) AS e_list -> WHERE -> w.engineer_id = e.id -> AND w.issue_id = i.id -> AND i.customer_id = c.id -> AND w.issue_id = e_list.i_id -> GROUP BY -> c.id -> ORDER BY -> c.company_name; +---------+--------------------------------------------------------------------------------------------------------------------------------------------+ | company | issue (engineer) | +---------+--------------------------------------------------------------------------------------------------------------------------------------------+ | AS | 5 (Marcos Albe, Miguel Nieto, Valerii Kravchuk), 13 (Marcos Albe) | | FD | 4 (Marcos Albe, Michael Rikmas, Miguel Nieto, Valerii Kravchuk), 7 (Miguel Nieto, Valerii Kravchuk), 14 (Michael Rikmas, Valerii Kravchuk) | | OT | 1 (Miguel Nieto), 9 (Marcos Albe, Miguel Nieto), 10 (Miguel Nieto, Valerii Kravchuk), 15 (Marcos Albe, Miguel Nieto) | | PZ | 2 (Michael Rikmas), 8 (Marcos Albe, Michael Rikmas), 11 (Valerii Kravchuk) | | SS | 6 (Marcos Albe), 12 (Marcos Albe) | | VK | 3 (Marcos Albe, Michael Rikmas) | +---------+--------------------------------------------------------------------------------------------------------------------------------------------+ 6 rows in set (0.00 sec) |
PHP/HTML? Why not? It’s easy 🙂
Source Code:
“; foreach($rows as $row) { echo “‘.$row[“company”].”.$row[“description”].”.$row[“engineer_list”].’ ‘; } echo ”
“.$row[“id”].’ |
“; $result->close(); $mysqli->close(); ?>
Result:
1 | OT | Fix replication | Miguel Nieto |
2 | PZ | Help with installation of Percona Cluster | Michael Rikmas |
3 | VK | Hardware suggestions | Marcos Albe, Michael Rikmas |
4 | FD | Error: no space left | Marcos Albe, Michael Rikmas, Miguel Nieto, Valerii Kravchuk |
5 | AS | Help with setup daily backup by Xtrabackup | Marcos Albe, Miguel Nieto, Valerii Kravchuk |
6 | SS | Poke sales about Support agreement renewal | Marcos Albe |
7 | FD | Add more accounts for customer | Miguel Nieto, Valerii Kravchuk |
8 | PZ | Create Hot Fix of Bug 1040735 | Marcos Albe, Michael Rikmas |
9 | OT | Query optimisation | Marcos Albe, Miguel Nieto |
10 | OT | Prepare custom build for Solaris | Miguel Nieto, Valerii Kravchuk |
11 | PZ | explain about Percona Monitoring plugins | Valerii Kravchuk |
12 | SS | Prepare access for customer servers for future work | Marcos Albe |
13 | AS | Decribe load balancing for pt-online-schema-change | Marcos Albe |
14 | FD | Managing deadlocks | Michael Rikmas, Valerii Kravchuk |
15 | OT | Suggestions about buffer pool size | Marcos Albe, Miguel Nieto |
That’s a power of MySQL GROUP_CONCAT!
Good call on this article, I almost never hear anything about this really useful SQL aggregate function.
Always worth remembering to set the session variable group_concat_max_len to a higher number if you are grouping excessively long lists
One thing to watch out for with GROUP_CONCAT (and ORDER BY) is how it can result in tmp tables on disk, the example query to list issues for each engineer above causes a tmp table on disk to be created:
show session status like ‘Created_tmp_disk_tables’;
+————————-+——-+
| Variable_name | Value |
+————————-+——-+
| Created_tmp_disk_tables | 1 |
+————————-+——-+
set session group_concat_max_len=512;
Stops the tmp table being created on disk. So does using ORDER BY NULL or removing the ORDER BY. Tested with mariadb 5.3.12, 10.0.4 and 5.6.13-rel61.0 Percona Server with XtraDB.
http://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html#function_group-concat
The result type is TEXT or BLOB unless group_concat_max_len is less than or equal to 512, in which case the result type is VARCHAR or VARBINARY.
http://dev.mysql.com/doc/refman/5.5/en/internal-temporary-tables.html
Some conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead:
Presence of a BLOB or TEXT column in the table
Presence of any string column in a GROUP BY or DISTINCT clause larger than 512 bytes
Presence of any string column with a maximum length larger than 512 (bytes for binary strings, characters for nonbinary strings) in the SELECT list, if UNION or UNION ALL is used
Also see:
http://bugs.mysql.com/bug.php?id=14169 – When using GROUP_CONCAT() function with group_concat_max_len > 512 then the field type will be BLOB if ORDER BY is used, otherwise it will be VARCHAR.
http://www.mysqlperformanceblog.com/2007/08/16/how-much-overhead-is-caused-by-on-disk-temporary-tables/
Thanks Peter for the comment. This is very interesting info which should be taken into attention.
Great topic, thanks!
whats the max length i can set for ” SET SESSION group_concat_max_len=15000000; ” is there any limit for the variable group_concat_max_len ?
thanks in advance.
Shamin,
Manual says it’s:
for 32-bit systems: 4294967295
for 64-bit systems: 18446744073709547520
You can see details here:
https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_group_concat_max_len
Thank you sooooooooooooooooooooooooooooo many much, It Rocks
Thank you very helpful article .
Thank you for the article.
I ‘ve got a question. Why are you using multiple table names in FROM and conditions in WHERE, instead of JOINs?
What about this scenario?
JobNum OprSeq PartNum
10000 10 N277070
10000 10 N277070-MM
10000 14 N277070
10000 14 N277070-MM
10000 14 N277070-001
10000 16 N277070-MM
10000 16 N277070
10000 16 N277070-001
10000 16 N277070-00024
10000 20 N277070
–EXPECTED RESULT–
JobNum OprSeq PartNum PartNumItemTwo PartNumItemThree PartNumItemFour
10000 10 N277070 N272770-MM NULL NULL
10000 14 N277070 N272770-MM N277070-001 NULL
10000 16 N277070 N272770-MM N277070-001 N277070-00024
10000 20 N277070 NULL NULL NULL
Is there a possible way to achieve this? I mean, is it even logical a cliente is asking for this?
Kudos to you..!!!
Great Article!! GROUP_CONCAT really interesting feature
Very helpful. Thank you Michael.
great article with full detail document
great tutorial explaining the power of concat and group concat.
thanks and kudos to you!
Hi,
If i want get all parents who have 2 AND 1 child id (or 2 only)
How do it ?
Thanks
How about if you want to do something a bit smarter so that the results that did look like this
5 (Marcos Albe, Miguel Nieto, Valerii Kravchuk), 13 (Marcos Albe)
turn into
5 (Marcos Albe, Miguel Nieto, Valerii Kravchuk) and 13 (Marcos Albe)
Great article! Being a Java guy learning to use MySQL, you really contributed in me building faith in this environment! Thank you!
Great topic, thanks!
Thanks man!! Great post
would it be possible to add a AND a to the same group_concat ??
I tried something like your example:
GROUP_CONCAT(DISTINCT ‘‘ ‘‘, CONCAT(e_name, ‘ ‘, e_surname), ‘‘ ‘ ORDER BY e_name SEPARATOR ‘, ‘) AS engineer_list
FROM
But that gives weird results. adding a , between the >’, ‘<span will not do better and '‘, Concat ….. does not work either.
Is it possible at all and how to write it?