Recently, I got access to the list of MySQL bug reports from bugs.mysql.com which someone crawled and stored in a MySQL database. I thought it would be interesting to see who the heroes are of MySQL bug reporting!
Top MySQL Bug Reporters Ever
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | select rank() over(order by count(*) desc) my_rank, count(*) cnt, reporter from bugs where reporter != "OCA Admin" and reporter != "[ name withheld ]" group by reporter order by cnt desc limit 20; +---------+------+--------------------+ | my_rank | cnt | reporter | +---------+------+--------------------+ | 1 | 1234 | Shane Bester | | 2 | 869 | Peter Gulutzan | | 3 | 818 | Daniël van Eeden | | 4 | 587 | Joerg Bruehe | | 5 | 572 | Philip Stoev | | 6 | 568 | Peter Laursen | | 7 | 564 | Roel Van de Paar | | 8 | 526 | Guilhem Bichot | | 9 | 524 | Jonathan Miller | | 10 | 476 | Hartmut Holzgraefe | | 11 | 431 | Simon Mudd | | 12 | 389 | Matthias Leich | | 13 | 388 | Todd Farmer | | 14 | 377 | Alexander Nozdrin | | 15 | 375 | Jonas Oreland | | 16 | 372 | Sven Sandberg | | 17 | 354 | Sveta Smirnova | | 18 | 337 | Paul DuBois | | 19 | 336 | Mark Callaghan | | 20 | 335 | Laurynas Biveinis | +---------+------+--------------------+ 20 rows in set (0.19 sec) |
Congrats Shane Bester, you kick ass!
I thought this is also a good way to illustrate using the SQL rank() function in action as if two people have the same count of bugs reported, they should share the same spot… not that it happened, in this case.
Let’s look at most persistent bug reporters, that is, anyone who has reported bugs in every one of the 10 years of the 2010s.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | mysql> select count(distinct year(submitted_date)) yr, count(*) cnt, reporter from bugs where year(submitted_date) between 2010 and 2019 group by reporter having yr=10 order by cnt desc limit 100; +----+-----+-------------------+ | yr | cnt | reporter | +----+-----+-------------------+ | 10 | 808 | Daniël van Eeden | | 10 | 543 | Shane Bester | | 10 | 381 | Simon Mudd | | 10 | 347 | Peter Laursen | | 10 | 235 | Valeriy Kravchuk | | 10 | 218 | Sveta Smirnova | | 10 | 137 | Alexey Kopytov | | 10 | 87 | Morgan Tocker | | 10 | 73 | Domas Mituzas | | 10 | 43 | Federico Razzoli | | 10 | 34 | Peter Brawley | +----+-----+-------------------+ 11 rows in set (0.20 sec) |
Wow! In the last 10 years, Shane Bester is being edged out by Daniël van Eeden. Congrats Daniël!
What if we look at the most recent full year, 2019?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | mysql> select rank() over(order by count(*) desc) my_rank, count(*) cnt, reporter from bugs where year(submitted_date) = 2019 and reporter != "OCA Admin" and reporter!= "[name withheld]" group by reporter order by cnt desc limit 20; +---------+-----+-------------------------+ | my_rank | cnt | reporter | +---------+-----+-------------------------+ | 1 | 34 | Tor Didriksen | | 2 | 30 | Manuel Rigger | | 3 | 19 | Laurynas Biveinis | | 3 | 19 | Hrvoje Matijakovic | | 5 | 15 | Georgi Sotirov | | 5 | 15 | Fungo Wang | | 5 | 15 | Simon Mudd | | 5 | 15 | Sivert Sørumgård | | 5 | 15 | chen zongzhi | | 5 | 15 | Meiji Kimura | | 5 | 15 | Paul Weiss | | 12 | 13 | JianJun Shi | | 13 | 12 | Przemysław Skibiński | | 13 | 12 | Bradley Grainger | | 15 | 11 | Przemyslaw Malkowski | | 15 | 11 | Jean-François Gagné | | 15 | 11 | Cai Yibo | | 18 | 10 | Manuel Ung | | 18 | 10 | tsubasa tanaka | | 18 | 10 | Yoshiaki Yamasaki | +---------+-----+-------------------------+ 20 rows in set (0.04 sec) |
Tor Didriksen gets the thorny bug crown for 2019, and what happened to Shane? He is not even in our Top 20 list!
Let’s not look at bugs, but rather, overall activity:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | mysql> select count(*) cnt, count(distinct reporter) reporters, year(submitted_date) from bugs where year(submitted_date) != 0 group by Year(submitte +------+-----------+----------------------+ | cnt | reporters | year(submitted_date) | +------+-----------+----------------------+ | 14 | 8 | 2002 | | 2226 | 973 | 2003 | | 5353 | 2363 | 2004 | | 8281 | 3524 | 2005 | | 8291 | 3188 | 2006 | | 7556 | 2616 | 2007 | | 7593 | 2581 | 2008 | | 7388 | 2479 | 2009 | | 8062 | 3342 | 2010 | | 4397 | 2960 | 2011 | | 3973 | 2821 | 2012 | | 3128 | 1670 | 2013 | | 3538 | 1600 | 2014 | | 3499 | 1676 | 2015 | | 3085 | 1525 | 2016 | | 2611 | 1386 | 2017 | | 2519 | 1439 | 2018 | | 1993 | 1215 | 2019 | | 261 | 203 | 2020 | +------+-----------+----------------------+ 19 rows in set (0.14 sec) |
It is much more interesting to look in the graph format:
We can see that the bug reporting activity peaked out in 2005-2006, and then dropped drastically after 2010 and continues to slide further.
We can speculate on the reasons of slide, but one thing to consider: Oracle’s purchase of Sun Microsystems was completed in 2010 and if I remember correctly, many bugs started to be reported in the internal bugs database which has not been open to the public since that time.
I hope you find those stats interesting 🙂
Very interesting, thanks for sharing Peter. Is the dataset available ?
Do you have similar stats for Percona Server ? 🙂
And about Daniël van Eeden being the best reporter for the last 10 years (“Shane Bester is being edged out by Daniël van Eeden”), without wanting to remove anything from Daniël, maybe someone did more reports in the last 10 years, but no report in one of the 10 last years as your query is only checking for people having reported bugs in each of the last 10 years. 😉
>
> “and what happened to Shane?”
>
My reports had to become mostly internal the last few years 😉
Why ? Do not you love us (community) any more ? 🙂
Thanks for documenting this. It is a great statement about the value of the community.