20 January 2006

Think I found a bug ....

A couple days ago, a mistake caused a key-value table to generate new keys for duplicate values when there should have been a unique constraint on key. This affected the stat gathering table (stats for one value are now associated with many keys) and I have to clean up the data -- while preserving the stats that have already been generated. In trying to do this, I think I found a bug in replication .... an UPDATE statement that aggregates the data produces different results on my replication slave. I've submitted a formal bug report along with SQL to duplicate the error here. I'm curious if anyone else has encountered this bug, or knows of a way to achieve the result that I want without breaking replication.

Here's a description of the test case:

Two tables store key-value pairs (in the example, `color` and `animal`), and `a_to_c` stores a relationship between each key along with a third column, `count`. However, in the two key-value tables, the values are not unique. The overall purpose is to sum the `count` column in the third table for each unique animal+color combination and remove all duplicate keys from the `color` and `animal` tables. In other words -- because there is no unique constraint on `color` or `animal`, the color "grey" is given more than one key, as is the animal "dog". The goal is to rebuild the `color` and `animal` tables, preserving the lowest key for each distinct value and -- most importantly -- to rebuild the `a_to_c` table, preserving the total sum for each unique combination of animal+color. In the example, I stop short of finishing this process when the first error appears.

Here is the demonstration SQL:

-----------------
-- creating the tables and initializing data
-----------------

DROP TABLE IF EXISTS `animals`;
CREATE TABLE `animals` (
`k` int(11) NOT NULL auto_increment,
`v` varchar(10) default NULL,
PRIMARY KEY (`k`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `animals` VALUES (1,'dog'),(2,'cat'),(3,'dog');

DROP TABLE IF EXISTS `colors`;
CREATE TABLE `colors` (
`k` int(11) NOT NULL auto_increment,
`v` varchar(10) default NULL,
PRIMARY KEY (`k`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `colors` VALUES (1,'grey'),(2,'beige'),(3,'grey'),(4,'grey');

DROP TABLE IF EXISTS `a_to_c`;
CREATE TABLE `a_to_c` (
`animal` int(11) default NULL,
`color` int(11) default NULL,
`count` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `a_to_c` VALUES (1,2,1),(3,2,1),(1,1,1),(1,3,1),(2,3,2),(2,4,2);

DROP TABLE IF EXISTS sums;
CREATE TABLE sums SELECT animals.v AS animal, colors.v AS color, SUM(count)
FROM animals JOIN a_to_c ON animals.k=a_to_c.animal
JOIN colors ON colors.k=a_to_c.color
GROUP BY animal, color;

-----------------------------------
-- Everything up to here works fine.
-----------------------------------

mysql> select * from sums;
+--------+-------+------------+
| animal | color | SUM(count) |
+--------+-------+------------+
| cat | grey | 4 |
| dog | beige | 2 |
| dog | grey | 2 |
+--------+-------+------------+
3 rows in set (0.00 sec)

----------------------------------
-- It is this next query that produces unexpected results on the slave!
-- It does work as expected on the master.
-----------------------------------

UPDATE sums LEFT JOIN
(
SELECT v AS animal, MIN(k) AS k FROM animals GROUP BY animal
) AS a USING(animal)
SET sums.animal=a.k;

-------------------------------------
-- Here are the results of this UPDATE, on master and slave
---------------------
-- master
---------------------
mysql> select * from sums;
+--------+-------+------------+
| animal | color | SUM(count) |
+--------+-------+------------+
| 2 | grey | 4 |
| 1 | beige | 2 |
| 1 | grey | 2 |
+--------+-------+------------+
3 rows in set (0.00 sec)

----------------------
-- slave
----------------------
mysql> select * from sums;
+--------+-------+------------+
| animal | color | SUM(count) |
+--------+-------+------------+
| NULL | grey | 4 |
| 1 | beige | 2 |
| 1 | grey | 2 |
+--------+-------+------------+
3 rows in set (0.00 sec)

2 comments:

Devananda vdv said...

http://bugs.mysql.com/bug.php?id=16688

It has been fixed in 4.1.16. Too bad it was already reported; I'm still trying to get my first "real" bug report!

manasi said...
This comment has been removed by a blog administrator.