Category: developers journal

Mysql GET_LOCK() “problem” … and how to debug it with help of mysql performance_schema

Author: damir August 4, 2017

After I recently switched back to Linux with php7 and mysql 5.7 I run into “bug” with MySQL GET_LOCK().
I executed php script which uses GET_LOCK(). Scripts were executed parallel (1-2s one after another) from two tabs of the same browser.
According to the output, locking was not working!!!

My first thought was that improvements which were introduced in MySQL 5.7. regarding GET_LOCK() broke something.

Test script:

<?php

$host = 'DB_HOST';
$db   = 'DB_NAME';
$user = 'DB_USER';
$pass = 'DB_PASSWD';
$charset = 'utf8mb4';

$dsn = "mysql:host=$host;dbname=$db;charset=$charset";
$opt = [
    PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    PDO::ATTR_EMULATE_PREPARES   => false,
	PDO::ATTR_PERSISTENT => false
];
$pdo = new PDO($dsn, $user, $pass, $opt);

echo "<pre>";
echo "pid=(".getmypid().")\n";

$stmt = $pdo->query('SELECT GET_LOCK("foobar", 15)');
$row = $stmt->fetch();

var_dump($row);
echo "\n";

echo "Sleep for 50s\n";
sleep(50);

echo "</pre>";
?>

Lock timeout is 15s and script executes (and hold lock) for 50s so I could have time to start both scripts in parallel and monitor MySQL performance_schema.

Result from the first tab:

pid=(14769)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}

Sleep for 50s

Result from the second tab:

pid=(14769)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}

Sleep for 50s

Output is not as expected. GET_LOCK() from the second tab should return 0, not 1.

I executed the script again, but from the command line, again in parallel. Second one was executed 1-2s after the first one.

Result from the first (cmd line):

$ php -q locktest_1.php 

<pre>pid=(16328)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}

Sleep for 50s

Result from the second (cmd line):

$ php -q locktest_1.php 

<pre>pid=(16382)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(0)
}

Sleep for 50s

And everything was working as expected. First script obtained the lock, and second didn't.

With helpful comments from StackOverflow community I continued with my investigation.

Put the performance_schema to good use

In order to see what is happening with my locks it is necessary to monitor performance_schema.threads and performance_schema.metadata_locks closely. To learn more about it check the MySQL documentation.

I added connection_id() in php test script in order to match running php script with entries in the performance_schema.threads and performance_schema.metadata_locks tables.

Second round of tests

Now, armed with all this data, second round of test can begin.

After running the test scripts I quickly executed these sql queries:

select * from performance_schema.threads;
select * from performance_schema.metadata_locks;

You will see active threads and locks and their status (GRANTED, PENDING).

First one returned:

pid=(14773)
CONNECTION_ID()=(71)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}
Sleep for 50s

Second one returned:

pid=(14773)
CONNECTION_ID()=(72)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}
Sleep for 50s

Also in performance_schema.threads there was at any point in time only one entry related to the test script.
First it showed:

Please note row with PROCESSLIST_ID = 71 (matching the output from the first script)

I kept refreshing performance_schema.threads and noticed that one thread related to the test script was replaced with another one. They were not running in parallel like they do when I start test script from cmd line.

Please note row with PROCESSLIST_ID = 72 (matching the output from the second script)
Also performance_schema.metadata_locks showed only one lock at any point in time (not one GRANTED and one PENDING):

For example, when script was executed in parallel from cmd line performance_schema.metadata_locks showed:

For now, it seems that locking is in fact working correctly, but the scripts executed from two tabs of the same browser are not being executed in parallel.

Third round of tests

I upgraded the test script to show start timestamp and end timestamp. This also proved that although I started the scripts with 1-2s delay, that they in fact were not running at the same time.

Final test script:

<?php

$host = 'DB_HOST';
$db   = 'DB_NAME';
$user = 'DB_USER';
$pass = 'DB_PASSWD';
$charset = 'utf8mb4';

$dsn = "mysql:host=$host;dbname=$db;charset=$charset";
$opt = [
    PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    PDO::ATTR_EMULATE_PREPARES   => false,
	PDO::ATTR_PERSISTENT => false
];
$pdo = new PDO($dsn, $user, $pass, $opt);

echo "<pre>";

echo "start timestamp=(".date("c").")\n";

echo "pid=(".getmypid().")\n";

$stmt = $pdo->query('SELECT CONNECTION_ID() as connid');
$row = $stmt->fetch();

echo "CONNECTION_ID()=(".$row['connid'].")\n";

$stmt = $pdo->query('SELECT GET_LOCK("foobar", 15)');
$row = $stmt->fetch();

var_dump($row);
echo "\n";

echo "Sleep for 50s\n";
sleep(50);

echo "end timestamp=(".date("c").")\n";
echo "</pre>";
?>

First one returned:

start timestamp=(2017-08-03T14:30:48+00:00)
pid=(14768)
CONNECTION_ID()=(73)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}
Sleep for 50s
end timestamp=(2017-08-03T14:31:38+00:00)

Second one returned:

start timestamp=(2017-08-03T14:31:38+00:00)
pid=(14768)
CONNECTION_ID()=(74)
array(1) {
  ["GET_LOCK("foobar", 15)"]=>
  int(1)
}
Sleep for 50s
end timestamp=(2017-08-03T14:32:28+00:00)

Please note the end timestamp of the first and start timestamp of the second script.

Conclusion

Execution of the scripts was serialized. They definitely didn't run in parallel and that was the reason why both succeeded to acquire the lock.

GET_LOCK() in MySQL 5.7 is still working as it should (plus the improvements). But even more important, I realized what a powerful tool Performance Schema is in debugging what at first appeared to be really weird bug.

Author
damir

    Leave a Reply

    Your email address will not be published. Required fields are marked *