MySQL replication connection error on non-standard ports

or, “Why SELinux was the worst thing to happen to Linux since LinuxThreads.”

Setting up MySQL master-slave replication is pretty straightforward. I’ve done it plenty of times. Similarly, running multiple MySQL server instances on the same host using mysqld_multi, really painless. Give each instance its own server-id and port to listen on, and that’s it.

In this case, the server that was being replicated was listening on port 3307. I could connect to it just fine using the mysql client using “-P 3307” as the replication user. It was clearly possible for the two machines to communicate over TCP port 3307. But, then, why was the MySQL slave I/O thread failing to connect to the master?

mysql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: x.x.x.x
                  Master_User: repl
                  Master_Port: 3307
                Connect_Retry: 60
...
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
...
                Last_IO_Errno: 2003
                Last_IO_Error: error connecting to master 'repl@x.x.x.x:3307' - retry-time: 60  retries: 86400
...
1 row in set (0.00 sec)

Makes no sense, right? Right! Unless, of course, you’re on a machine–in this case, Red Hat Enterprise Linux 5.7–that has ever-so-helpfully screwed you by installing SELinux policies that restrict what ports mysqld can use, and has SELinux enabled by default.

How can you tell if SELinux is enabled and its policies are being enforced?

# grep 1 /selinux/enforce
1

If you get no output, then SELinux isn’t being enforced. But, if it is enabled, here’s how you can tell if mysqld‘s ability to make network connections is being restricted by a policy:

# semanage port -l | grep 3306
mysqld_port_t                  tcp      1186, 3306

You can tell if this is happening if you look in /var/log/audit/audit.log and see an entry like this:

type=AVC msg=audit(1320869122.773:56478): avc: denied { name_connect } for pid=2990 comm=”mysqld” dest=3307 scontext=user_u:system_r:mysqld_t:s0 tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket

(I bolded the elements to keep an eye out for.)

So, how do we remedy the situation? Personally, my first recommendation is to turn SELinux off. It is always more trouble than it’s worth. I’ve never heard of a single anecdote where it improved security or otherwise did anything useful.

Disable SELinux by editing /etc/selinux/config and change the line that reads “SELINUX=enforcing” to “SELINUX=disabled” and then reboot the system. There’s some caveats around disabling SELinux with respect to re-enabling it in the future, but you’re never going to do that if you have a choice, so who cares.

If you absolutely must leave SELinux enabled, because you’re actually burning in Hell and being tortured by some unreasonable and irrational organizational policy that requires it, then this is how you can add ports to the list of ports that mysqld will be allowed to use:

# semanage port -a -t mysqld_port_t -p tcp 3307

This command might take a few seconds to complete, and won’t generate any output if successful. You can check to see if it did what you wanted using the same command we used before, but this time the newly added port should be in the list, as well:

# semanage port -l | grep 3306
mysqld_port_t                  tcp      3307, 1186, 3306

To make sure this really did fix the problem, lets look at the MySQL replication status, again:

mysql> SHOW SLAVE STATUS \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: x.x.x.x
                  Master_User: repl
                  Master_Port: 3307
                Connect_Retry: 60
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
                Last_IO_Errno: 0
                Last_IO_Error: 
...
1 row in set (0.00 sec)

And there you have it, replication configured on TCP port 3307, connected successfully, waiting for the master to send it updates, like it should be.

Hopefully, this has helped save you time, aggravation and much head-scratching as you tried to figure out why MySQL replication wasn’t working on a port other than 3306.

Got any other good gotcha stories like this? Share them in the comments below.

MySQL Geo Distance Code and Samples

Over the years, I’ve implemented and re-implemented the haversine formula whenever I’ve needed to compute great-circle distances, which is pretty common when you’re doing geospatial proximity searches, e.g., “What’s near me?” or “How far are these two locations from each other?”

For an implementation of the formula in MySQL, Alexander Rubin’s presentation is quite useful, and is what I’ve based the following code on this time around.

Hopefully, rather than re-implementing this yet again the next time I need it, I’ll find my own blog entry on it and just reuse this code. I keep forgetting where I’ve stashed the most recent copy, which is why I keep ending up re-implementing this every time I’ve needed it.

Here’s the code, tested on MySQL 5.1.41:

DELIMITER $$

DROP FUNCTION IF EXISTS geodist $$
CREATE FUNCTION geodist (
  src_lat DECIMAL(9,6), src_lon DECIMAL(9,6),
  dst_lat DECIMAL(9,6), dst_lon DECIMAL(9,6)
) RETURNS DECIMAL(6,2) DETERMINISTIC
BEGIN
  SET @dist := 3956 * 2 * ASIN(SQRT(
      POWER(SIN((src_lat - ABS(dst_lat)) * PI()/180 / 2), 2) +
      COS(src_lat * PI()/180) *
      COS(ABS(dst_lat) * PI()/180) *
      POWER(SIN((src_lon - dst_lon) * PI()/180 / 2), 2)
    ));
  RETURN @dist;
END $$

DROP FUNCTION IF EXISTS geodist_pt $$
CREATE FUNCTION geodist_pt (src POINT, dst POINT) 
RETURNS DECIMAL(6,2) DETERMINISTIC
BEGIN
  RETURN geodist(Y(src), X(src), Y(dst), X(dst));
END $$

DROP PROCEDURE IF EXISTS geobox $$
CREATE PROCEDURE geobox (
  IN src_lat DECIMAL(9,6), IN src_lon DECIMAL(9,6), IN dist DECIMAL(6,2),
  OUT lat_top DECIMAL(9,6), OUT lon_lft DECIMAL(9,6),
  OUT lat_bot DECIMAL(9,6), OUT lon_rgt DECIMAL(9,6)
) DETERMINISTIC
BEGIN
  /*
   * src_lat, src_lon -> center point of bounding box
   * dist -> distance from center in miles
   * lat_top, lon_lft -> top left corner of bounding box
   * lat_bot, lon_rgt -> bottom right corner of bounding box
   */
  SET lat_top := src_lat - (dist / 69);
  SET lon_lft := src_lon - (dist / ABS(COS(RADIANS(src_lat)) * 69));
  SET lat_bot := src_lat + (dist / 69);
  SET lon_rgt := src_lon + (dist / ABS(COS(RADIANS(src_lat)) * 69));
END $$

DROP PROCEDURE IF EXISTS geobox_pt $$
CREATE PROCEDURE geobox_pt (
  IN pt POINT, IN dist DECIMAL(6,2),
  OUT top_lft POINT, OUT bot_rgt POINT
) DETERMINISTIC
BEGIN
  /*
   * pt -> center point of bounding box
   * dist -> distance from center in miles
   * top_lft -> top left corner of bounding box
   * bot_rgt -> bottom right corner of bounding box
   */
  CALL geobox(Y(pt), X(pt), dist, @lat_top, @lon_lft, @lat_bot, @lon_rgt);
  SET top_lft := POINT(@lon_lft, @lat_top);
  SET bot_rgt := POINT(@lon_rgt, @lat_bot);
END $$

DROP PROCEDURE IF EXISTS geobox_pt_bbox $$
CREATE PROCEDURE geobox_pt_bbox (
  IN pt POINT, IN dist DECIMAL(6,2), OUT bbox POLYGON
) DETERMINISTIC
BEGIN
  /*
   * pt -> center point of bounding box
   * dist -> distance from center in miles
   * bbox -> bounding box, dist miles from pt
   */
  CALL geobox_pt(pt, dist, @top_lft, @bot_rgt);
  SET bbox := Envelope(LineString(@top_lft, @bot_rgt));
END $$

DELIMITER ;

And, here’s examples on how to use the above function and procedures:

SELECT @src := pt FROM exp_geo
WHERE postal_code = '07405' AND city = 'Butler';

CALL GEOBOX_PT(@src, 10.0, @top_lft, @bot_rgt);
SELECT Y(@top_lft) AS lat_top, X(@top_lft) AS lon_lft,
       Y(@bot_rgt) AS lat_bot, X(@bot_rgt) AS lon_rgt;

CALL GEOBOX_PT_BBOX(@src, 10.0, @bbox);
SELECT AsWKT(@bbox);

-- Assuming INDEX on (lat, lon) and SPATIAL INDEX on (pt) ...

-- Fast:
SELECT g.postal_code, g.city, g.state,
       GEODIST(Y(@src), X(@src), lat, lon) AS dist
FROM exp_geo g
WHERE lat BETWEEN Y(@top_lft) AND Y(@bot_rgt)
AND lon BETWEEN X(@top_lft) AND X(@bot_rgt)
HAVING dist < 5.0
ORDER BY dist;

-- Also fast:
SELECT g.postal_code, g.city, g.state, GEODIST_PT(@src, g.pt) AS dist
FROM exp_geo g
WHERE lat BETWEEN Y(@top_lft) AND Y(@bot_rgt)
AND lon BETWEEN X(@top_lft) AND X(@bot_rgt)
HAVING dist < 5.0
ORDER BY dist;

-- Slow:
SELECT g.postal_code, g.city, g.state, GEODIST_PT(@src, g.pt) AS dist
FROM exp_geo g
WHERE Y(pt) BETWEEN Y(@top_lft) AND Y(@bot_rgt)
AND X(pt) BETWEEN X(@top_lft) AND X(@bot_rgt)
HAVING dist < 5.0
ORDER BY dist;

-- Also slow:
SELECT g.postal_code, g.city, g.state, GEODIST_PT(@src, g.pt) AS dist
FROM exp_geo g
WHERE MBRContains(@bbox, pt) = 1
HAVING dist < 5.0
ORDER BY dist;

Normally, I try to explain the stuff I post, but this time I’m doing it for somewhat selfish reasons — so I don’t lose this work yet again. I’m just throwing this up here so Google will index it, so I can find it the next time I need it.

I suppose if you read this and have questions, please ask them in the comments below. I’ll do my best to answer, if I can.

Using MySQL Meta Data Effectively at ODTUG Kaleidoscope 2010

Since Oracle owns MySQL through its acquisition of Sun, more Oracle conferences are providing MySQL content. This past Oracle Development Tools User Group (ODTUG) Kaleidoscope 2010 conference from June 27 through July 1 had a whole dedicated track for MySQL.

Using MySQL Meta Data Effectively - title slide thumbnail

I was fortunate enough to have the opportunity to speak on MySQL metadata, titled “Using MySQL Meta Data Effectively“. Here’s the abstract:

This presentation discusses what MySQL meta data is available including the ‘mysql’ meta schema, the INFORMATION_SCHEMA (I_S) tables first introduced in MySQL 5.0 and extended in MySQL 5.1, storage engine specific INFORMATION_SCHEMA tables, as well as techniques for writing your own INFORMATION_SCHEMA plug-ins. MySQL also provides a number of SHOW commands that provide easily formatted presentation of MySQL meta data. Dossy Shiobara will also discuss some of the limitations and performance implications of the INFORMATION_SCHEMA.

Paper first page thumbnail

You can download the materials from my session here: