Discussion Forums
Advanced search options
EC2 cannot reach RDS anymore
Posted by: qoppp
Posted on: Mar 21, 2017 12:58 AM
  Click to reply to this thread Reply
This question is not answered. Answer it to earn points.
Hi all,

Since last night all my servers get the Database connection error. But my own computer can connect to the database server.

SQLSTATEhttps://forums.aws.amazon.com/ https://forums.aws.amazon.com/ php_network_getaddresses: getaddrinfo failed: Name or service not known
Permlink Replies: 20 | Pages: 1 - Last Post: Mar 23, 2017 2:24 PM by: rsf
Replies
Re: EC2 cannot reach RDS anymore
Posted on: Mar 21, 2017 5:10 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Same here. Three RDS instances in Oregon, and one in Virginia. All four of the sites went down. I can't figure out why all of a sudden the EC2 instances won't connect to them.

Someone help, please!

https://forums.aws.amazon.com/ I might add that I can SSH into the EC2 instance and connect to MySQL from there no problem. [/EDIT]

Edited by: Steven Scott Koontz on Mar 21, 2017 5:10 AM

Edited by: Steven Scott Koontz on Mar 21, 2017 5:40 AM
Re: EC2 cannot reach RDS anymore
Posted on: Mar 21, 2017 6:07 AM
in response to: Steven Scott Koontz in response to: Steven Scott Koontz
  Click to reply to this thread Reply
Rebooting the EC2 instances seems to have worked, the sites are back up again. What could have happened?
Re: EC2 cannot reach RDS anymore
Posted by: sea-you
Posted on: Mar 21, 2017 6:59 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Experienced the same here in ap-southeast-1 and some other regions. Not only RDS was affected, but ElastiCache and external DNS lookups. This most probably has to do something with either Route 53 or the connection in between EC2 and Route 53 services.
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 21, 2017 7:39 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
We experienced this issue last night as well. Started at approximately 3/20/2017 @ 9:55PM EDT. Affected multiple instances in us-east-1 where we couldn't reach RDS, and some external services. Looks like PHP cached a bad DNS response and Apache was unable to connect to RDS for extended period of time. To fix had to restart Apache services.

Some posts where others reported being affected as well:

https://stackoverflow.com/questions/42926075/pdoexception-sqlstatehy000-2002-php-network-getaddresses-getaddrinfo-faile

https://stackoverflow.com/questions/42925765/pdo-exception-php-network-getaddresses-getaddrinfo-failed-after-changing-dns

@AWS please let us know what happened. There's nothing on your status page related to this incident and clearly multiple customers were affected.
Re: EC2 cannot reach RDS anymore
Posted by: petejlawrence1
Posted on: Mar 21, 2017 7:54 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
We had 3 EC2 instances in eu-west-1 do the same thing whilst trying to connect to two different RDS endpoints at 8:08GMT, all running Ubuntu 16.04 and PHP 7. Other servers running Ubuntu 14.04 and PHP 5 weren't affected (may just be a coincidence though).

An external customer using non AWS services experienced the same issue a little later around 8:30GMT when connecting to our API (an endpoint hosted on Route 53), again they were running Ubuntu 16.04 and PHP 7.

In both cases it appeared that an invalid response has been cached somewhere, as at the same time we could resolve the same hostnames using nslookup/dig.

Subsequently, a colleague has just experienced the same error (14:30GMT), on his laptop running at home, whilst attempting to connect to a Route 53 hosted endpoint.

In all cases, restarting Apache/the instance sorted the issue.
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 21, 2017 7:58 AM
in response to: petejlawrence1 in response to: petejlawrence1
  Click to reply to this thread Reply
Hi @petejlawrence1,

Thanks for your example. We're on 16.04 but running 5.6. Some of our colleagues in the office that connect to RDS from Apache/PHP also experience same problem. In their case, restarting Apache also fixed it.
Re: EC2 cannot reach RDS anymore
Posted by: ChrisPinnacle
Posted on: Mar 21, 2017 8:54 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
From what I can tell this is not an AWS-specific issue. It is pretty likely to have been caused by a recent libc6 security update on Ubuntu 16.04 as there was a patch to the DNS resolution code: https://launchpad.net/ubuntu/+source/glibc/2.23-0ubuntu6. It also affected my development environment which does not use any AWS infrastructure.
Re: EC2 cannot reach RDS anymore
Posted by: sddot
Posted on: Mar 21, 2017 8:58 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Also experienced this. us-east-1 on both Ubuntu 14.04 and 16.04.
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 21, 2017 9:10 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
In our case, confirmed our instances started reporting errors around the time update for https://launchpad.net/ubuntu/+source/glibc/2.23-0ubuntu6 was installed. Each instance installed update at a vastly different time and errors started immediately after.
Re: EC2 cannot reach RDS anymore
Posted by: deskpro-cn
Posted on: Mar 21, 2017 9:56 AM
in response to: rsf in response to: rsf
  Click to reply to this thread Reply
I noticed the same thing after looking at unattended update logs. Do you know if there is a submitted bug report?

Restarting php-fpm fixed it temporarily, but the problem has recurred three times.

So I've rolled back the update to hopefully fix it until the package can be fixed.

apt-get -y --allow-downgrades install libc6=2.23-0ubuntu3


Edited by: deskpro-cn on Mar 21, 2017 4:57 PM
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 21, 2017 10:12 AM
in response to: deskpro-cn in response to: deskpro-cn
  Click to reply to this thread Reply
We've not had issue re-occur but have had update apply to new instances that caused issue. Perhaps you need to update the package on all instances to avoid the unattended upgrade doing it unexpectedly.
Re: EC2 cannot reach RDS anymore
Posted by: evan-por
Posted on: Mar 21, 2017 11:05 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
We can confirm that this happened to all of our instances this morning as well. Confirmed in Australia and US-West-1, issues resolving domains in US-East-1 and externally. Issues getting to S3 APIs.

A reboot seems to have fixed it.

Would like to get an official response from Amzon on this one as to why it happened and what we should do if anything to avoid it in the future.
Re: EC2 cannot reach RDS anymore
Posted by: Antwan
Posted on: Mar 21, 2017 11:11 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Same here, we had connection issues since 5.30am UTC on eu-west.

OperationalError: could not translate host name "our-rds.our-instance.eu-west-1.rds.amazonaws.com" to address: Name or service not known

We keep experiencing DNS resolutions problems from our EC2 instances to connect our RDS DB. Restarting our web server processes sometimes help but it keeps failing at some point resulting in HTTP500s.

Can you please confirm this is a known issue and you are working on it ?
Re: EC2 cannot reach RDS anymore
Posted by: RossCampbell3
Posted on: Mar 21, 2017 12:01 PM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
I saw the same thing -- errors connecting to redis because of DNS, and it happened not too long after the libc update. Running ubuntu 16.04, php7, php-fpm, nginx

But... some of the timings of my servers updating to the latest libc don't correlate exactly with these outages. I wonder if some Amazon services were hit with the same libc bug?

Here is the Ubuntu bug report: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1674733
Re: EC2 cannot reach RDS anymore
Posted by: kcromwell
Posted on: Mar 21, 2017 12:37 PM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Indeed the problem has been linked to the libc released to servers set to perform unattended updates.

It has an active bug on the Ubuntu launchpad and according to a post about 40 minutes ago, packages are being rebuilt after downgrading that lib.

https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1674733

It sounds like restarting php5-fpm process solves the problem temporarily, but restarting your server fixes it permanently (for now). When the new lib is deployed, a server restart will be required again.
Re: EC2 cannot reach RDS anymore
Posted by: esono_heinrich
Posted on: Mar 22, 2017 1:36 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
We experience DNS resolve errors in our web application since 21th of March. These resolving issues keep persistent even after successful resolving manually via shell. We have to restart the corresponding service to re-enable correct DNS resolves. So in case we're having the same issue here it's definitely not limited to RDS. The same thing happened 21th at 07:30 CET and today 22nd at 07:30 CET again.
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 22, 2017 5:18 AM
in response to: esono_heinrich in response to: esono_heinrich
  Click to reply to this thread Reply
Another Ubuntu update is being pushed out today(2.23-0ubuntu7) to fix a regression caused by the first update. This triggered same issue for us again.

Fix for us is to force upgrade on all instances (so they're not installed at random times) followed by rolling restart of long running services.

See https://www.ubuntu.com/usn/usn-3239-2/ for details.
Re: EC2 cannot reach RDS anymore
Posted by: ewgraf
Posted on: Mar 22, 2017 6:01 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
We face with the same problem today: "php_network_getaddresses: getaddrinfo failed: Name or service not known".

After restart php-fpm it is gone. Before this our developer enable xdebug extension.

Looks like xdebug have some artifacts that also can be a reason of this problem.

So, to solve it, try restart php-fpm (reload not enough), if it is not helps, or problem come back - consider to disable xdebug and restart php-fpm.
Looks like something cached inside php-fpm process or xdebug.
Re: EC2 cannot reach RDS anymore
Posted by: jp7jp
Posted on: Mar 22, 2017 10:33 AM
in response to: qoppp in response to: qoppp
  Click to reply to this thread Reply
Helpful
We had the same problem here using both RDS and EC2 in Virginia (US-EAST-1). The instability endured for about 12 hours. Now it seems OK. We restarted PHP-FPM and Nginx to ensure DNS cache cleaning.

Our error log: PDOException: PDO::__construct(): php_network_getaddresses: getaddrinfo failed

None of our servers were updated and none configuration were changed. So looks like it was a problem within AWS network.

We hope to see some interaction from AWS team explaining the problem.
Re: EC2 cannot reach RDS anymore
Posted by: rsf
Posted on: Mar 23, 2017 2:24 PM
in response to: jp7jp in response to: jp7jp
  Click to reply to this thread Reply
See above for information about Ubuntu security update that caused this issue. Cause was not AWS specific.

http://www.ubuntu.com/usn/usn-3239-2