Redshift losing connection and disconnecting mid-query


#1

I’m trying to run a fairly large query to build a derived table to perform some product analysis. However, when I try to run the query after about 20 minutes the connection drops out, and subsequently try to run the query again, the connection to Redshift has been cut.

I’m really not sure why this is happening. I’ve tried running this in a couple of different SQL clients with the same results. The error message I’m getting is below:

“setting/closing connection connection reset by peer”

Any help would be greatly appreciated.


#2

Do you initiate this query from outside the AWS infrastructure or inside? If from the outside, do you connect from a network segment that is under your company’s network and security administration? What is their stale connection timeout setting? Is it 20 minutes?

Connection reset by peer error happens when an application is trying to send something on a port, but that port has been ripped away. It is an RST as opposed to a normal TCP disconnect handshake. This is a good indication that someone other that AWS or a common ISP performing a network management between the client ( the computer that issued a query) and the server (your AWS Redshift cluster). Since Postgres database drivers do not implement any kind of background heartbeat/noise that keeps sending any traffic between the client and the server while your query is running, there’s no way for the network team to tell if you’re doing something productive or not while you are waiting for the results of the running query. Network equipment registers a set period of silence on your connection and rips it.

  1. Look for ways to shift the client into AWS infrastructure
  2. Look for ways to send an equivalent to “SELECT 1” query on the SAME connection that is waiting for a long running query
  3. Look for ways to negotiate with your security team for a longer period of inactivity on the network connections between your client IP address and AWS redshift IP address and PORT number.
  4. Look for ways to make the query return faster <— this is probably where you can start, rip the benefits and improve your life as well as lives of others .