Maybe your cluster is short in free space disk. We had the same issue a couple of months ago when doing a deep copy. In the graph below you can see how we tried 2 times (around Tuesday 21st & Wed 22nd) before finding the correct cluster size and succeeded with the deep copy (Thu 23rd).
Try to upscale your cluster temporarily. Ours has 11 billion lines and took ~6 hours to upscale/downscale
AWS Support says :
When doing a Deep Copy in a highly unsorted table Redshift needs to sort this table before inserting into the new one.
This sort operation will be taken place in a intermediate temporary table, which at first be placed in memory, but as the data set is too big, will eventually spillover to disk.
However, those temporary tables are not compressed, and Redshift will allocate temporary disk space for that operation which results in disk full error if there is not sufficient space for the temporary data.
Once you have your tables sorted try to vacuum them often (vacuum
to 100 percent).
Another best practice we learnt from snowplow database is that the default atomic.events definition is not 100% accurate for all cases, which is understandable as we think there is not a one-fits-all-usages encoding. After doing the above deep copy and before downscaling our cluster we ran the AWS Redshift Column Encoding utility. This reduced the usage of atomic.event from 66% of our cluster to 20% and an overall usage of the cluster from 94% (beginning of the chart below) to <50% (end of the chart, after downscaling to the original cluster size). Just beware and keep a backup of the original events before droping it.