I'm experiencing an intermittent issue with my GitHub Actions workflow, where one of the deployment jobs (deploy_standalone) randomly fails with the following error:
Run which ssh-agent || ( apk update && apk add openssh-client )
/usr/bin/ssh-agent
Agent pid 1698
Identity added: (stdin) (***@***)
# github:22 SSH-2.0-fddafdc3a
# github:22 SSH-2.0-fddafdc3a
# github:22 SSH-2.0-fddafdc3a
# github:22 SSH-2.0-fddafdc3a
# github:22 SSH-2.0-fddafdc3a
Error: Process completed with exit code 1.
The failure does not happen consistently. Restarting the failed job often makes it succeed. However, sometimes it fails multiple times in a row before eventually succeeding.
Workflow Details
The workflow sets up SSH using ssh-agent, adds the private key, and establishes a connection to the remote server for deploying a Docker container. Below is the relevant section of the workflow:
- name: Deploy standalone
run: |
which ssh-agent || ( apk update && apk add openssh-client )
eval $(ssh-agent -s)
echo "$SSH_PRIVATE" | tr -d '\r' | ssh-add - > /dev/null
mkdir -p ~/.ssh
chmod 700 ~/.ssh
ssh-keyscan github >> ~/.ssh/known_hosts
ssh-keyscan $HOST_STAGING >> ~/.ssh/known_hosts
chmod 644 ~/.ssh/known_hosts
export CURRENT_DROPLET_IP=$(ifconfig | awk 'f{print;f=0} /eth0/{f=1}' | awk 'END{print $2}') || true
ssh -vvv -tt root@$HOST_STAGING bash -c "
# Deployment steps...
"
shell: bash
The GitHub Actions workflow consists of three jobs: build, deploy_staging and deploy_standalone. Both deployment jobs starts right after the build. And both of them have this code that starts SSH agent. If anyone can explain to me what's the issue might be here and what is possible fix, I would really appreciate this. Because I couldn't find any useful information for me