I would like to check if a particular agent is available before running any jobs. I have encountered the problem that if the agent is unavailable the the run just hangs. I have tried setting a timeout on the job but this is ignored.
This is my pipeline:
trigger:
- none
jobs:
- job: Build
pool: testAgents
timeoutInMinutes: 2
steps:
- task: CmdLine@2
inputs:
script: |
echo Write your commands here
echo Hello world
Ok some more background I have a pipeline that runs when a pull request is created. Recently I was away from the office for a couple of weeks. During that time several pull requests were created. However, whilst I was away the build machine running the agent was powered down so no builds were happening and users had no idea what was going on. So I was tasked to see if we could give some feedback to users if the agent was not available.
I would like to check if a particular agent is available before running any jobs. I have encountered the problem that if the agent is unavailable the the run just hangs. I have tried setting a timeout on the job but this is ignored.
This is my pipeline:
trigger:
- none
jobs:
- job: Build
pool: testAgents
timeoutInMinutes: 2
steps:
- task: CmdLine@2
inputs:
script: |
echo Write your commands here
echo Hello world
Ok some more background I have a pipeline that runs when a pull request is created. Recently I was away from the office for a couple of weeks. During that time several pull requests were created. However, whilst I was away the build machine running the agent was powered down so no builds were happening and users had no idea what was going on. So I was tasked to see if we could give some feedback to users if the agent was not available.
Share Improve this question edited Nov 20, 2024 at 11:02 RoachFIsher asked Nov 20, 2024 at 9:56 RoachFIsherRoachFIsher 73 bronze badges 4- 1 The run doesn't "hang", jobs will be added to the agent pool queue and wait until there is a build agent available. The timeout on the job will be used to limit the amount of time used to run that job, after it is picked up by an agent. – Rui Jarimba Commented Nov 20, 2024 at 10:52
- What exactly is your scenario? Are you just trying to reduce the waiting time of jobs in the queue or do you have other requirements? – Rui Jarimba Commented Nov 20, 2024 at 10:54
- I have updated to give more information – RoachFIsher Commented Nov 20, 2024 at 11:03
- 1 This is an issue with your anization. There should be multiple people capable of observing, troubleshooting, and correcting issues with production systems. – Daniel Mann Commented Nov 21, 2024 at 5:02
4 Answers
Reset to default 0You may use the InvokeRESTAPI@1
task in an Agentless job to call this API to check the status of APaticularAgent in the self-hosted agent pool testAgents. If the agent is not online
or not enabled
, it then sends a custom notification to Teams Channel via Power Automate webhook (see the example). Here is a sample YAML pipeline for your reference.
trigger: none
variables:
poolId: 68 # testAgents
Name: ${{split(variables['System.CollectionUri'], 'https://dev.azure/')[1] }} # extract Name from $(System.CollectionUri) - https://dev.azure/Name
system.debug: true
jobs:
- job: AgentlessJob
strategy:
matrix:
agent01:
agentId: 731 # AParticularAgent
# agent02:
# agentId: 732
pool: server # note: the value 'server' is a reserved keyword which indicates this is an agentless job
steps:
- task: InvokeRESTAPI@1
displayName: Check if agent-$(agentId) is enabled and online
inputs:
connectionType: 'connectedServiceName'
serviceConnection: 'AzureDevOpsServices' # https://dev.azure
method: 'GET'
headers: |
{
"Content-Type":"application/json",
"Authorization": "Bearer $(system.AccessToken)"
}
urlSuffix: '/$(Name)/_apis/distributedtask/pools/$(poolId)/agents/$(agentId)?includeAssignedRequest=true&api-version=7.1-preview.1'
waitForCompletion: 'false'
successCriteria: and( eq(root.enabled, 'true'), eq(root.status, 'online'))
- task: InvokeRESTAPI@1
condition: failed()
displayName: Send notification to Teams
inputs:
connectionType: 'connectedServiceName'
serviceConnection: 'PowerAutomateWebHook'
method: 'POST'
headers: |
{
"Content-Type":"application/json"
body: |
{
"title": "Agent Status Check - FAILED",
"text": "<b>View the build results:</b><br /><a href=\"$(System.CollectionUri)$(System.TeamProjectId)/_build/results?buildId=$(Build.BuildId)&view=results\">$(Build.DefinitionName) - $(Build.BuildId)</a><br /><b>View the agent $(agentId) in:</b><br /><a href=\"$(System.CollectionUri)_settings/agentpools?poolId=$(poolId)&view=agents\">testAgents</a>"
}
urlSuffix: '&sig=$(sigSecret)'
waitForCompletion: 'false'
- job: AgentJobSelf
dependsOn: AgentlessJob
pool:
name: testAgents
Agent/Name: AParticularAgent
steps:
- powershell: |
Write-Host "Detected the Particular Agent is enabled and online - Queued the agent job"
displayName: Run agent job in self-hosted agent pool testAgents
As it has been said, the run doesn't hang, it wait for an agent to be free.
Usually when it stuck for too long, there will be a probleme in the agent machine, try to check logs.
You can use tags in order to match between the run and the agent capable to run it.
I have a pipeline that runs when a pull request is created. Recently I was away from the office for a couple of weeks. During that time several pull requests were created. However, whilst I was away the build machine running the agent was powered down so no builds were happening and users had no idea what was going on. So I was tasked to see if we could give some feedback to users if the agent was not available.
There are many issues that can prevent build agents from running, so as a starting point I'd create a document with some basic troubleshooting steps including (but not limited to):
- Check if build agents are both online and enabled in the agent pools
- Check build agents logs (if running)
- Check number of parallel jobs available
- Check if the Personal Access Tokens (PAT) are valid (e.g. not expired)
- Check for any issues/outages the Azure DevOps status page
- Check the infrastructure where the build agents are running (VMs, kubernetes clusters, etc)
You can check your Agent pool - Agent
status (online or offline) using Azure Pipelines pool - Agent
, which provided by MS and always be online.
Agents Get rest api help to check the agent status
Write-Host "##vso[task.logissue type=error]agent offline, check and re-deploy."
help to custom the error msg
Write-Host "##vso[taskplete result=Failed]"
help to set the job states to Failed, if the agent status is offline
Below is a sample code, hope can help. If the agent check job success, the second job from self-host agent continue to run, otherwise second job will be skipped.
variables:
selfAgentName: 'xxxxxx'
defaultPoolId: '1'
selfAgentId: '12'
## you can find your pool id , agent id first.
stages:
- stage: development
displayName: Deploy to development
jobs:
- job: SelfAgentStatusCheck
pool:
vmImage: windows-latest
steps:
- task: Powershell@2
name: checkAgent
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
inputs:
targetType: 'inline'
script: |
$anizationUri = "$(System.CollectionUri)"
$project = "$(System.TeamProject)"
$basicAuth = ("{0}:{1}" -f '', $env:SYSTEM_ACCESSTOKEN)
$basicAuth = [System.Text.Encoding]::UTF8.GetBytes($basicAuth)
$basicAuth = [System.Convert]::ToBase64String($basicAuth)
$headers = @{Authorization = ("Basic {0}" -f $basicAuth) }
$uri = "{0}_apis/distributedtask/pools/{1}/agents/{2}?api-version=7.0" -f $anizationUri, $(defaultPoolId), $(selfAgentId)
$res = Invoke-RestMethod -Uri $uri -Method 'Get' -Headers $headers
if($res.status -eq 'offline'){
Write-Host "##vso[task.logissue type=error]agent offline, check and re-deploy."
Write-Host "##vso[taskplete result=Failed]"
}else{
Write-Host "agent online"
}
- job: SelfAgentStatusRunCode
dependsOn: SelfAgentStatusCheck
condition: succeeded()
pool:
name: default
demands:
- agent.name -equals $(selfAgentName)
steps:
- script: echo 'hello, world'
displayName: 'Run script code in self-host agent'