KBEA-00036 - Fixing builds that appear to hang

Summary

Your build appears to be stuck, or you get timed out jobs.

Solution

First, it is important to note that the lack of output on the console does not necessarily mean that Electric Make (emake) is stuck. Due to the parallel execution of commands, there may be times when emake does not print anything while collecting results in the background.

Assuming that the build actually has trouble, the first thing to examine is the state of the agents participating in the build. Use cmtool to get this info:


cmtool --cm YOUR_CM login USER PASSWD
cmtool --cm YOUR_CM runAgentCmd "session state"

The output looks like this:


<responses>
<response requestId="1">
<agent>
<agentName>agent name</agentName>
<result>Waiting for command from emake
Seconds in current state: 123940.39
Last request received from emake: E2A_SYNC
144442 emake requests, 8582668 EFS requests processed
</result>
</agent>
</response>
</responses>

The most important piece of information here is the first line of the result, "Waiting for command from emake."
This indicates that this agent is currently idle, waiting for emake to give it work to do.

  1. If all agents are "Waiting for command from emake," this indicates that emake is hanging. If this happens, the stack for emake is needed:
    • Windows: Process Explorer stack information for emake is needed. To view the stack information for emake, start Process Explorer, double-click the emake process (to display the Properties window), select the Threads tab, then the Stack button.
    • Unix: Run pstack on the process and/or strace.
  2. If one or more agents are in a state of running a command and remain in that state for a long time, the process on the agent may be stuck.
    • Windows: This can occur, for example, with the pdbserver process used by VS 2005. Eventually the agent's inactive process monitor should terminate the job, but there are a number of things to do to diagnose this state:
      • A popup that is part of a regular application invocation (unlikely, because this generally means the local build requires user interaction as well). If this is the case, you must eliminate that popup as part of the build. There is no user interaction during an emake build.
      • An application crashed and put up Microsoft's exception/debug dialog. Refer to this page for more information about this issue. You can configure the agent's method for dealing with stalled jobs.
    • Unix: There can also be daemon processes on Unix that do not exit, but are tracked by the agent. Do not allow daemonization of processes, or ensure that the daemon is started by the agent startup code so it's not started by individual jobs.
    • General: There are several reasons for this, the information you want to find out is:
      • What process is stalled?
      • Does that process require any external resources to function properly (such as a license server, etc.) that may not be available on the agent node.
      • Is this happening all the time, some times, once?
      • Is this happening on all agents, some agents, one agent only?
      • Get the stack for the running process (see above). You may also want to do an strace on the process.
      • Could this be just a very long running job?

Applies to

  • Product versions: All
  • OS versions: All
Have more questions? Submit a request

Comments

Powered by Zendesk