Wednesday, August 09, 2006 2:58 PM
Chris
The evils of busy loops.
The other day I was at a customer site when the customer asked me to help them try to resolve an issue with a legacy component. It appeared that the legacy component was failing under load conditions so I proceeded to setup some performance counters. I could see that CPU utilization was at 100% capacity.
The component in question was written in VB6(ugh!) and was part of a COM+ package. We examined the COM+ package and found that there were a couple of things that needed changing in order to improve the performance of the component.
I was still curious so I asked some additional questions regarding the component when the developer mentioned that there is retry logic that is executed when the component detects certain types of problems, such as, inability to connect to the mainframe. The mention of retry logic certainly piqued my interest so I asked if I could review the code. When reviewing the code my attention was fixed on the retry logic.
In an effort to wait for a period of time before retrying a failed operation, the developer coded a busy loop:
Do While Timer < StartTime + WaitTime
End Loop
This code was part of the developer’s effort to kill some time before retrying a failed operation again. Upon further examination I discovered that the busy loop was executed for 2 seconds at a time up to a maximum of 120 seconds!
This code is evil! I explained to everyone how this code does not help the situation at all, instead the problem is made significantly worse. This code raises the processor utilization to 100% for the entire wait period. There is nothing that is causing this code to be interrupted so all other threads on the system are starved while this thread is in a tight loop.
The developer’s defense was the fact that the code would wait for
only 2 seconds.
The ill affects of this type of loop are easily demonstrable. Since the component was written in VB6 I wrote the following sample VB .Net application to demonstrate the affects of a busy loop.
Sub Main ()
Dim timeToSleep As Double = 2
Dim timeLimit As Double = 120
Dim startTime As Double = Timer
Do While Timer < startTime + timeLimit
Dim startTime2 As Double = Timer
Do While Timer < startTime2 + timeToSleep
Loop
System.Threading.Thread.Sleep(200)
Loop
End Sub
What follows is a graph of the affects of this loop on a single processor machine:
As can be seen, this one single process, running for a total of 2 minutes, nearly totally dominated the CPU.
I also ran the test on a dual processor machine with the following results:
On the dual processor machine the process stayed between 41% and 50% processor utilization. This may be misleading but what it means is that the process dominated one of the processors. The 2
nd processor was able to be used for other work. Still, the results are not good for a production machine.
In my code sample I added a 200 MS sleep to simulate other processing that occurs between each loop. The number 200 is arbitrary. In the actual code, it was unlikely that the processing between loops would take that long which means that the affect on the processor is even worse.
Other problems that I have with this retry logic is the fact that it continues to try to failed operation for a total of 2 minutes! That is a very long time to dedicate to one single operation.
The recommendations that I made to the customer were:
If it is ever necessary to program a delay, use the Win32 API function called Sleep. This function tells the OS to suspend the thread for a period of time. The processor utilization used by the suspended thread will be zero for the period requested.
In the sample program that I wrote I used sleep to simulate work being done. .Net provides a wrapper for the Win32 API function in the System.Threading namespace.
When a retry loop is necessary, it is generally better to loop for a maximum number of times rather than for a time period. I suggested a maximum of 3 times with a 500MS delay.