start-profiler

Function

Package: prof

ToC

DocOverview

CGDoc

RelNotes

FAQ

Index

PermutedIndex

Allegro CL version 10.1
Minimal update since the initial 10.1 release.
10.0 version

start-profiler

Arguments: &key type start-sampling-p count count-list except-processes interpret-closures verbose

This function starts the collection of runtime analyzer data. Data collection is stopped with a call to stop-profiler. The macro with-profiling wraps its body in a call to start-profiler at the start and a call to stop-profiler at the end.

If there is an unsaved, unanalyzed profile, you are asked by start-profiler if you want to save or discard that profile before proceeding with the new profile. You can cancel (delete) the current profile with cancel-current-profile, save it with save-named-profile, or analyze it with show-flat-profile or show-call-graph. Once any of these actions are taken, start-profiler will start without interruption.

When started, the runtime analyzer creates two new files in a temporary location for saving sample data and initializes the runtime analyzer data structures. These files are removed after analysis. See Further points, including information on temporary files in runtime-analyzer.htm.

The type keyword argument specifies the type of profile data to be collected. The choices are

Value	Meaning
`:time`	collect CPU time data (but not space data)
`:real-time`	collect real time data (but not space data)
`:space`	collect space data (but not time data)
`:count-only`	collect call-count data but no time or space data

You can collect either time data or space data but not both in a single run, but you can have call-counting data collected while collecting either time or space data. Or you can have call-counting data only collected. So the choices are:

time only: type should be :time or :real-time and both count and count-list should be nil or unspecified.
time and call-counting: type should be :time or :real-time and either count or count-list (or both) should be non-nil. (Appropriate non-nil values of count and count-list are given below.)
space only: type should be :space and both count and count-list should be nil or unspecified.
space and call-counting: type should be :space and either count or count-list (or both) should be non-nil. (Appropriate non-nil values of count and count-list are given below.)
call-counting only: type should be :count-only and either count or count-list (or both) should be non-nil. (Appropriate non-nil values of count and count-list are given below.)

If no value is specified for type, it defaults to :count-only if either count or count-list (or both) are non-nil and it defaults to :time if both count or count-list are nil or unspecified.

The count and count-list arguments control whether call-counting information should be collected during the run. Call count information is collected if either argument is true. Both default to nil.

count can be a positive integer or t. If it is a positive integer, a hash table with that many entries is created. If it is t, a hash table of 1000 entries is created. The table is filled with function objects specified by count-list, if any. New function objects (beyond what is specified in count-list) are added to the table as the associated function is called until the table is filled. No information is collected on function objects called for the first time after the table is filled.

count-list, if specified, should be a list of function objects (not symbols -- thus elements should be #'foo rather than foo). The hash table mentioned in the description of count just above is filled with the elements of count-list, in order, until the hash table is filled or the list is exhausted. Specifying a value for this variable ensures that functions you are interested in are call-counted. Note that if the length of the list is greater than the value of count, only the first count elements are entered in the hash table; the rest are ignored. If count-list has a true value but no value is specified for count, count defaults to 1000 (even if the length of count-list is greater than 1000).

If the start-sampling-p keyword argument is true or is not supplied then sampling is started immediately. If it is specified nil, sampling is not started until start-sampling is called. The functions start-sampling and stop-sampling can be used to control the gathering of samples while the runtime analyzer is running. The default value of this argument is t.

The except-processes keyword argument should be a list of processes or nil. This argument causes the runtime analyzer to not record data about the processes listed in this argument. Use of this argument is a good way to narrow down the extent of an analysis, if multiprocessing is being used. (In earlier releases, there was an except-stack-groups argument, which applied to certain platforms. That argument is no longer supported and a value provided for it is assumed to be a value for except-processes. A warning is signaled if except-stack-groups is specified.)

The interpret-closures argument controls the handling of closures. When interpret-closures is nil (the default) the closures are not saved and analyzed separately. When true, closures are saved and analyzed as separate profile entries, even though they may use the same template function. A function specialization called prof:closure has been added. This is valid whenever an analyzed profile exists and identifies the closures which the runtime analyzer has seen. The specialization is (prof:closure n) where n is a small integer enumerating the closures in the profile. This allows the user to grab the closure object and to do a disassemble on it without having to know what symbol holds its fdefinition (which is sometimes very hard). The Allegro CL implementation of disassemble is described in Extensions to cl:disassemble in implementation.htm. The profile argument, when specified, causes profiling information to be included in the disassembly.

When flat analyses and call-graphs are reported, closures are identified by (prof:closure n), followed by the printed representation of the closure object itself.

Note that if you use the Emacs-Lisp interface or the IDE (on Windows), you are using multiprocessing. The :processes top-level command lists active processes. The Process Browser in Allegro Composer (on UNIX) and the one displayed with View | Processes in the IDE (on Windows) also list processes and provide a graphic interface to include or exclude a process from profiling.

The verbose argument, if true (default is nil), causes messages about what the profiler is doing to be printed to *standard-output* during data collection.

Data can be collected for only one analysis at a time. Calling this function while data is already being collected will signal an error.

CPU time versus real time profiles

A CPU time profile takes samples while the program is running and bases its results on the CPU time used during the profile. A real time profile takes samples whether or not the program is consuming CPU time. The difference can be seen by profiling a call to sleep. The CPU time profile records nothing (no hits -- some runs may have a hit or two) since almost no CPU time was used. The real time profiler recorded the full 20 seconds.

cl-user(2): (prof:with-profiling (:type :time)
               (dotimes (i 10) (sleep 2)))
nil
cl-user(3): (prof:show-flat-profile)
No samples have been collected for the specified time profiler.
cl-user(4): (prof:with-profiling (:type :real-time)
              (dotimes (i 10) (sleep 2)))
nil
cl-user(5): (prof:show-flat-profile)
Sampling stopped after 2000 samples taken for the real-time profiler.

Sample represents 20.0 seconds of real time (out of a total of .1).

Real Times below 1.0% will be suppressed.

      %     %   self total         self   total  Function                                  
Real Time  Cum. secs  secs calls ms/call ms/call   name                                    
    100.0 100.0 20.0  20.0                       "__select"
cl-user(6):

Real time in SMP Lisps

The real time results in non-SMP Lisps are reasonably intuitive. In the example about, sleep is called 10 time for 2 seconds each and the result is roughly 20 seconds total time. In SMP Lisps, the results are more complicated because the total time combines time for all running threads. Consider the following example:

;; This example is run in an SMP Lisp
;;
cl-user(1): (require :smp-utilities)
; Fast loading /net/penn/acl/duane/sacl91/src/code/smp-utilities.fasl
t
cl-user(2): (defvar *bar* (mp:make-barrier 11))
*bar*
cl-user(3): (compile 
              (defun sleepit () 
                  (dotimes (i 5) (sleep 2))
                  (mp:barrier-pass-through *bar*)))
sleepit
nil
nil
cl-user(4): (progn (prof::with-profiling (:type :real-time)
                     (progn
		       (dotimes (i 10)
		         (mp::process-run-function 
                             (format nil "test~d" i) #'sleepit))
			 (mp::barrier-wait *bar*)))
		       (prof:show-flat-profile))

Sampling stopped after 10878 samples taken for the real-time profiler.
Sample represents 108.8 seconds of real time (out of a total of 10.0).

Real Times below 1.0% will be suppressed.

      %    %    self total         self   total  Function                       Real Time Cum.  secs  secs calls ms/call ms/call   name                                    
     99.9 99.9 108.7 108.7                       "sem_wait"
cl-user(5):

In this case the real-time shown is for the almost 110 seconds spent in the 10 created threads plus the 1 master thread. This is a feature. To be honest, we are not sure this is the best way to do things but the issue of what real time means when multiple processors are being used is a complicated one and this is our current method.

The macro with-profiling starts and stops a profing run while executing the code specified as the body of the macro calling form. See runtime-analyzer.htm for general information on the runtime analyzer.