| Allegro CL version 10.0 Significantly revised from 9.0. 9.0 version |
Arguments: &key type start-sampling-p count count-list except-processes interpret-closures verbose
This function starts the collection of runtime analyzer data. Data collection is stopped with a call to stop-profiler. The macro with-profiling wraps its body in a call to start-profiler at the start and a call to stop-profiler at the end.
When started, the runtime analyzer creates two new files in a temporary location for saving sample data and initializes the runtime analyzer data structures. These files are removed after analysis. See Further points, including information on temporary files in runtime-analyzer.htm.
The type keyword argument specifies the type of profile data to be collected. The choices are
Value | Meaning |
:time |
collect CPU time data (but not space data) |
:real-time |
collect real time data (but not space data) |
:space |
collect space data (but not time data) |
:count-only |
collect call-count data but no time or space data |
You can collect either time data or space data but not both in a single run, but you can have call-counting data collected while collecting either time or space data. Or you can have call-counting data only collected. So the choices are:
:time
or :real-time
and both count and
count-list should be nil
or unspecified.
:time
or :real-time
and either count or
count-list (or both) should be non-nil
. (Appropriate
non-nil
values of count and
count-list are given below.)
:space
and both count and
count-list should be nil
or unspecified.
:space
and either count or
count-list (or both) should be non-nil
. (Appropriate non-nil
values of count and
count-list are given below.)
:count-only
and either count
or count-list (or both) should be non-nil
. (Appropriate non-nil
values of
count and count-list are
given below.)
If no value is specified for type, it defaults to
:count-only
if either count or
count-list (or both) are non-nil
and it defaults to :time
if
both count or count-list are
nil
or unspecified.
The count and
count-list arguments control whether
call-counting information should be collected during the run. Call
count information is collected if either argument is true. Both
default to nil
.
count can be a positive integer or t. If it is a positive integer, a hash table with that many entries is created. If it is t, a hash table of 1000 entries is created. The table is filled with function objects specified by count-list, if any. New function objects (beyond what is specified in count-list) are added to the table as the associated function is called until the table is filled. No information is collected on function objects called for the first time after the table is filled.
count-list, if specified, should be a list of function objects (not symbols -- thus elements should be #'foo rather than foo). The hash table mentioned in the description of count just above is filled with the elements of count-list, in order, until the hash table is filled or the list is exhausted. Specifying a value for this variable ensures that functions you are interested in are call-counted. Note that if the length of the list is greater than the value of count, only the first count elements are entered in the hash table; the rest are ignored. If count-list has a true value but no value is specified for count, count defaults to 1000 (even if the length of count-list is greater than 1000).
If the start-sampling-p keyword
argument is true or is not supplied then sampling is started
immediately. If it is specified nil
, sampling
is not started until start-sampling is called. The functions
start-sampling and
stop-sampling can be
used to control the gathering of samples while the runtime analyzer is
running. The default value of this argument is t.
The except-processes keyword argument
should be a list of processes or nil
. This
argument causes the runtime analyzer to not record data about the
processes listed in this argument. Use of this argument is a good way
to narrow down the extent of an analysis, if multiprocessing is being
used. (In earlier releases, there was an
except-stack-groups argument, which applied to
certain platforms. That argument is no longer supported and a value
provided for it is assumed to be a value for
except-processes. A warning is signaled if
except-stack-groups is specified.)
The interpret-closures argument controls
the handling of closures. When interpret-closures
is
nil
(the default) the closures are not saved
and analyzed separately. When true,
closures are saved and analyzed as separate profile entries, even
though they may use the same template function. A function
specialization called prof:closure
has been added. This
is valid whenever an analyzed profile exists and identifies the
closures which the runtime analyzer has seen.
The specialization is
(prof:closure n)
where n is a small integer enumerating
the closures in the profile. This allows the user to grab the closure
object and to do a prof:disassemble-profile on it without
having to know what symbol holds its fdefinition
(which
is sometimes very hard).
When flat analyses and call-graphs are reported, closures are
identified by (prof:closure n)
, followed by the printed
representation of the closure object itself.
Note that if you use the Emacs-Lisp interface or the IDE (on Windows), you are using multiprocessing. The :processes top-level command lists active processes. The Process Browser in Allegro Composer (on UNIX) and the one displayed with View | Processes in the IDE (on Windows) also list processes and provide a graphic interface to include or exclude a process from profiling.
The verbose argument, if true (default
is nil
), causes messages about what the
profiler is doing to be printed to
*standard-output*
during data collection.
Data can be collected for only one analysis at a time. Calling this function while data is already being collected will signal an error.
A CPU time profile takes samples while the program is running and bases its results on the CPU time used during the profile. A real time profile takes samples whether or not the program is consuming CPU time. The difference can be seen by profiling a call to sleep. The CPU time profile records nothing (no hits -- some runs may have a hit or two) since almost no CPU time was used. The real time profiler recorded the full 20 seconds.
cl-user(2): (prof:with-profiling (:type :time) (dotimes (i 10) (sleep 2))) nil cl-user(3): (prof:show-flat-profile) No samples have been collected for the specified time profiler. cl-user(4): (prof:with-profiling (:type :real-time) (dotimes (i 10) (sleep 2))) nil cl-user(5): (prof:show-flat-profile) Sampling stopped after 2000 samples taken for the real-time profiler. Sample represents 20.0 seconds of real time (out of a total of .1). Real Times below 1.0% will be suppressed. % % self total self total Function Real Time Cum. secs secs calls ms/call ms/call name 100.0 100.0 20.0 20.0 "__select" cl-user(6):
The real time results in non-SMP Lisps are reasonably intuitive. In the example about, sleep is called 10 time for 2 seconds each and the result is roughly 20 seconds total time. In SMP Lisps, the results are more complicated because the total time combines time for all running threads. Consider the following example:
;; This example is run in an SMP Lisp ;; cl-user(1): (require :smp-utilities) ; Fast loading /net/penn/acl/duane/sacl91/src/code/smp-utilities.fasl t cl-user(2): (defvar *bar* (mp:make-barrier 11)) *bar* cl-user(3): (compile (defun sleepit () (dotimes (i 5) (sleep 2)) (mp:barrier-pass-through *bar*))) sleepit nil nil cl-user(4): (progn (prof::with-profiling (:type :real-time) (progn (dotimes (i 10) (mp::process-run-function (format nil "test~d" i) #'sleepit)) (mp::barrier-wait *bar*))) (prof:show-flat-profile)) Sampling stopped after 10878 samples taken for the real-time profiler. Sample represents 108.8 seconds of real time (out of a total of 10.0). Real Times below 1.0% will be suppressed. % % self total self total Function Real Time Cum. secs secs calls ms/call ms/call name 99.9 99.9 108.7 108.7 "sem_wait" cl-user(5):
In this case the real-time shown is for the almost 110 seconds spent in the 10 created threads plus the 1 master thread. This is a feature. To be honest, we are not sure this is the best way to do things but the issue of what real time means when multiple processors are being used is a complicated one and this is our current method.
The macro with-profiling starts and stops a profing run while executing the code specified as the body of the macro calling form. See runtime-analyzer.htm for general information on the runtime analyzer.
Copyright (c) 1998-2019, Franz Inc. Oakland, CA., USA. All rights reserved.
This page has had significant revisions compared to the 9.0 page.
Created 2019.8.20.
| Allegro CL version 10.0 Significantly revised from 9.0. 9.0 version |