1662 lines
54 KiB
Plaintext
1662 lines
54 KiB
Plaintext
=head1 NAME
|
|
|
|
perldebug - Perl debugging
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
First of all, have you tried using the B<-w> switch?
|
|
|
|
=head1 The Perl Debugger
|
|
|
|
"As soon as we started programming, we found to our
|
|
surprise that it wasn't as easy to get programs right
|
|
as we had thought. Debugging had to be discovered.
|
|
I can remember the exact instant when I realized that
|
|
a large part of my life from then on was going to be
|
|
spent in finding mistakes in my own programs."
|
|
|
|
I< --Maurice Wilkes, 1949>
|
|
|
|
If you invoke Perl with the B<-d> switch, your script runs under the
|
|
Perl source debugger. This works like an interactive Perl
|
|
environment, prompting for debugger commands that let you examine
|
|
source code, set breakpoints, get stack backtraces, change the values of
|
|
variables, etc. This is so convenient that you often fire up
|
|
the debugger all by itself just to test out Perl constructs
|
|
interactively to see what they do. For example:
|
|
|
|
perl -d -e 42
|
|
|
|
In Perl, the debugger is not a separate program as it usually is in the
|
|
typical compiled environment. Instead, the B<-d> flag tells the compiler
|
|
to insert source information into the parse trees it's about to hand off
|
|
to the interpreter. That means your code must first compile correctly
|
|
for the debugger to work on it. Then when the interpreter starts up, it
|
|
preloads a Perl library file containing the debugger itself.
|
|
|
|
The program will halt I<right before> the first run-time executable
|
|
statement (but see below regarding compile-time statements) and ask you
|
|
to enter a debugger command. Contrary to popular expectations, whenever
|
|
the debugger halts and shows you a line of code, it always displays the
|
|
line it's I<about> to execute, rather than the one it has just executed.
|
|
|
|
Any command not recognized by the debugger is directly executed
|
|
(C<eval>'d) as Perl code in the current package. (The debugger uses the
|
|
DB package for its own state information.)
|
|
|
|
Leading white space before a command would cause the debugger to think
|
|
it's I<NOT> a debugger command but for Perl, so be careful not to do
|
|
that.
|
|
|
|
=head2 Debugger Commands
|
|
|
|
The debugger understands the following commands:
|
|
|
|
=over 12
|
|
|
|
=item h [command]
|
|
|
|
Prints out a help message.
|
|
|
|
If you supply another debugger command as an argument to the C<h> command,
|
|
it prints out the description for just that command. The special
|
|
argument of C<h h> produces a more compact help listing, designed to fit
|
|
together on one screen.
|
|
|
|
If the output of the C<h> command (or any command, for that matter) scrolls
|
|
past your screen, either precede the command with a leading pipe symbol so
|
|
it's run through your pager, as in
|
|
|
|
DB> |h
|
|
|
|
You may change the pager which is used via C<O pager=...> command.
|
|
|
|
=item p expr
|
|
|
|
Same as C<print {$DB::OUT} expr> in the current package. In particular,
|
|
because this is just Perl's own B<print> function, this means that nested
|
|
data structures and objects are not dumped, unlike with the C<x> command.
|
|
|
|
The C<DB::OUT> filehandle is opened to F</dev/tty>, regardless of
|
|
where STDOUT may be redirected to.
|
|
|
|
=item x expr
|
|
|
|
Evaluates its expression in list context and dumps out the result
|
|
in a pretty-printed fashion. Nested data structures are printed out
|
|
recursively, unlike the C<print> function.
|
|
|
|
The details of printout are governed by multiple C<O>ptions.
|
|
|
|
=item V [pkg [vars]]
|
|
|
|
Display all (or some) variables in package (defaulting to the C<main>
|
|
package) using a data pretty-printer (hashes show their keys and values so
|
|
you see what's what, control characters are made printable, etc.). Make
|
|
sure you don't put the type specifier (like C<$>) there, just the symbol
|
|
names, like this:
|
|
|
|
V DB filename line
|
|
|
|
Use C<~pattern> and C<!pattern> for positive and negative regexps.
|
|
|
|
Nested data structures are printed out in a legible fashion, unlike
|
|
the C<print> function.
|
|
|
|
The details of printout are governed by multiple C<O>ptions.
|
|
|
|
=item X [vars]
|
|
|
|
Same as C<V currentpackage [vars]>.
|
|
|
|
=item T
|
|
|
|
Produce a stack backtrace. See below for details on its output.
|
|
|
|
=item s [expr]
|
|
|
|
Single step. Executes until it reaches the beginning of another
|
|
statement, descending into subroutine calls. If an expression is
|
|
supplied that includes function calls, it too will be single-stepped.
|
|
|
|
=item n [expr]
|
|
|
|
Next. Executes over subroutine calls, until it reaches the beginning
|
|
of the next statement. If an expression is supplied that includes
|
|
function calls, those functions will be executed with stops before
|
|
each statement.
|
|
|
|
=item E<lt>CRE<gt>
|
|
|
|
Repeat last C<n> or C<s> command.
|
|
|
|
=item c [line|sub]
|
|
|
|
Continue, optionally inserting a one-time-only breakpoint
|
|
at the specified line or subroutine.
|
|
|
|
=item l
|
|
|
|
List next window of lines.
|
|
|
|
=item l min+incr
|
|
|
|
List C<incr+1> lines starting at C<min>.
|
|
|
|
=item l min-max
|
|
|
|
List lines C<min> through C<max>. C<l -> is synonymous to C<->.
|
|
|
|
=item l line
|
|
|
|
List a single line.
|
|
|
|
=item l subname
|
|
|
|
List first window of lines from subroutine.
|
|
|
|
=item -
|
|
|
|
List previous window of lines.
|
|
|
|
=item w [line]
|
|
|
|
List window (a few lines) around the current line.
|
|
|
|
=item .
|
|
|
|
Return debugger pointer to the last-executed line and
|
|
print it out.
|
|
|
|
=item f filename
|
|
|
|
Switch to viewing a different file or eval statement. If C<filename>
|
|
is not a full filename as found in values of %INC, it is considered as
|
|
a regexp.
|
|
|
|
=item /pattern/
|
|
|
|
Search forwards for pattern; final / is optional.
|
|
|
|
=item ?pattern?
|
|
|
|
Search backwards for pattern; final ? is optional.
|
|
|
|
=item L
|
|
|
|
List all breakpoints and actions.
|
|
|
|
=item S [[!]pattern]
|
|
|
|
List subroutine names [not] matching pattern.
|
|
|
|
=item t
|
|
|
|
Toggle trace mode (see also C<AutoTrace> C<O>ption).
|
|
|
|
=item t expr
|
|
|
|
Trace through execution of expr. For example:
|
|
|
|
$ perl -de 42
|
|
Stack dump during die enabled outside of evals.
|
|
|
|
Loading DB routines from perl5db.pl patch level 0.94
|
|
Emacs support available.
|
|
|
|
Enter h or `h h' for help.
|
|
|
|
main::(-e:1): 0
|
|
DB<1> sub foo { 14 }
|
|
|
|
DB<2> sub bar { 3 }
|
|
|
|
DB<3> t print foo() * bar()
|
|
main::((eval 172):3): print foo() + bar();
|
|
main::foo((eval 168):2):
|
|
main::bar((eval 170):2):
|
|
42
|
|
|
|
or, with the C<O>ption C<frame=2> set,
|
|
|
|
DB<4> O f=2
|
|
frame = '2'
|
|
DB<5> t print foo() * bar()
|
|
3: foo() * bar()
|
|
entering main::foo
|
|
2: sub foo { 14 };
|
|
exited main::foo
|
|
entering main::bar
|
|
2: sub bar { 3 };
|
|
exited main::bar
|
|
42
|
|
|
|
=item b [line] [condition]
|
|
|
|
Set a breakpoint. If line is omitted, sets a breakpoint on the line
|
|
that is about to be executed. If a condition is specified, it's
|
|
evaluated each time the statement is reached and a breakpoint is taken
|
|
only if the condition is true. Breakpoints may be set on only lines
|
|
that begin an executable statement. Conditions don't use B<if>:
|
|
|
|
b 237 $x > 30
|
|
b 237 ++$count237 < 11
|
|
b 33 /pattern/i
|
|
|
|
=item b subname [condition]
|
|
|
|
Set a breakpoint at the first line of the named subroutine.
|
|
|
|
=item b postpone subname [condition]
|
|
|
|
Set breakpoint at first line of subroutine after it is compiled.
|
|
|
|
=item b load filename
|
|
|
|
Set breakpoint at the first executed line of the file. Filename should
|
|
be a full name as found in values of %INC.
|
|
|
|
=item b compile subname
|
|
|
|
Sets breakpoint at the first statement executed after the subroutine
|
|
is compiled.
|
|
|
|
=item d [line]
|
|
|
|
Delete a breakpoint at the specified line. If line is omitted, deletes
|
|
the breakpoint on the line that is about to be executed.
|
|
|
|
=item D
|
|
|
|
Delete all installed breakpoints.
|
|
|
|
=item a [line] command
|
|
|
|
Set an action to be done before the line is executed.
|
|
The sequence of steps taken by the debugger is
|
|
|
|
1. check for a breakpoint at this line
|
|
2. print the line if necessary (tracing)
|
|
3. do any actions associated with that line
|
|
4. prompt user if at a breakpoint or in single-step
|
|
5. evaluate line
|
|
|
|
For example, this will print out $foo every time line
|
|
53 is passed:
|
|
|
|
a 53 print "DB FOUND $foo\n"
|
|
|
|
=item A
|
|
|
|
Delete all installed actions.
|
|
|
|
=item W [expr]
|
|
|
|
Add a global watch-expression.
|
|
|
|
=item W
|
|
|
|
Delete all watch-expressions.
|
|
|
|
=item O [opt[=val]] [opt"val"] [opt?]...
|
|
|
|
Set or query values of options. val defaults to 1. opt can
|
|
be abbreviated. Several options can be listed.
|
|
|
|
=over 12
|
|
|
|
=item C<recallCommand>, C<ShellBang>
|
|
|
|
The characters used to recall command or spawn shell. By
|
|
default, these are both set to C<!>.
|
|
|
|
=item C<pager>
|
|
|
|
Program to use for output of pager-piped commands (those
|
|
beginning with a C<|> character.) By default,
|
|
C<$ENV{PAGER}> will be used.
|
|
|
|
=item C<tkRunning>
|
|
|
|
Run Tk while prompting (with ReadLine).
|
|
|
|
=item C<signalLevel>, C<warnLevel>, C<dieLevel>
|
|
|
|
Level of verbosity. By default the debugger is in a sane verbose mode,
|
|
thus it will print backtraces on all the warnings and die-messages
|
|
which are going to be printed out, and will print a message when
|
|
interesting uncaught signals arrive.
|
|
|
|
To disable this behaviour, set these values to 0. If C<dieLevel> is 2,
|
|
then the messages which will be caught by surrounding C<eval> are also
|
|
printed.
|
|
|
|
=item C<AutoTrace>
|
|
|
|
Trace mode (similar to C<t> command, but can be put into
|
|
C<PERLDB_OPTS>).
|
|
|
|
=item C<LineInfo>
|
|
|
|
File or pipe to print line number info to. If it is a pipe (say,
|
|
C<|visual_perl_db>), then a short, "emacs like" message is used.
|
|
|
|
=item C<inhibit_exit>
|
|
|
|
If 0, allows I<stepping off> the end of the script.
|
|
|
|
=item C<PrintRet>
|
|
|
|
affects printing of return value after C<r> command.
|
|
|
|
=item C<ornaments>
|
|
|
|
affects screen appearance of the command line (see L<Term::ReadLine>).
|
|
|
|
=item C<frame>
|
|
|
|
affects printing messages on entry and exit from subroutines. If
|
|
C<frame & 2> is false, messages are printed on entry only. (Printing
|
|
on exit may be useful if inter(di)spersed with other messages.)
|
|
|
|
If C<frame & 4>, arguments to functions are printed as well as the
|
|
context and caller info. If C<frame & 8>, overloaded C<stringify> and
|
|
C<tie>d C<FETCH> are enabled on the printed arguments. If C<frame &
|
|
16>, the return value from the subroutine is printed as well.
|
|
|
|
The length at which the argument list is truncated is governed by the
|
|
next option:
|
|
|
|
=item C<maxTraceLen>
|
|
|
|
length at which the argument list is truncated when C<frame> option's
|
|
bit 4 is set.
|
|
|
|
=back
|
|
|
|
The following options affect what happens with C<V>, C<X>, and C<x>
|
|
commands:
|
|
|
|
=over 12
|
|
|
|
=item C<arrayDepth>, C<hashDepth>
|
|
|
|
Print only first N elements ('' for all).
|
|
|
|
=item C<compactDump>, C<veryCompact>
|
|
|
|
Change style of array and hash dump. If C<compactDump>, short array
|
|
may be printed on one line.
|
|
|
|
=item C<globPrint>
|
|
|
|
Whether to print contents of globs.
|
|
|
|
=item C<DumpDBFiles>
|
|
|
|
Dump arrays holding debugged files.
|
|
|
|
=item C<DumpPackages>
|
|
|
|
Dump symbol tables of packages.
|
|
|
|
=item C<DumpReused>
|
|
|
|
Dump contents of "reused" addresses.
|
|
|
|
=item C<quote>, C<HighBit>, C<undefPrint>
|
|
|
|
Change style of string dump. Default value of C<quote> is C<auto>, one
|
|
can enable either double-quotish dump, or single-quotish by setting it
|
|
to C<"> or C<'>. By default, characters with high bit set are printed
|
|
I<as is>.
|
|
|
|
=item C<UsageOnly>
|
|
|
|
I<very> rudimentally per-package memory usage dump. Calculates total
|
|
size of strings in variables in the package.
|
|
|
|
=back
|
|
|
|
During startup options are initialized from C<$ENV{PERLDB_OPTS}>.
|
|
You can put additional initialization options C<TTY>, C<noTTY>,
|
|
C<ReadLine>, and C<NonStop> there.
|
|
|
|
Example rc file:
|
|
|
|
&parse_options("NonStop=1 LineInfo=db.out AutoTrace");
|
|
|
|
The script will run without human intervention, putting trace information
|
|
into the file I<db.out>. (If you interrupt it, you would better reset
|
|
C<LineInfo> to something "interactive"!)
|
|
|
|
=over 12
|
|
|
|
=item C<TTY>
|
|
|
|
The TTY to use for debugging I/O.
|
|
|
|
=item C<noTTY>
|
|
|
|
If set, goes in C<NonStop> mode, and would not connect to a TTY. If
|
|
interrupt (or if control goes to debugger via explicit setting of
|
|
$DB::signal or $DB::single from the Perl script), connects to a TTY
|
|
specified by the C<TTY> option at startup, or to a TTY found at
|
|
runtime using C<Term::Rendezvous> module of your choice.
|
|
|
|
This module should implement a method C<new> which returns an object
|
|
with two methods: C<IN> and C<OUT>, returning two filehandles to use
|
|
for debugging input and output correspondingly. Method C<new> may
|
|
inspect an argument which is a value of C<$ENV{PERLDB_NOTTY}> at
|
|
startup, or is C<"/tmp/perldbtty$$"> otherwise.
|
|
|
|
=item C<ReadLine>
|
|
|
|
If false, readline support in debugger is disabled, so you can debug
|
|
ReadLine applications.
|
|
|
|
=item C<NonStop>
|
|
|
|
If set, debugger goes into noninteractive mode until interrupted, or
|
|
programmatically by setting $DB::signal or $DB::single.
|
|
|
|
=back
|
|
|
|
Here's an example of using the C<$ENV{PERLDB_OPTS}> variable:
|
|
|
|
$ PERLDB_OPTS="N f=2" perl -d myprogram
|
|
|
|
will run the script C<myprogram> without human intervention, printing
|
|
out the call tree with entry and exit points. Note that C<N f=2> is
|
|
equivalent to C<NonStop=1 frame=2>. Note also that at the moment when
|
|
this documentation was written all the options to the debugger could
|
|
be uniquely abbreviated by the first letter (with exception of
|
|
C<Dump*> options).
|
|
|
|
Other examples may include
|
|
|
|
$ PERLDB_OPTS="N f A L=listing" perl -d myprogram
|
|
|
|
- runs script noninteractively, printing info on each entry into a
|
|
subroutine and each executed line into the file F<listing>. (If you
|
|
interrupt it, you would better reset C<LineInfo> to something
|
|
"interactive"!)
|
|
|
|
|
|
$ env "PERLDB_OPTS=R=0 TTY=/dev/ttyc" perl -d myprogram
|
|
|
|
may be useful for debugging a program which uses C<Term::ReadLine>
|
|
itself. Do not forget detach shell from the TTY in the window which
|
|
corresponds to F</dev/ttyc>, say, by issuing a command like
|
|
|
|
$ sleep 1000000
|
|
|
|
See L<"Debugger Internals"> below for more details.
|
|
|
|
=item E<lt> [ command ]
|
|
|
|
Set an action (Perl command) to happen before every debugger prompt.
|
|
A multi-line command may be entered by backslashing the newlines. If
|
|
C<command> is missing, resets the list of actions.
|
|
|
|
=item E<lt>E<lt> command
|
|
|
|
Add an action (Perl command) to happen before every debugger prompt.
|
|
A multi-line command may be entered by backslashing the newlines.
|
|
|
|
=item E<gt> command
|
|
|
|
Set an action (Perl command) to happen after the prompt when you've
|
|
just given a command to return to executing the script. A multi-line
|
|
command may be entered by backslashing the newlines. If C<command> is
|
|
missing, resets the list of actions.
|
|
|
|
=item E<gt>E<gt> command
|
|
|
|
Adds an action (Perl command) to happen after the prompt when you've
|
|
just given a command to return to executing the script. A multi-line
|
|
command may be entered by backslashing the newlines.
|
|
|
|
=item { [ command ]
|
|
|
|
Set an action (debugger command) to happen before every debugger prompt.
|
|
A multi-line command may be entered by backslashing the newlines. If
|
|
C<command> is missing, resets the list of actions.
|
|
|
|
=item {{ command
|
|
|
|
Add an action (debugger command) to happen before every debugger prompt.
|
|
A multi-line command may be entered by backslashing the newlines.
|
|
|
|
=item ! number
|
|
|
|
Redo a previous command (default previous command).
|
|
|
|
=item ! -number
|
|
|
|
Redo number'th-to-last command.
|
|
|
|
=item ! pattern
|
|
|
|
Redo last command that started with pattern.
|
|
See C<O recallCommand>, too.
|
|
|
|
=item !! cmd
|
|
|
|
Run cmd in a subprocess (reads from DB::IN, writes to DB::OUT)
|
|
See C<O shellBang> too.
|
|
|
|
=item H -number
|
|
|
|
Display last n commands. Only commands longer than one character are
|
|
listed. If number is omitted, lists them all.
|
|
|
|
=item q or ^D
|
|
|
|
Quit. ("quit" doesn't work for this.) This is the only supported way
|
|
to exit the debugger, though typing C<exit> twice may do it too.
|
|
|
|
Set an C<O>ption C<inhibit_exit> to 0 if you want to be able to I<step
|
|
off> the end the script. You may also need to set C<$finished> to 0 at
|
|
some moment if you want to step through global destruction.
|
|
|
|
=item R
|
|
|
|
Restart the debugger by B<exec>ing a new session. It tries to maintain
|
|
your history across this, but internal settings and command line options
|
|
may be lost.
|
|
|
|
Currently the following setting are preserved: history, breakpoints,
|
|
actions, debugger C<O>ptions, and the following command line
|
|
options: B<-w>, B<-I>, and B<-e>.
|
|
|
|
=item |dbcmd
|
|
|
|
Run debugger command, piping DB::OUT to current pager.
|
|
|
|
=item ||dbcmd
|
|
|
|
Same as C<|dbcmd> but DB::OUT is temporarily B<select>ed as well.
|
|
Often used with commands that would otherwise produce long
|
|
output, such as
|
|
|
|
|V main
|
|
|
|
=item = [alias value]
|
|
|
|
Define a command alias, like
|
|
|
|
= quit q
|
|
|
|
or list current aliases.
|
|
|
|
=item command
|
|
|
|
Execute command as a Perl statement. A missing semicolon will be
|
|
supplied.
|
|
|
|
=item m expr
|
|
|
|
The expression is evaluated, and the methods which may be applied to
|
|
the result are listed.
|
|
|
|
=item m package
|
|
|
|
The methods which may be applied to objects in the C<package> are listed.
|
|
|
|
=back
|
|
|
|
=head2 Debugger input/output
|
|
|
|
=over 8
|
|
|
|
=item Prompt
|
|
|
|
The debugger prompt is something like
|
|
|
|
DB<8>
|
|
|
|
or even
|
|
|
|
DB<<17>>
|
|
|
|
where that number is the command number, which you'd use to access with
|
|
the builtin B<csh>-like history mechanism, e.g., C<!17> would repeat
|
|
command number 17. The number of angle brackets indicates the depth of
|
|
the debugger. You could get more than one set of brackets, for example, if
|
|
you'd already at a breakpoint and then printed out the result of a
|
|
function call that itself also has a breakpoint, or you step into an
|
|
expression via C<s/n/t expression> command.
|
|
|
|
=item Multiline commands
|
|
|
|
If you want to enter a multi-line command, such as a subroutine
|
|
definition with several statements, or a format, you may escape the
|
|
newline that would normally end the debugger command with a backslash.
|
|
Here's an example:
|
|
|
|
DB<1> for (1..4) { \
|
|
cont: print "ok\n"; \
|
|
cont: }
|
|
ok
|
|
ok
|
|
ok
|
|
ok
|
|
|
|
Note that this business of escaping a newline is specific to interactive
|
|
commands typed into the debugger.
|
|
|
|
=item Stack backtrace
|
|
|
|
Here's an example of what a stack backtrace via C<T> command might
|
|
look like:
|
|
|
|
$ = main::infested called from file `Ambulation.pm' line 10
|
|
@ = Ambulation::legs(1, 2, 3, 4) called from file `camel_flea' line 7
|
|
$ = main::pests('bactrian', 4) called from file `camel_flea' line 4
|
|
|
|
The left-hand character up there tells whether the function was called
|
|
in a scalar or list context (we bet you can tell which is which). What
|
|
that says is that you were in the function C<main::infested> when you ran
|
|
the stack dump, and that it was called in a scalar context from line 10
|
|
of the file I<Ambulation.pm>, but without any arguments at all, meaning
|
|
it was called as C<&infested>. The next stack frame shows that the
|
|
function C<Ambulation::legs> was called in a list context from the
|
|
I<camel_flea> file with four arguments. The last stack frame shows that
|
|
C<main::pests> was called in a scalar context, also from I<camel_flea>,
|
|
but from line 4.
|
|
|
|
Note that if you execute C<T> command from inside an active C<use>
|
|
statement, the backtrace will contain both C<require>
|
|
frame and an C<eval>) frame.
|
|
|
|
=item Listing
|
|
|
|
Listing given via different flavors of C<l> command looks like this:
|
|
|
|
DB<<13>> l
|
|
101: @i{@i} = ();
|
|
102:b @isa{@i,$pack} = ()
|
|
103 if(exists $i{$prevpack} || exists $isa{$pack});
|
|
104 }
|
|
105
|
|
106 next
|
|
107==> if(exists $isa{$pack});
|
|
108
|
|
109:a if ($extra-- > 0) {
|
|
110: %isa = ($pack,1);
|
|
|
|
Note that the breakable lines are marked with C<:>, lines with
|
|
breakpoints are marked by C<b>, with actions by C<a>, and the
|
|
next executed line is marked by C<==E<gt>>.
|
|
|
|
=item Frame listing
|
|
|
|
When C<frame> option is set, debugger would print entered (and
|
|
optionally exited) subroutines in different styles.
|
|
|
|
What follows is the start of the listing of
|
|
|
|
env "PERLDB_OPTS=f=n N" perl -d -V
|
|
|
|
for different values of C<n>:
|
|
|
|
=over 4
|
|
|
|
=item 1
|
|
|
|
entering main::BEGIN
|
|
entering Config::BEGIN
|
|
Package lib/Exporter.pm.
|
|
Package lib/Carp.pm.
|
|
Package lib/Config.pm.
|
|
entering Config::TIEHASH
|
|
entering Exporter::import
|
|
entering Exporter::export
|
|
entering Config::myconfig
|
|
entering Config::FETCH
|
|
entering Config::FETCH
|
|
entering Config::FETCH
|
|
entering Config::FETCH
|
|
|
|
=item 2
|
|
|
|
entering main::BEGIN
|
|
entering Config::BEGIN
|
|
Package lib/Exporter.pm.
|
|
Package lib/Carp.pm.
|
|
exited Config::BEGIN
|
|
Package lib/Config.pm.
|
|
entering Config::TIEHASH
|
|
exited Config::TIEHASH
|
|
entering Exporter::import
|
|
entering Exporter::export
|
|
exited Exporter::export
|
|
exited Exporter::import
|
|
exited main::BEGIN
|
|
entering Config::myconfig
|
|
entering Config::FETCH
|
|
exited Config::FETCH
|
|
entering Config::FETCH
|
|
exited Config::FETCH
|
|
entering Config::FETCH
|
|
|
|
=item 4
|
|
|
|
in $=main::BEGIN() from /dev/nul:0
|
|
in $=Config::BEGIN() from lib/Config.pm:2
|
|
Package lib/Exporter.pm.
|
|
Package lib/Carp.pm.
|
|
Package lib/Config.pm.
|
|
in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|
in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0
|
|
in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
|
|
in @=Config::myconfig() from /dev/nul:0
|
|
in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'SUBVERSION') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
|
|
|
|
=item 6
|
|
|
|
in $=main::BEGIN() from /dev/nul:0
|
|
in $=Config::BEGIN() from lib/Config.pm:2
|
|
Package lib/Exporter.pm.
|
|
Package lib/Carp.pm.
|
|
out $=Config::BEGIN() from lib/Config.pm:0
|
|
Package lib/Config.pm.
|
|
in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|
out $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|
in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0
|
|
in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
|
|
out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
|
|
out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0
|
|
out $=main::BEGIN() from /dev/nul:0
|
|
in @=Config::myconfig() from /dev/nul:0
|
|
in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|
out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|
out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574
|
|
out $=Config::FETCH(ref(Config), 'PATCHLEVEL') from lib/Config.pm:574
|
|
in $=Config::FETCH(ref(Config), 'SUBVERSION') from lib/Config.pm:574
|
|
|
|
=item 14
|
|
|
|
in $=main::BEGIN() from /dev/nul:0
|
|
in $=Config::BEGIN() from lib/Config.pm:2
|
|
Package lib/Exporter.pm.
|
|
Package lib/Carp.pm.
|
|
out $=Config::BEGIN() from lib/Config.pm:0
|
|
Package lib/Config.pm.
|
|
in $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|
out $=Config::TIEHASH('Config') from lib/Config.pm:644
|
|
in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0
|
|
in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
|
|
out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
|
|
out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/nul:0
|
|
out $=main::BEGIN() from /dev/nul:0
|
|
in @=Config::myconfig() from /dev/nul:0
|
|
in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
|
|
out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
|
|
in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
|
|
out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
|
|
|
|
=item 30
|
|
|
|
in $=CODE(0x15eca4)() from /dev/null:0
|
|
in $=CODE(0x182528)() from lib/Config.pm:2
|
|
Package lib/Exporter.pm.
|
|
out $=CODE(0x182528)() from lib/Config.pm:0
|
|
scalar context return from CODE(0x182528): undef
|
|
Package lib/Config.pm.
|
|
in $=Config::TIEHASH('Config') from lib/Config.pm:628
|
|
out $=Config::TIEHASH('Config') from lib/Config.pm:628
|
|
scalar context return from Config::TIEHASH: empty hash
|
|
in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|
in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
|
|
out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
|
|
scalar context return from Exporter::export: ''
|
|
out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
|
|
scalar context return from Exporter::import: ''
|
|
|
|
|
|
=back
|
|
|
|
In all the cases indentation of lines shows the call tree, if bit 2 of
|
|
C<frame> is set, then a line is printed on exit from a subroutine as
|
|
well, if bit 4 is set, then the arguments are printed as well as the
|
|
caller info, if bit 8 is set, the arguments are printed even if they
|
|
are tied or references, if bit 16 is set, the return value is printed
|
|
as well.
|
|
|
|
When a package is compiled, a line like this
|
|
|
|
Package lib/Carp.pm.
|
|
|
|
is printed with proper indentation.
|
|
|
|
=back
|
|
|
|
=head2 Debugging compile-time statements
|
|
|
|
If you have any compile-time executable statements (code within a BEGIN
|
|
block or a C<use> statement), these will C<NOT> be stopped by debugger,
|
|
although C<require>s will (and compile-time statements can be traced
|
|
with C<AutoTrace> option set in C<PERLDB_OPTS>). From your own Perl
|
|
code, however, you can
|
|
transfer control back to the debugger using the following statement,
|
|
which is harmless if the debugger is not running:
|
|
|
|
$DB::single = 1;
|
|
|
|
If you set C<$DB::single> to the value 2, it's equivalent to having
|
|
just typed the C<n> command, whereas a value of 1 means the C<s>
|
|
command. The C<$DB::trace> variable should be set to 1 to simulate
|
|
having typed the C<t> command.
|
|
|
|
Another way to debug compile-time code is to start debugger, set a
|
|
breakpoint on I<load> of some module thusly
|
|
|
|
DB<7> b load f:/perllib/lib/Carp.pm
|
|
Will stop on load of `f:/perllib/lib/Carp.pm'.
|
|
|
|
and restart debugger by C<R> command (if possible). One can use C<b
|
|
compile subname> for the same purpose.
|
|
|
|
=head2 Debugger Customization
|
|
|
|
Most probably you do not want to modify the debugger, it contains enough
|
|
hooks to satisfy most needs. You may change the behaviour of debugger
|
|
from the debugger itself, using C<O>ptions, from the command line via
|
|
C<PERLDB_OPTS> environment variable, and from I<customization files>.
|
|
|
|
You can do some customization by setting up a F<.perldb> file which
|
|
contains initialization code. For instance, you could make aliases
|
|
like these (the last one is one people expect to be there):
|
|
|
|
$DB::alias{'len'} = 's/^len(.*)/p length($1)/';
|
|
$DB::alias{'stop'} = 's/^stop (at|in)/b/';
|
|
$DB::alias{'ps'} = 's/^ps\b/p scalar /';
|
|
$DB::alias{'quit'} = 's/^quit(\s*)/exit\$/';
|
|
|
|
One changes options from F<.perldb> file via calls like this one;
|
|
|
|
parse_options("NonStop=1 LineInfo=db.out AutoTrace=1 frame=2");
|
|
|
|
(the code is executed in the package C<DB>). Note that F<.perldb> is
|
|
processed before processing C<PERLDB_OPTS>. If F<.perldb> defines the
|
|
subroutine C<afterinit>, it is called after all the debugger
|
|
initialization ends. F<.perldb> may be contained in the current
|
|
directory, or in the C<LOGDIR>/C<HOME> directory.
|
|
|
|
If you want to modify the debugger, copy F<perl5db.pl> from the Perl
|
|
library to another name and modify it as necessary. You'll also want
|
|
to set your C<PERL5DB> environment variable to say something like this:
|
|
|
|
BEGIN { require "myperl5db.pl" }
|
|
|
|
As the last resort, one can use C<PERL5DB> to customize debugger by
|
|
directly setting internal variables or calling debugger functions.
|
|
|
|
=head2 Readline Support
|
|
|
|
As shipped, the only command line history supplied is a simplistic one
|
|
that checks for leading exclamation points. However, if you install
|
|
the Term::ReadKey and Term::ReadLine modules from CPAN, you will
|
|
have full editing capabilities much like GNU I<readline>(3) provides.
|
|
Look for these in the F<modules/by-module/Term> directory on CPAN.
|
|
|
|
A rudimentary command line completion is also available.
|
|
Unfortunately, the names of lexical variables are not available for
|
|
completion.
|
|
|
|
=head2 Editor Support for Debugging
|
|
|
|
If you have GNU B<emacs> installed on your system, it can interact with
|
|
the Perl debugger to provide an integrated software development
|
|
environment reminiscent of its interactions with C debuggers.
|
|
|
|
Perl is also delivered with a start file for making B<emacs> act like a
|
|
syntax-directed editor that understands (some of) Perl's syntax. Look in
|
|
the I<emacs> directory of the Perl source distribution.
|
|
|
|
(Historically, a similar setup for interacting with B<vi> and the
|
|
X11 window system had also been available, but at the time of this
|
|
writing, no debugger support for B<vi> currently exists.)
|
|
|
|
=head2 The Perl Profiler
|
|
|
|
If you wish to supply an alternative debugger for Perl to run, just
|
|
invoke your script with a colon and a package argument given to the B<-d>
|
|
flag. One of the most popular alternative debuggers for Perl is
|
|
B<DProf>, the Perl profiler. As of this writing, B<DProf> is not
|
|
included with the standard Perl distribution, but it is expected to
|
|
be included soon, for certain values of "soon".
|
|
|
|
Meanwhile, you can fetch the Devel::Dprof module from CPAN. Assuming
|
|
it's properly installed on your system, to profile your Perl program in
|
|
the file F<mycode.pl>, just type:
|
|
|
|
perl -d:DProf mycode.pl
|
|
|
|
When the script terminates the profiler will dump the profile information
|
|
to a file called F<tmon.out>. A tool like B<dprofpp> (also supplied with
|
|
the Devel::DProf package) can be used to interpret the information which is
|
|
in that profile.
|
|
|
|
=head2 Debugger support in perl
|
|
|
|
When you call the B<caller> function (see L<perlfunc/caller>) from the
|
|
package DB, Perl sets the array @DB::args to contain the arguments the
|
|
corresponding stack frame was called with.
|
|
|
|
If perl is run with B<-d> option, the following additional features
|
|
are enabled (cf. L<perlvar/$^P>):
|
|
|
|
=over
|
|
|
|
=item *
|
|
|
|
Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
|
|
'perl5db.pl'}> if not present) before the first line of the
|
|
application.
|
|
|
|
=item *
|
|
|
|
The array C<@{"_E<lt>$filename"}> is the line-by-line contents of
|
|
$filename for all the compiled files. Same for C<eval>ed strings which
|
|
contain subroutines, or which are currently executed. The C<$filename>
|
|
for C<eval>ed strings looks like C<(eval 34)>.
|
|
|
|
=item *
|
|
|
|
The hash C<%{"_E<lt>$filename"}> contains breakpoints and action (it is
|
|
keyed by line number), and individual entries are settable (as opposed
|
|
to the whole hash). Only true/false is important to Perl, though the
|
|
values used by F<perl5db.pl> have the form
|
|
C<"$break_condition\0$action">. Values are magical in numeric context:
|
|
they are zeros if the line is not breakable.
|
|
|
|
Same for evaluated strings which contain subroutines, or which are
|
|
currently executed. The $filename for C<eval>ed strings looks like
|
|
C<(eval 34)>.
|
|
|
|
=item *
|
|
|
|
The scalar C<${"_E<lt>$filename"}> contains C<"_E<lt>$filename">. Same for
|
|
evaluated strings which contain subroutines, or which are currently
|
|
executed. The $filename for C<eval>ed strings looks like C<(eval
|
|
34)>.
|
|
|
|
=item *
|
|
|
|
After each C<require>d file is compiled, but before it is executed,
|
|
C<DB::postponed(*{"_E<lt>$filename"})> is called (if subroutine
|
|
C<DB::postponed> exists). Here the $filename is the expanded name of
|
|
the C<require>d file (as found in values of %INC).
|
|
|
|
=item *
|
|
|
|
After each subroutine C<subname> is compiled existence of
|
|
C<$DB::postponed{subname}> is checked. If this key exists,
|
|
C<DB::postponed(subname)> is called (if subroutine C<DB::postponed>
|
|
exists).
|
|
|
|
=item *
|
|
|
|
A hash C<%DB::sub> is maintained, with keys being subroutine names,
|
|
values having the form C<filename:startline-endline>. C<filename> has
|
|
the form C<(eval 31)> for subroutines defined inside C<eval>s.
|
|
|
|
=item *
|
|
|
|
When execution of the application reaches a place that can have
|
|
a breakpoint, a call to C<DB::DB()> is performed if any one of
|
|
variables $DB::trace, $DB::single, or $DB::signal is true. (Note that
|
|
these variables are not C<local>izable.) This feature is disabled when
|
|
the control is inside C<DB::DB()> or functions called from it (unless
|
|
C<$^D & (1E<lt>E<lt>30)>).
|
|
|
|
=item *
|
|
|
|
When execution of the application reaches a subroutine call, a call
|
|
to C<&DB::sub>(I<args>) is performed instead, with C<$DB::sub> being
|
|
the name of the called subroutine. (Unless the subroutine is compiled
|
|
in the package C<DB>.)
|
|
|
|
=back
|
|
|
|
Note that if C<&DB::sub> needs some external data to be setup for it
|
|
to work, no subroutine call is possible until this is done. For the
|
|
standard debugger C<$DB::deep> (how many levels of recursion deep into
|
|
the debugger you can go before a mandatory break) gives an example of
|
|
such a dependency.
|
|
|
|
The minimal working debugger consists of one line
|
|
|
|
sub DB::DB {}
|
|
|
|
which is quite handy as contents of C<PERL5DB> environment
|
|
variable:
|
|
|
|
env "PERL5DB=sub DB::DB {}" perl -d your-script
|
|
|
|
Another (a little bit more useful) minimal debugger can be created
|
|
with the only line being
|
|
|
|
sub DB::DB {print ++$i; scalar <STDIN>}
|
|
|
|
This debugger would print the sequential number of encountered
|
|
statement, and would wait for your C<CR> to continue.
|
|
|
|
The following debugger is quite functional:
|
|
|
|
{
|
|
package DB;
|
|
sub DB {}
|
|
sub sub {print ++$i, " $sub\n"; &$sub}
|
|
}
|
|
|
|
It prints the sequential number of subroutine call and the name of the
|
|
called subroutine. Note that C<&DB::sub> should be compiled into the
|
|
package C<DB>.
|
|
|
|
=head2 Debugger Internals
|
|
|
|
At the start, the debugger reads your rc file (F<./.perldb> or
|
|
F<~/.perldb> under Unix), which can set important options. This file may
|
|
define a subroutine C<&afterinit> to be executed after the debugger is
|
|
initialized.
|
|
|
|
After the rc file is read, the debugger reads environment variable
|
|
PERLDB_OPTS and parses it as a rest of C<O ...> line in debugger prompt.
|
|
|
|
It also maintains magical internal variables, such as C<@DB::dbline>,
|
|
C<%DB::dbline>, which are aliases for C<@{"::_<current_file"}>
|
|
C<%{"::_<current_file"}>. Here C<current_file> is the currently
|
|
selected (with the debugger's C<f> command, or by flow of execution)
|
|
file.
|
|
|
|
Some functions are provided to simplify customization. See L<"Debugger
|
|
Customization"> for description of C<DB::parse_options(string)>. The
|
|
function C<DB::dump_trace(skip[, count])> skips the specified number
|
|
of frames, and returns a list containing info about the caller
|
|
frames (all if C<count> is missing). Each entry is a hash with keys
|
|
C<context> (C<$> or C<@>), C<sub> (subroutine name, or info about
|
|
eval), C<args> (C<undef> or a reference to an array), C<file>, and
|
|
C<line>.
|
|
|
|
The function C<DB::print_trace(FH, skip[, count[, short]])> prints
|
|
formatted info about caller frames. The last two functions may be
|
|
convenient as arguments to C<E<lt>>, C<E<lt>E<lt>> commands.
|
|
|
|
=head2 Other resources
|
|
|
|
You did try the B<-w> switch, didn't you?
|
|
|
|
=head2 BUGS
|
|
|
|
You cannot get the stack frame information or otherwise debug functions
|
|
that were not compiled by Perl, such as C or C++ extensions.
|
|
|
|
If you alter your @_ arguments in a subroutine (such as with B<shift>
|
|
or B<pop>, the stack backtrace will not show the original values.
|
|
|
|
=head1 Debugging Perl memory usage
|
|
|
|
Perl is I<very> frivolous with memory. There is a saying that to
|
|
estimate memory usage of Perl, assume a reasonable algorithm of
|
|
allocation, and multiply your estimates by 10. This is not absolutely
|
|
true, but may give you a good grasp of what happens.
|
|
|
|
Say, an integer cannot take less than 20 bytes of memory, a float
|
|
cannot take less than 24 bytes, a string cannot take less than 32
|
|
bytes (all these examples assume 32-bit architectures, the result are
|
|
much worse on 64-bit architectures). If a variable is accessed in two
|
|
of three different ways (which require an integer, a float, or a
|
|
string), the memory footprint may increase by another 20 bytes. A
|
|
sloppy malloc() implementation will make these numbers yet more.
|
|
|
|
On the opposite end of the scale, a declaration like
|
|
|
|
sub foo;
|
|
|
|
may take (on some versions of perl) up to 500 bytes of memory.
|
|
|
|
Off-the-cuff anecdotal estimates of a code bloat give a factor around
|
|
8. This means that the compiled form of reasonable (commented
|
|
indented etc.) code will take approximately 8 times more than the
|
|
disk space the code takes.
|
|
|
|
There are two Perl-specific ways to analyze the memory usage:
|
|
$ENV{PERL_DEBUG_MSTATS} and B<-DL> switch. First one is available
|
|
only if perl is compiled with Perl's malloc(), the second one only if
|
|
Perl compiled with C<-DDEBUGGING> (as with giving C<-D optimise=-g>
|
|
option to F<Configure>).
|
|
|
|
=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
|
|
|
|
If your perl is using Perl's malloc(), and compiled with correct
|
|
switches (this is the default), then it will print memory usage
|
|
statistics after compiling your code (if C<$ENV{PERL_DEBUG_MSTATS}> >
|
|
1), and before termination of the script (if
|
|
C<$ENV{PERL_DEBUG_MSTATS}> >= 1). The report format is similar to one
|
|
in the following example:
|
|
|
|
env PERL_DEBUG_MSTATS=2 perl -e "require Carp"
|
|
Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
|
|
14216 free: 130 117 28 7 9 0 2 2 1 0 0
|
|
437 61 36 0 5
|
|
60924 used: 125 137 161 55 7 8 6 16 2 0 1
|
|
74 109 304 84 20
|
|
Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
|
|
Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
|
|
30888 free: 245 78 85 13 6 2 1 3 2 0 1
|
|
315 162 39 42 11
|
|
175816 used: 265 176 1112 111 26 22 11 27 2 1 1
|
|
196 178 1066 798 39
|
|
Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
|
|
|
|
It is possible to ask for such a statistic at arbitrary moment by
|
|
using Devel::Peek::mstats() (module Devel::Peek is available on CPAN).
|
|
|
|
Here is the explanation of different parts of the format:
|
|
|
|
=over
|
|
|
|
=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
|
|
|
|
Perl's malloc() uses bucketed allocations. Every request is rounded
|
|
up to the closest bucket size available, and a bucket of these size is
|
|
taken from the pool of the buckets of this size.
|
|
|
|
The above line describes limits of buckets currently in use. Each
|
|
bucket has two sizes: memory footprint, and the maximal size of user
|
|
data which may be put into this bucket. Say, in the above example the
|
|
smallest bucket is both sizes 4. The biggest bucket has usable size
|
|
8188, and the memory footprint 8192.
|
|
|
|
With debugging Perl some buckets may have negative usable size. This
|
|
means that these buckets cannot (and will not) be used. For greater
|
|
buckets the memory footprint may be one page greater than a power of
|
|
2. In such a case the corresponding power of two is printed instead
|
|
in the C<APPROX> field above.
|
|
|
|
=item Free/Used
|
|
|
|
The following 1 or 2 rows of numbers correspond to the number of
|
|
buckets of each size between C<SMALLEST> and C<GREATEST>. In the
|
|
first row the sizes (memory footprints) of buckets are powers of two
|
|
(or possibly one page greater). In the second row (if present) the
|
|
memory footprints of the buckets are between memory footprints of two
|
|
buckets "above".
|
|
|
|
Say, with the above example the memory footprints are (with current
|
|
algorithm)
|
|
|
|
free: 8 16 32 64 128 256 512 1024 2048 4096 8192
|
|
4 12 24 48 80
|
|
|
|
With non-C<DEBUGGING> perl the buckets starting from C<128>-long ones
|
|
have 4-byte overhead, thus 8192-long bucket may take up to
|
|
8188-byte-long allocations.
|
|
|
|
=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
|
|
|
|
The first two fields give the total amount of memory perl sbrk()ed,
|
|
and number of sbrk()s used. The third number is what perl thinks
|
|
about continuity of returned chunks. As far as this number is
|
|
positive, malloc() will assume that it is probable that sbrk() will
|
|
provide continuous memory.
|
|
|
|
The amounts sbrk()ed by external libraries is not counted.
|
|
|
|
=item C<pad: 0>
|
|
|
|
The amount of sbrk()ed memory needed to keep buckets aligned.
|
|
|
|
=item C<heads: 2192>
|
|
|
|
While memory overhead of bigger buckets is kept inside the bucket, for
|
|
smaller buckets it is kept in separate areas. This field gives the
|
|
total size of these areas.
|
|
|
|
=item C<chain: 0>
|
|
|
|
malloc() may want to subdivide a bigger bucket into smaller buckets.
|
|
If only a part of the deceased-bucket is left non-subdivided, the rest
|
|
is kept as an element of a linked list. This field gives the total
|
|
size of these chunks.
|
|
|
|
=item C<tail: 6144>
|
|
|
|
To minimize amount of sbrk()s malloc() asks for more memory. This
|
|
field gives the size of the yet-unused part, which is sbrk()ed, but
|
|
never touched.
|
|
|
|
=back
|
|
|
|
=head2 Example of using B<-DL> switch
|
|
|
|
Below we show how to analyse memory usage by
|
|
|
|
do 'lib/auto/POSIX/autosplit.ix';
|
|
|
|
The file in question contains a header and 146 lines similar to
|
|
|
|
sub getcwd ;
|
|
|
|
B<Note:> I<the discussion below supposes 32-bit architecture. In the
|
|
newer versions of perl the memory usage of the constructs discussed
|
|
here is much improved, but the story discussed below is a real-life
|
|
story. This story is very terse, and assumes more than cursory
|
|
knowledge of Perl internals.>
|
|
|
|
Here is the itemized list of Perl allocations performed during parsing
|
|
of this file:
|
|
|
|
!!! "after" at test.pl line 3.
|
|
Id subtot 4 8 12 16 20 24 28 32 36 40 48 56 64 72 80 80+
|
|
0 02 13752 . . . . 294 . . . . . . . . . . 4
|
|
0 54 5545 . . 8 124 16 . . . 1 1 . . . . . 3
|
|
5 05 32 . . . . . . . 1 . . . . . . . .
|
|
6 02 7152 . . . . . . . . . . 149 . . . . .
|
|
7 02 3600 . . . . . 150 . . . . . . . . . .
|
|
7 03 64 . -1 . 1 . . 2 . . . . . . . . .
|
|
7 04 7056 . . . . . . . . . . . . . . . 7
|
|
7 17 38404 . . . . . . . 1 . . 442 149 . . 147 .
|
|
9 03 2078 17 249 32 . . . . 2 . . . . . . . .
|
|
|
|
|
|
To see this list insert two C<warn('!...')> statements around the call:
|
|
|
|
warn('!');
|
|
do 'lib/auto/POSIX/autosplit.ix';
|
|
warn('!!! "after"');
|
|
|
|
and run it with B<-DL> option. The first warn() will print memory
|
|
allocation info before the parsing of the file, and will memorize the
|
|
statistics at this point (we ignore what it prints). The second warn()
|
|
will print increments w.r.t. this memorized statistics. This is the
|
|
above printout.
|
|
|
|
Different I<Id>s on the left correspond to different subsystems of
|
|
perl interpreter, they are just first argument given to perl memory
|
|
allocation API New(). To find what C<9 03> means C<grep> the perl
|
|
source for C<903>. You will see that it is F<util.c>, function
|
|
savepvn(). This function is used to store a copy of existing chunk of
|
|
memory. Using C debugger, one can see that it is called either
|
|
directly from gv_init(), or via sv_magic(), and gv_init() is called
|
|
from gv_fetchpv() - which is called from newSUB().
|
|
|
|
B<Note:> to reach this place in debugger and skip all the calls to
|
|
savepvn during the compilation of the main script, set a C breakpoint
|
|
in Perl_warn(), C<continue> this point is reached, I<then> set
|
|
breakpoint in Perl_savepvn(). Note that you may need to skip a
|
|
handful of Perl_savepvn() which do not correspond to mass production
|
|
of CVs (there are more C<903> allocations than 146 similar lines of
|
|
F<lib/auto/POSIX/autosplit.ix>). Note also that C<Perl_> prefixes are
|
|
added by macroization code in perl header files to avoid conflicts
|
|
with external libraries.
|
|
|
|
Anyway, we see that C<903> ids correspond to creation of globs, twice
|
|
per glob - for glob name, and glob stringification magic.
|
|
|
|
Here are explanations for other I<Id>s above:
|
|
|
|
=over
|
|
|
|
=item C<717>
|
|
|
|
is for creation of bigger C<XPV*> structures. In the above case it
|
|
creates 3 C<AV> per subroutine, one for a list of lexical variable
|
|
names, one for a scratchpad (which contains lexical variables and
|
|
C<targets>), and one for the array of scratchpads needed for
|
|
recursion.
|
|
|
|
It also creates a C<GV> and a C<CV> per subroutine (all called from
|
|
start_subparse()).
|
|
|
|
=item C<002>
|
|
|
|
Creates C array corresponding to the C<AV> of scratchpads, and the
|
|
scratchpad itself (the first fake entry of this scratchpad is created
|
|
though the subroutine itself is not defined yet).
|
|
|
|
It also creates C arrays to keep data for the stash (this is one HV,
|
|
but it grows, thus there are 4 big allocations: the big chunks are not
|
|
freed, but are kept as additional arenas for C<SV> allocations).
|
|
|
|
=item C<054>
|
|
|
|
creates a C<HEK> for the name of the glob for the subroutine (this
|
|
name is a key in a I<stash>).
|
|
|
|
Big allocations with this I<Id> correspond to allocations of new
|
|
arenas to keep C<HE>.
|
|
|
|
=item C<602>
|
|
|
|
creates a C<GP> for the glob for the subroutine.
|
|
|
|
=item C<702>
|
|
|
|
creates the C<MAGIC> for the glob for the subroutine.
|
|
|
|
=item C<704>
|
|
|
|
creates I<arenas> which keep SVs.
|
|
|
|
=back
|
|
|
|
=head2 B<-DL> details
|
|
|
|
If Perl is run with B<-DL> option, then warn()s which start with `!'
|
|
behave specially. They print a list of I<categories> of memory
|
|
allocations, and statistics of allocations of different sizes for
|
|
these categories.
|
|
|
|
If warn() string starts with
|
|
|
|
=over
|
|
|
|
=item C<!!!>
|
|
|
|
print changed categories only, print the differences in counts of allocations;
|
|
|
|
=item C<!!>
|
|
|
|
print grown categories only; print the absolute values of counts, and totals;
|
|
|
|
=item C<!>
|
|
|
|
print nonempty categories, print the absolute values of counts and totals.
|
|
|
|
=back
|
|
|
|
=head2 Limitations of B<-DL> statistic
|
|
|
|
If an extension or an external library does not use Perl API to
|
|
allocate memory, these allocations are not counted.
|
|
|
|
=head1 Debugging regular expressions
|
|
|
|
There are two ways to enable debugging output for regular expressions.
|
|
|
|
If your perl is compiled with C<-DDEBUGGING>, you may use the
|
|
B<-Dr> flag on the command line.
|
|
|
|
Otherwise, one can C<use re 'debug'>, which has effects both at
|
|
compile time, and at run time (and is I<not> lexically scoped).
|
|
|
|
=head2 Compile-time output
|
|
|
|
The debugging output for the compile time looks like this:
|
|
|
|
compiling RE `[bc]d(ef*g)+h[ij]k$'
|
|
size 43 first at 1
|
|
1: ANYOF(11)
|
|
11: EXACT <d>(13)
|
|
13: CURLYX {1,32767}(27)
|
|
15: OPEN1(17)
|
|
17: EXACT <e>(19)
|
|
19: STAR(22)
|
|
20: EXACT <f>(0)
|
|
22: EXACT <g>(24)
|
|
24: CLOSE1(26)
|
|
26: WHILEM(0)
|
|
27: NOTHING(28)
|
|
28: EXACT <h>(30)
|
|
30: ANYOF(40)
|
|
40: EXACT <k>(42)
|
|
42: EOL(43)
|
|
43: END(0)
|
|
anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating)
|
|
stclass `ANYOF' minlen 7
|
|
|
|
The first line shows the pre-compiled form of the regexp, and the
|
|
second shows the size of the compiled form (in arbitrary units,
|
|
usually 4-byte words) and the label I<id> of the first node which
|
|
does a match.
|
|
|
|
The last line (split into two lines in the above) contains the optimizer
|
|
info. In the example shown, the optimizer found that the match
|
|
should contain a substring C<de> at the offset 1, and substring C<gh>
|
|
at some offset between 3 and infinity. Moreover, when checking for
|
|
these substrings (to abandon impossible matches quickly) it will check
|
|
for the substring C<gh> before checking for the substring C<de>. The
|
|
optimizer may also use the knowledge that the match starts (at the
|
|
C<first> I<id>) with a character class, and the match cannot be
|
|
shorter than 7 chars.
|
|
|
|
The fields of interest which may appear in the last line are
|
|
|
|
=over
|
|
|
|
=item C<anchored> I<STRING> C<at> I<POS>
|
|
|
|
=item C<floating> I<STRING> C<at> I<POS1..POS2>
|
|
|
|
see above;
|
|
|
|
=item C<matching floating/anchored>
|
|
|
|
which substring to check first;
|
|
|
|
=item C<minlen>
|
|
|
|
the minimal length of the match;
|
|
|
|
=item C<stclass> I<TYPE>
|
|
|
|
The type of the first matching node.
|
|
|
|
=item C<noscan>
|
|
|
|
which advises to not scan for the found substrings;
|
|
|
|
=item C<isall>
|
|
|
|
which says that the optimizer info is in fact all that the regular
|
|
expression contains (thus one does not need to enter the RE engine at
|
|
all);
|
|
|
|
=item C<GPOS>
|
|
|
|
if the pattern contains C<\G>;
|
|
|
|
=item C<plus>
|
|
|
|
if the pattern starts with a repeated char (as in C<x+y>);
|
|
|
|
=item C<implicit>
|
|
|
|
if the pattern starts with C<.*>;
|
|
|
|
=item C<with eval>
|
|
|
|
if the pattern contain eval-groups (see L<perlre/(?{ code })>);
|
|
|
|
=item C<anchored(TYPE)>
|
|
|
|
if the pattern may
|
|
match only at a handful of places (with C<TYPE> being
|
|
C<BOL>, C<MBOL>, or C<GPOS>, see the table below).
|
|
|
|
=back
|
|
|
|
If a substring is known to match at end-of-line only, it may be
|
|
followed by C<$>, as in C<floating `k'$>.
|
|
|
|
The optimizer-specific info is used to avoid entering (a slow) RE
|
|
engine on strings which will definitely not match. If C<isall> flag
|
|
is set, a call to the RE engine may be avoided even when optimizer
|
|
found an appropriate place for the match.
|
|
|
|
The rest of the output contains the list of I<nodes> of the compiled
|
|
form of the RE. Each line has format
|
|
|
|
C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
|
|
|
|
=head2 Types of nodes
|
|
|
|
Here is the list of possible types with short descriptions:
|
|
|
|
# TYPE arg-description [num-args] [longjump-len] DESCRIPTION
|
|
|
|
# Exit points
|
|
END no End of program.
|
|
SUCCEED no Return from a subroutine, basically.
|
|
|
|
# Anchors:
|
|
BOL no Match "" at beginning of line.
|
|
MBOL no Same, assuming multiline.
|
|
SBOL no Same, assuming singleline.
|
|
EOS no Match "" at end of string.
|
|
EOL no Match "" at end of line.
|
|
MEOL no Same, assuming multiline.
|
|
SEOL no Same, assuming singleline.
|
|
BOUND no Match "" at any word boundary
|
|
BOUNDL no Match "" at any word boundary
|
|
NBOUND no Match "" at any word non-boundary
|
|
NBOUNDL no Match "" at any word non-boundary
|
|
GPOS no Matches where last m//g left off.
|
|
|
|
# [Special] alternatives
|
|
ANY no Match any one character (except newline).
|
|
SANY no Match any one character.
|
|
ANYOF sv Match character in (or not in) this class.
|
|
ALNUM no Match any alphanumeric character
|
|
ALNUML no Match any alphanumeric char in locale
|
|
NALNUM no Match any non-alphanumeric character
|
|
NALNUML no Match any non-alphanumeric char in locale
|
|
SPACE no Match any whitespace character
|
|
SPACEL no Match any whitespace char in locale
|
|
NSPACE no Match any non-whitespace character
|
|
NSPACEL no Match any non-whitespace char in locale
|
|
DIGIT no Match any numeric character
|
|
NDIGIT no Match any non-numeric character
|
|
|
|
# BRANCH The set of branches constituting a single choice are hooked
|
|
# together with their "next" pointers, since precedence prevents
|
|
# anything being concatenated to any individual branch. The
|
|
# "next" pointer of the last BRANCH in a choice points to the
|
|
# thing following the whole choice. This is also where the
|
|
# final "next" pointer of each individual branch points; each
|
|
# branch starts with the operand node of a BRANCH node.
|
|
#
|
|
BRANCH node Match this alternative, or the next...
|
|
|
|
# BACK Normal "next" pointers all implicitly point forward; BACK
|
|
# exists to make loop structures possible.
|
|
# not used
|
|
BACK no Match "", "next" ptr points backward.
|
|
|
|
# Literals
|
|
EXACT sv Match this string (preceded by length).
|
|
EXACTF sv Match this string, folded (prec. by length).
|
|
EXACTFL sv Match this string, folded in locale (w/len).
|
|
|
|
# Do nothing
|
|
NOTHING no Match empty string.
|
|
# A variant of above which delimits a group, thus stops optimizations
|
|
TAIL no Match empty string. Can jump here from outside.
|
|
|
|
# STAR,PLUS '?', and complex '*' and '+', are implemented as circular
|
|
# BRANCH structures using BACK. Simple cases (one character
|
|
# per match) are implemented with STAR and PLUS for speed
|
|
# and to minimize recursive plunges.
|
|
#
|
|
STAR node Match this (simple) thing 0 or more times.
|
|
PLUS node Match this (simple) thing 1 or more times.
|
|
|
|
CURLY sv 2 Match this simple thing {n,m} times.
|
|
CURLYN no 2 Match next-after-this simple thing
|
|
# {n,m} times, set parenths.
|
|
CURLYM no 2 Match this medium-complex thing {n,m} times.
|
|
CURLYX sv 2 Match this complex thing {n,m} times.
|
|
|
|
# This terminator creates a loop structure for CURLYX
|
|
WHILEM no Do curly processing and see if rest matches.
|
|
|
|
# OPEN,CLOSE,GROUPP ...are numbered at compile time.
|
|
OPEN num 1 Mark this point in input as start of #n.
|
|
CLOSE num 1 Analogous to OPEN.
|
|
|
|
REF num 1 Match some already matched string
|
|
REFF num 1 Match already matched string, folded
|
|
REFFL num 1 Match already matched string, folded in loc.
|
|
|
|
# grouping assertions
|
|
IFMATCH off 1 2 Succeeds if the following matches.
|
|
UNLESSM off 1 2 Fails if the following matches.
|
|
SUSPEND off 1 1 "Independent" sub-RE.
|
|
IFTHEN off 1 1 Switch, should be preceeded by switcher .
|
|
GROUPP num 1 Whether the group matched.
|
|
|
|
# Support for long RE
|
|
LONGJMP off 1 1 Jump far away.
|
|
BRANCHJ off 1 1 BRANCH with long offset.
|
|
|
|
# The heavy worker
|
|
EVAL evl 1 Execute some Perl code.
|
|
|
|
# Modifiers
|
|
MINMOD no Next operator is not greedy.
|
|
LOGICAL no Next opcode should set the flag only.
|
|
|
|
# This is not used yet
|
|
RENUM off 1 1 Group with independently numbered parens.
|
|
|
|
# This is not really a node, but an optimized away piece of a "long" node.
|
|
# To simplify debugging output, we mark it as if it were a node
|
|
OPTIMIZED off Placeholder for dump.
|
|
|
|
=head2 Run-time output
|
|
|
|
First of all, when doing a match, one may get no run-time output even
|
|
if debugging is enabled. this means that the RE engine was never
|
|
entered, all of the job was done by the optimizer.
|
|
|
|
If RE engine was entered, the output may look like this:
|
|
|
|
Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__'
|
|
Setting an EVAL scope, savestack=3
|
|
2 <ab> <cdefg__gh_> | 1: ANYOF
|
|
3 <abc> <defg__gh_> | 11: EXACT <d>
|
|
4 <abcd> <efg__gh_> | 13: CURLYX {1,32767}
|
|
4 <abcd> <efg__gh_> | 26: WHILEM
|
|
0 out of 1..32767 cc=effff31c
|
|
4 <abcd> <efg__gh_> | 15: OPEN1
|
|
4 <abcd> <efg__gh_> | 17: EXACT <e>
|
|
5 <abcde> <fg__gh_> | 19: STAR
|
|
EXACT <f> can match 1 times out of 32767...
|
|
Setting an EVAL scope, savestack=3
|
|
6 <bcdef> <g__gh__> | 22: EXACT <g>
|
|
7 <bcdefg> <__gh__> | 24: CLOSE1
|
|
7 <bcdefg> <__gh__> | 26: WHILEM
|
|
1 out of 1..32767 cc=effff31c
|
|
Setting an EVAL scope, savestack=12
|
|
7 <bcdefg> <__gh__> | 15: OPEN1
|
|
7 <bcdefg> <__gh__> | 17: EXACT <e>
|
|
restoring \1 to 4(4)..7
|
|
failed, try continuation...
|
|
7 <bcdefg> <__gh__> | 27: NOTHING
|
|
7 <bcdefg> <__gh__> | 28: EXACT <h>
|
|
failed...
|
|
failed...
|
|
|
|
The most significant information in the output is about the particular I<node>
|
|
of the compiled RE which is currently being tested against the target string.
|
|
The format of these lines is
|
|
|
|
C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE>
|
|
|
|
The I<TYPE> info is indented with respect to the backtracking level.
|
|
Other incidental information appears interspersed within.
|
|
|
|
=cut
|