Visible Workings > Trace.java User's Guide

The Trace.Java User's Guide
Brian Marick

Introduction

Definitions

System Setup

Adding Tracing

Examples

Definitions

What does trace output look like?

How, in general, is trace output produced?

What is a trace level?

What is a threshold?

What is a subsystem?

What is a trace destination?


What does trace output look like?

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

Let me explain this output piece by piece.

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

Each trace message is timestamped in a YYYY/MM/DD HH:MM:SS.mmm format. Dates are in local time. Note that the format makes logs sortable by date and avoids confusion about whether 07/08/97 is a date in July or August.

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

This message came from the Parcel method in package Incoming. Specifically, it came from line 126 of file Parcel.java. Only the last component of a package name is printed to reduce clutter. (That is, "com.visibleworkings.trace.Trace.myMethod" is printed as "Trace.myMethod".)

In some cases (optimizing compilers), Trace.java can't find location information. If it can't find the line number, it prints it as "line?". If it can't find the filename, it prints it as "file?". If it can't find the method name, it prints it as "method?".

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

The next tag shows the level of the message. This message is an Event message. You can control at runtime or launch time which message levels are printed.

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

The first item on the second line is the subsystem the message came from. Subsystems are user-defined groupings of code, typically larger than a class. You can control tracing by subsystem. For example, you might want great detail from the "comm" subsystem, but less detail from all others.

=== 1997/09/22 13:52:09.342 (Incoming.Parcel:Parcel.java:126) EVN
comm: PROBE message detected from IP address 123.44.53.12

And, finally, the message is printed.

 

How, in general, is trace output produced?

Each class that uses Trace.java will have a static private Trace object. This trace object is shared with other classes in the same subsystem. The Trace object has several different methods; each posts a trace message at a different level. Here is an example:

subsystemTrace.eventm("message");

That method posts a Event message, but only if the threshold allows. For more details, see the programmer's guide.

 

What is a trace level?

Trace messages have several levels. Lower levels provide more detail. You can control at launch time or runtime which level of messages you want to see. Different subsystems can be controlled independently.

Error
Error messages report on some internal program error.
Warning
Warning messages are not as serious as errors, but they're signs of something odd. A user seeing them might be inspired to save buffers or otherwise protect herself from the risk of something bad happening soon. A programmer debugging a problem from a log might take a Warning as being the the point where things started to go wrong.
World
World messages record major milestones in the life of the subsystem. A subsystem might post a World message when it starts, or when it shuts down. These messages are normally stored in a permanent log, so they should record events of interest at some time arbitrarily distant. They should occur much less often than once per second, else the log will have too much detail to be useful.
Usage
Usage messages are used to answer the question "what did the user do up to the point the bug appeared?" They record significant user interface events. "Significant" is a fuzzy term. Generally, Usage messages should be used to record user actions that cause calls between subsystems. Thus, for example, the GUI subsystem would not record keypresses as Usage messages if those keypresses are buffered up until an "OK" button is pressed. It would record the application of a dialog if that caused calls into other subsystems.

Infrastructure code should use the Usage level for messages that allow the programmer to figure out roughly how the program got to where it broke when someone else - probably an end user - ran it. The reason? Usage messages are by default accessible in a transient buffer, so customer service can ask for them. These messages might make the difference between a bug you can't reproduce and one you can.
Event
The key difference between the Event level and the Usage level is that the Usage messages are by default stored in the transient buffer, whereas Event messages are only stored if you ask for them in advance.

As a rule of thumb, log significant calls into your subsystem as Event messages. "Significant" calls are, for example, those that change subsystem state in a way that matters to later execution of the program.
Debug
Debug messages are typically added to code to help with specific bugs. They may or may not be left in to help with future bugs.
Verbose
Verbose messages provide even more detail than debug. They're probably mainly used when first getting the code to work.

These level abbreviations are used in the trace output:

ERR Error
WRN Warning
WLD World
USE Usage
EVN Event
DBG Debug
VRB Verbose

 

What is a threshold?

If every single Debug level message were printed for a whole system, you'd be swamped with output. Therefore, you can adjust the threshold at launch time or runtime. If you set the threshold to Usage, you will only see Error, Warning, World, and Usage messages. Messages at a lower level will be discarded.

Each subsystem can have a different threshold. You might want to look at Debug messages for the "comm" subsystem, but only Error messages for the rest of the system.

Trace.java provides two different destinations for trace messages. They have separate thresholds that can be controlled independently. The threshold can never be set so high that Error messages don't appear.

Here are some examples of how changing the threshold might help:

  1. By default, Event level and lower messages are discarded. But when person acting as the first line of support gets a big nasty data set associated with a bug, she can change the default and run the data set with Event level logging turned on. Those log messages may help her funnel the bug report to the right programmer.
  2. That programmer might be able to understand the problem by looking at only the data set and the Event-level log, or she may need to crank the level up to Debug or even Verbose to get more detail. She may need to add more debugging trace messages.
  3. A tester might run tests with Event-level logging turned on. While working through the user interface, she might look at the log scrolling past in another window. In my experience, this is a good way to notice oddities to exploit.

 

What is a subsystem?

Trace.java assumes that the whole system can be roughly partitioned into independent subsystems. A subsystem is usually bigger than a class. For example, all the GUI code might be part of one subsystem, as might be the network interface code, or the database interface code, or the resource management code, etc. Subsystems will often have relatively stable APIs.

Trace.java doesn't assume that there's some central authority doling out subsystem names. Any programmer can create her own. However, a central authority may have predefined some for you. Check with the person who installed Trace.java.

 

What is a trace destination?

Tracing messages have two possible destinations: the log or the transient buffer. The log is usually a disk file. It can also be standard (terminal) output. The transient buffer is a fixed size ring buffer. Its contents are normally invisible. However, it can be dumped into the log.

The same message can go to both the log and the transient buffer. Typically, the transient buffer contains more detail than the log. The log by default has a World threshold, so it gets Error, Warning, and World messages. The transient buffer gets all those by default, plus Usage messages.

Unless the transient buffer is dumped into the log, its extra detail never bothers anyone and goes away when the buffer fills up. Unlike the log, older messages in the transient buffer are thrown away to make room for new ones.

However, suppose an error happens. The system's error handling code can arrange to dump the transient buffer into the log. That records the last 500 (or so) user actions. That can be of great help to customer service.


Introduction

Definitions

System Setup

Adding Tracing

Examples

Copyright © 2000 by Brian Marick. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, draft v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub).

Please send comments on this document to marick@visibleworkings.com.