<= Previous | Next => | Table of Contents DataVision User's Manual

11 Understanding Report XML Files

This section describes some of the details of the DataVision XML file format.

There are a few general parts to each XML file: the report, the data source description, parameter definitions, formula definitions, and the report section areas and sections that contain the fields and strings to output.

The DTD file report.dtd is a mostly accurate representation of a legal report file. The DTD snippets below may not be completely up-to-date. When in doubt, see the DTD file or---even better---the source code.

All sizes used in a report file are in points (72 points = 1 inch). All coordinates and lengths are floating point, but may be truncated by layout engines. Colors are stored as strings of the form "red;green;blue;alpha". Each component is an integer from 0 to 255.

For an example report XML file that uses a JDBC database connection, see postgresql.xml in the examples directory. For an example that uses a character-separated file as a data source, see charsep.xml in the same directory.

11.1 Elements

11.1.1 report

The "report" element is the root element of the XML. The attribute "dtd-version" defines the version of the report XML format used to create this report XML file. If this attribute is missing, its value is assumed to be "0.1".

The report element's remaining attributes ("name", "title", and "author") are values that may be edited and displayed in the report by using Special Fields. See also the "description" element.

11.1.2 bean-scripting-framework

This element and its contents describe the Bean Scripting Framework (BSF) scripting languages available to the report. The "default-language" attribute is applied to all formulas that don't specify a language. (This happens when the formula is read in. Changing the default language when designing a report does not change the language of those formulas.)

Inner "language" elements have two attributes: "name" and "class". The name may be any string; it is the string displayed to report developers in drop-down scripting language lists. The class name is used by DataVision to load the BSF language.

The default language must have a "language" element that describes it.

Since DataVision ships with JRuby, it also pre-defines the language "Ruby". See Scripting.java for the class name. Or, since it is always written out when you save a report XML file, see any report XML file.

11.1.3 paper

This element's "name" attribute describes the paper size used by this report. The name attribute's value must be one of the strings found in the paper.properties file contained in the DataVision.jar JAR file.

jimm/properties/paper.properties is copied into DataVision.jar; that is the version used by DataVision when it is running. If you need to change the paper size values, edit the file paper.properties and then replace it in the JAR file by using the commands

> cd jimm/properties
> jar uf ../../lib/DataVision.jar paper.properties

11.1.4 description

This element contains the descriptive text entered in the Report Summary dialog.

11.1.5 source

The "source" element contains elements describing the data source this report will use to read data. Just like a report's output can be sent to any one of a number of layout engines, a report's input can come from one of a number of data sources. The "source" element contains the element that defines the data source and the element that defines the query. The data sources implemented are database, a JDBC database connection, and charsep, a text file whose lines are rows and columns are separated by a character such as comma or tab.

The data source can be overridden on the command line. See Running DataVision from the Command Line.

11.1.5.1 database

The "database" element describes the database that is used in the report. See Connecting to the Database, for how to set these values.

driverClassName is the JDBC driver class name.

connInfo is the magic connection info string used by the database driver to connect to your database. It probably includes the database name (for example, "jobs" or "salesdata").

name is the name of the database.

username is your database user name.

11.1.5.2 charsep

The "charsep" element describes the contents of a text file whose lines contain rows of data and whose columns are separated by a character (typically a comma or a tab). The "sep-char" attribute determines the character. The separator character can be overridden on the command line. See Running DataVision from the Command Line.

The report needs to know the name and type of each column in a character-separated file. The "charsep" element must contain either a list of "column" elements or a "metadata-url" element.

When using a charsep data source, the "Select Records" text which is normally added to a SQL query's WHERE clause is instead evaluated as a Bean Scripting Framework (BSF) script. The result is used to determine if a record is used or ignored.

11.1.5.2.1 column

The "column" element's attributes include "name", "type", and optionally "date-format". The type must be one of "string", "number", or "date".

A date column may optionally specify the "date-format" string which describes how the dates in that column are formatted. The default format is "yyyy-MM-dd". The format string follows the same rules as the date formatting string for a report field. See The Format Dialog for an explanation of the date formatting string.

Here is an example charsep data source:

<!-- The column names happen to be the same as database -->
<!-- table.column names, but that's just a coincidence. -->
<!-- OK, I'm lazy. -->
<charsep sep-char=",">
  <!-- We need to define column names and types. For date -->
  <!-- columns, we need to define a string that will be -->
  <!-- used to parse the string so we can turn it into a -->
  <!-- java.lang.Date object. In this example, you could -->
  <!-- skip the date-format string; the default parse -->
  <!-- string (yyyy-MM-dd) is the same string. -->
  <column name="office.name" type="string" />
  <column name="jobs.id" type="number" />
  <column name="jobs.title" type="string" />
  <column name="jobs.hourly rate" type="number" />
  <column name="jobs.post_date" type="date"
          date-format="yyyy-MM-dd" />
</charsep>
11.1.5.2.2 metadata-url

If the "charsep" element contains a "metadata-url" element, that means the metadata is stored in an external XML file. The "metadata-url" element looks like this:

<!-- A charsep data source whose metadata is described -->
<!-- by an external XML file. -->
<charsep sep-char=",">
  <metadata-url>http://localhost/foo/metadata.xml</metadata-url>
</charsep>

Note that the URL can be a file URL or any other legal URL.

The metadata XML file (in this example, metadata.xml) must conform to the DTD found in metadata.dtd, which is quite simple. In that file, the root "metadata" element contains a list of "column" elements. Here is a complete example of a metadata XML file:

<!-- See metadata.dtd -->
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <!-- We need to define column names and types. For date -->
  <!-- columns, we need to define a string that will be -->
  <!-- used to parse the string so we can turn it into a -->
  <!-- java.lang.Date object. In this example, you could -->
  <!-- skip the date-format string; the default parse -->
  <!-- string (yyyy-MM-dd) is the same string. -->
  <column name="office.name" type="string" />
  <column name="jobs.ID" type="number" />
  <column name="jobs.title" type="string" />
  <column name="jobs.hourly rate" type="number" />
  <column name="jobs.post_date" type="date"
          date-format="yyyy-MM-dd" />
</metadata>

11.1.5.3 query

The "query" element must come after the "database" or "charsep" element. It contains up to three elements: the joins, the sorts, and the where clause. The report uses these and the list of fields it contains to create a SQL query.

Each "join" element represents one of the links between tables that the user specifies.

There is a "sort" element for each group's field and for any other sorts the user specifies.

Finally, the "where" where clause element specifies an additional user-specified clause to use. The text of the where clause is contained in a CDATA element. When generating SQL, the text becomes part of the SQL WHERE clause. When used with a charsep data source, the text is evaluated as a BSF script and the result is used to determine if a record is used or ignored.

11.1.6 parameters

Parameters are defined and appear in the XML before they are used. Each parameter has an id, a type (string, numeric, or date), a name, a question to display when asking the user for a value, an arity, and an optional default value or values.

Types include "string", "numeric", "boolean", and "date".

The arity attribute describes how many values are allowed in the answer. The arity may be "single", "range", "list-single", or "list-multiple".

Default values for each parameter are contained as text within "default" elements.

11.1.7 formulas

Formulas are defined and appear in the XML before they are used. Each formula has an id, a name, and formula text. The formula text is a BSF script.

11.2 Report Sections

A report may have zero or more parameters, formulas, report headers, report footers, page headers, page footers, and groups. It will have one or more detail sections. Each section may have zero or more fields and lines.

This DTD snippet says it best:

<!ELEMENT report      (description, formula?, paper, field, usercols?,
		       source, subreports?, parameters?, formulas?,
		       headers?, footers?, page?, groups?,
		       details)

>
<!ELEMENT headers     section*>
<!ELEMENT footers     section*>
<!ELEMENT page        headers? footers?>
<!ELEMENT group       headers? footers?>
<!-- column is deprecated, replaced by groupable-id and -->
<!--  groupable-type. For now, both will work. -->
<!ATTLIST group       column CDATA #REQUIRED>
<!ATTLIST group       groupable-id CDATA #REQUIRED>
<!ATTLIST group       groupable-type CDATA #REQUIRED>
<!ELEMENT details     section*>
<!ELEMENT section     (field|line)*>

11.2.1 Fields

Fields display text, including database values. Fields must have a bounds rectangle, and can have borders and formatting descriptions.

<!ELEMENT field        (bounds|format|border)*>
<!ATTLIST field type   (column
                           |formula
                           |parameter
                           |text
                           |special
                           |image
                           |sum
                           |average
                           |count
                           |min
                           |max
                           |stdev)
                                        #REQUIRED
                   value  CDATA         #REQUIRED
>
<!ATTLIST field        visible       (true|false)
>

The order of the bounds, format, and border elements is not important. I don't know how to describe that with DTD, though.

There are six types of fields that a report can display: text, database columns, parameters, formulas, aggregates, and special fields. The "value" attribute determines what is displayed.

text Text fields display the value attribute verbatim.

column Column fields display the value of the database column named by the value attribute (for example, "member.member id"). Yes, table and column names may contain spaces.

parameter Parameter fields display the value obtained by substituting the parameter value whose id is contained in this field's value attribute. (Remember, parameters are defined separately, and each one has an id.) The underlying parameter values are retrieved from the user each time the report is run.

formula Formula fields display the value obtained by executing the formula whose id is contained in this field's value attribute. (Remember, formulas are defined separately, and each one has an id.)

special Special fields display one of the following values, named in the value attribute. These values are described in Special Values.

sum, min, max, average, mean, median, count, stdev Aggregate fields display the aggregate (sum, min, max, etc.) for the numeric database column, numeric parameter field, or formula field whose id is contained in this field's value attribute.

11.2.2 Lines

Lines, though not yet supported in the Design Window GUI, have been part of the report XML since the beginning of DataVision. They are simple, with a thickness, color, and visibility flag. Lines contain two end points.

<!ELEMENT line     point point>
<!-- Color is stored as "red;green;blue;alpha" -->
<!ATTLIST line  thickness  CDATA
                   color      CDATA
                   visible    (true|false)
>

11.2.3 Suppression Procs

A suppression proc determines if a section should be displayed or not. See Section Suppression Procs for an explanation.

The "suppression-proc" element is contained within a "section" element. It has an optional "hide" attribute and optionally contains a formula containing a BSF script. When the "hide" attribute is "true", the section is always hidden. The default value of "hide" is "false".

The formula contains a BSF script that must return a boolean (true/false) value. For every row, the code is run to determine if the entire section should be hidden for that row only. When the code returns true, the section is hidden.

Here is an example "suppression-proc" element that uses the "hide" attribute to always hide the section:

<suppression-proc hide="true" />

Here is an example that uses a formula to determine when to hide a section:

<suppression-proc>
  <formula id="3" name=""><![CDATA[({jobs.ID} & 1) == 0]]></formula>
</suppression-proc>

11.3 Field Formats

A field's format describes how it will be displayed. Optional format attributes elements specify a field's font family name, alignment, font size, color, attributes (bold, italic, or underline), wrapping, and formatting (for numeric and date fields). The attributes bold, italic, underline, and wrap are boolean and may optionally take the values "true" or "false". The default for all but wrap is false; wrap defaults to true (as of version 0.8.0).

Color is stored as a string of the form "red;green;blue;alpha". Each value is an integer in the range 0 - 255.

The format string (an attribute of the format element; confusing, eh?) determines the print format of numeric and date fields. For a complete description of the format string, see The Format Dialog.

Here is an example field that displays some text, right justified.

<field id="42" type="text" value="My Very Own String">
  <bounds x="0" y="0" width="100" height="20" />
  <format bold italic underline size="24" align="right" />
</field>

<= Previous | Next => | Table of Contents Valid XHTML 1.0! Valid CSS! DataVision User's Manual