When creating a Subset, each new file link is defined as either a
"one-to-one" or a "one-to-many" relationship. For
"one-to-many" relationships, the anticipated number of records
must be specified. This number is used to optimize the code generated
for the link. This number is not a maximum - occurrences of the link
that exceed this number will be correctly processed. You should take
some care to choose an appropriate value as it has the potential to
affect performance.
This is an important field that can have a significant impact on
performance if inappropriately specified. It represents an optimum
number of records on the "many" side of the relationship for
each record on the "one" side.
The reason for the significance is the way that OPEN/DB performs its
processing. For each source record read, OPEN/DB will read all target
records and construct an array of the selected ones. The number of
records field determines how big this array will be (and consequently
how big associated reports programs will be).
The number is used to allocate working storage in the generated
programs to keep track of all the (selected) linked records in the
target file for any one record in the source file.
As a rule of thumb and taking into account your knowledge of the
application and the way it is used in your organisation, you should aim
to set this number high enough to cater for 95% or more of the cases.
Note that: Unless the distribution of records is almost perfectly even,
the number you specify should be somewhat larger than the
"average".
Taken to extremes, too large a value can lead to performance
degradation arising from excessive paging. In very extreme cases,
particularly where multiple nested groups are used, compilation failure
is possible. This should not arise providing you give proper
consideration to setting an appropriate value. Certainly, you should
avoid settings that use the maximum (9,999) except in the most simple 2
level hierarchies. If the value set is too small, the program will run
out of room and will have to store additional records externally to the
program. This can result in performance degradation.
Remember that this number relates to the records selected by a given
report during execution. In some applications, typical record selection
tests can be predicted (eg. "Customer Type" or "Product
Group" or a Transaction Date range etc.). An understanding of the
business need for a given Subset can be helpful in determining the best
value for "number of records". Keep in mind that the overall
number of records is multiplied at each hierarchical relationship. Where
Group 1 goes to Group 2 in a 1:100 relationship, the anticipated number
of records is 101 (ie. 1 + 1 * 100). If Group 2 then goes to Group 3
also in a 1:100 relationship, then the equation becomes (1 + 1 * 100
*100 = 10,001).
The possible effects on performance can be summarized as
follows:
- If the number specified is too small, then excess records cause
additional processing. If this occurs too frequently in execution,
then performance degradation will result.
- If the number specified is larger than required, then the
generated program will be larger and will use more storage than is
necessary. However this is usually preferable providing the excess
represents no more than a reasonable "margin for error"
because the impact upon performance is less significant.
In exceptional cases of imbalance (for example there are 200 records
on the source file and 20,000 records on the target file, but 17,000 of
the target records are related to a single source record), consider
developing the Subset using the current target file as the source file
and linking the current source file as a "one-to-one" target
file in the new Subset.