The Jacana Collectionwww.jacana.com (Home)
About us
Products
Services
Support
Partners
jumpzone
Sitemap

 

 

 

Top of page
11. Where two similar fields exist in the Subset, does the choice of which field to use have any effect on performance?

In most cases, the answer is yes. The most common reason for apparently identical fields to exist in a Subset is where the field is a key in one or more files. For example, customer number might be a field in the order header, but it is also the key to the customer master file. Consequently, there are two fields defined in the Subset called customer number, and the choice of which field to use can be significant.

Consider a report that lists some fields from the order header file. No customer details are required, only the customer number. If the customer number from the customer master is chosen instead of the customer number from the order header, then OPEN/DB thinks that the customer master file is required and will include it in the report - normally, it would be omitted.

Note that this problem is not functional - whichever customer number is used, the resulting report will be correct. However, there may be an effect on relative performance.

In most cases, this type of problem can be easily avoided by carefully choosing the fields included in the userviews. If the second customer number is not in the userview, then (in general), it will not be used and the problem is avoided. Indeed, when you first build your Subset, OPEN/DB automatically omits such fields from userviews but if you create or amend userviews, be careful with the choice of fields where duplicates exist.

 

Top of page
12. What aspects of the definition of one-to-one relationships in a Subset have an effect on performance?

Most of the factors affecting performance, at least in Subset design, are connected with the definition of one-to-many relationships or the use of OPNQRYF. 

The main performance-related factor in the definition of one-to-one relationships is the choice of parameters controlling OPEN/DB's use of "in-core" logic. Choosing in-core logic instructs OPEN/DB to generate code that will retain the last n used records in program variables, and retrieve the required record from the program variables if possible. By default, this option is selected for every one-to-one relationship that you define, and the default value of n is 20. 

In the majority of cases, this option considerably speeds up data retrieval by reducing the disk I/O that a report needs to perform. In most cases, it will be sufficient to accept the defaults that OPEN/DB proposes. Only in a couple of situations might you wish to vary from the defaults, and you should consider carefully before doing so to ensure that your changes will have the desired effect:

  • There may be some cases where the nature of the relationship and of the data means that the chances of scoring a "hit" using the in-core logic are extremely small. In these cases, you may wish to turn the option off and eliminate the overhead of having OPEN/DB search its 20 last-used records before performing the I/O. This benefit is likely to be marginal in most cases, since the search is a high speed CPU-bound process. The potential performance loss of turning it off inappropriately is likely to be much larger. Therefore, you should be very sure that your reasoning is sound before taking this action.
  • More commonly, you may wish to fine-tune the value of n to suit particular situations. For example, if your company only sells 25 products or services, then you may wish to alter the value for a link to your product master file to say 30 (allowing for a little growth in your product range). By doing this, you can be sure that OPEN/DB will only ever read each product master file record once for an entire report run.

You should not increase the value of n to excessive levels. There will come a point, probably somewhere around 200-300, when it takes longer for OPEN/DB to search the retained records than it would have taken to perform the I/O, and the benefit will have been lost.

 

Top of page
13. In what order should I define my files on the retrieval map?

Obviously the "based on" file must be defined first. Subsequently they may be defined in any order. However the retrieval map will be more meaningful if for a particular file you define the subsequent files with a one to one relationship before the one to many relationships.

 

Top of page
14. When deciding what the based-on file should be, normally a non-keyed physical file is specified. We have had the situation where a customer has decided to use a keyed logical file instead, are there any performance implications with this?

Choosing a based-on-file that has a keyed access path will not provide any benefit in performance. Regardless of the file definition, it will be defined within OPEN/DB report programs as non-keyed. The output sequence of OPEN/DB reports is achieved by the product's internal processing and is not the result of the sequence in which the data base is read.

 

Top of page
15. How many subsets do I need?

This will depend on the number and complexity of your applications, but here are a few guidelines:

  • Because a given Subset is only one way of retrieving data, it is not usually possible, in other than the smallest application, to have a single Subset that will cater for all possible reporting requirements. For example, a Subset based on a Customer Master file may access a Product Master file by linking via an Order Detail file. However, this Subset design would not be well suited to the task of producing a complete list of products. 
  • Most applications of moderate complexity will require at least two or three Subsets based on the most significant master files. 
  • In very complex applications, the size of the Subsets (the number of file links) may be the limiting factor. A Subset cannot contain more than 99 file links but a Subset this large is difficult to manage and even more difficult to use. Subset design should be approached with a view to your organization's practical reporting requirements. 
  • You should aim to include as much related information in the Subset as is practical and necessary to meet your reporting needs. However, there is no merit in idea of linking extra files into your Subset simply because you can. Often, a second Subset, which approaches the data base from a different starting point, will provide better overall access to information.

 

www.jacana.com (Home)Top of page footspace.gif (46 bytes) Copyright Momentum Utilities Pty Ltd - July 2016
Legal Information - Privacy - Contact Us