On Java Development

All things related to Java development, from the perspective of a caveman.

General Rules of Thumb for Creating Files

without comments

Introduction

This post presents a few general rules of thumb to keep in mind when creating a new file or table. When it comes to naming conventions to be applied to files and fields, there does not appear to be an industry standard which then leaves the task of defining standards to the individual organization.

Naming Systems for DB2, MySQL, Oracle, MS-SQL

This section focuses on file and field names for the aforementioned databases.
 
File Names
There are many views about file naming conventions for these databases and there is an argument that centers around the plural versus singular naming approach. Most appear to fall on the side of singular naming convention for files reflecting the entity relationship as opposed to the collection point of view. Below are two sources supporting the singular naming convention for tables.

From the Illinois State Board of Education:

Tables are patterns for storing an entity as a record – they are analogous to Classes serving up class instances. And if for no other reason than readability, you avoid errors due to the pluralization of English nouns in the process of database development. For instance, activity becomes activities, ox becomes oxen, person becomes people or persons, alumnus becomes alumni, while data remains data.

From a Microsoft proponent:

The AdventureWorks (Microsoft) sample uses a very clear and consistent naming convention that uses schema names for the organization of database objects.

  • Singular names for tables
  • Singular names for columns
  • Schema name for tables prefix (E.g.: SchemeName.TableName)
  • Pascal casing

It is easy to argue that files should be referred to in the plural as they represent collections of records. However, the prevailing argument of using the singular appears to win out, advancing the point of view that the table name should reflect the single entity represented by each record.

Examples of Table Names:

  • CUSTOMER
  • POLICY
  • INVENTORY
  • VENDOR
  • ADDRESS
  • ORDER_HEADER
  • ORDER_DETAIL
  • PHONE_NUMBER

Rule #1A. For DB2, MySQL, Oracle, MS-SQL: File names must be in the singular, upper case, with underscores to separate the names.
 

Naming Systems for DB2 on the iSeries

This section focuses on file and field names for the iSeries database.
 
File Names

When considering naming standards for files on the iSeries, keep in mind there are only 10 bytes to work with. Consider the limitations of closed naming systems. For example, there is a system designed for a Fortune 500 company by an internationally known consulting firm, now defunct. The consulting firm was charged with designing a Big System to be used by all of its client’s operating companies. The naming system employed for the Big System’s files went along the following lines of logic. File names are to begin with BS for “Big System”. The next 2-byte designator was assigned to specific areas of responsibility. For example the designator for the first Item oriented physical file was “IA” to indicate “Item file ‘A'”. This was assigned to the Item Master file. The next file would be assigned “IB” and so on. Since field name prefixes were under the control of another incrementing system, fields of this file were prefixed by JF, so it became the third designator of the file name. The last 2 positions of the file name acquired “PF” for physical file. The name of the file then became “BSIAJFPF”. Here’s a breakdown of the naming convention.

  • Bytes 1-2 : BS = Big System
  • Bytes 3-4 : IA = Item File A. 2nd file for Item becomes IB. 3rd is IC, etc.
  • Bytes 5-6 : JF = Field prefix. These were sequenced. By the time they got to the Item Master, it was “JF”.
  • Bytes 7-8 : PF = Physical File.
  • Bytes 7-10 : L000-L999 = Logical file designator.

The first logical for BSIAJFPF became BSIAJFL0 and could go out to BSIAJFL999. There were no worries about running out of numbers for logical file names. However, the problem experienced with this naming convention is that it did not take very long before 26 files were created for this area of the system. When that happened, the Item designation became JA after using IZ. With the introduction of the “J” into the naming convention, the file names started to become truly arbitrary. As more time passed, the file names became rather meaningless and best describes what a closed system eventually becomes. Also, keep in mind that a couple of people were assigned with the administration responsibilities of this naming system so that it could be strictly followed and chaos would not occur. The irony? It’s clear that after spending all that time, effort and money to keep everything under control, chaos occurred anyway.

Instead of pursuing such elaborate naming conventions, consider using a system that is as open as possible. For example, consider using a number oriented naming convention.

  • Bytes 1-2 : IN=Inventory, PR=Payroll, GL=General Ledger, AP=Accounts Payable, AR=Accounts Revievable, OE=Order Entry, etc.
  • Bytes 3-6 : 0000 – 9999
  • Bytes 7-8 : PF=Physical File
  • Bytes 7-10 : L=Logical File e.g. L000-L999

This recognizes that with 10-bytes, any naming system that tries to embed significance into the name will eventually fail at that endeavor and the names will eventually become meaningless. So, why not just begin with that understanding and limit the significance? With that, the first 2 bytes are assigned to the area of responsibility, the next 4 are for the numerical assignment, with the first file of that area being 0000, and the last 2 bytes are assigned with PF for physical file. When the file is a logical file, byte 7 becomes “L” and bytes 8-10 become 000 to 999. This means there can be 10 million files for any one area of responsibility (IN, AP, GL, etc) with very little worry about eventually rendering the convention meaningless.

The only conceivable criticism of this system is that it’s too cryptic. The solution to this perceived problem lies in the file’s description that can be given when the file is created. The description can be as long as 50 bytes and is always displayed by the system’s Display File Description command. This feature does not exist on any other system (to my knowledge) which enjoy 30 byte file names. It is as though IBM knew file names limited by 10 bytes deserved a description feature to address its cryptic quality.

Rule #1B. For files on the iSeries DB2, the rule is more of a guideline. First, they are always in upper case. Keep in mind there are only 10-bytes. Do not attempt to build too much significance into the name itself. Keep it simple. Try to communicate the area of responsibility to which the file is assigned and indicate whether the file is a physical or logical.

Written by admin

January 24th, 2014 at 11:09 am

Leave a Reply

You must be logged in to post a comment.