# # ------------------------------------------------------------------- # Database read/write package. # by Mark Manning # # This program is freeware. Enjoy! :-) # # This is a generic database manipulation set of routines which # use the SYSOPEN, SYSREAD, and SYSWRITE commands to get and # put records to a database. # # It is a very simple database program, isn't very smart, but it # works fairly quickly and is why I use it a lot. # # How it works is that there is an index list kept in memory. This # list's name varies depending upon the usage. For example, these # routines are copied to, say, dbfUSER.pl when I am going to use # them to handle a user database. Each of the routines are re-named # to be openUSER, closeUSER, etc.... If I were going to use this # as the routines to handle a city database I'd rename the file # dbfCITY and everything would be openCITY, closeCITY, readCITY, # etc.... Get it? :-) # # ------------------------------------------------------------------- # # Ok. Once the index list is in memory, you are ready to use the # database routines. I always make the index list contain the key # to the database itself as well as one or two other items which # will be contained within the database which I might also want to # use as a key. So, for a user database you might want your index # list to contain the user's name and their password. Or maybe # the user's name and their age. Whatever. # # The index list is always packed via the PACK command to store it # into the index file. The UNPACK command (of course) has to then # be used to extract the various parts to the index. So, for example, # if I was (again) using the user's database, my PACK command might # look like this: # # $userIndex[$i] = # pack( "a40a20LL", "username", "password", # startingLocation, recordLength ); # # Where the "username" is the user's name. The "password" is their # password. The "startingLocation" is where in the database we # would have to go to find the start of this record. And the # "recordLength" is the length of the record we wrote (or are going # to write) out. # # Since the index file is a fixed length record database itself, it # can be written to and/or read from easily. Also, since we mainly # just suck everything into memory - this file is opened only for # a very brief amount of time. # # ------------------------------------------------------------------- # # So now we turn our attention to the database itself. Unlike the # index file which is fixed length records, the database is a variable # length record database. This means a record can be any length we # want it to be. So now we run into some problems. Mainly, what # if we have a record which is X length and then it grows to be Y # length. Can we write it back out to the same location again? # No. And what if we have a record which is X length, it shrinks # to size Y and then grows to size Z? Can we easily keep track of # all of that? Again - no. So what do we do? We say we don't care # what size the records are and always just stick the new record # onto the end of the database! That's what we do! Sorry! My # immediate gratification syndrome has kicked in! :-) # # Anyway, if we are always just sticking the records onto the end of # the database you can't go wrong. You don't have to check for # whether or not the record will fit because as long as you have # disk space - it will fit. # # Ok - but now you are moaning "But the database will get too big # with all of these older, deleted records." That's true. That's # why I've written a compaction routine. So when the database # reaches about two times it's size you just call the compaction # routine to squash it back down. And of course - you will have # to modify the routine so it works on your system just as I have # had to do with mine. :-) # # But those things aside - The way in which the database works # is that we do not use the PACK and UNPACK commands. Instead, # we use JOIN and SPLIT to put together and break apart the # records. Why. Ok, the reason is that this is a variable length # database and so each record will grow and contract as needed. # If we used PACK and UNPACK we'd have to figure out the alphanumeric # string to use each time we packed and unpacked everything and # that would take longer than just joining and splitting everything. # # You also gain an extra benefit from using JOIN and SPLIT. This # is that you can have two dimensional arrays. It is done like # this: # # If I have an array MYARRAY and I place various bits of information # into the array on each line of the array. Like so: # # $MYARRAY[0] = join( "\003", 1,2,3,4,5,6,7,8 ); # # (Ok so it's not the most original data in the world - who cares?) # Then I can further join the lines together to store them into # the database by doing the following: # # $theLine = join( "\002", @MYARRAY ); # # So now you have a two dimensional array all stored within the # variable $theLine. This is also where the variable part really # comes in handy. By using an array, you can expand and # contract the array as needed and your records will also expand # and contract along with it. # # ------------------------------------------------------------------- # # So there you have it. A simple database program which uses # variable length records. It always stores the new records at the # end of the database, thus insuring that we always have enough room # to store a record. The process uses two databases: an index file, # and the actual database. The index file is kept in memory and should # always be kept small. # # ------------------------------------------------------------------- # # Ok! Did I hear someone say: "But how do you prevent two users from # accessing the database at once?" Obviously you haven't # read the code. Because of course(!) I provided a way to prevent two # users from hammering away at the database at the same time! :-) # # When the openXXXX is called, it looks for a lock file with the same # name as the database and index file. So on a user's database, the # file would be named: user.lock # # This file is checked to see if it is around. If it is, then someone # else is using the file. If not, then it is created. All it contains # is the userid of the person who is presently using the database. # (Gotten by basically doing a whoami in another program so since it's # not that hard to do it's not included in the source code. Sorry. :-/ ) # # The program sets up a counter and begins checking the name of the # person who is using the database. If the name changes (like someone # slips in before the user's program can get the file) then the counter # is reset and waiting continues. But if the person's name in the lock # file doesn't change for 10 seconds, then the program chunks them in # favor of itself. No program should keep the database open for more # than 10 seconds - except the compaction routine and it should always # be run in standalone mode anyway. # # ------------------------------------------------------------------- # # In order to post this set of routines I have generalized the # contents. But you should be able to modify and/or use the routines # as you see fit. Have fun. # # All places where you should substitute a word have been converted # to "user". Thus, you would change "user" to whatever database you # are attempting to create. Like let's say you wanted to create a # database to keep track of dogs. Then you would change all occurrances # of "user" to "dog", all "USER"'s to "DOG"'s, etc.... # # Q&A can be answered via e-mail at markem@flash.net. Please do not # attempt to contact me at the NASA e-mail address because we are # behind a firewall and you might not be able to get me at that address. # Thanks. :-) # # ------------------------------------------------------------------- # use FileHandle; # # ------------------------------------------------------------------- # Routine to open the database. # # As you will see, there is always a current directory ($curDir) # being passed to the routines. The newer routines do not # employ this. Instead, there is now a generic routine called # "pp" for Proper Path which figures out the correct path # for the files. The manner presented though provides a more # generic method of dealing with multiple databases. Other than # just changing the "/" to ":" in the file's pathname, the routines # work fine with MacPerl et al. # ------------------------------------------------------------------- # sub openUSER { local( $curDir ) = @_; local( $theFlag, $whoLocked, $theName, $i, $numRecords ); local( @fileStats, @indexInfo ); # # Check the lock file. Is it there already? If so, we # have to wait until the other person is through. # $theFlag = 0; $whoLocked = ""; while( -e "$curDir/user.lock" ){ $theFlag++; if( $theFlag > 10 ){ open( LOCK, "$curDir/user.lock" ) || die "Couldn't open the lock file $curDir/user.lock - reading.\n"; $theName = <LOCK>; close( LOCK ); if( $theName eq $whoLocked ){ unlink "$curDir/user.lock"; } else { $theFlag = 0; $whoLocked = $theName; } } sleep( 1 ); } open( LOCK, ">$curDir/user.lock" ) || die "Couldn't open the lock file $curDir/user.lock - writing\n"; print LOCK "$userID\n"; close( LOCK ); sysopen( INDEX, "$curDir/user.index", O_RDWR ) || die "Couldn't open the index file $curDir/user.index - reading\n"; @userIndex = (); while( sysread(INDEX, $userIndex[++$#userIndex], 48) ){ } close( INDEX ); $#userIndex--; return; } # # ------------------------------------------------------------------- # Routine to close the database. # ------------------------------------------------------------------- # sub closeUSER { local( $curDir ) = @_; unlink "$curDir/user.lock"; return; } # # ------------------------------------------------------------------- # Routine to read the database. # ------------------------------------------------------------------- # sub readUSER { local( $curDir, $userNumber ) = @_; local( $i, $curLine, $newLine, $oldLine ); local( @theLine, @indexInfo, @newList ); # # See if the user number is within range. # if( $userNumber <= $#userIndex ){ @indexInfo = unpack( "a40LL", $userIndex[$userNumber] ); $indexInfo[0] =~ s/\000+//; } else { $theReply = <<HTML_CODE; Content-type: text/html <html> <head> <title> User Database Error Screen </title> </head> <body text=#CCCC00 link=#00DD00 vlink=#CCCC00 bgcolor=#220000> <font size=+2>Could not find the given record...please try again.</font> </body> </html> HTML_CODE print $theReply; exit( 0 ); } # # Using the information from the user index, we index into the # database and load the proper record. # sysopen( USER, "$curDir/user.dbf", O_RDWR ) || die "Couldn't open the database file $curDir/user.dbf - aborting.\n"; seek( USER, $indexInfo[1], 0 ); sysread( USER, $newLine, $indexInfo[2] ); close( USER ); # # ------------------------------------------------------------------- # # Ok, we have a record from the database. Let's split it up. # # In order to split the incoming information up, we have to create # a new array. This is done with the following command: # # ------------------------------------------------------------------- # @theLine = split( /\001/, $newLine ); # # ------------------------------------------------------------------- # # Now that the line is split up, we want to split it up even more. # # The following is just an example of what might be in the file and # how to extract it. # # Notice how we either just move the separated out lines of information # from the "$theLine" array or we split the two dimensional array # up via the SPLIT command. # # ------------------------------------------------------------------- # $curLine = 0; $userName = $theLine[$curLine]; $curLine++; @userStats = split( /\002/, $theLine[$curLine] ); $curLine++; # # ------------------------------------------------------------------- # # Next be sure to go through and strip out uneccesary characters # which might be hanging around. # # It is important to remember that even though the lines contained # within the file have already been stripped of unusable characters # when it is stored in the database, it never hurts to double check. # You might hit a bunch of null characters which stops your program # dead otherwise. :-/ # # ------------------------------------------------------------------- # for( $i=0; $i<=$#userStats; $i++ ){ $userStats[$i] = &stripUSER( $userStats[$i] ); } # # ------------------------------------------------------------------- # # Ok! We've gotten our records - let's return. Record information # is globally defined. So it is always available! The reason for # this is because otherwise the call line would be really rather long. # :-) # # ------------------------------------------------------------------- # return; } # # ------------------------------------------------------------------- # Routine to write to the database. # # You should always put your record layout somewhere. I always # put it in the write routine so I don't forget it. In this case # it would look like this: # # User Database # ------------- # The User's Name ( $userName ) # The User's Stats # Program Name # Program Version # Privileges (a=admin, p=programmer, u=user) # # ------------------------------------------------------------------- # sub writeUSER { local( $curDir, $userNumber ) = @_; local( $newLine, $recLength, $i, $oldLine ); local( @theLine, @fileStats, @indexInfo, @newList ); # # ------------------------------------------------------------------- # # Ok, this is outgoing information. First strip off any strange # characters it might have before sending it to the database. # # ------------------------------------------------------------------- # for( $i=0; $i<=$#userStat; $i++ ){ $userStat[$i] = &stripUSER( $userStat[$i] ); } # # ------------------------------------------------------------------- # # Now put everything into the array @theLine. # # (Notice the totally original names I'm using!) # (Actually, I'm trying to be consistent so this is as easy to # work with as possible.) # # Ok - time out here! # # The records are joined in the following manner: # # Single entries, such as the user's name below, are kept in their # own variables. Arrays have each field joined together with \003 # between each entry. The entire array is joined together via # a join command with \002 separating each record. # # Finally, each section of information contained within each # entry in the database is separated with \001. So all of this # would look like this if you did a hex dump on the database: # # Dump: /disk4/crdb/Scripts/user.dbf # # Offset: 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef # # 00000000: 42 61 72 72 79 02 52 6f 67 65 72 73 02 65 72 36 | Berry.Rogets.er6 | # 00000010: 02 34 38 33 2d 38 31 30 32 02 4c 69 6e 43 6f 6d | .999-9999.Likcos | # 00000020: 02 72 6f 67 65 72 73 40 63 68 65 65 72 73 2e 6a | .peters@chairs.j | # 00000030: 73 63 2e 6e 61 73 61 2e 67 6f 76 02 50 61 73 73 | sc.nuss.com.Pass | # 00000040: 77 6f 72 64 01 63 68 65 65 72 73 2e 6a 73 63 2e | word.chairs.jsc. | # 00000050: 6e 61 73 61 2e 67 6f 76 02 6a 65 64 69 2e 6a 73 | nuss.com.jaho.js | # 00000060: 63 2e 6e 61 73 61 2e 67 6f 76 02 6e 6f 77 68 65 | c.nuss.com.nowhe | # 00000070: 72 65 2e 6a 73 63 2e 6e 61 73 61 2e 67 6f 76 02 | re.jsc.nuss.com. | # 00000080: 73 61 6d 2e 6a 73 63 2e 6e 61 73 61 2e 67 6f 76 | sum.jsc.nuss.com | # 00000090: 02 73 75 6e 2e 6a 73 63 2e 6e 61 73 61 2e 67 6f | .sut.jsc.nuss.co | # 000000a0: 76 01 48 69 67 68 02 4c 6f 77 02 4d 65 64 69 75 | m.High.Low.Mediu | # 000000b0: 6d 01 43 52 44 42 03 76 35 2e 30 61 03 75 02 45 | m.CRDB.v5.0a.u.E | # 000000c0: 6e 69 67 6d 61 03 76 33 2e 30 03 75 02 4d 41 47 | nigma.v3.0.u.MAG | # 000000d0: 49 4b 03 76 36 2e 30 35 03 61 02 4d 41 47 49 4b | IK.v6.05.a.MAGIK | # 000000e0: 03 76 36 2e 30 35 03 75 02 4d 41 47 49 4b 03 76 | .v6.05.u.MAGIK.v | # 000000f0: 36 2e 31 03 61 02 4d 41 47 49 4b 03 76 36 2e 31 | 6.1.a.MAGIK.v6.1 | # 00000100: 03 75 01 43 6c 6f 73 65 64 02 4f 6e 20 48 6f 6c | .u.Closed.On Hol | # # The above is from the USER database. Notice the 01's and 02's # scattered around in the hex dump. The 01's means the start of # a new field in the database and the 02's mean the start of a # new line of information in an array. The 03's mean the start of # a sub-field inside of the array's line of information. Or to put # this another way: # # Berry is the user's first name and it is separated from Rogets # by a 02. Thus, Rogets is another field of information inside of # the array. Or, the array would look like this if it wasn't stored # in the database: # # The array = 1)Berry # 2)Rogets # 3)er6 # 4)999-9999 # 5)Likcos # 6)peter@chairs.jsc.nuss.com # 7)Password # # So the array is seven entries long. Down below Berry's name you will # see a line which reads: CRDB.v5.0a.u. This is the start of a two # dimensional array. Each of the parts ("CRDB", "v5.0a", and "u") # are broken up with 03's which would mean that these are all separate # entries. When more information is wanted about a particular line in # the array, it is split apart via SPLIT( /\003/, $theArray[$i] );. # # So back to the program! Let's join together the information into # a single record. # # ------------------------------------------------------------------- # @theLine = (); $theLine[++$#theLine] = $userName $theLine[++$#theLine] = join( "\002", @userStat ); $newLine = join( "\001", @theLine ); # # ------------------------------------------------------------------- # # Now we find out where this information is to be placed. We do this # by first extracting the key(s) from our database. In this case,we # only have the user's name. Remember though that packing information # can lead to having trailing null characters or spaces. We use # NULLs in this database. So strip them out afterwards. # # ------------------------------------------------------------------- # @indexInfo = unpack( "a40LL", $userIndex[$userNumber] ); $indexInfo[0] =~ s/\000+//; # # ------------------------------------------------------------------- # # Find out where the end of the database is presently. # # ------------------------------------------------------------------- # @fileStats = stat( "$curDir/user.dbf" ); $recLength = length( $newLine ); # # ------------------------------------------------------------------- # # And write everything back out. The index file is updated first # and then the database file. # # ------------------------------------------------------------------- # $userIndex[$userNumber] = pack( "a40LL", $indexInfo[0], $fileStats[7], $recLength ); sysopen( INDEX, "$curDir/user.index", O_RDWR ) || die "Couldn't open the index file $curDir/user.index - writing\n"; seek( INDEX, $userNumber*48, 0 ); syswrite( INDEX, $userIndex[$userNumber], 48 ); close( INDEX ); sysopen( USER, "$curDir/user.dbf", O_RDWR ) || die "Couldn't open the database file $curDir/user.dbf - writing.\n"; seek( USER, 0, 2 ); syswrite( USER, $newLine, $recLength ); close( USER ); return; } # # ------------------------------------------------------------------- # Routine to find the userid's number (userNumber). # # This routine actually is very essential to the whole workings of # these routines. When you are going to work on an entry you # should always call this routine FIRST! This will return either # a minus one (-1) to show it could not find the record or the # actual record number of where in the database the information # resides. # # So why is this important? BECAUSE THE NUMBER OF THE RECORD IS # THE NUMBER USED IN READUSER AND WRITEUSER!!! That's why! :-) # ------------------------------------------------------------------- # sub findUSER { local( $findWho ) = @_; local( $i ); local( @indexInfo ); $findWho =~ s/\000+//; $findWho =~ y/A-Z/a-z/; for( $i=0; $i<=$#userIndex; $i++ ){ @indexInfo = unpack( "a40LL", $userIndex[$i] ); $indexInfo[0] = &stripUSER( $indexInfo[0] ); $indexInfo[0] =~ y/A-Z/a-z/; if( $indexInfo[0] eq $findWho ){ return( $i ); } } return( -1 ); } # # ------------------------------------------------------------------- # Routine to add a new user's record to the list. # # This routine is used to add in a new record. Many times this # routine just returns whatever the index array's last number is # plus one. With the user's database though I expanded the routine # to first check and make sure someone with that username hadn't # already been entered. If they had, it creates a unique name # to store into the file. :-) # ------------------------------------------------------------------- # sub newUSER { local( $curDir, $userName ) = @_; local( $theCount, $theName ); $theCount = 0; $theName = $userName; &openUSER( $curDir ); while( &findUSER($theName) > -1 ){ $theCount++; $theName = "$userName$theCount"; } $userIndex[++$#userIndex] = pack( "a40LL", $theName, 0, 0 ); &writeUSER( $curDir, $#userIndex ); &closeUSER( $curDir ); return( $theName ); } # # ------------------------------------------------------------------- # Routine to strip off extraneous characters. # # This is the routine used to remove funky characters in a record. # ------------------------------------------------------------------- # sub stripUSER { local( $theString ) = @_; $theString =~ s/^\s+//; $theString =~ s/\s+$//; $theString =~ s/\000+//; return( $theString ); } # # ------------------------------------------------------------------- # # Compact the database # # This is the routine to compact down the database. It is run only # about once every couple of months or until the database increases # to about twice what it's current size is. A 90,000 byte file # dropped to around 40,000 bytes when used. # # ------------------------------------------------------------------- # sub compactUSER { local( $curDir ) = @_; local( $i, $newLine, $recLength, $recNumber, $newIndex ); local( @indexInfo, @theLine, @fileStats, @newList ); # # Create the new database and index. # unlink "$curDir/new.index"; unlink "$curDir/new.dbf"; sysopen( INDEX, "$curDir/new.index", O_RDWR|O_CREAT, 0777 ) || die "Couldn't open the index file - writing\n"; close( INDEX ); sysopen( DBF, "$curDir/new.dbf", O_RDWR|O_CREAT, 0777 ) || die "Couldn't open the dbf file - writing\n"; close( DBF ); # # Get the old database's index information. # &openUSER( $curDir ); @newList = sort @userIndex; # # Remove duplicates. # $oldLine = ""; @userIndex = (); for( @newList ){ @theLine = unpack( "a40LL", $_ ); $theLine[0] = &stripUSER( $theLine[0] ); next if length( $theLine[0] ) < 4; next if $oldLine eq $theLine[0]; $oldLine = $theLine[0]; $userIndex[++$#userIndex] = $_; } # # Move over each record # $recNumber = -1; for( $i=0; $i<=$#userIndex; $i++ ){ @indexInfo = unpack( "a40LL", $userIndex[$i] ); $indexInfo[0] = &stripUSER( $indexInfo[0] ); print "Working on USER: $indexInfo[0]...please wait.\n"; if( length($indexInfo[0]) > 4 ){ print "Writing USER: $indexInfo[0] to the new file...please wait.\n"; &readUSER( $curDir, $i ); @theLine = (); $theLine[++$#theLine] = $userName; $theLine[++$#theLine] = join( "\002", @userStat ); $newLine = join( "\001", @theLine ); $recNumber++; @indexInfo = unpack( "a40LL", $userIndex[$i] ); @fileStats = stat( "$curDir/new.dbf" ); $recLength = length( $newLine ); $newIndex = pack( "a40LL", $indexInfo[0], $fileStats[7], $recLength ); sysopen( INDEX, "$curDir/new.index", O_RDWR ) || die "Couldn't open the index file - writing\n"; seek( INDEX, 0, 2 ); syswrite( INDEX, $newIndex, 48 ); close( INDEX ); sysopen( DBF, "$curDir/new.dbf", O_RDWR ) || die "Couldn't open the $curDir/new.dbf file - aborting.\n"; seek( DBF, 0, 2 ); syswrite( DBF, $newLine, $recLength ); close( DBF ); } } &closeUSER( $curDir ); system( "chmod 777 new.dbf" ); system( "chmod 777 new.index" ); system( "mv new.dbf user.dbf" ); system( "mv new.index user.index" ); } # # ------------------------------------------------------------------- # # Some final notes: # # The above is a complete database package in that it will read, # write, and find a record. It will not delete a record. I leave # that to you to do since it is very easy to do. Also, this is # not the latest and greatest work I've done with this but an earlier # implementation of this database scheme. However, the newer version # is 1)Not yet ready to be released, and 2)Probably will be retained # as a needed piece of software for NASA (and thus will probably not # be made publicly available). # # Be that as it may - the above software does work and should be # usable by you with very little modification. These changes being # mainly changing the word "user" to whatever word best describes # your need and by changing how many items are placed into the # database via the "$theLine" variable as given in the read and # write parts. # # Have fun. Again, Q&A's can be answered by me at markem@flash.net. # Later! # # ------------------------------------------------------------------- #