[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] DBM files get big and slow



Berndt, try this. -fred

#!/usr/bin/perl -w
use DB_File;

$a = new DB_File::HASHINFO ;

# The elements of this structure are as follows:
# You must delete the file to change its structure.
# So experiment first.
# bsize - defines the hash table bucket size (page size), and is,
#    by default, 256 bytes. It may be preferable to increase
#    the page size for disk-resident tables
#    and tables with large data items.
# ffactor - indicates a desired density within the hash table.
#    It is an approximation of the number of keys allowed to accumulate in any
#    one bucket, determining when the hash table grows or shrinks.
#    The default value is 8.
#    Table is automatically expanded if average number of keys per
#    bucket exceeds the fill factor.
#
# nelem - is an estimate of the final size of the hash table.
#   If not set or set too low, hash tables will expand gracefully as keys
#   are entered, although a slight performance degradation may be noticed.
#   The default value is 1.
#   Defines the initial size of the file to guarentee you disk space.
#
# cachesize - A suggested maximum size, in bytes, of the memory cache.
#   This value is only advisory, and the access method will allocate more
#   memory rather than fail.
#
# hash - is a user defined hash function.
#   Since no hash function performs equally well on all possible data, the
#   user may find that the built-in hash function does poorly on a particular
#   data set.
#   User specified hash functions must take two arguments (a pointer to a byte
#   string and a length) and return a 32-bit quantity to be used as the hash
#   value.
#
# lorder -
#   The byte order for integers in the stored database metadata.
#   The number should represent the order as an integer; for example,
#   big endian order would be the number 4,321.
#   If lorder is 0 (no order is specified) the current host order is used.
#   If the  file already exists, the specified value is ignored and the
#   value specified when the tree was created is used.
#   If the file already exists (and the O_TRUNC flag is not specified), the
#   values specified for the parameters bsize, ffactor, lorder and nelem are
#   ignored and the values specified when the tree was created are used.
#
#   If a hash function is specified, hash_open
#   will attempt to determine if the hash function specified is the same as
#   the one with which the database was created, and will fail if it is not.

$a->{'bsize'}=0x0200; # store with 512 byte page size
                      # because I have itsee bitsee tags and values
$a->{'ffactor'} = 16; # allow pages to fill up with 16 tags on average
$a->{'nelem'} = 10000; # hog some disk space for the future, thats alot of
512 byte pages
                       # they'll all be mostly empty

print "bucketsize= $a->{'bsize'}\n";
print "fillfactor= $a->{'ffactor'}\n";
print "initial number of elements= $a->{'nelem'}\n";
print "bytes of cache= $a->{'cachesize'}\n";
print "user supplied hash function= $a->{'hash'}\n";
print "lorder= $a->{'lorder'}\n";

tie (%hash, "DB_File", "dummy_db", O_RDWR|O_CREAT, 0644, $a) or die "Cant
tie \"dummy_db\"\n";
for($i = 101; $i < 201; $i++)
{
  $hash{$i} = "hello" . ($i + 1000);
  print "$hash{$i}:$i\n";
}
untie(%hash);


>Hi ,
>
>I tried the following script on my 4,5 GB harddisk (with HFS+)
>and it was running for about a minute on my G3 card equiped Mac .
>The result was a dummy_db file of 20.6 MB size (not bad for 100 hashes).
>On an older maschine with a 500 MB harddrive, it runs in some seconds
>with a resulting dummy_db of about 30 KB. Is it some sector size related
>stuff or is HFS+ not recommended for DBM routines? Any idea what happened ?
>
>#!/usr/bin/perl
>use DB_File;
>
>tie (%hash, "DB_File", "dummy_db", O_RDWR|O_CREAT, 0644, $DB_HASH) or die
>"Cant tie \"dummy_db\"\n";
>for($i = 101; $i < 201; $i++)
>{
>  $hash{$i} = $i + 1000;
>  print "$i\n";
>}
>untie(%hash);
>
>
>
>
>Thanks
>Berndt Wischnewski
>
>
>
>===== Want to unsubscribe from this list?
>===== Send mail with body "unsubscribe" to macperl-request@macperl.org


--
Fred Giorgi <mailto:fgiorg@atl.com>




===== Want to unsubscribe from this list?
===== Send mail with body "unsubscribe" to macperl-request@macperl.org