[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Tie::SubstrHash fails for very large hashes



I checked the archives and didn't find anything about this issue.

I have been working recently on a project involving merging monthly 
update files into a large flat-file database. Both the updates and 
the original db are stored as fixed-length records in text files, so 
I read them into two large hashes and then reconcile the data.

I discovered, though, that I wound up with less than 20% of the data 
coming back out.

The following snip of code demonstrates the error:

#------------------------------
#! /usr/local/bin/perl5

require Tie::SubstrHash;

print "Here we go!\n";
$hashsize = 114_862;		# arbitrary values from my data set
tie %test, "Tie::SubstrHash", 13, 86, $hashsize;

for ($i = 1; $i <= $hashsize; $i++) {
	$key1 = $i + 100_000;		# fix to uniform 6-digit numbers
	$key2 = "abcdefg$key1";
	$test{$key2} = ("abcdefgh" x 10) . "$key1";
}

print scalar(keys %test), "\n";

print %test, "\n";
print $test{"abcdefg207250"}, "\n";
print (keys %test), "\n";
print "\nDone.\n";
#-----------------------------

If you remove the "require" and "tie" statements, everything works 
fine, provided you've given MacPerl a huge enough memory allocation. 
(The hash alone is over 10 MB, not counting any "bookkeeping 
overhead.") Of course, you'll get a spew of garbled output from the 
"print %test" and "print (keys %test)" commands.

However, if you leave the code as is, on my system (PowerMac G4, 
single 500 MHz, MacOS 9.0.4, MacPerl 5.2.0r4), you get exactly ONE 
key/value pair back for the whole hash. It is specifically the one 
with key "abcdefg207251". I tried pulling out the previous element by 
name, and that worked fine, but both (%test) and (keys %test) come 
out with only a single line of data.

In my actual program, there's another hash with keylength of 6 
instead of 13, but still a value length of 86; that one ends up with 
only 18% of its data appearing in (keys %hash).

My program works fine without using the Tie::SubstrHash module, but 
it would still be nice to have it available for other programs.

-- 
Linc Madison  *  San Francisco, CA  *  LincPerl@LincMad.com
NO SPAM: California Bus & Prof Code Section 17538.45 applies!

# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org