[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-Modules] More about Tie::SubstrHash in MacPerl



>I do not have access to the module right now, but have a thought.  Does
>Tie::SubstrHash use DB_File?  DB_File is broken in some cases on MacPerl.
>I did a test case where I tried to write 100,000 keys, and I could retrieve
>only two.  It is a disk issues.  If I changed volumes, it worked fine.
>
>Anyway, look to see if DB_File is being used.  I can try to look more into
>it at a later date.  Not this weekend, though.
>
>Also, I think this kind of thing is inappropriate for perlbug until more is
>discovered ... it might not be a bug having anything to do with perl
>itself, or even MacPerl itself.

I don't think it uses DB_File, and I'm not tying the hash to a file, 
only to a more organized memory structure with fixed-length keys and 
values. Since I'm not using the disk at all, the disk volume 
shouldn't come into play.

The only "use" statement that I see in the SubstrHash.pm file is "use 
Carp;", which is only for error messages.

As to whether or not it's a bug, there is clearly a bug in 
Tie::SubstrHash as implemented in MacPerl 5.20r4, in some respect. I 
consider that established. I therefore don't see why this would be in 
any way inappropriate for perlbug.

This sample code fails:
#----------------
#! /usr/local/bin/perl5

use Tie::SubstrHash;

tie %test, "Tie::SubstrHash", 13, 86, 1;

$test{abcdefg000001} = ("abcde" x 17) . "1";

$j = scalar(keys %test);
print "hashsize = $j\n";	# should print:  hashsize = 1
print %test;			# should print:  abcdefg000001abcde...abcde1
#----------------

How can that be anything other than a bug?

Better yet:
#--------------
#! /usr/local/bin/perl5

use Tie::SubstrHash;

tie %test, "Tie::SubstrHash", 1, 5, 1;

$test{a} = "12345";

$j = scalar(keys %test);
print "hashsize = $j\n";
print %test, "\n";
print "Doing it by hand:\na => $test{a}\n";
print "Done.\n\n";
#--------------

(It works for 1,1,1 through 1,4,1, but fails for 1,5,1.)

I don't know whether or not it makes any difference in the operation 
of SubstrHash as a whole, but the subroutine "findprime" has a couple 
of bugs. First, and of relatively little worry, it doesn't 
accommodate the special case that 2 is prime. Of greater worry, 
though, it will return non-prime numbers if you give it a number that 
is just less than the square of a prime. For example, findprime(119) 
returns 121, which is not prime.

The subroutine findprime needs to be modified, something like
$max = (int sqrt $num) + 1;
# the parentheses are necessary because "+" has higher precedence than "sqrt"

because hash values may be lost if $tsize in TIEHASH is not actually prime.

For example,
#----------------
#! /usr/local/bin/perl5

require Tie::SubstrHash;

print "Here we go!\n";
$hashsize = 119;		# arbitrary values from my data set
tie %test, "Tie::SubstrHash", 13, 86, $hashsize;

for ($i = 1; $i <= $hashsize; $i++) {
	$key1 = $i + 100_000;		# fix to uniform 6-digit numbers
	$key2 = "abcdefg$key1";
	$test{$key2} = ("abcdefgh" x 10) . "$key1";
}

print ("scalar keys = ", scalar(keys %test), "\n");

print %test, "\n";
print ((keys %test), "\nDoing it by hand:\n");
for ($i = 1; $i <= $hashsize; $i++) {
	$key1 = $i + 100_000;
	$key2 = "abcdefg$key1";
	print ("$key2 => ", ( (defined $test{$key2})
			? $test{$key2} : "undefined"), "\n");
}
print "\nDone.\n";
#---------------

Only two key/value pairs are returned by the normal methods, but even 
extracting the values by hand, 38 of the 119 values stored have been 
lost. That doesn't seem to be directly related to the "findprime" 
problem, though, because with a table size of 119, findprime is given 
the argument 119 * 1.1 = 130.9.

Still, something is badly amiss, and I say it's a bug.

I've been testing my sample code with the following modification to 
the Tie::SubstrHash module:

#----------------------
#sub findprime {
#    use integer;
#
#    my $num = shift;
#    $num++ unless $num % 2;
#
#    $max = int sqrt $num;
#
#  NUM:
#    for (;; $num += 2) {
#	for ($i = 3; $i <= $max; $i += 2) {
#	    next NUM unless $num % $i;
#	}
#	return $num;
#    }
#}

sub findprime {
     use integer;

     my $num = int shift;	# this is NOT assumed by "use integer"
     return 2 if ($num <= 2);	# special case: 2 is prime
     $num++ unless $num % 2;

     $max = int sqrt($num) + 1;	# in case $num is just less than (prime)**2

   NUM:
     for (;; $num += 2) {
	for ($i = 3; $i <= $max; $i += 2) {
	    next NUM unless $num % $i;
	}
	return $num;
     }
}
#----------------------

So far, all of my sample code blocks work just fine with that modification.

As to whether this bug lies with Perl or MacPerl, I leave that to 
others to determine.
-- 
Linc Madison  *  San Francisco, CA  *  LincPerl@LincMad.com
NO SPAM: California Bus & Prof Code Section 17538.45 applies!

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-modules-request@macperl.org