[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Memory requirements for data structures



Greetings!

Anyone noticed that data structures seem to tke up an awful lot of memory 
in (Mac)Perl?

Consider the following:

If you use a hash to set the value of 1024 8-byte keys to 1, then the 
'visible payload' is 9 bytes per entry (or 9 KB for the whole thing).  
Real world tests would suggest that it is one (or _two_) orders of 
magnitude higher than that!

While I appreciate that there is likely to be a 32-bit pointer (for each 
entry) in there somewhere, that would only account for 44% of a 
1,000..10,000% overhead.  What's taking up the rest?

In the struggle for memory efficiency, one often turns to the array.  If 
one's keys happen to be integers, then $exists{$id} = 1; can be just as 
easily accomplished using $exists[$id] = 1; and, because the numbers are 
actually _indices_ they shouldn't need to be stored, and hence the 
'visible payload' should be reduced to a single byte per entry.

If we continue with the above example, one could hope for a best-case 
scenario of an 89% decrease in the size of the data structure.  But that 
doesn't happen (not even with small keys).

In extreme cases, one can even use vectors to store data, with 
vec($exists,$id,1); accomplishing much the same thing.  Bit-level 
manipulation of a _single_ string would seem to have 88% less payload 
than even a byte (character) array.  In real-world tests, the difference 
is hardly perceptible.

So, my question to those of you interested in this sort of thing, is 
"Why?"

By being 'clever' with how you store certain types of data, it is easily 
possible to acheive a 98.9% decrease in the amount of information needed 
to be manipulated.  The trickle-down benefits of such approaches (lower 
storage requirements, faster processing, etc.) should be obvious, and do 
materialise, but one thing that seems to go against the grain is the 
memory requirement.

Is this a MacPerl thing or just a Perl thing in general?  Are there any 
references/articles out there which discuss the overheads of Perl data 
structures, and memory optimisation strategies?  Are there any tools out 
there which allow memory usage to be accurately monitored?  Is this the 
price we pay for automated garbage collection?  Is it crippling 
object-oriented approaches to problem-solving?  Will Perl 6 address the 
issue?  Is it even an issue that needs to be addressed?  Do all languages 
suffer from similar 'overheads'?

Ladies and gentlemen, the chair is yours...

Henry.

# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org