Imagine there is a population/database/dictionary and we would like to distinguish its elements.

So for each element, let us somehow encode its individual features (e.g. using a hash function) as a bit sequence - the most dense way is to use sequence of uncorrelated P(0)=P(1)=1/2 bits.

We can now create minimal prefix tree required to distinguish these sequences, like in the figure below.

For such ensemble of random trees of given number of leaves (n), we can calculate Shannon entropy (H_n) - the amount of information it contains.

It turns out that it asymptotically grows with at average 2.77544 bits per element (1/2+(1+\gamma)lg(e)).

The calculations can be found here:http://arxiv.org/abs/1206.4555

Is it the minimal cost of distinguishability/individuality?

How to understand it?

ps. related question: can anyone find D(n)=\sum_{i=0}^{\infty} 1-(1-2^{-i})^n ?

The first thought about this distinctiveness is probably n! combinatorial term while increasing the size of the system, but n! is about the order and its logarithm grows faster than linearly. Distinctiveness is some much more subtle.

It is well seen in equation (10) from the paper (which should like):

H_n+lg(n!)=nD_n

where D_n is the average depth of leaves of such n leaf tree - average amount of bits distinguishing given element.

So this equation says:

distinctiveness/individualism + order = total amount of information

distinctiveness grows linearly with n (2.77544 asymptotic linear coefficient) - kind of chemical potential...

information about their ordering grows faster: lg(n!)\approx n lg(n)-n lg(e).