Motivated by the spread of on-line information in general and on-line
petitions in particular, recent research has raised the following
combinatorial estimation problem. There is a tree T that we cannot
observe directly (representing the structure along which the
information has spread), and certain nodes randomly decide to make
their copy of the information public. In the case of a petition, the
list of names on each public copy of the petition also reveals a path
leading back to the root of the tree. What can we conclude about the
properties of the tree we observe from these revealed paths, and can
we use the structure of the observed tree to estimate the size of the
full unobserved tree T?
Here we provide the first algorithm for this size estimation task,
together with provable guarantees on its performance. We also
establish structural properties of the observed tree, providing the
first rigorous explanation for some of the unusual structural
phenomena present in the spread of real chain-letter petitions on the
Internet.