Modelling and searching web-based document collections door