Incremental crawl of the Portuguese web performed between 23 September 2014 and 24 October 2014 mainly from .PT domain. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/) taking the content of AWP15 as baseline. Thus, the files that remained unchanged from the AWP15 complete crawl were not archived (duplicated) on the AWP16 incremental crawl.
All the items of the AWP12 incremental crawl are identified by the custom field pwacrawlid:AWP16. The AWP16 crawl is incremental because it was performed using DeDuplicator (http://landsbokasafn.github.io/DeDuplicator/). The previous complete crawl performed without DeDuplicator has the pwacrawlid:AWP15. The files of AWP16 that were duplicates from AWP15 were not archived. To see the complete content of AWP16 (e.g. pages containing duplicate images from AWP15) it must be combined with AWP15.