On Landing and Internal Web Pages: The Strange Case of Jekyll and Hyde in Web Performance Measurement

Waqar Aqeel, Balakrishnan Chandrasekaran, Anja Feldmann, Bruce M. Maggs

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

256 Downloads (Pure)

Abstract

There is a rich body of literature on measuring and optimizing nearly every aspect of the web, including characterizing the structure and content of web pages, devising new techniques to load pages quickly, and evaluating such techniques. Virtually all of this prior work used a single page, namely the landing page (i.e., root document, "/"), of each web site as the representative of all pages on that site. In this paper, we characterize the differences between landing and internal (i.e., non-root) pages of 1000 web sites to demonstrate that the structure and content of internal pages differ substantially from those of landing pages, as well as from one another. We review more than a hundred studies published at top-tier networking conferences between 2015 and 2019, and highlight how, in light of these differences, the insights and claims of nearly two-thirds of the relevant studies would need to be revised for them to apply to internal pages.Going forward, we urge the networking community to include internal pages for measuring and optimizing the web. This recommendation, however, poses a non-trivial challenge: How do we select a set of representative internal web pages from a web site? To address the challenge, we have developed Hispar, a "top list" of 100,000 pages updated weekly comprising both the landing pages and internal pages of around 2000 web sites. We make Hispar and the tools to recreate or customize it publicly available.
Original languageEnglish
Title of host publicationIMC '20
Subtitle of host publicationProceedings of the ACM Internet Measurement Conference
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery
Pages680–695
Number of pages16
ISBN (Electronic)9781450381383
DOIs
Publication statusPublished - Oct 2020

Publication series

NameProceedings of the ACM SIGCOMM Internet Measurement Conference, IMC

Funding

FundersFunder number
National Science Foundation1901047, 1763742
National Science Foundation

    Keywords

    • Web page performance
    • PLT
    • Speed Index
    • top lists
    • QoE

    Fingerprint

    Dive into the research topics of 'On Landing and Internal Web Pages: The Strange Case of Jekyll and Hyde in Web Performance Measurement'. Together they form a unique fingerprint.

    Cite this