Popular web sites are expected to handle huge number of requests concurrently within a reasonable time frame. The performance of these web sites is largely dependent on effective thread management of their web servers. Although the implementation of static and dynamic thread policies is common practice, remarkably little is known about the implications on performance. Moreover, the commonly used policies do not take into account the complex interaction between the threads that compete for access to a shared resource. We propose new dynamic thread-assignment policies that minimize the average response time of web servers. The web server is modeled as a two-layered tandem of multi-threading queues, where the active threads compete for access to a common resource. This type of two-layered queueing model, which occurs naturally in the performance modeling of systems with intensive software-hardware interaction, are on the one hand appealing from an application point of view, but on the other hand are challenging from a methodological point of view. Our results show that the optimal dynamic thread-assignment policies yield strong reductions in the response times. Validation on an Apache web server shows that our dynamic thread policies confirm our analytical results. © 2008 Elsevier B.V. All rights reserved.