Policy scorings of political actors are crucially important in operationalising rational choice models and other important theories in political science. Doing them more cheaply and quickly by computer is important for the advancement of the discipline. But we can hardly substitute these for hand-coding or even use them in new fields without being sure of their validity and reliability. We check this by comparing the mappings produced by word frequency methods with the policy series available from the work of the Manifesto Research Group/Comparative Manifesto Project (MRG/CMP). Using an aggregate calibrating/reference 'document set' for the time period in question evades reliability problems with pairwise comparisons and provides an authoritative text which enables individual party platforms to be scored and mapped over long time periods. Comparisons of the techniques for two countries (US and UK) are not encouraging. Wordscores in their current operationalisation flatten out party movement just as previous computerised approaches have done. Sensitivity testing with British party manifestos 1979-1997, using an expert scoring, does not reveal any improvement in performance. The reliability problems which arise with policy series are also likely to recur with cross-sectional applications of the word frequency approach. © 2006 Elsevier Ltd. All rights reserved.