Background: Insomnia disorder is the second most prevalent mental disorder, and it is a primary risk factor for depression. Inconsistent clinical and biomarker findings in patients with insomnia disorder suggest that heterogeneity exists and that subtypes of this disease remain unrecognised. Previous top-down proposed subtypes in nosologies have had insufficient validity. In this large-scale study, we aimed to reveal robust subtypes of insomnia disorder by use of data-driven analyses on a multidimensional set of biologically based traits. Methods: In this series of studies, we recruited participants from the Netherlands Sleep Registry, a database of volunteers aged 18 years or older, who we followed up online to survey traits, sleep, life events, and health history with 34 selected questionnaires of which participants completed at least one. We identified insomnia disorder subtypes by use of latent class analyses. We evaluated the value of our identified subtypes of insomnia disorder by use of a second, non-overlapping cohort who were recruited through a newsletter that was emailed to a new sample of Netherlands Sleep Registry participants, and by assessment of within-subject stability over several years of follow-up. We extensively tested the clinical validity of these subtypes for the development of sleep complaints, comorbidities (including depression), and response to benzodiazepines; in two subtypes of insomnia disorder, we also assessed the clinical relevance of these subtypes by use of an electroencephalogram biomarker and the effectiveness of cognitive behavioural therapy. To facilitate implementation, we subsequently constructed a concise subtype questionnaire and we validated this questionnaire in the second, non-overlapping cohort. Findings: 4322 Netherlands Sleep Registry participants completed at least one of the selected questionnaires, a demographic questionnaire, and an assessment of their Insomnia Severity Index (ISI) between March 2, 2010, and Oct 28, 2016. 2224 (51%) participants had probable insomnia disorder, defined as an ISI score of at least 10, and 2098 (49%) participants with a lower ISI score served as a control group. With a latent class analysis of the questionnaire responses of 2224 participants, we identified five novel insomnia disorder subtypes: highly distressed, moderately distressed but reward sensitive (ie, with intact responses to pleasurable emotions), moderately distressed and reward insensitive, slightly distressed with high reactivity (to their environment and life events), and slightly distressed with low reactivity. In a second, non-overlapping replication sample of 251 new participants who were assessed between June 12, 2017, and Nov 26, 2017, five subtypes were also identified to be optimal. In both the development sample and replication sample, each participant was classified as having only one subtype with high posterior probability (0·91–1·00). In 215 of the original sample of 2224 participants with insomnia who were reassessed 4·8 (SD 1·6) years later (between April 13, 2017, and June 21, 2017), the probability of maintaining their original subtype was 0·87, indicating a high stability of the classification. We found differences between the identified subtypes in developmental trajectories, response to treatment, the presence of an electroencephalogram biomarker, and the risk of depression that was up to five times different between groups, which indicated a clinical relevance of these subtypes. Interpretation: High-dimensional data-driven subtyping of people with insomnia has addressed an unmet need to reduce the heterogeneity of insomnia disorder. Subtyping facilitates identification of the underlying causes of insomnia, development of personalised treatments, and selection of patients with the highest risk of depression for inclusion in trials regarding prevention of depression. Funding: European Research Council and Netherlands Organization for Scientific Research.