Temporal trends in sperm count: a systematic review and meta-regression analysis
Shanna H. Swan
Hum Reprod Update 1-14.
25 July 2017
Reported declines in sperm counts remain controversial today and recent trends are unknown. A definitive meta-analysis is critical given the predictive value of sperm count for fertility, morbidity and mortality.
To provide a systematic review and meta-regression analysis of recent trends in sperm counts as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group.
PubMed/MEDLINE and EMBASE were searched for English language studies of human SC published in 1981–2013. Following a predefined protocol 7518 abstracts were screened and 2510 full articles reporting primary data on SC were reviewed. A total of 244 estimates of SC and TSC from 185 studies of 42 935 men who provided semen samples in 1973–2011 were extracted for meta-regression analysis, as well as information on years of sample collection and covariates [fertility group (‘Unselected by fertility’ versus ‘Fertile’), geographic group (‘Western’, including North America, Europe Australia and New Zealand versus ‘Other’, including South America, Asia and Africa), age, ejaculation abstinence time, semen collection method, method of measuring SC and semen volume, exclusion criteria and indicators of completeness of covariate data]. The slopes of SC and TSC were estimated as functions of sample collection year using both simple linear regression and weighted meta-regression models and the latter were adjusted for pre-determined covariates and modification by fertility and geographic group. Assumptions were examined using multiple sensitivity analyses and nonlinear models.
SC declined significantly between 1973 and 2011 (slope in unadjusted simple regression models −0.70 million/ml/year; 95% CI: −0.72 to −0.69; P < 0.001; slope in adjusted meta-regression models = −0.64; −1.06 to −0.22; P = 0.003). The slopes in the meta-regression model were modified by fertility (P for interaction = 0.064) and geographic group (P for interaction = 0.027). There was a significant decline in SC between 1973 and 2011 among Unselected Western (−1.38; −2.02 to −0.74; P < 0.001) and among Fertile Western (−0.68; −1.31 to −0.05; P = 0.033), while no significant trends were seen among Unselected Other and Fertile Other. Among Unselected Western studies, the mean SC declined, on average, 1.4% per year with an overall decline of 52.4% between 1973 and 2011. Trends for TSC and SC were similar, with a steep decline among Unselected Western (−5.33 million/year, −7.56 to −3.11; P < 0.001), corresponding to an average decline in mean TSC of 1.6% per year and overall decline of 59.3%. Results changed minimally in multiple sensitivity analyses, and there was no statistical support for the use of a nonlinear model. In a model restricted to data post-1995, the slope both for SC and TSC among Unselected Western was similar to that for the entire period (−2.06 million/ml, −3.38 to −0.74; P = 0.004 and −8.12 million, −13.73 to −2.51, P = 0.006, respectively).
This comprehensive meta-regression analysis reports a significant decline in sperm counts (as measured by SC and TSC) between 1973 and 2011, driven by a 50–60% decline among men unselected by fertility from North America, Europe, Australia and New Zealand. Because of the significant public health implications of these results, research on the causes of this continuing decline is urgently needed.
Have sperm counts declined? This question remains as controversial today as in 1992 when Carlsen et al. (1992) wrote that: ‘There has been a genuine decline in semen quality over the past 50 years’. This controversy has continued unabated both because of the importance of the question and limitations in studies that have attempted to address it (Swan et al., 2000; Safe, 2013; Te Velde and Bonde, 2013).
Sperm count is of considerable public health importance for several reasons. First, sperm count is closely linked to male fecundity and is a crucial component of semen analysis, the first step to identify male factor infertility (World Health Organization, 2010; Wang and Swerdloff, 2014). The economic and societal burden of male infertility is high and increasing (Winters and Walsh, 2014; Hauser et al., 2015; Skakkebaek et al., 2016). Second, reduced sperm count predicts increased all-cause mortality and morbidity (Jensen et al., 2009; Eisenberg et al., 2014b, 2016). Third, reduced sperm count is associated with cryptorchidism, hypospadias and testicular cancer, suggesting a shared prenatal etiology (Skakkebaek et al., 2016). Fourth, sperm count and other semen parameters have been plausibly associated with multiple environmental influences, including endocrine disrupting chemicals (Bloom et al., 2015; Gore et al., 2015), pesticides (Chiu et al., 2016), heat (Zhang et al., 2015) and lifestyle factors, including diet (Afeiche et al., 2013; Jensen et al., 2013), stress (Gollenberg et al., 2010; Nordkap et al., 2016), smoking (Sharma et al., 2016) and BMI (Sermondade et al., 2013; Eisenberg et al., 2014a). Therefore, sperm count may sensitively reflect the impacts of the modern environment on male health throughout the life course (Nordkap et al., 2012).
Given this background, we conducted a rigorous and complete systematic review and meta-regression analysis of recent trends in sperm count as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group.
This systematic review and meta-regression analysis was conducted and the results reported in accordance with MOOSE (Meta-analysis in Observational Studies in Epidemiology) (Stroup et al., 2000) and PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analysis) guidelines (Liberati et al., 2009; Moher et al., 2009) [checklists available upon request—contact corresponding author for access]. Our research team included epidemiologists, andrologists and a qualified medical librarian, with consultation with an expert in meta-analysis. Our predefined protocol, detailed in Supplementary Information, was developed following best practices (Borenstein et al., 2009; Higgins and Green, 2011; Program NT, 2015), and informed by two pilot studies, the first using all 1996 publications and the second all 1981 and 2013 publications.
The goal of the search was to identify all articles that reported primary data on human sperm count. We searched MEDLINE on November 21, 2014 and Embase (Excerpta Medica database) on December 10, 2014 for peer-reviewed, English-language publications. Following the recommendation of the Cochrane Handbook for Systematic Reviews, we searched in title and abstract for both index (MeSH) terms and keywords and filtered out animal-only studies. We used the MeSH term ‘sperm count’, which includes seven additional terms, and to increase sensitivity we added 13 related keywords (e.g. ‘sperm density’ and ‘sperm concentration’). We included all publications between January 1, 1981 (the first full year after the term ‘Sperm Count’ was added to MEDLINE as a MeSH term) and December 31, 2013 (the last full year at the time we began our MEDLINE search).
All studies that reported primary data on human SC were considered eligible for abstract screening. We evaluated the eligibility of all subgroups within a study. For example, in a case-control study, the control group might have been eligible for inclusion even though, based on our exclusion criteria, the case group was not.
We divided eligible studies into two fertility-defined groups: men unselected by fertility status, hereafter ‘Unselected’ (e.g. young men unlikely to be aware of their fertility such as young men screened for military service or college students); and fertile men, hereafter ‘Fertile’ (e.g. men who were known to have conceived a pregnancy, such as fathers or partners of pregnant women regardless of pregnancy outcome).
A study was excluded if study participants were selected based on: infertility or sub-fertility; range of semen parameters (e.g. studies selecting normospermic men); genital abnormalities, other diseases or medication. We also excluded studies limited to men with exposures that may affect fertility such as occupational exposure, post-intervention or smoking. Studies of candidates for vasectomy or semen donation were included only if semen quality was not a criterion for men’s study participation. Studies with fewer than 10 men and those that used non-standard methods to collect or count sperm (e.g. methods other than masturbation for collection, or methods other than hemocytometer for counting) were also excluded.
First, based on the title and abstract the publication was either excluded or advanced to full text screening. Any publication without an abstract was automatically referred for full text screening. Second, we reviewed the full text and assigned it to exclusion within a specific category, or data extraction. We then confirmed study eligibility and identified multiple publications from the same study to ensure that estimates from the same population were not used more than once.
We extracted summary statistics on SC and TSC (mean, SD, SE, minimum, maximum, median, geometric mean and percentiles), mean or additional data on semen volume, sample size (for SC and for TSC), sample collection years and covariates: fertility group, country, age, ejaculation abstinence time, methods of semen collection, methods of assessing of SC and semen volume, selection of population and study exclusion criteria as well as number of samples per man. The range of permissible values, both for categorical and numerical variables, and information on data completeness were recorded. Data were extracted on all eligible subgroups separately as well as for the total population, if relevant. We attempted to extract data on additional potential confounders such as BMI, smoking and other lifestyle factors (e.g. alcohol and stress). However, except for smoking (which was examined in sensitivity analysis), data were available for such variables in only a minority of studies so these were not included in meta-regression analyses.
The study was conducted following a predefined protocol (Supplementary Information). Screening for this extensive systematic review was conducted by a team of eight reviewers (H.L., N.J., A.M.A., J.M., D.W.D., I.M., J.D.M., S.H.S.). The screening protocol was piloted by screening of 50 abstracts by all reviewers followed by a comparison of results, resolution of any inconsistencies and clarification of the protocol as needed. The same quality control process was followed for full text screening (35 studies reviewed by all reviewers) and data extraction (data extracted from three studies by all reviewers). All data were entered into digital spreadsheets with explicit permissible values (no open-ended entries) to increase consistency. After data extraction, an additional round of data editing and quality control of all studies was conducted by H.L. The process ensured that each study was evaluated by at least two different reviewers.
We used point estimates of mean SC or mean TSC from individual studies to model time trends during the study period, as measured by slope of SC or TSC per calendar year. The midpoint of the sample collection period was the independent variable in all analyses. Units were million/ml for SC and million for TSC (defined as SC × sample volume) and all slopes denote unit change per calendar year.
We first used simple linear regression models to estimate SC and TSC as functions of year of sample collection, with each study weighted by sample size. We then used random-effects meta-regression to model both SC and TSC as linear functions of time, weighting studies by the SE. In all meta-regression analyses, we included indicator variables to denote studies with more than one SC estimate. We controlled for a pre-determined set of potential confounders: fertility group, geographic group, age, abstinence time, whether semen collection and counting methods were reported, number of samples per man and indicators for exclusion criteria (Supplementary Table S1).
For several key variables missing values were estimated and a variable was included in meta-regression analyses to denote that the value had been estimated. For example, for studies that reported median (not mean) SC or TSC, we estimated the mean by adding the average difference between the mean and median in studies for which both were reported. For studies that did not report the range or midpoint year of sample collection, the midpoint was estimated by subtracting the average difference between year of publication and midpoint year of sample collection in studies for which both were reported from publication year. When SD but not SE of SC or TSC were reported, the SE was calculated by dividing the SD by the square root of sample size for each estimate. For studies that did not report SD or SE, we estimated SE by dividing the mean SD of studies that reported SD by the square root of sample size for this estimate. If mean TSC was not reported it was calculated by multiplying mean SC by mean semen volume (Supplementary Information).
Our final analyses included two groups of countries. One group (referred to here as ‘Western’) includes studies from North America, Europe, Australia and New Zealand. The second (‘Other’/‘Non-Western’) includes studies from all other countries (from South America, Asia and Africa). We initially examined studies from North America separately from Europe/Australia but combined these because trends were similar and only 16% of estimates were from North America. We assessed modification of slope by fertility group (Unselected versus Fertile) and geographic group (Western versus Other). Because of significant modification by fertility and geography, results of models with interaction terms are presented for four categories: Unselected Western; Fertile Western; Unselected Other; and Fertile Other. Overall percentage declines were calculated by estimating the sperm count (SC or TSC) in the first and last year of data collection, and dividing the difference by the estimate in the first year. The percentage decline per year was calculated by dividing the overall percentage declines by the number of years.
We ran all analyses for TSC weighting by SE of TSC and adjusted for method used to assess semen volume: weighing, read from pipette, read from tube or other.
We conducted several sensitivity analyses; adding cubic and quadratic terms for year of sample collection in meta-regression analyses to assess non-linearity; excluding a specific group for each covariate, such as a group with incomplete information; removing covariates one at a time from the model; removing studies with SEs > 20 million/ml; replacing age group by mean age, excluding studies that did not report mean age; adding covariate for high smoking prevalence (>30%); excluding countries that contributed the greatest number of estimates in order to examine the influence of these countries; restricting analyses to studies with data collected after 1985 and after 1995 to examine recent trends.
All analyses were conducted using STATA version 14.1 (StataCorp, TX, USA). A value of P < 0.05 was considered significant for main effect and P < 0.10 for interaction.
Systematic review and summary statistics
Using PubMed and Embase searches we identified 7518 publications meeting our criteria for abstract screening (Fig. 1). Of these, 14 duplicate records were removed and 4994 were excluded based on title or abstract screening. Full texts of the remaining 2510 articles were reviewed for eligibility and 2179 studies were excluded. Of the remaining 331 articles, 146 were excluded during data extraction and the second round of full text screening (mainly due to multiple publications). The meta-regression analysis is based on the remaining 185 studies, which included 244 unique mean SC estimates based on samples collected between 1973 and 2011 from 42 935 men. Data were available from 6 continents and 50 countries. The mean SC was 81 million/ml, the mean TSC was 260 million and the mean year of data sample collection was 1995. Of the 244 estimates, 110 (45%) were Unselected Western, 65 (27%) Fertile Western, 30 (12%) Unselected Other and 39 (16%) Fertile Other. Data from the 185 publications included in the meta-analysis are available upon request—contact corresponding author for access (Abyholm, 1981; Fariss et al., 1981; Leto and Frensilli, 1981; Wyrobek et al., 1981a,b; Aitken et al., 1982; Nieschlag et al., 1982; Obwaka et al., 1982; Albertsen et al., 1983; Fowler and Mariano, 1983; Sultan Sheriff, 1983; Wickings et al., 1983; Asch et al., 1984; de Castro and Mastrorocco, 1984; Fredricsson and Sennerstam, 1984; Freischem et al., 1984; Ward et al., 1984; Ayers et al., 1985; Heussner et al., 1985; Rosenberg et al., 1985; Aribarg et al., 1986; Comhaire et al., 1987; Kirei, 1987; Giblin et al., 1988; Kjaergaard et al., 1988; Mieusset et al., 1988, 1995; Jockenhovel et al., 1989; Sobowale and Akiwumi, 1989; Svanborg et al., 1989; Zhong et al., 1990; Culasso et al., 1991; Dunphy et al., 1991; Gottlieb et al., 1991; Nnatu et al., 1991; Pangkahila, 1991; Weidner et al., 1991; Levine et al., 1992; Sheriff and Legnain, 1992; Ali et al., 1993; Arce et al., 1993; Bartoov et al., 1993; Fedder et al., 1993; Noack-Fuller et al., 1993; World Health Organization, 1993; Hill et al., 1994; Rehan, 1994; Rendon et al., 1994; Taneja et al., 1994; Vanhoorne et al., 1994; Auger et al., 1995; Cottell and Harrison, 1995; Figa-Talamanca et al., 1996; Fisch et al., 1996; Irvine et al., 1996; Van Waeleghem et al., 1996; Vierula et al., 1996; Vine et al., 1996; Auger and Jouannet, 1997; Jensen et al., 1997; Lemcke et al., 1997; Handelsman, 1997a,b; Chia et al., 1998; Muller et al., 1998; Naz et al., 1998; Gyllenborg et al., 1999; Kolstad et al., 1999; Kuroki et al., 1999; Larsen et al., 1999; Purakayastha et al., 1999; Reddy and Bordekar, 1999; De Celis et al., 2000; Glazier et al., 2000; Mak et al., 2000; Selevan et al., 2000; Wiltshire et al., 2000; Zhang et al., 2000; Foppiani et al., 2001; Guzick et al., 2001; Hammadeh et al., 2001; Jorgensen et al., 2001, 2002, 2011, 2012; Kelleher et al., 2001; Lee and Coughlin, 2001; Patankar et al., 2001; Tambe et al., 2001; Xiao et al., 2001; Costello et al., 2002; Junqing et al., 2002; Kukuvitis et al., 2002; Luetjens et al., 2002; Punab et al., 2002; Richthoff et al., 2002; Danadevi et al., 2003; de Gouveia Brazao et al., 2003; Firman et al., 2003; Liu et al., 2003; Lundwall et al., 2003; Roste et al., 2003; Serra-Majem et al., 2003; Uhler et al., 2003; Xu et al., 2003; Ebesunun et al., 2004; Rintala et al., 2004; Toft et al., 2004, 2005; Bang et al., 2005; Mahmoud et al., 2005; Muthusami and Chinnaswamy, 2005; O’Donovan, 2005; Tsarev et al., 2005, 2009; Durazzo et al., 2006; Fetic et al., 2006; Giagulli and Carbone, 2006; Haugen et al., 2006; Iwamoto et al., 2006, 2013a,b; Pal et al., 2006; Yucra et al., 2006; Aneck-Hahn et al., 2007; Garcia et al., 2007; Multigner et al., 2007; Plastira et al., 2007; Rignell-Hydbom et al., 2007; Wu et al., 2007; Akutsu et al., 2008; Bhattacharya, 2008; Gallegos et al., 2008; Goulis et al., 2008; Jedrzejczak et al., 2008; Kobayashi et al., 2008; Korrovits et al., 2008; Li and Gu, 2008; Lopez-Teijon et al., 2008; Paasch et al., 2008; Peters et al., 2008; Recabarren et al., 2008; Recio-Vega et al., 2008; Saxena et al., 2008; Shine et al., 2008; Andrade-Rocha, 2009; Kumar et al., 2009, 2011; Rylander et al., 2009; Stewart et al., 2009; Vani et al., 2009, 2012; Verit et al., 2009; Engelbertz et al., 2010; Hossain et al., 2010; Ortiz et al., 2010; Rubes et al., 2010; Tirumala Vani et al., 2010; Al Momani et al., 2011; Auger and Eustache, 2011; Axelsson et al., 2011; Brahem et al., 2011; Jacobsen et al., 2011; Khan et al., 2011; Linschooten et al., 2011; Venkatesh et al., 2011; Vested et al., 2011; Absalan et al., 2012; Al-Janabi et al., 2012; Katukam et al., 2012; Mostafa et al., 2012; Nikoobakht et al., 2012; Rabelo-Junior et al., 2012; Splingart et al., 2012; Bujan et al., 2013; Girela et al., 2013; Halling et al., 2013; Ji et al., 2013; Mendiola et al., 2013; Redmon et al., 2013; Thilagavathi et al., 2013; Valsa et al., 2013; Zalata et al., 2013; Zareba et al., 2013; Huang et al., 2014).
Simple linear models
Combining results from all four groups of men SC declined significantly (slope per year −0.70 million/ml; 95% CI: −0.72 to −0.69; P < 0.001) over the study period when using simple linear models (unadjusted, weighted by sample size) (Fig. 2a). SC declined by 0.75% per year (95% CI: 0.73–0.77%) and overall by 28.5% between 1973 and 2011. A similar trend was seen for TSC (slope per year = −2.23 million; 95% CI: −2.31 to −2.16; P < 0.001) (Fig. 2b), corresponding to a decline in TSC of 0.75% per year (95% CI: 0.72–0.78%), and 28.5% overall. Semen volume (156 estimates), did not change significantly over the study period (slope per year = 0.0003 ml; 95% CI: −0.0003 to 0.0008; P = 0.382).