Scolaris Content Display Scolaris Content Display

Testosterone supplementation in men with sexual dysfunction

Esta versión no es la más reciente

Contraer todo Desplegar todo

Abstract

This is a protocol for a Cochrane Review (Intervention). The objectives are as follows:

To assess the effects of testosterone compared to placebo or other medical treatments in men with sexual dysfunction.

Background

Description of the condition

Sexual dysfunction is any physical or psychological problem that prevents an individual or a couple from getting sexual satisfaction (McCabe 2016). Erectile dysfunction is the most prevalent complaint in men with sexual dysfunction, and its incidence increases with age (Feldman 1994; Hatzimouratidis 2010; Lue 2004). The aging process in men is accompanied by a progressive decline in serum testosterone levels. Serum concentration of testosterone declines by an average of 3.1 to 3.5 ng/dL per year in middle‐aged and older men. Low testosterone concentrations (value of less than 325 ng/dL) are seen in 19% of men in their 60s, 28% of men in their 70s, and 49% of men in their 80s (Harman 2001; Kang 2013). Erectile dysfunction is also associated with reduced or borderline circulating androgen levels (Feldman 2002). Male hypogonadism is traditionally defined as the failure of the testes to produce testosterone, because of a local (testicular; primary) or a distant (pituitary/hypothalamus; secondary) deficiency (Corona 2011). A new classification of male hypogonadism was introduced essentially based on age of onset of the hypogonadal symptoms and signs, thus distinguishing very early (in fetal life), early (pre‐ or peri‐pubertally), and late‐onset hypogonadism, which presents after puberty with aging (Corona 2011). Testosterone hormone should be prescribed for men with sexual dysfunction and testosterone deficiency (Khera 2016; Lunenfeld 2013; Wang 2009). Given the controversies in the threshold for diagnosing late‐onset hypogonadism as described in the following section (Seftel 2006), we will consider individuals with sexual dysfunction regardless of serum testosterone level as the target population through which to define the disease condition and intervention in this review.

Diagnosis

Late‐onset hypogonadism is a clinical syndrome defined as failure to produce physiological levels of testosterone with several symptoms and signs suggestive of androgen deficiency (Khera 2016). Late‐onset hypogonadism is a relatively common medical condition affecting the aging male, but is difficult to diagnose since there are no clear‐cut differences in relation to the physiological aging process. However, it has been recognised that diagnosis of late‐onset hypogonadism should be made only in men with consistent symptoms and signs and unequivocally low serum testosterone levels (Khera 2016; Lunenfeld 2013; Wang 2009).

The main symptoms of late‐onset hypogonadism are low libido and decreased nocturnal and morning erections. Other symptoms of late‐onset hypogonadism are several vague and non‐specific clinical features such as decreased muscle mass and strength, increased body fat, decreased bone mineral density and osteoporosis, and depressed mood (Khera 2016; Lunenfeld 2013; Wang 2009). Questionnaires such as the Aging Male Symptom Score (AMS) and the Androgen Deficiency in Aging Men that have good sensitivity are widely used in clinical practice to evaluate symptoms related to late‐onset hypogonadism, but are not recommended for the diagnosis of hypogonadism because of low specificity (Heinemann 1999; Morley 2000).

The biochemical work‐up to establish late‐onset hypogonadism is the measurement of serum testosterone. A number of methods are available to assess serum testosterone. Most clinical laboratories estimate the serum concentration of total testosterone by immunoassays which are relatively inexpensive and readily available (Diver 2009). There are no consistent and clinically relevant reference ranges, but total testosterone levels below 12 nmol/L are used as the lower limits of normal total testosterone (Khera 2016; Lunenfeld 2013; Wang 2009). However, approximately 50% to 70% of testosterone in the blood is tightly bound to sex hormone‐binding globulin (SHBG); 20% to 30% is loosely bound to albumin; 4% is bound to other proteins; and 1% to 3% is the free, non‐bound form in men (Diver 2009). The SHBG‐bound form is not readily physiologically available, whereas albumin‐bound testosterone and free testosterone (bioavailable testosterone) are able to cross cell membranes and bind to androgen receptors (Diver 2009). Recently, measurement of free or bioavailable testosterone has become the best marker of late‐onset hypogonadism and may confirm the diagnosis when the serum total testosterone concentration is not diagnostic of late‐onset hypogonadism. Equilibrium dialysis is the gold standard for free testosterone measurement, and free testosterone levels below 225 pmol/L are generally accepted as lower limits of free testosterone (Wang 2009). Alternatively, calculated free testosterone by measuring serum SHBG levels together with total testosterone can be used (Diver 2009; Khera 2016; Lunenfeld 2013; Wang 2009). It has been recommended that blood samples for diagnosing late‐onset hypogonadism be obtained in the morning, usually from 7:00 to 11:00 AM, because of the diurnal variation of serum testosterone in which values are highest in the early morning (Diver 2009; Khera 2016; Lunenfeld 2013; Wang 2009). Late blood sampling can overestimate the risk of diagnosing late‐onset hypogonadism (Khera 2016).

Treatment

In 1849, Adolph Berthold demonstrated a connection between "testicular secretions", male behaviour, and sexual characteristics (Nieschlag 2014). These testicular secretions have been elucidated as testosterone (Nieschlag 2014; Nieschlag 2015). Since exogenous testosterone administration, known as testosterone supplementation therapy, became available for the Klinefelter and the Kallmann syndrome (Nieschlag 2014; Zitzmann 2000), World Health Organization (WHO) consensus has stated that the major goal of exogenous testosterone administration is to restore testosterone levels at as close to physiologic concentrations associated with the patient's age as possible (Nieschlag 1992).

Testosterone supplementation therapy has been recommended in men who have low testosterone levels associated with certain medical conditions such as failure of the testicles to produce testosterone because of genetic problems or exposures from chemotherapy (primary hypogonadism), or problems with the pituitary gland or hypothalamus that control the production of testosterone by the testis (secondary hypogonadism) (Khera 2016; Lunenfeld 2013; Wang 2009). In addition, testosterone supplementation therapy is also recommended in the treatment of erectile dysfunction and problems with libido arising from low testosterone levels in aging men without other endocrinological causes of low testosterone (Khera 2016; Lunenfeld 2013; Wang 2009).

Based on current guidelines by major professional societies in andrology and endocrinology, there is a general agreement that testosterone supplementation therapy may be offered to symptomatic individuals that indicate a diagnosis of late‐onset hypogonadism when circulating total testosterone is below 8 nmol/L (Khera 2016; Lunenfeld 2013; Wang 2009). In the case of a total testosterone level above 12 nmol/L, testosterone supplementation therapy should not be recommend . If the serum total testosterone level is between 8 and 12 nmol/L with typical hypogonadal symptoms, measurement of free testosterone may be helpful to exclude the conditions when SHBG could be decreased (obesity, acromegaly, hypothyroidism) or increased (aging, hepatic illness, hyperthyroidism, use of anticonvulsants).

Description of the intervention

Testosterone has been available for more than 70 years, and the number of new testosterone preparations has increased dramatically over the last two decades (Nieschlag 2014). Recent US data showed that 2.2 million men were prescribed testosterone in 2013 compared to 1.2 million in 2010 (Nguyen 2015). Aggressive marketing strategies by American pharmaceutical companies may explain the remarkable increase in testosterone use, even in the absence of an appropriate diagnosis (Khera 2016; Lunenfeld 2013; Wang 2009).

Currently available testosterone supplementation therapy options include intramuscular injections, transdermal, subcutaneous pellets, oral, and buccal formulations (Giagulli 2011; Shoskes 2016).

  • Intramuscular injections is the longest‐available therapy option. While relatively inexpensive, frequent administration of testosterone is required to maintain therapeutic levels with two‐ to three‐week intervals owing to their short half‐life. Recently, longer‐acting testosterone injections have become available with improved compliance compared with the shorter‐acting testosterone injections.

  • Transdermal formulations (such as gels, creams, and solutions) maintain stable physiological testosterone levels and require daily application. However, there are concerns regarding local adverse events such as skin irritation and rash.

  • Subdermal administration (subcutaneous pellet implantation) provides sustained normal testosterone levels for three to four months, but implantation is invasive requiring surgical intervention.

  • Other therapies (oral tables and buccal patch) are also available. These formulations are convenient for daily use, but have disadvantages owing to frequent fluctuations in plasma testosterone levels.

Testosterone supplementation therapy can be initiated with any of the approved formulations in accordance with patient preference, predicted adverse effects, cost, and insurance coverage (Shoskes 2016).

Adverse effects of the intervention

The most important concerns about testosterone administration relate to cardiovascular risks in older men with age‐related decline in testosterone levels (US Food and Drug Administration). Two observational studies that examined the risk of cardiovascular events associated with testosterone supplementation therapy found an increased risk of cardiovascular harm (Finkle 2014; Vigen 2013). Similarly, a recent meta‐analysis of 27 randomised controlled trials reported that testosterone therapy was associated with an increased risk of adverse cardiovascular events (odds ratio 1.5, 95% confidence interval 1.1 to 2.1) (Xu 2013). The US Food and Drug Administration (FDA) has warned that testosterone supplementation therapy increases cardiovascular risk and has cautioned that prescription testosterone products are approved only for men who have low testosterone levels caused by certain medical conditions (US Food and Drug Administration).

Additonal adverse events of testosterone supplementation therapy are polycythaemia and an increase in prostate‐related events. Testosterone supplementation therapy commonly increases serum concentrations of haemoglobin and haematocrit which is described as erythrocytosis or polycythaemia related to the risk for thromboembolism (Jones 2015b). Another health risk of testosterone supplementation therapy is an increased incidence of prostate‐related events (i.e. combined incidence of prostate‐biopsy, prostate enlargement, elevated prostate specific antigen (PSA), and prostate cancer) (Calof 2005). There are concerns that testosterone supplementation therapy may increase prostate cancer risk, but it is unclear whether there is in fact a causal relation (Coward 2009; Shabsigh 2009).

In addition, exogenous testosterone also produces dose‐dependent depression of gonadotropin (follicle‐stimulating hormone (FSH) and luteinising hormone (LH)) release either by direct action on the pituitary gland or by suppression of the hypothalamic gonadotropin‐releasing hormone (GnRH) release. Reduced gonadotropin secretion results in decreased intratesticular and peripheral testosterone levels, which can cause adverse effects such as testicular atrophy, oligospermia, azoospermia, and other sperm abnormalities (Fronczak 2012). A lack of libido, erectile dysfunction, or even gynecomastia may also be observed (Nieschlag 2015).

How the intervention might work

Physiologic action of testosterone depends on the presence of an intact hypothalamic pituitary gonadal axis. The hypothalamus produces GnRH, which stimulates FSH and LH production in the anterior pituitary gland. Follicle‐stimulating hormone is responsible for sperm production, and LH is responsible for testosterone secretion in the testis. Testosterone is secreted primarily by the testes, and to a lesser extent by the adrenal glands. It exerts its action through binding to and activation of the androgen receptor in most organ systems, including the central and peripheral nervous system, musculoskeletal system, reproductive system, and cardiovascular system (Corona 2011).

Testosterone plays a critical role in the development of the male reproductive organs such as the testes and prostate, as well as promoting secondary sexual characteristics during puberty (Gannon 2016; Ohlander 2016). In addition, testosterone modulates male sexual desire, spontaneous sexual thoughts, motivation, attentiveness to erotic stimuli, and sexual activity (Gannon 2016). It also stimulates muscle protein synthesis and may increase bone mineral density by direct effect of testosterone or by an indirect action requiring conversion to estradiol (Borst 2015). Finally, testosterone is thought to exert a variety of beneficial effects on blood vessels and the heart though improvement of lipid profiles, insulin sensitivity, and exercise tolerance (Kloner 2016).

Why it is important to do this review

The benefit and safety of testosterone supplementation therapy for the treatment of men with low testosterone are not well established (Garnick 2015). Testosterone supplementation therapy has been performed in younger androgen deficient men for many years. However, testosterone supplementation therapy for older men with low testosterone levels has long been a topic of discussion (Zitzmann 2000). While there are existing systematic reviews for testosterone supplementation therapy to treat males with sexual dysfunction, none so far have used the same rigorous methodology as Cochrane Reviews, which include the GRADE approach for rating the certainty of evidence (Guo 2016; Tsertsvadze 2009). Given the controversial and important nature of this question, we propose a methodologically rigorous assessment of the benefits and harms of testosterone supplementation therapy in men with sexual dysfunction.

Objectives

To assess the effects of testosterone compared to placebo or other medical treatments in men with sexual dysfunction.

Methods

Criteria for considering studies for this review

Types of studies

We will include randomised controlled trials.

Types of participants

We will include adult men (aged 40 years and over) with sexual dysfunction. The age limitation is based on observations from the Massachusetts Male Aging Study, which noted that the prevalence of complete impotence triples from 5% to 15% between 40 and 70 years (Feldman 1994).

Diagnostic criteria for sexual dysfunction

We will define sexual dysfunction as acquired symptoms of decreased libido (or desire to engage in sexual activity) and/or erectile dysfunction referring to inability to attain or maintain an erection sufficient to permit satisfactory sexual performance (Khera 2016; Lunenfeld 2013; Wang 2009). We will include trials that use validated questionnaires related to sexual dysfunction such as AMS or International Index of Erectile Function (IIEF) to establish the diagnosis of sexual dysfunction (Heinemann 1999; Rosen 1997).

Types of interventions

We plan to investigate the following comparisons of intervention versus control/comparator.

Intervention

  • Testosterone supplementation therapy regardless of formulation

  • Testosterone supplementation therapy + phosphodiesterase 5 inhibitor (PDE5I)

Comparator

  • Placebo compared with testosterone supplementation therapy or testosterone supplementation therapy + PDE5I

  • PDE5I compared with testosterone supplementation therapy or testosterone supplementation therapy + PDE5I

Concomitant interventions will have to be the same in both the intervention and comparator groups to establish fair comparisons.

Minimum duration of intervention

We define a clinically meaningful minimal duration of the intervention as one month (four weeks).

Minimum duration of follow‐up

We define a clinically meaningful minimal duration of follow‐up as three months (12 weeks).

Summary of specific exclusion criteria

We will exclude trials of men with the following.

  • Primary hypogonadism such as disorders of the testicles.

  • Secondary hypogonadism such as disorders of pituitary gland or brain that cause a hypogonadism.

  • Disorders that cause hormonal disturbances such as Cushing disease or syndrome.

  • Relative or absolute contraindications of testosterone supplementation therapy such as haematocrit > 55%, known prostate cancer or PSA levels greater than 4 ng/mL, and untreated sleep apnoea.

Types of outcome measures

We will not exclude a trial if it fails to report one or several of our primary or secondary outcome measures. If a trial reports none of our primary or secondary outcomes, we will not include the trial but provide some basic information in an additional table.

We will investigate the following outcomes using the methods and time points specified below.

Primary outcomes

  • Erectile function

  • Sexual quality of life

  • Cardiovascular mortality

Secondary outcomes

  • Treatment withdrawal due to adverse events

  • Prostate‐related events

  • Lower urinary tract symptoms

  • Socioeconomic effects

Method of outcome measurement

We will consider clinically important difference for the review outcomes to rate the certainty of the evidence for imprecision in the 'Summary of findings' table (Johnston 2010). We will use minimal clinically important difference (MCID) for participant‐reported outcomes and clinically relevant absolute or relative effect measures for dichotomous outcomes to define clinically important difference.

Erectile function

  • Evaluated by validated questionnaires such as sexual domain of AMS, erection domain of IIEF or IIEF‐5, or evaluated by IIEF question 15.

  • We will consider the MCID in the erectile function domain score of IIEF to be four (Rosen 2011). We will also consider improvement of IIEF‐5 to be over five points as MCID (Spaliviero 2010). There is no established threshold in the sexual domain of the AMS. We will use an MCID of 25% improvement from baseline in AMS (Nickel 2015). For IIEF question 15 “In the last 4 weeks, how would you rate your confidence that you get and keep your erection?”, we will classify those who answered very low or low as having erectile dysfunction (Wessells 2011). We will use the MCID in the IIEF question 15 as relative risk reduction of at least 25% (Guyatt 2011).

Sexual quality of life

  • Evaluated by validated questionnaires such as total score of AMS and overall satisfaction domain of IIEF.

  • There is no reported threshold in total score of AMS and overall satisfaction domain of IIEF. We will use an MCID of 25% improvement from baseline to assess sexual quality of life (Nickel 2015).

Cardiovascular mortality

  • Such as mortality related to myocardial infarction, heart failure, constrictive pericarditis, stroke, and any type of embolism or thrombosis after participants were randomised to intervention/comparator groups (Basaria 2010).

Treatment withdrawal due to adverse events

  • Defined as treatment discontinuation due to adverse events at any time after participants were randomised to intervention/comparator groups.

Prostate‐related events

  • Such as incidence of prostate‐biopsy, prostate enlargement, elevated PSA, and prostate cancer.

  • If the authors of eligible studies reported prostate‐related events separately, we will use the available information described in the studies.

Lower urinary tract symptoms

  • Evaluated by validated questionnaires such as International Prostate Symptom Score (IPSS).

  • We will consider the MCID in the IPSS to be three points (Barry 1995).

Socioeconomic effects

  • Such as average length of hospital stay due to adverse events (measured in days from admission to discharge) and direct costs defined as visits to general practitioner.

Timing of outcome measurement

We will consider outcomes measured up to and including 12 months after randomisation as short term, and later than 12 months as long term.

Search methods for identification of studies

Electronic searches

We will search the following sources from the inception of each database to the date of search and will place no restrictions on language of publication or publication status.

  • Cochrane Central Register of Controlled Trials (CENTRAL) via the Cochrane Register of Studies Online (CRSO)

  • MEDLINE Ovid (Epub Ahead of Print, In‐Process & Other Non‐Indexed Citations, Ovid MEDLINE(R) Daily and Ovid MEDLINE(R); from 1946 onwards)

  • Embase (via Elsevier) (from 1974 onwards)

  • LILACS (Latin American and the Caribbean Health Sciences Literature; www.bireme.br/; from 1982)

  • ClinicalTrials.gov (www.clinicaltrials.gov)

  • World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) (www.who.int/trialsearch/)

We will continuously apply a MEDLINE (via Ovid SP) email alert service established by the Cochrane Metabolic and Endocrine Disorders (CMED) group to identify newly published trials using the same search strategy as described for MEDLINE (Appendix 1). After we submit the final review draft for editorial approval, the CMED group will perform a complete search update on all databases available at the editorial office and will send the results to the review authors. Should we identify new trials for inclusion, we will evaluate these and incorporate the findings into our review draft (Beller 2013).

Searching other resources

We will attempt to identify other potentially eligible trials or ancillary publications by searching the reference lists of included trials, systematic reviews, meta‐analyses, and health technology assessment reports. In addition, we will contact authors of included trials to identify any additional information on the retrieved trials or any further trials we may have missed.

We will not use abstracts or conference proceedings for data extraction unless full data are available from the trial authors because this information source does not fulfil the CONSORT requirements, which consist of "an evidence‐based, minimum set of recommendations for reporting randomized trials" (CONSORT; Scherer 2007). We will list key data from abstracts in an appendix. We will present information on abstracts or conference proceedings in the 'Characteristics of studies awaiting classification' table.

Data collection and analysis

Selection of studies

Two review authors (JHJ, HWK, BR, JSL, or HSY) will independently screen the abstract, title, or both of every record retrieved by the literature searches to determine which trials should be assessed further. We will obtain the full text of all potentially relevant records. Any disagreements will be resolved through consensus or by recourse to a third review author (PD). In the event that a disagreement cannot be resolved, we will categorise the trial as a 'study awaiting classification' and will contact the trial authors for clarification. We will present an adapted PRISMA flow diagram to show the process of trial selection (Liberati 2009). We will list all articles excluded after full‐text assessment in a 'Characteristics of excluded studies' table and will provide the reasons for their exclusion.

Data extraction and management

Two review authors (JHJ, HWK, BR, JSL, or HSY) will independently extract key participant and intervention characteristics for the included trials. We will describe interventions according to the 'template for intervention description and replication' (TIDieR) checklist (Hoffmann 2014; Hoffmann 2017).

We will report data on efficacy outcomes and adverse events using standardised data extraction sheets from the CMED group. Any disagreements will be resolved by discussion or by consulting a third review author (PD) if necessary.

We will provide information including trial identifier for potentially relevant ongoing trials in the 'Characteristics of ongoing studies' table and in a joint appendix 'Matrix of trial endpoint (publications and trial documents)'. We will attempt to locate the protocol for each included trial and will report primary, secondary, and other outcomes in comparison with data in publications in a joint appendix.

We will email authors of all included trials to ask if they would be willing to answer questions regarding their trials. We will present the results of this survey in an appendix. We will thereafter seek relevant missing information on the trial from the primary trial author(s), if required.

Dealing with duplicate and companion publications

In the event of duplicate publications, companion documents, or multiple reports of a primary trial, we will maximise the information yield by collating all available data and will use the most complete data set aggregated across all known publications. We will list duplicate publications, companion documents, multiple reports of a primary trial, and trial documents of included trials (such as trial registry information) as secondary references under the study ID of the included trial. Furthermore, we will also list duplicate publications, companion documents, multiple reports of a trial, and trial documents of excluded trials (such as trial registry information) as secondary references under the study ID of the excluded trial.

Data from clinical trials registers

If data from included trials are available as study results in clinical trials registers such as ClinicalTrials.gov or similar sources, we will make full use of this information and extract the data. If there is also a full publication of the trial, we will collate and critically appraise all available data. If an included trial is marked as a completed study in a clinical trial register but no additional information (study results, publication, or both) is available, we will add this trial to the 'Characteristics of studies awaiting classification' table.

Assessment of risk of bias in included studies

Two review authors (JHJ, HWK, BR, JSL, or HSY) will independently assess the risk of bias for each included trial. Any disagreements will be resolved by consensus or by consulting a third review author (PD) when necessary. In the case of disagreement, we will consult the remainder of the review author team and make a judgement based on consensus. If adequate information is not available from the publications, trial protocols, or other sources, we will contact the trial authors for more detail to request missing data on 'Risk of bias' items.

We will use the Cochrane 'Risk of bias' assessment tool (Higgins 2017), assigning assessments of low, high or unclear risk of bias (for details see Appendix 2). We will evaluate individual bias items as described in the Cochrane Handbook for Systematic Reviews of Interventions according to the criteria and associated categorisations contained therein (Higgins 2017).

Summary assessment of risk of bias

We will present a 'Risk of bias' graph and a 'Risk of bias' summary figure.

We will distinguish between self reported and investigator‐assessed and adjudicated outcome measures.

We will consider the following self reported outcomes.

  • Erectile function

  • Sexual quality of life

  • Lower urinary tract symptoms

We will consider the following outcomes to be investigator assessed.

  • Cardiovascular mortality

  • Treatment withdrawal due to adverse event

  • Prostate‐related events

  • Socioeconomic effects

Risk of bias for a trial across outcomes

Some 'Risk of bias' domains, such as selection bias (sequence generation and allocation sequence concealment), affect the risk of bias across all outcome measures in a trial. In case of high risk of selection bias, we will mark all endpoints investigated in the associated trial as being at high risk. Otherwise, we will not perform a summary assessment of the risk of bias across all outcomes for a trial.

Risk of bias for an outcome within a trial and across domains

We will assess the risk of bias for an outcome measure by including all entries relevant to that outcome (i.e. both trial‐level entries and outcome‐specific entries). We will consider low risk of bias to denote a low risk of bias for all key domains; unclear risk to denote an unclear risk of bias for one or more key domains; and high risk to denote a high risk of bias for one or more key domains.

Risk of bias for an outcome across trials and across domains

These are the main summary assessments that we will incorporate into our judgements about the certainty of the evidence in the 'Summary of findings' tables. We will define outcomes as at low risk of bias when most information comes from trials at low risk of bias; unclear risk when most information comes from trials at low or unclear risk of bias; and high risk when a sufficient proportion of information comes from trials at high risk of bias.

Measures of treatment effect

When at least two included trials are available for a comparison and a given outcome, we will attempt to express dichotomous data as a risk ratio (RR) or odds ratio (OR) with 95% confidence intervals (CIs). For continuous outcomes measured on the same scale (e.g. weight loss in kg), we will estimate the intervention effect using the mean difference (MD) with 95% CIs. For continuous outcomes that measure the same underlying concept (e.g. sexual quality of life) but use different measurement scales, we will calculate the standardised mean difference (SMD). We will express time‐to‐event data as a hazard ratio (HR) with 95% CIs.

Unit of analysis issues

We will take into account the level at which randomisation occurred, such as cross‐over trials, cluster‐randomised trials, and multiple observations for the same outcome. If more than one comparison from the same trial is eligible for inclusion in the same meta‐analysis, we will either combine groups to create a single pair‐wise comparison or appropriately reduce the sample size so that the same participants do not contribute data to the meta‐analysis more than once (splitting the 'shared' group into two or more groups). While the latter approach offers some solution to adjusting the precision of the comparison, it does not account for correlation arising from the same set of participants being in multiple comparisons (Higgins 2011a).

We will attempt to reanalyse cluster‐randomised trials that have not appropriately adjusted for potential clustering of participants within clusters in their analyses. The variance of the intervention effects will be inflated by a design effect. Calculation of a design effect involves estimation of an intracluster correlation coefficient (ICC). We will obtain estimates of ICCs by contacting trial authors or imputing the ICC values by using either estimates from other included trials that report ICCs or external estimates from empirical research (e.g. Bell 2013). We plan to examine the impact of clustering using sensitivity analyses.

Dealing with missing data

If possible, we will obtain missing data from the authors of the included trials. We will carefully evaluate important numerical data such as screened, randomly assigned participants as well as intention‐to‐treat, and as‐treated and per‐protocol populations. We will investigate attrition rates (e.g. dropouts, losses to follow‐up, withdrawals) and will critically appraise issues concerning missing data and use of imputation methods (e.g. last observation carried forward).

In trials where the standard deviation (SD) of the outcome is not available at follow‐up or we cannot recreate it, we will standardise by the mean of the pooled baseline SD from those trials that reported this information.

Where included trials do not report means and SDs for outcomes and we are unable to obtain the necessary information from trial authors, we will impute these values by estimating the mean and variance from the median, range, and the size of the sample (Hozo 2005).

We will investigate the impact of imputation on meta‐analyses by performing sensitivity analyses, and we will report for every outcome which trials had imputed SDs.

Assessment of heterogeneity

In the event of substantial clinical or methodological heterogeneity, we will not report trial results as the pooled effect estimate in a meta‐analysis.

We will identify heterogeneity (inconsistency) by visually inspecting the forest plots and by using a standard Chi² test with a significance level of α = 0.1 (Deeks 2017). In view of the low power of this test, we will also consider the I² statistic, which quantifies inconsistency across trials, to assess the impact of heterogeneity on the meta‐analysis (Higgins 2002; Higgins 2003).

When we find heterogeneity, we will attempt to determine the possible reasons for it by examining individual trial and subgroup characteristics.

Assessment of reporting biases

If 10 or more trials are included that investigate a particular outcome, we will use funnel plots to assess small‐trial effects. Several explanations may account for funnel plot asymmetry, including true heterogeneity of effect with respect to trial size, poor methodological design (and hence bias of small trials), and publication bias (Sterne 2017). We will therefore interpret the results carefully (Sterne 2011).

Data synthesis

We plan to undertake (or display) a meta‐analysis only if we judge participants, interventions, comparisons, and outcomes to be sufficiently similar to ensure an answer that is clinically meaningful. Unless good evidence shows homogeneous effects across trials of different methodological quality, we will primarily summarise low risk of bias data using a random‐effects model (Wood 2008). We will interpret random‐effects meta‐analyses with due consideration to the whole distribution of effects and present a prediction interval (Borenstein 2017a; Borenstein 2017b; Higgins 2009). A prediction interval needs at least three trials to be calculated and specifies a predicted range for the true treatment effect in an individual trial (Riley 2011). For rare events such as event rates below 1%, we will use Peto's odds ratio method, provided there is no substantial imbalance between intervention and comparator group sizes and intervention effects are not exceptionally large. In addition, we will perform statistical analyses according to the statistical guidelines presented in the Cochrane Handbook for Systematic Reviews of Interventions (Deeks 2017).

Subgroup analysis and investigation of heterogeneity

We expect the following characteristics to introduce clinical heterogeneity, and plan to carry out the following subgroup analyses limited to the primary outcomes including investigation of interactions (Altman 2003).

  • Participant age (less than 65 years versus ≥ 65 years).

  • Baseline testosterone serum level (less than 8 nmol/L versus ≥ 8 nmol/L) obtained between 7:00 and 11:00 AM.

  • Presence or absence of at least one of the metabolic syndrome components defined as the standard criteria at the time of the beginning of the trial (Grundy 2005).

These subgroup analyses are based on the following observations.

  • Sexual dysfunction is strongly associated with age (Feldman 1994; O'Leary 2003). Tolerability of testosterone in elderly men remains controversial, but may differ by age as testosterone treatment might be associated with increased cardiovascular risks (Basaria 2010; Snyder 2016). The age cut‐off is based on the WHO definition of old age (WHO 2002).

  • In clinical practice, testosterone supplementation therapy can be administered to people with borderline and low testosterone levels, but appears most meaningful in the latter group of individuals (Khera 2016; Lunenfeld 2013; Wang 2009).

  • The prevalence of erectile dysfunction was positively associated with the metabolic syndrome, and serum testosterone levels were also significantly lower in men with the metabolic syndrome (Garcia‐Cruz 2013). The magnitude of the effect of testosterone supplementation therapy may therefore differ in men with metabolic syndrome (Isidori 2005).

Sensitivity analysis

We plan to perform sensitivity analyses to explore the influence of the following factors (when applicable) on effect sizes by restricting analysis to the following.

  • Published trials.

  • Effect of risk of bias, as specified in the Assessment of risk of bias in included studies section.

  • Very long or large trials, in order to establish the extent to which they dominate the results.

  • Using the following filters: diagnostic criteria, imputation, language of publication, source of funding (industry versus other), or country.

We will also test the robustness of results by repeating the analyses using different measures of effect size (RR, OR, etc.) and different statistical models (fixed‐effect and random‐effects models).

Certainty of the evidence

We will present the overall certainty of the evidence for each outcome specified below according to the GRADE approach, which takes into account issues related not only to internal validity (risk of bias, inconsistency, imprecision, publication bias) but also to external validity, such as directness of results. Two review authors (JHJ, HWK, BR, JSL, or HSY) will independently rate the certainty of the evidence for each outcome. Any differences in assessment will be resolved by discussion or by consulting a third review author (PD).

We will include an appendix entitled 'Checklist to aid consistency and reproducibility of GRADE assessments' to help with standardisation of the 'Summary of findings' tables (Meader 2014). Alternatively, we will use GRADEpro Guideline Development Tool (GDT) software and will present evidence profile tables as an appendix (GRADEpro GDT 2015). We will present results for the outcomes as described in the Types of outcome measures section. If meta‐analysis is not possible, we will present the results in a narrative format in the 'Summary of findings' table. We will justify all decisions to downgrade the quality of trials using footnotes, and will make comments to aid the reader's understanding of the Cochrane Review where necessary.

'Summary of findings' table

We will present a summary of the evidence in a 'Summary of findings' table. This will provide key information about the best estimate of the magnitude of the effect, in relative terms and as absolute differences, for each relevant comparison of alternative management strategies, numbers of participants and trials addressing each important outcome, and a rating of overall confidence in effect estimates for each outcome. We will create the 'Summary of findings' table based on the methods described in the Cochrane Handbook for Systematic Reviews of Interventions (Schünemann 2017), employing Review Manager 5 table editor (RevMan 2014). We will report the following outcomes, listed according to priority.

  1. Erectile function

  2. Sexual quality of life

  3. Cardiovascular mortality

  4. Treatment withdrawal for any reason

  5. Prostate‐related events

  6. Lower urinary tract symptoms

  7. Socioeconomic effects