## What is a rare event in flow cytometry?

Rare event analysis started more than thirty years ago when Cupp aimed to count fetal red blood cells in maternal circulation [1]. Nowadays, rare events analysis has assumed growing importance for the diagnosis and monitoring of immunological and hematological disorders. For instance, rare circulating tumor cells (CTC) or the rise in rare circulating endothelial cells (CEC) are useful as disease indicators. Detection and function of rare cells could give relevant information about the status and stage of disease but may be difficult to detect given its frequency. Cellular abundance is considered rare when the population of interest has a frequency of 0.01% or less [2, 3].

To perform rare event analysis in research settings using flow cytometry, the quantity of biological material, number of events to acquire, as well as the markers that identify the population of interest should be considered. Moreover, “next-generation” instruments, like the Invitrogen Attune NxT, which are able to reach high speed of acquisition and characterized by high sensitivity, could allow easy detection, and phenotypic and functional characterization of rare cells.

Use these links to navigate through the guide to learn more about the three phases (pre-analytical phase, analytical phase, and data analysis) of rare cell analysis by flow cytometry.

## Pre-analytical phase

The pre-analytical phase involves the decision of which kind of biological matrix should be used to identify rare cells on the basis of the expected frequency of the rare population and the number of events that need to be acquired. Figure 1 describes how Poisson statistics defines the probability that a number of events will occur in a fixed interval of time/space/volume [4].

The number of events that satisfy a given criterion (e.g they are positive for a marker, P) is defined as
$P=R/N$
where
$N$=total events
$R$=events that meet a given criterion
and
$0<=P<=1$
As with all statistical distribution,
the variance (V) is defined as
$V(R) = NP(1-P)$
The standard deviation (SD) is
$SD=√V=√(NP(1-P))$
and the coefficient of variation is
$CV=1/√V$

Figure 1. Poisson statistics defines the probability that a number of events will occur in a fixed interval of time/space/volume.

Good experimental practice suggests keeping CV below 5%, and thus the number of events to acquire should be defined in order to maintain the lowest CV possible [6]. For instance, to identify a cell population that represents 0.01%, five million events should be acquired.

Acquired events (N)100,000500,0001,000,0004,010,00010,000,00020,000,000
Positive (R)10.0050.00100.00401.001000.002000.00
Proportion (P)0.00010.00010.00010.00010.00010.0001
Variance (Var)10.0050.0099.99400.96999.901999.80
Standard deviation (SD)3.167.0710.0020.0231.6244.72
Coefficient of Variation (CV)31.6214.1410.004.993.162.24

Table 1. Number of events needed to obtain a CV below 5% in order to detect a population that is 0.01%. Use the rare events calculator tool.

Sometimes, due to paucity of blood or different experimental settings, the final number of events is not sufficient to fulfill the numeric criteria given by Poisson statistics. In this case, “positivity’’ can be determined after comparison of the experimental samples against a set of control samples, including the adequate negative controls, using standard statistical tools to compare the frequencies [7]. For this reason, maximizing the signal-to-noise ratio is fundamental to distinguish the signal of the population of interest and from the background. In addition, given that it is crucial to acquire a high number of events, sample concentration and flow rate are critical parameters and should be adjusted in order to shorten acquisition time and avoid an increase of coincidence (See BioProbes article 71: Tools and Strategies for Rare-Event Detection Using Flow Cytometry). Coincident events are indeterminate and do not possess valuable information. Moreover, coincidence injects ‘polluted’ events into the experiment. Even if the gate on singlets in a bivariate plot with combinations of height, area and width removes doublets, unfortunately there is no method to fully remove them.

Thus, to keep the data integrity as high as possible, it is best to keep coincident events to a minimum. In addition, it affects absolute counts as events that contain two or more particles count only as one particle. In order to understand how the coincidence is calculated, some definitions, which define how well an instrument performs, need to be clarified:

1. The maximum event rate is the number of events the instrument can record per second before it starts counting two cells at once;
2. The flow rate defines how fast the sample flows through the machine; and
3. The sample concentration is the number of cells in a given volume.

There are two methods to calculate the maximum event rates of a flow cytometer. A standard, more accurate method is to define the maximum event rate as the analysis rate where 10% of all events are coincident (according to Poisson statistics). This is the method of design for the Attune NxT flow cytometer. Another method is to cite the maximum event rate of the electronic data collection system. A system designed with this method is agnostic to coincidence and thus, can result in a very high rate of coincidence at the instrument’s specified maximum event rate.

Given this information, it can be expected that an instrument designed using the 10% coincidence method has a specified maximum event rate of 30,000 events/s. The expected event rate is lower than the theoretical event rate since coincident events count only as a single event. At 30,000 events/s, there is a 10% difference in the event rates (by design). It should be noted that coincidence increases as a function of analysis rate. The abort rates are typically going to be higher on slow flow analyzers, however these high abort rates are not relevant on analyzers because frequencies of populations are analyzed and not absolute yield of populations (like on cell sorters). So, 10% rates over 35,000 events/s might be reasonable, however, 10% abort rates over 10,000 events/s is probably not.

Once the number of events that need to be acquired and the purpose of the experiment has been established, the quantity of biological material needs to be accounted for. Thus, the decision on the amount of blood required should be based on how many events need to be analyzed. In order to perform a deep rare events analysis, where the phenotype and the function of, for example, invariant Natural Killer T (iNKT) cells or antigen-specific T cells need to be analyzed, at least 30 mL of blood should be drawn from patients.

Moreover, during this step a decision is required based on the markers needed for cellular identification, and the multicolor panel should be designed according to best practices in panel design [8]. In particular, a channel should be reserved for the viability marker and/or DUMP channel ('Commentary' on page 14).

## Analytical phase

The rare population of interest may or may not be enriched on the basis of the experimental endpoint. The enrichment could be performed by using Ficoll-Paque isolation to obtain PBMC or by buffy coat to isolated peripheral blood cells. In addition, quantitative pre-enrichment of target cells via magnetic cell separation, which allows rapid processing of large samples (106 to 109 cells) is an excellent approach to increase the relative number of rare antigen-specific T cell, iNKT cells or circulating endothelial cell frequencies [9]. The enrichment could be performed by known markers that characterize the rare cell population, or by using tetramers, or by performing a specific enrichment of cytokine secreting cells [10-13].

Rare antigen-specific CD4 T cells could be pre-enriched by using anti-CD154 mAb. By including a fluorescently conjugated CD154-specific antibody during stimulation, the assay is fully compatible with intracellular cytokine staining and can be used for stimulations for up to 24 hours. Finally, another molecule can be used for enrichment of activated CD4+, CD8+, or CD1d+ T cells, namely CD137 (4-1BB), a member of the TNFR superfamily, which has been shown to be expressed following 16–24 hours of stimulation [6]. The combined analysis of CD137 and CD154 following a short-term (6 hours) stimulation might be optimal to detect in parallel conventional T cells and regulatory T cells (Tregs) reacting against the same antigen [14].

The enrichment can have negative effects if rare cells are lost, but of course the goal is to remove unwanted cells while retaining the rare cells of interest. [15]. Another strategy to facilitate the detection of rare cells is to increase the number of antigen-specific T cells by in-vitro expansion methods, even if the expansion of a single T cell is affected not only by its functional status (e.g. naïve, memory, anergic), but also by the presence of other reactive or accessory cells. Therefore, it is difficult to obtain the frequency of a given cell population in the original samples from the frequencies obtained after prolonged in-vitro culture. Similarly, the phenotype and function of the expanded cells may be significantly altered by culture conditions [16, 17] (Table 2).

### Direct enrichment method

MHC-multimers
T cells detected
• Naive
• Memory
• Treg
Pros
• Activation independent
• High analytical specificity
Cons
• Knowledge/availability of MHC/epitope
• Restricted to single epitope specificities
• No functional status of the cells
• Detection of low-affinity cells difficult

### Indirect enrichment methods

Cytokine secretion assay (CSA)CD154 enrichment (6h)CD137 enrichment (16h)
T cells detected
• Memory
• Naive
• Memory
Pros
• Isolation of cytokine producing subsets
• Detection of the whole antigen-specific CD4+ T cell response Fast
• Detection of the whole antigen-specific T cell response
Cons
• Restricted to few selected cytokine producers
• Mainly restricted to CD4+ T cells
• Not compatible with cytokine analysis
• No differentiation between Treg and Tconv cells

Table 2. Enrichment methods for the detection of rare antigen-specific T cells. The table represents different approaches to enrich antigen-specific T cells in two different ways. The direct way, by MHC-multimers (top) or by the indirect way via cytokine secretion assay, CD154 or/and CD137 enrichment (bottom).

## Two protocols for the detection of rare events population of interest to immunologists and oncologists are described below.

### Protocol 1. Detection of iNKT cells by flow cytometry

iNKT cell are innate-like lymphocytes uniquely identified by the expression of an invariant Vα24Jα18 TCR, and they recognize as cognate antigens, self and foreign lipids presented by CD1d. iNKT cells are characterized by the expression of markers typical of T lymphocytes (CD3, CD4, CD8) and NK markers (CD161, CD56) and they represent 0.1-0.001% of T cells in humans [18]. On the basis of CD4 and CD8 expression, mature iNKT cells can be divided into functionally distinct subsets, i.e., CD4+CD8−, CD4−CD8−, and CD4−CD8+ [19] [20].

#### Prepare cells for the identification of iNKT and Mucosal Associated Invariant T (MAIT) cells among PBMCs

1. Collect at least 30 ml of blood in anticoagulant (EDTA or Heparin) and proceed within 4 hours to the isolation of peripheral blood mononuclear cells (PBMC) according to standard procedures.
WARNING: Use freshly collected blood. Blood collected the day before could give trivial results due to shedding receptor, receptor downregulation or cell death.
2. Dilute blood 1:1 with DPBS and carefully layer 35 mL of diluted cell suspension over 15 mL of Ficoll-Paque in a 50 mL conical tube.
WARNING: do this step very slowly so as not to break the Ficoll surface.
3. Centrifuge at 400×g for 30 minutes at 20°C in a swinging-bucket rotor without brake.
4. Aspirate the upper layer leaving the mononuclear cell layer (lymphocytes, monocytes, and thrombocytes) undisturbed at the interphase.
5. Carefully transfer the mononuclear cell layer to a new 50 mL conical tube.
6. Fill the conical tube with DPBS, mix, and centrifuge at 300×g for 5 minutes at 20°C. Carefully remove supernatant completely.
WARNING: during this step, some clogs could be formed. To avoid clogs, add 0.5% of bovine serum albumin (BSA) or bovine foetal serum (FBS) to DPBS and filter the collected PBMC.
7. Repeat step 6.
8. Count cells by using Turk or Trypan Blue.

#### Stain with mAbs identifying iNKT cells

1. Prepare tubes for each single stained control, tubes for FMO controls and tubes containing all the monoclonal antibodies (mAbs) needed to identify cells of interest.
2. Put the same number of cells in all the tubes - at least 5 million cell in each tube.
WARNING: suspend well cells with a pipette and filtrate them if necessary.
3. Wash cells with DPBS+0.5% FBS. Centrifuge at 300×g for 5 minutes at 20°C and discard the supernatant.
4. Put different mAbs in the tubes (6B11, TCR 7.2, CD3, CD4, CD8, CD161) - choose the fluorochrome according to step 8 at a previously titrated concentration. Vortex the tubes, and incubate for 20 minutes, 20°C in the dark.
5. Wash cells with DPBS+0.5% FBS. Centrifuge at 300×g for 5 minutes at 20°C and discard the supernatant.
6. Resuspend cells in order to reach the concentration of 4-10*106 cells/mL and acquire cells immediately.

### Detection of circulating endothelial cells by flow cytometry

CEC are mature endothelial cells detaching from the intima monolayer in response to endothelial damages [21]. Endothelial dysfunction can take place during the development and the progression of different cardiovascular disorders. A huge problem in finding these cells is that a unique marker or a combination of markers that identify circulating endothelial cells and their progenitor have not been yet identified. At present, the most common markers used for this purpose are: DNA, CD34, CD45, CD133, CD31, CD146, and CD309. In particular, circulating endothelial cells are defined as events that are DNA+, CD34+, CD45-, CD31+, CD133-, CD309+. Endothelial progenitor cells (EPC) are defined as DNA+, CD34+, CD45dim, CD31+, CD133+, CD309+/- [22-24].

### Protocol 2. Identification of circulating endothelial cells among peripheral blood cells

1. Collect 30 ml of blood in anticoagulant (EDTA or Heparin) and proceed within 4 hours to the isolation of peripheral blood cells (buffy coat) according to standard procedures.
2. Centrifuge at 200×g for 20 minutes at 20°C in a swinging-bucket rotor.
3. Aspirate the upper layer (plasma) leaving the peripheral cell layer (lymphocytes, monocytes, and thrombocytes) undisturbed at the interphase.
4. Carefully transfer the peripheral cell layer to a new 50 mL conical tube.
WARNING: This should be done very slowly and with a small volume pipette in order to get the majority of the cells and to avoid collecting too many red blood cells.
5. Fill the conical tube with red blood cell lysing solution, mix, and incubate for 30 minutes.
6. Centrifuge at 300×g for 5 minutes at 20°C. Carefully remove supernatant completely and repeat.
7. Count cells by using Turk or Trypan Blue

#### Stain cells with mAbs identifying CEC and EPC cells

1. Prepare tubes for each single stained control, tubes for FMO controls and tubes containing all the monoclonal antibodies (mAbs).
2. Put at least 5 million cells in all the tubes.
3. Wash cells with DPBS+0.5% FBS. Centrifuge at 300×g for 5 minutes at 20°C and discard the supernatant.
4. Put different titrated mAbs in the tubes (Use at least: CD45, CD34, SYTO16, CD133). Vortex the tubes, and incubate for 20 minutes at 20°C in the dark.
5. Wash cells with DPBS+0.5% FBS. Centrifuge at 300×g for 5 minutes at 20°C and discard the supernatant.
6. Resuspend cells in order to reach the concentration of 4-10*106 cells/mL and acquire cells immediately.

#### Set up flow cytometer

1. Select all the channels needed for the analysis and switch off all the others in order to save computer memory.
2. Set up all the dot plots and histograms needed for the analysis and to check fluorescence spillover.
3. Set all the PMTs for each channel, by using single stained controls and create a compensation matrix.
4. Acquire single stained controls and FMO tubes, and check gates and gating strategy.
5. Acquire the all stained tube and perform the analysis. The gating strategy used to analyze phenotype of iNKT and MAIT cells is shown in Figure 3, while those used to analyze CEC and EPC is shown in Figure 4.

Figure 3. Gating strategy used to identify iNKT cells and MAIT cells in the same panel. Lymphocytes are selected on the basis of physical parameters (FSC-H vs SSC-H), doublets were removed by bivariate plot FSC-A vs FCS-H), T lymphocytes were selected on the basis of CD3. In this population, TCR Vα7.2 were used to identify MAIT cells, while TCR Vα24JQ (6B11 mAb were used) identified iNKT cells. MAIT cells were defined as CD3+, TCR Vα7.2+, CD161++, CD8+. Among iNKT cells, cells expressing CD4, CD8 and double negative population were identified and the expression of CD161 was analyzed in each subpopulation. At least 5 million cells were acquired.

Figure 4. Gating strategy used to identify CEC and EPC in the same panel. Gating strategy for the identification of circulating endothelial cells (CEC) and endothelial progenitor cells (EPC). Debris, monocytes and dead cells were excluded by the use of an electronic gate and the dump channel, containing cells identified by mAbs against CD14 and a viability marker, i.e. LIVE/DEAD. CEC and EPC were identified on the basis of the expression of CD34, CD45 and CD133: CEC were defined as CD45dim, CD34+ and CD133− while EPC were defined as CD45−, CD34+ and CD133+. The expression of CD309 (VEGFR-2, KDR) was detected among EPC and CEC.

## Data analysis

Even if the parameters not used have been turned off during the acquisition, the data files tend to be huge due to the number of events and parameters acquired. For this reason, data analysis benefits from powerful and fast hardware (at least 8gb RAM and at least 500gb of HHD). Data compensation and data analysis could be performed either with the same acquisition software or after the acquisition by using FlowJo or FSC Express. However, given the high-throughput nature of flow cytometry and based on the capability to acquire millions of events measuring more parameters at once, it is no longer possible to analyze the data using the classic, manual analysis techniques. Therefore, a set of tools able to analyze, visualize, and interpret large amounts of cellular data in a more automated and unbiased way is included in computational flow cytometry [25, 26].

There are several software programs (based on principal component analysis, PCA) that use visualization techniques as alternatives to the traditional two-dimensional dot plots. The first paper on the use of PCA for analyzing flow cytometry data was published in 2007; this approach was applied to eight-color cytofluorimetric analysis on the virgin and memory T-cell compartments in donors of different ages (young, middle-aged, and centenarians) [27]. Nowadays, several tools like SPADE, FlowMap, FlowSOM, viSNE, PhenoGraph, Scaffold map, and DREMI-DREVI are available on most common and used data analysis platforms. These approaches are mainly dimensionality reduction – or clustering-based techniques [reviewed in [26]]. An example of t-SNE analysis applied to CEC detection is shown in Figure 5. There could be some issues in identification of very rare cell types related to the fact that they could be mistaken for noise by many clustering algorithms. To identify all relevant populations, it may be necessary to do an exhaustive gating, resulting in strong over-clustering, and then select only those features related to a phenotype. With the traditional clustering algorithms, it is recommended to ensure that only relevant markers are used for clustering. Markers that vary little or that indicate properties not relevant for cell-type identification (for example, activation markers) are best left out, as these will only contribute noise to the similarity calculation [26].

## Commentary – tips and considerations

### Troubleshooting - Cells could be not distinguished from the background

Maximizing the signal-to-noise ratio is fundamental in distinguishing the signal of the population of interest from the background. Fixing a threshold on debris could help, along with the use of a gating strategy that removes dead cells from the analysis and excludes doublets/aggregates/debris (identified by a viability marker, such as amine reactive dyes), while also using a “DUMP” channel containing antibodies that identify antigens on cells that are of no interest. Furthermore, the parameter “time of acquisition” should be monitored to remove the event bursts caused by clogs or other possible transient problems during the acquisition (Figure 6).

Of note, two other factors to consider in order to optimize the sensitivity of an assay are the cleanliness of the instrument and the integrity of the sample. It is important to make sure that the instrument and fluids used are clean and free of particles that could contribute falsely to the rare population.

### Time considerations

It takes one and a half hours to obtain PBMC from whole blood, as well as to get peripheral blood cells starting from buffy coat. The staining with mAbs takes 20 minutes followed by 5 minutes for washing. Hence, the time taken from blood to cells ready to be acquired could be around two and a half hours considering the number of tubes which need to be stained along with the required rounds of washing. The acquisition time depends on the instrument, the rate and volume of acquisition, as well as the carry over among different samples (different washes could be required between two samples).

As it is crucial to acquire a high number of events for detection of rare-cell populations, the concentration in the sample and the flow rate are critical parameters which can typically shorten acquisition time - the Attune NxT is able to acquire with a speed of up to 35,000 events/sec. For example, to measure a cell type with 1% CV’s that makes up 1% of your sample (i.e. Basophils) and has a sample concentration range of 4 million, the Attune NxT takes only 33 seconds to acquire the sample; to measure a cell type with 1% CV’s that makes up 0.1% of your sample (i.e. NKT cells, iNKT cells & dendritic cells, circulating endothelial cells) and has a sample concentration range ~4 million, the Attune NxT takes only 4 minutes – 10 times faster than a hydrodynamic flow cytometer; the time taken to acquire 10 iNKT events varies between 14 to 3 minutes. This implies a number of samples could be analyzed in the same day resulting in significant savings in time, labor and associated costs.

Figure 6. Gating strategy - Useful guidelines. Step by step guidelines to perform an accurate gating strategy.

## Acknowledgement

We would like to express our sincere thank you to Sara De Biasi and Andrea Cossarizza, University of Modena and Reggio Emilia School of Medicine, Italy, for their valued contributions for this guide.