**Question:** How is the "Log2 fold change" value calculated in Cell Ranger and Loupe Cell Browser?

**Answer: **The "Log2 fold change" value reported in Cell Ranger and in the gene table in Loupe Cell Browser is the ratio of the normalized mean gene UMI counts in each cluster/group relative to all other clusters/groups for comparison.

In Loupe Cell Browser, the comparison is between a cluster/group versus the rest of the dataset (globally distinguishing) or versus all other checked groups in a selected category (locally distinguishing).

There are 5 main steps in calculating the Log2 fold change:

Assume n total cells

* Calculate the total number of UMIs in each cell

counts_per_cell: n values

* Calculate a size factor for each cell by dividing the cell's total UMI count by the median of those n counts_per_cell

counts_per_cell / median(counts_per_cell): n values

* Calculate a size factor for each cluster by summing the size factors of each cell in that cluster.

* Normalize the UMI counts for each gene in each cluster by dividing by the size factor for that cluster

* Calculate fold change per gene by dividing the normalized total UMI counts for that gene in cluster1 by cluster2

For more specifics on how the value is calculated, please see the Cell Ranger source code for the differential expression analysis module:

# Introduce a pseudocount into log2(fold_change) 'log2_fold_change': np.log2((1+gene_sums_a)/(1+size_factor_a)) - np.log2((1+gene_sums_b)/(1+size_factor_b))

If you are comfortable working with R here are is an example command line for computing the fold change:

library(Matrix)

# Load the matrix in R

matrix_dir = "~/Downloads/pbmc_filtered_feature_bc_matrix/"

barcode.path <- paste0(matrix_dir, "barcodes.tsv.gz")

features.path <- paste0(matrix_dir, "features.tsv.gz")

matrix.path <- paste0(matrix_dir, "matrix.mtx.gz")

mat <- readMM(file = matrix.path)

feature.names = read.delim(features.path,

+ header = FALSE,

+ stringsAsFactors = FALSE)

barcode.names = read.delim(barcode.path,

+ header = FALSE,

+ stringsAsFactors = FALSE)

colnames(mat) = barcode.names$V1

rownames(mat) = feature.names$V1

# Calculate total UMI counts per cell and size factor per cell

totalUMIcounts = apply(mat,2,sum)

sizefactor = totalUMIcounts/median(totalUMIcounts)

# Read the csv file with cluster assignments for barcodes downloaded from Loupe Cell Browser

# In this example, I have downloaded Graph-based clusters

graphbased = read.table("~/Downloads/Graph-based.csv",header=TRUE,sep=",")

# Extract subset of UMI counts for gene tcf7l2 in a separate data table

tcf7l2 = mat["ENSG00000148737",]

# Extract barcodes in cluster1 and cluster 5 for comparison

c1 = subset(graphbased,Graph.based=="Cluster 1")

c5 = subset(graphbased,Graph.based=="Cluster 5")

c1bc = c1$Barcode

c5bc = c5$Barcode

# Get the UMI counts for tcf7l2 for barcodes in cluster1 and cluster5

tcf7l2_c1 = tcf7l2[c1bc]

tcf7l2_c5 = tcf7l2[c5bc]

# Sum the UMI counts for gene for cells in cluster1 and cluster5

c1sumtcf7l2 = sum(tcf7l2[c1bc])

c5sumtcf7l2 = sum(tcf7l2[c5bc])

# Calculate size factor for each cluster

c1_sf = sizefactor[c1bc]

c5_sf = sizefactor[c5bc]

# Calculate fold change

log2((1+c1sumtcf7l2)/(1+sum(c1_sf)))-log2((1+c5sumtcf7l2)/(1+sum(c5_sf)))