Variance and Standard Deviation

std

Compute the standard deviation along the specified axis.

numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, 
          where=True, mean=None, correction=None)

Parameters

array_like

Calculate the standard deviation of these values.

axis

None or int or tuple of ints

Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.

dtype

Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.

out

ndarray

Alternative output array in which to place the result.

ddof

int or float

default:"0"

Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

keepdims

bool

default:"False"

If True, the axes which are reduced are left in the result as dimensions with size one.

where

array_like of bool

Elements to include in the standard deviation.

mean

array_like

Provide the mean to prevent its recalculation. The mean should have a shape as if it was calculated with keepdims=True.

correction

int or float

Array API compatible name for the ddof parameter. Only one of them can be provided at the same time.

Returns

standard_deviation : ndarray Array containing the standard deviation values.

What It Represents

The standard deviation measures how spread out data is from the mean:

\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (x_i - \mu)^2}

Where:

σ (sigma) is the standard deviation
μ (mu) is the mean
N is the number of elements

A low standard deviation means data points are close to the mean (consistent). A high standard deviation means data points are spread out (variable). The ddof parameter:

ddof=0: population standard deviation (divide by N)
ddof=1: sample standard deviation (divide by N-1, Bessel’s correction)

Examples

import numpy as np

# Basic standard deviation
data = np.array([2, 4, 4, 4, 5, 5, 7, 9])
np.std(data)
# 2.0

# Compare consistent vs variable data
consistent = np.array([5, 5, 5, 5, 5])
variable = np.array([1, 3, 5, 7, 9])

np.std(consistent)  # 0.0 (no variation)
np.std(variable)    # 2.828... (high variation)

# Population vs sample standard deviation
data = np.array([1, 2, 3, 4, 5])

np.std(data, ddof=0)  # 1.414... (population)
np.std(data, ddof=1)  # 1.581... (sample, larger)

# Standard deviation along axis
scores = np.array([
    [85, 90, 78],  # Student 1
    [92, 88, 91],  # Student 2
    [78, 82, 75]   # Student 3
])

# Consistency per student (across tests)
np.std(scores, axis=1)
# array([4.9, 1.7, 2.9])
# Student 2 is most consistent

# Variation per test (across students)
np.std(scores, axis=0)
# array([5.9, 3.6, 7.0])
# Test 3 has highest variation

# Temperature data analysis
temps = np.array([72, 75, 71, 78, 73, 76, 74])
mean_temp = np.mean(temps)
std_temp = np.std(temps)

print(f"Average: {mean_temp:.1f}°F ± {std_temp:.1f}°F")
# Average: 74.1°F ± 2.3°F

var

Compute the variance along the specified axis.

numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, 
          where=True, mean=None, correction=None)

Parameters

array_like

Array containing numbers whose variance is desired.

axis

None or int or tuple of ints

Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.

dtype

Type to use in computing the variance. For arrays of integer type the default is float64.

out

ndarray

Alternative output array in which to place the result.

ddof

int or float

default:"0"

Delta Degrees of Freedom. The divisor used in calculations is N - ddof.

keepdims

bool

default:"False"

If True, the axes which are reduced are left in the result as dimensions with size one.

where

array_like of bool

Elements to include in the variance.

mean

array_like

Provide the mean to prevent its recalculation.

correction

int or float

Array API compatible name for the ddof parameter.

Returns

variance : ndarray Array containing the variance values.

What It Represents

The variance is the average of squared deviations from the mean:

\sigma^2 = \frac{1}{N} \sum_{i=1}^{N} (x_i - \mu)^2

Variance is the square of standard deviation: var = std² Variance measures spread in squared units of the original data:

If data is in meters, variance is in meters²
If data is in dollars, variance is in dollars²

Standard deviation is often preferred because it’s in the same units as the data.

Examples

import numpy as np

# Basic variance
data = np.array([2, 4, 4, 4, 5, 5, 7, 9])
np.var(data)
# 4.0

# Relationship between variance and standard deviation
var = np.var(data)
std = np.std(data)

var  # 4.0
std  # 2.0
std ** 2  # 4.0 (equals var)

# Why variance is useful: additivity property
# var(X + Y) = var(X) + var(Y) for independent variables

# Portfolio variance example
stock_a_returns = np.array([0.05, 0.02, 0.07, 0.03, 0.06])
stock_b_returns = np.array([0.04, 0.08, 0.02, 0.05, 0.06])

var_a = np.var(stock_a_returns, ddof=1)
var_b = np.var(stock_b_returns, ddof=1)

print(f"Stock A variance: {var_a:.6f}")
print(f"Stock B variance: {var_b:.6f}")
# Stock B is riskier (higher variance)

# Compare groups
group1 = np.array([10, 12, 11, 13, 12, 11])
group2 = np.array([8, 15, 9, 16, 7, 17])

np.mean(group1)  # 11.5
np.mean(group2)  # 12.0 (similar means)

np.var(group1)   # 0.9 (low variance, consistent)
np.var(group2)   # 15.7 (high variance, variable)

nanstd

Compute the standard deviation along the specified axis, ignoring NaNs.

numpy.nanstd(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Parameters

array_like

Calculate the standard deviation of the non-NaN values.

axis

None or int or tuple of ints

Axis or axes along which the standard deviation is computed.

dtype

Type to use in computing the standard deviation.

out

ndarray

Alternative output array in which to place the result.

ddof

int or float

default:"0"

Delta Degrees of Freedom.

keepdims

bool

default:"False"

If True, the axes which are reduced are left in the result as dimensions with size one.

where

array_like of bool

Elements to include in the standard deviation.

Returns

standard_deviation : ndarray Standard deviation with NaN values ignored.

What It Represents

Same as std, but automatically excludes NaN (Not a Number) values from the calculation. This is essential when working with real-world data that has missing values. Using regular std on data with NaNs returns NaN. Using nanstd ignores the NaN values and computes the statistic on valid data only.

Examples

import numpy as np

# Data with missing values
data = np.array([1.0, 2.0, np.nan, 4.0, 5.0])

# Regular std returns NaN
np.std(data)
# nan

# nanstd ignores NaN values
np.nanstd(data)
# 1.58...  # std of [1, 2, 4, 5]

# Real-world example: sensor data with failures
temperature_readings = np.array([
    [72.0, 73.0, np.nan, 74.0],
    [71.0, np.nan, 73.0, 75.0],
    [73.0, 74.0, 72.0, 73.0]
])

# Standard deviation per sensor (axis=0)
np.nanstd(temperature_readings, axis=0)
# array([0.81..., 0.5, 0.5, 0.81...])

# Standard deviation per time (axis=1)
np.nanstd(temperature_readings, axis=1)
# array([0.81..., 1.63..., 0.70...])

# Financial data with missing days
returns = np.array([0.02, np.nan, 0.01, 0.03, np.nan, -0.01])
volatility = np.nanstd(returns, ddof=1)
print(f"Volatility: {volatility:.4f}")
# Volatility: 0.0158

nanvar

Compute the variance along the specified axis, ignoring NaNs.

numpy.nanvar(a, axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Parameters

array_like

Array containing numbers whose variance is desired, possibly with NaN values.

axis

None or int or tuple of ints

Axis or axes along which the variance is computed.

dtype

Type to use in computing the variance.

out

ndarray

Alternative output array in which to place the result.

ddof

int or float

default:"0"

Delta Degrees of Freedom.

keepdims

bool

default:"False"

If True, the axes which are reduced are left in the result as dimensions with size one.

where

array_like of bool

Elements to include in the variance.

Returns

variance : ndarray Variance with NaN values ignored.

What It Represents

Same as var, but automatically excludes NaN values from the calculation. Essential for computing variance on datasets with missing or invalid data. The relationship nanvar = (nanstd)² holds, just like var = std².

Examples

import numpy as np

# Data with missing values
data = np.array([1.0, 2.0, np.nan, 4.0, 5.0])

# Regular var returns NaN
np.var(data)
# nan

# nanvar ignores NaN values
np.nanvar(data)
# 2.5  # var of [1, 2, 4, 5]

# Verify relationship with nanstd
var = np.nanvar(data)
std = np.nanstd(data)
var  # 2.5
std ** 2  # 2.5 (equals var)

# Experimental measurements with equipment failures
measurements = np.array([
    [10.2, 10.5, np.nan, 10.1],
    [10.3, 10.4, 10.6, np.nan],
    [np.nan, 10.3, 10.4, 10.5]
])

# Variance across all measurements
np.nanvar(measurements)
# 0.020...

# Variance per trial (axis=0)
np.nanvar(measurements, axis=0)
# array([0.0025, 0.0067, 0.01, 0.04])

# Quality control: detect high-variance batches
batch_data = np.array([
    [50.1, 50.2, np.nan, 50.0],
    [50.5, 49.8, 51.2, np.nan],  # High variance
    [50.0, 50.1, 50.0, 50.1]
])

variances = np.nanvar(batch_data, axis=1, ddof=1)
for i, var in enumerate(variances):
    status = "REJECT" if var > 0.05 else "ACCEPT"
    print(f"Batch {i+1}: var={var:.4f} - {status}")
# Batch 1: var=0.0067 - ACCEPT
# Batch 2: var=0.3467 - REJECT
# Batch 3: var=0.0033 - ACCEPT

Averages

Mean, median, and central tendency measures

Correlating

Correlation and covariance functions

Overview

Array Creation

Array Manipulation

Mathematical Functions

Linear Algebra

Statistics

FFT

Random

Polynomials

Data Types

I/O

Logic

Indexing & Selection

Testing

Masked Arrays

Variance and Standard Deviation

std

Parameters

Returns

What It Represents

Examples

var

Parameters

Returns

What It Represents

Examples

nanstd

Parameters

Returns

What It Represents

Examples

nanvar

Parameters

Returns

What It Represents

Examples

See Also

Averages

Correlating

Build docs developers (and LLMs) love

Overview

Array Creation

Array Manipulation

Mathematical Functions

Linear Algebra

Statistics

FFT

Random

Polynomials

Data Types

I/O

Logic

Indexing & Selection

Testing

Masked Arrays

​std

​Parameters

​Returns

​What It Represents

​Examples

​var

​Parameters

​Returns

​What It Represents

​Examples

​nanstd

​Parameters

​Returns

​What It Represents

​Examples

​nanvar

​Parameters

​Returns

​What It Represents

​Examples

​See Also

Averages

Correlating

Build docs developers (and LLMs) love

std

Parameters

Returns

What It Represents

Examples

var

Parameters

Returns

What It Represents

Examples

nanstd

Parameters

Returns

What It Represents

Examples

nanvar

Parameters

Returns

What It Represents

Examples

See Also