Biology fact of the day #9
Some marine sponges can filter around 20,000 x own volume in 2 hours.
Master post

seen from Malaysia

seen from Russia

seen from Bulgaria

seen from France

seen from United States
seen from United Kingdom

seen from Italy

seen from Germany

seen from Thailand

seen from Thailand
seen from United States
seen from United States

seen from United States
seen from China
seen from United States

seen from Bulgaria

seen from Bulgaria
seen from Germany
seen from Hong Kong SAR China

seen from United Kingdom
Biology fact of the day #9
Some marine sponges can filter around 20,000 x own volume in 2 hours.
Master post
Statistics - Measures Of Dispersion In data Science
In data science, measures of dispersion (also called measures of variability) describe how spread out, scattered, or concentrated a dataset is. While measures of central tendency (mean, median, mode) tell us where the data is centered, dispersion tells us how much the data varies around that center.
Below is a deep and structured explanation, from intuition → formulas → interpretation → use cases.
1. Why Measures of Dispersion Matter
Two datasets can have the same mean but behave very differently.
Example:
Dataset A: 48, 49, 50, 51, 52
Dataset B: 10, 20, 50, 80, 90
Both have a mean of 50, but Dataset B is far more spread out.
👉 Measures of dispersion help us:
Understand data consistency
Detect risk and uncertainty
Identify outliers
Compare distributions
Improve model reliability
2. Range
Definition
The range is the simplest measure of dispersion. It shows the difference between the maximum and minimum values.
Formula
Range=Max−Min\text{Range} = \text{Max} - \text{Min}Range=Max−Min
Example
Data: 2, 4, 6, 8, 10 Range = 10 − 2 = 8
Interpretation
Large range → data is widely spread
Small range → data is tightly grouped
Limitations
❌ Uses only two values ❌ Extremely sensitive to outliers ❌ Ignores distribution shape
📌 Rarely used alone in data science
3. Interquartile Range (IQR)
Definition
IQR measures the spread of the middle 50% of the data.
Quartiles
Q1 (25th percentile)
Q2 (50th percentile / median)
Q3 (75th percentile)
Formula
IQR=Q3−Q1\text{IQR} = Q3 - Q1IQR=Q3−Q1
Example
Data: 1, 3, 5, 7, 9, 11, 13 Q1 = 3, Q3 = 11 IQR = 11 − 3 = 8
Interpretation
Focuses on the core data
Ignores extreme values
Advantages
✅ Robust to outliers ✅ Very useful for skewed data ✅ Used in box plots and anomaly detection
📌 Common in exploratory data analysis (EDA)
4. Variance
Definition
Variance measures the average squared distance of each data point from the mean.
Why Squared?
Prevents negative values from canceling out
Penalizes larger deviations more
Population Variance
σ2=1N∑(x−μ)2\sigma^2 = \frac{1}{N}\sum (x - \mu)^2σ2=N1∑(x−μ)2
Sample Variance
s2=1n−1∑(x−xˉ)2s^2 = \frac{1}{n-1}\sum (x - \bar{x})^2s2=n−11∑(x−xˉ)2
(The n−1 correction is called Bessel’s correction)
Interpretation
Higher variance → more spread
Lower variance → data clustered near mean
Limitations
❌ Units are squared (e.g., meters²) ❌ Hard to interpret directly
📌 Variance is the foundation of many ML algorithms
5. Standard Deviation
Definition
The square root of variance. It expresses spread in the same units as the data.
Formula
σ=σ2\sigma = \sqrt{\sigma^2}σ=σ2
Example
If variance = 16 Standard deviation = √16 = 4
Interpretation
Small SD → data points close to mean
Large SD → data points far from mean
Empirical Rule (Normal Distribution)
~68% within ±1 SD
~95% within ±2 SD
~99.7% within ±3 SD
Advantages
✅ Easy to interpret ✅ Widely used in statistics & ML ✅ Essential for normalization and z-scores
📌 Most important dispersion measure in data science
6. Mean Absolute Deviation (MAD)
Definition
Average of the absolute distances from the mean.
Formula
MAD=1n∑∣x−xˉ∣\text{MAD} = \frac{1}{n}\sum |x - \bar{x}|MAD=n1∑∣x−xˉ∣
Characteristics
Uses absolute values instead of squares
Less sensitive to outliers than variance
Limitations
❌ Less mathematically convenient ❌ Less common in advanced models
📌 Used when robustness is needed
7. Coefficient of Variation (CV)
Definition
CV measures relative dispersion by comparing standard deviation to the mean.
Formula
CV=σμ\text{CV} = \frac{\sigma}{\mu}CV=μσ
Interpretation
Unitless measure
Useful for comparing datasets with different units or scales
Example
Dataset A: mean = 100, SD = 10 → CV = 0.1
Dataset B: mean = 20, SD = 10 → CV = 0.5 Dataset B is more variable
📌 Common in finance, economics, and model evaluation
8. Dispersion and Outliers
Measures respond differently to outliers: MeasureSensitive to OutliersRangeVery HighVarianceVery HighStandard DeviationHighIQRLowMADLow
👉 Choose measures based on data quality and distribution
9. Role in Data Science & Machine Learning
Measures of dispersion are used in:
Feature scaling (standardization)
Anomaly detection
Risk analysis
Model stability checks
Bias–variance tradeoff
Confidence intervals
PCA and clustering algorithms
10. Summary Table
Measure Purpose Range Overall spread IQR Middle spread Variance Squared spread Standard Deviation Interpretable spread MAD Robust spread CV Relative spread.....
Read More.....
Minimal cells are synthetic cells with streamlined genomes. New study find these sorts of cells are still able to grow and evolve.
In the fall of 2020, three nations launched robotic explorers to Mars. The United States sent its fifth rover, Perseverance, the latest in an impressive line of successful spacecraft missions. Chin…
TIL: That a garter, not garden snake is called that because of its stripes resembling those of old-fashioned garters.
Epistemic Differentials Are Better Than Intuition & Training
The most common mistake lay people make is believing it is rational to make presumptions before evidence (based on narratives, common beliefs, biases, fallacious reasoning, etc) and then engaging in confirmation bias towards their presumptuous method instead of acknowledging the actual rational conclusion was drawn differently.
People can accidentally reach the correct conclusions with incorrect and nonsensical methodology; there's an entire field of Epistemology dedicated to this with specific terms debunking intuitive confirmation bias.
The problem is that too many people start to believe in their flawed methodologies because they pick and choose when they randomly coincide with rational conclusions, and thus shirk rational methodology for their flawed intuitive methodology. This is what confirmation bias is based on.
The more sound focus is to only speak on what has been demonstrated to be evidential through objective testing while weeding out false positives.
Anytime someone says their “paperwork, training intuition” supersedes a requirement for direct evidence and fallacy-free logic, they are using flawed methodology. No amount of training, paperwork or intuition should skip the scientific method and the epistemic differential method.
La cellule PtK2 mitotique à la métaphase est immunocolorifiée pour les microtubules (rouge) et les kinétochores (vert) avec de l'ADN coloré en bleu. L'image a été obtenue à l'aide d'une microscopie à illumination structurée (SIM, système Deltavision OMX) qui fournit une «super-résolution» au-delà de la limite de diffraction définie par la longueur d'onde de la lumière d'éclairage. La micrographie était le lauréat de 2012 dans la catégorie microscopie haute résolution et super-résolution du concours d'imagerie des sciences de la vie de GE Healthcare, et figurait dans le NIGMS Biomedical Beat, le condensé mensuel des recherches notoires parrainées par le NIGMS credit: Jane Stout, Indiana University Claire Walczak, Indiana University —————————————— #scienec #biologie #microbiology #microbe #bacteria #virology #virus #yeast #mold #agar #biology #science #nature #DNA #life #lab #scientist #researh #picoftheday #bestpic #discovery https://www.instagram.com/p/Bo-FYQJgiWm/?utm_source=ig_tumblr_share&igshid=1rfrqbr3jg11g
Out of every screenshot i’ve ever taken, this is most definitely my favorite.