@@ -467,6 +467,201 @@ A single exception is defined:
467467
468468 Subclass of :exc: `ValueError ` for statistics-related exceptions.
469469
470+
471+ :class: `NormalDist ` objects
472+ ===========================
473+
474+ A :class: `NormalDist ` is a a composite class that treats the mean and standard
475+ deviation of data measurements as a single entity. It is a tool for creating
476+ and manipulating normal distributions of a random variable.
477+
478+ Normal distributions arise from the `Central Limit Theorem
479+ <https://en.wikipedia.org/wiki/Central_limit_theorem> `_ and have a wide range
480+ of applications in statistics, including simulations and hypothesis testing.
481+
482+ .. class :: NormalDist(mu=0.0, sigma=1.0)
483+
484+ Returns a new *NormalDist * object where *mu * represents the `arithmetic
485+ mean <https://en.wikipedia.org/wiki/Arithmetic_mean> `_ of data and *sigma *
486+ represents the `standard deviation
487+ <https://en.wikipedia.org/wiki/Standard_deviation> `_ of the data.
488+
489+ If *sigma * is negative, raises :exc: `StatisticsError `.
490+
491+ .. attribute :: mu
492+
493+ The mean of a normal distribution.
494+
495+ .. attribute :: sigma
496+
497+ The standard deviation of a normal distribution.
498+
499+ .. attribute :: variance
500+
501+ A read-only property representing the `variance
502+ <https://en.wikipedia.org/wiki/Variance> `_ of a normal
503+ distribution. Equal to the square of the standard deviation.
504+
505+ .. classmethod :: NormalDist.from_samples(data)
506+
507+ Class method that makes a normal distribution instance
508+ from sample data. The *data * can be any :term: `iterable `
509+ and should consist of values that can be converted to type
510+ :class: `float `.
511+
512+ If *data * does not contain at least two elements, raises
513+ :exc: `StatisticsError ` because it takes at least one point to estimate
514+ a central value and at least two points to estimate dispersion.
515+
516+ .. method :: NormalDist.samples(n, seed=None)
517+
518+ Generates *n * random samples for a given mean and standard deviation.
519+ Returns a :class: `list ` of :class: `float ` values.
520+
521+ If *seed * is given, creates a new instance of the underlying random
522+ number generator. This is useful for creating reproducible results,
523+ even in a multi-threading context.
524+
525+ .. method :: NormalDist.pdf(x)
526+
527+ Using a `probability density function (pdf)
528+ <https://en.wikipedia.org/wiki/Probability_density_function> `_,
529+ compute the relative likelihood that a random sample *X * will be near
530+ the given value *x *. Mathematically, it is the ratio ``P(x <= X <
531+ x+dx) / dx ``.
532+
533+ Note the relative likelihood of *x * can be greater than `1.0 `. The
534+ probability for a specific point on a continuous distribution is `0.0 `,
535+ so the :func: `pdf ` is used instead. It gives the probability of a
536+ sample occurring in a narrow range around *x * and then dividing that
537+ probability by the width of the range (hence the word "density").
538+
539+ .. method :: NormalDist.cdf(x)
540+
541+ Using a `cumulative distribution function (cdf)
542+ <https://en.wikipedia.org/wiki/Cumulative_distribution_function> `_,
543+ compute the probability that a random sample *X * will be less than or
544+ equal to *x *. Mathematically, it is written ``P(X <= x) ``.
545+
546+ Instances of :class: `NormalDist ` support addition, subtraction,
547+ multiplication and division by a constant. These operations
548+ are used for translation and scaling. For example:
549+
550+ .. doctest ::
551+
552+ >>> temperature_february = NormalDist(5 , 2.5 ) # Celsius
553+ >>> temperature_february * (9 / 5 ) + 32 # Fahrenheit
554+ NormalDist(mu=41.0, sigma=4.5)
555+
556+ Dividing a constant by an instance of :class: `NormalDist ` is not supported.
557+
558+ Since normal distributions arise from additive effects of independent
559+ variables, it is possible to `add and subtract two normally distributed
560+ random variables
561+ <https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables> `_
562+ represented as instances of :class: `NormalDist `. For example:
563+
564+ .. doctest ::
565+
566+ >>> birth_weights = NormalDist.from_samples([2.5 , 3.1 , 2.1 , 2.4 , 2.7 , 3.5 ])
567+ >>> drug_effects = NormalDist(0.4 , 0.15 )
568+ >>> combined = birth_weights + drug_effects
569+ >>> f ' mu= { combined.mu :.1f } sigma= { combined.sigma :.1f } '
570+ 'mu=3.1 sigma=0.5'
571+
572+ .. versionadded :: 3.8
573+
574+
575+ :class: `NormalDist ` Examples and Recipes
576+ ----------------------------------------
577+
578+ A :class: `NormalDist ` readily solves classic probability problems.
579+
580+ For example, given `historical data for SAT exams
581+ <https://blog.prepscholar.com/sat-standard-deviation> `_ showing that scores
582+ are normally distributed with a mean of 1060 and standard deviation of 192,
583+ determine the percentage of students with scores between 1100 and 1200:
584+
585+ .. doctest ::
586+
587+ >>> sat = NormalDist(1060 , 195 )
588+ >>> fraction = sat.cdf(1200 ) - sat.cdf(1100 )
589+ >>> f ' { fraction * 100 :.1f } % score between 1100 and 1200 '
590+ '18.2% score between 1100 and 1200'
591+
592+ To estimate the distribution for a model than isn't easy to solve
593+ analytically, :class: `NormalDist ` can generate input samples for a `Monte
594+ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method> `_ of the
595+ model:
596+
597+ .. doctest ::
598+
599+ >>> n = 100_000
600+ >>> X = NormalDist(350 , 15 ).samples(n)
601+ >>> Y = NormalDist(47 , 17 ).samples(n)
602+ >>> Z = NormalDist(62 , 6 ).samples(n)
603+ >>> model_simulation = [x * y / z for x, y, z in zip (X, Y, Z)]
604+ >>> NormalDist.from_samples(model_simulation) # doctest: +SKIP
605+ NormalDist(mu=267.6516398754636, sigma=101.357284306067)
606+
607+ Normal distributions commonly arise in machine learning problems.
608+
609+ Wikipedia has a `nice example with a Naive Bayesian Classifier
610+ <https://en.wikipedia.org/wiki/Naive_Bayes_classifier> `_. The challenge
611+ is to guess a person's gender from measurements of normally distributed
612+ features including height, weight, and foot size.
613+
614+ The `prior probability <https://en.wikipedia.org/wiki/Prior_probability >`_ of
615+ being male or female is 50%:
616+
617+ .. doctest ::
618+
619+ >>> prior_male = 0.5
620+ >>> prior_female = 0.5
621+
622+ We also have a training dataset with measurements for eight people. These
623+ measurements are assumed to be normally distributed, so we summarize the data
624+ with :class: `NormalDist `:
625+
626+ .. doctest ::
627+
628+ >>> height_male = NormalDist.from_samples([6 , 5.92 , 5.58 , 5.92 ])
629+ >>> height_female = NormalDist.from_samples([5 , 5.5 , 5.42 , 5.75 ])
630+ >>> weight_male = NormalDist.from_samples([180 , 190 , 170 , 165 ])
631+ >>> weight_female = NormalDist.from_samples([100 , 150 , 130 , 150 ])
632+ >>> foot_size_male = NormalDist.from_samples([12 , 11 , 12 , 10 ])
633+ >>> foot_size_female = NormalDist.from_samples([6 , 8 , 7 , 9 ])
634+
635+ We observe a new person whose feature measurements are known but whose gender
636+ is unknown:
637+
638+ .. doctest ::
639+
640+ >>> ht = 6.0 # height
641+ >>> wt = 130 # weight
642+ >>> fs = 8 # foot size
643+
644+ The posterior is the product of the prior times each likelihood of a
645+ feature measurement given the gender:
646+
647+ .. doctest ::
648+
649+ >>> posterior_male = (prior_male * height_male.pdf(ht) *
650+ ... weight_male.pdf(wt) * foot_size_male.pdf(fs))
651+
652+ >>> posterior_female = (prior_female * height_female.pdf(ht) *
653+ ... weight_female.pdf(wt) * foot_size_female.pdf(fs))
654+
655+ The final prediction is awarded to the largest posterior -- this is known as
656+ the `maximum a posteriori
657+ <https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation> `_ or MAP:
658+
659+ .. doctest ::
660+
661+ >>> ' male' if posterior_male > posterior_female else ' female'
662+ 'female'
663+
664+
470665..
471666 # This modelines must appear within the last ten lines of the file.
472667 kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;
0 commit comments