Most ebook files are in PDF format, so you can easily read them using various software such as Foxit Reader or directly on the Google Chrome browser.
Some ebook files are released by publishers in other formats such as .awz, .mobi, .epub, .fb2, etc. You may need to install specific software to read these formats on mobile/PC, such as Calibre.
Please read the tutorial at this link. https://ebooknice.com/page/post?id=faq
We offer FREE conversion to the popular formats you request; however, this may take some time. Therefore, right after payment, please email us, and we will try to provide the service as quickly as possible.
For some exceptional file formats or broken links (if any), please refrain from opening any disputes. Instead, email us first, and we will try to assist within a maximum of 6 hours.
EbookNice Team
Status:
Available5.0
29 reviews(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition by Claus Thorn Ekstrøm, Helle Sørensen - Ebook PDF Instant Download/Delivery: 9781482238938 ,1482238934
Full download (Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition after payment
Product details:
ISBN 10: 1482238934
ISBN 13: 9781482238938
Author: Claus Thorn Ekstrøm, Helle Sørensen
(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition Table of contents:
Chapter 1 Description of samples and populations
Figure 1.1: Population and sample. In statistics we sample subjects from a large population and use the information obtained from the sample to infer characteristics about the general population. Thus the upper arrow can be viewed as “sampling” while the lower arrow is “statistical inference”.
1.1 Data types
1.1.1 Categorical data
1.1.2 Quantitative data
Example 1.1. Laminitis in cattle.
Table 1.1: Data on acute laminitis for eight heifers
1.2 Visualizing categorical data
Example 1.2. Tibial dyschrondroplasia.
Figure 1.2: Relative frequency plot (left) for broiler chickens with and without presence of tibial dyschondroplasia (dark and light bars, respectively). The segmented bar plot (right) shows stacked relative frequencies of broiler chickens with and without tibial dyschondroplasia for the four groups.
1.3 Visualizing quantitative data
Example 1.3. Tenderness of pork.
Table 1.2: Tenderness of pork from different cooling methods and pH levels
1.4 Statistical summaries
Figure 1.3: Histograms (top row) and relative frequency histograms (bottom row) for tunnel cooling of pork for low- and high-pH groups.
Figure 1.4: Scatter plot of tenderness for rapid cooling and tunnel cooling.
1.4.1 Median and inter-quartile range
1.4.2 Boxplot
Example 1.4. Tenderness of pork
1.4.3 The mean and standard deviation
Example 1.5. Tenderness of pork
Infobox 1.1: Sample mean and standard deviation of linearly transformed data
1.4.4 Mean or median?
Figure 1.5: Histograms, boxplots, and means (▴) for 4 different datasets. The lower right dataset contains 100 observations — the remaining three datasets all contain 1000 observations.
1.5 What is a probability
Example 1.6. Throwing thumbtacks.
Figure 1.6: Thumbtack throwing. Relative frequency of the event “pin points down” as the number of throws increases.
Table 1.3: Thumbtacks: 100 throws with a brass thumbtack. 1= pin points down, 0= pin points up
1.6 R
1.6.1 Visualizing data
1.6.2 Statistical summaries
1.7 Exercises
Table 1.4: Characteristics of the study population when comparing German Shepherd dogs following diets with and without chocolate. Values are mean ± SD.
Chapter 2 Linear regression
Figure 2.1: The straight line.
Example 2.1. Stearic acid and digestibility of fat.
Figure 2.2: Digestibility of fat for different proportions of stearic acid in the fat. The line is y = −0.9337 · x + 96.5334.
2.1 Fitting a regression line
Example 2.2. Stearic acid and digestibility of fat
Figure 2.3: Residuals for the dataset on digestibility and stearic acid. The vertical lines between the model (the straight line) and the observations are the residuals.
Figure 2.4: Two regression lines for the digestibility data. The solid line is defined by y = −0.9337 · x + 96.5334 while the dashed line is defined by y = 0.6 · x + 74.15. Both regression lines have residual sum zero.
2.1.1 Least squares estimation
Figure 2.5: Squared residuals for the dataset on digestibility and stearic acid. Gray areas represent the squared residuals for the proposed regression line.
Table 2.1: Calculations for the stearic acid data
Example 2.3. Stearic acid and digestibility of fat
2.2 When is linear regression appropriate
Figure 2.6: The effect of influential points on linear regression slopes. If we add a single extra point at (35, 75) to the stearic acid data we will change the regression slope from −0.9337 (solid line) to −0.706 (dashed line).
Figure 2.7: Interchanging x and y. Regression estimates of x on y cannot be determined from the regression estimates of y on x as the vertical residuals are used to fit the model for the latter while the “horizontal” residuals are needed for the former. The solid line corresponds to the regression of stearic acid percentage on digestibility while the dashed line is the ordinary regression of digestibility on stearic acid.
2.2.1 Transformation
Example 2.4. Growth of duckweed.
Figure 2.8: Top panel shows the original duckweed data. Bottom left shows the data and fitted regression line after logarithmic transformation and bottom right shows the fitted line transformed back to the original scale.
2.3 The correlation coefficient
Example 2.5. Tenderness of pork and sarcomere length.
Figure 2.9: Correlation coefficients for different datasets. Note from the second row of graphs that the slope has no influence on the correlation coefficient except for the middle case where the variance of y is 0 so the correlation is not well-defined. The last row of graphs shows that the correlation may be zero even though the data are highly structured.
Figure 2.10: Graph of tenderness of pork and sarcomere length for 24 pigs.
Table 2.2: Data on sarcomere length and pork tenderness
2.3.1 When is the correlation coefficient relevant?
2.4 Perspective
2.4.1 Modeling the residuals
2.4.2 More complicated models
2.5 R
2.6 Exercises
Chapter 3 Comparison of groups
3.1 Graphical and simple numerical comparison
Example 3.1. Parasite counts for salmon.
Figure 3.1: Boxplot for the two salmon samples.
Example 3.2. Antibiotics and dung decomposition.
Figure 3.2: Graphical presentation of the antibiotics data. Left panel: data points with group sample means (solid line segments) and the total mean of all observations (dashed line). Right panel: parallel boxplots.
Table 3.1: Group means and group standard deviations for the antibiotics data
3.2 Between-group variation and within-group variation
3.3 Populations, samples, and expected values
3.4 Least squares estimation and residuals
Example 3.3. Antibiotics and dung decomposition
Example 3.4. Antibiotics and dung decomposition
3.5 Paired and unpaired samples
Example 3.5. Equine lameness.
3.6 Perspective
3.7 R
3.7.1 Means and standard deviations
3.7.2 Factors
3.8 Exercises
Figure 3.3: Boxplots for Exercise 3.1.
Chapter 4 The normal distribution
4.1 Properties
Example 4.1. Crab weights.
Figure 4.1: Histogram for the crab weight data together with the density for the normal distribution with mean 12.76 and standard deviation 2.25.
4.1.1 Density, mean, and standard deviation
Figure 4.2: Illustration of the relation (4.2). The gray area corresponds to and is interpreted as the probability of a random observation falling between a and b.
Example 4.2. Crab weights
Figure 4.3: Densities for four different normal distributions.
Infobox 4.1: Properties of the density for the normal distribution
4.1.2 Transformations of normally distributed variables
Infobox 4.2: Transformation of normally distributed variables
Example 4.3. Crab weights
Infobox 4.3: Distribution of sample mean
4.1.3 Probability calculations
Figure 4.4: The density function ϕ (left) and the cumulative distribution function Φ (right) for N(0, 1). The dashed lines correspond to Each gray-shaded region has area (probability) 0.05, whereas the dashed-shaded region has area (probability) 0.90.
Table 4.1: Selected values of Φ, the cumulative distribution function for the standard normal distribution
4.1.4 Central part of distribution
Table 4.2: Intervals and corresponding probabilities for N(μ, σ2)
Figure 4.5: Density for N(μ, σ2) with probabilities for intervals μ ± σ, μ ± 2σ, and μ ± 3σ.
Example 4.4. Crab weights
4.2 One sample
4.2.1 Independence
Example 4.5 Sampling of apple trees.
4.2.2 Estimation
Infobox 4.4: Statistical properties of samples means
Figure 4.6: Distribution of for sample size 10 (left) and sample size 25 (right).
4.3 Are the data (approximately) normally distributed?
4.3.1 Histograms and QQ-plots
Figure 4.7: QQ-plot of the crab weight data together with the line with intercept equal to = 12.76 and slope equal to s = 2.25.
4.3.2 Transformations
Example 4.6. Vitamin A intake and BMR.
Figure 4.8: Histograms and QQ-plots of the intake of vitamin A for 1079 men: original values to the left, log-transformed values to the right.
Figure 4.9: Histograms of the BMR variable for 1079 men (upper left), 1145 women (upper right), and all 2224 persons (bottom left), and QQ-plot for all 2224 persons (bottom right).
4.3.3 The exponential distribution
Example 4.7. Interspike intervals for neurons.
Figure 4.10: Histograms for 312 interspike intervals with different resolutions together with the density for the exponential distribution with rate 1.147.
4.4 The central limit theorem
Infobox 4.5: Central limit theorem (CLT)
Example 4.8. Central limit theorem for binary variables.
Example 4.9. Central limit theorem for a bimodal distribution.
Figure 4.11: Illustration of the central limit theorem for 0/1-variables with p = 0.5. Histograms of the sample means for 1000 simulated samples of size 10 (left) and 100 (right) compared to the corresponding normal density curve.
Figure 4.12: Sample distribution of the sample mean of n observations from a bimodal distribution. The bimodal distribution is illustrated in the upper left panel (corresponding to n = 1). The curves are normal densities.
4.5 R
4.5.1 Computations with normal distributions
4.5.2 Random numbers
4.5.3 QQ-plots
4.6 Exercises
Chapter 5 Statistical models, estimation, and confidence intervals
5.1 Statistical models
Example 5.1. Stearic acid and digestibility of fat
Example 5.2. Antibiotics and dung decomposition
5.1.1 Model assumptions
Figure 5.1: Illustration of the linear regression model (left) and the one-way ANOVA model for k = 5 (right).
Infobox 5.1: Linear model (version 1
Infobox 5.2: Linear model (version 2)
5.1.2 Model formulas
5.1.3 Linear regression
5.1.4 Comparison of groups
5.1.5 One sample
5.2 Estimation
5.2.1 Least squares estimation of the mean parameters
5.2.2 Estimation of the standard deviation σ
5.2.3 Standard errors and distribution of least squares estimates
5.2.4 Linear regression
Example 5.3. Stearic acid and digestibility of fat
5.2.5 Comparison of groups
Example 5.4. Antibiotics and dung decomposition
Table 5.1: Estimates for group means and comparison to the control group for the antibiotics data
5.2.6 One sample
Example 5.5. Crab weights
5.2.7 Bias and precision. Maximum likelihood estimation
Figure 5.2: Bias and precision for four imaginary estimation methods. The centers of the circles represent the true value of the parameter whereas the points correspond to estimates based on 60 different datasets.
5.3 Confidence intervals
5.3.1 The t distribution
Figure 5.3: Left: density for the tr distribution with r = 1 degree of freedom (solid) and r = 4 degrees of freedom (dashed) as well as for N(0, 1) (dotted). Right: the probability of an interval is the area under the density curve, illustrated by the t4 distribution.
Table 5.2: 95% and 97.5% quantiles for selected t distributions
5.3.2 Confidence interval for the mean for one sample
Figure 5.4: Density for the t10 distribution. The 95% quantile is 1.812, as illustrated by the gray region which has area 0.95. The 97.5% quantile is 2.228, illustrated by the dashed region with area 0.975.
Example 5.6. Crab weights
5.3.3 Interpretation of the confidence interval
Figure 5.5: Confidence intervals for 50 simulated data generated from N(0, 1). The samples size and confidence level vary (see the top of each panel).
5.3.4 Confidence intervals for linear models
Infobox 5.3: Confidence intervals for parameters in linear models
Example 5.7. Stearic acid and digestibility of fat
Figure 5.6: Pointwise 95% confidence intervals for the regression line.
Example 5.8. Antibiotics and dung decomposition
Example 5.9. Parasite counts for salmon
Table 5.3: Mixture proportions and optical densities for 10 dilutions of a standard dissolution with Ubiquitin antibody
Figure 5.7: Scatter plot of the optical density against the mixture proportions (left) and against the logarithmic mixture proportions with the fitted regression line (right).
Example 5.10. ELISA experiment.
5.4 Unpaired samples with different standard deviations
Example 5.11. Parasite counts for salmon
Example 5.12. Vitamin A intake and BMR
5.5 R
5.5.1 Linear regression
5.5.2 One-way ANOVA
Table 5.4: Estimates for group means and comparison to the control group for the antibiotics data (identical to Table 5.1)
5.5.3 One sample and two samples
5.5.4 Probabilities and quantiles in the t distribution
5.6 Exercises
Figure 5.8: Illustration of the 95% confidence intervals for the mean weight gain for two diets.
Chapter 6 Hypothesis tests
Example 6.1. Hormone concentration in cattle.
Figure 6.1: Illustration of the t-test for the cattle data. Left panel: the p-value is equal to the area of the gray regions; that is, Right panel: the critical values for the test are those outside the interval from −2.31 to 2.31, because they give a p-value smaller than 0.05.
Figure 6.2: Histogram of the T-values from 1000 simulated datasets of size nine from the N(0, 152) distribution. The density for the t8 distribution is superimposed, and the dashed lines correspond to ±2.71 (2.71 being the observed value from the cattle dataset).
6.1 Null hypotheses
Example 6.2. Stearic acid and digestibility of fat
Example 6.3. Lifespan and length of gestation period.
Table 6.1: Lifespan and length of gestation period and age for seven horses
Figure 6.3: Lifespan and length of gestation period and age for seven horses.
Example 6.4. Antibiotics and dung decomposition
6.2 t-tests
Infobox 6.1: t-test
Example 6.5. Parasite counts for salmon
Infobox 6.2: Relationship between t-tests and confidence intervals
Example 6.6. Stearic acid and digestibility of fat
Example 6.7. Production control.
Example 6.8. Lifespan and length of gestation period
6.3 Tests in a one-way ANOVA
6.3.1 The F-test for comparison of groups
Figure 6.4: Left: Densities for the F (5, 28) distribution (solid), the F (2,27) distribution (dashed), and the F (4, 19) distribution (dotted). Right: The density for the F (5,28) distribution with the 95% quantile F0.95,5,28 = 2.56 and an imaginary value Fobs. The corresponding p-value is equal to the area of the gray region (including the dashed gray), whereas the dashed gray region has area 0.05.
Table 6.2: Analysis of variance table
Infobox 6.3: F-test for comparison of groups
Example 6.9. Antibiotics and dung decomposition
6.3.2 Pairwise comparisons and LSD-values
Example 6.10. Antibiotics and dung decomposition
Table 6.3: Binding rates for three types of antibiotics
Example 6.11. Binding of antibiotics.
Figure 6.5: Binding rates for three types of antibiotics.
6.4 Hypothesis tests as comparison of nested models
Example 6.12. Stearic acid and digestibility of fat
6.5 Type I and type II errors
Table 6.4: The four possible outcomes when testing an hypothesis
6.5.1 Multiple testing. Bonferroni correction
Figure 6.6: The risk of making at least one type I error out of m independent tests on the 5% significance level.
6.5.2 Summary of hypothesis testing
Infobox 6.4: General procedure for hypothesis tests
6.6 R
6.6.1 t-tests
6.6.2 The F-test for comparing groups
6.6.3 Hypothesis tests as comparison of nested models
6.6.4 Tests for one and two samples
6.6.5 Probabilities and quantiles in the F distribution
6.7 Exercises
Chapter 7 Model validation and prediction
7.1 Model validation
7.1.1 Residual analysis
Infobox 7.1: Model assumptions
Example 7.1. Stearic acid and digestibility of fat
Figure 7.1: Residual analysis for the digestibility data: residual plot (left) and QQ-plot (right) of the standardized residuals. The straight line has intercept zero and slope one.
Example 7.2. Growth of duckweed
Figure 7.2: Residual plots for the duckweed data. Left panel: linear regression with the leaf counts as response. Right panel: linear regression with the logarithmic leaf counts as response.
Example 7.3. Chlorophyll concentration.
Figure 7.3: The chlorophyll data. Upper left panel: scatter plot of the data. Remaining panels: residual plots for the regression of N on C (upper right), for the regression of log(N) on C (lower left), and for the regression of on C (lower right).
Infobox 7.2: Model validation based on residuals
Example 7.4. Antibiotics and dung decomposition
Figure 7.4: Residual analysis for the antibiotics data: residual plot (left) and QQ-plot (right) of the standardized residuals. The straight line has intercept zero and slope one.
7.1.2 Conclusions from analysis of transformed data
Example 7.5. Growth of duckweed
Example 7.6. Growth prohibition.
7.2 Prediction
Example 7.7. Blood pressure.
Example 7.8. Beer content in cans.
7.2.1 Prediction in the linear regression model
7.2.2 Confidence intervals versus prediction intervals
Infobox 7.3: Confidence intervals and prediction intervals
Example 7.9. Stearic acid and digestibility of fat
Figure 7.5: Predicted values (solid line), pointwise 95% prediction intervals (dashed lines), and pointwise 95% confidence intervals (dotted lines) for the digestibility data.
7.2.3 Prediction in the one-sample case and in one-way ANOVA
Example 7.10. Beer content in cans
Example 7.11. Vitamin A intake and BMR
7.3 R
7.3.1 Residual analysis
7.3.2 Prediction
7.4 Exercises
Figure 7.6: Four different residual plots.
Chapter 8 Linear normal models
8.1 Multiple linear regression
Example 8.1. Volume of cherry trees.
Table 8.1: Dataset on diameter, volume, and height for 31 cherry trees
Figure 8.1: Scatter plot of volume against height (left panel) and volume against diameter (right panel) for 31 cherry trees.
Figure 8.2: Residual plots for cherry tree data (left panel) and log-transformed cherry tree data (right panel).
Example 8.2. Nutritional composition.
Example 8.3. Tensile strength of Kraft paper.
Figure 8.3: Left panel shows paper strength of Kraft paper as a function of hardwood contents in the pulp with the fitted quadratic regression line superimposed. Right panel is the residual plot for the quadratic regression model.
8.2 Additive two-way analysis of variance
Example 8.4. Cucumber disease.
Table 8.2: Two-way table showing infection rate in cucumbers for different combinations of climate and fertilizer dose
Example 8.5. Cucumber disease
Table 8.3: Analysis of variance table for the additive two-way model of the cucumber data
Table 8.4: Tristimulus brightness measurements of pork chops from 10 pigs at 1, 4, and 6 days after storage
Example 8.6. Pork color over time.
Figure 8.4: Interaction plot of the change in meat brightness for 10 pigs measured at days 1, 4, and 6 after storage. Right panel shows the residual plot for the two-way analysis of variance of the pork data.
Table 8.5: Analysis of variance table for the additive two-way model for the pig brightness data
8.2.1 The additive multi-way analysis of variance
8.2.2 Analysis of variance as linear regression
Example 8.7. Pork color over time
8.3 Linear models
8.3.1 Model formulas
8.3.2 Estimation and parameterization
Figure 8.5: Different parameterizations for comparison of means in three groups. In the left panel we have three parameters — α1, α2, and α3 — that each describe the average level in groups 1–3, respectively. In the right panel we have one parameter that describes the average level in group 1 and two parameters — the difference α2 — α1 and the difference α3 − α1 — that describe contrasts relative to the mean values of group 1.
8.3.3 Hypothesis testing in linear models
Infobox 8.1: Model reduction steps
Example 8.8. Model parameterizations.
Figure 8.6: Graphical illustration of four different statistical models. The points show the expected values for different combinations of two categorical variables A and B (the change along the x-axis). Upper left has an additive effect of both A and B (y = A + B). In the upper right panel there is only an effect of A, y = A, while the lower left figure corresponds to the model y = B. The lower right panel is the model with no effect of A or B, y = 1.
8.4 Interactions between variables
8.4.1 Interactions between categorical variables
Example 8.9. Model parameterizations
Figure 8.7: Graphical example of the expected values from an interaction between two categorical variables A and B, y = A + B + A*B. The interaction model shown here can be compared to the additive models shown in Figure 8.6.
Example 8.10. Pork color over time
8.4.2 Hypothesis tests
Infobox 8.2: Hierarchical principle
Example 8.11. Cucumber disease
Figure 8.8: Interaction plot, with climate “A” represented by filled circles and climate “B” by open squares (left panel), and standardized residual plot for the interaction model (right panel) for the cucumber disease data.
Table 8.6: Analysis of variance table for the cucumber data where we include an interaction between dose and climate
8.4.3 Interactions between categorical and quantitative variables
Example 8.12. Birth weight of boys and girls.
Figure 8.9: Illustration of the possible types of models we can achieve when we have both a categorical, A, and a quantitative variable, x. The upper left figure shows an interaction between the A and x (i.e., the model y = A + x + A*x), where the interaction allows for different slopes and intercepts according to the level of A. The upper right panel shows three parallel lines (i.e., they have the same slope) but with different intercepts, which corresponds to the model y = A + x. The lines in the lower left panel have identical intercepts but different slopes (i.e., y = A*x) while the lines coincide on the lower right figure, so y = x.
Figure 8.10: Scatter plot of birth weight against age for baby boys (solid dots) and girls (circles). The two lines show the fitted regression lines for boys (solid line) and girls (dashed line). The right panel shows the residual plot for a model with an interaction between sex of the baby and age.
Table 8.7: Mixture proportions and optical densities for 10 dilutions of serum from mice
Example 8.13. ELISA experiment
Figure 8.11: Scatter plot of the optical density against the mixture proportions for the standard dissolution (solid dots) and for the mice serum (circles). The regression lines are the fitted lines in the model with equal slopes for the dissolution types. The right panel shows the residual plot for the model where the slopes are allowed to differ.
8.5 R
8.5.1 Interactions
8.6 Exercises
Figure 8.12: Data on dimensions for jellyfish. The measurements from Dangar Island are the solid circles while the open circles are the measurements from Salamander Bay.
Chapter 9 Non-linear regression
Example 9.1. Reaction rates.
Figure 9.1: Scatter plot of the puromycin data.
9.1 Non-linear regression models
9.2 Estimation, confidence intervals, and hypothesis tests
9.2.1 Non-linear least squares
Example 9.2. Reaction rates
9.2.2 Confidence intervals
Figure 9.2: The puromycin data. Observed data together with the fitted non-linear Michaelis-Menten regression model (left) and corresponding residual plot (right).
Example 9.3. Reaction rates
9.2.3 Hypothesis tests
9.3 Model validation
Example 9.4. Reaction rates
Figure 9.3: Left: Scatter plot of the reciprocal reaction rate, 1/V, against the reciprocal concentration, 1/C and the corresponding regression line. Right: Two Michaelis-Menten functions. The solid curve is the curve fitted by non-linear regression, whereas the dashed curve is the transformation of the fitted reciprocal regression.
9.3.1 Transform-both-sides
Example 9.5. Growth of lettuce plants.
Figure 9.4: The lettuce data (untransformed).
Figure 9.5: Residual plot for the untransformed Brain-Cousens model (left) and for the square root transformed Brain-Cousens model (right).
Figure 9.6: Fitted Brain-Cousens regressions for the lettuce data. The data points are shown together with the fitted curve from the raw data (dashed) and the fitted curve for the square root transformed (solid).
Table 9.1: Results from the analysis of the square root transformed Brain-Cousens model for the lettuce data. The confidence intervals are the symmetric ones.
9.4 R
9.4.1 Puromycin data
9.4.2 Lettuce data
9.5 Exercises
Figure 9.7: Scatter plot of the data for Exercise 9.4.
Chapter 10 Probabilities
10.1 Outcomes, events, and probabilities
Figure 10.1: Relationships between two events A and B.
Example 10.1. Die throwing.
Infobox 10.1: Definition of probability
Example 10.2. Die throwing
Infobox 10.2: Probability rules
Example 10.3. Die throwing
10.2 Conditional probabilities
Infobox 10.3: Conditional probability
Example 10.4. Specificity and sensitivity.
Table 10.1: Presence of E. coli O157: number of positive and negative test results from samples with and without the bacteria
Figure 10.2: Partition of the sample space U into disjoint events A1, ..., Ak. The event B consists of the disjoint events A1 ∩ B, ..., Ak ∩ B.
Infobox 10.4: Bayes’ theorem
Infobox 10.5: Law of total probability
10.3 Independence
Infobox 10.6: Independence
Example 10.5. Two dice.
Example 10.6. Card games and independent events.
Example 10.7. Cocaine users in the USA.
10.4 Exercises
Chapter 11 The binomial distribution
11.1 The independent trials model
Example 11.1. Independent trials.
11.2 The binomial distribution
Figure 11.1: Probability tree for an independent trials experiment with n = 3 and probability of success p. S and F correspond to success and failure, respectively.
Example 11.2. Germination.
Figure 11.2: Probability distributions for four binomial distributions all with n = 20 and with probability of success p = 0.1,0.25,0.50, and 0.80.
11.2.1 Mean, variance, and standard deviation
Figure 11.3: Variance of a Bernoulli variable for different values of the parameter p.
Example 11.3. Germination
Example 11.4. Germination
11.2.2 Normal approximation
Example 11.5. Blood donors.
Figure 11.4: Approximation of the binomial probability P(2 ≤ Y ≤ 5) with a normal distribution. To get the best possible approximation, we use the interval from 1.5 to 5.5 when we calculate the area under the normal density curve.
Example 11.6. Blood donors
Figure 11.5: Probability distributions for four binomial distributions, all with n = 20 and with probability of success p = 0.1, 0.25, 0.50, and 0.80 and corresponding normal distribution approximations (dashed curves).
11.3 Estimation, confidence intervals, and hypothesis tests
Example 11.7. Apple scab.
Figure 11.6: Probability distribution under the null hypothesis Y ~ bin(8,0.35). The gray triangle shows the observation, the dashed horizontal line is the probability of the observation (under the null hypothesis), and the solid vertical lines represent the probabilities of the outcomes that are used for calculating the p-value. The dashed lines are the probabilities of the outcomes that are not contradicting the null hypothesis.
Example 11.8. Apple scab
11.3.1 Improved confidence interval
Example 11.9. Apple scab
11.4 Differences between proportions
Example 11.10. Smelly pets.
11.5 R
11.6 Exercises
Chapter 12 Analysis of count data
12.1 The chi-square test for goodness-of-fit
Table 12.1: Leg injuries of coyotes caught by two different types of traps. Injury category I is little or no leg damage, category II is torn skin, and category III is broken or amputated leg
Example 12.1. Mendelian inheritance.
Table 12.2: Observed and expected values for Mendel's experiment with pea plants
Example 12.2. Mendelian inheritance
Figure 12.1: The left panel shows the density for the χ2(r) distribution with r = 1 (solid), r = 5 (dashed), as well as r = 10 (dotted). The right panel illustrates the density for the χ2(5) distribution. The 95% quantile is 11.07, as illustrated by the gray area which has area 0.95.
12.2 2 × 2 contingency table
12.2.1 Test for homogeneity
Table 12.3: A generic 2 × 2 table
Example 12.3. Avadex.
Example 12.4. Avadex
12.2.2 Test for independence
Table 12.4: A generic 2 × 2 table when data are from a single sample measured for two categorical variables (category 1 and category 2 are the two possible categories for variable 1, while category A and category B are the two possible categories for variable 2)
Example 12.5. Mendelian inheritance
12.2.3 Directional hypotheses for 2 × 2 tables
Example 12.6. Neutering and diabetes.
12.2.4 Fisher's exact test
Example 12.7. Avadex
12.3 Two-sided contingency tables
Table 12.5: A generic r x k table
Example 12.8. Cat behavior.
12.4 R
12.5 Exercises
Chapter 13 Logistic regression
13.1 Odds and odds ratios
Example 13.1. Avadex
13.2 Logistic regression models
Figure 13.1: The logit transformation for different values of p (left panel) and the inverse function: p as a function of the logit value (right panel).
Example 13.2. Moths.
Figure 13.2: The log odds (left) and the probability (right) of male moths dying based on the estimated logistic regression model. Points represent the observed relative frequencies for the six different doses.
Example 13.3. Feline urological syndrome.
Table 13.1: Data on urinary tract disease in cats
13.3 Estimation and confidence intervals
13.3.1 Complete and quasi-complete separation
Example 13.4. Moths
13.3.2 Confidence intervals
Example 13.5. Moths
13.4 Hypothesis tests
13.4.1 Wald tests
Example 13.6. Moths
13.4.2 Likelihood ratio tests
Example 13.7. Moths
13.5 Model validation and prediction
Example 13.8. Nematodes in mackerel.
Figure 13.3: Residual plot for the mackerel data.
Example 13.9. Moths
Example 13.10. Nematodes in mackerel
13.6 R
13.6.1 Model validation
13.7 Exercises
Table 13.2: Degree of pneumoconiosis in coalface workers
Table 13.3: Data regarding willingness to pay a mandatory fee for Danish breeders
Chapter 14 Statistical analysis examples
Figure 14.1: The process of investigating a biological hypothesis in order to reach a biological conclusion through the use of statistics.
14.1 Water temperature and frequency of electric signals from electric eels
Figure 14.2: Plot of the relationship between the frequency of the electric signal and water temperature for 21 electric eels.
14.1.1 Modeling and model validation
Figure 14.3: Left panel shows residual plot of a linear regression model for the eels data. The right panel is the corresponding residual plot for the quadratic regression model.
14.1.2 Model reduction and estimation
Figure 14.4: Plot of the relationship between the frequency of the electric signal and water temperature for 21 electric eels and the fitted regression line.
14.1.3 Conclusion
14.2 Association between listeria growth and RIP2 protein
14.2.1 Modeling and model validation
Figure 14.5: The listeria data. Untransformed listeria growth (left) and log-transformed listeria growth (right). On the horizontal axes we have the groups representing the four different combinations of organ and type of mouse.
Figure 14.6: Residual plot (left) for the two-way ANOVA model with interaction, i.e., the model fitted in twowayWithInt, and interaction plot (right).
14.2.2 Model reduction
14.2.3 Estimation of population group means
Table 14.1: Estimated log-growth of bacteria for the four combinations of mouse type and organ.
14.2.4 Estimation of the RIP2 effect
Table 14.2: Estimated differences in log-growth and estimated ratios in growth between RIP2-deficient and wild type mice.
14.2.5 Conclusion
14.3 Degradation of dioxin
Figure 14.7: Plot of the average total equivalent dose (TEQ) observed in crab liver at two different location (site “a” is the black circles while site “b” is the white circles) over the period from 1990 to 2003.
14.3.1 Modeling and model validation
Figure 14.8: Left panel shows the standardized residual plot of a linear model for the dioxin data. The right panel is the corresponding residual plot when the response (TEQ) has been log transformed.
14.3.2 Model reduction
14.3.3 Estimation
14.3.4 Conclusion
14.4 Effect of an inhibitor on the chemical reaction rate
Figure 14.9: Scatter plot of substrate concentration and reaction rate for three different inhibitor concentrations. Squares, circles, and triangles are used for inhibitor concentrations 0, 50μM and 100μM, respectively.
14.4.1 Three separate Michaelis-Menten relationships
Figure 14.10: Left: Data points together with the three independent fitted curves (squares and solid lines for inhibitor concentration 0; circles and dashed lines for concentration 50; triangles and dotted lines for inhibitor concentration 100). Right: Residual plot with the same plotting symbols as in the left plot.
14.4.2 A model for the effect of the inhibitor
Figure 14.11: Data points together with the fitted curves from model (14.1) for the substrate/inhibitor/reaction rate relationship (squares and solid lines for inhibitor concentration 0; circles and dashed lines for concentration 50; triangles and dotted lines for inhibitor concentration 100).
14.4.3 Conclusion
14.5 Birthday bulge on the Danish soccer team
14.5.1 Modeling and goodness-of-fit test
14.5.2 Robustness of results
Figure 14.12: Difference between observed and expected number of players on the Danish national soccer team depending on the birth month of the player.
14.5.3 Conclusion
14.6 Animal welfare
14.6.1 Modeling and testing
14.6.2 Conclusion
14.7 Monitoring herbicide efficacy
14.7.1 Modeling and model validation
Figure 14.13: Plot of the observed relative frequencies of dead plants for varying doses. The different symbols correspond to the three different locations/replicates.
14.7.2 Model reduction and estimation
Figure 14.14: Plot of the observed relative frequencies of dead plants for varying doses overlaid with the fitted logistic regression model. The different symbols correspond to the three different locations/replicates.
14.7.3 Conclusion
Chapter 15 Case exercises
Table 15.1: Distribution of gender in families with 12 children
Table 15.2: Data from mass spectrometry experiment
Back Matter
Appendix A Summary of inference methods
A.1 Statistical concepts
A.2 Statistical analysis
A.3 Model selection
A.4 Statistical formulas
A.4.1 Descriptive and summary statistics
A.4.2 Quantitative variables: one sample
A.4.3 Quantitative variables: two paired samples
A.4.4 Quantitative variables: one-way ANOVA
A.4.5 Quantitative variables: two independent samples
A.4.6 Quantitative variables: linear regression
A.4.7 Binary variables: one sample (one proportion)
A.4.8 Binary variables: two samples (two proportions)
Appendix B Introduction to R
B.1 Working with R
B.1.1 Using R as a pocket calculator
B.1.2 Vectors and matrices
B.2 Data frames and reading data into R
B.2.1 Data frames
B.2.2 Using datasets from R packages
B.2.3 Reading text files
B.2.4 Reading spreadsheet files
B.2.5 Reading SAS, SPSS, and Stata files
B.3 Manipulating data
B.4 Graphics with R
Figure B.1: Plotting symbols (pch=) and line types (lty=) that can be used with the R plotting functions.
B.5 Reproducible research
B.5.1 Writing R-scripts
B.5.2 Saving the complete history
B.6 Installing R
B.6.1 R packages
B.7 Exercises
Appendix C Statistical tables
C.1 The χ2 distribution
C.2 The normal distribution
C.3 The t distribution
C.4 The F distribution
Appendix D List of examples used throughout the book
Bibliography
Index
People also search for (Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition:
introduction to statistics and data analysis ppt
introduction to statistics and data analysis python
introduction to statistics and data analysis peck pdf
an introduction to statistical methods and data analysis ott
statistical data analysis examples
Tags: Claus Thorn Ekstrøm, Helle Sørensen, Statistical Data Analysis, Life Sciences