logo
Product categories

EbookNice.com

Most ebook files are in PDF format, so you can easily read them using various software such as Foxit Reader or directly on the Google Chrome browser.
Some ebook files are released by publishers in other formats such as .awz, .mobi, .epub, .fb2, etc. You may need to install specific software to read these formats on mobile/PC, such as Calibre.

Please read the tutorial at this link.  https://ebooknice.com/page/post?id=faq


We offer FREE conversion to the popular formats you request; however, this may take some time. Therefore, right after payment, please email us, and we will try to provide the service as quickly as possible.


For some exceptional file formats or broken links (if any), please refrain from opening any disputes. Instead, email us first, and we will try to assist within a maximum of 6 hours.

EbookNice Team

(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition by Claus Thorn Ekstrøm, Helle Sørensen ISBN 9781482238938 1482238934

  • SKU: EBN-5066660
Zoomable Image
$ 32 $ 40 (-20%)

Status:

Available

5.0

29 reviews
Instant download (eBook) Introduction to Statistical Data Analysis for the Life Sciences after payment.
Authors:Claus Thorn Ekstrøm, Helle Sørensen
Pages:526 pages.
Year:2014
Editon:2
Publisher:CRC Press
Language:english
File Size:3.27 MB
Format:pdf
ISBNS:9781482238938, 9781482238945, 9781482238952, 9781482238969, 1482238934, 1482238942, 1482238950, 1482238969
Categories: Ebooks

Product desciption

(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition by Claus Thorn Ekstrøm, Helle Sørensen ISBN 9781482238938 1482238934

(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition by Claus Thorn Ekstrøm, Helle Sørensen - Ebook PDF Instant Download/Delivery: 9781482238938 ,1482238934
Full download (Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition after payment


Product details:

ISBN 10: 1482238934
ISBN 13: 9781482238938
Author: Claus Thorn Ekstrøm, Helle Sørensen

A Hands-On Approach to Teaching Introductory StatisticsExpanded with over 100 more pages, Introduction to Statistical Data Analysis for the Life Sciences, Second Edition presents the right balance of data examples, statistical theory, and computing to teach introductory statistics to students in the life sciences. This popular textbook covers the m
 

(Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition Table of contents:

Chapter 1 Description of samples and populations

Figure 1.1: Population and sample. In statistics we sample subjects from a large population and use the information obtained from the sample to infer characteristics about the general population. Thus the upper arrow can be viewed as “sampling” while the lower arrow is “statistical inference”.

1.1 Data types

1.1.1 Categorical data

1.1.2 Quantitative data

Example 1.1. Laminitis in cattle.

Table 1.1: Data on acute laminitis for eight heifers

1.2 Visualizing categorical data

Example 1.2. Tibial dyschrondroplasia.

Figure 1.2: Relative frequency plot (left) for broiler chickens with and without presence of tibial dyschondroplasia (dark and light bars, respectively). The segmented bar plot (right) shows stacked relative frequencies of broiler chickens with and without tibial dyschondroplasia for the four groups.

1.3 Visualizing quantitative data

Example 1.3. Tenderness of pork.

Table 1.2: Tenderness of pork from different cooling methods and pH levels

1.4 Statistical summaries

Figure 1.3: Histograms (top row) and relative frequency histograms (bottom row) for tunnel cooling of pork for low- and high-pH groups.

Figure 1.4: Scatter plot of tenderness for rapid cooling and tunnel cooling.

1.4.1 Median and inter-quartile range

1.4.2 Boxplot

Example 1.4. Tenderness of pork

1.4.3 The mean and standard deviation

Example 1.5. Tenderness of pork

Infobox 1.1: Sample mean and standard deviation of linearly transformed data

1.4.4 Mean or median?

Figure 1.5: Histograms, boxplots, and means (▴) for 4 different datasets. The lower right dataset contains 100 observations — the remaining three datasets all contain 1000 observations.

1.5 What is a probability

Example 1.6. Throwing thumbtacks.

Figure 1.6: Thumbtack throwing. Relative frequency of the event “pin points down” as the number of throws increases.

Table 1.3: Thumbtacks: 100 throws with a brass thumbtack. 1= pin points down, 0= pin points up

1.6 R

1.6.1 Visualizing data

1.6.2 Statistical summaries

1.7 Exercises

Table 1.4: Characteristics of the study population when comparing German Shepherd dogs following diets with and without chocolate. Values are mean ± SD.

Chapter 2 Linear regression

Figure 2.1: The straight line.

Example 2.1. Stearic acid and digestibility of fat.

Figure 2.2: Digestibility of fat for different proportions of stearic acid in the fat. The line is y = −0.9337 · x + 96.5334.

2.1 Fitting a regression line

Example 2.2. Stearic acid and digestibility of fat

Figure 2.3: Residuals for the dataset on digestibility and stearic acid. The vertical lines between the model (the straight line) and the observations are the residuals.

Figure 2.4: Two regression lines for the digestibility data. The solid line is defined by y = −0.9337 · x + 96.5334 while the dashed line is defined by y = 0.6 · x + 74.15. Both regression lines have residual sum zero.

2.1.1 Least squares estimation

Figure 2.5: Squared residuals for the dataset on digestibility and stearic acid. Gray areas represent the squared residuals for the proposed regression line.

Table 2.1: Calculations for the stearic acid data

Example 2.3. Stearic acid and digestibility of fat

2.2 When is linear regression appropriate

Figure 2.6: The effect of influential points on linear regression slopes. If we add a single extra point at (35, 75) to the stearic acid data we will change the regression slope from −0.9337 (solid line) to −0.706 (dashed line).

Figure 2.7: Interchanging x and y. Regression estimates of x on y cannot be determined from the regression estimates of y on x as the vertical residuals are used to fit the model for the latter while the “horizontal” residuals are needed for the former. The solid line corresponds to the regression of stearic acid percentage on digestibility while the dashed line is the ordinary regression of digestibility on stearic acid.

2.2.1 Transformation

Example 2.4. Growth of duckweed.

Figure 2.8: Top panel shows the original duckweed data. Bottom left shows the data and fitted regression line after logarithmic transformation and bottom right shows the fitted line transformed back to the original scale.

2.3 The correlation coefficient

Example 2.5. Tenderness of pork and sarcomere length.

Figure 2.9: Correlation coefficients for different datasets. Note from the second row of graphs that the slope has no influence on the correlation coefficient except for the middle case where the variance of y is 0 so the correlation is not well-defined. The last row of graphs shows that the correlation may be zero even though the data are highly structured.

Figure 2.10: Graph of tenderness of pork and sarcomere length for 24 pigs.

Table 2.2: Data on sarcomere length and pork tenderness

2.3.1 When is the correlation coefficient relevant?

2.4 Perspective

2.4.1 Modeling the residuals

2.4.2 More complicated models

2.5 R

2.6 Exercises

Chapter 3 Comparison of groups

3.1 Graphical and simple numerical comparison

Example 3.1. Parasite counts for salmon.

Figure 3.1: Boxplot for the two salmon samples.

Example 3.2. Antibiotics and dung decomposition.

Figure 3.2: Graphical presentation of the antibiotics data. Left panel: data points with group sample means (solid line segments) and the total mean of all observations (dashed line). Right panel: parallel boxplots.

Table 3.1: Group means and group standard deviations for the antibiotics data

3.2 Between-group variation and within-group variation

3.3 Populations, samples, and expected values

3.4 Least squares estimation and residuals

Example 3.3. Antibiotics and dung decomposition

Example 3.4. Antibiotics and dung decomposition

3.5 Paired and unpaired samples

Example 3.5. Equine lameness.

3.6 Perspective

3.7 R

3.7.1 Means and standard deviations

3.7.2 Factors

3.8 Exercises

Figure 3.3: Boxplots for Exercise 3.1.

Chapter 4 The normal distribution

4.1 Properties

Example 4.1. Crab weights.

Figure 4.1: Histogram for the crab weight data together with the density for the normal distribution with mean 12.76 and standard deviation 2.25.

4.1.1 Density, mean, and standard deviation

Figure 4.2: Illustration of the relation (4.2). The gray area corresponds to and is interpreted as the probability of a random observation falling between a and b.

Example 4.2. Crab weights

Figure 4.3: Densities for four different normal distributions.

Infobox 4.1: Properties of the density for the normal distribution

4.1.2 Transformations of normally distributed variables

Infobox 4.2: Transformation of normally distributed variables

Example 4.3. Crab weights

Infobox 4.3: Distribution of sample mean

4.1.3 Probability calculations

Figure 4.4: The density function ϕ (left) and the cumulative distribution function Φ (right) for N(0, 1). The dashed lines correspond to Each gray-shaded region has area (probability) 0.05, whereas the dashed-shaded region has area (probability) 0.90.

Table 4.1: Selected values of Φ, the cumulative distribution function for the standard normal distribution

4.1.4 Central part of distribution

Table 4.2: Intervals and corresponding probabilities for N(μ, σ2)

Figure 4.5: Density for N(μ, σ2) with probabilities for intervals μ ± σ, μ ± 2σ, and μ ± 3σ.

Example 4.4. Crab weights

4.2 One sample

4.2.1 Independence

Example 4.5 Sampling of apple trees.

4.2.2 Estimation

Infobox 4.4: Statistical properties of samples means

Figure 4.6: Distribution of for sample size 10 (left) and sample size 25 (right).

4.3 Are the data (approximately) normally distributed?

4.3.1 Histograms and QQ-plots

Figure 4.7: QQ-plot of the crab weight data together with the line with intercept equal to = 12.76 and slope equal to s = 2.25.

4.3.2 Transformations

Example 4.6. Vitamin A intake and BMR.

Figure 4.8: Histograms and QQ-plots of the intake of vitamin A for 1079 men: original values to the left, log-transformed values to the right.

Figure 4.9: Histograms of the BMR variable for 1079 men (upper left), 1145 women (upper right), and all 2224 persons (bottom left), and QQ-plot for all 2224 persons (bottom right).

4.3.3 The exponential distribution

Example 4.7. Interspike intervals for neurons.

Figure 4.10: Histograms for 312 interspike intervals with different resolutions together with the density for the exponential distribution with rate 1.147.

4.4 The central limit theorem

Infobox 4.5: Central limit theorem (CLT)

Example 4.8. Central limit theorem for binary variables.

Example 4.9. Central limit theorem for a bimodal distribution.

Figure 4.11: Illustration of the central limit theorem for 0/1-variables with p = 0.5. Histograms of the sample means for 1000 simulated samples of size 10 (left) and 100 (right) compared to the corresponding normal density curve.

Figure 4.12: Sample distribution of the sample mean of n observations from a bimodal distribution. The bimodal distribution is illustrated in the upper left panel (corresponding to n = 1). The curves are normal densities.

4.5 R

4.5.1 Computations with normal distributions

4.5.2 Random numbers

4.5.3 QQ-plots

4.6 Exercises

Chapter 5 Statistical models, estimation, and confidence intervals

5.1 Statistical models

Example 5.1. Stearic acid and digestibility of fat

Example 5.2. Antibiotics and dung decomposition

5.1.1 Model assumptions

Figure 5.1: Illustration of the linear regression model (left) and the one-way ANOVA model for k = 5 (right).

Infobox 5.1: Linear model (version 1

Infobox 5.2: Linear model (version 2)

5.1.2 Model formulas

5.1.3 Linear regression

5.1.4 Comparison of groups

5.1.5 One sample

5.2 Estimation

5.2.1 Least squares estimation of the mean parameters

5.2.2 Estimation of the standard deviation σ

5.2.3 Standard errors and distribution of least squares estimates

5.2.4 Linear regression

Example 5.3. Stearic acid and digestibility of fat

5.2.5 Comparison of groups

Example 5.4. Antibiotics and dung decomposition

Table 5.1: Estimates for group means and comparison to the control group for the antibiotics data

5.2.6 One sample

Example 5.5. Crab weights

5.2.7 Bias and precision. Maximum likelihood estimation

Figure 5.2: Bias and precision for four imaginary estimation methods. The centers of the circles represent the true value of the parameter whereas the points correspond to estimates based on 60 different datasets.

5.3 Confidence intervals

5.3.1 The t distribution

Figure 5.3: Left: density for the tr distribution with r = 1 degree of freedom (solid) and r = 4 degrees of freedom (dashed) as well as for N(0, 1) (dotted). Right: the probability of an interval is the area under the density curve, illustrated by the t4 distribution.

Table 5.2: 95% and 97.5% quantiles for selected t distributions

5.3.2 Confidence interval for the mean for one sample

Figure 5.4: Density for the t10 distribution. The 95% quantile is 1.812, as illustrated by the gray region which has area 0.95. The 97.5% quantile is 2.228, illustrated by the dashed region with area 0.975.

Example 5.6. Crab weights

5.3.3 Interpretation of the confidence interval

Figure 5.5: Confidence intervals for 50 simulated data generated from N(0, 1). The samples size and confidence level vary (see the top of each panel).

5.3.4 Confidence intervals for linear models

Infobox 5.3: Confidence intervals for parameters in linear models

Example 5.7. Stearic acid and digestibility of fat

Figure 5.6: Pointwise 95% confidence intervals for the regression line.

Example 5.8. Antibiotics and dung decomposition

Example 5.9. Parasite counts for salmon

Table 5.3: Mixture proportions and optical densities for 10 dilutions of a standard dissolution with Ubiquitin antibody

Figure 5.7: Scatter plot of the optical density against the mixture proportions (left) and against the logarithmic mixture proportions with the fitted regression line (right).

Example 5.10. ELISA experiment.

5.4 Unpaired samples with different standard deviations

Example 5.11. Parasite counts for salmon

Example 5.12. Vitamin A intake and BMR

5.5 R

5.5.1 Linear regression

5.5.2 One-way ANOVA

Table 5.4: Estimates for group means and comparison to the control group for the antibiotics data (identical to Table 5.1)

5.5.3 One sample and two samples

5.5.4 Probabilities and quantiles in the t distribution

5.6 Exercises

Figure 5.8: Illustration of the 95% confidence intervals for the mean weight gain for two diets.

Chapter 6 Hypothesis tests

Example 6.1. Hormone concentration in cattle.

Figure 6.1: Illustration of the t-test for the cattle data. Left panel: the p-value is equal to the area of the gray regions; that is, Right panel: the critical values for the test are those outside the interval from −2.31 to 2.31, because they give a p-value smaller than 0.05.

Figure 6.2: Histogram of the T-values from 1000 simulated datasets of size nine from the N(0, 152) distribution. The density for the t8 distribution is superimposed, and the dashed lines correspond to ±2.71 (2.71 being the observed value from the cattle dataset).

6.1 Null hypotheses

Example 6.2. Stearic acid and digestibility of fat

Example 6.3. Lifespan and length of gestation period.

Table 6.1: Lifespan and length of gestation period and age for seven horses

Figure 6.3: Lifespan and length of gestation period and age for seven horses.

Example 6.4. Antibiotics and dung decomposition

6.2 t-tests

Infobox 6.1: t-test

Example 6.5. Parasite counts for salmon

Infobox 6.2: Relationship between t-tests and confidence intervals

Example 6.6. Stearic acid and digestibility of fat

Example 6.7. Production control.

Example 6.8. Lifespan and length of gestation period

6.3 Tests in a one-way ANOVA

6.3.1 The F-test for comparison of groups

Figure 6.4: Left: Densities for the F (5, 28) distribution (solid), the F (2,27) distribution (dashed), and the F (4, 19) distribution (dotted). Right: The density for the F (5,28) distribution with the 95% quantile F0.95,5,28 = 2.56 and an imaginary value Fobs. The corresponding p-value is equal to the area of the gray region (including the dashed gray), whereas the dashed gray region has area 0.05.

Table 6.2: Analysis of variance table

Infobox 6.3: F-test for comparison of groups

Example 6.9. Antibiotics and dung decomposition

6.3.2 Pairwise comparisons and LSD-values

Example 6.10. Antibiotics and dung decomposition

Table 6.3: Binding rates for three types of antibiotics

Example 6.11. Binding of antibiotics.

Figure 6.5: Binding rates for three types of antibiotics.

6.4 Hypothesis tests as comparison of nested models

Example 6.12. Stearic acid and digestibility of fat

6.5 Type I and type II errors

Table 6.4: The four possible outcomes when testing an hypothesis

6.5.1 Multiple testing. Bonferroni correction

Figure 6.6: The risk of making at least one type I error out of m independent tests on the 5% significance level.

6.5.2 Summary of hypothesis testing

Infobox 6.4: General procedure for hypothesis tests

6.6 R

6.6.1 t-tests

6.6.2 The F-test for comparing groups

6.6.3 Hypothesis tests as comparison of nested models

6.6.4 Tests for one and two samples

6.6.5 Probabilities and quantiles in the F distribution

6.7 Exercises

Chapter 7 Model validation and prediction

7.1 Model validation

7.1.1 Residual analysis

Infobox 7.1: Model assumptions

Example 7.1. Stearic acid and digestibility of fat

Figure 7.1: Residual analysis for the digestibility data: residual plot (left) and QQ-plot (right) of the standardized residuals. The straight line has intercept zero and slope one.

Example 7.2. Growth of duckweed

Figure 7.2: Residual plots for the duckweed data. Left panel: linear regression with the leaf counts as response. Right panel: linear regression with the logarithmic leaf counts as response.

Example 7.3. Chlorophyll concentration.

Figure 7.3: The chlorophyll data. Upper left panel: scatter plot of the data. Remaining panels: residual plots for the regression of N on C (upper right), for the regression of log(N) on C (lower left), and for the regression of on C (lower right).

Infobox 7.2: Model validation based on residuals

Example 7.4. Antibiotics and dung decomposition

Figure 7.4: Residual analysis for the antibiotics data: residual plot (left) and QQ-plot (right) of the standardized residuals. The straight line has intercept zero and slope one.

7.1.2 Conclusions from analysis of transformed data

Example 7.5. Growth of duckweed

Example 7.6. Growth prohibition.

7.2 Prediction

Example 7.7. Blood pressure.

Example 7.8. Beer content in cans.

7.2.1 Prediction in the linear regression model

7.2.2 Confidence intervals versus prediction intervals

Infobox 7.3: Confidence intervals and prediction intervals

Example 7.9. Stearic acid and digestibility of fat

Figure 7.5: Predicted values (solid line), pointwise 95% prediction intervals (dashed lines), and pointwise 95% confidence intervals (dotted lines) for the digestibility data.

7.2.3 Prediction in the one-sample case and in one-way ANOVA

Example 7.10. Beer content in cans

Example 7.11. Vitamin A intake and BMR

7.3 R

7.3.1 Residual analysis

7.3.2 Prediction

7.4 Exercises

Figure 7.6: Four different residual plots.

Chapter 8 Linear normal models

8.1 Multiple linear regression

Example 8.1. Volume of cherry trees.

Table 8.1: Dataset on diameter, volume, and height for 31 cherry trees

Figure 8.1: Scatter plot of volume against height (left panel) and volume against diameter (right panel) for 31 cherry trees.

Figure 8.2: Residual plots for cherry tree data (left panel) and log-transformed cherry tree data (right panel).

Example 8.2. Nutritional composition.

Example 8.3. Tensile strength of Kraft paper.

Figure 8.3: Left panel shows paper strength of Kraft paper as a function of hardwood contents in the pulp with the fitted quadratic regression line superimposed. Right panel is the residual plot for the quadratic regression model.

8.2 Additive two-way analysis of variance

Example 8.4. Cucumber disease.

Table 8.2: Two-way table showing infection rate in cucumbers for different combinations of climate and fertilizer dose

Example 8.5. Cucumber disease

Table 8.3: Analysis of variance table for the additive two-way model of the cucumber data

Table 8.4: Tristimulus brightness measurements of pork chops from 10 pigs at 1, 4, and 6 days after storage

Example 8.6. Pork color over time.

Figure 8.4: Interaction plot of the change in meat brightness for 10 pigs measured at days 1, 4, and 6 after storage. Right panel shows the residual plot for the two-way analysis of variance of the pork data.

Table 8.5: Analysis of variance table for the additive two-way model for the pig brightness data

8.2.1 The additive multi-way analysis of variance

8.2.2 Analysis of variance as linear regression

Example 8.7. Pork color over time

8.3 Linear models

8.3.1 Model formulas

8.3.2 Estimation and parameterization

Figure 8.5: Different parameterizations for comparison of means in three groups. In the left panel we have three parameters — α1, α2, and α3 — that each describe the average level in groups 1–3, respectively. In the right panel we have one parameter that describes the average level in group 1 and two parameters — the difference α2 — α1 and the difference α3 − α1 — that describe contrasts relative to the mean values of group 1.

8.3.3 Hypothesis testing in linear models

Infobox 8.1: Model reduction steps

Example 8.8. Model parameterizations.

Figure 8.6: Graphical illustration of four different statistical models. The points show the expected values for different combinations of two categorical variables A and B (the change along the x-axis). Upper left has an additive effect of both A and B (y = A + B). In the upper right panel there is only an effect of A, y = A, while the lower left figure corresponds to the model y = B. The lower right panel is the model with no effect of A or B, y = 1.

8.4 Interactions between variables

8.4.1 Interactions between categorical variables

Example 8.9. Model parameterizations

Figure 8.7: Graphical example of the expected values from an interaction between two categorical variables A and B, y = A + B + A*B. The interaction model shown here can be compared to the additive models shown in Figure 8.6.

Example 8.10. Pork color over time

8.4.2 Hypothesis tests

Infobox 8.2: Hierarchical principle

Example 8.11. Cucumber disease

Figure 8.8: Interaction plot, with climate “A” represented by filled circles and climate “B” by open squares (left panel), and standardized residual plot for the interaction model (right panel) for the cucumber disease data.

Table 8.6: Analysis of variance table for the cucumber data where we include an interaction between dose and climate

8.4.3 Interactions between categorical and quantitative variables

Example 8.12. Birth weight of boys and girls.

Figure 8.9: Illustration of the possible types of models we can achieve when we have both a categorical, A, and a quantitative variable, x. The upper left figure shows an interaction between the A and x (i.e., the model y = A + x + A*x), where the interaction allows for different slopes and intercepts according to the level of A. The upper right panel shows three parallel lines (i.e., they have the same slope) but with different intercepts, which corresponds to the model y = A + x. The lines in the lower left panel have identical intercepts but different slopes (i.e., y = A*x) while the lines coincide on the lower right figure, so y = x.

Figure 8.10: Scatter plot of birth weight against age for baby boys (solid dots) and girls (circles). The two lines show the fitted regression lines for boys (solid line) and girls (dashed line). The right panel shows the residual plot for a model with an interaction between sex of the baby and age.

Table 8.7: Mixture proportions and optical densities for 10 dilutions of serum from mice

Example 8.13. ELISA experiment

Figure 8.11: Scatter plot of the optical density against the mixture proportions for the standard dissolution (solid dots) and for the mice serum (circles). The regression lines are the fitted lines in the model with equal slopes for the dissolution types. The right panel shows the residual plot for the model where the slopes are allowed to differ.

8.5 R

8.5.1 Interactions

8.6 Exercises

Figure 8.12: Data on dimensions for jellyfish. The measurements from Dangar Island are the solid circles while the open circles are the measurements from Salamander Bay.

Chapter 9 Non-linear regression

Example 9.1. Reaction rates.

Figure 9.1: Scatter plot of the puromycin data.

9.1 Non-linear regression models

9.2 Estimation, confidence intervals, and hypothesis tests

9.2.1 Non-linear least squares

Example 9.2. Reaction rates

9.2.2 Confidence intervals

Figure 9.2: The puromycin data. Observed data together with the fitted non-linear Michaelis-Menten regression model (left) and corresponding residual plot (right).

Example 9.3. Reaction rates

9.2.3 Hypothesis tests

9.3 Model validation

Example 9.4. Reaction rates

Figure 9.3: Left: Scatter plot of the reciprocal reaction rate, 1/V, against the reciprocal concentration, 1/C and the corresponding regression line. Right: Two Michaelis-Menten functions. The solid curve is the curve fitted by non-linear regression, whereas the dashed curve is the transformation of the fitted reciprocal regression.

9.3.1 Transform-both-sides

Example 9.5. Growth of lettuce plants.

Figure 9.4: The lettuce data (untransformed).

Figure 9.5: Residual plot for the untransformed Brain-Cousens model (left) and for the square root transformed Brain-Cousens model (right).

Figure 9.6: Fitted Brain-Cousens regressions for the lettuce data. The data points are shown together with the fitted curve from the raw data (dashed) and the fitted curve for the square root transformed (solid).

Table 9.1: Results from the analysis of the square root transformed Brain-Cousens model for the lettuce data. The confidence intervals are the symmetric ones.

9.4 R

9.4.1 Puromycin data

9.4.2 Lettuce data

9.5 Exercises

Figure 9.7: Scatter plot of the data for Exercise 9.4.

Chapter 10 Probabilities

10.1 Outcomes, events, and probabilities

Figure 10.1: Relationships between two events A and B.

Example 10.1. Die throwing.

Infobox 10.1: Definition of probability

Example 10.2. Die throwing

Infobox 10.2: Probability rules

Example 10.3. Die throwing

10.2 Conditional probabilities

Infobox 10.3: Conditional probability

Example 10.4. Specificity and sensitivity.

Table 10.1: Presence of E. coli O157: number of positive and negative test results from samples with and without the bacteria

Figure 10.2: Partition of the sample space U into disjoint events A1, ..., Ak. The event B consists of the disjoint events A1 ∩ B, ..., Ak ∩ B.

Infobox 10.4: Bayes’ theorem

Infobox 10.5: Law of total probability

10.3 Independence

Infobox 10.6: Independence

Example 10.5. Two dice.

Example 10.6. Card games and independent events.

Example 10.7. Cocaine users in the USA.

10.4 Exercises

Chapter 11 The binomial distribution

11.1 The independent trials model

Example 11.1. Independent trials.

11.2 The binomial distribution

Figure 11.1: Probability tree for an independent trials experiment with n = 3 and probability of success p. S and F correspond to success and failure, respectively.

Example 11.2. Germination.

Figure 11.2: Probability distributions for four binomial distributions all with n = 20 and with probability of success p = 0.1,0.25,0.50, and 0.80.

11.2.1 Mean, variance, and standard deviation

Figure 11.3: Variance of a Bernoulli variable for different values of the parameter p.

Example 11.3. Germination

Example 11.4. Germination

11.2.2 Normal approximation

Example 11.5. Blood donors.

Figure 11.4: Approximation of the binomial probability P(2 ≤ Y ≤ 5) with a normal distribution. To get the best possible approximation, we use the interval from 1.5 to 5.5 when we calculate the area under the normal density curve.

Example 11.6. Blood donors

Figure 11.5: Probability distributions for four binomial distributions, all with n = 20 and with probability of success p = 0.1, 0.25, 0.50, and 0.80 and corresponding normal distribution approximations (dashed curves).

11.3 Estimation, confidence intervals, and hypothesis tests

Example 11.7. Apple scab.

Figure 11.6: Probability distribution under the null hypothesis Y ~ bin(8,0.35). The gray triangle shows the observation, the dashed horizontal line is the probability of the observation (under the null hypothesis), and the solid vertical lines represent the probabilities of the outcomes that are used for calculating the p-value. The dashed lines are the probabilities of the outcomes that are not contradicting the null hypothesis.

Example 11.8. Apple scab

11.3.1 Improved confidence interval

Example 11.9. Apple scab

11.4 Differences between proportions

Example 11.10. Smelly pets.

11.5 R

11.6 Exercises

Chapter 12 Analysis of count data

12.1 The chi-square test for goodness-of-fit

Table 12.1: Leg injuries of coyotes caught by two different types of traps. Injury category I is little or no leg damage, category II is torn skin, and category III is broken or amputated leg

Example 12.1. Mendelian inheritance.

Table 12.2: Observed and expected values for Mendel's experiment with pea plants

Example 12.2. Mendelian inheritance

Figure 12.1: The left panel shows the density for the χ2(r) distribution with r = 1 (solid), r = 5 (dashed), as well as r = 10 (dotted). The right panel illustrates the density for the χ2(5) distribution. The 95% quantile is 11.07, as illustrated by the gray area which has area 0.95.

12.2 2 × 2 contingency table

12.2.1 Test for homogeneity

Table 12.3: A generic 2 × 2 table

Example 12.3. Avadex.

Example 12.4. Avadex

12.2.2 Test for independence

Table 12.4: A generic 2 × 2 table when data are from a single sample measured for two categorical variables (category 1 and category 2 are the two possible categories for variable 1, while category A and category B are the two possible categories for variable 2)

Example 12.5. Mendelian inheritance

12.2.3 Directional hypotheses for 2 × 2 tables

Example 12.6. Neutering and diabetes.

12.2.4 Fisher's exact test

Example 12.7. Avadex

12.3 Two-sided contingency tables

Table 12.5: A generic r x k table

Example 12.8. Cat behavior.

12.4 R

12.5 Exercises

Chapter 13 Logistic regression

13.1 Odds and odds ratios

Example 13.1. Avadex

13.2 Logistic regression models

Figure 13.1: The logit transformation for different values of p (left panel) and the inverse function: p as a function of the logit value (right panel).

Example 13.2. Moths.

Figure 13.2: The log odds (left) and the probability (right) of male moths dying based on the estimated logistic regression model. Points represent the observed relative frequencies for the six different doses.

Example 13.3. Feline urological syndrome.

Table 13.1: Data on urinary tract disease in cats

13.3 Estimation and confidence intervals

13.3.1 Complete and quasi-complete separation

Example 13.4. Moths

13.3.2 Confidence intervals

Example 13.5. Moths

13.4 Hypothesis tests

13.4.1 Wald tests

Example 13.6. Moths

13.4.2 Likelihood ratio tests

Example 13.7. Moths

13.5 Model validation and prediction

Example 13.8. Nematodes in mackerel.

Figure 13.3: Residual plot for the mackerel data.

Example 13.9. Moths

Example 13.10. Nematodes in mackerel

13.6 R

13.6.1 Model validation

13.7 Exercises

Table 13.2: Degree of pneumoconiosis in coalface workers

Table 13.3: Data regarding willingness to pay a mandatory fee for Danish breeders

Chapter 14 Statistical analysis examples

Figure 14.1: The process of investigating a biological hypothesis in order to reach a biological conclusion through the use of statistics.

14.1 Water temperature and frequency of electric signals from electric eels

Figure 14.2: Plot of the relationship between the frequency of the electric signal and water temperature for 21 electric eels.

14.1.1 Modeling and model validation

Figure 14.3: Left panel shows residual plot of a linear regression model for the eels data. The right panel is the corresponding residual plot for the quadratic regression model.

14.1.2 Model reduction and estimation

Figure 14.4: Plot of the relationship between the frequency of the electric signal and water temperature for 21 electric eels and the fitted regression line.

14.1.3 Conclusion

14.2 Association between listeria growth and RIP2 protein

14.2.1 Modeling and model validation

Figure 14.5: The listeria data. Untransformed listeria growth (left) and log-transformed listeria growth (right). On the horizontal axes we have the groups representing the four different combinations of organ and type of mouse.

Figure 14.6: Residual plot (left) for the two-way ANOVA model with interaction, i.e., the model fitted in twowayWithInt, and interaction plot (right).

14.2.2 Model reduction

14.2.3 Estimation of population group means

Table 14.1: Estimated log-growth of bacteria for the four combinations of mouse type and organ.

14.2.4 Estimation of the RIP2 effect

Table 14.2: Estimated differences in log-growth and estimated ratios in growth between RIP2-deficient and wild type mice.

14.2.5 Conclusion

14.3 Degradation of dioxin

Figure 14.7: Plot of the average total equivalent dose (TEQ) observed in crab liver at two different location (site “a” is the black circles while site “b” is the white circles) over the period from 1990 to 2003.

14.3.1 Modeling and model validation

Figure 14.8: Left panel shows the standardized residual plot of a linear model for the dioxin data. The right panel is the corresponding residual plot when the response (TEQ) has been log transformed.

14.3.2 Model reduction

14.3.3 Estimation

14.3.4 Conclusion

14.4 Effect of an inhibitor on the chemical reaction rate

Figure 14.9: Scatter plot of substrate concentration and reaction rate for three different inhibitor concentrations. Squares, circles, and triangles are used for inhibitor concentrations 0, 50μM and 100μM, respectively.

14.4.1 Three separate Michaelis-Menten relationships

Figure 14.10: Left: Data points together with the three independent fitted curves (squares and solid lines for inhibitor concentration 0; circles and dashed lines for concentration 50; triangles and dotted lines for inhibitor concentration 100). Right: Residual plot with the same plotting symbols as in the left plot.

14.4.2 A model for the effect of the inhibitor

Figure 14.11: Data points together with the fitted curves from model (14.1) for the substrate/inhibitor/reaction rate relationship (squares and solid lines for inhibitor concentration 0; circles and dashed lines for concentration 50; triangles and dotted lines for inhibitor concentration 100).

14.4.3 Conclusion

14.5 Birthday bulge on the Danish soccer team

14.5.1 Modeling and goodness-of-fit test

14.5.2 Robustness of results

Figure 14.12: Difference between observed and expected number of players on the Danish national soccer team depending on the birth month of the player.

14.5.3 Conclusion

14.6 Animal welfare

14.6.1 Modeling and testing

14.6.2 Conclusion

14.7 Monitoring herbicide efficacy

14.7.1 Modeling and model validation

Figure 14.13: Plot of the observed relative frequencies of dead plants for varying doses. The different symbols correspond to the three different locations/replicates.

14.7.2 Model reduction and estimation

Figure 14.14: Plot of the observed relative frequencies of dead plants for varying doses overlaid with the fitted logistic regression model. The different symbols correspond to the three different locations/replicates.

14.7.3 Conclusion

Chapter 15 Case exercises

Table 15.1: Distribution of gender in families with 12 children

Table 15.2: Data from mass spectrometry experiment

Back Matter

Appendix A Summary of inference methods

A.1 Statistical concepts

A.2 Statistical analysis

A.3 Model selection

A.4 Statistical formulas

A.4.1 Descriptive and summary statistics

A.4.2 Quantitative variables: one sample

A.4.3 Quantitative variables: two paired samples

A.4.4 Quantitative variables: one-way ANOVA

A.4.5 Quantitative variables: two independent samples

A.4.6 Quantitative variables: linear regression

A.4.7 Binary variables: one sample (one proportion)

A.4.8 Binary variables: two samples (two proportions)

Appendix B Introduction to R

B.1 Working with R

B.1.1 Using R as a pocket calculator

B.1.2 Vectors and matrices

B.2 Data frames and reading data into R

B.2.1 Data frames

B.2.2 Using datasets from R packages

B.2.3 Reading text files

B.2.4 Reading spreadsheet files

B.2.5 Reading SAS, SPSS, and Stata files

B.3 Manipulating data

B.4 Graphics with R

Figure B.1: Plotting symbols (pch=) and line types (lty=) that can be used with the R plotting functions.

B.5 Reproducible research

B.5.1 Writing R-scripts

B.5.2 Saving the complete history

B.6 Installing R

B.6.1 R packages

B.7 Exercises

Appendix C Statistical tables

C.1 The χ2 distribution

C.2 The normal distribution

C.3 The t distribution

C.4 The F distribution

Appendix D List of examples used throughout the book

Bibliography

Index

People also search for (Ebook) Introduction to Statistical Data Analysis for the Life Sciences 2nd Edition:

    
introduction to statistics and data analysis ppt
    
introduction to statistics and data analysis python
    
introduction to statistics and data analysis peck pdf
    
an introduction to statistical methods and data analysis ott
    
statistical data analysis examples

Tags: Claus Thorn Ekstrøm, Helle Sørensen, Statistical Data Analysis, Life Sciences

*Free conversion of into popular formats such as PDF, DOCX, DOC, AZW, EPUB, and MOBI after payment.

Related Products