- The numbers have no way of speaking for themselves. We speak for them. We imbue them with meaning.
- PPDAC problem-solving cycle
- Problem
- Understanding and defining the proclem
- Plan
- What to measure and how?
- Data
- Collection
- Management
- Cleaning
- Analysis
- Sort, table, graphs
- pattern
- hypothesis generation
- Conclusion
- Interpretation
- conclusion
- new ideas
- communication
- None of the data sources could be considered "the truth"
- Framing
- 5% mortality sounds worse than 95% survival
- Nearly everyone has greater than the average number of legs (1.99999)
- And people have on average one testicle
- average-house price (median) vs average house-price (mean)
- We cannot conclude that the higher survival rates were in any sense caused by the increased number of cases - in fact it could even be the other way round: better hospitals simply attracted more patients
- Alberto Cairo four common features of a good data visualization
- 1) It contains reliable information
- 2) The design has been chosen so that relevant patterns become noticeable
- 3) It presented in an attractive manner, but appearance should not get in the way of honesty, clarity and depth.
- 4) When appropriate, it is organized in a way that enables some exploration
- The first rule of communication is to shut up and listen, so that you can get to know about the audience for your communication
- The second rule of communciation is to now what you want to achieve. To encourage open debate and informed decision-making. We have to acknowledge we are telling a story.
- Hans Rosling: "These facts are not up for discussion. I am right, and you are wrong"
- After someone from the royal statistical society criticized their suvey methods, a spokesman for Ryanair's boss Michael O'Leary said, "Ninety-fice per cent of Ryanair customers havent heard of the Royal Statistical Society, 97 per cent don't care what they say and 100% said it sounds like their people need to book a low-fare Ryanair holiday."
- Runs of good or bad fortune represent a constant state of affairs, then we will wrongly attribute the reversion to normal as the consequence of any intervention we have made
- Football managers who get sacked after a string of losses, only to find their successors getting credit for the return to normal
- Active fund managers dropping in performance, after being tipped after a couple of good years
- The "Curse of Sports Illustrated" in which athletes get featured on the cover following a series of achievements, only to subsequently have their performance plummet
- A model is a map, rather than the territory itself.
- All models are wrong, some are useful
- Bootstrapping the data - the magical idea of pulling oneself up by one's own bootstraps
- we do resampling from the collected data, say 1,000 times, we get 1,000 possible estimates of the mean.
- You can fit regression lines per bootstapped sample
- you get variability in gradient
- The American statistical associations 6 principles about P-values:
- P-values can indicate how incompatible the data are with a specified statistical model
- P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone
- Scientific conclusions and business or policy decisions should not be based only on whether a P-value passes a specific threshold.
- Proper inference requires full reporting and transparency
- A P-value, or statistical significance, does not measure the size of an effect or the importance of a result.
- By itself, a P-value does not provide a good measure of evidence regarding a model or hypothesis. For example, a P-value near 0.05 taken by itself offers only weak evidence against the null hypothesis.
- HARKing - inventing the Hyptheses after the Results are Known
Sonntag, 22. Mai 2022
The Art of statistics - Learning from data - David Spiegelhalter
Abonnieren
Posts (Atom)