Dit bestand is afkomstig van Wikimedia Commons en kan ook in andere projecten gebruikt worden.
De bestandsbeschrijvingspagina wordt hieronder weergegeven.
Beschrijving
This graphic represents the four datasets defined by Francis Anscombe for which some of the usual statistical properties (mean, variance, correlation and regression line) are the same, even though the datasets are different.
Property
Value
Mean of each variable
9.0
Variance of each variable
11.0
Mean of each variable
7.5
Variance of each variable
4.12
Correlation between each and variable
0.816
Regression line
The graph was created by User:Schutz for Wikipedia on 13 June 2006 (and updated on 29 March 2010), using the R statistical project. The program that generated the graphic is given below; it is based on the example provided with the help page of the R dataset anscombe (accessible using the command data(anscombe); help and more information about the dataset is available using the command help(anscombe)), and was slightly modified to improve the result. The graph was directly exported in SVG format.
References:
Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.
R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. 2006. ISBN3-900051-07-0. http://www.R-project.org
svg("anscombe.svg", width=10.5, height=7)
par(las=1)
##-- some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
for(i in 1:4) {
ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
## or ff2 <- as.name(paste("y", i, sep=""))
## ff3 <- as.name(paste("x", i, sep=""))
assign(paste("lm.",i,sep=""), lmi <- lm(ff, data= anscombe))
}
## Now, do what you should have done in the first place: PLOTS
op <- par(mfrow=c(2,2), mar=1.5+c(4,3.5,0,1), oma=c(0,0,0,0),
lab=c(6,6,7), cex.lab=1.5, cex.axis=1.3, mgp=c(3,1,0))
for(i in 1:4) {
ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name)
plot(ff, data =anscombe, col="red", pch=21, bg = "orange", cex = 2.5,
xlim=c(3,19), ylim=c(3,13),
xlab=eval(substitute(expression(x[i]), list(i=i))),
ylab=eval(substitute(expression(y[i]), list(i=i))))
abline(get(paste("lm.",i,sep="")), col="blue")
}
dev.off()
The R project is licensed under the GPL [1]; since this image is a derived work of an example script provided with R, it is also licenced under the GPL.
However, all modifications made by User:Schutz are also licensed under the CC-BY-SA licence.
Dit werk is vrije software;
u mag de software heruitgeven en/of aanpassen in overeenkomst met de voorwaarden van de GNU Lesser General Public License zoals gepubliceerd door de Free Software Foundation.
De geldende versie is 2 van de Licentie, of enige latere versie.
Dit werk wordt gedistribueerd in de hoop dat het bruikbaar is, maar zonder enige garantie;
zelfs zonder de impliciete garantie van goede werking of geschiktheid voor een bepaald doel.
Zie versie 2 en versie 3 van de GNU General Public License voor meer details.http://www.gnu.org/licenses/gpl.htmlGPLGNU General Public Licensetruetrue