<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Scienco.org &#187; Statistics</title>
	<atom:link href="http://www.scienco.org/category/math/statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.scienco.org</link>
	<description>Life&#039;s too short to be unenthusiastic</description>
	<lastBuildDate>Thu, 07 Jul 2011 13:47:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
	<atom:link rel="next" href="http://www.scienco.org/category/math/statistics/feed/?page=2" />

		<item>
		<title>STA: Statistical Toolbox for Android now in version 0.4</title>
		<link>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-now-in-version-0-4/</link>
		<comments>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-now-in-version-0-4/#comments</comments>
		<pubDate>Wed, 15 Sep 2010 20:52:46 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[STA]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=264</guid>
		<description><![CDATA[STA: Statistical Toolbox for Android has recently been updated to version 0.4. Among news compared to version 0.3 is the support for performing one way ANOVA and two kinds of Student's t-test. View a full changelog here. To present the application, it offers three main areas: Distribution tool Statistical tests Descriptives Distribution tool The distribution [...]]]></description>
			<content:encoded><![CDATA[<p>STA: Statistical Toolbox for Android has recently been updated to version 0.4. Among news compared to version 0.3 is the support for performing one way ANOVA and two kinds of Student's t-test. View a full changelog <a title="Changelog" href="http://evolve.dk/STA/CHANGELOG">here</a>.</p>
<p>To present the application, it offers three main areas:</p>
<ul>
<li>Distribution tool</li>
<li>Statistical tests</li>
<li>Descriptives</li>
</ul>
<p><strong>Distribution tool</strong></p>
<p>The distribution tool offers the following features: plot the pdf/pmf, properties (like mean value, variance, and support), cumulative probability, point mass/density, quantiles, and generating/sampling from the distribution. The probability distributions supported are:</p>
<p>Discrete probability distributions:</p>
<ul>
<li>Binomial</li>
<li>Hypergeometric</li>
<li>Negative binomial (or Pascal as it is also called)</li>
<li>Poisson</li>
<li>Zipf</li>
</ul>
<p>Continuous probability distributions:</p>
<ul>
<li>Beta</li>
<li>Cauchy</li>
<li>Chi^2 (Chi squarred)</li>
<li>Exponential</li>
<li>F (or Fisher-Snedecor as it is also called)</li>
<li>Gamma</li>
<li>Normal (or Gaussian as it is also called)</li>
<li>Student's t</li>
<li>Weibull</li>
</ul>
<p><strong>Statistical tests</strong></p>
<p>At the moment, the following tests are supported:</p>
<ul>
<li>One way ANOVA (i.e. univariate)</li>
<li>Chi^2 tests: Pearson's Chi^2 test for independence and observed vs expected counts</li>
<li>Two sample Student's t-tests: both homoscedastic and heteroscedastic are supported</li>
</ul>
<p><strong>Descriptives</strong></p>
<p>The following descriptive statistics about an entered dataset are given:</p>
<ul>
<li>Number of observations</li>
<li>Min</li>
<li>Max</li>
<li>Mean</li>
<li>Standard deviation</li>
<li>Variance</li>
<li>Median</li>
<li>Skewness</li>
<li>Kurtosis</li>
</ul>
<p><strong>Comments</strong></p>
<p>Please do not hesitate to express you thought about the application. Also, ideas for further functionality are warmly welcome! And donations to support the continuous development are highly appreciated (donations can be made by using the box in the upper right corner of this page)!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-now-in-version-0-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>STA version 0.2</title>
		<link>http://www.scienco.org/2010/sta-version-0-2/</link>
		<comments>http://www.scienco.org/2010/sta-version-0-2/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 19:06:59 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[STA]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=259</guid>
		<description><![CDATA[Already an update with the following changes from version 0.1: General: Icon changed Decimal separator always "." no matter the chosen locale of the phone (for consistency purposes) Screen rotate issues fixed Distribution tool: Typing error: Continuous distributions density output changed from "F([input]) = ..." to "f([input]) = ..." Error description at the parameter tab [...]]]></description>
			<content:encoded><![CDATA[<p>Already an update with the following changes from version 0.1:</p>
<p>General:</p>
<ul>
<li>Icon changed</li>
<li>Decimal separator always "." no matter the chosen locale of the phone (for consistency purposes)</li>
<li>Screen rotate issues fixed</li>
</ul>
<p>Distribution tool:</p>
<ul>
<li>Typing error: Continuous distributions density output changed from "F([input]) = ..." to "f([input]) = ..."</li>
<li>Error description at the parameter tab if the parameters are illegal when trying to plot</li>
<li>Descriptives gets calculated automatically when sampling data</li>
<li>The link under properties has been made clickable</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2010/sta-version-0-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>STA (Statistical Toolbox for Android) version 0.1</title>
		<link>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-version-0-1/</link>
		<comments>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-version-0-1/#comments</comments>
		<pubDate>Tue, 13 Jul 2010 10:50:03 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[STA]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=256</guid>
		<description><![CDATA[Finally, a ("beta") version 0.1 of STA is available on the market. Just search for STA. Please let me know if you run into trouble or would like certain features!]]></description>
			<content:encoded><![CDATA[<p>Finally, a ("beta") version 0.1 of STA is available on the market. Just search for STA. Please let me know if you run into trouble or would like certain features!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2010/sta-statistical-toolbox-for-android-version-0-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>STA: Statistical Toolbox for Android</title>
		<link>http://www.scienco.org/2010/sta-statistical-toolbox-for-android/</link>
		<comments>http://www.scienco.org/2010/sta-statistical-toolbox-for-android/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 13:32:46 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[STA]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=250</guid>
		<description><![CDATA[After having done some preliminary application development for Android (and finally finished my master's), I've decided to start a new project. And to blog about the creation of this new project. (As an aside I would really like to point out that I haven't forgot about Watexy, but for now it is not possible to [...]]]></description>
			<content:encoded><![CDATA[<p>After having done some preliminary application development for Android (and finally finished my master's), I've decided to start a new project. And to blog about the creation of this new project. (As an aside I would really like to point out that I haven't forgot about Watexy, but for now it is not possible to improve it.)</p>
<p>The aim of the project is to develop an Android-application with basic statistical tools (I really miss <a href="http://www.r-project.org/">R</a> on my phone, but the project won't be a R-clone nevertheless). So far the codename for the application is Statistical Toolbox for Android (or simply STA).</p>
<p>It is not going to be a programming language such as <a title="S programming language" href="http://en.wikipedia.org/wiki/S_programming_language">S</a>, but an easy-to-use graphical statistical toolbox. The features I've thought about including in the first version are:</p>
<ul>
<li>Quantiles (and fractiles) for a wide range of univariate probability distributions</li>
<li>Descriptive statistics (the first two or three empirical moments, correlation measures)</li>
<li>A guide for choosing the right statistical test</li>
</ul>
<p>The features for the later versions could be:</p>
<ul>
<li>Loading datasets (from mail, files on SD-card, or manual input)</li>
<li>A range of statistical tests</li>
</ul>
<p>If any of you have any comments, please do not hesitate to submit a comment here or by mail (use the contact form accessible from the top menu or by sending an e-mail to the reverse of mikl.dk @ scienco ).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2010/sta-statistical-toolbox-for-android/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>qqmultinorm.R version 1.1 - more intelligent plot size</title>
		<link>http://www.scienco.org/2009/qqmultinorm-r-version-1-1-more-intelligent-plot-size/</link>
		<comments>http://www.scienco.org/2009/qqmultinorm-r-version-1-1-more-intelligent-plot-size/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 07:20:47 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Math]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=229</guid>
		<description><![CDATA[I've updated the qqmultinorm.R script a bit so that it's now capable of picking a more optimal plot size (less unused space). The idea refers to deciding the sides (dimensions) of a rectangle if the areal (number of plots) is known i.e., optimising the dimensions of a rectangle given the areal. We want the perimeter [...]]]></description>
			<content:encoded><![CDATA[<p>I've updated the qqmultinorm.R script a bit so that it's now capable of picking a more optimal plot size (less unused space).</p>
<p>The idea refers to deciding the sides (dimensions) of a rectangle if the areal (number of plots) is known i.e., optimising the dimensions of a rectangle given the areal. We want the perimeter as small as possible (for better viewing), preferably wider than longer if square isn't possible. A given number n is factorised in to numbers within a given error. For example if the allowed error is 2, then 7 gets factorised in (1,7), (2, 4), (4, 2), and (7, 1) (here redundant factorisation is included). Although 2*4 = 8, but abs(7-8) = 1 <= 2, so it's acceptable. Actually the default error is 10% of the input areal.</p>
<p>The way the proper factorisations is chosen, is by ordering the pairs by ascending the difference of the components i.e, how much the width and height differs. And then ordered ascending by height so that the plot gets wide screen-like instead of poster-like.</p>
<p>The new script is a follows (only find.optimal.mfrow.size(...) and a bit logic in qqmultinorm(...) added):</p>

<div class="wp_syntax"><div class="code"><pre class="r" style="font-family:monospace;"># File name: qqmultinorm.R
# Version: 1.1
# Last updated: 2009-08-28
#
# This R-code is made by:
# Mikkel Meyer Andersen, Denmark
# mikl [funny-a] math [.] aau [.] dk or 
# mikl [funny-a] mikl [.] dk
#
# Licence: GPLv2
#
# Feel free to use it, but if you do I'll like to hear about it (just for fun).
# If you make corrections, please submit them back so others can enjoy them as well.
&nbsp;
qqchisq &lt;- function(y, main, df=2, continuity.correction = 0.5)
{
  n &lt;- length(y)
  y &lt;- sort(y)
  c &lt;- numeric(n)
&nbsp;
  for (i in 1:n)
    c[i] &lt;- qchisq((i-continuity.correction)/n, df=df)
&nbsp;
  plot(c, y, xlab=&quot;Theoretical Quantiles&quot;, ylab=&quot;Sample Quantiles&quot;, main=main)
  lines(c(c[1], c[n]), c(c[1], c[n]), type=&quot;l&quot;)
}
&nbsp;
dec2bin &lt;- function(x)
{
  if (!is.vector(x) || length(x) != 1 || x &lt; 0)
    stop(&quot;x must be a non-negative integer&quot;)
&nbsp;
  N &lt;- length(x)
  ndigits &lt;- floor(log2(x)) + 1
  bin &lt;- numeric(ndigits)
&nbsp;
  for (i in (ndigits-1):0)
  {
    tmp &lt;- 2^i
&nbsp;
    if (x %/% tmp &gt;= 1)
    {
      bin[i+1] &lt;- 1
      x &lt;- x - tmp
    }
  }
&nbsp;
  return(rev(bin))
}
&nbsp;
# Returns the power set without the empty set
power.set &lt;- function(v)
{
  n &lt;- length(v)  
  N &lt;- 2^n - 1
  ps &lt;- vector(&quot;list&quot;, N)
&nbsp;
  for (i in 1:(N-1))
  {
    Nbin &lt;- dec2bin(i)
    Nbin &lt;- c(numeric(n-length(Nbin)), Nbin)
    Nbin &lt;- rev(Nbin)
    ps[[i]] &lt;- v[which(Nbin == 1)]
  }
&nbsp;
  ps[[N]] &lt;- v
&nbsp;
  return(ps)
}
&nbsp;
# Input: n
#  - here the number of plots
# Returns c(h, w)
#  - the optimal choice of rows and cols to use in in mfrow
find.optimal.mfrow.size &lt;- function(n)
{  
  w0 &lt;- ceiling(sqrt(n))
&nbsp;
  # The area can at max contain of 10% unused space
  max.error &lt;- round(n*0.1)
&nbsp;
  n.minus.error &lt;- n - max.error
  n.plus.error &lt;- n + max.error
&nbsp;
  # If n is a square, fine!
  if (w0^2 == n)
    return(c(w0, w0))
&nbsp;
  # Col 1 and 2: w and h
  # Col 3: The difference between w and h: this should be as small as possible
  candidates &lt;- matrix(ncol=3)
&nbsp;
  for (w in w0:1)  
  {    
    h &lt;- ceiling(n / w)
    n0 &lt;- w*h
&nbsp;
    # Because of ceiling we know that n0 &gt;= n
    if (n0 &lt;= n.plus.error)
      candidates &lt;- rbind(candidates, c(h, w, abs(w-h)))
  }
&nbsp;
  # First row is NA
  candidates &lt;- candidates[-1,]
&nbsp;
  # Uups, something went wront - well, don't panic
  if (nrow(candidates) == 0)
    return(c(w0, w0))
&nbsp;
  # First order by abs(w-h) and then by h to get a widescreen-look 
  # instead of a poster-look
  candidates &lt;- candidates[order(candidates[,3], candidates[,1]), ]
&nbsp;
  return(candidates[1, c(1,2)])
}
&nbsp;
# dataset: variables in columns and observations as rows
# subset.min.size, subset.max.size: inclusive limits
# filename: if specified, the plot are saved as a png file with this filename
qqmultinorm &lt;- function(dataset, subset.min.size = 1, subset.max.size = 4, filename = NULL, use.optimale.size = F)
{
  p &lt;- ncol(dataset)
  n &lt;- nrow(dataset)
&nbsp;
  if (subset.min.size &lt; 1) stop(&quot;subset.min.size &lt; 1&quot;)
  if (subset.min.size &gt; p) stop(&quot;subset.min.size &gt; p&quot;)
  if (subset.max.size &lt; 1) stop(&quot;subset.max.size &lt; 1&quot;)
  if (subset.max.size &gt; p) stop(&quot;subset.max.size &gt; p&quot;)
  if (subset.min.size &gt; subset.max.size) stop(&quot;subset.min.size &gt; subset.max.size&quot;)
&nbsp;
  if (is.null(colnames(dataset)))
    colnames(dataset) &lt;- 1:p
&nbsp;
  # We have p variables. If all is to be checked against each other,
  # then we have a power-set with 2^p subsets (including the empty set)
&nbsp;
  # Here we get subset containing indexes of the variables to include
  # Note that power.set doesn't include the empty set.
  subsets &lt;- power.set(1:p)
  subsets.len &lt;- length(subsets)
&nbsp;
  # To get the plots with the fewest variables first, we do a litte trick:
  # While we find out which plots to include, we build a list
  # where each element of a list is the index of the subset,
  # and the index of the element is the size of the subset.
  # (The +1 is because the limits are includesive!)
  s.included &lt;- vector(&quot;list&quot;, subset.max.size - subset.min.size + 1)
&nbsp;
  plots &lt;- 0
&nbsp;
  for (i in 1:subsets.len)
  {
    s &lt;- subsets[[i]]
    s.len &lt;- length(s)
&nbsp;
    if (s.len &gt;= subset.min.size &amp;&amp; s.len &lt;= subset.max.size)
    {
      plots &lt;- plots + 1
      s.included[[s.len - subset.min.size + 1]] &lt;- c(s.included[[s.len - subset.min.size + 1]], i)
    }
  }
&nbsp;
  # Now it's possible to build the subset index vector; 
  # we disregard the size of each subset; no more need to know it.
  s.indexes &lt;- c()
&nbsp;
  for (s in s.included)
  {
    s.indexes &lt;- c(s.indexes, s)
  }
&nbsp;
  # We want the best view  
  plot.per.row &lt;- ceiling(plots^(1/2))
  plot.per.column &lt;- ceiling(plots^(1/2))
&nbsp;
  if (use.optimale.size)
  {
    mfrow.parameters &lt;- find.optimal.mfrow.size(plots)
    plot.per.row &lt;- mfrow.parameters[1]
    plot.per.column &lt;- mfrow.parameters[2]
  }
&nbsp;
  plot.width &lt;- 300
  plot.height &lt;- 200
&nbsp;
  if (!is.null(filename))
    png(file=paste(filename, &quot;.png&quot;, sep=&quot;&quot;), bg=&quot;white&quot;, width = plot.per.row * plot.width, height = plot.per.column * plot.height)
&nbsp;
  par(mfrow = c(plot.per.row, plot.per.column))  
&nbsp;
  # There's no need to calculate a whole lot several times:
  ybar &lt;- as.vector(colMeans(dataset))
&nbsp;
  S &lt;- as.matrix(var(dataset))
&nbsp;
  current &lt;- 1  
  for (i in s.indexes)
  {
    s &lt;- subsets[[i]]
    s.len &lt;- length(s)
&nbsp;
    if (s.len &gt; subset.max.size)
      next
&nbsp;
    cat(&quot;Processing subset no.&quot;, current, &quot;out of&quot;, plots, &quot;\n&quot;)
&nbsp;
    # Container for our values
    squared.dist &lt;- numeric(n)
&nbsp;
    # qr.solve(A) = A^(-1)  
    Sinv &lt;- qr.solve(S[s,s])
&nbsp;
    # Then calculate the squared distance for each datapoint
    for (i in 1:n)
    {
      c &lt;- dataset[i,s] - ybar[s]
      squared.dist[i] &lt;- t(c) %*% Sinv %*% c
    }
&nbsp;
    # Finding the order statistic
    squared.dist &lt;- sort(squared.dist)
&nbsp;
    qqchisq(squared.dist, paste(colnames(dataset)[s], collapse=&quot;, &quot;), s.len)
&nbsp;
    current &lt;- current + 1
  }
&nbsp;
  if (!is.null(filename))
    dev.off()
}
&nbsp;
# Example:
A &lt;- matrix(rnorm(2000, mean=3, sd=2), ncol=8)
qqmultinorm(A, 2, 2, &quot;multinorm-optimal.size&quot;, use.optimale.size = T)
qqmultinorm(A, 2, 2, &quot;multinorm&quot;, use.optimale.size = F)</pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2009/qqmultinorm-r-version-1-1-more-intelligent-plot-size/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>qqmultinorm.R - evaluating the normality of a sample</title>
		<link>http://www.scienco.org/2009/qqmultinorm-r-evaluating-the-normality-of-a-sample/</link>
		<comments>http://www.scienco.org/2009/qqmultinorm-r-evaluating-the-normality-of-a-sample/#comments</comments>
		<pubDate>Thu, 27 Aug 2009 11:44:00 +0000</pubDate>
		<dc:creator>Mikkel Meyer Andersen</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Math]]></category>

		<guid isPermaLink="false">http://www.scienco.org/?p=225</guid>
		<description><![CDATA[I wrote a R-script that can be used to evaluate for multivariate normality of a sample. It's a kind of generalisation to the qqnorm, but this one just uses sums of the squared statistical distance which is then chi squared distributed with degrees of freedom equalling the number of squared statistical distance summed. The useful [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote a R-script that can be used to evaluate for multivariate normality of a sample. It's a kind of generalisation to the qqnorm, but this one just uses sums of the squared statistical distance which is then chi squared distributed with degrees of freedom equalling the number of squared statistical distance summed.</p>
<p>The useful thing about this script, is that it's able to plot all possible combinations of the variables. It's possible to specify the minimum number of variables to compare and the maximum number of variables to compare. All possible combinations are found by calculating the power set (using binary representation through the decimal to binary conversion, dec2bin, because if a set has k elements, then its power set has 2^k elements, and the binary representation of the numbers from 1 to 2^k are then used to pick out the subsets in the power set).</p>
<p>Please notice that it's possible to specify a filename so that the plots are written to a png file. This is by far the easiest thing - and only possibility - if there's more than a few plots.</p>
<p>I haven't wrote a lot of documentation besides this, but feel free to ask if in doubt of anything!</p>

<div class="wp_syntax"><div class="code"><pre class="r" style="font-family:monospace;"># File name: qqmultinorm.R
#
# This R-code is made by:
# Mikkel Meyer Andersen, Denmark
# mikl [funny-a] math [.] aau [.] dk or 
# mikl [funny-a] mikl [.] dk
#
# Licence: GPLv2
#
# Feel free to use it, but if you do I'll like to hear about it (just for fun).
# If you make corrections, please submit them back so others can enjoy them as well.
&nbsp;
qqchisq &lt;- function(y, main, df=2, continuity.correction = 0.5)
{
  n &lt;- length(y)
  y &lt;- sort(y)
  c &lt;- numeric(n)
&nbsp;
  for (i in 1:n)
    c[i] &lt;- qchisq((i-continuity.correction)/n, df=df)
&nbsp;
  plot(c, y, xlab=&quot;Theoretical Quantiles&quot;, ylab=&quot;Sample Quantiles&quot;, main=main)
  lines(c(c[1], c[n]), c(c[1], c[n]), type=&quot;l&quot;)
}
&nbsp;
dec2bin &lt;- function(x)
{
  if (!is.vector(x) || length(x) != 1 || x &lt; 0)
    stop(&quot;x must be a non-negative integer&quot;)
&nbsp;
  N &lt;- length(x)
  ndigits &lt;- floor(log2(x)) + 1
  bin &lt;- numeric(ndigits)
&nbsp;
  for (i in (ndigits-1):0)
  {
    tmp &lt;- 2^i
&nbsp;
    if (x %/% tmp &gt;= 1)
    {
      bin[i+1] &lt;- 1
      x &lt;- x - tmp
    }
  }
&nbsp;
  return(rev(bin))
}
&nbsp;
# Returns the power set without the empty set
power.set &lt;- function(v)
{
  n &lt;- length(v)  
  N &lt;- 2^n - 1
  ps &lt;- vector(&quot;list&quot;, N)
&nbsp;
  for (i in 1:(N-1))
  {
    Nbin &lt;- dec2bin(i)
    Nbin &lt;- c(numeric(n-length(Nbin)), Nbin)
    Nbin &lt;- rev(Nbin)
    ps[[i]] &lt;- v[which(Nbin == 1)]
  }
&nbsp;
  ps[[N]] &lt;- v
&nbsp;
  return(ps)
}
&nbsp;
# dataset: variables in columns and observations as rows
# subset.min.size, subset.max.size: inclusive limits
# filename: if specified, the plot are saved as a png file with this filename
qqmultinorm &lt;- function(dataset, subset.min.size = 1, subset.max.size = 4, filename = NULL)
{
  p &lt;- ncol(dataset)
  n &lt;- nrow(dataset)
&nbsp;
  if (subset.min.size &lt; 1) stop(&quot;subset.min.size &lt; 1&quot;)
  if (subset.min.size &gt; p) stop(&quot;subset.min.size &gt; p&quot;)
  if (subset.max.size &lt; 1) stop(&quot;subset.max.size &lt; 1&quot;)
  if (subset.max.size &gt; p) stop(&quot;subset.max.size &gt; p&quot;)
  if (subset.min.size &gt; subset.max.size) stop(&quot;subset.min.size &gt; subset.max.size&quot;)
&nbsp;
  if (is.null(colnames(dataset)))
    colnames(dataset) &lt;- 1:p
&nbsp;
  # We have p variables. If all is to be checked against each other,
  # then we have a power-set with 2^p subsets (including the empty set)
&nbsp;
  # Here we get subset containing indexes of the variables to include
  # Note that power.set doesn't include the empty set.
  subsets &lt;- power.set(1:p)
  subsets.len &lt;- length(subsets)
&nbsp;
  # To get the plots with the fewest variables first, we do a litte trick:
  # While we find out which plots to include, we build a list
  # where each element of a list is the index of the subset,
  # and the index of the element is the size of the subset.
  # (The +1 is because the limits are includesive!)
  s.included &lt;- vector(&quot;list&quot;, subset.max.size - subset.min.size + 1)
&nbsp;
  plots &lt;- 0
&nbsp;
  for (i in 1:subsets.len)
  {
    s &lt;- subsets[[i]]
    s.len &lt;- length(s)
&nbsp;
    if (s.len &gt;= subset.min.size &amp;&amp; s.len &lt;= subset.max.size)
    {
      plots &lt;- plots + 1
      s.included[[s.len - subset.min.size + 1]] &lt;- c(s.included[[s.len - subset.min.size + 1]], i)
    }
  }
&nbsp;
  # Now it's possible to build the subset index vector; 
  # we disregard the size of each subset; no more need to know it.
  s.indexes &lt;- c()
&nbsp;
  for (s in s.included)
  {
    s.indexes &lt;- c(s.indexes, s)
  }
&nbsp;
  # We want a squared view
  plot.per.row &lt;- ceiling(plots^(1/2))
  plot.per.column &lt;- ceiling(plots^(1/2))
  plot.width &lt;- 200
  plot.height &lt;- 200
&nbsp;
  if (!is.null(filename))
    png(file=paste(filename, &quot;.png&quot;, sep=&quot;&quot;), bg=&quot;white&quot;, width = plot.per.row * plot.width, height = plot.per.column * plot.height)
&nbsp;
  par(mfrow = c(plot.per.row, plot.per.column))  
&nbsp;
  # There's no need to calculate a whole lot several times:
  ybar &lt;- as.vector(colMeans(dataset))
&nbsp;
  S &lt;- as.matrix(var(dataset))
&nbsp;
  current &lt;- 1  
  for (i in s.indexes)
  {
    s &lt;- subsets[[i]]
    s.len &lt;- length(s)
&nbsp;
    if (s.len &gt; subset.max.size)
      next
&nbsp;
    cat(&quot;Processing subset no.&quot;, current, &quot;out of&quot;, plots, &quot;\n&quot;)
&nbsp;
    # Container for our values
    squared.dist &lt;- numeric(n)
&nbsp;
    # qr.solve(A) = A^(-1)  
    Sinv &lt;- qr.solve(S[s,s])
&nbsp;
    # Then calculate the squared distance for each datapoint
    for (i in 1:n)
    {
      c &lt;- dataset[i,s] - ybar[s]
      squared.dist[i] &lt;- t(c) %*% Sinv %*% c
    }
&nbsp;
    # Finding the order statistic
    squared.dist &lt;- sort(squared.dist)
&nbsp;
    qqchisq(squared.dist, paste(colnames(dataset)[s], collapse=&quot;, &quot;), s.len)
&nbsp;
    current &lt;- current + 1
  }
&nbsp;
  if (!is.null(filename))
    dev.off()
}
&nbsp;
# Example:
A &lt;- matrix(rnorm(2000, mean=3, sd=2), ncol=8)
qqmultinorm(A, 1, 3, &quot;multinorm-1-3&quot;)
#qqmultinorm(A, 4, 4, &quot;multinorm-4&quot;)
#qqmultinorm(A, 5, 5, &quot;multinorm-5&quot;)
#qqmultinorm(A, 6, 8, &quot;multinorm-6-8&quot;)</pre></div></div>

<div id="attachment_227" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.scienco.org/wp-content/multinorm-1-3.png"><img src="http://www.scienco.org/wp-content/multinorm-1-3-300x300.png" alt="multinorm-1-3" title="multinorm-1-3" width="300" height="300" class="size-medium wp-image-227" /></a><p class="wp-caption-text">multinorm-1-3</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.scienco.org/2009/qqmultinorm-r-evaluating-the-normality-of-a-sample/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

