vendredi 3 mai 2013

quickcheck tutorial 0.0.1

This post follows this one.
  • Checking properties defined from built-in types
So far, we have seen how to define properties when their free variables were of types  bool, int or bigint. Other primary types are handled by default. These are double, complex, char and string.

Whereas any value of the bool and int types is equiprobable ($\frac{1}{2}$ for bool, $\frac{1}{2^{32}}$ for int), some assumptions were made  for the possible values returned by the other types. A produced bigint value, for instance, can't have a length greater than 256 in decimal. The same restriction applies to string values. Also, some restrictions have been added to concentrate the distribution of the double and complex values around 0. Finally, only 573 possible characters for char values have been kept from the unicode set without any justification and this decision affects the content of any string values.

Two useful types have also been integrated by default:
  1. shortint, represents an integer between 0 and 255. Its values are equiprobable;
  2. probability, represents a double between 0 and 1. Its values are uniformly distributed.
For instance, let's check the equality $E((X-E(X))^2)=E(X^2)-E(X)^2$ for the variance of $X$, say for 3 values.

> prop_Var p::probability q::probability x1::shortint x2::shortint x3::shortint =
>  E (map ((^2).(+-EX)) X) == E (map (^2) X) - EX^2
>    when X  = [x1,x2,x3];
>         EX = E X end 

>      with
>         E = foldr1 (+) . zipwith (*) [p, (1-p)*q, (1-p)*(1-q)];
>         a == b = abs (1-a/b) < 1e-11 end;

 You can call an evaluation for a particular set of arguments:
> prop_Var 0.3 0.4 3 5 8;


or run the tests:
> quickcheck prop_Var ;
OK, passed 100 tests.

If these defaults don't match your needs or if you want to build specific rules, see next section.

  • User-defined types
Three steps are required for any new type definition.
  1. Declaration of a nonfix symbol with the same name as the type name.
  2. Definition of the new type.
  3. Implementation of the rule which describes the creation of an arbitrary value of that type.
As an example, suppose we're trying to find a small counterexample of the wrong proposition saying that if $a$ is prime with $b$ and if $a$ prime with $c$ then $b$ is prime with $c$.

Of course, we could simply define:

> infix (mod) primewith;
> a primewith b = gcd a b == 1;

> prop_wrong_may_pass a::shortint b::shortint c::shortint =
>     a > 59 || b > 59 || c > 59 || ~ a primewith b || ~ a primewith c || b primewith c;

> quickcheck prop_wrong_may_pass ;
OK, passed 100 tests.
> quickcheck prop_wrong_may_pass ;
Falsifiable, after 30 tests:

 But, as the example shows, a single run could make believe that the proposition is true.

Instead, let's define a specific type in order to get rid of all the ignored cases:
> nonfix my_adequate_triple;
> type my_adequate_triple t::matrix =
>     all (>0) t && all (<60) t && t!0 primewith t!1 && t!0 primewith t!2;
> arbitrary  my_adequate_triple = {a,b,c}
>     when a = rnd;
>          b = primewith_ a;
>          c = primewith_ a end
>     with rnd = 1 + (abs random) mod 59;
>          primewith_ x =
>              if y primewith x then y
>                               else primewith_ x
>           when y = rnd end
>      end;

You can test if an arbitrary value fits the requirements:
> arbitrary my_adequate_triple ;

Then the property:
> prop_wrong t::my_adequate_triple = t!1 primewith t!2;
> quickcheck prop_wrong;
 Falsifiable, after 0 tests:
> quickcheck prop_wrong ;
Falsifiable, after 15 tests:
> quickcheck prop_wrong ;
Falsifiable, after 0 tests:

See you next time...

mercredi 1 mai 2013

quickcheck tutorial 0.0.0

Here is a short first tutorial on how to use the quickcheck tool (download quickcheck.pure) interactively:

  • Copy the quickcheck.pure file to your working directory (in pure, type  > ! pwd (without >) to know where is it and > ! cd mydir to change it if you wish).
  • Start pure (if it not already started).
 Working with quickcheck
  • Checking logical propositions (built-ins).
 Load the library: 
> using quickcheck;

Suppose we want to check Pierce's law : $((a \Rightarrow b)  \Rightarrow a) \Leftrightarrow a$.

First, declare some new operators (with arbitrarily the same priority as ||): 
> infix (||) ==> <=>;

in order to define some helper functions: 
> p ==> q = ~p || q; 
> p <=> q = (p ==> q) && (q ==> p);

Then define the proposition (for the moment - May 2nd 2013 - it is impossible to include the arguments as lambdas in the definitions so the following proposition can't be defined as prop_Pierce = \ a::bool b::bool -> ((a ==> b) ==> a) <=> a yet): 
> prop_Pierce a::bool b::bool = ((a ==> b) ==> a) <=> a;

This function is nothing but the mathematical predicate where the free variables have been explicitely given a type. Note that these (boolean here) types are mandatory since this is the only way for the program to decide what the possible values for $a$ and $b$ are.

Finally, run the tests:
> quickcheck prop_Pierce;

The answer:
OK, passed 100 tests.

means that no contradiction was found (of course, a lot of the tests are redondant here since with 2 possible values for 2 variables, we could have get this result in 4 tests.)
  • Checking numerical properties (built-ins) 
 Distributivity of the multiplication over the addition: 
> distrib_mult_over_add_big a::bigint b::bigint c::bigint = a*(b+c) == a*b+a*c;
> quickcheck distrib_mult_over_add_big;

OK, tests are passed, we could have guessed. What about: 
> distrib_mult_over_add a::int b::int c::int = a*(b+c) == a*b+a*c;
> quickcheck distrib_mult_over_add;
> quickcheck distrib_mult_over_add;
> quickcheck distrib_mult_over_add;

Tests always pass. But wait, isn't it surprinsing despite the possible overflows ? Well, it isn't since the int type is a sort of modular ring but we have to be careful with interpretation and remember, for instance, that even if 2*(1500 000 000+1500 000 000) = 2*1500 000 000+2*1500 000 000, each member equals 1705032704.

Indeed, with: 
> distrib_mult_over_add_mix a::int b::bigint c::int = a*(b+c) == a*b+a*c;
> quickcheck distrib_mult_over_add_mix;

the output: 
Falsifiable, after 0 tests:

warns that, with triple (a,b,c) where a=-795755684, b=-9167850166993051345L and c=-1459775657, the equality doesn't hold.

to be continued...