18 January 2012

Keeping the "science" in Software Development

If you already feel that math is a science you can safely skip the first section.  What follows are my musings on what it means to be "correct", especially in software.  I pose a challenge at the end for practitioners and researchers.

Math is science too
According to Wikipedia the scientific method is:
Scientific method refers to a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge.[1] To be termed scientific, a method of inquiry must be based on gathering empirical andmeasurable evidence subject to specific principles of reasoning.[2]
At first glance it may appear that mathematics doesn't follow the scientific method above. The mathematics that you're most likely familiar with is a collection of definitions and the facts, lemmas and theorems, about those definitions that result from applying logical deduction.  You might notice that if you reason that way, then textbook applications of newton's three laws do not follow the scientific method either.

You might argue that in this case the applications of newton's laws is now mathematics and no longer science.  I would agree that it is mathematics and science.  Let's see if I can convince you.

By the time you see the definitions and corresponding theorems in a math text they have been reduced to facts.  The trial and error, as well as correction and integration of previous knowledge, have already happened.  The same is true of a textbook that shows you how to apply newton's three laws.  I recommend taking a moment to familiarize yourself with Lockhart's lament.

Math only seems to not be a science because of our relationship to learning it.  Most of us get taught math as facts.  If you've taken a proof based course you may have had the experience of searching for proofs.  There is a lot of trial and error involved.  Mathematics as a field may have already integrated previous knowledge, but if you are taking a course chances are you have not.  So you employ trial and error as well as hypothesis refinement to build logical deductions to establish what your intuition, or pen and paper examples, says in true.

The set of assumptions that you choose can make or break a result.  Even though mathematical results are timeless, they still bend to the sway of assumptions.  This is exactly what happens to empirical scientists too.  They think they know something, or have the right model, but later it turns out they reasoned from a flawed assumption and suddenly new data transform their field.

Evidence vs Proof
If you've studied proofs in different logical settings you may have noticed that what constitutes a proof depends on what you seek to gain from the proof.  See for example classical logic versus constructive logic.

In a court case or a forensics study, people look for evidence or witnesses to give data points in support of a particular position within a larger case (or claim).

Think about that for a minute.  A claim is made, then people try to establish it or refute it.  Often times, the assertion needs only be accepted or rejected by a "reasonable" skeptic that is willing to accept partial proof.  Such a skeptic might be a jury or peers reviewing a publication for acceptance in a journal.  Because we used partial proof new information could turn up later that reverses our understanding.

At the other extreme we have mathematical proofs, which are composed of logical statements following from assumptions.  Proofs are timeless.  If we get our logic right and we're clear about our assumptions then the result will stand forever.  I say "forever", but in reality, proofs can be found to be incomplete, insufficiently rigorous, or using the wrong logic at a later date.  Sometimes years after the arguments are accepted.

It seems that the difference here isn't in whether the resulting "facts" are immutable but rather the strength with which we intend to make the claim.  In the forensics case, we are making a claim to the best of our incomplete knowledge about what happened.  In the case of mathematical proof, we are making a claim that should be true forever.

In my opinion, the difference between "evidence" and "proof" exists to quantify the strength of our assertion that the claim or conjecture is true.  Claims can roughly be placed on a continuum from weak to strong.  Some claims, such as "a human authored this blog post", are weak in nature and need little evidence.  Other claims, such as "the author of this blog post can see through walls", is a hard to believe as there is good evidence that humans can't see through walls. To prove the second claim I would hope that you demand really strong "evidence".  Perhaps in the form of a demonstration, a scientific model for how I can do it, and validation from peer review.

Due to the continuum for strength of evidence described above, I consider proofs to be a form of evidence.  Proofs, in the mathematical sense, being the strongest form of evidence.  Claims made by politicians would be on the extreme other end of my continuum.

The strength of the evidence for the claim has practical applications too.  I wouldn't engineer a shuttle to the moon based on arguments built by politicians.  I would base my engineering on mathematical proofs as getting to the moon requires a high degree of correctness.

So then, what level of evidence is right if I need to be correct?

Evidence Based Correctness
Now we're really at the heart of the matter.  Sometimes it just doesn't matter if we're correct or not, for example when arguing with a spouse.  Other times it matters deeply, like when we want to engineer a shuttle to the moon.

While mathematics is obsessed with being unequivocally true, other branches of science have been trying to deepen their understanding while accepting a greater risk of being wrong.  This results in a different level of evidence required for progress.

In other words, you have to figure out how important it is to you (or your application) that you know you're correct versus it being reasonable that you are correct.  I like continuums and this is no exception.  I believe that you can build a continuum of how strongly you believe things.  This continuum is a natural dual to the continuum for strength of evidence.

In other words, the bolder your claim is, the stronger the evidence you need.  Unfortunately, no one has a magical formula for knowing when evidence is strong enough.  If we did science would become more precise, court would become more cut and dry, and mathematicians could relax and quantify results in terms of proof strength.

Now wait a second.  Quantify results in terms of proof strength?  That sounds like statistical reasoning doesn't it?  Statisticians are able to cobble together data points and talk about the strength of that evidence in terms of supporting a particular hypothesis.

In software development, testing is probably the cheapest way to have some evidence that your program is correct.  Testing is almost never enough to have a proof of correctness though.  On the other end of the spectrum we have theorem proving, which is known to be costly in terms of human effort.

My point is this, you don't have to force yourself into a staunch mathematical camp to have "correctness" in your software development.  Nor do you have to give up on correctness in order to fit into a budget.  It's not a dichotomy where you are forced to give up on either truth or practicality.  Testing is not "wrong" and proofs are not "required".

What is required is having a level of evidence that is strong enough to support the level of belief we need for our chosen level of correctness.  Currently we have to carefully evaluate where on the continuum of correctness our result rests and then pick a commiserate level of evidence.  You probably do this already intuitively to some degree.  My challenge to you is to do this explicitly and consciously.

Furthermore, can we take the precedent set by the statisticians and find a way to quantify strength of evidence?  Can testing and theorem proving be unified on a continuum by this measure?  Is there a way to look at a successful run of a test suite, that may or may not include theorem proving, and quantify the strength of that evidence that our program is correct?

23 September 2011

Galois offers Haskell course: The Tao of Functional Programming

This year Galois will be offering a course in professional Haskell development. Details on our main website: http://corp.galois.com/haskell-course

17 March 2011

Picking a GUI library to use with OpenGL

OpenGL provides nice real-time graphics primitives, but to get started with OpenGL you have to first get an OpenGL rendering context on the screen.

For years the simplest cross-platform way to do this was by using the GLUT library.  GLUT is unattractive for reasons I'll explain towards the end, but it's easy to get started using it. This tends to make it a nice choice for beginners.  These days there are many alternatives to GLUT for those of us who want simplicity and cross-platform compatibility.  Here I will focus on Haskell options.



Full spreadsheet view here.

The significance of the license columns is that pure Haskell code which has a standard LGPL license forces your binaries to also abide by the LGPL restrictions. This is due to GHC's cross-module inlining. On the other hand, if a foreign library has an LGPL license but BSD3 Haskell bindings then this does not apply. See for example, the SDL bindings. License entries that read "LGPL with linking exception" mean that using it with Haskell code does not cause your binaries to fall under LGPL restrictions due to the license explicitly granting that exception.

The use of atexit() can lead to subtle problems so I've tried to highlight which libraries use it. Please let me know if you see a mistake in my categorizing the use of atexit().

The requirement of extra libraries is more of a hassle for people who will use cabal-install to get and build your program. If your project is a popular one with a large audience you will want to create an installer for end-users, in which case you could offer to install the extra libraries when they install your program.

For simple uses, such as learning or demos programs, my current recommendation is GLFW-b. SDL also makes a nice choice, but it requires more from your windows users as they will have to install the libraries separately. Linux and OSX users will have to install the libraries but they have the option of using a package manager to handle the install.

Gtk2hs and wxHaskell are harder for novices as they come with restrictions to what you can do. For example, at one point in time gtk2hs programs had to be very careful when using threading. I don't know if this is still an issue though.

Surprisingly, all of the libraries in my table have good documentation. Although, in some cases the Haskell bindings themselves do not have documentation. Typically it's not an issue to go by the original library docs.

GLUT is attractive, in part because it's well known, it's included in the Haskell Platform, and lots of documentation and examples exist. Keep in mind, it's technically not free software (although the source is available), it uses atexit(), and even though it's included in the Haskell Platform windows users will still need to download glut32.dll and install it in their application build directory. If you have the urge to use GLUT, try to use freeglut as your implementation.

Based on the above list of offers, you can see that it would be very nice if someone created a library based on the APIs of GLFW, SDL, or GLUT, but did a direct binding to the native APIs of windows, OSX, and linux so that we'd have a "pure" Haskell option that doesn't require extra C libraries. For this to become a nice solution it would be nice if the Haskell Platform bundled a binding to native OSX APIs.

I hope the above comparisons help you to pick the right GUI library for your next project that uses OpenGL!

19 September 2010

Composability, laziness, testing, and proofs (oh my!)

UPDATED: I forgot to mention types as a reason for composability!

By now you've probably all seen the excellent talk that Larry Diehl gave at the Scottish Ruby convention this year. The talk encourages the audience to think of proofs in languages such as Agda as composable unit tests.

If you're an experienced Haskeller then I won't have to explain to you why composability is easier in Haskell than other languages, but let me try anyway as it's easy to forget why composability is so important.

  1. Haskell is pure so we get a lot of composability just from knowing statically the precise set of inputs a function takes. This means we can easily take a function from one context and use it in another one. We've removed any hidden or implicit state. It's all lexically known.
  2. Laziness is another reason why Haskell code is highly composable. The Haskell prelude is full of composable list functions. By operating on (potentially) infinite data structures and only computing as much as is needed you can define combinators that glue together rather nicely. For example, the functions take and iterate allow you to define loops as infinite sequences while allowing you to only process some finite prefix of the sequence.
  3. Another major reason Haskell is highly composable is higher-order functions. Functions like foldr, foldl, and bracket demonstrate that you can write the structure once and reuse it with whatever logic you need in the body.
  4. Haskell has rich types that correspond accurately to what a function takes and what it computes, although they are not as precise as we get with dependently typed languages. These types give us hints about when it's safe to compose things by giving us a bit of machine checkable documentation. Conversely, when things cannot fit together as written we will get type errors.

As you can see, each of these features enhances and enables opportunities for composability which makes us as programmers more efficient, allows us to program at a level closer to our specification, and means we can get things right once and reuse it later.

How does this relate to testing?

First let's review the Curry-Howard correspondence, which says that our types are logical propositions and our programs represent proofs of those propositions.

Unit tests are evidence that our code does what we expect for some finite subset of the inputs. While they are not proofs that our code works in general they do provide some assurance. Larry's point is that in some languages, such as Agda, you can write lemmas that are full blown proofs of correctness using the Curry-Howard correspondence. Furthermore, you can compose these lemmas into theorems.

My question is, can we compose unit tests in Haskell?

It occurs to me that typically with unit tests we don't know what logical proposition the unit test corresponds to. Often the type is very simple, such as Int -> Int -> Bool, or IO (). I suspect that to make unit tests composable in the same way as lemmas we need to know what logical statement corresponds to each unit test. We could compose the body of the unit tests, which corresponds to composing their proofs, but that's not really scaleable and essentially amounts to copy&paste programming.

This is where QuickCheck comes in. In QuickCheck we keep the proposition at the value level, but we take a step forward in at least stating the proposition. We also try to ensure the property for more than manually picked inputs by using a random value generator or clever value enumeration.

Now my question turns into, have you ever considered composing your QuickCheck properties to build bigger properties? Can we build QuickCheck theorems from our QuickCheck lemmas?

Pushing this a bit more, can we redesign QuickCheck so that the properties are reflected in the types of the properties? Would this help make QuickCheck properties more composable or would it lead to Haskell type hackery madness?

Does anyone have pointers to work on this?

18 September 2010

Chrome Developers Don't Listen to Users (or, why no advanced option to show http:// in the url bar?)

I want to take a moment to register my dissatisfaction with the chrome developers.

I think the way they are handling bug 41467 is disrespectful to their user community. Given the uproar surrounding the change why haven't they provided an extension or a user option to display the "http://" part of URLs? I think their behavior shows that they are not listening to their users.

Chrome devs, please give us users more freedom.

Does anyone know of an extension to put the http back in the URL?

01 September 2010

Trying out jsMath

\left(\, \sum_{k=1}^n a_k b_k \right)^2 \le
\left(\, \sum_{k=1}^n a_k^2 \right) \left(\, \sum_{k=1}^n b_k^2 \right)

Well, that was pretty easy.

I just followed the instructions here on downloading and installing.

Next, I went looked in the tests directory and copied some lines out the test file:


Next I went to the html view of the template for my blog and I put those lines at the start of the document body and pointed the paths at my install of jsMath (in a web accessible directory). Next, I put the following lines at the very bottom of the document body:


According to the install instructions there are variations on this that should work, but I couldn't get them to work satisfactorily.

In later posts maybe I'll play with some of the plugins.

30 August 2010

REVISED: More fun pointless code metrics

My apologies to everyone who looked at the numbers for GCC before. I had run cloc from the wrong directory and it was counting all the lines of code in all the source trees on my computer. Oops! The GCC numbers listed at the bottom are corrected. While I'm at it, let's look at GHC vs. GCC. Here is GHC:
1177 text files.
    1159 unique files.                                          
     247 files ignored.

http://cloc.sourceforge.net v 1.51  T=56.0 s (16.6 files/s, 6818.4 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Haskell                         424          38213          64078         125830
C                               139          10199          13583          46412
XML                              33           3839            539          29416
C/C++ Header                    170           3171           5236           8509
HTML                             41            708             82           6761
Perl                             31           1578           1526           6284
Bourne Shell                     26            410            794           3983
Pascal                            2            753            356           2711
m4                                2            293            171           1959
yacc                              4            261              0           1465
make                             42            216            397            606
Bourne Again Shell                5             75            138            393
Lisp                              1             65             76            291
Assembly                          1             34             30            125
Teamcenter def                    3             20              0             54
CSS                               2              8              0             36
D                                 1             10             29             23
C Shell                           2             24             41             20
--------------------------------------------------------------------------------
SUM:                            929          59877          87076         234878
--------------------------------------------------------------------------------
And GCC:
58499 text files.
   57735 unique files.                                          
  116986 files ignored.

http://cloc.sourceforge.net v 1.51  T=2794.0 s (19.4 files/s, 2776.8 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                             15704         350409         365314        1790292
Java                           6324         169097         641639         680923
C/C++ Header                   9712         135787         126549         659612
Ada                            4518         227132         307713         659411
C++                           12898          98958         127224         428592
Bourne Shell                    126          56208          48639         317192
HTML                            440          30509           5304         139342
Fortran 90                     2978          12630          22272          71604
m4                              163           6547           1879          57799
Assembly                        195           7543           9400          45902
XML                              56           3621            214          29679
make                            115           3193            941          20374
Teamcenter def                   76           2903            379          18601
Expect                          219           4214           7719          16402
Fortran 77                      381            917           2776          10119
Objective C                     275           1922           1092           7021
Perl                             24            760           1157           4121
XSLT                             20            563            436           2805
Bourne Again Shell               12            372            599           1675
CSS                               8            332            143           1427
awk                              10            221            373           1370
Python                            4            301            142           1311
yacc                              2            107            109            987
Pascal                            4            218            200            985
C#                                9            230            506            879
MUMPS                             4            121              0            521
Tcl/Tk                            1             72            112            393
ASP.Net                           7             37              0            203
lex                               1             36             27            150
NAnt scripts                      2             17              0            148
MSBuild scripts                   1              1              0            140
Javascript                        2             20             81            122
Haskell                          35             15              0            109
Lisp                              1              4             21             59
MATLAB                            3             13              0             52
DTD                               3             28             70             26
Fortran 95                        2             10              7             21
DOS Batch                         3              0              0              7
--------------------------------------------------------------------------------
SUM:                          54338        1115068        1673037        4970376
--------------------------------------------------------------------------------
Keep in mind, the GCC tree includes lots of languages because it's a compiler for lots of things. And it has a lot of tests.