Matthew Podwysocki's Blog

Architect, Develop, Inspire...

  • The Unit Testing Story in F# Revisited

    Last week I posted about some troubles I was having with the unit testing frameworks for F#.  Today, Brad Wilson announced the release of xUnit.net 1.0.1 which addressed the change in the F# compiler as well as integration with ASP.NET MVC Preview 3 which was just released.  As always you can find the latest bits on CodePlex.  There was a change in the way F# was compiling the modules as static classes which was not expected in previous versions. 

    Running the Tests Again

    Now, I'm able to run my functions just as before and the runner will now recognize them.  Below is just a simple example of some unit tests to determine whether numbers are prime or not.  I'm extending the System.Int32 to add a property to the instance to determine whether it is prime.  Just to prove a point on how flexible F# really is, I'm also able to extend the Int32 class using static methods, something that you cannot do with C# and extension methods.  More and more, I love the language itself and finding myself trapped sometimes by the limits of C#.  But, that's sidetracking, so let's get to the unit tests.

    #light

    #R @"D:\Tools\xunit-1.0.1\xunit.dll"

    open Xunit

    let isPrimeNumber(i) =
      let limit = int(sqrt(float(i)))
      let rec check j =
        j > limit or (i % j <> 0 && check(j + 1))
      check 2
       
    type System.Int32 with
      member i.IsPrime
        with get () = isPrimeNumber(i)
       
      static member IsPrimeNumber(x) =
        isPrimeNumber(x)

    [<Fact>]
    let IsPrime_WithPrimeNumber_ShouldReturnTrue() =
      Assert.True((7).IsPrime)
       
    [<Fact>]
    let IsPrime_WithNonPrimeNumber_ShouldReturnFalse() =
      Assert.False((21).IsPrime)
      Assert.False(System.Int32.IsPrimeNumber(45))

    And then when I run it through the GUI runner, I sure enough get two passing tests.  It was asked of me last week at the Philly ALT.NET meeting about TDD with F# and I see no problem with this at all, and in fact I actively encourage it.  But, you have to think about this in a different light when talking about objects and behaviors, and then turning around to functions and behaviors.

    Getting Going with Gallio

    As I mentioned last time, Jeff Brown has been hard at work to support the F# community as well.  I was able to get the right build going of Gallio finally after there may have been some mixups with getting the latest code.  Anyhow, I am now able to get these same tests to work, but using MbUnit version 3 and through the Gallio Icarus Runner.  If you're not familiar with Gallio, it is an open platform of tools and runners that is extensible to all testing frameworks.  Jeff Brown talked about it on Hanselminutes with Brad Wilson of xUnit.net fame, Roy Osherove and Charlie Poole of NUnit on the Past, Present and Future of Unit Testing Frameworks.

    So, let's just modify the above code to migrate to Gallio with MbUnit version 3 and see how we do:

    #light

    #R @"D:\Program Files\Gallio\bin\Gallio.dll"
    #R @"D:\Program Files\Gallio\bin\MbUnit.dll"

    open MbUnit.Framework

    let isPrimeNumber(i) =
      let limit = int(sqrt(float(i)))
      let rec check j =
        j > limit or (i % j <> 0 && check(j + 1))
      check 2
       
    type System.Int32 with
      member i.IsPrime
        with get () = isPrimeNumber(i)
       
      static member IsPrimeNumber(x) =
        isPrimeNumber(x)

    [<Test>]
    let IsPrime_WithPrimeNumber_ShouldReturnTrue() =
      Assert.IsTrue((7).IsPrime)
       
    [<Test>]
    let IsPrime_WithNonPrimeNumber_ShouldReturn_False() =
      Assert.IsFalse((21).IsPrime)
      Assert.IsFalse(System.Int32.IsPrimeNumber(45))
     
    And we can notice through the Gallio runner that it's only detecting the MbUnit tests right now, unfortunately.  Hopefully that issue will be resolved soon.



    BDD Specs in F#?

    F# is a pretty flexible language for unit testing and even BDD style.  I wonder if we could take some lessons from the spec BDD framework for Scala and apply to F#.  Just a thought...

    If you're not familiar with specs, it's a BDD framework with some interesting syntax that I'm still coming to terms with.  But the concept looks interesting.  Take a look at a quick example and see if it speaks to you.

    package podwysocki.specs

    object scalaSpecExample extends Specification {
      "A hello world spec" should {
        "return something" in {
           "hello" mustBe "hello"
        }
      }
    }

    As I've played around with Scala, this is a pretty interesting concept.  I'm much more a fan of F# as a language, but still there are some interesting pieces to Scala.  I'm also interested in using MSpec from Aaron Jensen at some point, but have a bit on my plate and other points of focus right now. 

    Wrapping It Up


    In the mean time, we have another testing framework to consider.  Me, personally, I prefer xUnit.net because of the functional aspects of Assert.Throws and so on.  But, that option is up to you quite frankly.  There is a good story to be told here with regards to unit testing and F# that is not to be overlooked. 

    kick it on DotNetKicks.com

  • Static versus Dynamic Languages - Attack of the Clones

    Very recently there has been an ongoing debate between static and dynamically typed languages.  Since it seems that there has been some Star Wars references, I thought I'd add my own.  I originally wanted to cover this as part of the future of C#, but I think it deserves its own topic.  There have been many voices in the matter and I've read all sides and thought I'd weigh in on the matter.  I find myself with my feet right now in the statically typed community right now.  I do appreciate dynamic typing and it definitely has its use, but to me the static verification is a key aspect.  But, of course I do appreciate dynamic languages, especially those of the past including Lisp, Erlang, etc.

    Here are some of the salvos that have been fired so far:


    The Salvos Fired

    First, Steve Yegge posted a transcript from his talk at Stanford called "Dynamic Languages Strike Back".  In this talk, he talks about the history of dynamic languages, the performance, what can be done, and the politics of it all.  But at the end of the day, it comes down to the tools used.  It was a pretty interesting talk, but of course dredge up some pretty strong feelings.  In turn, you had responses from Cedric Beust coming out in favor of statically typed languages, and Ted Neward, Ola Bini and Greg Young analyzing the results of the two of them.  I won't get into the me too aspect of it all, but I encourage you to read the posts, but also the responses as well.

    I think Cedric lost me on the argument though is when he brought Scala into the argument.  To me, it was kind of nonsensical to mention it in this case.  And to mention that pattern matching is a leaky abstraction is unfortunate and I think very wrong.  The thing that functional languages give us is the ability to express what we want, and not necessarily how to get it.  Whether it puts it in a switch statement, an if statement, or anything else doesn't matter, as long as the decision tree was followed.  I don't see any leakiness here.  So, that was a bad aside on there.  I'm not a huge fan of Scala either, but for entirely different reasons.  First off, the type inference isn't really as strong as it should be and the syntax to me just doesn't seem to be as functional as I'd like.  F# and Scala tackle the problems in vastly different ways.

    Ola Bini, who has been advocating the polyglot programmer for some time, summed up the Steve versus Cedric posts very concisely in these two paragraphs:

    So let's see. Distilled, Steve thinks that static languages have reached the ceiling for what's possible to do, and that dynamic languages offer more flexibility and power without actually sacrificing performance and maintainability. He backs this up with several research papers that point to very interesting runtime performance improvement techniques that really can help dynamic languages perform exceptionally well.

    On the other hand Cedric believes that Scala is bad because of implicits and pattern matching, that it's common sense to not allow people to use the languages they like, that tools for dynamic languages will never be as good as the ones for static ones, that Java generics isn't really a problem, that dynamic language performance will improve but that this doesn't matter, that static languages really hasn't failed at all and that Java is still the best language of choice, and will continue to be for a long time.

    It seems that many of the modern dynamic languages are pretty flexible, but also not as performance oriented as the ones in the past.  Why is this?  It's a good question to ask.  And what can be done about it?  Of course Ola takes the tact, and I think correctly so that the tooling won't be the same or as rich for dynamic languages as it is for statically typed.  It simply can't be.  But that doesn't mean that it needs those tools won't exist, they'll just be different.  But at the end, Ola argues for the polyglot programmer and each language to its strength.  He talks a bit more about this with Mike Moore on the Rubiverse podcast here.

    Impedance Mismatch?

    There was a topic discussed at the ALT.NET Open Spaces, Seattle fishbowl on the polyglot programmer which talked about the impedance mismatch between statically typed languages and dynamic ones.  What's great is that Greg Young got together a session with Rustan Leino and Mike Barnett from Microsoft Research on the Spec# team, John Lam from the IronRuby team, and me.  It was a great discussion which revolved around the flexibility that dynamic languages give you versus the static verification that you lack when you do that.  And there is a balance to be had.  When you look at that flexibility that Ruby and other dynamic languages give you, also creates a bit more responsibility for ensuring its correctness.  It's a great conversation and well worth the time invested.  But one of the benefits we're seeing from CLR and in turn the DLR is the interop story so that you could have your front end be Ruby, service layer in C#, rules engine in F#, Boo for configuration and so on.

    Anders Hejlsberg on C# And Statically Typed Languages

    As I noted earlier, Anders Hejlsberg was on Software Engineering Radio Episode 97 to discuss the future of C#.  Although Anders has his foot firmly in the statically typed camp, he sees the value of dynamic dispatch.  The phrase that was used and quite apt was "Static Programming but Dynamically Generated".  I think the metaprogramming story in C# needs to be improved for this to happen.  Doing Reflection.Emit isn't the strongest story for doing this, and certainly not easy.

    Where I think that C# can go however is more towards making DSL creation much easier.  Boo, F# and others on the .NET platform are statically typed, yet go well beyond what C# can do in this arena.  Ayende has been doing a lot with Boo and making the language, although statically typed, very flexible and readable.  Ruby has a pretty strong story here and C# and other languages have some lessons it can learn.

    Another example is that Erlang is a dynamic language, yet very concurrent and pretty interesting.  C# and other .NET languages can learn a bit from Erlang.  I'm not sure Erlang itself will be taking off, as it would need some sort of sponsorship and some better frameworks before it could.  F# has learned some of those lessons in terms of messaging patterns, but no in terms of recovery and process isolation just yet.  I covered a bit of that on my previous post.

    Wrapping It Up

    It's a pretty interesting debate, and at the end of the day, it really comes down to what language meets your needs.  The .NET CLR has a pretty strong story of allowing other languages to interoperate that nicely compliments the polyglot.  But, I don't think that static typing is going the way of the dodo and I also don't think dynamic typing will win the day.  Both have their places.  Sounds like a copout, I know, but deal with it.  I have a bit more to discuss on this matter, especially about learning lessons from Erlang, one of the more interesting languages that has seen a resurgence lately.


    kick it on DotNetKicks.com

  • DC ALT.NET May Wrapup - Common Lisp and Applying Lessons Learned

    Last night's DC ALT.NET meeting was a great success.  We had Craig Andera, of PluralSight and FlexWiki fame, talk to us about Common Lisp and some of the lessons he learned.  It was great to see the guys from the FringeDC group join us as well.  I can definitely see a lot of overlap between the two groups as we both struggle to find new and innovative ideas for solving our hardest problems.  We tend to look outside of our community to find what has worked and what hasn't worked for each community.  Because at the end of the day, we're all developers with just different backgrounds.  I want to thank the Motley Fool for hosting the event for us, and we'd love to come back if you'll have us.

    The Presentation

    Craig spent much of his summer vacation actually learning Common Lisp.  The original idea was to learn Ruby, but why not go back to the grandfather of them all, Lisp.  I remembered Lisp from back in the college years, but had forgotten most of it.  It's very interesting to watch the presentation and learn how flexible of a language it is since we're just dealing with expression trees.  I can definitely see where other languages got their heritage.  All good things tend to come back to Lisp and SmallTalk!

    Anyhow, the important features we covered were object oriented programming with Common Lisp Object System (CLOS), macros, defining properties and methods, and even the .NET interop story with IronScheme and others.  It was a great time and I learned a lot.  If you wish to grab his presentation notes, you can find them here.

    Next time, I'll be presenting F#, so we'll keep on the functional programming style with some OOP mixed in, so I hope we see another great crowd for that.  As always, join the mailing list here if you want to learn more.

    kick it on DotNetKicks.com

  • What Is the Future of C# Anyways?

    It was often asked during some of my presentations on F# and Functional C# about the future direction of C# and where I think it's going.  Last night I was pinged about this with my F# talk at the Philly ALT.NET meeting.  The question was asked, why bother learning F#, when eventually I'll get these things for free once they steal it and bring it to C#.  Being the language geek that I am, I'm pretty interested in this question as well.  Right now, the language itself keeps evolving at a rather quick pace as compared to C++ and Java.  And we have many developers that are struggling to keep up with the evolution of the language to a more functional style with LINQ, lambda expressions, lazy evaluation, etc.  There are plenty of places to go with the language and a few questions to ask along the way.

    An Interview With Anders Hejlsberg

    Recently on Software Engineering Radio, Anders Hejlsberg was interviewed about the past, present and future of C# on Episode 97.  Of course there are interesting aspects of the history of his involvement with languages such as Tubro Pascal and Delphi and some great commentary on Visual Basic and dynamic languages as well.  But, the real core of the discussion was focused around what problems are the ones we need to solve next?  And how will C# handle some of these features?  Will they be language constructs or built-in to the framework itself? 

    Let's go through some of the issues discussed.

    Concurrency Programming

    Concurrency programming is hard.  Let's not mince words about it.  Once we start getting into multiple processors and multiple cores, this becomes even more of an issue.  Are we using the machine effectively?  It's important because with the standard locks/mutexes it's literally impossible to have shared memory parallelism with more than two processors without at some point being blocking and serial. 

    The way things are currently designed in the frameworks and the languages themselves are not designed for concurrency to make it easy.  The Erlang guys of course would disagree since they started with that idea from the very start.  Since things are sandboxed to a particular thread, they are free to mutate state to their heart's content, and then when they need to talk to another process, they pick up the data and completely copy it over, so there is a penalty for doing so.  Joe Armstrong, the creator of Erlang, covered a lot of these things in his Erlang book "Programming Erlang: Software for a Concurrent World ".

    Mutable State

    Part of the issue concerning concurrency is the idea of mutable state.  As far back as I remember, we were always taught in CS classes that you can feel free to mutate state as need be.  But, that only really works when you've got a nicely serial application where A calls B calls C calls D and all on the same thread.  But, that's a fairly limiting thing idea as we start to scale out to multiple threads, machines and so on.  Instead, we need to focus on the mutability and control it in a meaningful way through not only the use of our language constructs, but our design patterns as well.

    In the C# world, we have the ability to create opt-in immutability through the use of the readonly keyword.   This is really helpful to decide those fields that we don't really need to or want to modify.  This also helps the JIT better determine the use of our particular variable.  I'm not sure about performance gains, but that's not really the point of it all, anyways.  Take the canonical example of the 2D point such as this:

    public class Point2D
    {
        private readonly double x;
        private readonly double y;

        public Point2D() { }

        public Point2D(double x, double y)
        {
            this.x = x;
            this.y = y;
        }

        public double X { get { return x; } }

        public double Y { get { return y; } }

        public Point2D Add(Size2D size)
        {
            return new Point2D(x + size.Height, y + size.Width);
        }
    }

    We've created this class as to not allow for mutable state, instead returning a new object that you are free to work with.  This of course is a positive thing.  But, can we go further in a language than just this?  I think so, and I think Anders does too.  Spec# and design by contract can take this just a bit further in this regard.  What if I can state that my object, as it is, is immutable?  That would certainly help the compiler to optimize.  Take for example doing Value Objects in the Domain Driven Design world.  How would something like that look?  Well, let's follow the Spec# example and mark my class as being immutable, meaning that once I initialize it, I cannot change it for any reason:

    [Immutable]
    public class Point2D
    {
       // Class implementation the same
    }

    This helps make it more transparent to the caller and the callee that what you have cannot be changed.  This enforces the behaviors for my member variables in a pretty interesting way.  Let's take a look at the actual C# generated in Spec# for the above code.  I'll only paste the relevant information about what it did to the properties.  I'll only look at the X, but the identical happened for the Y as well.

    public double X
    {
        [Witness(false, 0, "", "0", ""), Witness(false, 0, "", "1", ""), Witness(false, 0, "", "this@ClassLibrary1.Point2D::x", "", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=0x21, EndLine=20, EndColumn=0x22, SourceText="x"), Ensures("::==(double,double){${double,\"return value\"},this@ClassLibrary1.Point2D::x}", Filename=@"D:\Work\SpecSharpSamples\SpecSharpSamples\Class1.ssc", StartLine=20, StartColumn=20, EndLine=20, EndColumn=0x17, SourceText="get")]
        get
        {
            double return value = this.x;
            try
            {
                if (return value != this.x)
                {
                    throw new EnsuresException("Postcondition 'get' violated from method 'ClassLibrary1.Point2D.get_X'");
                }
            }
            catch (ContractMarkerException)
            {
                throw;
            }
            double SS$Display Return Local = return value;
            return return value;
        }
    }

    What I like about F# and functional programming is the opt-out mutability, which means by default, my classes, lists, structures and so on are immutable by default.  So, this makes you think long and hard about any particular mutability you want to introduce into your program.  It's not to say that there can be no mutability in your application, but on the other hand, you need to think about it, and isolate it in a meaningful manner.  Haskell takes a more hardline stance on the issue, and mutability can only occur in monadic expressions.  If you're not aware of what those are, check out F# workflows which are perfectly analogous.  But by default, we get code that looks like this and is immutable:

    type Point2D = class
      val x : double
      val y : double
     
      new() = { x = 0.0; y = 0.0 }
     
      new(x, y) =
        {
          x = x
          y = y
        }

      member this.X
        with get() = this.x
       
      member this.Y
        with get() = this.y
    end

    So, as you can see, I'm not having to express the immutability, only the mutability if I so choose.  Very important differentiator.

    Method Purity

    Method purity is another important topic as we talk about concurrent programming and such.  What I mean by this is that I'm not going to modify the incoming parameters or cause some side effects, and instead I will produce a new object instead.  This has lasting effects if I'm going to be doing things on other threads.  Eric Evans talked about this topic briefly in his Domain Driven Design book on Supple Design.  The idea is to have side effect free functions as much as you can, and carefully control where you mutate state through intention revealing interfaces and so on.

    But, how do you communicate this?  Well, Command-Query Separation gets us part of the way there.  That's the idea of having the mutation and side effects in your command functions where you return nothing, and then queries which return data but do not modify state.  Spec# can enforce this behavior as well.  To be able to mark our particular functions as being pure is quite helpful in communicating whether I can expect a change in state.  Therefore I know whether I have to manage the mutation in some special way.  To communicate something like that in Spec#, all I have to do is something like this:

    [Pure]
    public Point2D Add(Size2D size)
        requires size != null;
    {
        return new Point2D(x + size.Height, y + size.Width);
    }

    This becomes part of the method contract and some good documentation as well for your system.

    Asynchronous Communication and Messaging

    Another piece of interest is messaging and process isolation.  The Erlang guys figured out a while ago, that you can have mutation as well as mass concurrency, fail safety and so on with process isolation.   Two ideas come to mind from other .NET languages.  An important distinction must be made between concurrency memory models between shared-memory and message passing concurrency.  Messaging and asynchronous communication are key foundations for concurrent programming. 

    In F#, there is support for the mailbox processing messaging.  This is already popular in Erlang, hence probably where the idea came from.  The idea is that a mailbox is a message queue that you can listen to for a message that is relevant to the agent you've defined.  This is implemented in the MailboxProcessor class in the Microsoft.FSharp.Control.Mailboxes namespace.  Doing a simple receive is pretty simple as shown here:

    #light

    #nowarn "57"

    open Microsoft.FSharp.Control.CommonExtensions
    open Microsoft.FSharp.Control.Mailboxes

    let incrementor =
      new MailboxProcessor<int>(fun inbox ->
        let rec loopMessage(n) =
          async {
                  do printfn "n = %d" n
                  let! message = inbox.Receive()
                  return! loopMessage(n + message)
                }
        loopMessage(0))

    Robert Pickering has more information about the Erlang style message passing here

    Now, let's come back just a second.  Erlang also introduces another concept that Sing# and the Singularity OS took up.  It's a concept called the Software Isolated Process (SIP).  The idea is to isolate your processes in a little sandbox.  Therefore if you load up a bad driver or something like that, the process can die and then spin up another process without having killed the entire system.  That's a really key part of Singularity and quite frankly one of the most intriguing.  Galen Hunt, the main researcher behind this talked about this on Software Engineering Radio Episode 88.  He also talks about it more here on Channel9 and it's well worth looking at.  You can also download the source on CodePlex and check it out.

    Dynamic C#?

    As you can probably note, Anders is pretty much a static typing fan and I'd have to say that I'm also firmly in that camp as well.  But, there are elements that are intriguing such as metaprogramming and creating DSLs which are pretty weak in C# as of now.  Sure, people are trying to bend C# in all sorts of interesting ways, but it's not a natural fit as the language stands now.  So, I think there can be some improvements here in some areas.

    Metaprogramming

    Metaprogramming is another area that was mentioned as a particularly interesting aspect.  As of right now, it's not an easy fit to do this with C#.  But once again, F# has many of these features built-in to do such things as quotations to do some metaprogramming because that's what it was created to do, a language built to create other languages.  Tomas Petricek is by far one of the authorities on the subject as he has leveraged it in interesting ways to create AJAX applications.  You can read about his introduction to metaprogramming here and his AJAX toolkit hereDon Syme has also written a paper about leveraging Meta-programming with F# which you can find here.  But I guess I have to ask the question, does C# need this or shouldn't we just use F# for what it's really good at and not shoehorn yet another piece onto the language?  Or the same could be said of Ruby and its power with metaprogramming as well, why not use the best language for the job?

    Dynamic Dispatch

    The idea of dynamic dispatch is an interesting idea as well.  This is the idea that you can invoke a method on an object that doesn't exist, and instead, the system figures out where to send it.  In Ruby, we have the method_missing concept which allows us to define that behavior when that method that is being invoked is not found.  Anders thought it was an intriguing idea and it was something to look at.  This might help in the creation of DSLs as well when you can define that behavior even though that method may not exist at all.

    In the Language or the Framework?

    Another good question though is do these features belong in the language itself or the in the framework?  The argument here is that if you somehow put a lot of constraints on the language syntax, then you might prematurely age the language and as a result, decline in usage.  Instead, the idea is to focus on the libraries to make these things available.  For example, the MailboxProcessor functionality being brought to all languages might not be a bad idea.  Those sorts of concepts around process isolation would be more of a framework concept than a language concept.  But, it's an interesting debate as to what belongs where.  Because at the end of the day, you do need some language differentiation when you use C#, F#, Ruby, Python, C++, etc or else what's the point of having all of them?  To that point I've been dismayed that VB.NET and C# have mirrored themselves pretty well and tried to make themselves equal and I wish they would just stop.  Let VB find a niche and let C# find its niche. 

    Conclusion


    Well, I hope this little discussion got you thinking as well about the future of C# and the future of the .NET framework as well.  What does C# need in order to better express the problems we are trying to solve?  And is it language specific or does it belong in the framework?  Shouldn't we just use the best language for the job instead of everything by default being in C#?  Good questions to answer, so now discuss...

    kick it on DotNetKicks.com

  • F# and Unit Testing - Some New Developments

    This past week, I've been focusing a lot of my attention on F# in terms of my presentations that I have been giving.  I'm busy preparing for the Philly ALT.NET meeting tomorrow night on the very subject.  An important aspect of some of the presentation has been unit testing.  There is some good news and some not so good news when it comes to this.  For those that have been following my pursuit of good unit tests in F# have known that xUnit.net has been a good option for being able to create static unit tests inside my classes instead of the pomp and circumstance of creating a new class and having member functions.

    MbUnit Support for F#

    Very recently Jeff Brown announced on his blog that he's now supporting tests without the requirement for the TestFixtureAttribute to be marked on your class in MbUnit.  This is quite helpful for F# tests and has joined the ranks of xUnit.net in terms of giving me another tool in my toolbelt.  There were other bugs that were filed that also were hindering good unit testing in F# that have been worked out as well. 

    So, I should be able to do this below and everything should just work:

    #light

    #R @"D:\Program Files\Gallio\bin\MbUnit2\MbUnit.Framework.dll"
    open MbUnit.Framework

    let FilterCall protocol port =
      match(protocol, port) with
      | "tcp", _ when port = 21 || port = 23 || port = 25 -> true
      | "http", _ when port = 80 || port = 8080 -> true
      | "https", 443 -> true
      | _ -> false
     
    [<Test>]
    let FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
      Assert.IsTrue(FilterCall "http" 80)

    But...  this, is not the case.  It doesn't recognize that my tests exist.  Why? 

    The New F# Release

    With the newest release of F#, version 1.9.4.15, there was a change made that took the classes that encapsulated the tests and made it a static class.  So, if I were to look through .NET Reflector, it would look like this:

    [CompilationMapping(SourceLevelConstruct.Module)]
    public static class MbUnitTesting
    {
        // Methods
        public static bool FilterCall(string protocol, int port) /// Method under test

        [Test]
        public static void FilterCall_WithHttpAndPort80_ShouldReturnTrue()
        {
            Assert.IsTrue(FilterCall("http", 80));
        }
    }

    This can be a problem, due to the fact that through reflection, any static class is marked abstract due to the fact you cannot create an instance of these classes.  This is a problem for the unit testing frameworks which cannot process abstract classes, yet.  So, this is a work in progress, but there has to be some strategy to get around this, as we have no way in reflection to determine if it is a static class easily.

    The Workaround

    The workaround for the issue is pretty simple, which is to actually use classes when creating your unit tests in F#.  I know it's a little bit of a pain, but the unit testing teams are aware of the issue and hopefully we'll have a fix soon enough.  But, in the mean time, we'll have to create the classes such as this in MbUnit:

    [<TestFixture>]
    type MbUnitTests = class
      new() = {}
     
      [<Test>]
      member x.FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
        Assert.IsTrue(FilterCall "http" 80)
     
    end

    or in xUnit.net

    type XUnitTests = class
      new() = {}
     
      [<Fact>]
      member x.FilterCall_WithHttpAndPort80_ShouldReturnTrue() =
        Assert.True(FilterCall "http" 80)
     
    end


    Then the Gallio Icarus Runner is free to pick up the results and runs as expected.  Like I said, hopefully the issue will be fixed soon.

    kick it on DotNetKicks.com

  • NoVA Code Camp Wrapup and Thoughts

    This past weekend was the Northern Virginia Code Camp in Reston, Virginia.  There was a pretty good turnout for my two sessions which were the first two of the day.  Unfortunately, I could not stay the whole day to attend some of the other sessions including fellow DC ALT.NET'er John Morales on NServiceBus, so I'll have to catch it soon enough because the ideas around it are pretty intriguing and I've played with TIBCO and a few others, so another tool in my toolbelt is not a bad thing.  I did two sessions, one of Functional C# and the other was an introduction to F#.  I'm not quite ready to post my slides as I have a few more presentations on the subject to give and I'm still tweaking them as I go, so they will be a bit more refined.

    Lessons Learned For Me

    Some of the things I came away with is that I need to schedule a little better.  I would have much preferred to have the F# and Foundations of Functional Programming talk come first as it would give people more of a basis of what functional programming is and how it is expressed in a more pure functional language in F#.  Next time I should be a bit more upfront about this and get the schedule changed accordingly.  Two sessions in a row is a situation which could be improved as well. 

    Functional C#

    The first talk I gave was on Functional C#.  This was to take the ideas of the more pure functional programming of Haskell, OCaml and F#.  To bring these ideas and apply them in a C# ish manner.  Some of the things in functional programming languages such as pattern matching isn't an easy concept, so, there are things that can apply and some things that don't.

    Some of these lessons include:

    • Immutable types
      Focus on immutable types and opt-out mutability instead of mutable by default and opt-in immutability such is the case in C# versus F#.  Remember, I've been talking about this in context of multi-threaded, parallel programming where this is absolutely crucial to mutate in very controlled circumstances, putting them in isolation.  This also applies to the Domain Driven Design world where I was coming from in regards to Value Objects and supple design.

    • Side Effect Free (Pure) Functions
      The idea here is to control the side effecting in your system.   Ideally in the functional programming world, when you call such a function as myList |> List.map (fun x -> x * x) will return another list and not the list you gave it with mutated state.  This is important once again as we get into the concurrent programming paradigm to focus on method purity and follow the Command Query Separation (CQS) principle.  Once again, this has roots in Domain Driven Design as well when following Intention Revealing Interfaces and Supple Design.

    • Functions as First Class Types
      The delegate in the .NET world has made the function pointer a first class citizen.  With the use of extension methods, generics and lambda expressions, we are now able to take full advantage of performing such critical computations as Reduce, Filter, Map and other High Order function operations.  Other areas in this realm include Currying and partial application of functions.

    • Lazy Evaluation
      In functional programming we have the ability to specify infinite ranges, such as all Fibonacci numbers or some other number sequence.  The last thing we'd want is to evaluate that and get the length.  Haskell takes the approach of be lazy by default.  But that's not practical in a framework like .NET when we want deterministic behavior in the execution order of our code.  So, instead, languages like C# and F# are eager evaluators.  But, that doesn't mean we cannot take advantage.  In fact, when we talk about .NET 2.0 and beyond, such things as IEnumerable<T> is a somewhat lazy execution model when we only calculate when we call the MoveNext() function and so on for each value in the collection.  So, when you think about it, LINQ follows that delayed execution model and is pretty powerful for doing large sequences and evaluation. 

    So, as you can see, there are quite a few lessons the C# developer can learn from functional programming and F#.  The key really is when to apply this knowledge and marry the ideas of OOP and FP in a cohesive manner.  Speaking of which, Anders Hejlsberg was recently on Software Engineering Radio Episode 97 to talk about the past, present and future of C#.  In there, he talked about some of the more functional programming ideas that have been incorporated into C# and a focus on immutability, and how we can make concurrent programming easier.  Definitely not time to stick a fork in C# just yet, as I think there are plenty of ideas yet to come to express some of these problems a little bit better.  In my ext post, I'll dive a little deeper into Anders' appearance on SE Radio and some of the interesting things going on around static versus dynamic typing.

    Introduction to F#

    My second talk for the day was on an introduction to functional programming with F#.  This was more of my bread and butter presentation on explaining functional programming as I have with my Adventures in F# series.  From this presentation, I focused on many of the 101 level aspects of functional programming and how they are implemented in F#.  Of course there was some deviation as I explored some of the features that are more library based and exclusive to F# (async workflows, quotations, etc).

    Often, the question comes up with the value proposition of F#.  Yes, many can get behind many of the ideals of the language and would rather have C# adopt most of these features and not have to learn another language.  This to me strikes me as a bit sad that many people are not stretching their wings outside of their comfort zone of the MSDN help files and their language of choice.  Learning a new language with a new paradigm is essential to learning.  This doesn't mean learning C# coming from VB.NET, but instead, gravitating towards functional programming with a language that fully supports it (F#, OCaml, Haskell, Erlang, Lisp/Scheme, etc), or towards a dynamically typed language (Ruby, Python, etc).  Then once you have fully understood and become more fluent in said paradigms, you can learn those lessons and help express your solutions to your problems in more interesting ways.

    But, back to F# for a moment here.  What is the value of F# and why use it?
    • Concurrency Programming Is Hard
      It is hard, and don't let anyone fool you otherwise.  With locks, mutexes and so on, it is literally impossible on a dual processor machine to have a concurrent program.  Period.  Instead, with a focus on immutability, side effect free functions, asynchronous workflows, the ideas of concurrent programming becomes a bit easier.  Without the first two, concurrency is quite difficult.  Messaging is first class through the use of the Mailbox patterns and lessons learned from Erlang.

    • Representing Data Can be Hard
      With the ideas of tuples, records and discriminated unions, F# gives you a powerful new way of representing your data succinctly.  Then to be able to use such techniques as pattern matching against them makes for an even more compelling case.

    • Creating Other Languages Is Hard
      F# has a firm foundation as a language used to create other languages.  With first class support of lexer generators and yacc parsing, tokenizing and parsing becomes a bit easier.  Also, the inclusion of quotations as a part of the libraries make it possible for really interesting metaprogramming constructs, such as Tomas Petricek's journey into AJAX and metaprogramming.
    Of course there are more than just this simple list of three areas of focus, but the idea is to download it, kick the tires and see if it feels right to you.  That's the important part.  Spending a good amount of time to become fluent in it will definitely help and there is a thriving community waiting to help.  All you have to do is ask...

    Teaching Versus Speaking

    D'arcy Lussier had an interesting post which took at Scott Bellware tweet about teachers versus speakers.  It's a pretty good post, but I enjoy the comments a bit more on the subject.  So, when you get up in front of that podium, just think, are you just another speaker, or are you being a teacher?  Is it a dialog or death by PowerPoint?

    Wrapping It Up

    It was another great experience at this code camp, but I think the one hour sessions just aren't enough sometimes to fully get into any particular subject.  I sometimes leave a session wanting, not because the presentation wasn't good, but there wasn't enough time to fully express the full intent of it.  I could have gone on and on for hours about functional programming and F# for quite some time as I barely scratched the surface.  Maybe in the future, there will be a better venue for this, but I hope to get more in depth in future iterations.

    Don't forget that I'll be at Philly ALT.NET this Wednesday night for an F# presentation and then Thursday night is the DC ALT.NET meeting in Alexandria on Applying Lessons Learned from Common Lisp with Craig Andera!

    kick it on DotNetKicks.com

  • DC ALT.NET - 5/22/2008 - Applying Lessons Learned from Lisp

    The May meeting of DC ALT.NET has been scheduled for May 22nd from 7-9PM.  Check out our mailing list and site for more information as it becomes available.  If you're in the Washington DC area, come check us out.  This month, we're having Craig Andera, of FlexWiki fame, speak about applying lessons learned from learning Lisp and how to be a better programmer because of it.  That's one of the true strength's of the DC ALT.NET, or even the ALT.NET movement as a whole, as we look outside our .NET community to the outside world to find better ways to solve problems and apply lessons learned from each community, and Lisp is one of those communities.  Dave Laribee, Jeremy Miller and Chad Myers spoke about this on the first episode of the ALT.NET Podcast with Mike Moore.  If you haven't listened to it yet, I highly recommend that you do.

    Applying Lessons Learned from Lisp

    There has been a lot of talk and some hype (deserved and undeserved) around functional programming lately, partly due to looking for ways for expressing parallel applications and multi-core scenarios.  Some might find it interesting that functional programming has its roots back in the 1950s, well before Object Oriented Programming, yet has been relegated mostly to the research community mostly. 

    Back in 1958, John McCarthy from MIT designed Lisp and has been a mainstay in the Artificial Intelligence field for a long time after that.  Since that time, there have been quite a few Lisp dialects to pop up due to the fact that many of the universities and labs did not share their information before everyone was connected to ARPANET.  Two that have really emerged since then are Common Lisp, an attempt to standardize the Lisp variants into one, and Scheme.  Lisp is a strongly typed dynamic language, meaning that if when it is interpreted, the function does not exist, an exception will be thrown.  By it's nature, it is a functional language with such elements as lists, lambdas and so on.  Some of the interesting additions to Lisp is the Common Lisp Object System (CLOS) which adds OOP functionality to the Common Lisp language.  It's a bit different than what we think of OOP in C++, C#, Java and other OO langauges.

    In the .NET world, we have IronLisp and IronScheme.  IronLisp has been deprecated in favor of IronScheme going forward.  That's the beauty of .NET is to build these dynamic languages on top of the DLR with relative ease, truly speaks to how flexible the type system and CLR are.  To make OOP and FP first class citizens within the .NET space is also pretty interesting as well.

    Back to Lisp, if you want to hear more, you should check out Dick Gabriel's appearance on Software Engineering Radio Episode 84 on Common Lisp.  Dick has been a noted authority in the Lisp space for some time and is the organizer for OOPSLA back in 2007.  It's one of their better episodes, so I'd encourage you to listen to it.  I know I did, but then again, I have a pretty long commute, so I have time to listen to these things.

    Who We Are

    So, as I said, I run the DC ALT.NET group which meets monthly to discuss ways of bettering ourselves.  You won't find us doing what most other user groups do in the area and is more of an intimate environment for learning and discussion.  Typically we have the first hour for the topic of discussion, this month being Lisp, and then the second hour is Open Spaces, so it encourages everyone to speak and bring a topic they are passionate about.  As always, we're looking for sponsors to help us out along the way.  Since we're in the Washington DC area, and traffic can be bad, we tend to move from month to month to accommodate.  That may change in the future as we grow, but for now, it works nicely.  So, if you're in the DC area, come check us out.  And, hopefully I'll get Dave Laribee to stop by before too long as well...

    Where I'll Be

    In addition to the meeting next week, I will be speaking at the Philly ALT.NET group meeting on May 21st on F# and an introduction to Functional programming.  This should be a great session and I hope there will be a good crowd for it.  Also, this weekend is the NoVA Code Camp in which I have two sessions, "Introduction to F# and Functional Programming" and "Improve Your C# with Functional Programming Ideas".  Look forward to seeing everyone at those events!

    kick it on DotNetKicks.com

  • Concurrency with MPI in .NET

    In my previous post, I looked at some of the options we have for concurrency programming in .NET applications.  One of the interesting ones, yet specialized is the Message Passing Interface (MPI).  Microsoft made the initiative to get into the high performance computing space with the Windows Server 2003 Compute Cluster Server SKU.  This allowed developers to run their given algorithms using MPI on a massive parallelized scale.  And now with the Windows Server 2008 HPC SKU, it is a bit improved with WCF support for scheduling and such.  If you're not part of the beta and are interested, I'd urge you to go through Microsoft Connect. 

    When Is It Appropriate?

    When I'm talking about MPI, I'm talking in the context of High Performance Computing.  This consists of having the application run within a scheduler on a compute cluster which can have 10s or hundreds of nodes.  Note that I'm not talking about grid computing such as Folding@Home which distributes work over the internet.  Instead, you'll find plenty of need for this in the financial sector, insurance sector for fraud detection and data analysis, manufacturing sector for testing and calculating limits, thresholds and whatnot, and even in compiling computer animation in film.  There are plenty of other scenarios that are out there, but it's not for your everyday business application.

    I think the real value comes with .NET to be able to read from databases, communicate with other servers with WCF or some other communication protocol, instead of being stuck in the C or Fortran world which the HPC market has been relegated.  Instead, they can cut down on the code necessary for a lot of these applications by using the built-in functions that we get with the BCL.

    MPI in .NET

    The problem has been to run these massively parallel algorithms left us limited to Fortran and C systems.  This was ok for most things that you would want to do, cobbling together class libraries wasn't my ideal.  Instead, we could use a lot of the things that we take for granted in .NET such as strong types, object oriented and functional programming constructs.

    The Boost libraries were made available for MPI in C++ very recently by the University of Indiana.  You can read more about it here.  This allowed the MPI programmer to take advantage of many of the C++ constructs that you can do in regular C, such as OOP.  Instead of dealing with functions and structs, there is a full object model for dealing with messaging.

    At the same time as the Boost C++ Libraries for MPI were coming out, the .NET implementation has been made available based upon the C++ design through MPI.NET.  It's basically a thin veneer over the msmpi.dll which is the Microsoft implementation of the MPICH2 standard.  For a list of all operation types supported, check the API documentation here for the raw MSMPI implementation.  This will give you a better sense of the capabilities more than the .NET implementation can.

    What you can think of this is that several nodes will be running an instance of your program at once.  So, if you have 16 nodes assigned through your scheduled job, it will spin up 16 instances of the same application.  When you do this on a test machine, you'll notice 16 instances of that in your task manager.  Kind of cool actually.  Unfortunately, they are missing a lot of the neat features in MPI which includes "Ready Sends", "Buffered Sends", but they have included nice things such as the Graph and Cartesian communicators which are essential in MPI.

    You'll need the Windows Server 2003/2008 HPC SDK in order to run these examples, so download them now, and then install MPI.NET to follow along.

    Messaging Patterns

    With this, we have a few messaging patterns available to us.  MPI.NET has given us a few that we will be looking at and how best to use them.  I'll include samples in F# as it's pretty easy to do and I'm trying to get through on the fact that F# is a better language for expressing the messaging we're doing instead of C#.  But, for these simple examples, they are not hard to switch back and forth.

    To execute these, just type the following:

    mpiexec - n <Number of Nodes You Want> <Your program exe>

    Broadcast

    A broadcast is a a process in which a single process (ala a head node) sends the same data to all nodes in the cluster.  We want to be efficient as possible when sending out this data for all to use, without having to loop through all sends and receives.  This is good when a particular root node has a value that the rest of the cluster needs before continuing.  Below is a quick example in which the head node sets the value to 42 and the rest will receive it.

    #light

    #R "D:\Program Files\MPI.NET\Lib\MPI.dll"

    open System
    open MPI

    let main(args:string[]) =
      using(new Environment(ref args))(fun _->
        let commRank = Communicator.world.Rank

        let intValue = ref 0
        if commRank = 0 then
          intValue := 42
         
        Communicator.world.Broadcast(intValue, 0)
        Console.WriteLine("Broadcasted {0} to all nodes", !intValue)
      )
    main(Environment.GetCommandLineArgs())

    Blocking Send and Receive

    In this scenario, we're going to use the blocking send and receive pattern.  This will not allow the program to continue until I get the message I'm looking for.  This is good for times when you need a particular value before proceeding to your next function from the head node or any other particular node.

    #light

    #R "D:\Program Files\MPI.NET\Lib\MPI.dll"

    open System
    open MPI

    let main (args:string[]) =
      using(new Environment(ref args))( fun _ ->
        let commRank = Communicator.world.Rank
        let commSize = Communicator.world.Size
        let intValue = ref 0
        match commRank with
        | 0 ->
          [1 .. (commSize - 1)] |> List.iter (fun i ->
            Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
            Console.WriteLine("Result: {0}", !intValue))
        | _ ->
          intValue := 4 * commRank
          Communicator.world.Send(!intValue,0, 0)
      )

    What I'm doing here is letting the head node, rank 0, to do all the receiving work.  Note, that I don't care particularly where the source was, nor what the tag was.  I can specify however, if I wish to go ahead and receive from a certain node and of a certain data tag.  If it's a slave process, then I'm going to go ahead and calculate the value, and send it back to the head node of 0.  The head node will wait until it has received that value from any node and then print out the given value.  The methods that I'm using the send and receive are generic methods.  Behind the scenes, in order to send, the system will go ahead and serialize your object into an unmanaged memory stream and throw it on the wire.  This is one of the fun issues when dealing with marshaling to unmanaged C code.

    Nonblocking Send and Receive

    In this scenario, we are not going to block as we did before with sending or receiving.  We want the ability to continue on doing other things while I sent the value, while the other receivers might need that value before continuing.  Eventually we can force getting that value from the node through the communication status, and then at a certain point, we can set up a barrier so that nobody can continue until we've hit that point in our program.  The below sample is a quick sending of a multiplied value and letting it continue.  The other nodes will have to wait until that broadcast comes, and then we'll wait at the barrier until the job is done.

    let main (args:string[]) =
      using(new Environment(ref args))( fun _ ->
        let commRank = Communicator.world.Rank
        let commSize = Communicator.world.Size
       
        let intValue = ref 0
        if commRank = 0 then
          [1 .. (commSize - 1)] |> List.iter (fun _ ->
            Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
            Console.WriteLine("Result: {0}", !intValue))
        else
          intValue := 4 * commRank
          let status = Communicator.world.ImmediateSend(!intValue,0, 0)
          status.Wait() |> ignore
         
        Communicator.world.Barrier()
      )
     
    main(Environment.GetCommandLineArgs())

    Gather and Scatter

    The gather process takes values from each process and then sends it to the root process as an array for evaluation.  This is a pretty simple operation for taking all values from all nodes and combining them on the head node.  What I'm doing is a simple calculation of gathering all values of commRank * 3 and sending it to the head node for evaluation.

    let main (args:string[]) =
      using(new Environment(ref args))( fun e ->
        let commRank = Communicator.world.Rank
        let intValue = commRank * 3
       
        match commRank with
        | 0 ->
          let ranks = Communicator.world.Gather(intValue, commRank)
          ranks |> Array.iter(fun i -> System.Console.WriteLine(" {0}", i))
        | _ -> Communicator.world.Gather(intValue, 0) |> ignore
      )
     
    main(Environment.GetCommandLineArgs())

    Conversely, scatter does the opposite which takes a row from the given head process and splits it apart to be spread out among all processes.  In this exercise I will go ahead and create a mutable array that only the head node will modify.  From there, I will scatter it across the rest of the nodes to pick up and do with whatever they please.

    let main (args:string[]) =
      using(new Environment(ref args))( fun e ->
        let commSize = Communicator.world.Size
        let commRank = Communicator.world.Rank
        let mutable table = Array.create commSize 0
       
        match commRank with
        | 0 ->
          table <- Array.init commSize (fun i -> i * 3)
          Communicator.world.Scatter(table, 0) |> ignore
        | _ ->
          let scatterValue = Communicator.world.Scatter(table, 0)
          Console.WriteLine("Scattered {0}", scatterValue)
      )
     
    main(System.Environment.GetCommandLineArgs())

    There is an AllGather method as well which performs a similar operation to Gather, but the results are available to all processes instead of the root process. 

    Reduce

    Another collective algorithm similar to scatter and gather is the reduce function.  This allows us to combine all values from each process and perform an operation on them, whether it be to add, multiply, find the maximum, minimum and so on.  The value is only available at the root process though, so I have to ignore the result for the rest of the processes.  The following example shows a simple

    let main (args:string[]) =
      using(new Environment(ref args))( fun _ ->
        let commRank = Communicator.world.Rank
        let commSize = Communicator.world.Size
       
        match commRank with
        | 0 ->
          let sum = Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0)
          Console.WriteLine("Sum of all roots is {0}", sum)
        | _ ->
          Communicator.world.Reduce(Communicator.world.Rank, Operation<int>.Add, 0) |> ignore
      )
     
    main(Environment.GetCommandLineArgs())

    There is another variation called the AllReduce which does very similar operations to the Reduce function, but instead makes the value available to all processes instead of just the root one.  There are more operations and more communicators such as Graph and Cartesian, but this is enough to give you an idea of what you can do here. 

    LINQ for MPI.NET

    During my search for MPI.NET solutions, I came across a rather interesting one called LINQ for MP.NET.  I don't know too many of the details figuring the author has been pretty aloof as to providing the complete design details.  But it has entered a private beta if you do wish to contact them for more information.

    The basic idea is to provide provide some scope models which include for the current scope, the world scope, root and so on.  Also, it looks like they are providing some sort of multi-threading capabilities as well.  Looks interesting and I'm interested in finding out more.

    Pure MPI.NET?

    Another implementation of the MPI in .NET has surfaced through PureMPI.NET.   This is an implementation of the MPICH2 specification as well, but built on WCF instead of the MSMPI.dll.  Instead, this does not rely on the Microsoft Compute Cluster service for scheduling and instead, uses remoting and such for communication purposes.  There is a CodeProject article which explains it a bit more here.

    More Resources

    So, you want to know more, huh?  Well, most of the interesting information is out there in C, so if you can read and translate it to the other APIs, you should be fine.  However, there are some good books on the subject which not only provide some decent samples, but also some guidance on how to make the most of the MPI implementation.  Below are some of the basic ones which will help on learning not only the APIs, but the patterns behind their usage.


    Wrapping It Up

    I hope you found some of this useful for learning about how the MPI can help for massive parallel applications.  The patterns learned here as well as the technologies behind them are pretty powerful to help you think about how to make your programs a bit less linear in nature.  There is more to this series to look at thinking of concurrency in .NET, so I hope you stay tuned.

    kick it on DotNetKicks.com

  • Thinking in Concurrently in .NET

    In recent posts, you've found that I've been harping on immutability and side effect free functions.  There is a general theme emerging from this and some real reasons why I'm pointing it out.  One of the things that I'm interested in is concurrent programming on the .NET platform for messaging applications.  As we see more and more cores and processors available to us, we need to be cognizant of this fact as we're designing and writing our applications.  Most programs we write today are pretty linear in nature, except for say forms applications which use background worker threads to not freeze the user interface. But for the most part, we're not taking full advantage of the CPU and its cycles.  We need not only a better way to handle concurrency, but a better way to describe them as well.  This is where Pi-calculus comes into the picture...  But before we get down that beaten path, let's look at a few options that I chose.  Not that these aren't all of them, just a select few I chose to analyze.

    Erlang in .NET?

    For many people, Erlang is considered to be one of the more interesting languages to come out of the concurrent programming field.  This language has received little attention until now when we've hit that slowdown of scaling our processor speed and instead coming into multi-core/multi-processor environments.  What's interesting about Erlang is that it's a functional language, much like F#, Haskell, OCaml, etc.  But what makes it intriguing as well is that it's not a static typed language like the others, and instead dynamic.  Erlang was designed to support distributed, fault-tolerant, non-stop real-time applications.  Written by Ericsson in the 1980s, it has been the mainstay of telephone switches ever since.  If you're interested in listening to more about it, check out Joe Armstrong's appearance on Software Engineering Radio Episode 89 "Joe Armstrong on Erlang".  If you want to dig deeper into Erlang, check out the book "Programming Erlang: Software for a Concurrent World" also by Joe Armstrong, and available on Amazon.

    How does that lead us to .NET?  Well, it's interesting that someone thought of trying to port the language to .NET on a project called Erlang.NET.  This project didn't get too far as I can tell, and for obvious impedance mismatch reasons.  First off, there is a bit of a disconnect between .NET processes and Erlang processes and how he wants to tackle them.  Erlang processes are cheap to create and tear down, whereas .NET ones tend to be a bit heavy.  Also the Garbage Collection runs a bit differently instead of a per process approach, the CLR takes a generational approach.  And another thing is that Erlang is a dynamic language running on its own VM, so it would probably sit on top of the DLR in the .NET space.  Not saying it's an impossible task, but improbable the way he stated.

    Instead, maybe the approach to take with an Erlang-like implementation is to create separate AppDomains since they are relatively cheap to create.  This will allow for process isolation and messaging constructs to fit rather nicely.  Instead, we get rid of the impedance mismatch by mapping an Erlang process to an AppDomain.  Then you can tear down the AppDomain after you are finished or you could restart them in case of a recovery scenario.  These are some of the ideas if you truly want to dig any further into the subject.  I'll probably cover this in another post later.

    So, where does that leave us with Erlang itself?  Well, we have the option of integrating Erlang and .NET together through OTP.NET.   The original article from where the idea came from is from the ServerSide called "Integrating Java and Erlang".  This allows for the use of Erlang to do the computation on the server in a manner that best fits the Erlang style.  I find it's a pretty interesting article and maybe when I have a spare second, I'll check it out a bit more.  But, in terms of a full port to .NET?  Well, I think .NET languages have some lessons to learn from Erlang, as it tackled concurrent programming as the first topic instead of most imperative languages bolting it on after the fact.

    MPI.NET

    The Message Passing Interface (MPI) approach has been an interesting way of solving mass concurrency for applications. This involves using a standard protocol for passing messages from node to node through the system by the way of a compute cluster.  In the Windows world, we have Windows Compute Cluster Server (WCCS) that handles this need.  CCS is available now in two separate SKUs, CCS 2003 and CCS 2008 for Server 2008.  The Server 2008 CCS is available in CTP on the Microsoft Connect website.  See here for more information.  You mainly find High Performance Computing with MPI in the automotive, financial, scientific and academic communities where they have racks upon racks of machines.

    Behind the scenes, Microsoft implemented the MPICH2 version of the MPI specification.  This was then made available to C programmers and is fairly low level.  Unfortunately, that leaves most C++ and .NET programmers out in the cold when it comes to taking advantage.  Sure, C++ could use the standard libraries, but instead, the Boost libraries were created to support MPI in a way that C++ could really take advantage of. 

    After this approach was taken, a similar approach was taken for the .NET platform with MPI.NET.  The University of Indiana produced a .NET version which looked very similar to the Boost MPI approach but with .NET classes.  This allows us to program in any .NET language now against the Windows CCS to take advantage of the massive scalability and scheduling services offered in the SKU.  At the end of the day, it's just a thin wrapper over P/Invoking msmpi.dll with generics thrown in as well.  Still, it's a nice implementation.

    And since it was written for .NET, I can for example do a simple hello world application in F# to take advantage of the MPI.  The value being is that most algorithms and heavy lifting you would be doing through there would probably be functional anyways.  So, I can use F# to specify more succinctly what types of actions and what data I need.  Here is a simple example:

    #light

    #R "D:\Program Files\MPI.NET\Lib\MPI.dll"

    open MPI

    let main (args:string[]) =
      using(new Environment(ref args))( fun e ->
        let commRank = Communicator.world.Rank
        let commSize = Communicator.world.Size
        match commRank with
        | 0 ->
          let intValue = ref 0
          [1 .. (commSize - 1)] |> List.iter (fun i ->
            Communicator.world.Receive(Communicator.anySource, Communicator.anyTag, intValue)
            System.Console.WriteLine("Hello from node {0} out of {1}", !intValue, commSize))
        | _ -> Communicator.world.Send(commRank,0, 0)
      )

    main(System.Environment.GetCommandLineArgs())

    I'll go into more detail in the future as to what this means and why, but just to whet your appetite about what you can do in this is pretty powerful.

    F# to the Rescue with Workflows?

    Another topic for discussion is for asynchronous workflows.  This is another topic in which F# excels as a language.  Async<'a> values are really a way of writing continuation passing explicitly.  I'll be covering this more in a subsequent post shortly, but in the mean time, there is good information from Don Syme here and Robert Pickering here.

    Below is a quick example of an asynchronous workflow which fetches the HTML from each of the given web sites.  I can then run each in parallel and get the results rather easily.  What I'll do below is a quick retrieval of HTML by calling the Async methods.  Note that these methods don't exactly exist, but F# through its magic, creates that for you.

    #light

    open System.IO
    open System.Net
    open Microsoft.FSharp.Control.CommonExtensions

    let fetchAsync (url:string) =
      async { let request = WebRequest.Create(url)
              let! response = request.GetResponseAsync()
              let stream = response.GetResponseStream()
              let reader = new StreamReader(stream)
              let! html = reader.ReadToEndAsync()
              return html
            }

    let urls = ["http://codebetter.com/"; "http://microsoft.com"]
    let htmls = Async.Run(Async.Parallel [for url in urls -> fetchAsync url])
    print_any htmls

    So, as you can see, it's a pretty powerful mechanism for retrieving data asynchronously and then I can run each of these in parallel with parameterized data.

    Parallel Extensions for .NET

    Another approach I've been looking at is the Parallel Extensions for .NET.  The current available version is for the December CTP and is available here.  You can read more about it from two MSDN Magazine articles:


    What I find interesting is Parallel LINQ or PLINQ for short.  The Task Parallel library doesn't interest me as much.  LINQ in general is interesting to a functional programmer in that it's a lazy loaded function.  The actual execution of your LINQ task is delayed until the first yield in GetEnumerator() has been called.  That's definitely taking some lessons from the functional world and pretty powerful.  And add on top of that the ability to parallelize your heavy algorithms is a pretty powerful concept.  I hope this definitely moves forward.

    Conclusion

    As you can see, I briefly gave an introduction to each of these following areas that I hope to dive into a bit more in the coming weeks and months.  I've only scratched the surface on each and each tackle the concurrency problems in slightly different ways and each has its own use.  But I hope I whetted your appetite to look at some of these solutions today.

    kick it on DotNetKicks.com

  • Your API Fails, Who is at Fault?

    I decided to stay on the Design by Contract side for just a little bit.  Recently, Raymond Chen posted "If you pass invalid parameters, then all bets are off" in which he goes into parameter validation and basic defensive programming.  Many of the conversations had on the blog take me back to my C++ and early Java days of checking for null pointers, buffer lengths, etc.  This brings me back to some recent conversations I've had about how to make it explicit about what I expect.  Typical defensive behavior looks something like this:

    public static void Foreach<T>(this IEnumerable<T> items, Action<T> action)
    {
        if (action == null)
            throw new ArgumentNullException("action");

        foreach (var item in items)
            action(item);
    }

    After all, how many times have you not had any idea what the preconditions are for a given method due to lack of documentation or non-intuitive method naming?  it gets worse when they don't provide much documentation, XML comments or otherwise.  At that point, it's time to break out .NET Reflector and dig deep.  Believe me, I've done it quite a bit lately.

    The Erlang Way

    The Erlang crowd takes an interesting approach to the issue that I've really been intrigued by.  Joe Armstrong calls this approach "Let it crash" in which you only code to the sunny day scenario, and if the call to it does not conform to the spec, just let it crash.  You can read more about that on the Erlang mailing list here.

    Some paragraphs stuck out in my mind.

    Check inputs where they are "untrusted"
        - at a human interface
        - a foreign language program

    What this basically states is the only time you should do such checks is at the bounds when you have possible untrusted input, such as bounds overflows, unexpected nulls and such.  He goes on to say about letting it crash:

    specifications always  say what to  do if everything works  - but never what  to do if the  input conditions are not met - the usual answer is something sensible - but what you're the programmer - In C etc. you  have to write *something* if you detect an error -  in Erlang it's  easy - don't  even bother to write  code that checks for errors - "just let it crash".

    So, what Joe advocates is not checking at all, and if they don't conform to the spec, just let it crash, no need for null checks, etc.  But, how would you recover from such a thing?  Joe goes on to say:

    Then  write a  *independent* process  that observes  the  crashes (a linked process) -  the independent process should try  to correct the error, if it can't correct  the error it should crash (same principle) - each monitor  should try a  simpler error recovery strategy  - until finally the  error is  fixed (this is  the principle behind  the error recovery tree behaviour).

    It's an interesting approach, but proves to a valuable one for parallel processing systems.  As I dig further into more functional programming languages, I'm finding such constructs useful.

    Design by Contract Again and DDD

    Defensive programming is a key part of Design by Contract.  But, in a way it differs.  With defensive programming, the callee is responsible for determining whether the parameters are valid and if not, throws an exception or otherwise handles it.   DbC with the help of the language helps the caller better understand how to cope with the exception if it can.

    Bertrand Meyer wrote a bit about this in the Eiffel documentation here.  But, let's go back to basics. DbC asserts that the contracts (what we expect, what we guarantee, what we maintain) are such a crucial piece of the software, that it's part of the design process.  What that means is that we should write these contract assertions FIRST. 

    What do these contract assertions contain?  It normally contains the following:
    • Acceptable/Unacceptable input values and the related meaning
    • Return values and their meaning
    • Exception conditions and why
    • Preconditions (may be weakened by subclasses)
    • Postconditions (may be strengthened by subclasses)
    • Invariants (may be strengthened by subclasses)

    So, in effect, I'm still doing TDD/BDD, but an important part of this is identifying my preconditions, postconditions and invariants.  These ideas mesh pretty well with my understanding of BDD and we should be testing those behaviors in our specs.  Some people saw in my previous posts that they were afraid I was de-emphasizing TDD/BDD and that couldn't be further from the truth.  I'm just using another tool in the toolkit to express my intent for my classes, methods, etc.  I'll explain further in a bit down below.

    Also, my heavy use of Domain Driven Design patterns help as well.  I mentioned those previously when I talked about Side Effects being Code Smells.  With the combination of intention revealing interfaces which express to the caller what I am intending to do, and my use of assertions not only in the code but also in the documentation as well.  This usually includes using the <exception> XML tag in my code comments.  Something like this is usually pretty effective:

    /// <exception cref="T:System.ArgumentNullException"><paramref name="action"/> is null.</exception>

    If you haven't read Eric's book, I suggest you take my advice and Peter's advice and do so.

    Making It Explicit

    Once again, the use of Spec# to enforce these as part of the method signature to me makes sense.  To be able to put the burden back on the client to conform to the contract or else they cannot continue.  And to have static checking to enforce that is pretty powerful as well. 

    But, what are we testing here?  Remember that DbC and Spec# can ensure your preconditions, your postconditions and your invariants hold, but they cannot determine whether your code is correct and conforms to the specs.  That's why I think that BDD plays a pretty good role with my use of Spec#. 

    DbC and Spec# can also play a role in enforcing things that are harder with BDD, such as enforcing invariants.  BDD does great things by emphasizing behaviors which I'm really on board with.  But, what I mean by being harder is that your invariants may be only private member variables which you are not going to expose to the outside world.  If you are not going to expose them, it makes it harder for your specs to control such behavior.  DbC and Spec# can fill that role.  Let's look at the example of an ArrayList written in Spec#.

    public class ArrayList
    {
        invariant 0 <= _size && _size <= _items.Length;
        invariant forall { int i in (_size : _items.Length); _items[i] == null };  // all unused slots are null

        [NotDelayed]
        public ArrayList (int capacity)
          requires 0 <= capacity otherwise ArgumentOutOfRangeException;
          ensures _size/*Count*/ == 0;
          ensures _items.Length/*Capacity*/ == capacity;
        {
          _items = new object[capacity];
          base();
        }

        public virtual void Clear ()
          ensures Count == 0;
        {
          expose (this) {
            Array.Clear(_items, 0, _size); // Don't need to doc this but we clear the elements so that the gc can reclaim the references.
            assume forall{int i in (0: _size); _items[i] == null};  // postcondition of Array.Clear
            _size = 0;
          }
        }

    // Rest of code omitted

    What I've been able to do is set the inner array to the new capacity, but also ensure that when I do that, my count doesn't go up, but only my capacity.  When I call the Clear method, I need to make sure the inner array is peer consistent by the way of all slots not in the array must be null as well as resetting the size.  We use the expose block to expose to the runtime to have the verifier analyze the code.  By the end of the expose block, we should be peer consistent, else we have issues.  How would we test some of these scenarios in BDD?  Since they are not exposed to the outside world, it's pretty difficult.  What it would be doing is leaving me with black box artifacts that are harder to prove.  Instead, if I were to expose them, it would then break encapsulation which is not necessarily something I want to do.  Instead, Spec# gives me the opportunity to enforce this through the DbC constructs afforded in the language. 

    The Dangers of Checked Exceptions

    But with this, comes a cost of course.  I recently spoke with a colleague about Spec# and the instant thoughts of checked exceptions in Java came to mind.  Earlier in my career, I was a Java guy who had to deal with those who put large try/catch blocks around methods with checked exceptions and were guilty of just catching and swallowing or catching and rethrowing RuntimeExceptions.  Worse yet, I saw this as a way of breaking encapsulation by throwing exceptions that I didn't think the outside world needed to know about.  I was kind of glad that this feature wasn't brought to C# due to the fact I saw rampant abuse for little benefit.  What people forgot about during the early days of Java that exceptions are meant to be exceptional and not control flow.

    How I see Spec# being different is that since we have a static verification tool through the use of Boogie to verify whether those exceptional conditions are valid.  The green squigglies give warnings about possible null values or arguments in ranges, etc.  This gives me further insight into what I can control and what I cannot.  Resharper also has some of those nice features as well, but I've found Boogie to be a bit more helpful with more advanced static verification.

    Conclusion

    Explicit DbC constructs give us a pretty powerful tool in terms of expressing our domain and our behaviors of our components.  Unfortunately, in C# there are no real valuable implementations that enforce DbC constructs to both the caller and the callee.  And hence Spec# is an important project to come out of Microsoft Research.

    Scott Hanselman just posted his interview with the Spec# team on his blog, so if you haven't heard it yet, go ahead and download it now.  It's a great show and it's important that if you find Spec# to be useful, that you press Microsoft to give it to us as a full feature.

    kick it on DotNetKicks.com