Saturday, October 27, 2012

Linear Types with HOAS

I've recently been writing a tagless interpreter for the linear lambda calculus based on the pi calculus as in this paper. While there existed previously a typed interpreter for linear logic in haskell, it used DeBruijn indexes for the context. This isn't great if you'd actually like to use your language, and you can't convert into typed DeBruijn indexes from an untyped representation without template meta-programming, so this needed to be fixed. As it turns out, all DeBruijn indexes are really giving you here is the ability to wade through the context intelligently searching for the correct type. I've created a version where this is unnecessary. Instead type families are used to search for the instance of the variable in the context.

Monday, September 17, 2012

rpc-framework #2: Sessions as Closures


    I'd like to theorize that RPC can really form more of a communicative solution than older wisdom would claim.  I've seen the following claims in numerous places regarding RPC:
  1. It hides the existence of the network from the user.
  2. It hides the location of data in the network.
  3. It hides message passing communication from the user and prevents control of this process. Specifically, local calls can fail without knowing whether the remote call succeeded or not. 
    These drawbacks may be valid for RMI in Java, but they aren't as much of a problem with the implementation in rpc-framework.
  1. While rpc-framework does hide the calls corresponding to sockets, ports, file-io, data marshaling, and concurrency, it does not make an attempt to hide that each remote call to a remote function is on a network.  The WIO monad signifies that remote calls might be made and includes the world that the action is executing in.
  2. In keeping with the Haskell mentality, values which have effects, such as network calls, have types which recognize these effects.  Data returned from a remote call will have an effect type if using that data will require a remote call.  For example, if "getFoo :: WIO Server IO (Int -> Int)" is called as a remote function, the type of the function it returns will be "Int -> WIO m IO Int"  rather than the opaque "Int -> Int".  Remote calls aren't in any way hidden.
  3. If you do not accept that remote calls send and receive data using "show" and "read", then yes, this communication is hidden.  But I'd like to present some methods for controlling communication using closures, which you can hopefully extend to your own needs. 
    The mentality behind RPC shouldn't be to hide communication, but to remove the boilerplate of marshaling data and connecting message passing to message receiving while improving the type safety of the communication.

Simple Sessions


    Suppose you want to create a Server/Client system where the client connects to the server with credentials then receives authorization to perform certain actions on the server.  For example, the guess the number game from the CMU SkillSwap 2012.  Here we want for every client that connects to the server, to generate a session with a new random number for that client to guess where the client can then query the server to see if his guess is correct. 

Solution Without RPC  

   There are a few ways to accomplish this.  In a flat system without RPC, the obvious way would be to have the server assign the client a unique identifier, which we then pass to the client.  We then save the a new random number to a hash table using the identifier as a key.  We would then create a server which accepts requests given given the identifier and a random number guess and queries the global state hash-table for the information connected to the identifier. 

Solution With RPC

   That solution contains lots of boilerplate.  You'd need to setup a server which responds to two different kinds of requests in parallel, write a session hash-table, pass around identifiers, write parsers and printers for your data, you'd also have to free the data in the hash-table when it is no longer needed. The first step to doing this better is to notice that rpc-framework gives you most of this for free.
   Every time you return a function from a remote call, a new service is set up listening for it's unique identifier.  This identifier is what is actually being passed to the caller.  Furthermore, since the returned closure is attached to the servlet, the data it needs to work is saved, similar to attaching a value to a key in a hash-table.   Thus, closures describe simple sessions in this system.  All this potential boilerplate code can be reduced to the following few lines of code:
    sessionServer = do
      onHost Server
      r <- newRandomInteger
      return $ \i -> compare i r

Wednesday, August 29, 2012

rpc-framework #1: Contravariance & Typeclasses

Suppose we have a typeclass:
 class Sendable a b | a -> b, b -> a where  
   send :: a -> IO ()  
   receive :: IO b  
Ideally we should have an instance for functions:
 class (Sendable a' a, Sendable b b') => Sendable (a -> b) (a' -> IO b') where  
This has contravariant narrowing though. Down the line, it will run into problems. Solution: split sendable into two definitions, one for sendable and one for receivable.

Wednesday, August 8, 2012

A remote language

     Last year when programming a database language in Haskell I found myself writing a lot of redundant and unsafe code for sending data across a network. Programming in enterprise java made me realize the usefulness of the service/rmi framework that it gives you. It also seems to be one of the few good reasons to use java over language X. There are of course plenty of drawbacks to network blind rpc, but the widespread acceptance of java's rmi seems to imply that a significant portion of the time people don't care, and that these drawbacks can be remedied by an understanding of the mechanics of rpc. The addition of such an rpc system to Haskell would surely boost it's prototyping usability. While there already exist plenty of rpc packages available for Haskell, most of them seem geared towards interfacing with other languages, which really didn't solve my problem of sending arbitrary data across the network safely.    

    Thus, I wrote my own RPC system. I wanted something that would be useful for prototyping large production code quickly, and also for quickly adding network capabilities to existing programs for programmers not very well versed in networking already. Given that this is Haskell, I wanted to be able to send functions across the network as well as pure data. This is really what rpc-framework solves. It is quite difficult serializing functions in a compiled lazy language, so I quickly gave up on the idea of sending them verbatim. Rather, I send links to the functions, as anonymous RPC calls. Actions work similarly. There is a lot of type level programming magic going on here, which I will definitely explain in the future. For now, please enjoy and report any bugs/missing features:

More info on github
Tutorial on hackage

Thursday, May 3, 2012

Almost LF in Haskell

NOTE:  If this seems a little abstract and useless, you can think of it as a cleaner, more first class version of template Haskell.  I do plan on eventually constructing a library with template Haskell to automatically convert GADT declarations like these into  meta programming dependent types as shown here.

    A while back I had written a post about logic programming using Haskell typeclasses.  The example I had given was fairly minimal, and didn't show the full power of logical programming.  I wanted to show how type inference in combination with logical programming could be used to do meta-programming and program transformations.
    I thus set out to show that lambda calculus could be embedded entirely within the type system using logical programming.  My goal was to have lambda expressions in the type system evaluate to terms which were then converted to runtime values.  This is what I came up with:

infixr 0 :$
type m :$ a = m a

infer_explicit = val :: (Step (App (Lam :$ Var Z) :$ Var :$ S :$ S Z) b => Term b)

When compiled, Haskell infers the following type:

infer_explicit :: Term (Var (S (S Z)))

However, alone this wasn't good enough.  It is very easy to write incorrect logical code, given how many arguments logical predicates can have.  Type checking makes logical programming that much easier.  However, we are already in working Haskell's type system, so there really isn't a well developed next level.  Haskell does have kinds however, and kinds do support checking if a logical variable is a * or a * -> * ...  This means that at least some amount of checking is being performed.  The next step is figuring out how to name *. It is known that many logic systems can embed a type system for themselves within themselves.  Examine the following example of predicate for testing whether it's argument is a logical number:

data Z
data S a

class Nat a
instance Nat Z
instance Nat a => Nat (S a)

Clearly this code would break if anybody were to add another instance of Nat somewhere else.  So type checking logical code in Haskell can't be done simply with predicates, there needs to be some way to ensure that the instances for a type predicate only correspond to those values that are part of a type.  Haskell already has a something thing that behaves like type:  a type.   So to solve this, we need to connect a single type with type constructors to multiple phantom types.  Instead of the type predicate being a class which determines if it's argument is of that type, we can use Haskell's internal definition of type, and GADTs:

data Nat b where
  Z :: Nat Z
  S :: Nat a -> Nat (S a)

data Z
data S a

Now that we have constructed the predicate on the value level in Haskell, we want to leverage "Nat b" to a predicate.  The way to do this is with another predicate.  "Nat" is a type function, and "Z" and "S a" are both type values.  So we need a predicate which ensures that "Nat"'s arguments are only the type values "Z" and "S a" where "Z" are type the only values usable.  The way to do this is with a class that ensures an Isomorphism between the values that inhabit either these types:

module Nat (Iso(), Nat(..), S(), Z()) where
-- AXIOM: into . outof == id
class Iso m a | a -> m where
  into :: a -> m a 
  outof :: m a -> a

data Nat b where
  Z :: Nat Z
  S :: Iso Nat a => Nat a -> Nat (S a)

data Z where Z' :: Z
data S a where S' :: Iso Nat a => a -> S a
instance Iso Nat Z where
  into Z' = Z
  outof Z = Z'

instance Iso Nat a => Iso Nat (S a) where
  into (S' a) = S (into a)
  outof (S a) = S' (outof a)

Notice now that the types S and Z have been given values to make this work.   Also notice that we do not export the values of S' and Z'.  Now when "Iso Nat a" is used as a predicate outside this module, it is guaranteed that "a" will be either "Z" or "S b" where b has the same predicate.

The code from these examples EDIT: Apparently in GHC-7.4 and up, you can define kinds in the language:

{-# LANGUAGE                                                                                           
module Kinds where

data N = Z | S N

data family NatFam (a :: N)
data instance NatFam Z
data instance NatFam (S a)

class NatClass (a :: N)
instance NatClass Z
instance NatClass a => NatClass (S a)

Wednesday, April 25, 2012

ImperativeHaskell and Loops

ImperativeHaskell- has been released and is posted on Hackage. This version has an addition that significantly simplifies using ImperativeHaskell with existing code. In this version I've added a template function "liftOp" which uses the type of its argument to convert its argument into a function in the style of ImperativeHaskell. There are more details in the documentation. I have noticed a potential problem however. "continue';" and "break';" will only work inside of a loop, and will cause an error if used outside of a loop. A type level way of ensuring that they are only used in loops is clearly needed. Turning "Control" into a GADT and adding a phantom type turned out to not be the way (necessarily). Perhaps some logic programing with typeclasses in combination with this method is necessary.

Tuesday, February 7, 2012

Logic Programming with GADTs and Classes.

During the last advanced Haskell class, we discussed how to encode natural numbers in Haskell's type system with GADTs, and use classes to sum two natural numbers together.   This idea can be extended to lists of items of different types, where the type of the list is a list of types.  Here is an example where foldr and append are implemented.

infixr 5 :+

data Nil
data a :+ r

data List a where
  Nil :: List Nil
  (:+) :: a -> List c -> List (a :+c)

class Append a b c | a b -> c where
  append :: List a -> List b -> List c

instance Append Nil a a where
  append Nil v = v

instance (Append l1 l2 r) => Append (f :+ l1) l2 (f :+ r) where
  append (f :+ l1) l2 = f :+ (append l1 l2)

class Foldr f b a c | f -> b c, a -> f where
  foldrList :: List f -> b -> List a -> c
instance Foldr a v Nil v where
  foldrList _ v _ = v
instance (Foldr r1 b r2 r) => Foldr ((a -> r -> c) :+ r1) b (a :+ r2) c where
  foldrList (f :+ r1) b (a :+ r2) = f a (foldrList r1 b r2)

While this code might type-check without functional dependencies, the functional dependencies act a bit like uniqueness mode declarations and make type inference possible in more cases.  Here is a link to the code.

 Edit:  In retrospect, since multiparameter typeclasses act so much like logic programming, why is there no actual mode declaration for multiparameter typeclasses?  Instead there are only functional dependencies, which don't quite capture every case you want.  It would be nice to direct the compiler how to unify typeclasses better without having to restrict them so much.