Discover Top Posts Tagged with #icfp15

George Karachalias - GADTs meet their match (ICFP15)

This involves two things: checking exhaustiveness, and checking redundancy. Additionally in Haskell, we want to account for laziness; we also want to reason about exotic features like view patterns.

Example:

zip [] [] = [] zip (x:xs) (y:ys) = (x,y):zip xs ys

Notice this is not exhaustive: input lengths are not necessarily the same length. So we get a warning saying they're not exhaustive.

But a GADT length-indexed list, we can enforce this. Now you want the pattern checker to NOT complain that it's not exhaustive.

BTW, here's something funny:

f _ True = 1 f True True = 2 f _ _ = 3

GHC claims that this is overlapped: the second clause is redundant. But it's not!! f undefined False should fail, because we poke the first thing. If we remove it, we'll get 3.

IDEA: abstract interpretation!

We abstract over all possible values, and then process each clause separately.

The notion of a clause... think about it as a partition function: it partitions them into "Covered", "Divergent" and "Uncovered"

Keep going over clauses till done.

If covered and divergent are empty, it's redudant. If covered set is empty but divergent is not, it has an inaccessible right hand side (it forces things but still needs to be there.)

How do we represent these sets? We need to say something about all arguments simultaneously. Some constraints can be ruled out by our GADT rules.

By the way, it's implemented in GHC.... it's not precise but you can download it. I want to merge before the end of the month. We solve a lot of bug reports. And some should be implementable within the framework.

Three solvers: type constraints (OutsideIN(X)) and term equalities/strictness constraints (minimal solver).

Managed to find 38 redundant clauses where previous found zero! 99% have size less than 100, but some have a lot. They look like this:

f A A = ... f B B = ... f C C = ...

It's quadratic! (and the data type had 54 data constructors.)

Q: This is cool... in the presence of or patterns, you can have a clause that is only partially redundant, in the sense that you wrote a bit of pattern that doesn't contribute at all, but you can't delete the whole thing.

A: In Haskell we don't have or patterns, but... there are may transformations that give you the right result. We represent missing as a tree, so we could split and have two possible paths. We don't have that at the moment.

#icfp15

Practical SMT Based Type Error Localization (ICFP15)

Solve the subproblems independently!

let x = "hi" in not x

Claim: x is an error source; there is a change to this place which can make it well typed. So now you replace it with a hole:

let x = "hi" in not ?

and now it's well typed. In general, there might be many, so the error source is a set.

Now, also there are possibly multiple error sources, where fixing any one is OK. So for example, "hi" is also an error source. How do we rank? we just use the size of the expresssion. So an error source with minimum cumulative weight is a good one. One actual metric is an AST size.

The problem of typed error localization is the problem of computing minimum type error sources! This is an optimization problem: define all the error sources and then find the minimum cumulative.

So let's reduce the problem to Partial Weighted MaxSMT. Here, you have a set of hard constraints (must hold) and soft constraints (assigned a weight). Constraints belong to a fixed first-order theory. Output: subset of soft constraints with max cumulative weight.

So the reduction goes like this. First, generate a hard constraint which is the program structure. Have variables T for the types, and a for internal things. Soft assertions are with the weights.

Solver: sets all propositional sets to true, then if it's well typed, this will be true. Otherwise it will set some things false. By semantics of implication, the consequent can be ignored. Now it might hold! The hard constraint is satisfied, but the soft constraint is not (at some weight).

Type checking is EXPTIME complete. Why here? Exponential number of constraints: calls to polymorphic functions we have to copy the constraints. Can we solve this? Well compilers use principal types? No... because this is not type checking, this is an optimization problem. If you have the best error... the whole typing information of a principle tyle... we might lose the error.

How do we tame blow-up? We're not going to give up on principal types. Instead, only a small fragment is involved in the error. So use the principal types, but only expand them when necessary.

let first (a, b, _) = a let second ... -- etc

Best way to explain principal type abstraction is by "trial and error". Second application: .... (ezyang: this example was too fast for me.)

But the idea is, the solver should ask us to EXPAND into the body of the polymorphic variable. The weight of a cheapest fix... so if the solver thinks that a polymorphic function is a cheap way to do this, then we expand it and try again. INCREMENTAL EXPANSION!

Implementation: EasyOCaml. (Subset of OCaml)

GRASSHOPPER verification tool, which does functional verification of heap manipulating data structures. Took the large modules, and manually added five errors, and ran the tool. So, it still takes 5-10 seconds to generate the error.

Conclusion:

Practically fast algorithm for searching best error sources

No assumptions about what the best type error source

Rely on principal types, only expanding them when necessary

Q: This looks really interesting, I like the work in this area. I'm intruiged by the notion of best. First, have you done any studies to find out how oftne your notion of best finds the actual error? Also, I can imgaine different kinds of programming in different phases might have different things. I may have mistyped a function name (localized), but if I'm refactoring, it may say that is this change I introduced in refactoring... but actually I want to know all the other places I changed.

A: In the first paper, we took some programs and went to see if how our criterion was. For the second question, when we started, we were looking at papers from ten years ago; we didn't see a formal definition of the problem. So we wanted to split subproblems. So learning good ranking criterion would be interesting; we're trying to use machine learning... but even if you have something precise, if it is not XXXX. With this framework, we can invest effort to find good ranking functions.

Q: What's the relationship with typed error slicing?

A: In previous work, there were a lot of notions of typing. They decide sometimes they can just overapproximate the program. That's not usable. So here, you have locations which ar ereported by our tool are really th elocations to fix. If you don't fix any of them, you won't get rid of the error. There was work that introduce dminimal slice; but here we generalized you can choose which minimum slice... this can be any criterion. In previous work, they were just fixed criterion. From call to call change the criterion.

Q: (Simon) As I understand it, your system relies on encoding the whole type checking and ... of constraints the SMT solver understands. That's very good for two phase languages; generate and solve them. It would be hard to scale that for OCaml. Haskell compiler is built in this way... but we have a special purpose solver for the constraints. So what happens if the constraints are more than special equality? (E.g. rho unification.)

A: This is a very important question. First for OCaml, we were able to even encode the OO parts. Implemented in the paper. The first paper we were happy with htis, when we look at your paper, undecidable type inference for GADTs... you built constraints... SMT solvers are really expressible these days. I have a good feeling. But I cna't say you can actually do it. With a more expressive type system, it might not be that easy, but I won't say that you can't do it either.

#icfp15

Bahr - Certified symbolic management of financial multi-party contracts (ICFP15)

Built a language to express these contracts.

American Option. At any time in the next 90 days, party X may decide to buy EUR 1000 from party Y for a fixed rate 1.1 of USD.

Our contract language: we have transfers, composed conjunctively with a scaled version of the transfer (scaled). The option itself is a conditional which is bounded over a range, and checks if the condition is true. obs() checks if an event has happened.

Combinators to capture financial contracts, symbolic analysis of contracts, and CERTIFY the implementation

Denotational semantics based on cash flows

Contract combinators:

nil a(p -> q) -- transfer p to q of unit a c1 & c2 e x c if e within d then c1 else c2 d ^ c -- what is this?? let x = e in c -- freezes the value of x

Expressiong language:

obs(l, d) observe the value of l acc(f, d, e) accumulate over d days (like a fold)

To do multiparty: credit default swap:

Bond: if obs(X default, 0) within 30 then nil else 1000 x EUR(X -> Y) Credit Default Swap: (10 x EUR(Y -> Z)) & if obs(X defaults, 0) within 30 then 900 x EUR( Z -> Y ) ...

We can combine this! And then get the combined contract.

Denotational semantics is simple. Contr x Env -> CashFlow. Each time we see what transactions happen; the Env is external behavior which may has happened. Env is both the future and the past. (ezyang: leak!) I general, this will allow us.... for something like a txn in T2, may depend on future observations. We have to restrict ourselves; every such transaction only depends on the past. We can define this semantically, but we need a compositional approximation.

Type system comes in: every type is annotated with a time index: "the value of e is available at time t and after"; conversely, for a contract time t, there are no obligations strictly before t.

Crucial bit: the scaling can only occur if the time matches.

Reduction semantics: small step that starts with a contract, and goes one timestep into the future and gives you what the contract looks like tomorrow.

Everything here has been formalized in Coq. Extracted Haskell code.

Future work: obvious example is more sophisticated analyses. Main focus was on symbolic analyses; want more numeric ideas. Also, continuous time? (Discrete here)

http://bit.ly/contract-DSL

Q: What sort of representation is used for Real numbers in verification and implementation?

A: Good question. In the Coq formalization we use axiomatic formalization of real numbers. In implementation, we have floating point; but you'd better do fixed point.

Q: I'm puzzled by denotatoinal semantics. IT doesn't rule out non-causal contracts? (Yes.) So it seems odd that the denotational semantics it he primary thing, even though the reduction semantics is more accurate.

A: The thing is, the language itself, it was designed to be very compositional; always compose two contracts. This means you quickly end up with contracts which are non-causal. We wanted a full picture of what you can write down, and then restrict. But you could also argue we should give semantics to well-typed contracts... that would lead to something more difficult, in terms of denotatoinal semantics. Pragmatic choice to simplify the proof: algebraic rules are simple to prove.

Q: I like the idea of doing... but credit default swaps, every time I have to pay my mortgage; what about recurring contracts?

A: Yes. We cannot do infinitely recurring contracts, but if it's finite runtime, yes. At the moment, this requiqresexplit unfolding, but we've builtin a bounded fixpoint combinator for contracts (not in the paper.) The theme of these contracts is that they always have a finite horizon

Q: Bitcoin communities, they are interested in smart contracts. Have you thought about it?

A: ...perhaps? I don't really know. Not familiar.

Q: There was a proposal for formalized exchanges by SEC. How do you compare?

A: The difference is we have a very minimal language where we can reason about it easily; e.g. simplifying, whereas the formalization of a language like Python, the idea is to execute and see what they do (but you have to run to understand). (Are you sure you can express all contracts). We cannot. For example, we can't express indefinite contracts. At least the sample we got from our partners, they didn't have these; it's very rare. Every contract with finite runtime we were able to express.

Q: Comment about the incident contract; Britihs WW1 bonds?? (We can't do those). These contracts are always in terms of absolute time, not relative. Is that complicating?

A: We deal explicitly with relative time, because the idea is you can have these as contract templates, and instantaite them to concrete times in the end. (But you have something you said... within 90 days.. it will never say that in the contract it will be 90 days might not fall on a busines day.) These observables can also be used to handle things like business days... like if it's not a business day.

#icfp15

Atze - FRPNow! (ICFP15)

GUI programs often have mutable state. How do we avoid it? FRP.

Problems: space leak (forget the past), no I/O interface (change the future)

Goal: get rid of space leaks with out fancy types and getting rid of higher-order programming

Time -> a; (Time, a)

Leak: snapshot :: Behavior a -> Event () -> Event a (need to keep all of the old behavior)

More general: whenJust :: Behavior (Maybe a) -> Event () -> Event a; "the first point after Easter when the behavior is just"

But another type doesn't have space: Behavior (Maybe a) -> Behavior (Event a). We can only sample the value now. So this says: "tell me the NEXT time it is Just" (ezyang: It looks kind of arrowey)

"I can sample in the future, I will just sample you in the future at that point"

How do you do IO in FRP? Well, it's very reminiscent of how we did IO prior to having monads. The IO code and FRP code are tightly coupled but separately. New input? Change the IO and FRP and route it through.

So, a Now monad:

sample :: Behavior a -> Now a -- tell me the value of a now. async :: IO a -> Now (Event a) -- does some IO, its effects will show up in the FUTURE (you get an event when it's done) plan :: Event (Now a) -> Now (Event a) -- Give me the event of finishing the now which may occur in th efuture

Time "stands still" during the execution of the now monad; everything is immediate.

"No more spaghetti with meatballs!"

https://www.reddit.com/r/haskell/comments/3ai7hl/principled_practical_frp_forget_the_past_change/

Q: (Ryan) The Async IO actions are semantically happening in parallel. Can you do it here?

A: It just forks a thread. It's related to par IO monad. (So you could put a lightweight scheduler in.)

Q: (Simon) Can I check something. If I recall, the reason arrowized became popular, was to deal with space leak. But it wasn't very popular; arrow programming is tricky. Is it really true, is this FRP back for the masses? (Yampa in the bin? FRPNow what we want?)

A: I'm not exactly sure what the expected correct answer. I prefer this over arrows. You can do higher order stuff. (Is it equally expressive?) It's more expressive than Yampa. (Some side-by-side comparisons would be entertaining.) Maybe for the next paper.

Q: I appreciate you didn't give up on continuous time. What about time transformatoins that refer to the type?

A: In the original FRP... there was a time transformation primitive which allowed you to do things which was non-causal. We don't support it, not a way to support it. But if you have a behavior that gives you the time, you can transform it. You can't go back in time? (I'm thinking of behaviors which slow down, so they have to buffer the past... it's causal, but it doesn't work if you ditch the past.) It does work, but you do a different formulation. To slow down animations, if you have a behavior which changes the time, gives you the number of seconds since, you transform the input behavior to your animation... instead of what was done in Hudak's work, where it actually went back to the past.

Q: You mentioned async can't actually happen at that very moment but it can be scheduled. How far must you schedule it in th efuture?

A: It is implementation dependent. What I meant is, you are sure the event you get back, if you sample it again, no? It's not happened, because IO takes time. So it can be any point.... you start thread immediately, but the effects are not visible now.

Q: The problem some people have is they start an action, they plan it for th efuture, they wnat to cancel.

A: Not in th epaper, but we've thought about it; having a callback which cancels an actoin. It's possible, nad nice to have.

#icfp15

Ryan Newton - Adaptive Lock-Free Maps: Purely-Functional to Scalable (ICFP15)

Motivation: LVish. Need to provide data structures. We don't know if your program has a big contended map, or a bunch of small contended map. So they have a few implementations of data structure variants, at the cost of complexity.

Standard data structure: map in an IORef. Pure data in a box is useful because of constant time snapshots, and lock freedom: if you atomically access the IORef, other stuff is OK.

Best case: scalable lock free data structures? But it still takes twice as long to allocate scalable structures as opposed to mutable things. Also they take more bytes, slower single-threaded... that's why Java did not throw out those constraints.

In GHC: indirection cost and GC cost.

Can we just make the user just pick the right data structure? Sometimes, the contention is not statically known.

To handle data structures like this: mash up existing data structures. You have inner maps which start as pure, IORef, and the convert to scalable lock free. Optimistically, pick pure data, but if you experience contention, switch to the scalable variant.

When transferring, copy thread starts moving things to the scalable structure. Writes go to the scalable structure, while reads query both (in case it hasn't been copied over.) When done, swing it over. The references are monotonic.

What about removals? Copying and a write have to commute, so updates are only weak: if there's already new data, don't overwrite it. Removals need to put a special semaphore value in. Once you finish, just drop them.

Evaluation: with no contention it's great, and with contention it tracks scalable structure.

Q: The tombstoning is kind of interesting and works well here. How well does it generalize?

A: Tombstoning is an inefficiency for us because Maybe is an extra indirection. Semantically the concept is interesting. It comes up a lot in the literature. Then you often need to record... this is also a concept which is common in other areas.

Q: Do you have any idea your hybrid one is slower in the larger cases? What is the overhead?

A: We have another level of indirection to get to S2. Really it must be just that, otherwise we're calling it directly. Also the overhead from the pure phase to copy. (And the tombstone?) Yeah, it's a Maybe. (that Maybe is quite annoying... because you're not using it most of the time.)

Q: In the hybrid case, did you do any measurements of read latency for copying really contended ones?

A: We have not looked at the distribution of read latencies. It must be worse because you have to check two... I hope it's not more than x2 slower.

Q: Some of the ... approaches have a problem with add/delete (if you add, delete, add), because the tombstone...

A: That's not a problem here, because the tombstone has only a very specific purpose here.

#icfp15

Matthieu Sozeau - A Unification Algorithm for Coq Featuring Universe Polymorphism and Overloading (ICFP15)

About unification in Coq today. Write a Coq function which maps zero to zero. He tries to be clever and write (fun x => _) and Coq can fill in underscore with x. But then he tries to write _ and Coq says no unifier is found. Hmm, somehting is fishy.

Another assignment:

in_head : In a (a :: l) in_tail : In a l -> In a (b :: l) inR : In a r -> In a (append l r)

Try to prove something.... and it works in Coq and fails in Agda. (didn't understand the example) Coq works because there are some heuristics.

So, what we've done is reverse engineer what is going on in Coq, what are Coq's heuristics, and explicate it. BTW, there are two inference algorithms, one for type inference, and another for rewriting.

BTW, this is not documented at all! BTW, unification is hard/undecidable problem. We kept adding heuristics over time.

It's an old problem. As soon as you go to higher-order, it's undecidable. Fragments were found where some unification could be done. The main refinement is the higher order pattern fragment. But in Coq we added module beta-eta; and also modulo delta. It's pretty complicated.

Contrib: formalization of a new unification algorithm. All of Coq, with overloading, universe polymorphism. (BTW, no constriant postponement, which alows you to postpone unification constraints in the fragment. Suppose that there's some...) Implemented as a plugin. And it solves 99.9% case of mathematical components (biggest library using unification in nontrivial ways). As we removed constraint postponement, there were some cases we failed, but we just needed a few typing annotations, and bidirectional type checking would solve this case. It's not optimized, and obviously faithful.

Taste of algorithm.

Q: Could you tell us a little about why universe polymorphism needs to be treated specially, as opposed to other quantification

A: This is related to the FO approximation. When we are doing unification with universe constraints, we have a notion of flexible and rigid variables. We say f is applied to particular instances, so f is always applied to a universe... we want to know what is the status of... you don't want to force unfication of variables. This is also linked to the fact that... convert f of t and f of u, either might have two strong constraints if you just unify, but if you unfold, you might ...

Q: The remark about constrait postponement, which is that it makes unification less easy to reason about, th eproblem is if you link unification and resolution, the unification order is observable to user. Can we recognize the unification that would be useful to the users and the ones that are not observable, and delay those for later? Is there a way to separate? For example, the user needs to mark which holes ar euseful for resolution in a different way?

A: This is possible. In ... they have ... for constraints, you could say ... more eagerly resolved than other some suers can ... if you want.

Q: You said there are things you stlil cannot solve. Do you think that it's worth trying to get those corner cases? Or maybe getting Coq to be the algorthm? Or in those cases the user should pose their unification problem the other way:

A: (Beta) I talked to Gonthier about one of the main developers of the component library; he was part of my committee where this work was first presented; we saw some of these problems in equations that were not solved by our algorithm. And really, in some cases, it was because the problem was ill-formed; it was just no no no it shouldn't be solved; I don't know why the current algo is solving this. I don't claim this is for al cases, but we should be more careful.

Q: (Simon) Come back to the slide about call to append... the natural unifier would be somethin gor other... The natural solution isn't the only solution, ther ecould be others. I wan tto check, you're going to try the natural solution, you'll backtrac. But you might have a large collection of constraints. If you've got two, they've got natural solutions, ... undoing one might unlock another one... ther emight be exponentially large numbers. Is it predictable what will happen, or we do the best we an?

A: We don't have constraint postponement, so we only have one constraint at a time. (But that's an artifact: Coq encounters things in a specific order; this is like tcers for languages, you can see they go left to right...) but you have to go left to right, because of dependency.

Q: Is that 99.9% figure, is that before adding bidir typechecking, will you get even more complete? (Yes.) Do you think it will be complete enough that you'll us eit in main?

A: That's the goal.

#icfp15

Derek Dreyer - Pilsner: A Compositionally Verified Compiler for a Higher-Order Imperative Language (ICFP15)

On Georg Neis not being able to give the talk: "George Neis is starting at Google, while is old boss and his new boss are enjoying themselves at ICFP."

Georg Neis is actually the driving force for this project. Pilsner is the first compiler for a higher-order imperative (ML) language which has been compositionally verified.

The starting point for this work is this field on compiler verification, esp. Xavier's work on CompCert. The goal of compiler verification is to show that the output of the compiler perserve semantics of the input. So the idea is that the target has the same observable behavior. This has various benefits, but the big one if you do analysis on your source program, those behaviors will also hold for the target.

But the main limitation is it doesn't say anything about separate compilation: it only says what happens if you build the WHOLE program.

Compositional compiler corretness tries to define something which is definable at the level of separately compiled modules. From our view, the three key criteria are modularity, flexibility and transitivity.

Modularity: a basic criterion, which says that semantics should be definable at the level of a module, and preservd under linking.

Flexibility: you notion of semantics preservation should not be tied to a specific compiler. So if multiple different compiles compile something you should still be able to verify it (or even if it's hand-optimized.) And all of these can be linked together.

Transitivy: compiler is not monolithic, so you should be able to verify each pass separately and then link them together to get an end to end result.

So what's the problem? There have been proposals for modularity, but they are not flexible or transitive. So... we figured out how to get all three!

The KEY IDEA: a new definition of semantics: "Parametric Inter-Language Simulations." And we mechanized it and made a compositionally verified multipass compiler for a language.

Pilsner starts with an ML like core language, with products, sum,s polymorphism, mutable references, etc. going down to an Asm language going through a CPS intermediate language. (Inlining, contification, blah) (ezyang: Actually, what's the relationship between sequent lambda calculus and CPS). We also have a simple compiler (Zwickle, another german beer), and the challenge example which is implemented with self-modifying code. It's not realistic, but just intended to show the felxibility of PILS.

So I'm not going to talk about the compiler, but PILS, the secret sauce which makes everything works.

PILS builds on a mountain of prior work. If you really want to understand what's going on, it's useful to have five PhDs. I'm just going to talk about the two most direct influences on PILS; two previous works at POPL. In the image, it's the barley and hops that go in... it's metaphorical; they are the key ingredients which are not very tasty and satisfying on their own... but when fermented with years of a grad student's life, you get something tasty. So I'm going to explain what we did, how they fell short, and what happened here.

First, a strawman. Compositional semantics preservation? Contextual equivalence. two modles are ctx-equiv if any client program you link them with cannot tell the difference between them. People really like it because it hits all the criteria on the head: it's inherently modular and transitive; it is also flexible because it is completely extensional. But the achilles heel is that this is fundamentally one-language. M1, M2 and C must be interoperable in the same language. This makes it seem that ctx-equiv is just not related to compiler correctness.

When we first started thinking about this problem, we thought, maybe we should use the proof technique for ctx-equiv (LRs) which does not suffer from this. So we tried using logical relations, and in the past decade there's been a big advance to Kripke Logical Relations that work with imperative functional languages. The great thing about LRs is they scale to interlanguage reasoning. If you look at the canonical thing you see in a logical relation, it says: two functions are logically related at type A -> B, if for any two related arguments in A, applying them results in a related B. Notice that although they could all be the same, they could be different term languages: 1 and 2 stay on the same side. This is well known: adequacy. So in POPL11, we showed that you could use this to relate ML code with untyped, unstructured assembly code. This was kind of natural, but it was also challenging because you had map concepts to assembly. So like logical relations, it is modular and flexible. The problem: NOT TRANSITIVE. LR proofs in general, esp. Kripke logical relations cannot be transitively linked together. So this would support single-pass compilers but not multi-pass. I was proud of this paper, but it was a bit of a bummer. (image macro, a dude... the paper is best read while high.)

Fortunately, we came up with a technique for getting around this. POPL12, getting from bisimulations, we came up with parametric simulations which are like Kripke logical relations, but it's transitive. At a detailed level is beyond what I can do in this talk, but th ekey idea is that LRs are so beautiful, so concisely and elegantly defined, they admit too many proofs. They permit some kinds of proofs you would never actually do in practice. So when the proof space is too large, you can't prove transitivity. So parametric simulations imposes a more rigid structure which then you can prove transitivity.

Suppose you want to prove two higher-order functions are equivalent, where they take a function, apply a value v to it and then run an expression. So assume the arguments are related and show the let expressions are related. We don't know anything about f1 and f2, except that they are logicall related. Subgoals: we want to show the inputs to the fs are related, and if we assume the outputs are related, then the continuations are related too. Notice this proof is parametric in the fs: we never looked into the fs to see what are the possible fs that could fit into this logical relation.

But logical relations, in theory, allow you to do non-parametric proofs! You can case on f. So this is too unstructured. So parametric simulations force you to do this parametric proof.

So, putting everything together: in POPL12, we developed parametric simulations and transitivity (A BEAR), but that was complicated enough for ctx-equiv in a single language setting. We claimed this shoooould scale, but we didn't actually work it out.

So what this paper did, takes the POPL11, and swaps out the Kripke LR with parametric simulations, giving you PILS, marrying the benefits of the two approaches.

Related work: ESOP14 with Perconti and Ahmed: the idea was to stick with contextual equivalence to merge all the languages into a single language which can all interoperate. So this is good for interlanguage interop (which can happen), but the drawback is that it's less flexible. ESOP14 can only link compilers with same IL; POPL15 source/target must have the same model.

Q: Who between Princeton, you and Amal Ahmed finally won? My question: you are doing interlanguage completion, but how hard would it to combine it with Perconti's work to get both interlanguage and multilanguage?

A: I don't know. I don't see a fundamental reason why you couldn't use this basic infrastructure... with multilanuge compilation; you just have to instantiate the source language with something that is interoperable. But I didn't see any fundamental paper why this paper couldn't have been written; this stuff is hard to make happen.

Q: Can you elaborate how transitivity fails in the earlier approach?

A: This is typical. I can show you technically where it shows up... Amal has ESOP06 where she proves transitivity for restricted pure language with recursive types... but it relies on adding syntactic types to the LR, which we did not want to do. Also Lars and his student proposes another way, but that also bakes in both sides are teh same. It's a good question, I've kept asking this question, but there doesn't seem to be crisp answers to exactly what breaks.

Q: (Ben) I'm a little confused: you're related untyped languages? What are the types on the right, and what color should they have?

A: We use types even in PILS to describe the contract that should be satisfied: the type can specify what kin dof things.... it's the specification for the linking.

#icfp15

Mary Sheeran - Hardware Design and Functional Programming: Still Interesting After All These Years

Thank you for the opportunity come and rant, unconstrained by science! (laughter)

What I plan to do: show you some stuff I think is interesting, in the hope of luring you into FP and hardware design. I consulted some oracles (who work at the coal-face of hardware design); this is a mixture of things I like and they told me about.

Let's start about hardware description languages. How old are they? When did it start? I found a paper from 1968 which starts nicely: it's states "specifying, documenting and controlling the design of digital systems are problems of increasing severity." That's a nice introduction! Then it goes ahead and says, "Unfortunately, their contribution is mostly oriented tward the machine that they were develping at the time and is not generally useful." Papers were snarkier in the 60s? (Another on Algol60: "it is less than satisfying"). The paper being snarkily referred to here is actually the one that introduced microprogramming. Paper 1953, presented 1951. That's cool, but I should add... my slides are full of links. Eventually you will be able to click and look at these things.

So... that's one possible oldest HDL. But there's also Reed from MIT, who wrote "Symbolic synthesis of digital computers." Unfortunately, my university doesn't think I want to read papers that are older than 1980, so I couldn't get it. But I could get the abstract: it talks about how a boolean machine is an "automatic operational filing system": information is contained in sets of elementary boxes or files, each contianing one of the symbols 0 and 1. You can think back to this time where people couldn't even agree what to call things! But it has a lnguage which describes circuits.

But I finally got back to 1937: Shannon's MSC thesis, it has the idea of using boolean algebra to reason about switching circuits. If I had known that someone had written a masters thesis and started a whole field, I would have given up! It's online at MIT; and it's fascinating. IF THERE IS ONLY ONE THING YOU LOOK AT, IT'S THIS ONE.

If we jump forward in time, for actually recognizable hardware languages... APL has a serious past as a hardware language. An entire formal description of System 360 was made at APL. He likes the idea of using the APL for systems design... they take this seriously at IBM. They hooked up APL to their synthesis language; it was their main hardware description language; in 1970, they pubbed a paper where the compared the results of a hand design of IBM 1800, versus a generated circuit from their hardware synthesis. They discovered the generated circuits were 2.6x better; but they decided that if they had more time and resources, they could get it to be x1.3 larger. In this paper, they said, "This is a good thing! APL is good for hardware generation." There was a plethora of hardware description languages.

But then, what hapened?! I became an undergrade in the 70s, and there wasn't hide or hare of an HDL: they had all gone away. There was no way of describing adders except with pictures. I did learn about formal methods... I met Henderson who advised me for my masters. He was doing art; but he was also interetsed in "Introduction to VLSI systems". It opened up circuit design to NORMAL PEOPLE (like computer scientists.) (laughter) It has more than ___ citations, it started Intel and MIPs. I looked on Amazon, and I discovered you can buy it for 77 cents... and I think somebody should do that right now!

So Peter Henderson was having his mead and con.... phase; for my master's thesis I generated the layout of a circuit, which we then implemented in the MEad and Conway way. It was really stressful; we were only thinking about the layout... and then we had denotational semantics and functional programming, and there was the answer! Also 1978 when Backus introduced an alternative form of combining forms; these were exactly what I needed. So I wrote papers where I advocated reasoning about regular circuits: Reed and Conway advocated... using the combining forms. We had to draw pictures by hand back then! This is one of those algebraic laws Backus talked about. On the left is a reduce of G; on the right is a reduce of F composed with G, and the algebraic law says if you can push the .... (etc etc) and the Fs will appear on the carry. This law helps you reason about things like pipelining. Strange as it may seem, we actually got users. Most of the work was done by .... and a design team of Plessey, making a motion detection array, used this to think about the flow of data in the circuit. Tihs was great because previously, the way to design the circuit was to draw it on large pieces of graph paper in a room.... you could see the joy in the faces having a simulator. They wrote a very nice paper about their array: "Using MuFP the array processing element was described in just one line of code..." It goes on to talk about the importance of having a language to play with your circuit. I din't do this: it was done by G. Jones...

This was a success! But then bad things happened. Our partner Plessey was bought by GEC, and they closed down the design team, and every thing stopped. And if you thought we could convince GEC to use MuFP... no, that didn't happen.

So let's talk about the reality today of hardware descriptions.

I took, a couple of years ago, a picture of the Wikipedia page for HDL for Swedish. I had to use the Swedish one, because English is different. It says, "There are two pages: Verilog, VHDL". This is correct. The reality is that hardware design at the low level is completly dominated by 25-year old and not very nice PLs. So maybe you'd say, "We can have coffee, we failed" but I don't want to say that; I want to say something different. So I want to say, despite this reality, I'm still interested in hardware description. Even if we can't persuade Intel to change how they do the lowest level of hardware design...and I don't think we can do it, I still think it's interesting to think about it!

First, let's think about circuits which operate on arrays. I draw them as boxes. I'm reading them in column-major. I'm going to think about combinators for plugging together circuits on these arrays. Here is "interleave" (ilv), which applies a function on every second element of the array. "Two" applies a function on the first half and the second half. And it turns out two and ilv commute with each other. Once you have things like ilv, you can describe well known networks, like a butterfly network. It's made out of ilv: it's made my interleaving two half size network: I call it evens, applies f to adjacent elements of the array. If we draw a picture of the butterfly network: the interleaves causes these rifflings... it's a bit hard to see what's happening... so what I'll do is take the wires and pull it, so the little wires are stretched, and we end up with a different rep. Each vertical line corresponds to the two input two output function. And this is a butterfly network. And this the shape used in making FFT... but it is also Batcher's bitonic merge. Batcher in 1969 wrote a great paper explaining how to do sorting in a hardware network. He introduced a bitonic sequence: one that starts off increasing, and then is decreasing, or some cyclic rotation. His eureka was, if you take such a bitonic sequence, and push it through the first column of the two sorters, you end up with two smaller bitonic sequences: all the elementson the top one are greater than the bottom one. So if you keep going, you get a sorted sequence. So how do we make a bitonic sequence? We can use recursion! So we can take two half size sorters... reverse one of them, and then put that into a butterfly! (Diagram of the half size sorters.) For sixteen inputs, this uses 80 two-sorters.

(computer crashed here)

Satnam convinced me to be interested in median networks... it doesn't need the values to be sorted but you just want the middle thing. Median filtering. It's an interesting function in relaity. I wrote ap aper in 98, where I got down to 98 comparators. That was the piece of code I was willing to show. I did get down to 96, but it was too embarassing to show in the paper. On the web, there's a famous piece of C code with a 99 compare and swaps, and the folklore was that you couldn't do better. I wrote to Paeth but he didn't respond.

It looks very unsymmetrical to me! I thought it would be symmetric about the middle line but it wasn't. What I've done is I've done the usual running of a parallel sorter, but I've also done computation about what i know about wires, so I only include the wires that I actually need. This synthesis hackery allows me to beat numbers. This kind of hackery is interesting, not only if you want to make circuits, but pieces of sequential C code that do such comparisons.

Now you might be thinking, "Search?" I have played with search. My second most proud moment, was finding this sentence in a paper from 2013: "Recently, a sequence of 2^n input prefix circuits of depth n and complexity L(2^n) (at least for n <= 25) was discovered by Sheeran, VIA COMPUTER PROGRAMMING." (laughter) This sentence was written by a matehmatician. Two days after it went up on JFP, he said, "You found prefix networks that match my lower bound! Unfortunately all my papers are in Russian!" But he has written a lower bound... 3.5 * 2^n... and so on. In my JFP paper, I found the prefix networks that exactly matched. I did not find them by search: I invented them, but search was an aid to the invention. I searched with lazy dynamic programming; I pored over the graphics, and then I had a Eureka moment and invented them. So having a language to describe things and play with them is important

I told him we should write a paper together, and he said "Nooo, computer scientists will not understand!" It is a heroic proof... I can't say that I really grasp it.

(Another network.) The whole key is: with a notation, you can play; and if you can play, you can invent!

Search is a fantastic help in discovering new things. Some more examples: SPIRAL, an approach in which you have a small DSL for expressing networks and the algebra of networks and they search for good implementations by applying algebraic laws. It's the kind of search which relies on the algebra; they generate circuits, low-level code. They have a great website that should be guiding all of us about how to present research on a website. And now it turns out... I didn't know this, but there are a bunch of people using search for sorting networks. These people (Valsalam and Miikkulainen), they are trying to push the limits of small sorting networks. Up for 16 inputs, they couldn't make improvements, but in their SAT based approach, reduced the best-known 17 input from 73 to 71. That as a result by Van Voorhis can reduce best known for others. This is heroic search... I think some of these numbers are up for grabs. I think 71 is quite high. Maybe I can get it down. If you get it down from 71 to 70, you can get your name on a book by Knuth. That was my favorite career moment, though for another reason.

Also, you can use search to prove optimality by searching through all possible 24 comparator nine-input sorting networks, to show you can't do better than 25. At the top of the line, it shows what we know, and known lower bounds. So for 9 and 10 inputs, we've made the lower bound match. That's where we are! These are very small numbers... but also very hard searchers. Only as far as 10 do we know what the smallest number of comparators there are.

And why are people are are not actually interested in hardware design playing with sorting networks? Well we have another reason for needing circuits: we can use these circuits for doing verification. This papers from MiniSAT is for translating pseudo-boolean constraints to SAT... and they use Batcher's sorting network! So they are interested in designing better sorting networks, median networks, etc.... so there's a reason you'd be interested in this.

By the way, I want to produce some of these numbers, and I want to understand these structures in the large, to replace the structures from 60s and 70s. We haven't made progress designing them since hten. So this is my take on FP and cirucit like things. But you might be thinking, ugh, that's not real circuits! So let's talk about FP and hardware in the real world...?

In 1994, Intel released a faulty Pentium 4, and then they screwed up on dealing with the flack that came out with. Afterwards, they did reduce a keyring of these fualty things, ahnded it out to each employee, on the back from andy Grove it said, "Great companies are improved by crises. good companies survive them." The effect of this Pentium bug, was that half my friends of formal verification were working for Intel. Many of them still do. I consulted Carl Sieger about the current status of FP and FV in Intel. He gave me some slides...

In Intel, they use a system called Forte to do formal verification of "computational structures"; these might be called the data path parts of processors; the algorithmic parts; the floating point units. This Forte system has thousands of users, sittin gin inside Intel, doing a system based on a lazy FP called fl. It has built-in BDDs, decision procedures, and a HW symbolic simulator (a ckind of model checking). FL is used in this context in many ways... ways that are familiar (design language, high-level spec, implementation language...) This is a success; a quiet success, that's going inside Intel. Carl gave m examples of fl in various contexts. He also told me about two tools that are used inside Intel: IDV, which is 280k fl plus tcl/tk; this is the nearest tool to vision I had as a doctoral student of using algebraic transformations on circuits. It allows you to start on a high level, start applying transformations, and then eventually deal with physical layout. I tried to persuade them to give it to me for teaching, but it didn't get out; and Intel stopped using it. in order to move desings from one process to another, they did the opposite of refinement: up to a spec, and then back down to another. And then we talked about STEP on it. Some papers describing them. These are huge success stories...

There is another x86 provider called Centaur; they also do heroic formal verification. In hardware, the price of getting it wrong is very high. They took a 47 million dollar loss. He said, "We can afford one more bug, but two will kill the company." So formal verification in that context is very important. The work at Centaur is verification: it starts with a spec of what an instruction should do, and compares the two to make sure instructions are implemented correctly. It's all based on ACL2. They're increasing the areas of trust: the parts of the chip which are verified. He has also spectactular results for formalizing entire x86 instruction sets, building verified compiler.... but it's not about design; it's about post hoc verification.

My next Oracle was Nikhil; CTO of Bluespec. I said to him, I have to give a keynote, what should I say about Bluespec. I wouldn't have said "still" interesting, I would say "even more" interesting. Over the last two years, the interest in Bluespec, has greatly increased. Two reasons in the upsurge in another way to do hardware: the rise of FPGAs. FPGAs are being designed by people without a hardware background; and the other reason is malware and hacking scares... in the FPGA world, there is an interest in end-to-end because fears of bad things.

BSV is a combination of structural hardware description language for us who did Lava things. It's like guarded rewrite rules: Lennart is one of the developers. And Nikhil gave me some slides... the slides also contain butterflies. One of the things he argued for in the slides he gave me was with Bluespec it is possible to explore very many implementations of te same function; e.g. inverse FFT. There's many choices: a combinational circuit and add pipelining, or you can feed data back around. And the combination of a library for doing structural descriptions with the rules gives you something extremely poweful. Adding rules to structural hardware description gives you a very attractive approach.

Just like we saw from IBM 1800, here's an example where they compared BSV generated with .... now the njumbers look a lot better. For hand coded, the BSV generated an implementation which is one third the size! for the larger size, it's 0.81x. These are spectacular results. Not just competitive, but far superior. Is this real? Nikhil's answer is... BSV often beats handcoded RTL code. Why? Algorithmically superior designs, because it's easy to make changes, make major architectural changes. You have a much better chance of finding a good implementation.

Now, they also have...another tool in their arsenal. This is someting new: Bluecheck. It's related to Quickcheck, which some of you might have heard of. It hasn't appeared yet, but it's a generic synthesiable test bench; let's FPGA designers provide an executable specification, and for free get an entire test bench included shrinking. Apparently this is blowing minds! The point is that the shrinking an diterative deepening, the approach for generating inputs, CAN HAPPEN on the fpga. You change from doing 350 test/s to 100000 test/s. They've linked it up to a software tool for studying. There in a project making memory system for multicore. They worry about sequential consistency; this quickcheck thing... they end up with much shorter failing test cases than they do in any other way. WE've all heard this before: shrinking is a huge win. Fantastic work! bluespec is making a lot of use of this.

And around Bluespec, there's interesting work on formal verification. I picked on two papers by Adam Chlipala: Formal Verification of Hardware Synthesis; not verifying resulting circuits of fixed size, but the whole generation of parametric circuits. That paper also has a bitonic sorter in it! There's a theme arising here. There's a follow-on paper from CAV this year which uses Coq to do the first machine verification of sequential consistency for a multicore hadware design that includes caches and spec processors. bluespec is not only winning, it's a basis for really interesting work on FV.

What about Lava and some other things? Chisel is a last... implemented in Scala. With lots of users in an arch design group. You should look at it.

And there's Cryptol, for crypto algorithm with aa route to FPGA. That book from 2010 has a lot of interesting papers. I like this paper, it says how they get to FPGA; and the notion of undelay (antidelay); the use of thinking about circuits this way.

And there are opportunities in OpenSPL. There's a company Maxselor, a gigantic FPGA, and when you buy it you also get a programmer, because it's just so hard to program. They're trying to pretend it's "spatial programming" but it's really just hardware. Open spatial language: they want input, we should provide it!

Final oracle is Andreas Olofsson; Adepteva; I could #include a rant from Satnam, or the #include of what machines will look like in the future. These slides, he told his whole story of developing Parallela board, whch contains a multicore chip, plus two ARM cores. Credit card sized, $99; it's very low power, very interesting, and very difficult to program. And he wants help. His story was how he went from Kickstarter, being invested in by Erikkson, at the end, "it's the software stupid!!!" He can do the hardware design... but we need to help him on how to program these things! Programming these is not only going to be difficult because it's not ones... but many.

so the division between hardware and software: those days are gone. This is what achines look like now: FPGAs, multicores, CPUs, GPUs. If I asked Carl Sieger: is there any hope for us to influence what happened? He said: FPGAs are moving into the processors. There's an announcement today of a XEON chip of an FPGA in th eprocessor. So we can't make this disinction. We're going to have a large number of tese things to program. It needs to deal with heterogeneity/massive parallelism. Lots of relevant work.

BUT STILL. I kept... I was positive until the very end. STILL, I lack a high level language that allows me to think about playing with time and space the way hardware designers do. The research I started in PhD: I have not succeeded in. There's still a lot of work to do. Work by ... I need help. I'm hoping to lure somebody into this. How do we think about the tradeoffs of space and time designers do? I've been involved in setting up workshops to think about this. It used to think that this conference was FP and comparch... but we've gone away from it. Maybe we should come back.

Programming future machines will be more like hardwar design than is comfortable. So not only FP + HW is interesting, th eideas might be important for software.

Q: (Ragde) My grad work, which was done about the same time, was lower bounds for || computation. The big result was AKS log depth sorting result. Have you been thinking about that result?

A: To get the constants down? (Not just get it down, but log depth networks which seem to be hard.) I have not really thought about it.

Q: (Bodik) On the note of how abstractions ar echanging; substrate changing; the sorting network you mentioned is to count the comparators, minimize that. But if you want to implement in software, FPGU, the wires are not free; you need to implement them as indexing to arrays. There, minimizing comparators may lead to complicated index. Are there thoughts on a different cost model for networks.

A: I haven't played with this for sorting, but I have for parallel prefix, because when i DP search, I sometimes didn't minimize comparators, but length of wires. You might also minimize the size of largest fnaout. I'm lucky I have a VLSI group who helped with deciding what are reasonable cost models. I think it's possible to do work in that direction, we should do more.

Q: (Svenningson) We often sort to search with binary search. On modern hardware, that has caches, binary search is not the most efficient way... often it's better to have a Btree or other order. What is known about circuits that produce this kind of order?

A: I don't know! Sounds very interesting.

Q: (Gershom) As I understand parallel prefix, it's lovely derivation... you jump ahead and you have syntehsize a bunch of stuff, think hard, test it, get better ones. There's a very big gap there. Is there a point where program derivation fall off?

A: For prefix netowrks; we know very well what the look like, we know how they decompose, but what we don't know what the size of the components should be. We understand very well... it's not a good example of where program derivation would be of value. Sorting would be a better example.

#icfp15

Mary Sheeran - Hardware Design and Functional Programming: Still Interesting After All These Years

Thank you for the opportunity come and rant, unconstrained by science! (laughter)

So let's talk about the reality today of hardware descriptions.

(computer crashed here)

I told him we should write a paper together, and he said "Nooo, computer scientists will not understand!" It is a heroic proof... I can't say that I really grasp it.

(Another network.) The whole key is: with a notation, you can play; and if you can play, you can invent!

What about Lava and some other things? Chisel is a last... implemented in Scala. With lots of users in an arch design group. You should look at it.

Programming future machines will be more like hardwar design than is comfortable. So not only FP + HW is interesting, th eideas might be important for software.

Q: (Ragde) My grad work, which was done about the same time, was lower bounds for || computation. The big result was AKS log depth sorting result. Have you been thinking about that result?

A: To get the constants down? (Not just get it down, but log depth networks which seem to be hard.) I have not really thought about it.

A: I don't know! Sounds very interesting.

#icfp15

#icfp15

Trending Tags

Recently Viewed Tags

#icfp15