Discover Top Posts Tagged with #pldi

L'IDLP au Pays Bas...au moins à l'aéroport. ;-)

19 December 2014

Everything is completely setup and working, yay! I started doing a bunch of things today, but kept getting sidetracked and distracted. Not a very productive day. I have, however, almost finished https://github.com/papers-we-love/papers-we-love/blob/master/plt/fundamental-concepts-in-programming-languages.pdf?raw=true and I don't plan on sleeping until it is done.

#pldi #needfocus

Quentin Carbonneaux - End-to-End Verification of Stack-Space Bounds for C Programs

Toyota unintended acceleration; stack overflow. Stack usage of embedded systems important.

Quantitative logic for C (for stating stack usage) as well as a (certified) automatic bound derivation program. Extends CompCert

CompCert models behavior as traces; now talk about call/ret in trace

Trace weight = maximum stack usage (supremum over prefixes)

Correctness: source does not go wrong, and all traces have weight smaller than stack size

Assertions in quantitative logic are natural numbers (saying how much stack space is available.) 0 is True, infinity is False. Logic works the way you would expect.

"Multipel function arguments": varargs?

Q: Do you support embedded code pointers? Function pointers in memory.

A: The logic basic, so we don't support it. But if we used XCaP, then we should bea ble to use it.

Q: How did you prove soundness? Agument semantics with calling depths?

A: Continuation based definition

Q: Do you have recursions? (Yes) Does simple induction work? (That's not simple)

Q: (Alastair Reid) This is really important. But embedded systems have another problem: interrupt driven. So it's really important to know how much stack the interrupt handlers are using.

A: First we need semantics of C with interrupts. Assuming we have that, we can bound it with interrupt handlers

Q: They are often nested. So it's important to reason about priority levels.

#pldi #pldi14

PLDI Artifact Evaluation Session

(Shriram Krishnamurthi) We wanted to explain this process, since it's new to PLDI and it has started to happen at other conference. (Insert description of process here: only after paper acceptance, AE does not change status of paper)

What is the criteria? Is it good? Does it build? Actually: evaluate against expectation set by the paper. About the reviewers: "They're young and naive and actually wrote long reviews." One paper had 29 comments. "People on the PC actually responded and stuff, which is amazing."

52 accepted, 20 artifacts. It's a bit questionable that only 12 passed. That's low--maybe a little disturbing.

What are quotes about good artifacts? (Sometimes reviewers thanked the submitter, etc.) Some people were positive. High praise for some artifacts: "I'd be happy to use it in a production web site today"

Distinguished in artifact evaluation: Doppio: Breaking the Browser Language Barrier (Vilk, Berger). Java bytecodes to JavaScript, and virtualize filesystem and everything else. "Worst of both worlds!" (chuckles)

Questionable reviews: "slightly crashy, undocumented format, but it works". "The virtual image didn't work, but it was easy to build the source archive." (AEC did their job!) "Once it runs, it runs, but getting to that point wasn't easy." That is shady. "Artifact is easy to use but insufficient to reproduce the experiment." "I couldn't reproduce quantitative results because there were no parsing for output files." "Couldn't reproduce what happened in the screencast." (There is an innocent explanation: the paper was based on original data, but the artifact was after the paper was accepted.) "The configure script exited and just printed 'Sorry'." "Too many dependencies!"

There was one paper where the AE might have affected the actual acceptance. "Comparing to related work without faithfully implementing the other optimization." We wrote to the authors, but we weren't setup to have a detailed discussion, but it was very questionable.

(this ends the slides)

(Emma Tosch) This is for my own benefit, since the OOPSLA AE is due today. I've been making lots of changes to things: I have data, I was wondering, for the system and the analysis I do, these are going to be different, so I need to give an explanation about what's different. How much documentation should I give and how much explanation? Nothing in my paper depends on it; I'm just reporting. What do you recommend w.r.t. the goodness of science and reproduceability?

(Jan Vitek) I think you should think about the most important criterion: expectations set by the paper. It would be nice if they can doublecheck the things in the paper. If some small things change, that's fair. You're constructing a closure: here is everything from the point of the submission, to reproduce the submission. As long as you keep data sets around, if you change format, well you should have also changed your scripts.

(SK) Explaining what happened is fine too. But if something mysteriously disappeared, you don't have a replayable artifact. It doesn't invalidate the research, but it's an asterisk. You might be able to explain why we couldn't get the stamp, e.g. "here's the tool that disappeared."

(ET) One problem is there are many different types of artifacts. The trouble with a dataset, the analyses are often very sensitive to the implementation. So if you're using one library to do a simulation, your RNG changes. Reproduceability and understanding bounds can be difficult.

(SK) Yes, so some papers mention version numbers.

(JV) And some people do VMs so that you can have true reproduceability

(SK) But if it's that sensitive, why do you trust the result?

(JV) If there's a range, and you're in the tolerance, that's OK. If you get varying numbers, which are not discussed in the paper, well, that's questionable.

(SK) One quote from a reviewer, which I didn't put on the slides: "I got something roughly similar, but I got so many different answers, it was patently clear the paper should have error bars." Well, of course you should have error bars, but it's possible that the authors were picking desirable data.

(Adrian Sampson) This came up when I was doing reviews: some people place more weight on reproduceability and some placed weight on general expectations. I think it's worth teasing this out. The paper expectation is different. Do we want to reproduce the evaluation? The paper is not just the evaluation. I was willing to be more lenient; the evaluation might be logistically complicated; the overall description of the tool, that seemed more important. But that's not necessarily to argue, but I think it's an argument for being clear about the role of reproduceability.

(Eric Eide) This was pretty tricky. I had to explain multiple times the criterion was not perfect repeatability. There are lots of reasons why perfect repeatability is not attainable. There are a lot of performance numbers! General trends? Whatever. That said, I think there is expectation the tool should plausibly support the conclusions made in the paper. If there are questions about variability, maybe the paper should note that. In general, part of the idea is to improve repeatability and move towards an understanding that we should have a scientific process.

(JV) Another benefit is we have to be much more precise about what we did. In the paper, we might say, "I've implemented a C++ compiler." Then a reviewer might throw any C++ program at it. But more likely, they've implemented a compiler for a subset of C++ for parallel loops. (SK: you actually implemented a compiler for Pascal). Is the description in the paper matching what was actually implemented?

(EE) The point is to put the D back in PLDI. Someone made a dialect of Python, it's easy to use? They have a bunch of language constructs, semantics part, and then a big user study, whatever. Then they give you the compiler. What would you expect to be able to do with the compiler, to evaluate it? The example programs should go through, maybe the semantics is formal, but the user study is not reproduceable.

(SK) Well, be careful: this is the question of repeatability versus reproduceability. The user study is not repeatable, but we might still evaluate the raw data. To get back to the high level point, reviewers often have an obsession with exactly reproducing what the paper said. For some papers, it is obvious that you can't. Embedded hardware you don't have? Well! But for things that look similar enough, you might think you can. I think we shouldn't be too obsessed, but I've published papers, we generated some number of states (e.g. two), get some speedup, there was a lot of writing. If someone else gets five, did I really understand what was going on? Maybe there's a problem.

(AS) Maybe in addition to this, maybe as an overall goal, say, "Give us all the tools to reproduce the evaluation, or tell us what the gaps are, which are infeasible"

(SK) It wasn't good enough?

(AS) It felt people weren't totally sure... often people said here are the things that are reasonable to reproduce, but didn't explain other things. This left reviewers at a loss. It would be good if things said: "Here's why there's variability."

(ET) I like the idea about responses: maybe having a conversation for making the paper better. For AE, that seems necessary, in that, for error bars, maybe people weren't thinking about it, making typos, especially for last minute.

(SK) Part of the problem is this: every time we've run this process, there was not enough time. There are four weeks to do it: it has to be before the camera ready. There's not enough time, but we did make up an ad hoc response period this year. We sat on the reviewers so they did the reviews ahead of time. We told authors, this is ad hoc, we gave authors the full reviews, and told them you're welcome to respond. We got some responses, some were not prepared. It's just really constrained.

(Carl Bolz) Did it make a difference?

(SK) I can't think of a case.

(Paolo Giarrusso) Was there a phase for all papers?

(EE) We only solicited ones where there was a question.

(George Kastrinis) Given the code which cast doubts, do you think the process should change in the future and take artifacts into consideration?

(PG) Yeah, didn't we just lose this data (from the AEC)?

(SK) All reviews are like that. At ESOP, people were very concerned: how is this going to affect the process? The community was fairly conservative. The idea was, if we do AEC after decisions, then nobody will be hurt by this.

(Ben Liblit) Even the bad papers!

(JV) Overall the number of bad papers is not overwhelming, and the whole point is to encourage people to do the right thing. If it's punitive, then it will change the tenor. People know that the paper is accepted, so they are trying to be encouraging and helpful. So it was an artifact of the genesis of the paper, but it's a feature we should keep.

(SK) Here are the things people were concerned about http://www.artifact-eval.org/motivation.html (Design Criteria). (Singled out the exposing artifacts to the general public is unfair to the group)

(ET) About exposing artifacts and people poaching your research. Do we really want to a community that has this fear?

(JV) Well, in Biology, you explain how you did it, but you will not share the data. It cost you millions, it's precious, you will not share it.

(EE) Well, in computer science, things tend to be very open.

(SK) What kind of community do we want to be?

(ET) Think about informal publications, I have a research blog where I write lots of details. Maybe two people read it.

(SK) I didn't know you had a blog!

(ET) My understanding is, today, that's considered a formal document--if there was bad faith behavior, it functions as lab notebook. But if you're in the public, it seems to foster a sense of scientific community and shared goals and published, rather than the economic system of biology, which is setup to divide people.

(SK) My view is if someone builds on it, I get citations without doing any work. That's great!

(JV) The reward system is not quite right...

(SK) Can people write papers based on your work? Then you will get your next NSF grant for free. "This is something people will use!"

(ET) Do people ever contact original authors?

(SK) Never.

(Jean Yang) How much work is evaluated on how easy it was to setup environment? Should we build environments to improve that?

(SK) That was not the goal. Even if painful and undocumented, if we can push it through, we accepted it. Here is the social problem. If we reject most things, then the AEC process would die. 20 grad students rejecting papers; we'd have 50 professors would have pitchforks saying what a terrible process this is. We want to encourage people: they did a lot more work than they needed to! The primary thing was not the artifact, it's the paper: you are NOT EVEN REQUIRED to publish the artifact. The paper is key.

(JV) It's a process. People don't know how to package these things. Every year we'll know better.

(Gregor Richards) I have a response. The dependencies comment was my comment. I wanted to say how excruciating it was, but it wasn't part of my conclusions about the quality the artifact. It was, "In the future...", but to actually judge it, here are the other flaws.

(SK) And in the future you might get rejected; you won't get an easy pass.

(JY) If we make it easier to submit artifacts, will more people do it?

(SK) Well, we provided links to many tools for doing this. See the section "Code Artifacts Packaging." This year we saw aa lot more Docker instances.

(Ross Tate) I have some relevant experience, in the context of whether this should be required or not. This year I have a bunch of students work on established GitHub projects, but the biggest headache is setting things up. These things are just part of the process. It can be intimidating, training people on these things. As an untrained user, it's very intimidating. But if it's not required, then maybe it doesn't work out.

(GR) That's why one suggestion is to provide a VM image.

(RT) To have someone compile my code, that is not easy!

(SK) That's why I really like the example where the VM did not work, but the compiler image did.

(Swarpenedu Biswas) I submitted an artifact and benefited from it; one reviewer told about documentation. Do you plan on making the response period a mandatory part? Or keep it ad hoc? It's very possible to misunderstand the documentation.

(SK) We would like to make it a standard part, but there's just not enough time.

(JV) We tried a "kick the tire" thing. First three days, try to get proof of life (build it.) If you get stuck...

(SK) Well, you don't notice subtle things... the VM runs, that's nice, but that's often not the tricky thing. It's all the other stuff; the weird output format, numbers don't match up. We're squeezed on time. Here's why I think we had time here: because we only got 20 submissions. If 40 submitted (we expected 30), there's no way we would have gotten it done.

(GK) Are there valid reasons for not submitting an artifact?

(SK) Absolutely. We should not turn this into a litmus test for every paper. I write theory papers sometimes. These should not be pounded out of the system. There's enough ... going on from SIGPLAN. That's why expectations are important. There's a running joke, since POPL has done AEC; is there anything besides Coq that should be done? Maybe you have a 30 page technical report, and that's the artifact. It's all a function of claims, and justification of claims.

(JV) Don't knock proofs as artifacts. I have seen papers which have claimed to have proven something, but authors haven't typed it up.

(SK) It turns out dogs eat everything: codes, proofs, manuscripts...

(JV) I don't think this is about fraud.

(SK) But I've been there.

(AS) What about super-complicated artifacts? I know someone anecdotally, because he has a big cluster system, you need Infiniband between 102 nodes. I was trying to argue, take your analysis part, and do it, but he couldn't find guidelines about how to extract the useful part.

(JV) Do they have any data? How did they compute that data? There is some output of simulation, a complicated set of scripts which made the graph, that is very valuable.

(AS) I agree, but I wish the guidelines for the AECs had ideas for how to deal with this. You don't need the whole thing.

(SK) We also have to accept at some point we're not going to be able to do it. Let's just continue to keep this after the paper is accepted, because to reject it would be crazy. That's too bad, maybe you depend on very specialized hardware, that was the point... what are you going to do. There's two copies... etc. One author offered to ship us hardware. we were like, well, we have three reviewers, we can't clone it, it just isn't going to work. Also, I think this is a case where the common case is not problematic, so I don't want the uncommon case to hold up the common case. One of the things we do is we give the decision back to =the authors. They can choose to print it however they want. "This paper was not submitted AEC because could ship Amazon things." What I want: I read a paper two years later, I want to know if anyone evaluated this. If I see AEC, someone evaluated it!

(CB) Here's a thought: there's a tension between making artifacts easy to evaluate for AEC, and making it actually useful for consumers of research. Shipping VMs is sensible for AEC, but not useful for sticking it in a pipeline. This is something to figure out over time.

(SK) We got some feedback: it was helpful to have someone outside the group run the damn thing. Multiple people! This automatically helps.

(CB) It also helps you find holes in arguments.

(SK) Some people had hardcoded usernames... but if the only thing I have is my machines, well, maybe you forgot. It's hard to think of everything.

(RT) The topic, is to verify claims? Then could another way to do it is have the AEC work with the authors, rather than independent.

(SK) That's painful. I've not known many cases where the authors vigorously disagreed, but some authors were mad. Imagine your paper is rejected... you write back and refuse to accept the rejection...

(BL) If evaluators are grad students, you don't want them getting into a fight with a tenured professor.

(EE) The chairs acted as intermediaries between committee members and authors.

(JV) Signed NDAs...

(SK) Commercial software is tricky. We want people to be anonymous, there was commercial software freely available for people who sign the document. So Jan signed the document, and told the reviewers, "Don't fuck up!" It's not clear what to do in that situation.

(BL) Many people have contacted authors of papers...

(SK) People started to nod...

(BL) ...asking for copy of tool. What happens, your email address for the first two-three authors bounces, their emails are invalid; finally you get a response from the professor, they don't even remember the project existed, because they never ran it. I'm cautiously optimistic to see what effect AECs will have in the long run, when five years from now I want to cite/build on an AEC paper. I hope it's better.

(EE) There are two papers that have attached artifacts going into digital library. I think this is the very first time you can do that. Positive effect!

(BL) Trying to get and use artifacts has failed so often. I just hope... maybe it will be better.

(SK) I don't want to give oxygen to a terrible piece of work, but I'm going to do it. Here is Christian Collberg's buildability of artifacts. They did a dreadful job when they found things... they did not make Arizona graduate students look very smart. One bottom line. Workshop TRUST, Collberg is going to be there, and I'm trying to figure out the most measured way to tell him what I did. A bunch of us setup a crowdsourced effort, to reproduce their reproduceability subject. But they were unresponsive, finally, they talked to a coauthor, got responses, sure, and now, here's what we have. Here's the classification: disputed, etc... http://cs.brown.edu/~sk/Memos/Examining-Reproducibility We did this, there was a bunch of response, then interest died out (as usual), but if you read the emails, the conversations, it is truly scary.

(GR) I have a thought: if we're thinking so much about how authors are going to communicate to the AEC what claims their artifact correspond to, and in the end, the stamp, there's a disconnect, communicating to the reader what claims it corresponds to.

(SK) Yeah... computer science, more complicated mechanisms don't work. If the evaluators have decided overall the paper is credible, it gets a stamp, otherwise, it just is like... when a paper is accepted, it doesn't mean the PC agrees to everything. All it means they signed onto something in the original paper. So there is some amount of trust.

(JV) In the long run, maybe they will submit the code for external review...

(PG) It's good style... to update the paper to say what things. I want to see how many papers were altered as a result of the feedback. Did you add error bars? The optimization paper? They would have had to withdraw the paper...

(EE) It would be nice to figure out how to work it into the process... here is feedback. Getting that to authors for them to modify things... I don't know how to do that in the current schedule unless... if you participate in AEC, you get an extra week? Two weeks? That's one way to get artifacts.

(SK) We have this problem camera ready three months before.

(JV) That could change.

(PG) What about loop stuff?

(SK) First stage?

(JV) Second stage. But we could do it at the end of first stage, because most papers.

(CB) It's pipelining.

(ET) This goes back to Adrian's comment, for things which are difficult to reproduce. Simulators: it's nice to have, but you don't want to talk about simulator in paper. in terms of reproduceability, it's nice to have alternatives. Clearly... it's very selfish, but my work requires Mechanical Turk, but I also have a local host to walk through. This is feasible, but in terms of encouraging understanding, the pipeline works, do we emphasize simulators? It's nice to be able to model a system

(SK) In any case, you had one for testing.

(EE) I think having a version of your system which works in a simulator, to validate the parts you can validate, is an excellent idea. I have one foot in PL, systems, many of my colleagues in SDN and datacenters... unless I have a datacenter, I can reproduce, but there's ton of SDN simulators.

(ET) So should we ask people to submit this?

(JV) It's not appropriate for all papers. Requesting something... extra work? We don't want it. It's god to have independent evaluation...

(SK) How are you going to make your case? They have to believe it. you take your best shot. There were all sorts of questions about how to write a paper, because a whacko reviewer didn't... the advisor now can tell you what to do. I think with a little practice, we'll figure it out.

(ET) Well, the thing about simulator, is the conflict for users and AEC. Users have no use for simulator. For cases where people who didn't need it... proper system. I think we should try to encourage people to have these kinds of tools, but in terms of what we're asking people, maybe clear guidelines on it, so you don't have people saying it's not appropriate at all.

(SK) that's the really troubling thing. "Why didn't you ask?"

(JV) The reason you have to say it often is because people don't ask.

(RT) There are two things you're trying to do, they need to be separate. Verifying claims, and verifying there is a useful tool (helping others.) There are different things.

(JV) Our hope is pushing on one push the other.

(SK) I think that's true. It makes it all a little better.

(CB) The virtual machine makes it possible.

(SK) As an example, I'm trying to build on something ....

...

(JV) We're looking for the least work. You did something, you did experiments to convince people.

(EE) Artifact is message to research peers, not users. If you can address both, that's great. We like pretty webpages, but users won't care about some things. Research colleagues do care about. users don't care about source code, researchers do.

(RT) Well, it's tough...

(SK) We hope it's similar enough, so having a third processor is unlikely to be helpful.

(RT) Well, making my tool useful... if I'm not going to use it...

(SK) The tool paper on the tool is different from the result of using it.

(AS) Here's a funny idea: have a list of stories of successful artifacts. Because everyone's case is so unique. Once there was a person who submitted [X], or made a cluster thing, they didn't give us anything... they don't have to be real stories.

(?) If I can't reproduce your story...

(ET) We all hope we're doing work we think is valid, I don't want to shame my advisor by not earning a sticker.

...

(SK) Now we require a webpage. I think this is really important.

(JY) Is there a public announcements?

(EE) You have to look at the frontmatter.

#pldi #pldi14 #aec

Tony Hoare - The Laws of Concurrent Programming

"There are two ways of constructing a software design. One way is to make it so simple so there are no obvious deficiencies. The other is to make it so complicated so there are no obvious deficiencies."

I think I'll continue Keshav's catechism, and see if you understand all the things I've got on this slide. Relational calculus to residuals: I have a confession to make: if you're familiar with all of these -isms, you will have thorough difficulty understanding my talk. If you don't, you'll have to exercise some gray cells, but you'll have an easier time understanding. The reason is I'll be simplifying everything to the greatest extent possible.

Algebra is the main subject of my talk. The algebra I'm interested in is based on Tarski's relational calculus. This is one of the great unknown articles of computer science. The secret is to introduce into the relational calculus a second operator for composing programs, the concurrent composition, to match the sequential composition. Then to the relational calculus, we'll add an exchange law, iteration (while loops) and a definition of residuals, which is also less well known in relational calculus, and recursion (also due to Tarski). Put together, these are a complete set of algebraic laws for concurrency.

The most important topics are process calculus (Robin Milner's discovery), verification logic (attributed to me, but I will use separation logic, due to Reynolds and O'Hearn), algebraic laws and denotational models (Dana Scott).

Algebraic laws play a central role in the semantics of programming language. Frm the algebraic laws it is possible to deduce a verification logic, simple mathematical reasoning, and also deduce a process calculus, ala Milner's CCS. From the conjunction of a process calculus and verification logic, it is possible to deduce all of the algebraic laws I would propose. And finally, to make sure this little triangle is based on something that exists or could exists, one should provide a denotational semantics, which is a model for the algebraic laws.

Here is a simple subset which only has the composition operators, which are applied not just to complete programs, but individual traces of programs: a trace model of my model of programming, using P, Q and R to stand for individual traces. This will give us a simpler explanation of what the meaning of the operators are. (Or subtraces, since the execution of a program splits into as many parts of the program there are, and we can identify which part of the trace corresponds to each part of the program.)

P;Q stands for a trace which has been split between P and Q, such that all the activities of P have been completed before the execution of Q began. A time-wise split of the trace. The concurrent execution expresses a space-wise split of execution, where P is executed on a different component, using a different set of local objects than Q. If we distinguish splitting traces in space and time, we get the idea we need two different operators. Finally, the trace which I will call I, is very lazy, it describes doing nothing.

The algebraic laws which I recommend to you are the simple ones simple from high-school algebra, of commutivity, associativity and a unit law. These same laws apply to sequential composition and concurrent composition. I would like you to feel these laws ought to be true of programs; when we write a program, we would not expect to have to put extra brackets in to disambiguate the associative operator, and similarly, you'd expect doing something at the same time as something else... and doing nothing has no effect anywhere... These laws are not exactly surprising.

I'd like you to notice something about the first law,because it illustrates the property of duality. If you look at the right hand side of the law, you see if you read the rhs backwards, it has the same form as the lhs. If you interchange the order of operands of semicolon, you get a statement which is a theorem if you start with the theorem; it will remain a theorem after reading it backwards. This is ture of associative axiom, and it will be true of all the theorems derived from the axioms. Reading backwards preserves the truth of axioms, and it thus preserves the truth of all theorems.

Summarized on this slide. This is very similar to the time symmetry principle obeyed by many laws of theoretical physics: they remain true if you go forward in time or backwards in time.

I'm going to introduce you to the concept of interleaving. When we get a parallel combination in a program, we would not be surprised if the implementation decided to interleave the actions of the two threads involved, perhaps executing them on the same computer or core. I'm not going to make it compulsory, but if you did interleave them, you'd get a valid execution of the same program, even if it was a more interleaved execution--a more sequential one, maybe taking longer, because it didn't exploit full concurrency. The <= sign is used to mean that P and Q perform the same action, but P is more sequential, and Q is more concurrent. This means I can have true concurrency, but implement it sequentially whenever I want. I am later going to equate this with refinement ordering later, as P is more determinate than the trace which contains true concurrency, as it sets a particular order for actions, whereas true concurrency does not.

I've used an ordering symbol; I've talked about partial interleavings; only some of the interleavings have to be implemented. The result will be in the interleaving ordering as compared with the uninterleaved. So the laws are that it's a partial order, and laws of monotonicity, which say that if you have a partial interleaving and compose it sequentially with a trace, the results will also be ordered by partial interleaving. We can use partial interleaving anywhere inside a trace and the result will sitll preserve it. This is an important reasoning principle for reasoning about various kinds of ordering, including refinement ordering.

Now I come to the central part of my talk: the exchange law, which relates concurrent composition and sequential composition. It's a sort of distribution law, but it's not familiar because it has four operands. The law is easier to understand if you write a 2D version where parallel composition is written by putting operands above the other. (P || Q); (P' || Q') <= (P; P') || (Q; Q'). The 2D law is familiar because it is true for multiplication and division of natural numbers. But I think the point I'd like to make is that the law is self-dual, in that if you read it backwards, it is still the same law.

The left-hand side is an interleaving, implementation of the rhs. It is the special case when the two rhs ';'s happen to be simultaneous, perhaps as a result of scheduling decisions, which you have to make when you're sharing a single processor among two concurrent threads.

From this exchange law, we can derive a number of simple laws which only have three or two operands, simply by replacing an operand in the frame rule with the do-nothing operator. Theorem 1 (P || Q); Q' <= P || (Q; Q') is done by replacing P' with I. You can do this with the other variables. The question is, in order to characterize interleaving, do we need new laws for five operands, or six operands? The answer is no: the four operand form is powerful enough to derive all theorems about interleaving. At least, I think so! I'l show you by example: abcd || xyzw (Tony uses a semicolon to show where he wishes to apply the exchange law) <= xayzbwcd (see slides.)

Now I want to move onto an idea which is also obtained from boolean logic, namely proof rules from the algebraic laws. In freshman logic, your students learned how to prove the validity of the laws of natural deduction, from the laws of boolean algebra. (And the other way around too.) if you want to prove an interleaving with r||r' on the rhs, of the shape shown below th eline, you can split the task of proving the conclusion into two simpler parts, where you only have to prove the interleaving of single operands. Like a natural deduction law, it shows how you can match a conjecture, and the antecedents of the law will tell you how to structure the proof. Now the proof of the modularity law is not difficult; no more difficult than the proofs of modularity rule implies exchange law. The short undergrad proof demonstrates it, using covariance to deduce something that looks like the conclusion.

We can now use the exchange law to derive the rules of verification logic and process calculus. But to do so, I have to define the primitive notations of verification logic and process calculus in algebraic terms; I have to give an algebraic definition of what a Hoare triple means.

I've done that on this slide. A Hoare triple is {p} q {r} = p;q <= r. It is interpreted as saying, that if p describes the part of the trace that has happened so far, and q describes the rest of the trace, then p;q describes the whole of the trace, r--the part that has happened already, and the part you have just performed. Let's see, the definition may seem a bit bold, but we'll see the consequences soon. The Milner transition is given a similar definition. (Who's familiar with Milner transition, the arrow with transition on top of it? In that case, I'm doing something new. I can't write it on top, so I have it in the middle.) The intended meaning of r >- q -> p = q;p <= r; if you have to execute the program r, one way of doing it is to execute q first, and then saving up p to execute afterwards. It's written the other way around from the Hoare triple. One consequence of this law is the first rule of Milner's process calculus, which is if (q;p) >- q -> p (i.e. you can execute (q;p), you execute q first, and then execute p.) Th eproof is very simple.

The point is that it is proved from a law and the definition; it is not postulated as an axiom or part of the language. The modularity rule is the main law which governs concurrency, both in separation logic and in Milner's process calculus. This slide just contains two different translations of the modularity theorem I proved a little while ago. The top line tells how to prove q||q'...

Now, Milner's form of the same rule has a different interpretation from the Hoare logic. If you wnat ot execute (r||r'), then one of the things that can happen first, is the concurrent occurrence of q||q', where r >-q-> p, and q'... What remains to be done afterwards, is what remains to be done on the first of the two concurrent operands. It gives an operation rule by which one can implement concurrent composition of processes in CCS.

The most important law [O'Hearn Frame Rule] of separation logic, which makes concurrent programs mostly a question of proving sequential (reduces the problem), using local variables. If we can prove {p} q {r} with normal reasoning, if f is a trace that is independent of p, then {p||f} q {r||f} can be deduced. It is easy to understand this rule with a rule I gave a while ago, rule of adapation, which enables you to take a triple which you proved about assertions of some of the variables, and extending it to an environment with more local variables, described by the proposition f. So you can make the prover's simple triple adapted to the context it's going to be used. But we don't need the awkward side condition that the variables of f are not changed by q. In fact, separation logic is defined so that it produces a proof that there are no race conditions at the same time you are defining your rules. Absence of race conditions and validity of triples, and that's what makes separation logic so attractive. Of course, it doesn't solve the difficult problems, not all of them. Just writing the assertions for a concurrent program can be considerably more laborious than writing them for a conventional program; you have to use many more ghost variables.

Milner has a frame rule as well. If you have to execute r concurrently with f, you can just do q first, provided q is something you can do first if you just want to run with r. f has not noticed q has happened.

The same thing is true for sequential composition. The traditional Hoare rule for sequential composition: if you do the same kind of derivation, you find, using covariance, that the rule {p} q;q' {r} is equivalent to half of the law of association.

Note that the Milner rules are completely dual to the verification logic. Therefore, every law of Hoare logic can be translated, by duality, into a law of normal logic, and vice versa. I think this is a nice result, because for many years Milner and I would argue about this: I believed in verification logic, and Milner believed in... We thought we were arguing about something different, but actually we were arguing about the same thing; we just had the order of the semicolon different. To restate, the Milner rule of sequential composition (which looks very similar to the Hoare rule), except it's reversed, but that's exactly what we should see.

Thus far, I have carried out my promise to show how the laws of process calculus and verification logic can be derived from algebraic laws. This is not a complete story; I haven't given you the denotational semantics. But I want to say a word. Why is this important? I think the unification of theories is very important, for a human factor: no human engineer, in his right mind, will listen to a theorist, if he disagrees with another theorist, who thinks they know better. He will tell them to go away, and come back when they come to agreement. But it is very rare for theorists to come back (chuckles). We did have a twenty year rivalry about these theories, but now we realize they are the same. Why is this important? Process calculus, verification logic, algebraic laws, are the basis for all the tools we have for helping us reason about our programs. Process calculus = operational semantics, for implementation by compiler/interpreter/hardware. Verification logic was designed to be useful for proof tools and program analyzers. And how do you make sure if you have written an eight thousand line compiler, that those two large chunks of software, are actually dealing with the same language? You'd better base them on the same theory, because otherwise there may be hopeless mismatches. Preferably, you should show the theories on which they are based are consistent, and you should conduct that proof before you have written that eight million lines of code, because I would not want to check the consistency of a verification tool and an implementation tool, after they have both been implemented.

Thank you very much for your attention.

Q: (Keshav Pingali) Back in the 60s and 70s, your work and Djikstra for proving properties of languages, suggested we needed to use structure programming rather than GOTO. That transformed how we wrote sequential programs. Has this work for parallel programs, told us how to design parallel programming languages?

A: I maintain my original lifetime goal: the needs of verification need to be balanced with the needs of implementation, the contract between the implementors of the language, and the implementors of the proof tools, and the actual programmer who is actually going to use the program and the tools. What I've learned in my study of concurrency, although the laws that I would propose as algebraic laws, and the basis for verification and implementation, although they're the same for many different forms of concurrency, there are many different forms, and the way in which the forms of concurrency are defined and used these days is not by designing new languages, but designing new design patterns for the use of existing languages. The way concurrency is currently built into languages, is what hardware people did to implement it with cache memories and complicated devices. The only way in which we can use these very elaborate schemes is by observing protocols and conventions, which, provided everyone writing the program have observed the same conventions, it's possible to verify reasonably simply. I'm shifting from design of PLs, to design of design patterns: protocols and rules and an API which tells you how to correctly use that design pattern. If you use the design pattern, the rules will give you structural correctness. This is good news for the research community, since we can establish the validity of design patterns, before they are used by many different people who get into trouble. So logic and algebra is a means of defining, adpating tools to check observance of the design patterns. Rather than as influencing the design of PLs themselves, which I don't believe is likely to happen in the near future.

Q: Let me press more. If we go back to sequential programming, we use if-then-else, for-loops. In parallel programs, what are the patterns we should be programming with? What are the equivalents?

A: I think I've shown you: it's the single parallel operator, which is of equal importance to the sequential operator and has nice laws. That's the way to think about it, before you introduce all the complicated weak memory models... I should confess that, as I did write in the beginning, I've left out all the complicated things; all the basic commands, 100s of them. I maintain and hope each command can be specified fairly simply as a proof rule or implementation transition (it doesn't matter which), but I haven't shown that. When you do that, that's when you introduce all the complexity, when you describe what the basic commands of your language do.

Q: (Phil Wadler) John Reynolds put us all to shame by inventing separation logic at the end of his career. I hoped at some point I could unify and stop making new things.

A: I won't hold you to that. It's very nice to retire too, and I shall be doing it some time soon.

Q: I'm glad you didn't do it before I did this unification.

A: I feel very fulfilled by that. I worked on unification explicitly with my colleagues for over ten years before we published the book. The book... I've dated it now. I've not found the right way of expressing the unification. The main point of the book, take algebra, the relational calculus model, defining the unification of theories and programming. The algebra is much simpler and more general than the model could be.

Q: I want to ask how this handout could correspond more closely to theories. What about the connectives connecting things in separation logic?

A: Vertical bar is very much like a separating conjunction, but for programs.

Q: For CCS/CSP, an a-bar communication combine to form a tick.

A: The Milner laws I showed you, the version of the modularity law, he requires that the Q and Q'...both occur simultaneously. That is a specific use of the law I've introduced to him.

Q: So if you add such a law, it becomes equivalent?

A: What he has done is a specialization of my law. I allow Q and Q' to be arbitrary programs, while Milner says they have got to be atomic.

Q: () Your concurrency logic naturally gives rise to executions where all of the concurrency are properly nested. But many of us use fork/join which does not give proper nesting. I wonder if that unstructured concurrency of fork/join is analogous to GOTOs, or if that's too much of a stretch.

A: I think it's analogous. It's a barrier; it's perfectly possible to define fork/join as primitives, and have the convention they are always the first actions of any new thread.

Q: Such as how we can program in a structured way with GOTOs. So would you say we shouldn't use fork/join.

A: I've become a lot more tolerant. I don't think theorists should be telling engineers what they should and should not do. I've come to have a great respect for the engineers who actually do things, and master the complexity that we as theoreists deplore. But I also think that theorists have a duty to plumb the horrible depths of things that have actually happened. When we say that we recommend you shouldn't do this, we should give a clear demonstration that we know exactly what the consequences of what you do. Like the dentist who says, "Well, I could mend that tooth." The theorist should know what is required to use a complicated feature safely, and recommend something else should be used instead.

#pldi #pldi14 #tony hoare #concurrency

Future of PLDI panel

Why a panel on the future of PLDI? The discipline has evolved. We would like to have selective PL conference have a positive force, without trusting an "invisible hand". We want to put our hand to guide the direction.

If you look at the past 27 years, much has changed in terms of research problems, composition of topics we present. Importantly, the number of submissions has grown significantly. While for some the quality of a conference is related to the selectivity, if you raise the selectivity too much, you will start rejecting papers which are good. That's bad for science, and bad for the people doing the work. Additionally, journals have declined--they are less influential, and less likely for people to focus their best work on. Often, we never get around to polishing a paper for journal submission. Some people say the conference is our journal, but it's not the same, typically there is only one submission.

Two main questions. How should PLDI change in answer to an evolving research environment? (Should it?) What should be the relation of PLDI to other top PL conferences? Overlap? What should distinguish? What should we do about the increased number of submissions? Should we change the review process? (Two rounds?) What should we do with journals?

If we say something interesting, we'll try to write a short document and submit it to the SIGPLAN committee. In order for us to catch the conversation, we have two volunteers to scribe. Milind and Ryan Newton.

Blackburn, Krishnamurthi, McKinley and Walker, then open floor.

B: When David invited me on, I said, "I don't have any opinions, only facts." So what I am showing you right here is some grounding information. I have been digging around the corpus of all SIGPLAN publications for the last seven years, and I was doing that for a different reason. Here is a word cloud which reflects the words that appear in PLDI publications over the last six years, after subtracting the SIGPLAN corpus. (it is funny that aiken is on the cloud). Now here is POPL. (chuckles). Something that doesn't show is the deviation. Now, you can see that POPL has deviated much further from SIGPLAN, whereas PLDI is closer. It's about a factor of two.

(Wadler) Now by size do you mean area or height?

It's font size. These are scaled, so it's distorted by squared. So PLDI and POPL are about the same.

Now what do they not like? Here is PLDI, and here is POPL. (laughter) And if you scale them relative... as you can see, there is a big difference between what's hot and what's not.

I also did pairwise paper similarity (for different reasons), I took all papers, analysis of full text, keywords, references, and then did similarity to see how similar a paper is to another paper. Then I took a given paper and looked at all the othe rpapers, and found for each SIGPLAN venue, what the average score was. It's important to remember it's the average score. If I had a GC paper, you'd expect the best average score is ISMM, but that's not saying the closest match might not be PLDI or OOPSLA. So if you take that to each of the venues, where do they wind up?

Graph. He clustered based on things which are closely related. Here is something curious: only 25% of PLDI papers, the best fit is PLDI. But this is not so surprising: if there are architecture papers, it will go to ASPLOS; if it's GC, it will go to ISMM. So the fact that PLDI is low makes it a diverse conference. What does POPL look like? (chuckles) It's clearly different. And then here is OOPLSA, and ASPLOS (note, anything that is OS or Arch, it is going to best fit here, since SIGPLAN doesn't have anything else).

Next, I want to show how things have changed over time. It's been pretty table. In 2012 there was less POPL, but this is basically noise. POPL has also been fairly stable. OOPSLA on the other hand, seems to be changing from OOPLSA to ASPLOS.

Walker: There's another conference called POPL, and this deadline is coming up (earlier than last year). I hope despite differences in word clouds, I hope people are submitting to POPL. I'm interested in the relationship between the two. I think different people form different communities have different perceptions about these conferences. We were just discussing amongst ourselves. I think it would be a disaster if one fragment of the PL community considered themselves PLDI-only practical, and another fragment POPL-only, because I think there is a very large middle that needs both theoretical and empirical evaluation. I have some data about what tends to get submitted to POPL and what is accepted. Some traditionl bulwarks (static program analysis, type systems, semantic models, functional languages), OO languages nad dynamic languages, much less, and compiler optimization shows up less, though when a POPl person clicks on compiler optimization and design, they may mean something very different from what a PLDI person might mean.

Also I have an acceptance rate ratio, but I wouldn't read too much into this because some are very small numbers (ezyang: you should have added error bars). OO in particular looks really low, but that's because not many were submitted.

To end my spiel, I wanted the authors at this conference to fill in a little survey. If you have a paper at PLDI, suppose you didn't get it done in time, nad you had the opportunity to submit it to POPL. Would you? If not, why not? (Is POPL too theoretical? Is it not in POPL's scope, since the scope is too small? Will POPL not appreciate it? Would they just prefer to live in a different community?)

Krishnamurthi: Axiomatically, I don't think there's been a more exciting time since 1960, which is when we had ALGOL, Lisp, COBOL come out. Every company is in the PL business. Therefore, I think it's unproductive to be talking about POPL over here. POPL's great, but even if it's not, it's going to keep doing what it does. I think we should talk about what we should do about the large number of submissions; PLDI and OOPSLA, is there maybe one conference there? Are there some differences? Are you nervous? If PLDI really is that different from POPL, why is there only one PLDI a year? Maybe there should be more. What about our reviewing resources? We throw out all those reviews and start again. I know some authors have a ping pong, where every PC has a completely different opinion. We're doing this to ourselves, and we're saying it's OK to throw out everything and start again. Maybe we can think of a matching system. Say where you're willing to let the paper come out, and they decide what conferences you can appear at. I think it's instructive and productive to think about that. Multiple instances of PLDI. I think we should leave POPL off the table.

McKinley: I want to talk about the topics we cover in PLDI. It's important to let a thousand flowers bloom. Historically, parallelism was not supported very well at PLDI, especially when I was working on it. Now you see 25% papers on parallelism, and we're not really ready for it, we haven't given the rest of the communitee the tools they need. PLDI needs to stay as broad as possible. I think we could improve in security. People are proving things about security, but they are not proving things in the languages people are writing things in. To encourage things, you can always change the CFP. Invite external people onto the PC. While security is one example, I think there are many such examples. One additional point is this up and down of things that are popular. For example, right now it's out of favor to do optimization, and we see CGO is the new conference. Now that Moore's law is dead, efficiency and performance is something people are thinking about. We could have done better, even though we are one of the most general communities. These are opinions rather than facts.

Laris (EPFL): Something that's bothered me is we haven't really taken advantage of modern technology to disseminate research results. We're still publishing papers that look like the papers that I used to do, except maybe longer, on a memory stick. This bugs me: the latency is very long, and people are not using the modern technology, like blogs and other techniques to get results out faster. SEcond, we're still tied to the very long paper, which is a journal article format. Most have 1-2 page of new idea, 1-2 page of experimental results, and we have to wade through 12 pages to get them. So how can we join the 21st century of information dissemination.

Walker: I think the AEC is a nice thing that we have done. (Though this is not quite what you were talking about). (Though this is not quite what you were talking about). (Though this is not quite what you were talking about). (Though this is not quite what you were talking about).

K: Who could object. As long as we're stuck with ACM, where are the SIGPLAN people here, leadership sent an open letter for people running for positions, asking various questions, much less sophisitcated, but aobut open access. These candidates had positions on these? No, they wrote a unified letter which said absolutely nothing. The difficulty here is enough people need to infiltrate the ACM, or we have to dissociate from the ACM. The ACM is an embarrassment to us for its ability to give us technology. I shudder at live papers. We haven't considered it, because the technology constraint is the ACM, not th web.

(): When I tell students what to expect from reviews, I say its a random process. When some of my papers don't get good reviews, they often missed the point. When I submitted papers to workshop, I thought I got really good reviews, and it was useful. So this is the problem with a very broad conference. I often get a paper that I kind of know something about, but I haven't followed it for the last five years. I'll write a review but I know it's not the best. Because we have so many areas but we are focused on such a small fraction. Perhaps we can datamine related work or something to find reviewers who can give good reviews, rather than general broad reviews.

M: High quality reviews are the most important reason we submit to conferences. So that is the most important question as a conference organizer to address. Related work is possibly a step to better informing the process, but also self-identifying as an expert. Have the external reviewer community be bigger? It requires a lot of attention from the PC to get it right. If it works, it's great; otherwise you get random reviews. Even when I submit, I see a range of things. When you're organizing a conference, that's the most important thing to do. Let's bring some modern tools to the process.

K: This is why I'm trying to move this modest proposal about the PC. We're supposed to review 20 paperS? That makes no sense to me. It's kind of bizarre, where the numbers are revolving around the number of submissions. We need a new communal process, because in the end the community is involved. Maybe we should give more credit for the work they do. But the balance is to avoid having every little subcommunity from accepting every paper they care about, but maybe that's the role of the PC.

B: On a more pragmatic note, broad tension is diversity and specificty. Diversity makes the experience researcher. How to address expert review? We have a much larger external review community. I also did something Keshav did, which was look at previous year submissions as the grounding for where we should be finding PC members, because if you always look at acceptances, you have a rut. The reason I did all this analysis, I wanted to identify people who were doing ....

M: Here is one point about generality versus specific. If you have one topic where a lot of people ar focusing and going in one way, and you want to do something different, the broad community works in your favor because established wisdom is in one direction and you're in another: that's conflict. Even in a narrow community breadth can lead to benefits. Tension for getting these things right. It's not just having the right aspects, but having youth so they're not as entrenched.

Ed deVries (Cornell): Content based question. you said it's exciting to be in PL, since industry is developing new PL. I think that's right, and it shows we need new PL, because when people build these things, it's not because they want to do it. But this design is pretty disconnected from the academic setting. There are people who are involved, but we're not contributing enough to the process. Industry should not be setting up strawmen for us to knock down. Who in SIGPLAN is going to do that? A conference with Programming Language Design might seem like a place for that to happen. Historically, OOPSLA used to do a fair amount of language design, maybe slightly flaky, but innovative, and it had impact. They're now moving away form that. How can we make room for those kinds of papers.

W: We always run into here, the problem with evaluating design papers. A theorem is right or wrong, a graph shows your 20% benefit, but we don't have good ways to evaluate design. This isn't a reason to say we shouldn't accept it, but we should think about how we set the criterion for the acceptance for those types of papers maybe differently. Few people can afford to do user studies these days. I ompletely agree that is missing from PLDI and POPL (even though we're not talking about POPL (laughter))

(): We only use it to submit papers and accept... we should be able to find... JavaScript is just showing up at PLDI papers. We should have languages like Go. There's been a way to write... get some sort of forum everyone here, so new languages people are thinking about, get it here.

K: How many have gone to StrangeLoop. (Something like 20% of people raise). There are nice venues for designers of new languages to go. There are now professional conferences which are full of PL, quite competitive. Are they going to colocate with us? Well, there's some new language like X, well, there's nothing new... but this is the new normal is now a good baseline to work from. Is it going to lag? I ugess os. Have we accelerated the rate of closures? I guess that is kind of a success. I've been on PCs where people tried to do it, Phil had applets/thislets, but we ended up having acceptance rates of 0%. I don't think the conventional mechanism will work, and it's hard to incentivize people to come for this. Unless there is a conscious effort, we have to convince they will get some value.

(): The issue about evaluation is absolutely real, it's real in other communities, such as security. An attack paper is self-validating, but defense is hard. But this is an element of a drunk searching for keys under the lamppost. I worry we fall into that.

(): Many people are troubled by that.

W: If I do a type system paper, I would have a type safety proof, and then an evaluation checkmark, but it wouldn't say anything about usability.

(): you want to be as inclusive as possible, while making it usable for other people.

M: Follow-up: historical perspective on new language dev, it's not generally our community whose been inventing new language. They have some new piece of hardware, thing to do, the current language doesn't work, so we get things like JavaScript, GPU languages, at the point those things start we should jump in and help people solve those problems, then we'd have better synergy than just denigrate JavaScript. If we had jumped in when JS appeared on the horizon, we should have contributed

Keshav Pingali: As the PC, I was the guy who asked you to run this panel. I have some comments. Before we start talking about solutions, we need to define problems. We may have ideas about what are problems. Personal perspective: I have three problems. Coming from Texas, we had a governer he wanted to close three government agencies, but after naming one, he couldn't remember the other two. I want to get away from the focus from POPL. The first problem is growing fragmentation, even though it's a small community. Go to SIGGRAPH, or security, they're just always a magnitude bigger than we are. It's a shame if a small community like ours fragments. There's ICFP that has its own clientele, David may disagree with it, but POPL attracts certain theoretically oriented people; I have not gone to POPL for many years; PLDI has its own clientele. We have a lot to learn from each other; I think more theory oriented people would improve from looking at what we're interested in. Let's reduce fragmentation. Second problem is an existential angst about what they represent. I talked to Jan Vitek, the D and the I has disappeared from PLDI, we talk more about the ... Same with OOPSLA, what does OO mean? Everyone says great idea, we're going to have ADTs, inheritance, blah, so they renamed to SPLASH, and he's broadened the call for papers, but I see the same sort of feeling, what is OOPSLA all about? This is what I mean. Finally, I wanted to note, it would be wonderful if we had multiple opportunities to submit to conferences. We submit to PLDI, and now we either have to send it to a lower tier conference or wait another year. In the security community, they have four conferences which are at the same level (meh) and if you don't get in, you can submit three months from now. This is better for grad sutdents and junior faculty problems. One way to solve all these problems is to do something similar to security guys: have three PL conferences (not POPL, PLDI, ICFP), call them summer, spring and fall, and negotiate, whatever, or four, or two, and we have broad PL conferences every year, instead of the current structure, and the PCs would be chosen to have broad representation to have type theory to GPU programming (which I have not seen in a long time.) We encourage everyone in the community to submit, multiple tracks, different areas, whatever. There are many advantages to that structure and I want to know what people think about it

() I wanted to advocate for VLDB, so yes I'm in favor.

Rinard: I support what Eric said. More submit chances, the better it is.

Michael Scott (Rochester): Observation and pose a question, with Jim Morrison's comment about modern technology, shorter turnaround, it also gets us the opportunity to have arbitrary length versions. Not for reviewing purposes, but for archival. As long as everything is coming out that a six page paper can't have 30 page appendix. More directly to this question of new strategies for publications out there, there is a lot of work, not only in DB community, but architecture community, not that we're going to set a bar, but we're going to do a quality, and take as many that pass the bar. We had 50 papers at ASPLOS. Here in Europe, ACM TAO has had rolling submissions that's intriguing.

K: My concern with this rolling submissions is we kind of want three outcomes. we want accept, reject and give us something more. We have a response period, but it's not actually changing anything. You're not allowed to say anything technical new. I understand this constraint but I don't see why it's a global constraint. The problem is it's razy to do wha tthe security community does: they're crazy, two months review, the author gets a reject, they submit to a new PC. It could not possibly have changed that much in a week. And all this work the PC had done was thrown out. This is why I want a continuous process.

(): Astr... has an interesting model. You submit a paper to a public website, and then you have a year for public comment, and then it goes to the conference. So you can't do bait and switch papers, because there are lots of things there. People make their name in this process by giving very good feedback. This process might be good, beyond anonymous reviews. To see if we understand soething. And then we don't have to travel three times a year somewhere. There's one conference. Rolling process. One thing people are missing. There are some papers which need discussion, but you either get into PLDI or not. If you get in, you've won the game and you can go home. That's not how we can.

(): How to evaluate design? I think you can't. I don't think there's a bigger bullseye to put on your paper, than evaluation on design. Anyone can find a fault. People don't believe the benchmark, whatever. Design is different and people have to evaluate and discuss it differently. PLDI model is too selective here.

Livschits (MSR): I copublish in security, and have been on submitted and reviewer. It's not perfect, but it's far superior to this model. It's not quite rolling review, you get feedback, there is a general sense of continuity. PC members are retained. It is better.

Armando: One big issue is our conferences are serving three purposes which are at odds. We want to learn what cool new things are, we want a gold star on the resume, and then we want an archival document which describes new tech. These three are at odds. The restriction of no new information in rebuttal, weird for gold star, but it's no sense for best paper and best talk.

Steve: Metapoint, as being a PC chair, the inertia is so big, it's hard to make a change. I think we should be doing right now is how to move more freely. I think we should commit to you.

#pldi #pldi14

Matthew Hammer - Adapton: Composable, Demand-Driven Incremental Computation

Limitations with existing systems: eager trace computation does unnecessary work sometimes. Improve experience with incremental systems.

Interaction pattern: laziness. A few spreadsheets, I look at only one, then go to another one. Switching: I setup a dependence to A then switch to B

(sorry, I was jetlagged and couldn't stay awake)

Q: Is it psosible to create recursive computations?

A: I.e. a thunk which references itself? No. In the implementation, it might be, but formally it's not.

Q: Do you know how this compares to reactive programming?

A: I alluded to this, esp. FRP. You have a notion of stream as time varying label. I have trouble with this question because there are a lot of variants of FRP. High-level, it's not. We're doing something that might be more general than FRP, but I don't think you can apply that here.

Q: You talked about having no assumptions about applications; you want to incrementalize all applications (i.e. support general recursion). Do you have speedups for bigger applications?

A: We are reimplementing in Rust, but all we have are smaller.

Q: For any of these, did you implement the SML approach?

A: This is subtle. You're asking, why didn't we use self-resistant computation? There's a lot of apps we haven't done yet. We reimplemented to do an apples-to-apples comparison in OCaml. Key allocation is important.

#pldi

Paolo G. Giarrusso - A Theory of Changes for Higher-Order Languages - Incrementalizing λ-calculi by Static Differentiation

Paolo motivates incrementalization, and walks through an example of incrementalizing a sum operation. In the presentation, he uses special "circle" operators (circle plus and circle plus) to represent calculating the diff, and applying the result of the differential to the old result. The language here is pretty suggestive.

So as it turns out, these operators are going to play a key role in the algebraic theory they're interested in. Types: V (base set), delta v (change set for v), (+) : (v : V) -> delta v -> V, (-) : (v : V) -> V -> delta v, obeying some laws (e.g. cancellation)

NB: change set might be different from base set. Example: for natural numbers, delta v is integers z such that z + x is a nat. (-4 is not delta 3)

How to incrementalize? Static differentiation? (ezyang: Is this the same as symbolic differentiation?) Correctness: [[ f (a (+) da) ]] = [[ f a (+) Derive(f) f da ]]

Well, actually, we don't have dependent types

Well, watch out: it's not derivatives, but function changes. Not f' x dx = f x2 (-) f x1 = y2 (-) y1, but df x dx = f2 x2 (-) f1 x1 = ... Function changes are necessary for application, since the function could change. delta (T -> U) = T -> delta T -> delta U

Setting: language plugins which define datatypes and change structures / primitives.

They did a case study on map reduce

Correctness proof uses logical relations!

Q: How do you compute the derivative of a function? E.g. for a fold function, it has to assume the input is a valid group.

A: We provide a primitive which assumes that the it gets a group operation, then the designer of the language plugin provides the derivative. (So fold is a primitive?) Yes, fold is a primitive. Since we are a higher order language, we can have a few primitives, such as fold, and then other operations can be derived from it.

Q: For some functions, derivatives won't be defined. Does your system allow the developer to recognize them? Or does this have to be recognized. (I.e. mathematical derivative?)

A: Well, watch out, this is different from mathematics. Actually, the derivative always exists, the real question is whether or not it is efficient. Sometimes this is impossible. But you can often provide more general primitives. We are limited to simple examples. We don't have support yet for caching intermediate results.

Q: Could you say more about the description of change in datatype? You have support for bag and map. Can I have changes for the values contained?

A: Something better to do, a change to a bag, is add/remove/change, and the change is just the change type for T. This is more efficient. We did not do it in the case study, since it still requires (indistinct), there is past work which we should reuse, Liu&Teitelbaum '95

Q: What is the programming interface. Can I use this to take derivatives in my program?

A: Our language is embedded as a DSL. Inside the DSL, you don't, but outside you can take derivatives and apply changes. So... yes.

#pldi #pldi14 #incrementalization

PLDI Artifact Evaluation Session

52 accepted, 20 artifacts. It's a bit questionable that only 12 passed. That's low--maybe a little disturbing.

(this ends the slides)

(SK) Yes, so some papers mention version numbers.

(JV) And some people do VMs so that you can have true reproduceability

(SK) But if it's that sensitive, why do you trust the result?

(JV) If there's a range, and you're in the tolerance, that's OK. If you get varying numbers, which are not discussed in the paper, well, that's questionable.

(AS) Maybe in addition to this, maybe as an overall goal, say, "Give us all the tools to reproduce the evaluation, or tell us what the gaps are, which are infeasible"

(SK) It wasn't good enough?

(Carl Bolz) Did it make a difference?

(SK) I can't think of a case.

(Paolo Giarrusso) Was there a phase for all papers?

(EE) We only solicited ones where there was a question.

(George Kastrinis) Given the code which cast doubts, do you think the process should change in the future and take artifacts into consideration?

(PG) Yeah, didn't we just lose this data (from the AEC)?

(Ben Liblit) Even the bad papers!

(SK) Here are the things people were concerned about http://www.artifact-eval.org/motivation.html (Design Criteria). (Singled out the exposing artifacts to the general public is unfair to the group)

(ET) About exposing artifacts and people poaching your research. Do we really want to a community that has this fear?

(JV) Well, in Biology, you explain how you did it, but you will not share the data. It cost you millions, it's precious, you will not share it.

(EE) Well, in computer science, things tend to be very open.

(SK) What kind of community do we want to be?

(ET) Think about informal publications, I have a research blog where I write lots of details. Maybe two people read it.

(SK) I didn't know you had a blog!

(SK) My view is if someone builds on it, I get citations without doing any work. That's great!

(JV) The reward system is not quite right...

(SK) Can people write papers based on your work? Then you will get your next NSF grant for free. "This is something people will use!"

(ET) Do people ever contact original authors?

(SK) Never.

(Jean Yang) How much work is evaluated on how easy it was to setup environment? Should we build environments to improve that?

(JV) It's a process. People don't know how to package these things. Every year we'll know better.

(SK) And in the future you might get rejected; you won't get an easy pass.

(JY) If we make it easier to submit artifacts, will more people do it?

(SK) Well, we provided links to many tools for doing this. See the section "Code Artifacts Packaging." This year we saw aa lot more Docker instances.

(GR) That's why one suggestion is to provide a VM image.

(RT) To have someone compile my code, that is not easy!

(SK) That's why I really like the example where the VM did not work, but the compiler image did.

(SK) We would like to make it a standard part, but there's just not enough time.

(JV) We tried a "kick the tire" thing. First three days, try to get proof of life (build it.) If you get stuck...

(GK) Are there valid reasons for not submitting an artifact?

(JV) Don't knock proofs as artifacts. I have seen papers which have claimed to have proven something, but authors haven't typed it up.

(SK) It turns out dogs eat everything: codes, proofs, manuscripts...

(JV) I don't think this is about fraud.

(SK) But I've been there.

(JV) Do they have any data? How did they compute that data? There is some output of simulation, a complicated set of scripts which made the graph, that is very valuable.

(AS) I agree, but I wish the guidelines for the AECs had ideas for how to deal with this. You don't need the whole thing.

(SK) We got some feedback: it was helpful to have someone outside the group run the damn thing. Multiple people! This automatically helps.

(CB) It also helps you find holes in arguments.

(SK) Some people had hardcoded usernames... but if the only thing I have is my machines, well, maybe you forgot. It's hard to think of everything.

(RT) The topic, is to verify claims? Then could another way to do it is have the AEC work with the authors, rather than independent.

(BL) If evaluators are grad students, you don't want them getting into a fight with a tenured professor.

(EE) The chairs acted as intermediaries between committee members and authors.

(JV) Signed NDAs...

(BL) Many people have contacted authors of papers...

(SK) People started to nod...

(EE) There are two papers that have attached artifacts going into digital library. I think this is the very first time you can do that. Positive effect!

(BL) Trying to get and use artifacts has failed so often. I just hope... maybe it will be better.

(JV) In the long run, maybe they will submit the code for external review...

(SK) We have this problem camera ready three months before.

(JV) That could change.

(PG) What about loop stuff?

(SK) First stage?

(JV) Second stage. But we could do it at the end of first stage, because most papers.

(CB) It's pipelining.

(SK) In any case, you had one for testing.

(ET) So should we ask people to submit this?

(JV) It's not appropriate for all papers. Requesting something... extra work? We don't want it. It's god to have independent evaluation...

(SK) that's the really troubling thing. "Why didn't you ask?"

(JV) The reason you have to say it often is because people don't ask.

(RT) There are two things you're trying to do, they need to be separate. Verifying claims, and verifying there is a useful tool (helping others.) There are different things.

(JV) Our hope is pushing on one push the other.

(SK) I think that's true. It makes it all a little better.

(CB) The virtual machine makes it possible.

(SK) As an example, I'm trying to build on something ....

...

(JV) We're looking for the least work. You did something, you did experiments to convince people.

(RT) Well, it's tough...

(SK) We hope it's similar enough, so having a third processor is unlikely to be helpful.

(RT) Well, making my tool useful... if I'm not going to use it...

(SK) The tool paper on the tool is different from the result of using it.

(?) If I can't reproduce your story...

(ET) We all hope we're doing work we think is valid, I don't want to shame my advisor by not earning a sticker.

...

(SK) Now we require a webpage. I think this is really important.

(JY) Is there a public announcements?

(EE) You have to look at the frontmatter.

#pldi #pldi14 #aec

#pldi

Trending Tags

Recently Viewed Tags

#pldi