YQL+ on Amazon Serverless
Yahoo has quietly open sourced the YQL+ engine, parser and SDK. It’s a powerful platform for querying and orchestrating data, and if you’re familiar with their YQL platform some concepts will be familiar. But in YQL+, your programs live in your application, not as part of a request, and you can interact with it in a more RESTful (or any) manner.
YQL+ implements a query language for services. A YQL “program” looks a little like SQL, but its sources can provide data from anywhere. For example, you could have a source that pulls data from a webservice, or redis, or MySQL, or DynamoDB. Your program may query multiple sources, join them, have the output of one as the input of another, or a complicated combination. The YQL+ engine will figure out which sources it can execute in parallel and then provide the result(s).
One use case I can imagine is using YQL+ application as the public interface for a mobile app. It’s good at orchestrating data, querying multiple downstream sources, joining data, perhaps executing a little business logic the apps have no business knowing about themselves and returning only the necessary data.
Here, I have combined YQL+ with Amazon Serverless to create microservices. Each endpoint is its own lambda function. The source for this example is located at https://github.com/wabzqem/yql-plus-serverless-example
It works as follows:
Client request -> Amazon API Gateway -> lambda handler which executes appropriate YQL Program -> downstream source(s) -> back to client
The files in the “function” package are the ones mapped to each AWS lambda function. The files in the “sources” package are the YQL+ Source classes, which fetch/manipulate downstream data. Table names are mapped to Sources in SourceModule.java. And the methods in the source classes are annotated with @Query (more on that later).
This example has the following endpoints:
GET / - fetches the current stocks POST / - adds a stock to the portfolio and fetches data for it GET /update/ updates the current prices in the portfolio
See README.md in the repository for deployment instructions.
To get an idea of what’s going on here, let’s see how the update endpoint works. Here’s the YQL program which is mapped to the /update/ endpoint (see net.whatsbeef.portfolio.webservice.function.UpdateStock to see how this gets mapped):
PROGRAM(); CREATE TEMPORARY TABLE updated AS (SELECT * FROM stocks); SELECT * FROM updateStocks(@updated) OUTPUT AS result;
This program has no inputs. First, a temporary table is created, which is the output of the stocks query. The socks “table” is mapped to the StockSource class, and no input is given. The StockSource query method without any parameters simply fetches all stocks from the database.
A critical point here is that the output of this query is passed as input to the updateStocks table. When doing this, the YQL engine won’t execute these statements in parallel (obviously, but it does otherwise) and the types are mapped appropriately. The updateStocks table’s source does several HTTP requests to update the stock data, updates DynamoDB and returns the results.
Let’s have a look at the putStock program:
PROGRAM( @stockId string, @quantity int64, @boughtPrice double ); CREATE TEMPORARY TABLE putTable AS (SELECT * FROM putStock(@stockId, @boughtPrice, @quantity)); SELECT * FROM updateStocks WHERE stockId = @putTable[0].stockId OUTPUT AS result;
This program takes 3 inputs which are pretty self-explanatory. YQL maps int64 to java’s Long type. YQL+ chooses which @Query method in your source based on the number of parameters - not the types unfortunately. But the types must match correctly.
The second SELECT statement takes the first result from the temporary putTable query, and only the given stock is updated.
The return value of a Source is generally your model object. I’ve used simple kotlin data classes, and they work fine getting passed from one source to another.
And as mentioned, requests/responses are simply standard REST requests. Here's a sample GET:
$ curl -s https://<hidden>.execute-api.ap-southeast-2.amazonaws.com/Prod/ | python -m json.tool [ { "boughtDate": 1506928655152, "boughtPrice": 34.45, "currentPrice": 4.33, "quantity": 3000, "stockId": "\u20180354.HK'", "stockName": "CHINASOFT INT'L" }, { "boughtDate": 1506928693550, "boughtPrice": 34.45, "currentPrice": 59.95, "quantity": 3000, "stockId": "2318.HK", "stockName": "PING AN" } ]















