f06code @f06code - Tumblr Blog

New blog!

Hello, you've reached my old blog. All of the posts are still here, but new ones will be added over there.

plredmond.github.io

Some old posts will also get updates on the new blog as I sift through this site for things worth highlighting.

`stackdoc` to search haddocks offline

There's an updated version of this post at my new blog.

You can generate all of the haskell haddock docs you need offline and search them with a shell function. Since Dash just added Haskell support I decided to share the lightweight approach that I use. The idea should work for both cabal and stack based workflows so long as you've generated haddocks for the package in question. I've written it for stack users.

stackdoc is available as a gist for bash or fish.

stackdoc.sh

stackdoc.fish

Usage is simple.

~ $ stackdoc Usage: stackdoc PACKAGE [COMMAND]

In a project directory stackdoc will search local packages before checking user-level snapshot packages. For example, in a project that depends on GPipe you can call:

~/my/project/dir $ stackdoc GPipe Search GPipe in ~/my/project/dir/.stack-work/install/x86_64-osx/nightly-2015-11-17/7.10.2/doc Found ~/my/project/dir/.stack-work/install/x86_64-osx/nightly-2015-11-17/7.10.2/doc/GPipe-GLFW-1.2.1/index.html

To look up the MVar docs which are in the package base you can invoke:

~ $ stackdoc base Search base in ~/.stack/global/.stack-work/install/x86_64-osx/lts-3.9/7.10.2/doc Search base in ~/.stack/snapshots/x86_64-osx/lts-3.9/7.10.2/doc Found ~/.stack/snapshots/x86_64-osx/lts-3.9/7.10.2/doc/base-4.8.1.0/index.html

The utility of this command comes from being able to launch your browser. The default command just prints Found ..., but you can specify your own.

~ $ stackdoc $PACKAGE chromium

Finally, if you can't find docs for a package it is possible the haddocks have yet to be generated. Try running:

~ $ stack haddock $PACKAGE ~ $ stackdoc $PACKAGE chromium

Incidentally, this was a nice exercise in translating code from fish to bash. fish is way nicer to code in!

#haskell #haskell-stack #stackage #dash #documentation #haddock #programming #bash #fish #shell scripting #stack

Hello! I'm going out on a limb here, but it's worth a shot. I'm looking for insight on how someone might take FFT spectral data and use it to create an obj or similar file.. Right now, the only success I've had was by taking a spectrogram image (b&w) and using it as the depth/height map of a terrain. The resulting object works, but not as well asthe FFT data itself would. Any pointers in the right direction would be appreciated.. Great work, btw!!

Hi, and thanks! :)I think the approach you took, using the spectrogram brightness as the heightmap of a 3d terrain, is basically right.For example if I took the first image below (excepting the vertical bars, which I think are a glitch in the python FFT that I used), and built a heightmap out of it, I might get something like the second image below.**Spectrogram data as an image:*** `x` is time (left side is time 0, it proceeds to the right)* `y` is audio frequency (bottom are low frequencies, top are high)* pixel brightness is audio amplitude (dark is silent, light is loud)![](http://31.media.tumblr.com/c3d6e8725a5479bd744ab46056c8d81b/tumblr_inline_n1p4zlL5kO1r8s6rf.png)**Spectrogram data as a surface:*** blue `z` axis is time (deep is time 0, it proceeds toward the screen)* red `x` axis is audio frequency (left are low frequencies, right are high)* green `y` axis is audio amplitude (down low is silent, up high is loud)![](http://38.media.tumblr.com/a4c83a6f3e0da84370c9e628037be127/tumblr_inline_n35oenwdsp1r8s6rf.png)So in this way the same data can be represented as an image or a surface. (In this case, it’s actually different data, but conceptually it should be true.)Does that seem right to you? Perhaps I’m misunderstanding your technique.

#code #fft #fast fourier transform #data representation

Hi, what editor do you use, especially for Haskell?

I use `vim`, with a few plugins and I keep GHCI via `cabal repl` open in a `tmux` frame.* [syntastic](https://github.com/scrooloose/syntastic) with the linter [hlint](https://hackage.haskell.org/package/hlint) shows helpful improvements and common mistakes* [tagbar](https://github.com/majutsushi/tagbar) with [lushtags](https://github.com/bitc/lushtags) lets you navigate around your code via a sidebar

#haskell #code #vim #vim plugins #coding

spectrogram3d and gltut now buildable!

I've written short buildscripts for spectrogram3d_hs and gltut_haskell-gpipe which basically boil down to a wget and sh build.sh. No cabal hell necessary! Now you can build them from the comfort of your own home. Deets in the readmes!

spectrogram3d_hs

A live 3D spectrogram written in Haskell. It takes sound from JACK, and draws a mesh with GPipe. You can use it visualize audio from VLC, iTunes, or even microphone input (whistling is particularly interesting to watch).

gltut_haskell-gpipe

These are ports of the examples from Jason L. McKesson's wonderful online book Learning Modern 3D Graphics Programming from C++ and GLSL into Haskell. The ports use Tobias Bexelius' OpenGL framework GPipe [1] [2] to express OpenGL buffers, uniforms, vertex shaders, and fragment shaders in normal Haskell code.

new things

In lieu of continuing to work on GPipe projects I'm trying out LambdaCube3D [lc-dsl] for a spell, and with my new job being mostly clojure I expect to be making interesting things with it and posting them here soon. Stay tuned!

#haskell #spectrogram #opengl #build script #update #news flash

clojure's style from haskell's viewpoint

There's an updated version of this post at my new blog.

Starting a new job by learning a new language can be tough. These past two weeks I've been adjusting to clojure after using haskell all year. Returning to the world of dynamic types has been a little rough, and there are some fundamental differences between the lisp and haskell approaches to coding.

Haskell boots

You can chunk a list with Data.List.Split.chunksOf, but in its absence you could write it this way.

chunksOf :: Int -> [a] -> [[a]] chunksOf ct xs = filter ((== ct) . length) . map (take ct) . map (flip drop xs) $ [0,ct..length xs]

This might not be terribly efficient, but it's easy to understand and follows the style I've adopted in haskell.

partial application

composition as a pipeline

These two things dominate much of my code, but clojure doesn't really bend this way.

(defn chunksOf [ct xs] ((comp (partial filter (comp (partial = ct) count)) (partial map (partial take ct)) (partial map #(drop % xs))) (range 0 (count xs) ct)))

Although this code accomplishes the same task, it's pretty unreadable. Partial application as an explicit functon (or macro?) quickly clutters things up. And despite their simplicity, reading s-expressions is about as fun as reading xml. My eyes muddle through all the parens; they don't provide me with anchors to facilitate reading comprehension.

Clojure boots

Well, any clojure programmer would probably know better than to write that way. Since I've been reading clojure code for two weeks straight now, here's a stab at writing it better.

(defn chunks-of [ct xs] (->> (range 0 (count xs) ct) (map #(drop % xs)) (map #(take ct %)) (filter #(= ct (count %)))))

To me, this renamed function feels about as readable as the haskell code was; perhaps even moreso. The ->> macro lets you thread a value through all the following forms as their last argument, top to bottom, which seems like an improvement over haskell's bottom to top pipeline. I've also replaced the clunky calls to partial with a very concise function literal which simply expresses where to drop in argument values. My company's clojure codebase relies heavily on macros, and while they are powerful they take some getting used to.

FP boots

At least I'm finally working in FP! My days of fighting massively flawed in-memory state and poor integration with external libraries are past. Yay!

#haskell #fp #prismatic #clojure #macros #programming #code style #code #coding #functional programming

cabal usage notes

Follow along in bash to see the effects of all the commands in this post.

mkdir ~/sandbox-play cd ~/sandbox-play

Play in a sandbox

If you're not already starting all your projects with the cabal sandbox command, then you might consider beginning today. If you're a haskeller then this is the fastest way to save yourself from dealing with the cabal hell caused by installing all the dependencies for your various projects into the global or user-level registries.

mkdir super-giraffe-game # your project directory cd super-giraffe-game cabal sandbox init ls -la

Now any cabal install commands you run in the project directory will install to a subdirectory. This goes for all the cabal commands in this post. Here we install the utility-ht package to the sandbox.

cabal install utility-ht

You can list what's been installed to the sandbox with the following command. It'll be all the packages after the path to your sandbox.

cabal sandbox hc-pkg list ... xhtml-3000.2.1 zlib-0.5.4.1 /Users/f06code/sandbox-play/super-giraffe-game/.cabal-sandbox/x86_64-osx-ghc-7.6.3-packages.conf.d utility-ht-0.0.10

You might want to unregister packages from a sandbox too. You can use the following command.

cabal sandbox hc-pkg unregister utility-ht

Share the sandbox

An even better habit to start is the use of shared sandboxes. When you have multiple projects which use the same dependencies, shared sandboxes mean that you only have to install things once. Let's change your project directory to use a shared sandbox.

rm -r cabal.sandbox.config .cabal-sandbox/ cabal sandbox init --sandbox ~/sandbox-play/playpen ls . ../playpen/

You can use that second command with the absolute path to playpen from any project directory on your system, and all will use the same sandbox. Sandboxes contain absolute paths in their configuration files. Make sure you create your sandboxes in a directory where you won't mind them staying.

Look before you leap

Always check what will happen before performing a cabal install. This will help you recognize likely cabal hell situations in the future before they crop up. Let's do this for the GPipe package.

cabal install GPipe --dry-run -v

This will print out the required dependencies of the target package in order of planned installation, and it will tell you if each package is completely new or just a new version of an existing package. New versions frequently put you in cabal hell, and you can sometimes reduce them by adding --avoid-reinstalls to a cabal install command.

You can tweak the install plan by including multiple package names or including a package version after any package name.

cabal install containers-0.5 GPipe --dry-run -v # use a specific version of containers

When you're satisfied with the install plan, remove both --dry-run and -v to install. Cross your fingers, because this is where you might end up in cabal hell. Don't actually do this today.

Aside: If you actually do want to install GPipe with haskell-platform: stable 2013.2.0.0 then follow these instructions which have fewer dependencies.

You can also look before you leap when installing the dependencies for a package you've cloned or are building yourself. Simply leave off the package name from the install command and include --dependencies-only, as I've done on the last command in this paragraph.

cd ~/sandbox-play/ cabal get GPipe cd GPipe-*/ cabal sandbox init --sandbox ~/sandbox-play/playpen cabal install --dependencies-only --dry-run -v

Shared sandboxes are a useful tool for escaping the cabal hell that occurs when a dependency fails to compile. You just clone the correct version of the failing dependency, link it to your project's shared sandbox, and fix it.

Get out of hell

Don't execute any of the commands below; they're for a hypothetical fix.

Hypothetical Situation

You installed some dependencies for a package you are working on and it ended with "error" or some other useless dribble. Cabal happily installed all the other planned packages excepting those that depended on the failed package. This means that the actual error is likely not on your screen.

Recommended Solution

Re-run the previous installation command. Cabal won't re-install any of the successful packages. You'll see the compiler or cabal error for the failed package at the bottom of your screen this time. If you are confident, you can try to fix the package by delving into its source:

cd .. # get out of your project directory cabal get [package[-version]] # download source for the failed package cd [package[-version]] cabal sandbox init --sandbox [your project's shared sandbox] cabal install --dependencies-only --dry-run -v # should report no dependencies are required

Now you can fix the source of the failed package. If you think your work is done, try to build with:

cabal build

Once it's working you can cabal install your fixed version into the shared sandbox.

cabal install

Now when you return to your project directory and run your original dependency installation command, it works!

Happy packaging.

#cabal #cabal hell #haskell #cabal sandboxes #sandboxes #cabal packaging #cabal sandbox #cabal sandbox management #sandbox management #code #programming #package hell #tutorial #basic commands

stasis-field

So I have a new function that’s leaking space:

f06code

data C3 a = C3 a a a c3_0 (C3 x _ _) = x c3_1 (C3 _ x _) = x c3_2 (C3 _ _ x) = x splitBands :: IOArray Int (C3 Double) -> IO (C3 (IOArray Int Double)) splitBands vc = do b0 <- mapArray c3_0 vc b1 <- mapArray c3_1 vc b2 <- mapArray c3_2 vc return $ C3 b0 b1 b2

I know there are probably more concise ways of writing this, but this is the simplest version of the function that exhibits the problem. It takes a mutable array of containers of three doubles, and converts it into a single container of three mutable arrays of doubles. And it leaks. Apparently.

I’m super lost. Like, pretty much the only bits that do anything are the mapArrays, and they’re library functions so they probably shouldn’t leak, right?

Strict constructor

Try making the constructor C3 strict in its arguments.

data C3 a = C3 !a !a !a

This should force the mapArray calls to evaluate to WHNF. Without seeing how this code is used I can't really say how the thunks are piling up though, and this may only slow the leak without actually fixing it (it also may do nothing).

Match input with (strict) output

Make sure that every time you receive input data, there's a corresponding strict action which consumes that data by giving it to the outside world (writing it to a file, rendering it to a screen, etc.). This should force evaluation, which prevents thunks from piling up.

Explicit strict evaluation

Control.DeepSeq contains a class NFData which allows you to daisy-chain strictness into a data structure such that evaluation to WHNF becomes evaluation to NF.

You may have to implement your own instance of NFData for IOArray (and you definitely will for C3), but once it's in place, calling literally anything on your IOArray should force the whole underlying computation.

#haskell #lazy evaluation #thunks #stack leak #whnf

Recurse Center projects

I worked on a lot of different things at Recurse Center (previously known as Hacker School). My time was mostly split between exploration of unfamiliar web technologies and getting better at Haskell. (Heroku and OAuth2 are sweet digs!!) Here's a summary of the projects which resulted in either a github repo or a blog post. I've added ★ on projects which represent a significant investment of effort or which resulted in loads of learning.

table#recap tr:nth-child(odd){ background-color: rgba(0, 0, 0, 0.08); } table#recap td{ padding: 6px 4px; }

Date Project Links Tech ★ 2/9 Completed a package to make text stats for vector-space ranking systems. repo python3 2/10 Requested pull of updated version dependencies into GPipe (first pull request!). repo cabal 2/13 Completed cleanup and build instructions of translated GLTut examples (preexisting). repo haskell, gpipe, cabal 2/19 Wrote a post about using the AWS python api to change instance properties. blog aws, python ★ 2/19 Requested pull of options page for Katee's chrome extension Tab Explorer. repo js, css, html, d3 2/28 Wrote a post about visualizing some confusingly stored geographic data. blog gist english, js, threejs ★ 3/11 Completed an offline spectrogram generator. repo blog python3, numpy, scipy ★ 3/23 Wrote a series of posts about structural recursion. blog haskell, python, svg ★ 3/27 Completed pairing with clin88 on a multithreaded bittorrent client. branch unmerged haskell, stm, mtl ★ 3/28 Completed a live 3D audio spectrogram (preexisting). repo blog/vid haskell, gpipe, jack 3/30 Cleaned up and modularized an example UDP echo server. repo haskell 4/1 Explored creation of an evaluator for a small language. repo haskell 4/8 Learned about nginx-embedded lua web apps from davidad. fork demo nginx, lua, heroku ★ 4/10 Completed pairing with pgayane on an arithmetic expression parser/repl. repo haskell 4/10 Completed a Flask app which authenticates against Hacker School. repo demo python3, flask, oauth2, heroku 4/15 Updated and deployed a website to play wfnx music (preexisting). repo demo python3, sqlite3, js, html, css, svg, youtube, heroku

There's still a little time left of the batch, and I'm currently setting up a cross-compiling gcc to build executables for my Kindle. I'm not sure where that will lead. Maybe I'll get around to the Barnes-Hut gravity simulation too someday. There isn't enough time in one life to explore all the interesting things that code can do.

#hacker school #recap #didstuff #code #programming #recurse center

Long live WFNX

This is an old project, but thanks to davidad exposing me to Heroku, I've finally put the wfnx tribute site back up in a permanent place (warning! it plays music!). The code is also available.

The Ghost of Local Radio (plays music)

Left button pauses the music

Right button goes to the next song.

#radio #wfnx #local radio #boston #adobe flash #heroku

Live 3D Audio Spectrogram in Haskell

Recently I wrapped up a project to create a live 3D spectrogram in Haskell which visualizes audio obtained using JACK Audio Connection Kit (JACK). The fun gimmick here is that one may route arbitrary audio sources into the spectrogram while it's running. VLC, iTunes, or the system microphone are all options. You can watch your own voice slide along the spectrogram if you sing along. It's great.

Fig. 1 - TL;DR demo screencast (1m 39s)

Background

This project started a few weeks back with a spectrogram to visualize a tonal swoosh in the song All I Need by the band Air. The program was an off-line (batch) analysis which took as input a .wav file and some parameters to produce a white-on-black image of the sound.

Fig. 2 - The "tonal swoosh" in question (6s)

A spectrogram visualizes the relative amplitude of pitches in a sound over time. It is made up of many Fast Fourier Transforms (FFTs) representing each of many time intervals in the sound. A single FFT depicts along its X axis the frequencies exhibited by a chunk of a waveform and along its Y axis the relative amplitude of those frequencies. Check out Visualizing a Tonal Swoosh for more background on the idea of an audio spectrogram.

Fig. 3 - The off-line spectrogram of the tonal swoosh

In this image, time proceeds along the X-axis from left to right and pitch increases upwards along the Y-axis. Note the staggered introduction of several voices which pitch-bend down into one tone and then swing back up together.

Live spectrogram

Visualizing the swoosh with my new program also works. Check out these results!

Fig. 4 - A snapshot of the live Haskell spectrogram just as the tonal swoosh swings downward in pitch

In this image, time proceeds toward the viewer, and older events slide into the distance. Higher pitches appear as spikes to the right, and lower pitches clump on the left. Long lived pitches appear as sustained ridges sliding away from the camera into the distance.

Fig. 5 - Demo of the live 3D tonal swoosh (10s)

Program design

The following paragraphs include links to a repository on Github. Each link destination is the line of code which implements the idea indicated in the link-text.

The live spectrogram is implemented with two threads. One thread accumulates audio samples in a sliding window and periodically generates an FFT from these samples. The other thread draws frames to the screen as quickly as possible, occasionally picking up a new FFT and pushing it into a queue of 40.

The audio thread includes a callback to consume audio samples. It provides this callback to the plugin architecture exposed by JACK. State is maintained in an IORef which is atomically updated for each incoming audio frame.

The display thread inspects the IORef of the audio thread once each frame, and atomically extracts an FFT if one is available. It processes the FFT and atomically updates its own state.

Fig. 6 - State-data and threads used in the live spectrogram

FFT display processing

This section discusses the FFT post-processing I implemented to achieve an aesthetically pleasing visual. It does not discuss FFT error cases or how to convert to the dB scale. Readers interested in how to utilize the arb-fft package in their own code may benefit by reading my wrapper module which includes code for those concerns.

1. Exaggeration

The FFT is cubed to exaggerate loud pitches and make quiet pitches less noticeable. This is helpful to reduce the effect of background noise on the visualization as long as background noise amounts to pitch-peaks of less than 1.

Although exaggeration helpfully increases the contrast of foreground features from the background, this is a work in progress. Input audio is not normalized in the current version. Consequently, the question of which pitches are considered to be "loud" or "quiet" is dependent on the volume of the audio source (more on this later).

2. Frequency maximum

We assess maximum interesting frequency f. The visualization includes only those frequencies lower than f; higher frequencies are left out.

We define an interesting frequency as one which is more than one standard deviation away from zero amplitude, where the standard deviation is calculated over all the values in the FFT. In plain english, an interesting frequency is one which is audibly distinguishable.

Fig. 7 - In this contrived example, the standard deviation (σ) of the FFT is 200,000. The peaks in the dashed-box on the right side of the graph are all less than σ, therefore the visualization would consider them uninteresting and use only the data for x=0 through x=125.

The purpose of identifying the maximum interesting frequency is to restrict the part of the FFT included in the visualization to that part which contains the most interesting audio features. The interesting part of the FFT is then stretched across the whole horizontal of the screen, allowing for more easy visual tracking of those features which dominate the audio.

3. Transformation to a mesh

The components of the FFT are paired with interpolated X coordinates from between -0.5 and 0.5, and loaded onto the GPU using GPipe. This stream is then pushed into a queue of no more than 40 FFT meshes.

4. Display prep

Elements in the queue of FFT meshes are individually shifted along the negative Z axis according to their index in the queue. They are also scaled vertically according to the defined height of peaks in the visualization and the learned maximum amplitude. These transformations are written in Haskell, but execute in GLSL produced by GPipe.

The queue of FFT meshes is then concatenated into a single buffer. We draw the buffer to the screen using a display function which could potentially be rewritten more compactly like this one.

Configuration

Configuring the various parameters to optimally display the features of a given audio source is difficult and subjective. Some variables are automatically learned, such as frequency maximum (how much of the FFT to display) and amplitude maximum (how to vertically-scale the visual), but even these have hard-coded learning parameters with no real "best" value.

Some parameters are truly subjective, though. Three in particular are adjustable by a user of the program.

1. Stride

How wide to make the sliding window of FFT input. Measured in count of audio samples.

Stride must be set wide enough to produce a detailed spectrum analysis, but not so wide that it leads to a spikey line with hundreds of harmonics. If stride is too wide, the display thread receives a new FFT infrequently and the visual effect is disrupted.

Fig. 8 - Too wide (left). Too narrow (right). The default stride is 512 samples, but anything between 384 to 3072 works well enough.

2. Offset

How much to slide the window before calculating another FFT. Measured in count of audio samples.

When offset is smaller than stride, there is overlap between the chunks of audio used to create successive spectrograms. This produces a smoother animation and smoother ridges.

When offset is larger than stride, the visualization skips some information between each pair of FFTs.

Fig. 9 - Too close (left). Too far (right). The default offset is 512 samples, but anything within a factor of 1 of the stride should work well enough.

3. Volume

How loud the waveform is on which the FFT is calculated.

The quality of the FFT calculation depends in part on how loud the audio signal is. For quiet waveforms, the calculation is less able to discern signals from noise, and many of the higher pitched sounds go undetected. This may be an artifact caused by clipping small FFT values before converting to the log scale, but it is a parameter of the visualization nonetheless.

In an attempt to resolve this artifact, I implemented naive audio normalization to scale the average of the audio stream to a known value before performing the FFT on it. This system blindly scales up silence to become loud white noise, and so it has been disabled.

As such, in the current version the act of changing volume in iTunes or VLC will affect the quality of the visualization. Loud volumes will allow the spectrogram to hear more nuances in the audio. Quiet volumes will isolate only the loudest features.

Fig. 10 - VLC 100% Volume (left). VLC 50% volume (right). This is the "low point" in the tonal swoosh. Notice how it's all but absent from the image at right.

Packages used

Audio is skimmed using JACK (OSX) which has bindings on Hackage as jack.

The FFT is calculated in pure code utilizing the arb-fft package. Ian Ross gives an extensive overview of how he optimized arb-fft on his blog.

Visual code is written using Tobias Bexelius' GPipe package, which expresses the whole graphics pipeline as a pure, functional abstraction. Even shaders are written in Haskell and compiled to GLSL. I've translated a number of OpenGL tutorials into GPipe code, and the Haskell community has built a package inspired by GPipe called LambdaCube.

Code

Clone this project from github and try it for yourself! spectrogram3d_hs

Credit

Thanks to Maja, Ralf, and Robert for proofing my post and providing suggestions about how to improve comprehensibility! I'm sure I've added more typos while implementing their suggestions and and would appreciate your feedback.

#3d graphics #haskell #fft #audio visualization #near realtime #multithreading #pure programming #functional programming

Structural recursion

(Intro post)

[Update: Mindy made me aware of the concept of Tying the Knot, which allows the construction of cyclical data structures in Haskell. It is not safe to use structural recursion with cyclical data structures, but it remains a useful recursive pattern nonetheless. See Skeleton.hs in the examples section for a practical application which would be difficult without the guarantees provided by structural recursion.]

Introduction

A “recursively defined data type” is a type which references itself in its definition. This is possibly vague, weird, and bad-sounding if you're coming at this from an OOP angle, but it's a very fundamental concept in the Lisps and Schemes of the world. It is this idea which changes recursion from a nasty beastie to a friendly nyancat.

In this post I try to explain structural recursion in a way that is accessible to programmers from all backgrounds. I start by defining a datatype in Haskell and translating this datatype to Python. Next I define a __repr__ method for the datatype (this is like a toString method in Java), and show how structural recursion enables a natural definition of repr in both Haskell and Python.

Recursively defined data type

Here's an example of how you might define a linked list in Haskell:

data List thing = Empty | Element thing (List thing)

You might read that definition as:

A List of things is one of two possibilities. It may be either an Empty or it could be an Element carrying one thing and a whole List of more things.

(Aside: The thing is the type of elements in the list. Lists are homogenous in Haskell for the most part. If this is confusing, try reading the definition aloud while saying Int wherever you see thing.)

Our use of List thing as a type in the second argument to the Element constructor is what makes this a recursively defined data type.

Let's see some examples.

According to our definition, this is a list:

Empty

This is also a list. It has a length of one and contains some sort of Inty number.

Element 4 Empty

This list contains a 3 followed by a 4.

Element 3 (Element 4 Empty)

This may seem foreign. It is a very functional-programmingy way to think about linked-lists, and Haskell might be a foreign syntax for you. Let's explore more in Python.

Recursively defined data type dot py

We need to implement our list as a thing that can be constructed in one of two ways, where both constructors produce something recognizable as being a part of the same type. Looking to the expression problem [1] for guidance hints at the right way to set things up.

First we define a list type. Our two constructors will be members of this type by extending it.

class List(object): def __init__(self): raise NotImplementedError()

The list type is abstract (you can't construct one). It's just there so that we can extend it.

class Empty(List): def __init__(self): pass def __repr__(self): return '{}()'.format(self.__class__.__name__)

The Empty class lets us construct and repr an empty list.

class Element(List): def __init__(self, thing, nxt): assert isinstance(nxt, List), 'nxt argument should be a list' self.thing = thing self.nxt = nxt def __repr__(self): return '{}({}, {})'.format(self.__class__.__name__, repr(self.thing), self.nxt)

The Element class lets us carry one thing and another which must be a subclass of List. It defines __repr__ in terms of these properties.

gist

Now let's see the same examples that we defined above:

An empty list.

Empty()

A list of length one containing an int.

Element(4, Empty())

A list containing 3 followed by 4.

Element(3, Element(4, Empty()))

Recursion that works

>>> repr( Element(3, Element(4, Empty())) ) 'Element(3, Element(4, Empty()))'

Our implementation of Element.__repr__ is recursive. Inside the method we implicitly coerce self.nxt to a string, and Python dispatches to the corresponding __repr__ method on the class of self.nxt. Since self.nxt is restricted by Element.__init__ to a subclass of List, it's probably either an instance of Element or Empty.

In the absence of mutation, it's not possible to put a reference to the whole structure anywhere inside the structure [2]. That means that we'll eventually stop seeing instances of Element, find an instance of Empty, and stop recursing. This recursion is guaranteed Halting™.

Infinite recursion

At this point you may notice that this approach has some problems. You're right. Never write production code like this in Python. Eventually somebody will make a list point to itself and everything will break.

>>> quux = Element(3, Element(4, Empty())) >>> quux.nxt.nxt Empty() >>> quux.nxt.nxt = quux

Now if you were to call repr(quux) you'd see Python attempt to recurse infinitely.

RuntimeError: maximum recursion depth exceeded while calling a Python object

The infinite part of this recursion originated when we mutated the last pointer in our linked list from Empty() to quux. Now the coercion of self.nxt to a string always calls another Element.__repr__.

That mutation of the last pointer inside quux is called a side effect. It introduces a notion of time into our code which didn't previously exist. Before and after the execution of that assignment statement are distinct moments, and the thing we call quux has had its internals changed between those moments.

You wouldn't do this in Haskell because the language intentionally discourages the use of side effects [2,3].

(Aside: Not only does this cause infinite recursion, but it's also a reference cycle, the bane of reference-counting garbage collectors.)

Structural recursion

A recursively defined data type is useful because writing a function to operate on a recursively defined data structure is a mechanical reexpression of the original data definition.

Here's how one might implement repr for List a in Haskell.

repr xs = case xs of Element first rest -> "Element " ++ (show first) ++ " (" ++ (repr rest) ++ ")" Empty -> "Empty"

gist

This function is simply a casewise analysis over the same constructors used in the data definition for List a. Notice how recursion into repr occurs at precisely the same place in our function as List a occurred in the data definition.

Transforming the definition of List to a recursive function over Lists is so easy, you could write a program to do it! *ahem*

Progress

To prevent infinite recursion, structural recursion must always take place on a smaller formulation of the same problem. This is why the first step in the Haskell repr is to break off the current item as first from the rest of the list. Recursion only occurs on the rest. Since the problem always reduces in size, eventually it reaches a base case.

In the Python examples, self.thing is always the current item and self.nxt always represents the rest of the list. There's little chance for a mistake when implementing the recursion because they are already separate names in the class definition.

Combination

Once you've identified the place(s) where recursion will appear in your function definition, an easy way to decide how the recursion should look is to ask yourself this question:

If I've already finished writing this function and have obtained the result for the rest of the structure, how do I combine that result with the current one to make a correct result?

A big part of getting structural recursion right is the recognition that each invocation of your function is responsible for only two things:

Process the current item.

Combine the current item with the result of the rest of the structure.

The examples below generalize a few operations and should help to solidify your notions of structural recursion.

More examples

Child post, chrs (which generalizes to map_)

Child post, len_ (which generalizes to reduce_)

On github, Skeleton.hs defines two recursively defined data structures, Skeleton m a and FrozenSkel m a.

freezSkel maps from Skeleton m a to FrozenSkel m a.

flattenSkel reduces from the tree-like FrozenSkel m a to a list.

HTDP is the text book used in the Northeastern University College of Computer and Information Systems. It contains many examples with excruciating detail about structural recursion (and a few other types of recursion).

Subscripts

[1] Expression problem: http://www.infoq.com/presentations/post-functional-scala-clojure-haskell

[2] Lazy evaluation in Haskell actually allows you to create cyclic data structures, but structural recursion on cyclic data structures is not goung to terminate: http://www.haskell.org/haskellwiki/Tying_the_Knot

[3] Mutation inside pure code is possible in Haskell using unsafePerformIO, but this practice comes with some severe caveats: http://hackage.haskell.org/package/base-4.6.0.1/docs/System-IO-Unsafe.html

Credit

Ralf W'14 noticed two grammar errors and suggested a mention of toString in the intro. Thanks!

Mindy W'14 alerted me to the practice of "tying the knot" [2], which is a serious caveat to the whole point of this post.

#haskell #python #functional programming #structural recursion #recursive datatype #recursively defined datatype #recursion #it's easy

Recursively defined data type

There's an instant messaging client called Pidgin which was once upon a time known as..

GAIM

That is an acronym which stands for..

GTK+ AOL Instant Messenger

There are a few acronyms in that, which helpfully expand to..

GIMP Toolkit America Online Instant Messenger

Wait, there's more!

GNU Image Manipulation Program Toolkit America Online Instant Messenger

And now we find a little bit of infinite recursion.

GNU's Not Unix Image Manipulation Program Toolkit America Online Instant Messenger

GNU's Not Unix Not Unix Image Manipulation Program Toolkit America Online Instant Messenger

GNU's Not Unix Not Unix Not Unix Image Manipulation Program Toolkit America Online Instant Messenger

Where am I going with all of this? Well, today a fellow Hacker Schooler asked me:

What do you mean by a "recursively defined data type"?

(Followup post)

#functional programming #recursion #recursively defined data structure #the matrix #hacker school #unix #symbol manipulation

Visualizing a tonal swoosh

I was inspired by Mel Chua to spend the past couple of days implementing a spectrogram generator for wav files. It's on github as spectrogrampy.

It's easy to use. You simply give it a wav, and the default options output a low-resolution overview of the sound.

./spectrogram.py bird.wav

I'm using a twittering bird, Parus major 15mars2011, from Wikipedia that I converted to wav with VLC.

The project came out of a possibly naive assumption that a spectrogram is just a bunch of FFTs (Fast Fourier Transforms) expressed as brightness and then squished together vertically on a picture.

Compact data presentation

What we're seeing in the image above is a bird singing briefly at a low note, then a high note, then a low note, then a high note ... 9 times. Can you see it?

Each vertical column of pixels corresponds to an FFT on a chunk of the sound. The FFT compacts the time which elapsed during that chunk into an instant, and represents the instant on the spectrogram as a set of brightness values.

So what I mean is that each column of the spectrogram...

...is itself a graph...

... which depicts the loudness of all the pitches in the sound during that slice of time!

In this FFT, the units on the Y axis are dB^4 to emphasize loud things over quiet things.

Mistakes blur the lines

One of the resources I saw said that the "input data for the FFTs should overlap". This might be true for some applications, but for me it caused nothing but headaches.

I implemented options for this behavior in the script.

-o --offset; how far to jump before making another FFT -s --stride; how many samples to take for each FFT

At first I tried all kinds of variations for stride, making sure to overlap liberally. All of my spectrograms came out looking terrible! Then I thought to keep the other values constant.

Spectrograms with increasing FFT stride, and roughly constant FFT offset.

Clearly, when stride is too large you lose clarity in time! The important thing is to strike a balance between stride and offset according to the length of your audio file and desire for clarity in time vs spectrum.

Munge for fun

Tonal swoosh at the beginning of All I Need by Air

#spectrogram #audio analysis #visualization #fft #python #numpy #scipy #all i need #data science #munging

Demystifying geographic data for display

A fellow Hacker Schooler wanted to show some geographic data on a grid-aligned set of terrain splines with *three.js*, but he was having some trouble. His data came as a flat JSON array grouped by threes. ... [-118.433333333,32,-821], ... The data was on a grid, but he didn't know the dimensions of the grid. Furthermore, the grid seemed to have been rotated; both the **x** and **y** values changed, rather than one staying constant along each row.

He needed to find the resolution of the data before he could generate terrain splines and iterate through to make an elevation map.

Finding the jumps

If the data is on a grid, there should be an easy way to detect the resolution by looking for when the data jumps. If we detect how often the jumps are, then we'll have found the major axis of the 2D storage arrangement of the grid.

To do this generally, I wrote some routines in javascript to calculate the distance between every adjacent pair of data points.

Having obtained a list of deltas, I found the standard deviation of the deltas and then searched for the indexes of deltas that jumped more than a standard deviation. These deltas are the row-reset.

After finding the unusual deltas, I looked at the deltas of their indexes to find the major axis' resolution.

deltas_indexes = [6, 13, 20]; deltas_indexes_deltas = [7, 7];

With the major axis, the minor axis is easy to find.

major = 7; minor = data.length / major;

And then the problem is solved! Axis aligned data is then easily shunted into splines of the proper resolution.

#threejs #geographic data #munging #statistics #mvue #javascript #data science

Amazon EC2 instance safety tweak: Turn off "Delete on Termination"

I set up an Amazon EC2 instance today and spent a chunk of time configuring some server software. When it was all set up, I noticed that the root partition was set to delete on termination of the instance, but there wasn't an obvious way turn this automatic "feature" off.

This is bad for me because if my instance was accidentally terminated, I'd lose all the data on the root partition. It took awhile to figure out how to flip this bit, and so I'm writing it up here.

1. Get your AWS API key

Navigate to your Amazon Web Services Console and select IAM from the services available.

Change the URL from this: To this:

You may get a modal/lightbox popup telling you that best practices dictate that you create a user... but you're just gaining temporary API access so that you can fix one setting. I just clicked Continue to Security Credentials.

Expand the section titled Access Keys (Access Key ID and Secret Access Key) and proceed through Create New Access Key to download your rootkey.csv secrets file.

Remember where you put the secrets!

2. Make changes from within Python

Use pip to install the AWS API binding, boto.

pip install boto

Fire up a python repl and try to connect to your AWS services using boto and the data in the secrets file you downloaded from IAM:

$ python2 >>> import boto.ec2 >>> conn = boto.ec2.connect_to_region( ... "<region_name>", ... aws_access_key_id="<key_id>", ... aws_secret_access_key="<key>" ... )

My "<region_name>" was the string "us-east-1", but yours will depend where you created your instance. Your "<key_id>" and "<key>" are in the rootkey.csv file which you downloaded earlier.

If you got this far, and you know your way around Python, you might think my usefulness ends here. You'd be right if Amazon had a sane API, but they don't.

There are scant examples in the boto documentation, and the boto API reference is more confusing than helpful.

Continuing where we left off in the repl, get a list of the instances on your account.

>>> all_insts = conn.get_only_instances() >>> print all_insts [Instance:i-7890abcd]

Pick one instance, or follow the rest of these instructions with each of yours. I only have one instance, so I did this:

>>> inst = all_insts[0]

2.1 Fixing one instance

You need to modify an attribute on one or more volumes attached to the instance. This dictionary comprehension will tell us the current value for each volume on inst.

>>> {dev: bd.delete_on_termination for dev, bd in inst.block_device_mapping.items()} {u'/dev/sda1': True}

If you've gotten this far and you know your way around python, you might think again that my usefulness is at an end. Sorry, but assigning False to the delete_on_termination property of a block device doesn't actually sync back to AWS!

Call modify_attribute on the instance, giving as the second argument a list of the block device paths and your desired setting for delete on termination. Do this with a string like "/device/path=int" because this is the future.

>>> inst.modify_attribute('blockDeviceMapping', ['/dev/sda1=0'])

This does change the value in the AWS backend (I promise!), but it doesn't change the local value of delete_on_termination properties on your boto block device objects. You're Done!

3. Verify the change

I haven't experimented enough to figure out how little work it takes to retrieve the updated value from AWS, but starting over from the conn = ... is enough to verify that the change has occurred

>>> conn = boto.ec2.connect_to_region( ... ) >>> insts = conn.get_only_instances()[0] >>> inst.block_device_mapping['/dev/sda1'].delete_on_termination False

Finally.

Credit

Google Groups pointed me in the right direction, but is wrong about the type of the second argument to modify_instance.

StackOverflow gave me most of the correct answer, but is wrong about how to turn it off. For that you need ..=0.. as I stated above.

#aws #amazon web services #ec2 #elastic compute #delete on termination #DeleteOnTermination #BlockDeviceMapping #boto #elastic block store #ebs #data integrity

Gravity in JS [2010]

Gravity in JS

Objects have a minimum distance of 20 meters, to reduce slingshotting. There is no inter-object collision detection or resolution. The ":P" object is about the weight of an adult.

This is the second thing I ever made in Javascript.

#gravity #javascript #simulation

Trending Blogs

Recently Viewed Blogs

f06code