What’s in the Black Box (Part 2): The Serendipity Engine algorithm
OK, yesterday I explained what social science was in the Serendipity Engine algorithm. Today, I explain the way it functions, how does it do what it does. I unveil what else is inside black box.
Remember, the Serendipity Engine was an exercise in trying to understand two things: 1) what is serendipity, and 2) how it might be “produced" by a digital technology. The first version of the engine (created with Kat Jungnickel) tackled the first question, and the second (created with Ben Hammersley) the second.
So. How does it produce its results?
The Engine uses a combination of automated and human-powered techniques to generate a personalised Serendipity Recipe.
Computers are incredible at holding lots of things in their processors at one time. This means they can cross-reference information in a way that our brains simply are unable to do. That makes them inherently better at making connections than we are, and making connections between things is a prerequisite for serendipity.
But at the same time, computers aren’t very good at assessing human qualities like physical attractiveness or creativity. To say whether someone has these traits require human judgement, which is why the Engine is also made up of human components.
The forms you complete feed the Engine with two things:
A) keywords that identify what should be relevant and valuable to you
B) your level of Serendipitousness, measured across seven scales
The information fed to the Engine on a paper form contributes to your personal Serindipitousness rating.
You are asked to do three things: draw a circle, create a drawing from a squiggle and photograph it with your face.
The photograph of your face, your circle and your drawing are sent to a system called Mechanical Turk. Mechanical Turk is an automated, human-powered “computer” run by Amazon that farms out the projects that can’t be automated by a machine to people who, for a small sum, take a look at what needs doing and complete the process. In this case, someone somewhere in the world was paid $1.50 to look at the photograph you submitted and is asked to assess it in four ways:
1) The circle you draw is compared with a Japanese ensō, a symbol that represents universality, openness and creativity
2) The drawing you make based on the quiggle is rated for its creativity
3) Your face is assessed for its physical attractiveness
4) Your face is rated for your psychological well-being
The information you provided the Engine on the online form is used to identify the keywords that are part of your personalised Serendipity Recipe. The form splits into three sections, with three different types of information:
1) The first four questions authenticate who you are by cross-referencing what you have told the Engine about your basic demographic information. This is done by using a Google search of your name and checking the search results with other details (house number, post code, age, etc). After identifying you in Google, the Engine extracts keywords from the top three search results that define who you are in public.
2) Your social media handles are used to identify what is most relevant to you right now, by extracting nouns from your 10 most recent status updates on Twitter, Facebook and/or Google+.
3) The remaining questions are used to extract keywords about you that online databases are very unlikely to have on record. Questions about your favourite subject at school or your Desert Island Disc, or even the food you’d eat tonight if today was your last day on Earth require a much more nuanced level of self-assessment and personal expertise than is catered for in online profiles. This is also the information that can’t be collected based on your browsing behaviour.
In other words, what you listed as the most influential film in your life may (or may not) be different from the film you’ve watched the most, or rented from Netflix most often, or even what you listed on a social networking site as your favourite film. And your parents’ specialist subjects may be completely different from what they do or did for a living. The Engine extracts keywords from tags associated with each from open data sources for each of these categories.
Step 3: Your Serendipity Assessment
Finally, there are 23 questions in the Serendipity Assessment section of the Engine. In the physical version, they're inside a suitcase and made of knobs, switches and dials. The questions form seven scales, each of which has been identified in the research literature as important predictors of serendipity. I described the scales yesterday. The outcome of your answers to these questions, combined with the responses from the Mechanical Turk, is a measure of your personalised Serendipitousness.
There are also two questions in this section that extract even more keywords: What is your profession? What is the nature of your business?
How’s my personalised Serendipity Recipe determined?
By completing the Paper Form, the Online Form and the Serendipity Assessment, you provide all the information the Engine needs to deliver a personalised Serendipity Recipe.
5-10 keywords based on your responses to the keyword questions in the Online Form and the Serendipity Assessment are randomly selected from the long list of keywords you gave the Engine access to, whether you told it directly, or it extrapolated based on its automated searching system. These keywords are then put through a process of filtration determined by your Serendipitousness score.
What’s Serendipitousness?
Your level of Serendipitousness is based on your answers to the questions in the Serendipity Assessment and to the Mechanical Turk’s responses. These are aggregated and weighted to produce your Serendipitousness Score.
The scales are weighted differently according to how important each is in predicting whether someone will think something is serendipitous or not. This is a difficult combination of things, but briefly, it’s about how likely it is that you’ll be able to connect the random keywords in your personalised Serendipity Recipe, and how likely you are to think that those connections could be valuable to you.
Creativity, Attention and HeadRAM are the two most important scales in the Assessment, and so they’re weighted the most heavily. The other scales - Social Support, Physical Well-Being, Psychological Well-Being and Grit - are also important, but for the purposes of this Engine, they are equally weighted.
See more about the scales and their weightings.
You said it’s filtered? What does that mean?
The randomly selected keywords are filtered through Google Translate. If you have a low level of Serendipitousness, the keywords will be translated only once from English to German and back to English. If you have a high score in the Serendipitousness Assessment, your keywords will be translated up to 5 times.
The Engine creates relevant randomness. It takes things that you are likely to pay attention to and puts a new spin on them.
In order to the Engine to work for as many people as possible, it has to cater to the different abilities of the people who use it. Serendipity is a very relative thing, based on where you are in the world, what time it is, how you feel that day, what resources you have access to, what political regime you live in, and which culture you’re from. Some of these can vary day-by-day, and even hour-by hour. So to cope with the constantly moving target of your own Serendipitousness, the Engine works for everyone.
The more filtered the keywords are, the less like the keywords you provided. Which means the more connections you might make that are tangential to who you are and what you know. The less filtered the keywords are, the more like the ones you put in. In other words, the Engine makes you work a bit harder to find the connections, have the insight and see the value if you are more serendipitous, and less difficult if you aren’t.
Almost. There is one more thing. Serendipity is often associated with a flash of insight out of the blue: when walking the dog, taking a shower, having a cup of tea. And so the Engine suggests three contexts to guide you to where you should consider the words, what you should be doing, and who you should be with.
Yup. Predicting serendipity. It's that easy.
* Thanks to Nominet Trust and Google for their support in this research.