Discover Top Posts Tagged with #nm3213

NM3213 Final Project - Tasty Cartography (Post)

Download the KMZ files to simulate my prototype in Google Earth here: https://drive.google.com/open?id=0BypBpSlUszqAVVg0c2N6ek50OXM

--------------------------------------------------------------------------------------------------

Addressing Two Key Digital Humanity Terms

According to Wallack and Srinivasan, (2009) “Meta-ontology” and “Local knowledge” are always at odds with each other – they represent a struggle of the state versus the layman: both attempting to negotiate a middle-ground, but rarely succeeding to do so. To be clear, 'ontology' refers to the construction of reality, or how the world is perceived.

Meta-ontology refers to the reality of information constructed by power-holders – like governments – which is used to run nations or organizations. It essentially manifests as a state's data system: annual reports, financial earnings, crime rates, and more (Wallack & Srinivasan, 2009, p. 2). The concern is that such power-holders only prioritize specific "development" markers like health and education – which do not necessarily equate to a greater experience of well-being for citizens. (ibid., p. 3) As a result, the local knowledge – or ontology of ground-level citizens – is often ignored by authorities.

Local knowledge is important because it underscores the inner workings of smaller communities, and essentially defines their unique cultures. A positive example would be the rural communities in Vietnam, which store drinking water in tanks – when government officials planned to release microscopic copepods into such tanks to cull mosquito breeding, they consulted the villagers to understand who would be most capable of managing these tanks as project collaborators. (Vu, Nguyen, Kay, Marten, & Reid, 1998) Conversely, when meta-ontology and local knowledge clash, we get America's social climate: where less than half of eligible voters actually cast their electoral votes – showcasing a lack of confidence in a government which has ignored their needs. (Miller & Shanks, 1996; Wallace & Yoon, 2016)

Emotional analysis (a subset of sentiment analysis), however, presents a possible marriage of both meta-ontology and local knowledge. Emotional analysis classifies emotional responses into categories such as anger or happiness – and can be understood as a field interested in utilizing local knowledge to better interpret people's feelings.

Suggested Digital Humanities Term

How can analyzing emotional reactions bridge the gap between local knowledge and meta-ontology? According to Westbrook and Oliver (1991, p. 85), an individual's positive emotional experience with a product is somewhat dependent on their preconceived perceptions about it, and whether those expectations are affirmed or betrayed after they use that product. Simply put, emotional analysis can reveal an individual's orientation – and by extension, the state of their 'local knowledge'.

Additionally, local culture plays a part in shaping emotional responses – in Japan, positive emotions (e.g. being calm) are typically associated with experiences of interdependence with others. (Kitayama, Markus, & Kurokawa, 2000) As a result, Japanese service standards are associated with attending to a customers' needs first and foremost. (Winsted, 1997, p. 353) In short, emotional analysis also has to factor in the baseline of a culture – the shared orientation of a group.

To that end, I would suggest the term 'Crowd-sourced ontology' – emotional analysis represents a different breed of meta-ontology which has to account for local culture and personal orientations when interpreting facial expression. Positive and negative facial responses vary between races and genders – and because most facial analysis programs mathematically examine facial dimensions to interpret emotion, good facial analysis tools must have a large sample of facial references to pinpoint and seek out common ethnic traits (for example, Asians having comparably smaller eyes than Americans).

In this sense, the meta-ontological parameters of emotional analysis are only effective when its foundations are firmly modeled on data sourced from a local culture and population. Notably, a lack of ethnic diversity in facial references can restrict the scope of usefulness for such tools – which will be discussed further in the "Limitations" section.

Background/Context of Sentiment and Emotional Analysis

Textual analysis used to be at the forefront of sentiment analysis – the technology has been particularly useful for identifying themes in historical texts (Lincoln Logarithms, n.d.), but has additionally been co-opted to weigh the tone of online tweets (Barbosa & Feng, 2010) and reviews. (Pang & Lee, 2008) This is typically done by breaking textual data up into individual words and aggregating their positivity and negativity.

London Feels is a sentiment-mapping visualization that highlights hotspots of tweets from London in real-time (see http://london.feels.website/). Tweets are interpreted for their sentiment level before being placed on the map with a color representing positivity or negativity – ranging from blues and purples to represent "great" to "not so great" sentiment. Each hotspot can also be clicked on to reveal the full tweet, and the visualization aggregates an overall sentiment rating for these tweets originating from within London.

Textual analysis, however, has become less effective with the increasing prevalence of social media, where viral memes are birthing colloquialisms that machines cannot interpret. Textual analysis tools also face language barriers – most are in English, and cannot read foreign languages like German or Mandarin.

Enter facial recognition technology, which is mostly featured-based – scanning faces for geometric coordinates (of the eyes and mouth) and comparing them to an existing library of mathematically mapped emotional expressions. (Zhang, 1999) One may view this as digital mediation – "folding" real world images into computer language. (Berry, 2011, p. 1)

In general, facial analysis can be an accurate indicator of human sentiment – a study of more than 5000 recordings of facial reactions to funny commercials by McDuff, El Kaliouby, and Picard (2015) found that facial analysis results could predict participants' self-reported levels of enjoyment. Essentially, such tools are capable of capturing positive or negative emotional expressions – which are more uniform across different individuals, and do not suffer as heavily from the complications of context-sensitive textual analysis.

Face++ (see http://www.faceplusplus.com/demo-detect/) was chosen as a tool for my project because It outputs a full suite of facial measurements that factored into a 'smile' rating, denoting how likely someone is smiling. Skybiometry (see https://skybiometry.com/) was also utilized as it is capable of classifying facial expressions according to multiple emotional categories (Happy, Sad, Angry, Disgust, and Surprise).

The data produced by these two tools would be considered nominally classified, as each image is assigned a single emotional classification. (Sperberg-McQueen, 2004) However, it has the added benefit of quantifying these emotional characteristics with scalar ratings – instead of being lumped into broad emotional categories, each face retains some information about its degree of emotional intensity. (ibid.)

Intraface (see http://www.humansensing.cs.cmu.edu/intraface/) is an iPhone application which gives real-time emotional analysis of video feeds from an iPhone's front camera. This data is displayed as six bars denoting: Anger, Disgust, Fear, Happy, Neutral, Sadness, and Surprise. Unfortunately, the app does not have recording features, making its live-analysis interface difficult to obtain data from. However, using such mobile technology was considered, and will be discussed further in the "Possible Future Developments" section.

Building An Alternative Singapore Map

It is evident that sentiment analysis mapping is not new to the data visualization landscape. However, such visualizations are merely overlays that do not necessarily disrupt the geographical landscape that they observe. Their intent is merely to show where positive and negative sentiment originate from – and little else. Similarly limited is emotional analysis, which has mostly only been used to gauge consumers' feelings toward advertisements – its usefulness has not yet extended beyond the commercial realm.

In contrast, my visualization prototype aims to break away from such conventions and utilize facial analysis to represent local citizens' feelings about food – of which some dishes are "endowed with iconic status." (Henderson, 2014, p. 904) To achieve this, my prototype mapping visualization will take the form of a three-dimensional chloropeth (Cartogeek, 2016) – with regions of Singapore lifted up at varying altitudes to represent citizens' emotions about local food. A reference that inspired my project is the 2011 LIVE! Singapore exhibition, which visualized Singaporeans' intensity of mobile data usage on a daily basis (see http://senseable.mit.edu/livesingapore/visualizations.html) by using a three-dimensional overlay as well.

However, the map is not primarily meant to be an objective interpretation of reality, and can never be one – Drucker (2011, p. 12) convincingly argues that all 'data' is fundamentally biased because researchers have to interpret it. No 'accurate' sentiment map regarding food can exist, because it would inevitably leave out the sentiments of citizens outside my sample. As such, my disclaimer is that this project is less of a measurement of Singapore's population, and more of an experiment to structure emotional sentiment in visual forms.

Instead, my project's aim is to kick-start a platform for multiple reinterpretations of Singapore through its culinary scene, so as to counter the mainstream cartography of roadmaps. Those viewing the project may reflect on whether eateries can become new intersections in their daily commute, as opposed to urban infrastructures (like train stations and skyscrapers) that render the local citizen "insignificant." (Sim, 2011, p. 361)

Description of Project

My proposed project is a prototype digital map based on Google Earth, programmed with emotional ratings toward eateries in Singapore – each location appears as a vertical three-dimensional polygon (like a tower). These polygons will be uploaded and available as KMZ (Compressed Keyhole Markup Language) files with the submission of this report – and can be imported into any installation of Google Earth to simulate the prototype.

Two data types are present in the prototype: emotional data (a scalar rating associated with one of five categories: 'happy', 'sad', 'angry', 'disgust', and 'surprise'), as well as a 'smile' rating (also a scalar rating). Two of the tools discussed earlier were utilized – the emotional rating was extracted with a facial analysis tool demo from Skybiometry, while the 'smile' rating was obtained with a similar demo from Face++.

The polygons' heights are proportional to each other based on their emotional rating multiplied by a constant, which in this case is 25. Meanwhile, each polygon is color-coded according to the emotion detected: yellow for 'happy', blue for 'sad', red for 'angry', pink for 'surprise', and green for 'disgust'. For example, a detection of 'happy' emotion at 56 percent is represented by a yellow polygon that is 1400 meters (i.e. 56 times 25 meters) tall.

Accompanying each emotional rating is a 'smile' rating – this measures the percentage certainty that an analyzed face is depicting a smiling expression, and acts as a secondary indicator in identifying if a subject has a positive response. In general, having a common rating like this across all datasets would allow for different emotional responses (for example, 'angry' faces and 'happy' faces) to be compared on a similar scale.

To sum up: one unit of data in this visualization comprises of a photo of some facial response taken when someone was eating food – accompanied by their respective emotional rating and 'smile' rating. Due to the limited resources currently available for this prototype, the images were extracted from food video logs (or "vlogs") found on YouTube, which still feature some involuntary reactions on-screen. In a full-scale project, however, a dedicated sampling of such images from real-life would be preferable.

Proposed Interaction Experience of User/Viewer/Audience

The below flowchart is a rough outline of a viewer's interaction experience with a hypothetical, fully-fleshed out version of the map. Due to the limitations of Google Earth, the prototype does not have most of these processes (like the live generation of polygons).

When booted up, the interactive map zooms into Singapore immediately for convenience. Users should be able to freely zoom in or rotate the map to view it at different angles or scale, and they will be prompted to search for a local dish of their choice.

When users search for a local dish (e.g. Laksa) in the fully-featured map, they will see three-dimensional polygons rising up from the ground to mark out eateries that sell that local dish. As mentioned earlier, users can rotate the map to get a better perspective of how an eatery compares to others based on the height of their polygons. Alternatively, users can search 'all' to see every dataset currently available at the same time.

When users click on a polygon, a pop-up displays the emotional rating and 'smile' rating alongside the image analyzed to produce those ratings. Indicated as well is the dish being consumed, and the general location that sells that dish. Essentially, these pop-ups summarize subjects' emotional responses in regards to the food being eaten.

There should also be an option to toggle a visual style where each polygon is represented by stacks of the dish that they represent, with their emotion rating on top. This might make it easier to interpret from afar what the data is representing, instead of a static polygon that has to be clicked on for clarification. In the example below, a 'sad' emotional rating is associated with McDonalds' hamburgers.

In the current prototype, attempting to compare static polygons that are far apart from each other does not yield any meaningful analysis – however, the fully-featured map should also allow for users to select two or more polygons to compare side-to-side. By dragging two polygons to a side-menu, food enthusiasts and researchers alike can get a direct comparison between the various aspects of two datasets.

Limitations

Unfortunately, facial analysis is not flawless. As mentioned earlier, a caveat of emotional analysis tools is that they require a global sample of faces to account for different ethnicities' unique facial dimensions, as well as the differences in emotional intensity between cultures. As this task is costly and tedious, it is not uncommon for facial analysis tools to be restricted to a smaller sample of facial references.

This was evident in Skybiometry's facial demo – a practice run of the tool revealed that it sometimes categorized Asian faces erroneously as giving 'disgusted' expressions. It is likely that Skybiometry's facial recognition algorithm is mostly based on Caucasian facial data, rendering it less accurate for a project that aims to represent Singaporeans – whom are predominantly Chinese, Malay, and Indian in ethnicity.

Designating each photo to a discrete emotional 'category' is itself also problematic, as Drucker (2011, p. 11) argued – doing so might erroneously imply that only happiness, sadness, anger, disgust, and surprise exist. Reducing the complexity of such data makes it more palatable – however, we would risk ignoring other expressions like boredom or uncertainty. Drucker suggests a "sliding scale" to represent data as a continuity between binary labels such as male and female (ibid.) – but this might not be feasible for the multiple scalar categories used in this project.

Another issue lies in the photo data – in particular, the sources used were mostly video logs of foreign tourists eating, and not of Singaporean citizens. This was probably because the novelty of eating local Singapore dishes was an experience worth capturing – conversely, however, Singaporean citizens would likely not go to the trouble of recording themselves eating dishes that they are already familiar with.

By extension, this meant that the polygons mostly reflected locations that these tourists visited – a number of them being renowned eateries recommended by online reviews, instead of coffee shops in smaller neighborhoods. This is unfortunate, given that the food of Singapore's heartlands is precisely what constitutes 'local knowledge' and culture – which was the focus of this project. The use of mobile technology to circumvent this overarching issue will be discussed in the next section.

Possible Future Developments

Given the involuntary nature of emotions, it might be possible for emotional analysis tools to evaluate the faces of those who consume spicy food – which can cause cheeks and foreheads to flush red. (Ka, Kim, Kim, Kim, & Cho, 2014) Having a facial analysis tool detect the intensity of redness at those areas of the face (in addition to an emotional rating) might yield data that is useful for determining a meal's spiciness, and how enjoyable that spiciness is. Given the prevalence of spiciness in Singapore's dishes (Huat & Rajah, 2001), a visualization built on such data could yield other reorganizations of Singapore's geography.

As mentioned earlier, a particular limitation of my prototype was the difficulty of gathering large amounts of image samples from Singaporean citizens. To obtain a representative visualization of locals' feelings, the means to perform facial analysis should be widely accessible – the ideal would be a mobile phone application that boasts a facial analysis API. Users would be able to use the tool to upload their own emotional ratings and tag them to their geographical location, thereby crowd-sourcing the map visualization. As noted earlier, existing facial analysis apps like Intraface could be useful if they were further adapted for this purpose.

Concluding Remarks

With this project, my goal was to explore how to reconcile 'Local Knowledge' and 'Meta-ontology' through emotional analysis – with the express aim of populating Singapore's map with visual manifestations of Singaporean citizens' feelings about local dishes. The biggest limiting factors that prevented further development beyond the prototype stage were (1) a lack of photos from local citizens, and (2) the restricted scope of some facial recognition tools. Nevertheless, it is my hope that this project acts as a breeding ground for future visualizations which elevate the importance of local citizens' emotions.

In particular, food as a topic was chosen because of its cultural diversity in Singapore, which makes it a popular subject of small-talk amongst citizens. Other national issues such as education or housing can also elicit emotional responses which simply go unheard. However, with the ideas presented here, such local sentiments need not be lost in the rapid flow of life – instead, they might one day be manifested as the building blocks of a larger, more colorful world.

(Word count: 2989)

References

Barbosa, L., & Feng, J. (2010, August). Robust sentiment detection on twitter from biased and noisy data. In Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics. 36-44.

Berry, D. M. (2011). The computational turn: Thinking about the digital humanities. Culture Machine, 12(0), 1-22.

Cartogeek. (2016). 2012 US Presidential Election Results by County: Democrat & Republican Share of Vote. Retrieved from https://www.arcgis.com

Drucker, J. (2011). Humanities approaches to graphical display. Digital Humanities Quarterly, 5(1), 1-21.

Henderson, C. J. (2014). Food and Culture: in Search of A Singapore Cuisine. British Food Journal, 116(6), 904-917.

Huat, C. B., & Rajah, A. (2001). Hybridity, Ethnicity and Food in Singapore. Changing Chinese Foodways in Asia, 161-98.

Ka, M. K., Kim, M. H., Kim, B. H., Kim, H. D., & Cho, D. U. (2014). Effect of Spicy Food on Face Image Color According to Sasang Constitution. Journal of the Korea Academia-Industrial Cooperation Society, 15(5), 2671-2677.

Kitayama, S., Markus, H. R., & Kurokawa, M. (2000). Culture, Emotion, and Well-being: Good Feelings in Japan and the United States. Cognition & Emotion, 14(1), 93-124.

Lincoln Logarithms: Finding Meaning in Sermons. (n.d.). Retrieved from http://disc.library.emory.edu/lincoln/

McDuff, D., El Kaliouby, R., & Picard, R. W. (2015, September). Crowdsourcing facial responses to online videos. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference. IEEE. 512-518.

Miller, W. E., & Shanks, J. M. (1996). The new American voter. Cambridge, MA: Harvard University Press.

Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and trends in information retrieval, 2(1-2), 1-135.

Sim, G. (2011). Historicizing Singapore cinema: questions of colonial influence and spatiality. Inter-Asia Cultural Studies, 12(3), 358-370.

Sperberg-McQueen, C. M. (2004). Classification and its Structures. In S. Schreibman, R. Siemens, & J. Unsworth. (eds.), A Companion to Digital Humanities (Chapter 14). Oxford: Blackwell.

Vu, S. N., Nguyen, T. Y., Kay, B. H., Marten, G. G., & Reid, J. W. (1998). Eradication of Aedes Aegypti from a Village in Vietnam, Using Copepods and Community Participation. The American journal of tropical medicine and hygiene, 59(4), 657-660.

Wallace, G., & Yoon, R. (2016, November 12). Voter turnout at 20-year low in 2016. Cable News Network. Retreived from http://edition.cnn.com/2016/11/11/politics/popular-vote-turnout-2016/

Wallack, J. S., & Srinivasan, R. (2009, January). Local-global: Reconciling mismatched ontologies in development information systems. In System Sciences, 2009. HICSS'09. 42nd Hawaii International Conference. IEEE. 1-10.

Westbrook, R. A., & Oliver, R. L. (1991). The dimensionality of consumption emotion patterns and consumer satisfaction. Journal of consumer research, 18(1), 84-91.

Winsted, K. F. (1997). The service experience in two cultures: A behavioral perspective. Journal of Retailing, 73(3), 337-360.

Zhang, Z. (1999). Feature-based facial expression recognition: Sensitivity analysis and experiments with a multilayer perceptron. International journal of pattern recognition and Artificial Intelligence, 13(06), 893-911.

#nm3213

NM3213 Final Project Outline

Address two key terms

“Meta-ontology” and “Local knowledge” have always been at odds with one another – they represent a constant struggle of the state versus the layman, both attempting to negotiate and communicate with each other, but often failing to achieve a middle-ground understanding of reality. To be clear, “ontology” refers to the construction of reality, or how someone perceives the reality they exist within.

Meta-ontology refers to the reality of information that is constructed by a power-holder – such a governing state – which is used to run the administration of a nation or organization from an executive position. It essentially manifests as information that is consolidated for a birds-eye perspective of some reality – annual reports, financial earnings, crime rates, and so forth. The concern is that this information is being compiled solely based on the interests of such governing bodies – which inevitably do not connect with the priorities of their citizens (or subordinates).

As a result, the Local Knowledge – or ontology of ground-level citizens – is often ignored by authorities despite its value within the community it stems from. Local knowledge is important because it underscores the working(s) of an individual or smaller community, and basically shapes how they interact with the society around them. A positive example would be the rural communities in Vietnam, which use tanks to store their drinking water: when the Vietnamese government wanted to release microscopic creatures into these tanks to cull mosquito breeding in them, they intensely consulted the villagers first to see which members of the community were most capable of being volunteers to test this method for everyone. This is because they understood the local habits of these villagers after years of observation.

On the other hand, when meta-ontology and local knowledge encounter disconnect, we get the social atmosphere of America currently: just about less than half of eligible voters actually cast their votes for recent presidential elections – a clear sign that no one has any confidence in a USA administration that has not been able to perceive and address the needs of its numerous citizens. The state is stuck with its own priorities and many politicians are self-seeking in their interests and policies.

Suggest a key term to contribute

I would suggest the term “Crowd-sourced ontology” – sentiment analysis specifically requires a sampling of an audience to get an understanding of what certain words denote or connote. Although sentiment analysis word lists ultimately require researchers (i.e. executives) to make a call on how these word lists are compiled, the description of a keyword as ‘positive’ or ‘negative’ is highly reliant on the responses of individuals toward a word or subject. In a sense, researchers might pick out ‘appropriate’ people to determine the sentiment characteristics of a word, but it is the local knowledge of these individuals that aggregate to form the meta-ontology used in a sentiment analysis tool.

In short, the term “Crowd-sourced ontology” would refer to a different breed of meta-ontology, one that blends in local knowledge as its main body of data. Sentiment analysis encapsulates this ethos precisely because it wants to understand local sentiment as natively as possible.

Description of Project

A digital interactive map prototype based on Google Earth API, programmed with locations and sentiment ratings of various eateries in Singapore.

It will be accompanied by a word cloud of commonly-encountered words as obtained from an aggregate of 60 Singapore food reviews on HungryGoWhere and processed via Voyant Tools and Cortical.io. This acts an introductory exploration of keywords that might be considered in an actual sentiment analysis tool based on Singaporean food reviews.

Interaction experience of user/viewer/audience

*(These describe the final product – the prototype will likely lack most of these features. Drawn mock-ups will supplement and describe them instead in my final submission.)

When initiated, the interactive map zooms into Singapore immediately for convenience. The available geographical data should be restricted to Singapore only (at least for this project). Users should be able to freely zoom in and rotate the map to view it from different angles and scale.

For the digital map prototype, users should be able to search up a local dish (e.g. Laksa) and observe three-dimensional polygons rise up from the geography to mark out eateries of interest – the higher the polygon, the more positive sentiment its reviews exhibited. As mentioned earlier, users should be able to rotate the map freely to get a better perspective of how each eatery compares to others in terms of the height of their “sentiment”.

Users should also be able to search by location. Searching by location will highlight a predetermined boundary that will isolate all the eateries ranked with sentiment in the area. An advanced search should also be available to let users search by location and dish simultaneously.

The polygons that appear will be masked with textures of the searched food to imply their nature. For example, a search for Laksa would produce polygons that have orange textures mimicking that of Laksa gravy. For eateries serving more than one indexed dish, their polygon might shift to displaying textures of other dishes when the computer mouse is placed over them (e.g. peanut bits on dark sauce to denote Rojak). This might also be useful for food enthusiasts who want to be surprised by their search results and uncover what other dishes are recommended at their eatery of choice.

When users click on a polygon, which represents an eatery, a side bar displays the sentiment ranking of the eatery, as well as the review that was analyzed to give that result. The keywords identified in the text of these reviews will also be highlighted in red and green to denote if they were perceived by the sentiment analysis as ‘negative’ or ‘positive’ respectively. This allows the tool to be more transparent, and allows researchers to examine the data and tools used more critically.

Background/Context

There are many sentiment analysis applications available online – the most basic being a demo built on movie-review data and using Natural Language Toolkit (NTLK) developed by Bo Pang and Lillian Lee, which only gives a raw numerical score aggregating all the sentiment identified in a text, ranging from 1 (most positive) to 0 (most negative). Unfortunately this tool does not provide a breakdown of how the text was read and interpreted for its sentiment, and the numerical figure cannot be explored in further detail.

Tweet Sentiment Map by Daneshvar and Bar-Yam is a real-time interactive map using the Google Maps API to highlight the positivity and negativity of tweets from regions in New York. Each region is color-coded with a spectrum of colors from green (positive) to red (negative), and can be clicked on to see how many positive and negative tweets occurred there. However, not much other information is available – clicking on a region does not allow us to examine those tweets in greater detail.

Visualizations of sentiment analysis do exist - Tweet Sentiment Visualization by Healey and Ramaswamy grabs the past 100 or so tweets based on the hashtag being searched, and places them on a four-way graph to describe how positive and active they are. Each circle representing a tweet can also be clicked on to examine the full tweet and what words contributed to its sentiment score. This is far more ideal as it allows researchers to examine what key words cause a tweet to be more positive or negative, as well as to discover cases of sarcasm (i.e. when a negative post has positive keywords). The tool is also a suitable means to study the Affective Norms of English Words (ANEW) dictionary.

A keyword extraction demo at Cortical.io has proven to be useful – it reviews the entire text based on both the frequency of each word, as well as the semantic meaning of each word. the process is a little dense for the layman to understand, but it essentially accesses the frequency of a word against its meaning (this allows the tool to omit non-unique vocabulary like "the", "him, and so on). An initial test with a collected corpus of 60 food reviews unearthed the prominence of the term 'chili' amongst other ingredients – as well as the term 'spicy', once common food ingredients (like "chicken" and "pork") were removed. This finding has given me some new ideas for further development of my proposed project (see 'Possible further developments').

The NUS Natural Language Processing Group has a variety of text-analysis tools available for free – of note are BioKIT and IMS (It Makes Sense).

BioKIT was built using biomedical texts as data, and is capable of reading the semantic meanings of words in a text. Unfortunately, I have been unable to evaluate its usefulness – running the software requires knowledge of python language that I lack at the present. If I were to utilize this, I would certainly need a basic lesson in python to get started.

IMS, on the other hand, has an active demo. When reading a text, IMS parses and recognizes phrases and words in the text that have a specific function (i.e. adverbs). It can even detect negatives ("not hungry") – making it a powerful tool for pinpointing areas of particular interest in a text. Using IMS should also allow us to identify the discrepancies within the text, like words with double meanings. The main drawback of using IMS is that it does not consolidate this information into a neat overview – it just lists out all the text that was inputted with red overlays to indicate words that can be explored further into. For a corpus of 18,000 words like my food reviews, this is not very practical.

Many of the more powerful sentiment analysis tools are typically marketed as commercial products and services – because sentiment analysis is such a powerful factor in adjusting business practices to suit public opinion, it is often considered too lucrative to release into public domain. Even the full list of the ANEW, developed by the University of Florida, is only available by request for academic researchers. In short, the avenues of sentiment analysis available are highly limited to the academic and commercial sphere.

Singapore food reviews are mostly aggregated on food review websites like HungryGoWhere. The website mostly relies on star ratings and “recommended” or “not recommended” votes (much like e-commerce websites like Amazon). Much of the text reviews, however, are not vetted too professionally – many of them contain pseudo-expletives and a lack of tactful language (as a number of negative reviews are more like personal complaints). As a result, it can be difficult to pick out objectively good places to eat, especially when some reviewers are just too demanding and give a negative rant that pulls down the review scores of some eateries.

Searching for a single dish on HungryGoWhere also calls up a long list that progressively loads as a user scrolls downwards – this can be very tedious, and giving up after the 20th entry is quite understandable for many users. In short, much of the review information on the review website is not presented simultaneously for ease and flexibility of exploration, and many lesser-known stalls are overshadowed by eateries that are more publicly visible or more controversial among food enthusiasts.

How does my project differ?

For one, the subject of Singapore’s culinary landscape has not been applied to sentiment analysis before – local citizens have specific ideas about what makes eating at a location a good experience (e.g. attitude of staff, service standards, and quality of ingredients). A sentiment analysis tool called “FoodMood” does exist, but it only looks at the perception of dishes in general, not the eateries that serve those dishes. Fast food brands like KFC have had sentiment analysis performed before, but they specifically examine the brand as a whole, and that includes corporate image and other trivialities not directly related to food.

I seek to make this project dependent on specific local knowledge instead – from plain reviews on the food review website HungryGoWhere. These pertain to plain hawker stores as well, which usually do not get much exposure as public brands. Allowing users to view the actual food reviews used for each polygon also allows for more interactivity (which is lacking in other sentiment analysis tools that merely act as aggregators of online sentiment).

Each review (this can range from a sentence to several paragraphs) will be counted as one unit of data - accompanied by a sentiment numeric (ranging from 0 to 1).

Using a three-dimensional map that can call up all indexed instances of a dish simultaneously also departs from how HungryGoWhere loads its reviews. By providing a user with all the information required at once, they can zoom in to any polygons that interest them immediately, without having to wait for a long list of lowly-reviewed stalls to appear. This makes the visualization more of a birds-eye view.

Possible further developments

The proposed final product requires a proper, expansive corpus for Python and an NLTK suite to process into proper information for a sentiment analysis tool – this can require up to 5000 ‘positive’ and 5000 ‘negative’ bodies of text. At the present moment, it is impractical to mine the whole of Hungrygowhere.com for all of its reviews and categorize them by dish and location. A proper realization of my proposed prototype would require such a corpus to be collected and categorized.

Other food review blogs could be considered for inclusion in the corpus as well, given that some are quite professionally maintained as compared to the messier reviews on HungryGoWhere.

Of course, expanding such a concept to the rest of the world would be the penultimate dream. To preserve the importance of local knowledge in such a tool, reviews would have to be sourced mostly from the locals living near an eatery. For example, examining a curry rice eatery in the basement of Sunshine Plaza, Ikebukuro (of Tokyo), would require responses from Tokyo residents living nearby. This would be a task of monumental difficulty – but if Google has already charted out nearly all of the world’s roadways and landscapes, then the resources do exist.

The use of cortical.io to reveal 'chili' as one of the keywords in my collected food reviews holds potential for an alternate project where an NLTK is used to identify instances of the word "chili" and "spicy" as indicators how spicy the food of an eatery is. This might be considerably more practical – instead of weighing reviews, a sentiment analysis tool could be re-defined to understand chili-related terms as 'positive' hits. The same concept of a Google Maps API could be used, but the polygons that rise from the map can now be more uniform in appearance if all of them are to represent 'spiciness' and presence of good chili.

#nm3213

NM3213 Week 10 Homework

*NOTE* It seems as though this task was already assigned to us by Professor Nancy during the recess week - In any case, I will re-post it for our homework task this week for posterity’s sake.

------------------------------------------------------------------

1. Think about and describe - what are these projects trying to do?

They are all attempting to re-examine and reconfigure historical entities and events through the use of digital humanities tools. More importantly, they are trying to create links between these entities and events, so as to create a alternative, unified narrative from the mere inferences made by basic academic literature. This narratives are ones that can be explored and added to over time in an ongoing process of research and digitization.

Aegaron offers users multiple types of drawings for ancient Egyptian architecture, as opposed to a single sketch image.

Quantifying Kissinger raises new questions about the US administration by examining the phone calls that Kissinger made when he was still establishing US diplomacy in a daunting part of history (Cold War) - and how those calls focused on topics not made public at the time.

FreedomsRing combines Martin Luther King’s speeches into a dramatic narrative that flows chronologically - but also shows parts of his speeches that have been redacted for conservative purposes.

SlaveVoyages gives a global perspective of the slave trade, where previous databases only focused on single countries.

Finally, Kindred Britain attempts to link together historic figures’ lives as one singular narrative by showing the potential connections between them.

————————————————————————————————–

2. Are the projects doing something that a book or article couldn’t do? If so, what?

Yes they are - they are giving users the agency to examine data in whatever order they see fit. This is certainly the case with Kindred Britain, where viewers can connect whichever historical figures in the United Kingdom to understand how they might be connected, as opposed to the website having a static dataset for the edge between two particular nodes. In short, these projects are ceding over more independence over to users and not forcing them down a linear path of reasoning with the data presented. FreedomRing also chains so many visual and audible elements in a vertical fashion that would not be possible in a typical publication.

————————————————————————————————–

3. How would you classify each of these projects (e.g., history, literature, philosophy …)?

Aegaron would be historical - it focuses on somewhat objective visualizations of once-existing Egyptian Architecture.

Quantifying Kissinger is most certainly political in nature - examining what topics concerned the USA at the time, and probably even recently.

FreedomRing is both political and philosophical - Martin Luther’s speeches have some religion poured in, and the entire presentation plays out as an argument or rhetoric for the emancipation of people lacking civil rights (in particular, African-American citizens in the USA)

SlaveVoyages can be classified as history - it examines the movements of slaves in the past according to known sailing routes.

Kindred Britain can be considered historical as well - it tracks family trees and also working relations between writers in Britain’s past.

————————————————————————————————–

4. What do you like about these projects?

Aegaron peers into a culture that has not been examined for its architecture before - the data looks original and fresh.

Quantifying Kissinger gives an exclusive perspective into international politics that has never quite been achieved before.

FreedomRing runs solely on Martin Luther’s speeches, without any overarching narration breaking up the man’s message - its gives the impression that he is speaking directly to viewers, not just being presented.

SlaveVoyages has a large database of African slaves with all of their travel details available - this is data that raises many questions about the USA’s past as well.

Kindred Britain almost feels like the historical equivalent of social networks nowadays - people are connected through other people visually - this gives us a perspective on our own modern social networks as well.

————————————————————————————————–

5. What do you dislike or how would you improve them?

Aegaron appears to be unavailable to the public currently - the most I can suggest is that they provide a timeline of how these buildings changed over time.

Quantifying Kissinger is also overladen with information - the graphs might be hard for laymen to interpret meaningfully apart from the larger keywords. A short discussion could be included with each graph to summarize what it illustrates and why that illustration is important.

FreedomRing is quite abstract and does not allow for much exploration into the context of each proclamation made by Martin Luther. If it was made into an interactive narrative, each utterance should have some elaboration on the topic (e.g. elections, slave trade) that he is making reference to or implying.

SlaveVoyages is very comprehensive - but it also does not attract laymen into exploring it further. If possible the website should publish thought provoking articles on some of the revelations that their data has discovered - so as to provoke the thoughts of visitors to come back for more.

Kindred Britain unfortunately lacks substance - it does not examine the achievements of each of the people listed, and so some of the connections made between people seem superfluous and based solely on blood relations. It would be much more interesting to see how these people have worked with one another as well - what common academic work did they achieve? Have they ever collaborated in real life?

#nm3213

NM3213 Week 9 Homework

[Characters who talked to one another]

[Characters who appeared in the same scene]

I looked at "The Disappearing"" by Fatima Bhutto - what was interesting was that the main character of Delawar talks to animals, although they do not necessarily respond to him in kind.

The two graphs very clearly visualize that Delawar is at the centre of the narrative - he speaks to nearly everyone and is spoken to by nearly everyone. The same goes for his appearances in scenes with other characters. What struck me as suprising was that the author, who was implied to be married to Nowruz, Delawar's father, does not appear with him in a single scene or even converse to him! In this sense, their relationship of husband and wife is absent because I based my edges only on whom talked to whom, and whom appeared with whom. One other limitation I noticed was that I could not really perceive the relationships being described...without my labels above, other people might not be able to do so either!

#nm3213

Week 8 - comments on Cognitive and Aesthetic Issues in Data Visualization

Hi Gabriel,

Thanks for your insightful post, I think the example you gave is very apt in addressing the importance of human perception in designing data visualizations! I think having a clear research purpose before deciding how the data should be arranged is a valid starting point, but I do wonder how we can go about reconciling with the fact that the analysis of messy and large datasets can sometimes change the focus of our research when undiscovered relationships between data points start to emerge.

In that case, should the research purpose inform the data collection, or should the data collection inform the research purpose?

Cheers,

Sharmaine

#nm3213 #submission

(NM3213, Weekly Blog Post for Week 8 – Cognitive and Aesthetic Issues in Data Visualization)

Image from: http://www.forbes.com/sites/naomirobbins/2012/05/30/winner-of-the-bad-graph-contest-announced-2/#5e9565813307

And this is why Microsoft Excel cannot be relied on.

Before today’s blog post, I understand that Nicole and Hilary will be taking us through concepts about the interfaces and aesthetics of visualizations, as well as the many roadblocks that visualization present. In any case, I wish to contribute some of my thoughts - hopefully they will find something helpful here. :)

Back in 2012, Naomi Robbins of Forbes held a user-submission contest for some of the worst graph solutions to ever set foot on this Earth - the above is one such rotten winner. What’s wrong with it?

Well...what’s RIGHT with it?

Kirschenbaum observes that an interface ultimately cannot be separated from it aesthetics, and I personally feel likewise for data visualizations - as they are technically an interface through which viewers access the data we are presenting.

Kirschenbaum illustrates a point with TextArc - a text-analysis program that arranges words from a text in a radial, circular formation, with the most frequently used words enlarged at the center. While text analysis tools have been around long before TextArc, the tool received a lot of public attention for its viewer-centered visualizations that expressed keywords of a text in a manner that was natural to the human eye.

At the same time, Jesse James Garrett emphasizes five stages of considerations when developing an interface: 1. Strategy (what an interface aims to do for users) 2. Scope (the functional aspects of an interface) 3. Structure (how the user moves through and makes sense of data) 4. Skeleton (how visual elements are arranged to help comprehend data) 5. Surface (How the product looks)

These, in my personal understanding, are strong considerations to follow in that particular order - one should consider what kind of story one wants to tell with a visualization, before deciding how the data should be arranged, and so forth.

Now. Back to this 3D Excel graph.

Although this is an example graph, we can get an inkling of its strategy - it wants to draw comparisons between five different individuals, and perhaps between the compass orientations for each of these individuals.

The scope is also present in some capacity - by placing bar graphs of different heights next to each other, we can compare their heights and get an understanding of how Peter compares to Tom in some characteristic, and so forth.

Once we hit structure, however, we encounter a MAJOR roadblock. How do users actually decipher this graph? The main issue is that the graph is in isometric 3D - not real, two-eyed 3D where depth is perceivable. The major error in this graph’s design choice is a lack of consideration for the human being’s perception of three dimensions!

This messiness cascades down to the skeleton - in order to improve on this issue, the graph attempts to place numbers to denote the value of each graph more clearly. Unfortunately, this probably makes it worse - the green bars numbers’ actually leak into other graphs!

Thus, we finally get to its surface...

...andddddd it looks like fresh road-kill.

The graph attempts to inform viewers of so much information through an extra dimension, but has not made considerations about how people perceive depth, and does not have the functional capacity to deal with that issue (i.e. allow free rotation of the graph). This all culminates in a data visualization that lacks focus or principle - let alone a purpose.

Given the open nature of our lectures from here on out, I hope this has been a fun dive into my musings on interface and aesthetic issues within data visualizations. All the best to Nicole and Hilary for their lessons - you have my support! ^w^

#nm3213

NM3213 Week 7 Homework (Data Visualization Exercise)

I decided to work on the Amazon and Coca Cola business figures! I've grouped them into two interactive tables in Google Fusion Tables: the first one compares their closing prices for 10 years, the second one compares their daily percent returns over the same period!

Closing Prices Compared

Daily Percent Returns Compared

#nm3213

NM3213 Week 7 Task 2: Commenting on 5 Digital Humanities Projects

1. Think about and describe - what are these projects trying to do?

Aegaron offers users multiple types of drawings for ancient Egyptian architecture, as opposed to a single sketch image.

FreedomsRing combines Martin Luther King’s speeches into a dramatic narrative that flows chronologically - but also shows parts of his speeches that have been redacted for conservative purposes.

SlaveVoyages gives a global perspective of the slave trade, where previous databases only focused on single countries.

Finally, Kindred Britain attempts to link together historic figures’ lives as one singular narrative by showing the potential connections between them.

————————————————————————————————–

2. Are the projects doing something that a book or article couldn’t do? If so, what?

————————————————————————————————–

3. How would you classify each of these projects (e.g., history, literature, philosophy …)?

Aegaron would be historical - it focuses on somewhat objective visualizations of once-existing Egyptian Architecture.

Quantifying Kissinger is most certainly political in nature - examining what topics concerned the USA at the time, and probably even recently.

SlaveVoyages can be classified as history - it examines the movements of slaves in the past according to known sailing routes.

Kindred Britain can be considered historical as well - it tracks family trees and also working relations between writers in Britain’s past.

————————————————————————————————–

4. What do you like about these projects?

Aegaron peers into a culture that has not been examined for its architecture before - the data looks original and fresh.

Quantifying Kissinger gives an exclusive perspective into international politics that has never quite been achieved before.

SlaveVoyages has a large database of African slaves with all of their travel details available - this is data that raises many questions about the USA’s past as well.

————————————————————————————————–

5. What do you dislike or how would you improve them?

Aegaron appears to be unavailable to the public currently - the most I can suggest is that they provide a timeline of how these buildings changed over time.

#nm3213