Conclusion
Excel users have a strong mental model of how data analysis works, and this makes learning to program more difficult. However, learning to program will allow you to do things that you can’t do easily in Excel

titsay
will byers stan first human second
RMH
YOU ARE THE REASON
Xuebing Du

tannertan36
Lint Roller? I Barely Know Her

shark vs the universe
d e v o n
sheepfilms
Stranger Things
todays bird
One Nice Bug Per Day

祝日 / Permanent Vacation
dirt enthusiast
No title available
"I'm Dorothy Gale from Kansas"

No title available

Andulka
Cosimo Galluzzi
seen from United States
seen from United States

seen from Malaysia

seen from United States

seen from Malaysia

seen from United Kingdom

seen from United States
seen from United States

seen from Australia
seen from United States
seen from United States

seen from Germany

seen from Norway
seen from T1
seen from United States
seen from Türkiye

seen from Ireland

seen from United States
seen from Türkiye
seen from United States
@jrlittle
Conclusion
Excel users have a strong mental model of how data analysis works, and this makes learning to program more difficult. However, learning to program will allow you to do things that you can’t do easily in Excel
A quick introduction to Apache Spark
The Long Read: The ability of statistics to accurately represent the world is declining. In its wake, a new age of big data controlled by private companies is taking over – and putting democracy in peril
“Statistics began life as a tool through which the state could view society, but gradually developed. ... [to become] one of many pillars of liberalism and Enlightenment.
“The declining authority of statistics – and the experts who analyse them – is at the heart of the crisis that has become known as ‘post-truth’
In constrast, “Data Analysts skills are often not developed [from or] for the study of society.
As datafictation become a normative foundation “... it is not just the quantity of data that is different. It represents an entirely different type of knowledge, accompanied by a new mode of expertise.
... there is no fixed scale of analysis (such as the nation) nor any settled categories (such as “unemployed”).
A “post-statistical” society questions
whether indicators such as GDP and unemployment continue to carry political clout ... if they don’t, it won’t necessarily herald the end of experts, less still the end of truth. The question to be taken more seriously, now ..., is where the crisis of statistics leaves representative democracy.
After 174 years, John Little is closing its last department store in Singapore. The remaining outlet in Plaza Singapura will shutter by the end of December.. Read more at straitstimes.com.
That’s it. I’m closing all my Singapore options. Retail storefronts just can’t compete with the sexy allure of data science!
R has found its way into a good number of news groups who do data journalism. Andrew Flowers for FiveThirtyEight talks about how they use the statistical computing language throughout their workflo…
22 minute vdeo (you can alter the playback speed). Informative and interesting.
Quick Start to R for Data Wrangling
Introduction to R - interactive and online tutorial by DataCamp
Data Wrangling with R & RStudio (dplyr & tidyr) -- Video
Introduction to dplyr
Quick Start to Exploratory.io (Exploratory.io is a User Interface [UI] to pre-configured R with dplyr, tidyr, and ggplot2, aka Hadleyverse)
Quick Start Introduction to Tableau
Tableau for Students
Tableau Guide
Tableau Introduction and Workshop with data (Video)
Lynda Video for Tableau
Created by libjohn
Created by libjohn
Workshops by Duration
Registrations by Workshops Sessions Offered
Registratant's Academic Status by Academic Year
Total Workshop Sessions offered by Academic Year
Unique Workshops offered by Academic Year
I recently had a chance to play with Exploratory.io. The tool/website/desktop application bills itself as “an interactive and reproducible real data wrangling and analysis experience powered by R and visualization”.
Upon download what you find is a very nice drag-and-drop interface combined with a cloud-based sharing platform. The tool masks a lot of the R command-line complexity and simplifies the wrangling processes when using dplyr and ggplot. In essence it masks the complexity inherent to the popular R data tool, along with the dplyr, and ggplot2 packages. (I think there are a few additional packages also loaded) And it does this without removing the command-line; so, tweaking is still possible.
Of course with simplification comes a loss of customization. I think it’s a fair trade-off. It is perhaps short of being a total and unbridled power tool but delivers a quicker start with a more consumable learning curve by putting many of the necessary data munging commands at your fingertips and thereby relieving you of the need to memorize the arcania of [R] commands and switches.
For the non data-scientist (and perhaps also for a data scientist) this provides a welcome simplification on the way to simple analysis.
Recommended.
Brief Definitions from ShellyPalmer on Data Science
Brevity from “What Do You Do With Data” referenced in another ShellyPalmer article which notes Data Science literacy does not demand fluency ...
Transformational Analytics
Aggregation – a class of techniques used to summarize data including basic statistics such as mean and weighted averages, median, Gaussian distribution and standard deviation. Other aggregation techniques include probability distribution fitting (the repeated measurement of variable phenomena – remember “method of moments” and “maximum likelihood” from Stats class?) and good, old-fashioned plotting points on a graph.
Enrichment – a set of techniques employed to add information to, or fill gaps in, a data set – for example, adding zip + 4 to five-digit zip codes, appending purchase data or credit scores or even simply standardizing prefixes or suffixes.
Processing – everything from data munging or data wrangling (the cleaning up of data) to entity extraction (identifying key terms in unstructured data that have value) to true feature extraction (building derived values from existing data).
Learning Analytics
Regression – a common way to predict the future based on the past by exploring spatial relationships. There are many types of regression techniques, but all share the common goal of predicting the value of a dependent variable where partial related variables are available, or estimating effects of an explanatory variable on the dependent variable.
Clustering – is just what it sounds like. The goal is to group a set of data points so that the ones with the most in common are closest together. Importantly, clustering is not a specific formula; it is accomplished by using a series of algorithms. And it is almost always an iterative process.
Classification – algorithms and other techniques used to identify to what category or subpopulation a data point belongs. When speaking about classifications, you must be careful to also identify the discipline you are speaking about. Statisticians use the term differently than practitioners of machine learning do.
Predictive Analytics
Simulation – a set of techniques used to create a simulated environment for testing predictive models.
Optimization – a wide-ranging tool set for making optimal selections from a set of alternatives. Commonly used for pricing and maximizing yield.