Entrepreneurship n stuff @namityadav - Tumblr Blog

In this course, you will train a model to tackle a regression problem and predict house prices using Tensorflow.

Just completed this free 49 mins hands-on session on our platform https://rhyme.com. Here, we train a model to tackle a regression problem of predicting house prices using Tensorflow.

#rhyme #machine learning #supervised learning #regression #tensorflow

In this course, you will learn to classify movie reviews as either positive or negative with the help of Tensorflow.

Just completed this free 52 mins hands-on session on our platform https://rhyme.com. This session is about classifying movie review text as either positive or negative with the help of Tensorflow.

#rhyme #machine learning #classification #tensorflow #supervised learning

In this course, you will learn how to perform image classification with Tensorflow. You will train a neural network model to classify images of clothes.

Just completed this free 58 mins hands-on session on “Basic Image Classification with Tensorflow” on our platform https://rhyme.com. In this session, we learn how to train a neural network to perform image classification with Tensorflow.

#rhyme #machine learning #classification #tensorflow #supervised learning

Welcome to this course on the Introduction to Jupyter Notebooks. In this course, we will get familiar with one of the best open-source software platforms for interactive computing. We will learn what Jupyter Notebooks are used for and explore a number of advanced and powerful features that we can leverage in our own projects. A basic familiarity with Python is assumed. It is the primary language we will use to explore Jupyter. We hope you enjoy this course as much as we loved creating it!

Completed this free 30 mins hands-on session on Jupyter Notebooks on our platform https://rhyme.com

#rhyme #jupyter #python #machine learning

Machine Learning - Introduction

Time to start blogging again. I am starting to study Machine Learning. And, I am going to try to write my notes about what I learn here as blog posts. This way, hopefully I can help someone and hopefully others can inform me if they notice any mistake in my understanding.

Machine Learning vs Data Mining:

Machine Learning is the subset of Artificial Intelligence that focuses on using data for self-learning. Note that AI itself doesn't need to involve learning at all.

Data Mining is digging into large amounts of data using ML techniques to discover patterns that were not immediately apparent. Unlike ML, DM is done by a person, using ML tools.

Terms:

Attribute: Variable (e.g., mileage).

Feature: Variable + Value (e.g., mileage = 15K).

Label / Response: Dependent variable / attribute.

Predictor: Independent variable / attribute.

Learning Styles:

A) Supervised Learning

Learn a model from labeled training data. Use it to make predictions on unseen or future data. Example, train a spam filter with emails marked spam or not-spam, and then the filter will predict whether future emails are spam or not.

Classification: Classify data into discrete class labels. E.g., this email is spam vs not (binary classification) or this handwritten letter is 'X' as opposed to any other letter between A-Z (multi-class classification).

Regression: Response is continuous value. Given a number of predictor variables and a continuous response variable, try to estimate the relationship to be able to predict responses in future. E.g., given H hours of studying, your SAT score will be S. Note: Regression can be used for classification as well. E.g., the response can be a value that corresponds to the probability of belonging to a given class (like, 20% chance of being spam).

Example Supervised Learning Algorithms: k-Nearest Neighbors, Linear Regression, Logistic Regression, Support Vector Machines, Decision Trees and Random Forests, Neural Networks.

B) Unsupervised Learning

Deal with data having unknown structure or no labels and extract meaningful information without any known outcomes or reward signals.

Clustering: Organize a pile of information into meaningful subgroups (clusters) without having any prior knowledge of the nature or types of these clusters. Clustering is also called "Unsupervised Classification." E.g., marketers may want to cluster customers into subgroups based on their behavior without knowing exactly what makes those customers similar. If you use a Hierarchical Clustering algorithm, it may also subdivide clusters into sub-clusters.

Dimensionality Reduction: Compressing data into smaller dimensional subspace (i.e., removing features / dimensions) by reducing noise without any prior knowledge of which features / dimensions can be removed. Smaller dimensional space can also be achieved if few dimensions are highly correlated (e.g., a car's mileage and its age). Dimensionality Reduction can be the first step that prepares data for another ML step (e.g., supervised learning).

Anomaly Detection: Finding anomalies. E.g., unusual credit card transactions, catching manufacturing defects, etc. Anomaly Detection can also be the first ML step that prepares data for another ML step (e.g., supervised learning) by removing outliers from the data.

Association Rule: Discover relations between attributes. E.g., sales logs show that people who buy ketchup also buy potato chips.

C) Reinforcement Learning

Develop a system (agent) that improves its performance based on interactions with the environment. The environment typically includes a reward signal. The agent learns a series of actions that maximizes this reward signal via a mix of trial-and-error approach and deliberate planning (by programmer). E.g., learning to play chess by treating the board (and game rules?) as the environment, and winning or losing the game as the reward signal.

Other Notes:

Matrix item: To represent data in a Matrix, I will use M(i,j) to indicate an individual item in the matrix where i represents the training sample (think row) and j represents feature / dimension (think column).

Vector: One row of training sample or a column of one feature is called a vector (row or column vector).

Preprocessing: Preprocessing is the first step of machine learning. The goal is to give the raw data the shape and form that optimizes effectiveness of a learning algorithm. Preprocessing may consist of randomization (randomize rows to remove any biases), feature extraction (find meaningful features), feature scaling (scale various features to a range of 0 to 1 or to a normal distribution with zero mean and unit variance), dimensionality reduction (remove irrelevant or highly correlated features), and division (randomly divide data into training and test sets).

The best ML algorithm: There's no single all-powerful ML algorithm. Different algorithms have different biases and assumptions. So, one should pick all relevant algorithms, try them with the training and test sets, and pick the one performing best for that kind of data. Note: Even within an algorithm, there are going to be configurable parameters (think knobs) for tweaking the algorithm for further maximizing effectiveness.

#machine learning #supervised learning #unsupervised learning #reinforcement learning

Fixing common git-commit mistakes using pre-commit hook

There are two mistakes that I make fairly regularly in git-commit.

(1) Accidentally commit code directly to a production branch (because I sometimes forget to change branches after the last deploy).

(2) Accidentally add a big binary file. Big binary files is one area where git fails miserably. It’s better to use something like git-annex, git-lfs, etc to just git-commit hashes and keep the binaries elsewhere.

The code here uses git’s pre-commit hook to warn me about these two mistakes. BTW, I’ve added this pre-commit in a template directory so that it gets copied to all of my repos.

https://gist.github.com/namit/f74376e5c80d22b7aa0c

Computer Science degree from The University of Self Study

Thanks to the great work done by many innovative educators, and entrepreneurs, there's now a lot of computer science content on a variety of platforms. And a lot of generous educators and students have created very insightful and exciting curricula, which allow anyone interested in Computer Science to learn or revise most of the core Computer Science concepts using freely available online courses. There is a curriculum from the good folks at ADUni.org. Then, there's Scott Young's MIT Challenge, Adam Fletcher's You Say You Want An Education?, and aGupieWare's curriculum.

I wanted to propose a slightly different alternative to these curricula. Instead of online courses, I wanted to propose a list of textbooks that one could read (and work on) to acquire knowledge comparable to that of a Computer Science graduate from one of the best schools. And I wanted to propose an order in which the books could be read, so that someone not aware of how all these subjects come together could just trust the wisdom of the Computer Science community, and read the books in the given order. In addition to the above mentioned curricula, I looked at the guidelines from The Joint Task Force on Computing Curricula ACM IEEE Computer Society.

NOTE: I tried to stick to the classics because, otherwise, it would be almost impossible to settle on one book for a subject. And I tried to limit to under 15 subjects.

Calculus: Calculus: Early Transcendentals - James Stewart

Statistics (and Probability): Probability and Statistics - Morris DeGroot

Discrete Mathematics / Mathematics for CS: Discrete Mathematics With Applications - Susanna Epp

Algorithms (and Data Structures): The Algorithm Design Manual - Steven Skiena OR Algorithms - Sedgewick

Computer Architecture: Computer Architecture - John Hennessy

Programming: Structure and Interpretation of Computer Programs - Harold Abelson

Operating Systems: Modern Operating Systems - Tanenbaum OR Operating Systems Design & Implementationn - Tanenbaum

Computer Networks: Computer Networking - James Kurose (for top-down) OR Computer Networks - Tenenbaum (for bottom-up)

Compiler Design: Engineering a Compiler - Linda Torczon OR Modern Compiler implementation in ML - Andrew Appel (implementation focus)

Computational Science: Introduction to the Theory of Computation - Michael Sipser

Database Systems: An Introduction to Database Systems - C J Date

Parallel Programming: An Introduction to Parallel Programming - Peter Pacheco

Artificial Intelligence: Artificial Intelligence A Modern Approach - Stuart Russell

Crptography (and Security): Understanding Cryptography: A Textbook for Students and Practitioners - Christof Paar

If you have any suggestions for improvements, please let me know.

#computer science #books #cs books #bs in cs

Publishing notification in Amazon Simple Notification Service (SNS) + Apple Push Notification Service (APNS) in Ruby

Now that we have the endpoints created and edited correctly in the previous post we can publish a notification.

As usual, please let me know if you find a mistake, or have suggestions for improvement. You can see the gist here:

https://gist.github.com/namit/11189637

Creating and Editing Endpoints in Amazon Simple Notification Service (SNS) + Apple Push Notification Service (APNS) in Ruby

Amazon SNS supports APNS, but there's almost no documentation or code-examples. After struggling to get the code working to create and edit endpoints in Ruby, I decided to publish it so that others desperately searching for an example can have an easier time than me. Please let me know if you find a mistake, or have suggestions for improvement.

BTW, why is set_endpoint_attribute's CustomUserData attribute key not custom_user_data to be consistent with create_platform_endpoint?

You can see the gist here:

https://gist.github.com/namit/10911934

Internet Startups: 5 Information Arbitrage

In economics, arbitrage is the practice of taking advantage of an imbalance in price between two markets. That arbitrage is almost no risk and short lived.

I define arbitrage as the practice of discovering data and then extracting previously unavailable information from it to create market opportunities. So, in my opinion, the economics definition of arbitrage is just one example of arbitrage. But, to not confuse the two, let me call my definition Information Arbitrage.

Information Arbitrage just provides an opportunity. So, it is naturally more risky. But it is also long lived and can even become your source of defensibility.

To benefit from Information Arbitrage, look for industries where: (Similar to "Data Opportunities" discussed here )

even more data can be generated

data can be captured more efficiently

data can be better analyzed to produce actionable information.

data can be processed faster (real-time)

#mba #internet startups #arbitrage

Internet Startups: 4 Objectivity

Go against convention

Listen to conventional wisdom to understand:

a market and your potential customers and competitors.

the potential flaws in the conventions that you can poke holes in. (convention != disruption)

Example, while Xbox 360 and Playstation 3 were batteling out in the red ocean of graphics and high-def, Nintendo Wii was setting the rules in the blue ocean of casual gaming.

Learn about an industry like a child (Why? Why? Why?). Dont be afraid of asking stupid questions.

But, be objective

Objectivity is what differentiates a disruptive idea from a simply bad idea.

Test your idea in the market with solid research and non-negotiable metrics to make a go-nogo decision. And don't be afraid to pull the plug.

Example metrics: We will personally work with 10 B2B customers for 3 months. After 3 months, more than 5 should want to pay us to continue working with them.

Example metrics post launch: Are we break-even (EBITDA >= Expenses) after one year? Are we getting customers in greater numbers or cheaper than before?

As technology and consumer needs evolve, will your solution serve a big enough market? Or will the market become big enough by then?

#mba #internet startups #objectivity

Idea, Execution, Marketing

Idea

Solve a pain (product-market fit).

Learn continuously and early, when the cost of change is small.

Talk to customers - Go where they hang out (For example, if customer is a small startup, go to hackernews and "offer" them a service).

Adwords testing with fake landing pages (simple, to-the-point, get email for prospects).

Be very focused on a niche (Facebook started at just Harvard).

Execution

Do just one thing for customers, but do it very very well.

Fake it till you make it (Simulate some features till they are ready).

You can never make something "too easy". Assume that reducing every additional step for the customer doubles your customer-base.

Always ask for very basic feedback (For example, Dropbox's simple question - How's Dropbox? (a) Awesome (b) Needs work).

How do customers look at your product (hint: not like you do)?

Marketing (and Sales)

Your small customer base may also mean you have a "scarce resource" (Ex, Gmail invites).

Leverage your connection to a bigger trend to get people's (especially bloggers' and early adopters') interest.

Build connections with bloggers today. Help them with something today to get a return-favor next year.

Are you doing enough for your power-users to turn them into evangelists?

Referrals - Encourage word-of-mouth by giving something valuable to the referrer.

Provide easy import functionality from Facebook, Twitter, Email etc.

Understand if your business can really get virality (Do people really wanna tell their friends that they signed up on eHarmony?)

Understand your CAC and LTV.

#startup

Internet Startups: 3 Pain

There’s no happy without sad, no good without evil – You need pain as a catalyst to overcome reluctance to change. A pain can be an inconvenience, added cost, missed opportunity, obsolescence etc.

Mathematically, Pain to resolve > Cost of resolving <=> Adoption

You can't really understand customers' pain till you really understand your customer.

Making an opaque part of an industry transparent and cost-effective usually solves a pain.

Some questions to consider:

Does the industry have a supply / demand imbalance? (Groupon / Echo)

Is there significant price elasticity?

Is there fragmentation in the market?

Are there too many steps in the process?

Is there little automation in the process? Time consuming, tedious, error-prone.

Is the customer over-paying for something?

Is the customer not making as much profit as she can?

Is the pain significant enough for the customer to pay for it?

Is the customer solving the pain through some home-grown solutions?

Is the market big enough?

More questions to discover articulated customer needs:

http://namityadav.com/post/3166857651/marketing-developing-new-products-3-customers

Note: Replicating a US business (and its business model) in other countries is getting easier because markets, customers, and their pains are very alike everywhere.

Painful industries

Based on the American Customer Satisfaction Index, some industries have been historically bad in customers’ eyes (Their scores in parenthesis):

Newspapers (65)

Subscription TV (66). Major offenders – Charter, Comcast, Time Warner

Airlines (66). Major offenders – United, Northwest, US Airways, Delta

Gas Stations (70)

Social Media (70). Major offenders – MySpace, Facebook

USPS (71)

Wireless Telephone (72). Major offenders – AT&T, Sprint

Hospitals (73)

Health Insurance (73). Major offenders – United Health, Aetna, WellPoint

Internet News (74). Major offenders – CNN, MSNBC

#mba #internet startups #pain

Internet Startups: 2 Data

With all the data available to us today, very few decisions should be based on faith.

Look for industries where data is fragmented and needs aggregation.

(a) Find the decisions and actions that your business has to take. (b) Then find the answers based on which you can take those decisions and actions. (c) Then design your information system to get those answers.

Get data from people & give them something useful in return (Think Google’s services). Data is the currency that runs businesses.

"tackling Big Data will determine the winners and losers in the next wave of cloud computing innovation." - A GigaOm article.

Data can provide defensibility (Ex, Groupon’s future direction) by acting as a source of customization, uniqueness, innovation, etc.

Data can be used as a marketing tool: OKTrends, Admob Mobile Metrics.

Data Opportunities

Infrastructure - Serve the infrastructural needs of big data. For example, storage, management and security of data in cloud-based systems. Ex, Imperva.

Capture – Collect data more efficiently, cheaply, or from new sources. Ex, Google Analytics.

Structure – Translate and structure data for use / analysis.

Analysis – Extract meaningful information from data that help customers in making important decisions. Future opportunities lie in predictive analysis. Ex, Netflix recommendations.

Visualization – Present data in a simple and friendly way. Ex, Hipmunk, Flipboard.

Real-time – Use real-time data to remove inefficiency and unreliability. Ex, Echo.

#mba #internet startups #data

Internet Startups: 1 Disruption

Disruption is an innovation that improves a product or service in unexpected ways and lowers price or brings value to a different set of consumers.

Sometimes disruption comes directly from technology. In other cases, it comes from strategy and business models (which may or may not have been enabled by technology).

Disruption may consist of off-the-shelf components put together in a simple fashion to bring value to a new market, or more commonly, to the lower end of the market (un-served or under-served). If the rate of improvement is significant, the solution then moves up-market.

Disruptive solutions initially have lower gross margins and smaller markets.

Example, Digital music’s disruptive effect on music CDs.

Disruption through technology

How can I use technology to disrupt an established industry?

Can I make an opaque part of an industry transparent and cost-effective?

Is the industry fragmented?

What will the customers need one/three/five years from now?

Can disruption be brought through a hybrid of technology and human processes?

Notes:

Use a rapid application development tool to quickly launch a solution and see if/how customers use it.

Define market segments in terms of “different people”, not in terms of “product usage”.

#mba #internet startups #disruption

Network Structures: 7 Stimulation

Spillover

Is it true that to obtain brokerage benefits, one should build connections with brokers? Many business practices, like mentoring, assume this spillover benefit.

There's a strong correlation between rewards and indirect network constraint (ie, your connections are brokers).

However a lot of that correlation may be explained by the fact that those having low indirect network constraints also have low direct network constraints. The partial or pure effect of indirect network constraint may be negligible.

Effect of indirect network: Global process - High; Local process - Low; Personal process - Negligible

Stimulation

Agency Question: How much do individuals matter relative to their social structure.

Network advantage is not a result of having the right access. Individual's reaction to network is the critical performance variable.

Bent Preferences are evaluations shaped by social comparison. What will I get in comparison to what I already have and what my peers have?

Network fear: The feeling of loss as peers overtake you are more severe than the feeling of gain in overtaking peers. However, the feeling of loss fades as peers continue to do well (because they are no longer your peers?).

Brokers are more motivated by gain than fear of failure. Maybe because they don't have as many direct peers?

#mba #network #network structures

Network Structures: 6 Outsider

In business, insider-outsider distinctions manifest in casual conversation as gossip-enforced stereotypes about "those people."

Outsiders should consider a third kind of network (other than Broker and Closed) -- Partner (Hierarchy) network.

In a partner network, a strategic partner lends you access to others/nodes. She is your sponsor, and has a frame-of-reference effect on others.

For insiders, correlations of Rewards and: Network Size (+ve), Network Density (-ve), Network Hierarchy (-ve).

For outsiders, correlations of Rewards and: Network Size (insignificant), Network Density (insignificant), Network Hierarchy (+ve).

(Image: Three kinds of networks: Broker, Partner, Closed)

#mba #network #network structures

Trending Blogs

Recently Viewed Blogs

Entrepreneurship n stuff