Explain things simple @nick1100 - Tumblr Blog

Exceptions handling

Sometimes the question "When and where to use exceptions" comes up on different forums, like Quora. In my opinion, if an operation could unexpectedly go wrong - then we should try and catch it. But where in our call stack should we do it? If we can handle it, then we obviously should do it where it happens. For instance, let's assume we have some config data we need in a XML or Json file. What if that file wouldn't exist? Perhaps we have some entry in a registry thst we can use and load instead - that way we can let the program continue to execute. But if our system would depend on that file in order to execute, then it doesn't make sense to try to handle it anywhere else than in our main control flow, perhaps log the error and inform the user.

I've seen this piece of code a few times;

try { Execute Some Operation; } catch (Exception ex) { throw ex; }

Which at least in my opinion is merely useless; we will have the same results from CLR if we just leave it out - stop execution and pop the stack until something catches it.

Even worse is to just swallow the exception;

try { Execute Some Operation; } catch {}

- This kind of "error handling" is going to cost time and perhaps money; just a matter of time. Please just don't. :)

Varför har vi endast en ström om en krets är sluten?

Mycket länge sedan jag skrev ett inlägg här! Jag är mitt i ett projekt där denna blogg kommer ersättas med en annan när som, men jag publicerar detta inlägget här så länge. Den ursprungliga domänen kommer inte förnyas, mer info inom kort.

En grundlag i ellära är att elektricitet flödar endast om en krets är sluten, det vill säga, att vi har en väg där vi har en spänning mellan.

Till exempel, om vi tar ett batteri med spänningen $ U $ och kopplar ihop denna med en lampa med resistansen $ R $ med en ledare, förslagsvis koppar som har hög konduktivitet, så blir strömmen genom lampan $ I = \frac{U}{R} $ enligt Ohms lag och lampan lyser, givet att spänningen är tillräckligt stor. Däremot om vi klipper av ledaren efter lampan (det vill säga, strax före koppartråden är i kontakt med pluspolen på batteriet) så slocknar lampan.

Den vanliga frågan som dyker upp ofta kring detta är, hur kan elektronerna veta på förhand om kretsen är sluten eller ej? "Ser" de att kretsen är bruten strax efter lampan och bestämmer sig för att inte gå dit?

Okej, dags att korrigera vår förståelse lite granna. För det första, att batteriet har en spänning innebär att batteriets minuspol har ett överskott av elektroner jämfört med pluspolen. Vi kallar detta en potentialdifferens; om batteriet då har spänningen 5 V så menar vi att spänningen, potentialet, är 5 V vid minuspolen och 0 V vid pluspolen, där vi definierar jord. Vi kan endast mäta spänning mellan två punkter.

För det andra, i en krets är det inte elektroner som flödar en efter en, eller som en stor klump. Absolut, elektroner är mycket snabba men eftersom de konstant byter riktning kommer de aldrig särskilt långt. Man brukar prata om en drifthastighet som är den genomsnittliga hastigheten i stället; ungefär så lite som några mm per sekund. Det som är unikt för strömförbara material, som koppar, är att de har fria laddningsbärare; elektroner som kan röra sig fritt från atom till atom. En ledare är det vill säga fullt av dessa. Det som gör att energin kan överföras så snabbt är för att ett elektriskt fält fortplantas i samma stund spänningskällan kopplas på. De fria elektronerna trycks då på och krockar med andra, vilka i sig krockar med närliggande, vilket bildar en stor våg. Så överförs energin. Ungefär som vågen i en hockeymatch https://www.youtube.com/watch?v=9HJPu3_fm4M. Ett annat sätt att se det är som när man skjuter biljard.

Hur som helst, huvudfrågan är ännu ej besvarad riktigt, men nu har vi åtminstone lite mer kött på bena. Det som händer i detta specifika fall är att, naturligtvis kan inte elektroner "veta" på förhand om de befinner sig i en sluten krets. När spänningskällan kopplas på fortplantas det elektriska fältet i ljusets hastighet som vanligt och elektroner flödar igenom lampan, men på grund av att kretsen är bruten efter lampan stöter elektronerna på mycket stor resistans där, vilket medför en ansamling av laddningar. Runt varje laddning finns ett elektriskt fält, som tillslut på grund av den stora resistansen tar ut det elektriska fält som trycker på initialt - som är själva orsaken till strömmen från början. Naturen eftersträvar alltid jämvikt, vilket är det som uppnås. Om detta är alltför abstrakt, studera gärna kondensatorns funktion och vad som händer där för att få lite djupare insikt.

Denna kunskap kommer även kunna utnyttjas till att förstå Kirchoffs lagar lite djupare.

Maximum nodes of a binary tree of depth k

Let’s see if we can come up with a mathematical expression for maximum nodes $ n $ in a given binary tree of depth $ k $.

In matter of fact, this is pretty easy to come up with if we take a look at the pattern when we count the nodes. Let's take this tree as an example.

Notice that for each level in the tree, the number of nodes increases by $ 2^k $, counting the root node as zero. Why? First, the root node will have two children A and B. They will in turn have two children of their own meaning $ 2 \cdot 2 = 2^2 = 4 $ nodes for that level. Then, the children of those $ 4 $ nodes will all have two children each, meaning $ (2 + 2 + 2 + 2) = 4 \cdot 2 = 2^2 \cdot 2 = 2^3 $. And then, every node of those $ 2^3 $ nodes will have children of their own: $ 2^4 $ and so forth. Notice the pattern? The tree grows exponentially. We can describe that sum as $$ 2^0 + 2^1 + 2^2 + ... + n^k = \sum_{k=0}^{n} 2^k. $$ Not surprisingly, this is a geometric sum so we can rewrite it to the closed expression $$ \sum_{k=0}^{n} 2^k = \frac{2^{k+1} - 1}{2-1} = 2^{k+1} - 1. $$ Thus, the maximum number of nodes in a binary tree is $ 2^{k+1} - 1 $.

Yet another example of proof by induction on sums

Here’s a problem (although, quite easy) that was posted on a forum.

Prove that

$$ \sum_{i=0}^{n} a^i = \frac{a^{n+1} - 1}{a-1 } $$

for all $ n \in \mathbb{Z} $ and $ a \neq 1 $.

Typically, this is a target for the proof by induction technique.

(1) We show that the equality holds true for $ n = 0 $ because obviously everything raised to 0 is one.

$ a^0 = 1 $ and $ \frac{a^{0+1} - 1}{a-1} = \frac{a - 1}{a-1} = 1 $. OK

(2) We assume that the equality holds true for some $ n = k \in \mathbb{Z} $, e.g. $ \sum_{i=0}^{k} a^i = \frac{a^{k+1} - 1}{a-1} $.

3. Since $ k $ was chosen arbitrarily, we can just prove the equality holds for $ n = k + 1 $. This means, using our induction step (2), if we can show that

$$ \frac{a^{k+1} - 1}{a-1} + a^{k+1} = \frac{a^{k+2} - 1}{a-1}, $$

we’re done.

Rewriting the left-hand side, we get that

$$ \frac{a^{k+1} - 1}{a-1} + a^{k+1} = \frac{a^{k+1} - 1}{a-1} + \frac{a^{k+1}(a-1)}{a-1} = \frac{a^{k+1}(1+(a-1))-1}{a-1} = $$

$$ = \frac{a^{k+1}a - 1}{a-1} = \frac{a^{k+2} - 1}{a-1} $$

for $ a \neq 1 $. Thus, we’re done.

#Mathematical Induction

Let’s talk about binary trees and search

Binary search is a pretty straight forward search technique - at least if we’re looking for some good performance. It’s pretty easy to explain binary search because the algorithm itself is pretty intuitive. Let’s dive right into it. Let’s take the classic and intuitive example; the phonebook. Assume that we don’t have the Internet and we’re actually still using one of these. ;) So if we wanted to find **Smith** how would we go on doing that? Well, of course.. you could flip one page at a time until you get there, although this method would probably take some time considering how large phonebooks can actually be. Or, we could simply use the fact that the phonebook is sorted alphabetically - so we flip up to the middle page instead and check from there. If the page we turned on list names with **M** then we know that **Smith** must be listed in the other right part, so we flip the middle page in the right part, and continue doing this until we find Smith. This is a lot lot more efficient than checking every page. Sorry I had to tell you that, I know you're not stupid. :) Binary search is built actually on this logic. Let's say, for simplicity, that we have a sorted list of integers $$ A[1, 14, 18, 27, 33, 35, 37, 44, 58, 81, 103] $$ and we want to see if $ K = 14 $ is in that list. Then we start by selecting the middle element, in this case index $ i = 5 $, which has the value $ 35 $. We compare $ 14 $ with $ 35 $ and see that $ 14 < 35 $, thus if $ 14 $ is in the array then it must be in the left part. We repeat this process by again selecting the middle element and so on, until our comparison $ K = A[i] $ is true, or if there are simply no elements left to search. A **bold** element indicates the element we're comparing with. Let's have two variables $ l $ and $ r $ to keep track of the start and end index of respective subarray, then to figure out the middle index of the subarray we simply just compute the average of $ l $ and $ r $, i.e. $ m = \frac{l + r }{2} $, where $ m $ is this midpoiint. Remember we wanted to search for $ 14 $ in the list. In first iteration, we have $ l = 0 $ and $ r = n - 1 = 10 $, where $ n $ is the length of input array. Then $ m = \frac{0 + 10}{2} = 5 $. A[1, 14, 18, 27, 33, **35**, 37, 44, 58, 81, 103] $ l = 0 $ and $ r = m - 1 = 4 $. Then $ m = \frac{0 + 4}{2} = 2 $. Go into the left subarray $ A[0..4] $. A[1, 14, **18**, 27, 33, 35, 37, 44, 58, 81, 103] $ l = 0 $ and $ r = m - 1 = 1 $. Then $ m = \frac{0 + 1}{2} = 0 $ (integer division). Go into the left subarray $ A[0..1] $ A[**1**, 14, 18, 27, 33, 35, 37, 44, 58, 81, 103] In this case, $ 1 $ is smaller than $ 14 $, so we'll have the following values: $ l = m + 1 = 1 $, $ r = 1 $ and $ m = \frac{1 + 1}{2} = 1 $. A[1, **14**, 18, 27, 33, 35, 37, 44, 58, 81, 103] And we're done! ## Implementation in Python ## This can easily be implemented using recursion like this

def binarysearch(arr, left, right, k): mid = (left + right) / 2 ## simplest case, we found the element immediately if arr[mid] == k: return True ## if this is true, we're done searching. Not found if l == r: return False if arr[mid] > k: return binarysearch(arr, left, mid - 1, k) else: return binarysearch(a, mid + 1, right, k)

The iterative version is pretty similar

def binarysearch(arr, left, right, k): while left <= right: mid = (left + right) / 2 if arr[mid] == k: return True if arr[mid] > k: right = mid - 1 else: left = mid + 1 // Not found return False

Performance

But how well does binary search perform? BS is a decrease-and-conquer algorithm, since during each iteration it decreases its problem instance by a constant factor, namely $ 2 $.

So in worst-case, the total amount of times we'll actually have to compare is how many times we can divide the input size $ n $ in halves + 1, i.e. $ \lfloor log_2\, n + 1 \rfloor $. We add one because we're doing a comparison initially before we divide. The $ \lfloor \, \rfloor $ notation is the floor function (it rounds down its argument). Why is $ log_2 \, n $ how many times we can divide $ n $ by $ 2 $? Read my post about it (it's pretty short).

So $ C_{worst}(n) $ is

$$ C_{worst}(n) = \lfloor log_2\, n + 1 \rfloor $$

meaning $ C_{worst}(n) \in \Theta(log \, n) $.

On average, BS will run near its worst case, so $ C_{average}(n) \in \Theta(log \, n) $.

However, even though binary search is pretty fast, we can do even better. For instance, we'll look at hash tables and maps in an upcoming post. But all have its pros and cons.

Binary search tree

Let's look at binary search tree while we're at binary search. This is an interesting data structure that's used for searching - where we apply this search technique.

So what's the difference between a binary tree and a binary search tree? The simple answer is that, a binary tree just consists of two child nodes. So every tree where each node has (at most) two child nodes is called a binary tree. A binary search tree (BST), however, you could say is a special binary tree used for searching. The BST has some special properties; the left child node should be smaller than its parent node and its right child node should be greater.

For instance, this is a binary tree:

1 / \ 2 3

this is a binary search tree

2 / \ 1 3

See that the left child node is smaller, and the right is greater. This property must at all times be satisfied in order to be a BST.

Binary and trees in general

So a tree consists of nodes. These are basically just "places" where we put our data. For instance, in the trees above: 1, 2, 3 are all nodes.

In computer science we write our trees upside down (so relate to trees in nature, but think of a "upside-down" tree). This means that the root of the tree, is the start node. In the BST example, 2 is the root node.

Each node can have children, e.g. child nodes, but like I said - binary trees can only have up to two children, but a node isn't required to have children (in programming we usually refer to an empty child node as null).

Since we're working with an "upside-down tree" and if the root is at the top, then the leaves must be at the bottom. So in the BST example above, 1 and 3 are also considered leaf nodes since they are the bottom.

The height of a binary tree is simply how many levels there are in the tree, in both examples above there's one level since we never include the root node while counting the levels, so for instance a tree with only a root node has a height of $ 0 $.

Intuitively, the levels can be seen as how many times we can divide something in halves, however, there's a problem with binary trees (which we will look at soon) and that is, they can be very unbalanced. This means that the height of a binary tree can actually be proportional to the input size $ n - 1 $ (-1 since we don't count the root node). However, the minimum number of levels a BST can have is $ log_2 \, n $ if it's full. This means that, the height $ h $ of a given BST with size $ n $ satisfies the following inequality

$$ log_2 \, n \leq h \leq n - 1 $$

So I said binary search trees can end up really bad. Imagine the scenario if we keep inserting elements in a BST in increasing order, i.e. every element is greater than its precedence, the height of the tree will just increase for every insertion. For an example, we want to insert $ [3, 5, 7, 9, 15] $, starting from left.

So we start with $ 3 $. This will be the root node

3

Next to insert is $ 5 $. $ 5 $ is greater than $ 3 $, so it should be inserted at the right side.

3 \ 5

Now we want to insert $ 7 $. Again, $ 7 > 5 $ so we do the same

3 \ 5 \ 7

Next one is $ 9 $ and again, $ 9 > 7 $ so

3 \ 5 \ 7 \ 9

Last one is $ 15 $ and, $ 15 > 9 $ so

3 \ 5 \ 7 \ 9 \ 15

Conclusions? A pretty skewed tree we ended up with. If we count the levels (not including the root) then we see it's $ 4 $. That's why the height of a BST can at most be $ n - 1 $. This is an issue that also affects the time complexity. Both searching, inserting and deleting is at worst $ \Theta(n) $ if the tree is skewed, for an example, like this one.

It's pretty easy to come up with an algorithm for insertion in a BST, the deletion is a bit harder. A BST is usually constructed using a node with pointers to its child nodes. For searching, the recursive algorithm is simple (as you've seen before):

Let $ N $ be a node in a BST with pointers to its left child node $ N.left $ and right $ N.right $. $ N.value $ is the "compare" element in that node.

if $ N.value = v $ or $ N.value = null $, return $ N $.

If $ v > N.value $, search the right subtree recursively

else, search the left subtree recursively

where $ v $ is the value you want to search for in tree.

With insertion, again using the same variables but as an extra parameter, we want to accept some value $ v $ to be inserted as well.

if $ N = null $ then N = new Node(v) $

else if $ v > N.value $, insert at right subtree recursively

otherwise, insert at the right subtree recursively

Not harder than that! I will leave deletion as an exercise. The scenarios that you need to consider are

If deleting a node with no children, remove the node

If deleting a node with one child, remove the node and replace with its child

The little "tricky" part is what happens when you have two child nodes. That's the problem to solve!

Have fun. :)

#algorithms #binary search

Finding the kth smallest element in a list efficiently

Let’s talk about finding the kth smallest element in a list. We’ll first look at some trivial solutions and then check if there might be any better. The absolute smallest element in a list is easy (and we will never do better than $ \Theta(n) $ time for that). Let’s quickly see if we can design a naive **recursive** algorithm for that. This function will accept a list of (in this case) _integers_ and will return the smallest element. The base case is simply if the list has zero or one element. So we get something like

function smallestElem(A[0..n-1]): if length(a) is zero: return [] if length(a) is one: return A[0] end

Now look at the recursive case. If the last element, $ A[n-1] $ is smaller than smallestElem(A[0..n-2]) then we just return that, or we'll return the smallest element of the subarray $ A[0..n-2] $. We get

function smallestElem(A[0..n-1]): if length(a) is zero: return A if length(a) is one: return A[0] // Smallest element of subarray A[0..n-2] smallest = smallestElem(A[0..n-2]) if A[n-1] ≤ smallest: return A[n-1] else: return smallest

In Python, we can easily translate this into

def smallestElem(A[0..n-1]): if len(a) == 0: return [] if len(a) == 1: return a[0] // Smallest element of subarray A[0..n-2] smallest = smallestElem(a[:-1]) if A[-1] <= smallest: return a[-1] else: return smallest

Finding the kth smallest element

So what about the $ k$th smallest element? A little trickier. This means that, if $ k = 2 $ then we want the second smallest element in list. This is a very simple solution (and quite readable) but not particularly efficient:

def kSmallestElem(array, k): stack = [] while k > 0: currSmallest = float("inf") for j in range(0, len(array)): if array[j] < currSmallest and array[j] not in stack: currSmallest = array[j] k -= 1 return stack.pop()

Obviously not very efficient. However, we can also sort the array before (transform-and-conquer) which lets us do this in $ \Theta(n \, log \,n) $ time on average:

def kSmallestElem(array, k): array.sort() return array[k-1]

However, this doesn't take into account if there are duplicates (but it can easily be worked around). But can we do even better? Yes, absolutely, at least we can improve the computation time by choosing a different strategy. If you have been introduced to the sorting algorithm Quicksort before, then this will feel familiar.

Quickselect

Like Quicksort, Quickselect can be understood in two separate parts. For quicksort, the divide-and-conquer strategy itself and the partition process (which there are several algorithms for), see my post about subarray partition if you need to refresh. In Quickselct, it's variable-decrease (decrease-and-conquer).

In pseudocode, the algorithm looks like this

function quickselect(A[l..r], k): s = lomutoPartition(A, l, r) if s = k - 1: return A[s] else if s > l + k - 1: quickselect(A[l..s-1], k) else: quickselect(A[s+1..r], k)

$ s $ is the split position returned by lomutoPartition, $ l $ is the start of the subarray and $ r $ is the end.

The idea is to use subarray partition (here we use Lomuto's partition); we basically pick an element as a pivot and then we partition the subarray such that all elements less than the pivot are to the left of it, and the greater ones are to the right. The pivot element will then be put at its right place, and we just return the index of the pivot's new position. But why do we do all this? Well, if our split position $ s $ is greater than $ l + k - 1 $, meaning the start index plus the $ k $ value, then we know that the $ k $th smallest element must be in the left subarray (obviously). Note, we have to do $ k - 1 $ because arrays begin at index zero.

The same goes with if $ s $ would be smaller, then the $ k $th smallest element must seemingly be in the right subarray.

So we first check the most simplest case (a.k.a. base case), which is if we get an array and the split position is the $ k $th smallest already, then we just return it. Or we return quickselect recursively depending on the logic whether the $ k $th smallest element must be in the left or right subarray (of the split position $ s $).

Implementation in Python. Neat, don't you think? ;)

## test print quickselect([7, 3, 2, 9, 12, 5], 6) ## wrapper def quickselect(a, k): return quickselect2(a, 0, len(a)-1, k) def quickselect2(a, left, right, k): splitPos = lomuto_partition(a, left, right) if splitPos == k - 1: return a[splitPos] if splitPos > left + k - 1: return quickselect2(a, left, splitPos -1, k) else: return quickselect2(a, splitPos + 1, right, k) def lomuto_partition(a, l, r): pivot = a[r] i = l - 1 for j in range(l, r): if a[j] <= pivot: i += 1 swap(a, i, j) swap(a, i + 1, r) return i + 1

Quickselect - Time complexity

So how well does Quickselect perform?

Remember that the time complexity for partition will always be $ \Theta(n) $, more specifically $ n - 1 $ key comparisons as we have to scan the whole array.

The base case for quickselect is easy. If $ s = k - 1 $ immediately after the first partition, then $ C_{best}(n) = n - 1 \in \Theta(n) $, i.e. linear time.

However, the worst case scenario is interesting. Since in Lomuto's algorithm we're (in this implementation) choosing the last element as pivot, things might not be as good as expected. Assume that we get the following input array

$$ [1, 2, 3, 4, 5, 6, 7] $$

and we have $ k = 1 $ (for purely demonstration).

Then Lomoto will go through the array and say, hey the split position $ s = 6 $. We will then say ok, let's recurse on the left subarray $ A[0..5] $. This time, Lomuto will yield $ s = 5 $, and next time $ s = 4 $ and so forth. This leads to the computation sum

$$ c_{worst}(n) = (n - 1) + (n - 2) + (n - 3) + ... + 1 $$

This is a known sum and occur in so many different areas. We know from math that

$$ c_{worst}(n) = (n - 1) + (n - 2) + (n - 3) + ... + 1 = \frac{n(n-1)}{2} $$

Thus, $ c_{worst}(n) \in \Theta(n^2) $.

Not good at all, pre-sorting the array and choosing the $ k $th largest element is more efficient in worst case. However, it's been shown that quickselect runs in $ \Theta(n) $ in time on average, and can be improved by using a better strategy for choosing the pivot (instead of first or last element). So quckselect is, on average, faster than pre-sorting.

#algorithms #selection problem

Subarray partition algorithms: Lomuto and Hoare

Lomutos and Hoares algorithms are used for partitioning, which for instance the famous sorting algorithm Quicksort is built on-top of. Lomuto's is a simpler solution but less efficient, while Hoare's is a little more sophisticated and was invented by the Quicksort man himself - Tony Hoare, hence the name.

In these algorithms we want to partition a given array around something that we call the pivot, which is basically just an element in the array (and can be chosen differently).

Overview

Let $ A[p] $ be our pivot element. Then we want to partition the array such that $$ A[0..p-1] A[p] A[p+1..n-1] $$

where all elements in $ A[0..p-1] $ are smaller than $ A[p] $ and $ A[p+1..n-1] $ are greater.

For example, assume that we have the array $$ A = [4, 6, 2, 1, 3] $$

And that we choose the last element, $ 3 $, as pivot element. Then we want to partition the array so that all elements smaller than $ 3 $ are to the left of it, and all greater elements are to the right. Note that, it doesn't matter if the elements left of $ 3 $ or right of $ 3 $ would be in the wrong order. For example, this can be a result that we're looking for:

$$ A[2, 1, 3, 6, 4] $$

Using Lomuto's algorithm

Lomuto's algorithm is fairly simple. We scan the whole array and use a variable $ i $ that you can think of as a marker where we put all elements smaller than our pivot behind. So whenever we find a smaller element we just increment $ i $ and put the element there and move on to the next element in list. This particular pseudocode uses the last element as pivot.

Example

#algorithms #partition #lomuto #hoare

How to solve integrals with u-sub and why it works

Finding a primitive, or antiderivative, of a function is at times hard. But there are some important and well-known techniques for doing that, and one of them is the u-substitution technique. You could say that it's based on the chain rule. If you want to refresh your memory on that, I've blogged about it in another post.

Perhaps you've seen before that, if you’ve have an integral of the following kind

$$\int f(g(x)) g'(x) \, dx $$

and we let $ u = g(x) $ it can be (hopefully) be rewritten to a more managable integral

$$ \int f(u) \, du. $$

But why it works is very interesting. How fun is it to use something without knowing why it works the way it does? ;) I said in the beginning that you could say this technique comes from the chain rule. To get an idea of why it works (a proof could be made using the fact that chain rule is true), recall that, the chain rule tells us what happens if we want to figure out the derivative of a composition of two or more functions:

$$ \frac{df}{dx} f(g(x)) = f'(g(x))\cdot g'(x)) $$

And we also know that, in order to get back to $ f(g(x)) $ we just need to integrate $ f'(g(x))\cdot g'(x)) $ with respect to $ x $, e.g.

$$ \int f'(g(x))\cdot g'(x)) \,dx= f(g(x)) + C $$

Usually we let $ F $ be the antiderivative of $ f $, and also note that it doesn't really matter what we call our functions, so for pedagogical reasons I will just say that $ g(x) = u(x) $:

$$ \int f(u(x))\cdot g'(x))\, dx = F(u(x)) + C $$

Now compare this with the following formula

$$ \int f(u) \, du = F(u) + C $$

If we say that $ u = u(x) $ and $ du = g'(x) \, dx $ we end up with the exact same thing! So you can indeed say that, the u-substitution technique works because of the fact that the chain rule works.

Some teachers probably tell you to imagine that $ \frac{dy}{dx} $ is a fraction and not an operator. If you want to do that, it's fine - as long as you feel confident. But now you probably have an idea on why you can "pretend" that it is an fraction and treat it that way.

Example

Compute the indefinite integral

$$ \int \, 2x \ sin\,x^2 \, dx $$

It's not hard to see that if we let $ u = x^2 $ then it's derivative $ u' = 2x $. If we now treat the $ \frac{dy}{du} $ operator as a fraction (remember that we said it was "legal" in this case) we would get

$$ \frac{du}{dx} = 2x \Leftrightarrow dx = \frac{du}{2x} $$

So now we also know what $ dx $ is, we get

$$ \int \, 2x \ sin\,(u) \, \frac{du}{2x} $$

Notice that, $ 2x $ cancels out, thus

$$ \int \, 2x \ sin\,u \, \frac{du}{2x} = \int \, sin\,u \, du $$

Now it's easy to see the antiderivative, namely

$$ \int \, sin\,u \, du = -cos \, u + C $$

Remember we had the variable substitution $ u = x^2 $ so we substitute it back, thus our answer is

$$ -cos \, x^2 + C $$

More examples can be found in this post

#u-substitution #math #integrals

Calculus: Limits

What are limits? How should we think about them? Why are they so useful?

A limit is really a value that some function gets close to when it approaches some value. Let’s jump right to a classic example:

Let’s say we have this function

$$ f(x) = \frac{x+1}{x-1}. $$

If we would try to plot this graph, we would see that when $ x = 1 $, i.e. $ f(1) $ it doesn’t really have any value! Why? Because you can’t divide something with zero - and it wouldn’t make much sense either. We say that $ f $ is not defined for $ x = 1 $.

Notice how the curve just keeps increasing when it approaches $ x = 1 $.

But one might ask; what happens when we get really close to $ x = 1 $? That’s exactly when limits come to use. If we look at the graph above, we see that as closer as the function gets to $ x = 1 $ from right side it starts increasing and increasing and goes to infinity $ \infty $.

Another example is the function $$ f(x) = \frac{1}{x} $$

Here it's obvious that $ x \neq 0 $. But what happens if we get close to zero? Let's check.

$$ f(-1) = -1 $$ $$ f(-0.1) = -10 $$ $$ f(-0.01) = -100 $$ $$ f(-0.00001) = -100 000 $$

#math #limit #calculus

Derivatives: The chain rule

You’ve probably heard of it before - the almighty chain rule for computing the derivative of the composition of two or more functions, for an example

$$ f(x) = g(h(x)). $$

In perhaps more concrete terms, an example of this can be

$$ f(x) = \sqrt{x^2 + 1} $$

or

$$ f(x) = sin \, 2x^2$$

and so forth.

It states that if we have a function of the form

$$ f(g(x)) $$

and we let $ u = g(x) $, then its derivative is

$$ \frac{df}{dx} = \frac{df}{du} \cdot \frac{du}{dx} $$

given that $ f $ is differentiable at $ g(x) $ and $ g $ is differentiable at $ x $.

If you don't like Leibniz notation (how could you not!) then it's equivalent as saying

$$ D(f(g(x)) = f'(g(x)) \cdot g'(x) $$

How do I know when to use the chain rule?

Everytime you see a composition of two or more functions. The more tricky part is probably to identity this, and in my experience this is probably a consequence of not really knowing what a function is. But it comes with practice!

#math #chain rule #derivatives

Understanding log n complexity

Binary search, mergesort, quicksort, AVL Trees, Splay trees, B-trees and red-black trees, etc. are only some algorithms or data structures that $ log \, n $ is related to. But what does it mean, really? Can we think intuitively about it?

I believe one intuitive way of thinking about $ log \, n $ complexity is: How many times can $ n $ be divided in halves until reaching $ 1 $?

This is a good way to think about it, but let’s show why we can think of it that way. I’m aware of that there are many attempts for explaining this in an intuitive way out there, but I remember myself that I was not really satisfied until I got to the more mathematical way of seeing it.

Let's see how many times $ n $ can be divided until we reach $ 1 $.

$$ n / 2 $$ $$ \frac{\frac{n}{2}}{2} = \frac{n}{2} \cdot \frac{1}{2} = \frac{n}{2^2} $$ $$ \frac{n}{2^2} \cdot \frac{1}{2} = \frac{n}{2^3} $$ $$ \frac{n}{2^3} \cdot \frac{1}{2} = \frac{n}{2^4} $$ $$ \frac{n}{2^4} \cdot \frac{1}{2} = \frac{n}{2^5} $$ $$ \frac{n}{2^5} \cdot \frac{1}{2} = \frac{n}{2^6} $$ $$ ... $$ $$ \frac{n}{2^x} = 1 $$

Note: Dividing by $ 2 $ is the same as multiplying with its multiplicative inverse $ \frac{1}{2} $.

So we keep dividing by 2 until we get $ 1 $, where $ x $ is how many times we have divided. See how we're actually left with an equation, namely

$$ \frac{n}{2^x} = 1 \Leftrightarrow 2^x = n. $$

Recall from logarithms in your math class (if you need to refresh, read my post

about it) that this is an exponential equation that can be solved easily using the definitions of logarithms $$ 2^x = n \Rightarrow x = log_2 n $$

So what did we end up with? $ log \, n $, just what we wanted.

Conclusions

First I suggested an intuitive way of thinking about logarithms; how many times $ n $ could be divided in halves until reaching one. This is a really useful property, because it's used in computer science all the time when talking about trees (and other applications). For instance, the height of a complete binary tree is at most $ log_2 \, n $.

Then we saw how we could come up with this in a mathematical way.

Not really a long post, but it's pretty concrete. Ask any questions you might have in the comments field. =)

#algorithms #complexity analysis #log n

Merge sort algorithm

Learn how to sort using the Merge Sort algorithm. Merge sort works by recursively splitting a given list into two halves, sort them, and finally merge them. What makes this fast is that it's cheaper to sort a list that's close to be sorted already.

Let's first look at the pseudo code and then we'll take an example. I recommend that you apply the algorithm "by hand" on a random list that you can choose how you like. I think it's the easiest way for learning algorithms.

Pseudo code

function mergesort(A[0..n-1]): if n > 1 then left = subarray(A, 0, n/2 - 1) right = subarray(A, n/2 - 1, n - 1) mergesort(left) mergesort(right) merge(left, right, A) end end

Looks elegant, doesn't it? That's the beauty of recursion. But is that really it? Well, most of the work is going to happen in the merge function. "Merge" has the responsibility to merge two sorted lists together: $ A[0..p-1] $ and $ B[0..q-1] $ into a new sorted list $ C[0..p + q - 1] $.

For an example, if $ A = [2, 4, 6, 8] $ and $ b = [1, 3, 5, 7, 9] $ then

$$ C = [1, 2, 3, 4, 5, 6, 7, 8, 9] $$

As an exercise: Come up with this algorithm, i.e. write pseudocode for an algorithm that merges two sorted lists.

Here's one way to do it:

#algorithms #merge sort

Insertion Sort algorithm

Yet another algorithm is the so called Insertion sort. It’s a little faster than bubble sort and selection sort on average, but still practically slow. The idea is to go through an orderable list and during each iteration insert the element at its right place, by comparing the element with all the previous elements in the list. You can think of the algorithm as it grows the sorted list "behind" during each run.

Pseudo code

Let $ A $ be a sortable array $ A[0..n-1] $

for i = 0 to n - 1 j = i while j > 0 and A[j-1] > A[j] swap(A[j], A[j-1]) j = j - 1

How can we be sure that this works?

Loop invariant

You can think of it as it maintains a sorted partition of the array during each iteration that gradually grows. For an example, assume that we want to sort the array $ A[0..n - 1] $. During each iteration $ i $ we have a partition $ A[0..i - 1] $ of the array where we insert $ A[i] $ to, and the result is a sorted list $ A[0..i] $. This holds true for all steps because even if our smallest element would be the last element in the list, that is $ i = n - 1 $, then $ A[0..i - 1] $ would be considered sorted and after we insert the last element $ A[i] $ it would be inserted at its right place, thus $ A $ would be a sorted list.

Of course in practice, like shown in the pseudo code, we just shift up the greater elements during each run by swapping them.

#algorithms #sorting #insertion sort

Selection Sort

Another simple sorting algorithm is the famous Selection Sort. It works a bit similiar to Bubble Sort, but instead of comparing each adjacent pair and let the sorted elements “bubble up” we start at the beginning element and search through the whole list to find the minimum element, and then we swap them, and then advance to the next element. This process continues until we can consider the whole list to be sorted.

Pseudo code

Let $ A $ be a sortable array $ A[0..n-1] $

for i = 0 to n - 2 min = i for j = i + 1 to n - 1 if A[min] > A[j] then min = j swap(A[min], A[i])

An invariant for selection sort that holds true could be that after each iteration in the outer loop the first $ i $ elements are considered sorted.

#algorithms #sorting #selection sort #complexity analysis

Bubble Sort algorithm

Bubble sort is a simple sorting algorithm where we sequently step through the list and compare each adjacent pair and swap them if neccessary, i.e. if they are in the wrong order. Then we start over and do the same until we have compared all and thus sorted the list. In this post we’ll take a closer look at the algorithm and do some complexity analysis.

Pseudo code

Let $ A $ be a sortable array $ A[0..n-1] $

for i = 0 to n - 2 for j = 0 to n - 2 if A[j] > A[j+1] then swap(A[j], A[j+1])

An invariant for this to hold true could be that, after each iteration in the outer loop at least one element is guaranteed to be at its right place, so we actually don't need to check the whole list again. This means we can make our pseudo code a little more effective and not let it compare unneccessary elements. We do this by modifying the inner loop a little:

for i = 0 to n - 2 for j = 0 to n - 2 - i if A[j] > A[j+1] then swap(A[j], A[j+1])

#algorithms #bubble sort #complexity analysis

Derivatives: The quotient rule

In the previous post we talked about taking the derivative of a product of two functions and came up with why our old rules for derivatives would fail. Let's take a look at how to take the derivative of a division of two functions, a quotient.

Suppose that we want to figure out the derivative for

$$ f(x) = \frac{g(x)}{h(x)} $$

assuming $ h(x) \neq 0 $ and $ f $ is continuous at $ x $.

It might not come as a surprise that since division is the opposite of multiplication, we can use the product rule to figure out $ f'(x) $.

#math #derivatives #quotient rule

Derivatives: Product rule

When taking the derivative of a product of two functions things get a little bit more complicated. Here I will show you why.

Assume that we have two functions: $ f(x) $ and $ g(x) $ and that $ h(x) = f(x) \cdot g(x) $.

The definition of the derivative then says that $$ h’(x) = \lim_{a \to 0} \frac{h(x+a) - h(x)}{a}. $$

Since the argument to our function $ h $ is $ x + a $ we get

$$ h’(x) = \lim_{a \to 0} \frac{f(x+a) \cdot g(x+a) - f(x) \cdot g(x)}{a} $$

but it’s not really obvious that we can simplify anything like we did for the more common rules for derivatives. However, what if we actually can rewrite the expression in the numerator so we can use the definition of the derivative to simplify? - Well, that’s the idea. Let me show you what I mean.

#math #derivative #product rule

Trending Blogs

Recently Viewed Blogs

Explain things simple