Sunday, April 20, 2014

Everything is connected... but how?

    Mathematical reasoning is not exclusively grounded in the use of numbers, and it does not need to be inaccessible to people who lack a knack for the numbers. At the same time numbers can be very handy for describing different kinds of relationships, and people who don't completely understand how the math works behind the scenes can still benefit from the information provided by logically structured and quantified arguments.

Collaborative Knowledge Distribution and Problem Solving Networks

    The modern world is faced with some truly astronomical problems, having to do with the environment, the distribution of resources among its residents, and of course the ability our species has developed for self-destruction. Fortunately, we are also equipped with some unbelievably powerful tools. Cloud computing and social media are still blooming technological industries whose full potentials are as yet no where near being realized. Lets go over what there is though.

Google

    The PageRank algorithm itself is a great example of math being used behind the scenes to everyone's great benefit. This brilliant application of Markov Chains was one of the most recent great leaps towards human beings attaining the ability to make decisions based on the collective knowledge of our entire species. One could almost call it the best thing since written language (Note Baidu is about the same age, those Chinese, and who knows what else is out there). Google of course didn't stop there though. Google Drive, Doc, Calendar, and of course Gmail (obviously they didn't invent email but it needs to be mentioned, might as well be here), are all services that revolutionize our workflow, letting us with people from miles away (we might as well throw Google Hangouts in here, though yes Skype came first), and access our documents anywhere with an internet connection. Google's search is still probably the most influential however, connecting people with information, people, and businesses they would otherwise never know about, enjoy, or make use of.

#KnowledgeDistribution

Wikipedia

    Next up comes the online encyclopedia of everything ever that anyone has ever talked about, or so it attempts to be. It seems to do a pretty wonderful job of it though, and the partnership between Google and Wikipedia is one of the few things that gives me hope for the future of humankind. Wikipedia is the current paradigm for crowd sourced knowledge aggregation. The way the encyclopedia is interconnected with links supports the way psychologists believe knowledge is structured in the human mind (Not brain, mind. We are talking about behavioral data here). A main point of support from the realm of psychology has to do purely with motivation. People tend to start digging around on Wikipedia out of real interest. Studies of attention tend to show that as we might expect, people are better able to recall information if they were genuinely interested in what they were reading in the first place. This comes back to Google, and especially to the Google-Wikipedia alliance. For the readers at home - never stop Googling things when they interest you, never tell yourself you are wasting time on Wikipedia. I'm not entirely sure there really is such a thing.

#KnowledgeDistribution
#KnowledgeRepresentation

Reddit

    After discussing that enormous body of information that is simply there for the reading, we should probably discuss the one that lives and breathes. Well, one of the digital ones that is. Up-voting, down-voting, link karma, and comment karma are all ideas that fill me with heady thoughts of distributed cognition. We will have to come back to this after we discuss logical structures and the current under-use of the idea of isomorphism in social media.
#KnowledgeDistribution

Facebook and Twitter

    While we are on the topic of social media, lets talk about the sites everyone actually uses, or that most people use. This is how a lot of people, if not the majority of my generation at least, get their news. I personally think that it is much more efficient than televised news at least. Newspapers do tend to have more investigative reporting and less cat videos, but even these are more systematically biased than the aggregation of the things everyone you know personally finds important. The Facebook activist can be seen as a lazy bystander sitting at his or her computer instead of standing in a street protest somewhere, but is that really necessary? The first point of activism is almost always to spread awareness about something that may be wrong with our society, or about something that could be improved upon. This we can do through Facebook and Twitter, and it really can be meaningful. That isn't to mention the ease with which petitions can be created and signed over the internet. I have yet to discuss Twitter specifically, but don't worry. It'll come again up soon.

#KnowledgeDistribution

HashTagging

   This phenomenon is actually very mathematical to me. It conveys a very simple, widely interpretable "is related to" relationship between some concept or media object and the label of some other concept or media object. These links can be treated as edges in a graph where the concepts (network files; the Graphs) and objects (text, links, images, video, applications) are vertices.
#KnowledgeRepresentation
#Ok,enoughhashtagging. 

Wolfram

   The world's first large attempt at a computational knowledge engine that I know of, WolframAlpha is impressive, and Mathematica is so useful for so many things. I don't understand why they don't work with Wikipedia, but whatever. Their natural language programming project will be useful as well. It is young, and seems to have a lot of potential. They already have libraries of code to deal with different types of social network data. While we are talking about problem solving networks though,

Yahoo Answers and Cha-Cha

    From every text book physics problem to relationship advice, these human powered question answering services answer a lot of questions. They are also faster than Wolfram's buggy free version at giving you useful information, especially from a phone. The idea of forums in general, and this goes back to Reddit, is incredibly powerful and still underutilized in the political domain. Domain experts have valuable, targeted information to offer in response to reasonably well formed linguistic queries, and once some questions have been answered once, that answer can be easily reused. The challenge is to store them in a sensible way so that they may be retrieved when necessary. These types of systems could be incorporated with knowledge computation systems like the ones Wolfram is working on to great effect. It may even become reasonable to approach larger complex problems from this direction. However, new was to structure information and represent knowledge might be helpful.

Mind and Concept Mapping

    Mind maps are more and more frequently being used by professionals to structure their thought process and workflow, and to create demonstrations and present information. Concept mapping on the other hand seems to interest more psychologists and linguists.

CMAPS and IHMC

   Researchers at the Florida Institute for Human and Machine Cognition head a large collaborative online concept mapping project, and are connected with the engineering of the Semantic Web, as it is called. They do other interesting things too, like teaching robots to make use of spoken natural language (http://www.ihmc.us/Research/human_machine.php), and using our understanding of human cognition to most advantageously integrate human interaction into complex automated decision-making processes (http://www.ihmc.us/Research/knowledge_discovery.php). The CMaps interface comes along with great features, like the ability for multiple users to simultaneously edit a concept map, and a built in similarity test with several options that compares any two concept maps.

XMind, FreeMind, MindMeister, etc.

   The mind mapping game seems to always have been more about creating visual displays of information that they were about attempting to encode complex contextual information in way that can be retrieved by natural language queries. The structures that can be made in CMaps can also be made in programs like XMind, and I think that these mind mapping companies have done a good job of making the interface flexible and pretty.

The 3rdMind

    When I originally had this idea I had never heard of mind mapping software or CMaps, or the semantic web. What it amounts to though is combination of those and a lot of the ideas I have already been describing, with one added little feature. The ability to send messages between any two nodes. An object-oriented database would be formed that could be populated through crowd-sourced knowledge, pulled right from Yahoo Answers and Wikipedia themselves, preferably. Users would be nodes themselves, or indexed by nodes anyway. This approach would allow for personal knowledge databases to be compiled and accessed by others. 


Local Solutions

    One truth that I think new technologies could exploit, is that local problems often have local answers that can be addressed locally but informed by large, complex data that regards systems more or less holistically, on different levels one could say. Systems for rewarding and making use of people with good ideas could be more prevalent than they are now on a casual level. This is getting unfocused though, and I think I will pick up here sometime later.

Saturday, April 19, 2014

The Koch-Sierpinski Ninja Star

Fractals
    A fractal is, in general, a mathematical set that displays self-similar patterns. Most people recognize these geometrically, as a set of points plotted in two or three dimensions. We think of them by the step, or iteration, going on towards infinity. Two of the first-to-be-discussed, simplest, and most recognized fractals are known as Koch’s Snowflake and Sierpinski’s Triangle. A few years ago I had an idea for how to draw a fractal that involves some similar generative functions, but is indeed quite distinct from both and possesses some surprisingly unique properties. I’ve been looking into it on and off ever since.


Koch’s Snowflake

    As Wikipedia explains: 

The Koch snowflake can be constructed by starting with an equilateral triangle, then recursively altering each line segment as follows:

1) divide the line segment into three segments of equal length.

2) draw an equilateral triangle that has the middle segment from step 1 as its base and points outward.

3) remove the line segment that is the base of the triangle from step 2.



Like so:

File:KochFlake.svg
   What I find most interesting about Koch’s Snowflake, is that its perimeter has infinite length. It will be interesting to see if the Ninja Star does as well, because its perimeter grows similarly to but slower than that of Koch's Snowflake.


Sierpinski’s Triangle

File:Sierpinski triangle evolution.svg

The cool thing about this guy’s fractal is that it has connections to other foundational patterns in mathematics, like Pascal’s Triangle and the Towers of Hanoi puzzle. It doesn’t have really cool limits in it like Koch’s Snowflake or the Ninja Star, but there are cool limits having to do with other things (like Pascal’s Triangle) that actually converge to it, which is also cool.

The Koch-Sierpinski Ninja Star

    What I did originally was to literally just start drawing Sierpinski's triangle inside of Koch's snowflake, which you can do. However, though it looks kind of cool there isn't much that is interesting about it that isn't interesting about the first two. I was interested in the area ratio of dark to light triangles, so continuing the Sierpinski pattern in the white triangle didn't appeal to me for long. I then took a liking to the way the corners of the inner triangles touched each other at a point, and decided to limit generation of new triangle to include only those. Notice the difference that emerges between the outer bounds of this first version and Koch's Snowflake ( You can see it in the fourth iteration ).
It wasn't until this year that I realized the whole snowflake can be inscribed into each blank triangle that composes it. That realization was how it came to look like this.

The area question was actually really easy before, though somehow I wasn't convinced at the time - it will stay 1/4 colored the whole time as it iterates on toward infinity. However, that changes entirely when the white area in one iteration starts to be filled with more Ninja Stars.

Notation

    After going through a few different versions of tables trying to count the triangles in each iteration of this thing, I settled on keeping track of the triangles by level using a sub-scripted variable TL sub k of n. This variable tracks how many new full triangles ( 3 blank surrounding one colored ) are created at level k, where a level k is defined as one less than how many triangles the base of the star that is sprouting the triangles is directly and indirectly inscribed into. This way we are are able to count the number of a given size of triangle by the n value at which they are produced, while reusing the original recursive generating function. This is necessary because the way it goes, 3 triangles sprout off of the initial one, but then only 2 sprout off of each of those and so on. Notice in the GIF that we add all of the triangles that there will ever be of a given size to the design in each iteration. Then a recursive formula for the number of triangles produced at each level can be:

**There is a typo here! The second to last line should read,
TL_k (n) = 3 (SUM from i = 1 to k - 2 of: TL_i (k - 1)) TL_1 (n - k

Basically we are just reusing the base sequence, which is a binary tree of valency 3.

Remember however that this only counts the number of new triangles at a particular level. A formula for the number of new triangles produced in a given iteration can be:


Then the total number of triangles is given by
The sequences given by both of these last two formulas look like this:

T(n): 1,3,9,30,96,309,996,3207,...

TT(n): 1,4,13,43,139,448,1444,4651,...

Neither of these two sequences are present in the Online Encyclopedia of Integer Sequences (OEIS), so that's sort of cool.

The Area Question 

    Now that we can track how many triangles are created in each iteration under the original assumption that all triangles of a given size are created in a single iteration, we can start to really pursue the question: What is the ratio of colored to blank space in the whole design as n approached infinity?

First we could define another sequence, this one an ordinary convergent geometric series. We count the area in terms of the original triangle at n=1 having an area of 4, so that each triangle composing it has area 1. Then each new triangle made in iteration n has an area of 4(1/4)^n. We plug this expression in to the sum given by TL sub 1 of n. This will calculate the area of the entire design as n approaches infinity. The limit we are interested is given below, followed by a simplification of the function we are trying to evaluate as n approaches infinity. The way we represent it initially includes recursive actions that we can't deal with in a limit.
    The limit of A(n) is now easy to calculate as n approaches infinity. It is equal to 10. This is interesting, because we can observe that the equilateral triangle that the design could be inscribed into has a area of 16 by our convention, and the hexagon within that triangle that clearly bounds the outward growth of the design has an area of 13 by that same convention. Now I will leave you with a limit that I have yet to solve, as it also includes recursive functions. We can define a function CA(n) to represent the total area of colored triangles in the design as well. We do so by starting at one fourth the area of the whole pattern at n=1, so 1. This is the area of the first colored triangle. We then add (1/4)^n of the number of the total number of triangles created at each iteration to the total colored area in each iteration. It looks like this,
Since T(n) is recursively dependent on the weird TL_k(n), it is tricky to get a closed formula for. TL_1(n) is actually in the OEIS, and a closed formula for it was not so hard to find. This on the other hand stumps me, and the corresponding limit is thus out of my reach as of now. The final relationship of interest is precisely the following; the ratio of colored area to total, 
In the spreadsheet where I came up with these recursive definitions I approximated this R(n) function up to the 17th iteration, and it actually seems that it may be approaching 1/2. I find that very interesting personally, but as of yet we cannot know for sure what the value is. We know visually that the limit must converge to some number between 0 and 1, because there are some white spaces which we know will never be filled, as given by the difference between 10 and 16, the area of the design and the equilateral triangle it is inscribed into.

Check out the spreadsheet. Just make a copy if you want to mess with it.

https://docs.google.com/spreadsheet/ccc?key=0AkvjjjAi1kyWdE9YcW1XWXlCeFJURUxBTFBHS3ZRSnc&usp=sharing

GIF of triangles by level.

Geogebra drawings all courtesy of Professor John Golden. Your beautiful pictures inspire me to my best :)