Tuesday, August 4, 2015

0011 - My Frustration...again

Just had to crank up my "Pick-me-up" playlist on Youtube. Just in case you're wondering, my playlist consists of 4 songs that loop forever. Feel free to listen to these as you read, or at your leisure:







Why am I playing my "Pick-me-up" playlist? If you can recall a previous blog post about "My Frustration," just know that I'm more frustrated right now than I have been at any point in this process. Granted, I'm 3 months into working on a problem that has consumed the greater portions of great men's lives before me. Those guys must have gone completely insane, because damn. This problem is stupid hard.

I just had my latest algorithm go up in smoke. Granted, I knew it wasn't complete, I knew it wasn't correct. I also knew that it was returning accurate results on large-sized graphs upwards of 90% of the time, which is pretty good. However, I finally admitted to myself tonight something that I've known for at least a week: at best, my algorithm is a decent heuristic, nothing more. It's like solving this problem is winning the World Cup, and I'm playing hopscotch while jumping rope. It makes no sense. I'm not even part of the game. I'm not even playing my game correctly. I feel like a dumbass.

So, since tonight is apparently full disclosure night, I'm going to share the BS algorithm I built because I'm brainless.

This algorithm will solve a majority of the graphs you throw at it, assuming that the graphs are formatted in a peculiar way: if two nodes are connected by an edge, it is represented in a 2-dimensional array with a 1; if they are not connected by an edge, it is represented as a 0; also, each node itself is represented with a 2. For example,

0 1 2 3 4 5 6 7 8 9
0 2 0 0 0 0 0 1 0 1 0
1 0 2 1 1 1 1 1 1 1 0
2 0 1 2 0 0 0 1 1 1 1
3 0 1 0 2 1 1 0 1 1 0
4 0 1 0 1 2 0 0 1 1 0
5 0 1 0 1 0 2 1 0 1 1
6 1 1 1 0 0 1 2 1 0 1
7 0 1 1 1 1 0 1 2 1 0
8 1 1 1 1 1 1 0 1 2 0
9 0 0 1 0 0 1 1 0 0 2

is a 10 node graph (nodes 0-9). You'll notice the diagonal line of 2's - these correspond to the nth column on the nth row, such as (0,0), (1,1), etc. So the first row represents the relationships of node 0. There is a 2 for 0, meaning that this is node 0. There are 0's for 1, 2, 3, 4, 5, 7, and 9, indicating that 0 is not connected to those numbers. There are 1's for 6 and 8, indicating that these two nodes are connected to node 0.

Okay, so that graph is used by the following algorithm (first few lines explain global variables):

gSize - integer that represents the number of nodes in the graph
best - integer that contains the largest independent set found thus far
g[][] - 2 dimensional integer array that contains the original graph (such as expressed above)
s[i][j] - 2 dimensional integer array that contains the optimal solution array for each node i; initialized with the original values of g[][]

IS():
    for i = 0 to gSize:
        while(true):
            temp = Weight(i)
            if temp < gSize:
                Switch(i, temp)
            else:
                break
         currBest = GetBest(i)
         if currBest > best:
             best = currBest

Weight(i):
    currBest = 0, currIndex = gSize
    for j = i to gSize:
        currWeight = 0
        if s[i][j] = 0:
            match = true
            for k = i to gSize:
                if s[i][k] == 0 && g[j][k] == 0:
                    currWeight++
                else if g[j][k] == 2:
                    currWeight++
                else if s[i][k] == 0 && g[j][k] == 1:
                    match = false
            if match == true:
                return j
        if currWeight > currBest && currWeight != 0:
            currBest = currWeight
            currIndex = j
        else if currWeight == currBest && currWeight != 0:
            first = 0, second = 0
            for k = i to gSize:
                if s[i][k] == 0 && g[currIndex][k] == 0:
                    for l = i to gSize:
                        if s[i][l] == 0 && g[currIndex][l] == 0 

                          && g[k][l] == 0:
                            first++
                if s[i][k] == 0 && g[j][k] == 0:
                    for l = i to gSize:
                        if s[i][l] == 0 && g[j][l] == 0 && g[k][l] == 0:
                            second++
            if second > first:
                currIndex = j
    return currIndex

Switch(i, j)
    s[i][j] = 2
    for k = 0 to gSize:
        if g[j][k] == 1:
            s[i][k] = 1

GetBest(i)
    currBest = 0
    for j = i to gSize:
        if s[i][j] == 2:
            currBest++
    return currBest


If you don't understand that, it's totally fine. It doesn't work anyway. If you do understand it, feel free to try to solve the graph I gave above with this algorithm. Here's a hint: you'll get a largest independent set of 3, while the correct answer is 4 because node 0 will connect with node 9 before the rest of them, while node 9 is actually not a part of the true largest set. In fact, my algorithm will come up with a largest independent set made up of nodes 0, 9, and 1 (picked in that order), whereas the actual largest independent set (as determined by me working it out by hand) is 0, 2, 4, and 5 (not necessarily picked in that order).

The problem is that I tried to simplify the selection process by weighting nodes based on similarity to the current node in question (in this algorithm, this would be node i). In so doing, I managed to actually solve a vast majority of the graphs that I tested. However, it was not perfect, so there were incorrect solutions.


So, how do I fix this? In short, I'm going to have to teach my computer to think, or at least to appear to be thinking. Backtracking, if you remember, has a worst-case time complexity of T(n) = O(2^n). I'm not taking that many steps when I work these out by hand, but I can't rightly explain to you my by hand method (which is part of the problem). The algorithm I outlined above has a time complexity of T(n) = O(n^5) (more or less...), which is much mo' bettah, but also doesn't get the right freaking answer.

So I'm creating a new class in my test app, opening up my steno pads, busting the dry erase markers back out, and listening to my Pick-me-up Playlist. Because I've got a lot of work to do.





No comments:

Post a Comment