Wednesday, June 10, 2015

0008 - My Frustration

Part of my childhood that won't die (likely for many, if not most, others) is my imagination. I like to daydream. I like to construct elaborate fantasies about potential futures, and plot ways to make those futures a reality.

Unfortunately, all of the plots I concoct center on one of a few highly improbable events:
    1. Winning the lottery (I don't play the lottery, so will never win)
    2. Finding some other source of massive income within a relatively short timeframe

As I've gotten older, my daydreams have reached less and less into the future as the window for one of these improbable events narrows each and every day. As such, I have plunged myself headfirst into the second event: finding a source of massive income. My current event? Solving the P vs. NP problem and proving that P = NP.

For those who are not familiar with this problem or are a little rusty on the particulars, here's a brief rundown with some heavy generalization:
The complexity of an algorithm can be judged by its worst case time complexity. Worst case time complexity is a measure of how long it will take to solve the given problem using the given algorithm as a product of the size of the problem.
For example, an algorithm may use a loop to add the numbers 1-N together, like so:

sum = 0
for i = 1 to N:
    sum = sum + i

This has a worst case time complexity of N (i.e. if N = 10, this will take 10 steps to complete)
If we wanted, instead, to add each number from 1 to N to each other N times, we could use the following algorithm:

sum = 0
for i = 1 to N
    for j = 1 to N
        sum = sum + i

This has a worst case time complexity of N^2, because it takes N steps N times. Both of these problems are called P problems, because they can be solved in deterministic polynomial time (i.e the worst case time complexity is N^X, where X is a finite number less than N).
If, however, we wanted to find the largest independent set of a graph using backtracking, our algorithm would have a worst case time complexity of 2^N.

Confused?

Okay, first off, a "graph" in this instance is referring to a set of nodes that are connected by edges:
A-B-C
In this graph, A, B, and C are the nodes, and edges connect A to B and B to C.
Another example:
A-B-C
  \  |
    D
In this graph, A, B, C, and D are the nodes, and edges connect A to B, A to D, B to C, and B to D.

Second, an "independent set" is a collection of 1 to N nodes that are not connected with edges. For our first graph, the independent sets are {A}, {B}, {C}, and {A,C}. In our second graph, the independent sets are {A}, {B}, {C}, {D}, {A,C}, and {C,D}. The largest independent set is the independent set containing the most nodes (for graph 1, it is {A,C}; for graph 2, it is either {A,C} or {C,D}).

To date, the best algorithm to solve for the largest independent set utilizes backtracking, which I don't want to talk about in this post. Essentially, it attempts to test every possible combination of nodes and has a worst case time complexity of 2^N (i.e. with a problem size of 3, it would take 8 steps; with a problem size of 4, it would take 16 steps; with a problem size of 10, it would take 1024 steps). This is called nondeterministic polynomial time, or NP.

Enough of that.
My goal, and what I've been working on in my spare time for almost a month now, is to prove that P = NP, meaning that problems that are contained in the NP class (such as determining the largest independent set of a graph) can be solved with an algorithm in the P class. Unfortunately, this is very difficult, and is starting to frustrate me.
This problem was formally stated in 1971, so it's something that nobody has proved in the last 44 years, though many, many intelligent people have tried. As such, I'm not being very hard on myself for not solving it yet. However, every time I work up an algorithm that seems to work, I'm missing one of two important points:
1. It's accurate (i.e. it always reaches the correct solution)
2. It's in the P class
So far, I can only get one or the other. The closest I've come (earlier this evening) is an NP algorithm that is 100% accurate that, when removing two simple loops (and thereby changing the algorithm to the P class), becomes accurate only for problem sizes < 5.

So, I've decided that, as of about an hour ago, I am going to completely scrap what I've done so far. I've been approaching and representing the problem in the same way that everyone else has. I'm reinventing the wheel over and over (of my accurate solutions, I have worst case time complexities of N!, 2^N, and N^N!; don't ask about that last one - I was drinking). Now I just have to figure out a novel approach to the problem, and that will hopefully yield better results.

Or I can buy a Powerball ticket. Either way works.

Edit: as for why solving this problem will give me massive income, please view the following links:
http://en.wikipedia.org/wiki/Millennium_Prize_Problems
http://en.wikipedia.org/wiki/Fields_Medal
http://en.wikipedia.org/wiki/Abel_Prize
etc.

No comments:

Post a Comment