Saturday, August 6, 2022

P = NP? A Solution to the Clay Mathematics Institute's Sample NP Problem

ABSTRACT:

This paper offers a solution to the sample NP problem that can be found at the Clay Mathematics Institute's website. A python program is included along with commentary on whether or not P = NP.

At their P vs. NP page, the Clay Mathematics Institute (CMI) provides the following NP problem example:

"Suppose that you are organizing housing accommodations for a group of four hundred university students. Space is limited and only one hundred of the students will receive places in the dormitory. To complicate matters, the Dean has provided you with a list of pairs of incompatible students, and requested that no pair from this list appear in your final choice. This is an example of what computer scientists call an NP-problem, since it is easy to check if a given choice of one hundred students proposed by a coworker is satisfactory (i.e., no pair taken from your coworker's list also appears on the list from the Dean's office), however the task of generating such a list from scratch seems to be so hard as to be completely impractical. Indeed, the total number of ways of choosing one hundred students from the four hundred applicants is greater than the number of atoms in the known universe! Thus no future civilization could ever hope to build a supercomputer capable of solving the problem by brute force; that is, by checking every possible combination of 100 students."

To make this problem even more difficult and more explicit, let's assume there is only one set of 100 students who qualify for places in the dormitory, and, let's assume we can randomly select any two students from this set--and the pair will not appear on the dean's list of pairs of incompatible students. Thus, no pair from this set will match any pair on the dean's list. The following math shows that this ideal set is buried in a haystack of approximately 10^96 (not 10^200) sets of 100 students!

Below we derive equation 4 which shows that not only k increases as n increases, but the exponent increases as well:

A brute-force algorithm will undoubtedly exceed polynomial time (P). However, we can convert this NP problem to a P problem if we replace student names with numbers ranging from 0 to 399, then pair with each student number an incompatibility score. We do this by examining how many times each student number appears on the dean's list of student pairs. For example, if the list shows [[1,2], [1,3] ...], student 1 appears twice and receives a score of 2. Students 2 and 3 only appear once, so they receive a score of 1. Having the lowest incompatibily scores, students 2 and 3 will appear on the final 100 list along with 98 other low-scoring students.

The final list of 100 students can be checked against the dean's list in polynomial time. The process of checking the dean's list to give each student a score can also be performed in polynomial time. This can be mathematically shown. Let's start by calculating the size of the dean's list of pairs:

The first term on the right side of equation 5 is the total number of student pairs possible with no repeats. We subtract from that the total possible pairs (with no repeats) from the final 100 list. Thus, the size and complexity of the problem has been reduced from approximately 10^96 arrangements of 100 students to less than 100,000 arrangements of student pairs. Below is the relevant math showing the problem is solved in polynomial time:

The term on the right side of equation 11 does not have the number of students (n) as an exponent. The variable k' increases as n increases but the exponent stays fixed at 2--unlike equation 4 where the exponent increases as n increases.

Now, on paper it looks as though the NP problem can be reduced to a P problem, but what about in practice? Below is a python program that solves the CMI's sample NP problem:

Here is the final output:

Lines 1 to 21 are devoted to creating a random dean's list. The remaining program solves the puzzle by giving each student number an incompatibility score. The 100 students with the lowest scores make the final cut. As you can see, there is only 45 lines of code, most of which are used merely to set up the problem. Only the final 22 lines of code are used to solve the problem. The total run time on my personal computer is about one minute. Now, does this mean that P = NP?

At the very least it can be shown that an NP problem can be reduced to a P problem. But is this always the case? One thing I have noticed is that some NP problems have more clues than others. The CMI's example has 74,850 clues, i.e., the student pairs on the dean's list. These clues tell which pairs don't belong to the set of pairs of students on the final 100 list. So there's no need to guess which of the approximately 10^200 sets of 100 students is the right answer. We can use the power of deduction.

Now, imagine this same NP problem without any dean's list, without any clues. The dean simply tells you whether or not you have the right answer. Even the great Sherlock Holmes would be forced to test up to approximately 10^96 sets of 100 students. His powers of deduction would be useless. So even if we can make NP equal to P, we can always make the NP problem harder--the hardest problem has no clues. Thus it seems reasonable to conclude that P doesn't equal NP where there are no clues. Where there are clues, there is opportunity!

References:

1. Cook, Stephen. The P Versus NP Problem. Clay Mathematics Institute

2. Stewart, Ian. Ian Stewart on MInesweeper. Clay Mathematics Institute

3. P vs NP Problem. Clay Mathematics Institute