The Random Number Generator - One of the best Maths problems ever IMO

bellenuit · 16 November 2009

Wysiwyg said:
Okay.
Bellenuit, any formula works upon the knowing of the numbers. The likelihood of answering yes correctly the first time is 1 in 1000. Everyone seems to be collecting the data and working it out. It is easy to eliminate numbers and increase the odds if the numbers are known. That is not part of the stated operation.

But you are not collecting the data and working it out, otherwise you would be right every time. You are allowing 500 numbers to pass answering No to each one, but taking a note of the highest number within that 500. Then you say Yes when the computer generates a number that is higher than that. The probability is that you will be right at least 25% of the time if you use that strategy.

As an additional hint for everyone. You should ask yourself is 500 the optimum number of random numbers to let pass before making the decision. If you only let 1/5 of the random numbers pass, or 1/4 or 1/3 or perhaps more than half, 2/3 say, would your chances improve using the same strategy.

Wysiwyg · 16 November 2009

bellenuit said:
But you are not collecting the data and working it out, otherwise you would be right every time. You are allowing 500 numbers to pass answering No to each one, but taking a note of the highest number within that 500.

Is that not collecting data? You state "taking a note of the highest number within that 500".

skyQuake · 16 November 2009

I think i've figured it out, the optimal number to choose depends on how big the number is at what point in the sequence.

-Let H denote the highest number so far in the series.
-Let the range of finite numbers be limited to x and -x (giving a range of 2x)
-Let n denote how many numbers have already been drawn.

The relative position of the current highest number in the series is thus:
x-H (ie. how many can possibly be ABOVE the current high)

Thus the probability that a higer number will not be drawn on the next draw is:
(x-H)/(2x)
ie. Number of possible higher integers divided by the total amount of possible integers.

Consequently, assuming we have drawn n numbers, we have (1000-n) numbers left to draw.

Thus the probability that no higher number will occur in the rest of the draw is given by:

[(x-H)/(2x)]^(1000-n)

As such, it is best to wait till close to the end of the series to make the choice. H = say 0.8x early on at say n=100 would yield a very low probability that this is the highest possible number and would not be picked. But if H increases significantly and/or n approaches 1000, the probability that 'no higher number would occur' increases and you would be more likely to pick...

bellenuit · 16 November 2009

Wysiwyg said:
Is that not collecting data? You state "taking a note of the highest number within that 500".

Who said you had to forget each number once you said No to it? Of course the chances will always be 1 in 1000 if you simply press Yes at any number without any system or method behind your decision. But the question as stated implies you adopt an optimum strategy, and such a strategy will usually involve basing your decision for each number on the information you have gathered from the numbers that have gone before. For example, you would hardly say Yes to a number that is less than the one immediately preceding it.

This is directly from the text of the problem.....

If you answer No, it will print the number it displayed on the output listing so that you have a record of what the number was for whatever calculations you might want to do.

and also....

Assuming you are a perfect logician and will try and maximise your chances of identifying the highest number.....

that implies you will apply a strategic approach to trying to pick the highest number

Wysiwyg · 16 November 2009

skyQuake said:
I think i've figured it out, the optimal number to choose depends on how big the number is at what point in the sequence.

-Let H denote the highest number so far in the series.
-Let the range of finite numbers be limited to x and -x (giving a range of 2x)
-Let n denote how many numbers have already been drawn.

The relative position of the current highest number in the series is thus:
x-H (ie. how many can possibly be ABOVE the current high)

Thus the probability that a higer number will not be drawn on the next draw is:
(x-H)/(2x)
ie. Number of possible higher integers divided by the total amount of possible integers.

Consequently, assuming we have drawn n numbers, we have (1000-n) numbers left to draw.

Thus the probability that no higher number will occur in the rest of the draw is given by:

[(x-H)/(2x)]^(1000-n)

As such, it is best to wait till close to the end of the series to make the choice. H = say 0.8x early on at say n=100 would yield a very low probability that this is the highest possible number and would not be picked. But if H increases significantly and/or n approaches 1000, the probability that 'no higher number would occur' increases and you would be more likely to pick...

Okay here is a simple "test" for this formula. I will present five numbers from one thousand random numbers my computer has generated. Please answer yes or no to each number. I know what the highest number is in the 1000 so good luck.

192821239
-616178961
789673834
680667335
998670174

skc · 16 November 2009

I think we are missing the probability enhancement coming from saying yes only if the number on screen is higher than all the numbers already showned.

(A). Pass up on first n numbers. The probability you are wrong would be n/1000.

(B). Look at number (n+1), if it is the highest number so far, you answer yes. The chance of that being correct is 1/(1000-n). But you only need to make this guess 50% of the time (assuming a range so vast that the chance of the next number being higher or lower is simply 50%).

(C). The remaining 50%, number (n+1) is lower than the highest number so far. You are guaranteed to make the correct call by saying no.

So...the probability of us being right is some sort of product or sum of the 3 terms above. And then we need to solve for the optimal n.

I have a hunch that the optimal n may in fact simply be 1, but I am struggling to put the correct formula together...

Robb · 16 November 2009

skyQuake said:
-Let the range of finite numbers be limited to x and -x (giving a range of 2x)

You cannot use this, as the range is unknown/infinite to the person answering yes or no. ie. We could assume that a single observation of a number generted is Y, so it could be said that Y is continuously uniformly distributed, Y~U(-X,X). Ie. Y doesn't have to take an integer value?

So you would need to use some form of order statistic distribution, Y(n): max(Y1,...Yn)
http://en.wikipedia.org/wiki/Order_statistic#The_order_statistics_of_the_uniform_distribution

I just had a stat exam this morning, and not getting where to take this, i assume this is the more 'advanced' way of solving it? or Im looking at it completly from the wrong angle?

skyQuake · 16 November 2009

Wysiwyg said:
Okay here is a simple "test" for this formula. I will present five numbers from one thousand random numbers my computer has generated. Please answer yes or no to each number. I know what the highest number is in the 1000 so good luck.

192821239
-616178961
789673834
680667335
998670174

Lol can't do; The equation needs the limits. If limits not specified, its as good as finite. What can be done is say this high number is 85% of Limit.
Thanks for the link, reading up on it.

Robb said:
You cannot use this, as the range is unknown/infinite to the person answering yes or no. ie. We could assume that a single observation of a number generted is Y, so it could be said that Y is continuously uniformly distributed, Y~U(-X,X). Ie. Y doesn't have to take an integer value?

So you would need to use some form of order statistic distribution, Y(n): max(Y1,...Yn)
http://en.wikipedia.org/wiki/Order_statistic#The_order_statistics_of_the_uniform_distribution

I just had a stat exam this morning, and not getting where to take this, i assume this is the more 'advanced' way of solving it? or Im looking at it completly from the wrong angle?

Oh i was doing this under the assumption that bellenuit had changed it to a finite range.

bellenuit · 16 November 2009

skyQuake said:
Lol can't do; The equation needs the limits. If limits not specified, its as good as finite. What can be done is say this high number is 85% of Limit.
Thanks for the link, reading up on it.

Oh i was doing this under the assumption that bellenuit had changed it to a finite range.

Yes. After there was some anomalies encountered when infinities were involved in the maths, the range of random numbers was changed to a vast but unknown finite range.

You have a simple computer program that sequentially generates 1000 perfectly random numbers within a finite, but vast range. You do not know what the range is, other than that it is finite.

Wysiwyg · 16 November 2009

skyQuake said:
Lol can't do; The equation needs the limits. If limits not specified, its as good as finite. What can be done is say this high number is 85% of Limit.

Okay thanks. Saying yes to a higher number than previous numbers narrows the chances.

:

Robb · 16 November 2009

bellenuit said:
Yes. After there was some anomalies encountered when infinities were involved in the maths, the range of random numbers was changed to a vast but unknown finite range.

You have a simple computer program that sequentially generates 1000 perfectly random numbers within a finite, but vast range. You do not know what the range is, other than that it is finite.

So discrete random iid numbers then.
Well, its a conditional probability; let Y be the observed value, so Y(1) is the first observation, Y(2) is the second etc.
So the Probability that an observation is the maximum is
P(Y(1))=1/1000
P(Y(2))=1/999 given (Y(2)>Y(2)) otherwise 0
P(Y(3))=1/998 given (Y(3)>Y(1),Y(2)) otherwise 0.
....
P(Y(n))=1/(1001-n) given (Y(n)>Y(1),...,Y(n-1)) otherwise 0

hmmm... good question

schnootle · 16 November 2009

Very famous problem this one. I haven't heard it formulated this way, i believe you would be called a misogynist if you gave it its original name.

I won't let be a spoiler, but am happy to give hints for people that are having fun.

bellenuit · 16 November 2009

schnootle said:
Very famous problem this one. I haven't heard it formulated this way, i believe you would be called a misogynist if you gave it its original name.

I won't let be a spoiler, but am happy to give hints for people that are having fun.

It's at least 35 years since I first heard this problem. It was originally formulated as numbers written on the backs of pieces of paper that were turned upside down. You were to turn a paper and nominate if that was the highest number on all the pieces of paper, including those unturned obviously. I thought I would just change the setting.

I plan to put the solution up tonight sometime. I hope my logic is correct as I have had to recreate what I think is the correct strategy from memory. You can correct me if I'm wrong.

schnootle · 16 November 2009

bellenuit said:
It's at least 35 years since I first heard this problem. It was originally formulated as numbers written on the backs of pieces of paper that were turned upside down. You were to turn a paper and nominate if that was the highest number on all the pieces of paper, including those unturned obviously. I thought I would just change the setting.

I plan to put the solution up tonight sometime. I hope my logic is correct as I have had to recreate what I think is the correct strategy from memory. You can correct me if I'm wrong.

You have done pretty well if you haven't seen the problem for 35 years. I will direct people to some more information when you have put up your solution.

2020hindsight · 16 November 2009

well you're right bellenuit - it is a ripper.

I look forward to the answer.
THe following is not mathematically helpful whatsoever. Just some scribling I've been doing whilst flying around the place.

I started trying to simplify it.
That at some point of time you would have turned n numbers.
the chance of the largest in that sample would be n/1000.
then the chance of the largest being the most recently turned in that sample would be 1/n.

So - surprise surprise, the chances of the largest being the most recently turned would be (n/1000) x (1/n) which simplifies to 1/1000 lol - ... (0.1%)

doh ... big deal I found myself saying.

Not sure how you take account of the fact that , in the other cases (of these) where the most recent wasn't the largest ( = (1- 1/n) (n/1000) ) - that after "n" you keep turning , hoping that the largest is not in the disclosed numbers.

I thought 67% was a bit too much risk that you'd blown it already.

Pure guesswork ( compare betting on one column of the roulette wheel) , my gut feel is that it is about the 33% mark, that you turn up 333 of them, then continue until (as others have said) you get a bigger number than all the numbers to date = the biggest in the upturned pack. THen stop.

That way ( stopping at the first opportunity after the third point) you only have 1/3 chance that you've blown it and gone too far. And you could well get the largest number/card at around the 50% mark - not that there's any logical reason it has to be in the middle of the pack, lol.

Sorry very unmathematical here, just playing around with numbers and thoughts - and approximations. ( the same way I'd play roulette - if I were to get real pissed and play roulette that is )

PS But I wouldn't be surprised if it was after the second number as was suggested way back there somewhere.

But it has been real challenging and fun in a masochistic way. Sadomasochism - never have to say you're sorry. :2twocents

waza1960 · 16 November 2009

Lets just ask Howard Bandy .I reckon he will be able to answer correctly within seconds with 98% probability.

2020hindsight · 16 November 2009

PS
after this you're gonna have to give us the good oil on playing that game where people hold all those golden suitcases - "Deal or No Deal:" I think it's called

schnootle · 16 November 2009

2020hindsight said:
Sorry very unmathematical here, just playing around with numbers and thoughts - and approximations. ( the same way I'd play roulette - if I were to get real pissed and play roulette that is )

Your guessing is better than you give yourself credit for

bellenuit · 16 November 2009

People seem to be now on the right track, so I will give the solution as best as I can. brty was close to the mark when he/she posted "By waiting until the first 500 numbers have been drawn,and then putting your name to the first number that is higher than any in the first 500, should give a probability of 1 in 4 of being correct" That is the strategy, but the number to pass over is not the optimum. 2020 was closer when he/she said " my gut feel is that it is about the 33% mark, that you turn up 333 of them, then continue until (as others have said) you get a bigger number than all the numbers to date = the biggest in the upturned pack".

As I mentioned at the very beginning you can get an approximate answer with very basic probability theory, but to get the exact answer requires a better knowledge of maths than I have. In fact my maths has improved enormously over the last day or so when I started using the Wolfram Alpha computational engine, but more about that below.

If you don't want to get deeply into the maths, you can derive the optimum fraction to use by evaluating for easy fractions like 1/5, 1/4, 1/3, 1/2 etc. and seeing which if these gives the greatest chance of being right. If you do this, you will find that 1/3 gives the greatest chance. Later I will show how to work out the optimum fraction using more advanced mathematics.

So your strategy is to answer No to the first 333 numbers generated (noting the highest number among these) and then say Yes to the first number (if any) that is higher than this. This will give you the greatest chance of success. This is how you work out the probability of being right....

This strategy will give you the correct answer if the 2nd highest number is in the first third (1/3 chance) AND the highest number is not in the first third (2/3 chance). Thus you will be correct 2/9 or 22% of the time (at least).

But it will also give you the correct answer if the 3rd highest number is in the first third (1/3 chance) AND the highest and second highest numbers are not in the first third (2/3 X 2/3 chance) AND the highest number comes up before the second highest number (1/2 chance). This comes to 2/27 or 0.074 or 7.4%. Since this is a mutually exclusive result to the first, both are potential outcomes, so the chances of either happening are 29.4%.

You would also be successful if the 4th highest number is in the first third (1/3) AND the first, second and third highest are not in the first third (2/3 X 2/3 X2/3) AND the highest number is generated before the second or third highest (1/3). This is 8/243 or .033 or 3.3%. Again, as this is a mutually exclusive result, the chances of any one of these three outcomes occurring, which would give the correct result, are now up to 32.7%.

So this is almost a one in three chance, but it will increase slightly more each time you evaluate and include additional possible successful outcomes. The 667th term will be the highest you can go and this will be when the 667th highest number is in the first third AND all 666 higher numbers are not in the first 1/3 AND the highest number comes before the other 665. However, evaluating beyond a few terms is pointless, as each additional term adds proportionally less and less to the final result.

So as an approximation, the chances of success using this strategy is about 1 in 3.

Now for the heavy stuff. If any of you are not aware of the computational engine Wolfram Alpha, I urge you to go to the website http://www.wolframalpha.com/ and play around with it. It will blow your mind when you see its capabilities. There is an introductory video there that is worth watching. I will be using that website in this section.

In the advanced solution instead of using 1/3 as the fraction to turn, you need to first determine what actual fraction will yield the highest chances of success. 1/3 is close, but not the optimum fraction.

Lets say this fraction is n, where n is between 0 and 1.

So following the above basic probability logic, using the same strategy, but this time using n rather than 1/3, we get:

This strategy will give you the correct answer if the 2nd highest number is in the first n of the generated numbers (n chance) AND the highest number is not in the first n of the generated numbers (1-n chance). Thus your chance of success is n(1-n) at least.

But it will also give you the correct answer if the 3rd highest number is in the first n of the generated numbers (n chance) AND the highest and second highest numbers are not in the first n of the generated numbers ((1-n)(1-n)) AND the highest number comes up before the second highest number (1/2). The chance of this happening is 1/2*n*(1-n)^2. Since this is a mutually exclusive result to the first, both are potential outcomes, so the chances of either happening are n(1-n) + 1/2*n*(1-n)^2.

You would also be successful if the 4th highest number is in the first n of the generated numbers (n chance) AND the first, second and third highest are not in the first n of the generated numbers ((1-n)(1-n)(1-n)) AND the highest number is generated before the second or third highest (1/3). This is 1/3*n*(1-n)^3. Again, as this is a mutually exclusive result, the chances of any one of these three outcomes occurring, each of which would each give the correct result, are now n(1-n) + 1/2*n*(1-n)^2 + 1/3*n*(1-n)^3.

This can be continued on by evaluating for the 5th and subsequent highest numbers in a similar fashion, increasing the accuracy of your answer each time. It should be apparent that including the (x+1)th highest number in the evaluation will add 1/x*n*(1-n)^x to the result for your chances of success. As the highest term you can evaluate to is the 1000*(1-n)th highest number, the complete formula for the probability of success is:

sum of 1/x*n*(1-n)^x for x =1 to 1000*(1-n)

So what we want to now do is find what value of n gives the highest value to that formula.

The days when I could work that out are long behind me, but we can use Wolfram Alpha to help us.

Remember each additional term adds proportionally less and less to the accuracy of the answer so lets try firstly just for three terms (e.g. that the 4th highest number is in the first n etc. as described above). So we want Wolfram Alpha to evaluate this formula

global maximum sum(1/x*n*(1-n)^x) for x = 1 to 3

So copy this line in its entirety (including the words global maximum) and paste it into the search field at www.wolframalpha.com and then hit enter or press the = sign. Wait a few seconds for the results to be calculated. There will be a section called Global Maximum. If it shows "approximate form" at the top right of that section, click on that term and it should show "exact form" You will get the formula expanded out into individual terms and then to the right of it, it shows:

0.341572 at n~~0.422862

What that is saying is that formula is at its maximum when n is approximately 0.422862 and it evaluates to 0.341572 when n has that value. In other words, if we calculate to just 3 terms, the optimum number of random numbers to let pass is 422 (instead of 333 using the 1/3 approach), and the chance of success will be 34%.

Now lets make Wolfram Alpha really work and change the 3 in the formula to 199 so it reads "for x = 1 to 199". So we are trying to get really accurate by evaluating to 199 terms. The result under global maximum is so big, you will need to scroll down to see the result which this time shows:

0.367879 at n~~0.367861

If you try to go higher than 199, Wolfram Alpha usually gives up, so this is as close to the optimum as we can get using that tool. In fact the optimum would be only ever so slightly different and would be:

0.367879 at n~~0.367879

Both the optimum fraction to use to maximise the chances of success and the probability of success at that optimum value are the same.

For those of you still awake, this fraction just happens to be 1/e, e being the natural number e. Now, ain't that interesting.

So what the final result is saying is that to maximise the chance of success, say No to the first 367 numbers (but note the highest of them). Then answer Yes to the first number that is higher than the number noted. If you follow that strategy, your chance of identifying the highest number overall (of the 1000 generated) is 36.7%.

2020hindsight · 17 November 2009

bellenuit said:
.... For those of you still awake, this fraction just happens to be 1/e, e being the natural number e. Now, ain't that interesting.

bellenuit
bludy hell , lol.
so we end up in the logarithmic domain after all -

and for someone who used to do calcs on slide rules, I (also) find that fascinating.

cheers, and thanks for the cerebral workout there

PS how do you pronounce your nicname anyways?
Belly knew it?

The Random Number Generator - One of the best Maths problems ever IMO

bellenuit

Wysiwyg

Everyone wants money

skyQuake

Attachments

bellenuit

Wysiwyg

Everyone wants money

skc

Goldmember

Robb

skyQuake

bellenuit

Wysiwyg

Everyone wants money

Robb

schnootle

bellenuit

schnootle

2020hindsight

waza1960

2020hindsight

schnootle

bellenuit

2020hindsight

Similar threads

Connect with us