1010101001 0001110101 0110100110 1001010100 1001001101
1000111010 0111101101 1110111111 1011001111 1100010110
Which of these is randomly generated, and which of these was created by a human?
Forum Index > General Forum |
![]()
The_Templar
your Country52797 Posts
August 28 2014 15:18 GMT
#2701
1010101001 0001110101 0110100110 1001010100 1001001101 1000111010 0111101101 1110111111 1011001111 1100010110 Which of these is randomly generated, and which of these was created by a human? | ||
Acrofales
Spain18000 Posts
August 28 2014 15:54 GMT
#2702
On August 29 2014 00:18 The_Templar wrote: There was an argument in my information and coding class today about two binomial strings, where I was the only person who thought my point was valid at all. 1010101001 0001110101 0110100110 1001010100 1001001101 1000111010 0111101101 1110111111 1011001111 1100010110 Which of these is randomly generated, and which of these was created by a human? Honestly, there's not really enough info to go on... what is the human trying to do when creating this? Generate a random sequence? Or give some kind of meaning? If he's trying to create a random sequence, I'll go with him writing the first sequence, because it has less long sequences and humans tend to think long sequences of subsequent characters are atypical of random strings. However, it's pretty tenuous. EDIT: I say it's tenuous because these strings actually represent something else, and if we were to generate the objects they represent (for instance, integers between 0 and 1023) and then convert them to string form, this interpretation of human bias is invalidated. Another argument for maybe the second one is that the last 2 strings of the first sequence are quite similar (start with 10010). A "human random generator" doesn't like that kind of pattern either. | ||
ComaDose
Canada10357 Posts
August 28 2014 16:04 GMT
#2703
| ||
Acrofales
Spain18000 Posts
August 28 2014 16:05 GMT
#2704
On August 29 2014 01:04 ComaDose wrote: wouldn't there be about the same chance that the computer generated either of those strings Yes. But I think the question is not so much about the computer, but about human bias. | ||
Najda
United States3765 Posts
August 28 2014 16:49 GMT
#2705
| ||
Zess
Adun Toridas!9144 Posts
August 28 2014 17:30 GMT
#2706
| ||
![]()
The_Templar
your Country52797 Posts
August 28 2014 17:39 GMT
#2707
| ||
Najda
United States3765 Posts
August 28 2014 17:55 GMT
#2708
On August 29 2014 02:39 The_Templar wrote: There are only two strings, I just happened to divide them into groups of ten Oh I see that now, my phone broke the format and it's much more obvious on the computer. I'll just say the first string then. | ||
LSB
United States5171 Posts
August 28 2014 18:52 GMT
#2709
On August 29 2014 00:18 The_Templar wrote: There was an argument in my information and coding class today about two binomial strings, where I was the only person who thought my point was valid at all. 1010101001 0001110101 0110100110 1001010100 1001001101 1000111010 0111101101 1110111111 1011001111 1100010110 Which of these is randomly generated, and which of these was created by a human? For a serious answer. Assumption #1: One of the strings is Human Generated, One of the Strings is Computer Generated Assumption #2: The computer picks 0 and 1 at true random. String 1 Has 24 Ones, this seems to be the one most likely to be generated by a random number generator String 2 Has 33 Ones The chance of observing 33 or more successes in 50 trials is 1.64%, double this if you want to include the chance of 17 or less heads for 3.28% which is less than the 5% value typically used for "statistical significance" Thus it is far more likely the first is randomly generated. My statistics is rusty so correct me if I'm wrong plox. | ||
Acrofales
Spain18000 Posts
August 28 2014 19:20 GMT
#2710
On August 29 2014 03:52 LSB wrote: Show nested quote + On August 29 2014 00:18 The_Templar wrote: There was an argument in my information and coding class today about two binomial strings, where I was the only person who thought my point was valid at all. 1010101001 0001110101 0110100110 1001010100 1001001101 1000111010 0111101101 1110111111 1011001111 1100010110 Which of these is randomly generated, and which of these was created by a human? For a serious answer. Assumption #1: One of the strings is Human Generated, One of the Strings is Computer Generated Assumption #2: The computer picks 0 and 1 at true random. String 1 Has 24 Ones, this seems to be the one most likely to be generated by a random number generator String 2 Has 33 Ones The chance of observing 33 or more successes in 50 trials is 1.64%, double this if you want to include the chance of 17 or less heads for 3.28% which is less than the 5% value typically used for "statistical significance" Thus it is far more likely the first is randomly generated. My statistics is rusty so correct me if I'm wrong plox. Eh, I kinda disagree. While you seem to be right on the math (just calculated part of the tails manually, didn't plug it into R and got bored after 36/14, but it seems to be heading for the %s you say), you're dismissing the fact that it's not just 1 being drawn up by a computer, but it's the other one being drawn up by a human, who we are assuming is doing his best to generate a "random" sequence. Maybe a "bias towards 1s" is a human bias (it might be, for all I know), but I think the human would generate less than 3 in 100 sequences with such a lopsided count: if asked to draw a random distribution of 50 1s and 0s, I for one would take good care to never stray too far from 25 of each ![]() | ||
LSB
United States5171 Posts
August 28 2014 19:34 GMT
#2711
On August 29 2014 04:20 Acrofales wrote: Show nested quote + On August 29 2014 03:52 LSB wrote: On August 29 2014 00:18 The_Templar wrote: There was an argument in my information and coding class today about two binomial strings, where I was the only person who thought my point was valid at all. 1010101001 0001110101 0110100110 1001010100 1001001101 1000111010 0111101101 1110111111 1011001111 1100010110 Which of these is randomly generated, and which of these was created by a human? For a serious answer. Assumption #1: One of the strings is Human Generated, One of the Strings is Computer Generated Assumption #2: The computer picks 0 and 1 at true random. String 1 Has 24 Ones, this seems to be the one most likely to be generated by a random number generator String 2 Has 33 Ones The chance of observing 33 or more successes in 50 trials is 1.64%, double this if you want to include the chance of 17 or less heads for 3.28% which is less than the 5% value typically used for "statistical significance" Thus it is far more likely the first is randomly generated. My statistics is rusty so correct me if I'm wrong plox. Eh, I kinda disagree. While you seem to be right on the math (just calculated part of the tails manually, didn't plug it into R and got bored after 36/14, but it seems to be heading for the %s you say), you're dismissing the fact that it's not just 1 being drawn up by a computer, but it's the other one being drawn up by a human, who we are assuming is doing his best to generate a "random" sequence. Maybe a "bias towards 1s" is a human bias (it might be, for all I know), but I think the human would generate less than 3 in 100 sequences with such a lopsided count: if asked to draw a random distribution of 50 1s and 0s, I for one would take good care to never stray too far from 25 of each ![]() I considered that approach however you are adding even more assumptions. Theoretically we can assume that the collection of human biases are normally distributed around some number, however we have no idea what that number is (might not even be 50%), and if we do make an assumption of 50% we would be sampling an assumption which would introduce a boatload of unmeasurable error. | ||
ComaDose
Canada10357 Posts
August 28 2014 19:38 GMT
#2712
can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. | ||
GettingIt
1656 Posts
August 28 2014 19:58 GMT
#2713
| ||
![]()
The_Templar
your Country52797 Posts
August 28 2014 20:13 GMT
#2714
On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. | ||
LSB
United States5171 Posts
August 28 2014 20:25 GMT
#2715
On August 29 2014 05:13 The_Templar wrote: Show nested quote + On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. Welcome to peer pressure and confirmation bias. | ||
Najda
United States3765 Posts
August 28 2014 20:28 GMT
#2716
On August 29 2014 05:13 The_Templar wrote: Show nested quote + On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. I'll agree with that now that I see LSB's statistical analysis | ||
Acrofales
Spain18000 Posts
August 28 2014 20:31 GMT
#2717
On August 29 2014 05:13 The_Templar wrote: Show nested quote + On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. I don't think you phrased that properly, because I don't really see why either of the strings is "far more likely" to be generated by a human than by a computer. I do agree that the underlying assumptions for stating the second one is computer-generated are tenuous... and a better argument is that in isolation it is not easy to state which is which. As LSB's math above shows, a computer will only generate a similarly lopsided string in 3% of the cases, so it's not exactly a "typical" outcome for a random string generator either. @LSB: you have to make some assumptions. Otherwise all you're saying is that a string similar to the bottom one is less likely to be generated by a computer than the top one, in which you are throwing away the information that you know the other one is generated by a human... and it's not so that we know absolutely nothing about humans and therefore should simply assign to them the one that is less likely to be generated by a computer. | ||
LSB
United States5171 Posts
August 28 2014 20:35 GMT
#2718
On August 29 2014 05:31 Acrofales wrote: Show nested quote + On August 29 2014 05:13 The_Templar wrote: On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. I don't think you phrased that properly, because I don't really see why either of the strings is "far more likely" to be generated by a human than by a computer. I do agree that the underlying assumptions for stating the second one is computer-generated are tenuous... and a better argument is that in isolation it is not easy to state which is which. As LSB's math above shows, a computer will only generate a similarly lopsided string in 3% of the cases, so it's not exactly a "typical" outcome for a random string generator either. @LSB: you have to make some assumptions. Otherwise all you're saying is that a string similar to the bottom one is less likely to be generated by a computer than the top one, in which you are throwing away the information that you know the other one is generated by a human... and it's not so that we know absolutely nothing about humans and therefore should simply assign to them the one that is less likely to be generated by a computer. Just because you have data doesn't mean you have or should incorporate in it a model. In fact, in this case incorporating the data would induce a huge amount of error, rather than simplify it. EDIT: Technically speaking it is impossible to incorporate it into the model unless you want to throw out statistics. The are a variety of reasons, the chief being that you can't use two variables to describe two data points. | ||
Acrofales
Spain18000 Posts
August 28 2014 20:42 GMT
#2719
On August 29 2014 05:35 LSB wrote: Show nested quote + On August 29 2014 05:31 Acrofales wrote: On August 29 2014 05:13 The_Templar wrote: On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. I don't think you phrased that properly, because I don't really see why either of the strings is "far more likely" to be generated by a human than by a computer. I do agree that the underlying assumptions for stating the second one is computer-generated are tenuous... and a better argument is that in isolation it is not easy to state which is which. As LSB's math above shows, a computer will only generate a similarly lopsided string in 3% of the cases, so it's not exactly a "typical" outcome for a random string generator either. @LSB: you have to make some assumptions. Otherwise all you're saying is that a string similar to the bottom one is less likely to be generated by a computer than the top one, in which you are throwing away the information that you know the other one is generated by a human... and it's not so that we know absolutely nothing about humans and therefore should simply assign to them the one that is less likely to be generated by a computer. Just because you have data doesn't mean you have or should incorporate in it a model. In fact, in this case incorporating the data would induce a huge amount of error, rather than simplify it. I disagree. As long as you do it in a principled manner. I think I could make a fairly simple Bayesian classifier that does better than random at predicting human strings looking at "longest string of subsequent digits" as one of the features. Perhaps "deviation from the expected number of 1s" is another one, although I have no evidence to back the second one up. | ||
LSB
United States5171 Posts
August 28 2014 20:54 GMT
#2720
On August 29 2014 05:42 Acrofales wrote: Show nested quote + On August 29 2014 05:35 LSB wrote: On August 29 2014 05:31 Acrofales wrote: On August 29 2014 05:13 The_Templar wrote: On August 29 2014 04:38 ComaDose wrote: how much someone knows about statistics and random number generation would also affect how well they made a random string of numbers so it would vary greatly change from person to person. can you tell us what your point was and what the answer is if there is one? my answer is that it could be either we don't know. The point I made is that, in isolation, both are far more likely to be human generated, and there was therefore no way to actually tell. Nobody agreed with me, and everyone found it obvious that the second one was computer generated and not the first. Of course this was correct. I don't think you phrased that properly, because I don't really see why either of the strings is "far more likely" to be generated by a human than by a computer. I do agree that the underlying assumptions for stating the second one is computer-generated are tenuous... and a better argument is that in isolation it is not easy to state which is which. As LSB's math above shows, a computer will only generate a similarly lopsided string in 3% of the cases, so it's not exactly a "typical" outcome for a random string generator either. @LSB: you have to make some assumptions. Otherwise all you're saying is that a string similar to the bottom one is less likely to be generated by a computer than the top one, in which you are throwing away the information that you know the other one is generated by a human... and it's not so that we know absolutely nothing about humans and therefore should simply assign to them the one that is less likely to be generated by a computer. Just because you have data doesn't mean you have or should incorporate in it a model. In fact, in this case incorporating the data would induce a huge amount of error, rather than simplify it. I disagree. As long as you do it in a principled manner. I think I could make a fairly simple Bayesian classifier that does better than random at predicting human strings looking at "longest string of subsequent digits" as one of the features. Perhaps "deviation from the expected number of 1s" is another one, although I have no evidence to back the second one up. This is the fatal trap I which I am pointing out that you are falling into. You have three assumptions 1) Computer behaves a certain way 2) A typical human behaves a certain way 3) The specific human who picked the number sequence behaves like a typical human I make one. See the difference? | ||
| ||
![]() StarCraft 2 StarCraft: Brood War Calm Dota 2![]() Rain ![]() Sea ![]() Bisu ![]() BeSt ![]() Barracks ![]() EffOrt ![]() actioN ![]() Flash ![]() ggaemo ![]() [ Show more ] Counter-Strike Super Smash Bros Heroes of the Storm Other Games Organizations
StarCraft 2 • davetesta23 StarCraft: Brood War• AfreecaTV YouTube • intothetv ![]() • Kozan • IndyKCrew ![]() • LaughNgamezSOOP • Migwel ![]() • sooper7s Dota 2 |
Wardi Open
Wardi Open
RotterdaM Event
Replay Cast
WardiTV Summer Champion…
RSL Revival
PiGosaur Monday
WardiTV Summer Champion…
The PondCast
WardiTV Summer Champion…
[ Show More ] Replay Cast
LiuLi Cup
Online Event
SC Evo League
uThermal 2v2 Circuit
Sparkling Tuna Cup
WardiTV Summer Champion…
SC Evo League
uThermal 2v2 Circuit
Afreeca Starleague
Sharp vs Ample
Larva vs Stork
|
|