|
a curious puzzle for those who want to have some fun. https://gist.github.com/evanthebouncy/bde3e510e562da5db6a2 take a look at this code and try to run it, what it does is it creates some random functions f1,f2,...,f100, and then form a chain by composing them together, and the result is a (random) constant function! how can it be?
|
As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.
I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though.
|
By order arguments, one can show that at each value of n, the result of the composition is more dependent on the random numbers you generated rather than the initial x.
This is because at the first iteration f_1 you've already mapped any x to a number between 0 and 1. Then look at the series expansion of 1/(1+e^(-x)) (respectively 1/(1+e^(x)) ) for n=2 onwards and you'll see that the terms depending on x become negligible since the powers of 0 < f_1(x) < 1 vanishes quickly unless f_1(x)=1 is already a constant function.
So yeah, the result is (almost) never a constant function, but damned close to one. Caveat: if your random number at any point is zero the function becomes constant.
|
On June 25 2015 18:27 fixed_point wrote: By order arguments, one can show that at each value of n, the result of the composition is more dependent on the random numbers you generated rather than the initial x.
This is because at the first iteration f_1 you've already mapped any x to a number between 0 and 1. Then look at the series expansion of 1/(1+e^(-x)) (respectively 1/(1+e^(x)) ) for n=2 onwards and you'll see that the terms depending on x become negligible since the powers of 0 < f_1(x) < 1 vanishes quickly unless f_1(x)=1 is already a constant function.
So yeah, the result is (almost) never a constant function, but damned close to one. Caveat: if your random number at any point is zero the function becomes constant.
it's fitting that someone with a name like "fixed point" should answer this lol but yeah I know it's not a constant function, it's just faster to say it that way.
can u elaborate a bit on the derivative argument (serieis expansion)? derivatives and function composition has a strong relationship via the chain rule, so we can take the derivative of the huge chain and get a big multiplications (which approaches 0 in our problem), is that what you were trying to say?
|
On June 25 2015 16:42 Xxazn4lyfe51xX wrote: As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.
I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though.
the way I reasoned with it is consider a pair of numbers x, y and push the numbers through the chain, i.e. x, y f1(x), f1(y) f2(x), f2(y) ...
and try to prove that in each step the distance between x and y is closer together
once you're done with that, u will be able to show then for any distinct values (x, y) they will converge together after a sufficiently long chain, thus the chain is close to a constant function
|
On June 25 2015 20:35 evanthebouncy! wrote:Show nested quote +On June 25 2015 18:27 fixed_point wrote: By order arguments, one can show that at each value of n, the result of the composition is more dependent on the random numbers you generated rather than the initial x.
This is because at the first iteration f_1 you've already mapped any x to a number between 0 and 1. Then look at the series expansion of 1/(1+e^(-x)) (respectively 1/(1+e^(x)) ) for n=2 onwards and you'll see that the terms depending on x become negligible since the powers of 0 < f_1(x) < 1 vanishes quickly unless f_1(x)=1 is already a constant function.
So yeah, the result is (almost) never a constant function, but damned close to one. Caveat: if your random number at any point is zero the function becomes constant. it's fitting that someone with a name like "fixed point" should answer this lol but yeah I know it's not a constant function, it's just faster to say it that way. can u elaborate a bit on the derivative argument (serieis expansion)? derivatives and function composition has a strong relationship via the chain rule, so we can take the derivative of the huge chain and get a big multiplications (which approaches 0 in our problem), is that what you were trying to say? Derivatives are unnecessary. I can write more details with greater clarity in latex when I get home.
|
I thought this was pure math. Seems like I need some programming knowledge to understand :\
|
Sanya12364 Posts
I don't really know how fun that was. If you jam a number through 100 iterations of
shrink_fun(x) = 1 / (1 + e^-x)
You don't leave a lot of information about the original x at the very end.
|
|
On June 26 2015 06:26 fixed_point wrote:Heuristically this is why the function converges to a constant one. PDF file on google drive
wow that's quite more involved than I expected. but I'm sure it's second nature to you so good!!
|
On June 25 2015 20:37 evanthebouncy! wrote:Show nested quote +On June 25 2015 16:42 Xxazn4lyfe51xX wrote: As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.
I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though. the way I reasoned with it is consider a pair of numbers x, y and push the numbers through the chain, i.e. x, y f1(x), f1(y) f2(x), f2(y) ... and try to prove that in each step the distance between x and y is closer together once you're done with that, u will be able to show then for any distinct values (x, y) they will converge together after a sufficiently long chain, thus the chain is close to a constant function sounds like a typical contraction to me
|
On June 26 2015 09:12 Kleinmuuhg wrote:Show nested quote +On June 25 2015 20:37 evanthebouncy! wrote:On June 25 2015 16:42 Xxazn4lyfe51xX wrote: As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.
I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though. the way I reasoned with it is consider a pair of numbers x, y and push the numbers through the chain, i.e. x, y f1(x), f1(y) f2(x), f2(y) ... and try to prove that in each step the distance between x and y is closer together once you're done with that, u will be able to show then for any distinct values (x, y) they will converge together after a sufficiently long chain, thus the chain is close to a constant function sounds like a typical contraction to me If the domain is two fixed numbers, the functions are contractions. Generally you'll need a fixed constant 0 \leq \lambda < 1 for which the distance between any two f_1(x),f_2(y) is less than \lambda |x-y| for all x,y \in R. I'm not sure we have that here, but then again my brain doesn't work at this hour (need to take a look at the behaviour near x=0)
|
I have a feeling I should know what is going on, but I have no idea
|
On June 26 2015 15:00 fixed_point wrote:Show nested quote +On June 26 2015 09:12 Kleinmuuhg wrote:On June 25 2015 20:37 evanthebouncy! wrote:On June 25 2015 16:42 Xxazn4lyfe51xX wrote: As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.
I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though. the way I reasoned with it is consider a pair of numbers x, y and push the numbers through the chain, i.e. x, y f1(x), f1(y) f2(x), f2(y) ... and try to prove that in each step the distance between x and y is closer together once you're done with that, u will be able to show then for any distinct values (x, y) they will converge together after a sufficiently long chain, thus the chain is close to a constant function sounds like a typical contraction to me If the domain is two fixed numbers, the functions are contractions. Generally you'll need a fixed constant 0 \leq \lambda < 1 for which the distance between any two f_1(x),f_2(y) is less than \lambda |x-y| for all x,y \in R. I'm not sure we have that here, but then again my brain doesn't work at this hour (need to take a look at the behaviour near x=0) Nevermind, I'm an idiot. y=f_1(x) is between 0 and 1 and so f_100(f_99(...f(2(y)))) is a contraction. Therefore you can definitely use the contraction mapping/Banach fixed point theorem to prove that the composition/chain converges to a fixed point (i.e. constant number) for sufficiently large compositions. Pretty ironic, given my name...
However, I still think the series expansion is perhaps more intuitive for one unfamiliar with this theorem, since the original question asks for a finite sequence of compositions. (Of course from an abstract point of view, this is just a specific case of the fixed point theorem).
|
Having trouble with Berkeley homework?
|
On June 26 2015 21:11 Dagobert wrote:Having trouble with Berkeley homework? I wish
|
On June 26 2015 21:11 Dagobert wrote:Having trouble with Berkeley homework?
nah i graduated 3 yrs ago lol i'm doing a phd now. this problem actually came up when i tried to run a randomly initialized neural network of 10 layers w/o training, and noticing it is always the constant function, so I thought it was really cool so I abstracted away the non-essentials and hope people would find the simplified problem interesting to look at.
|
On June 26 2015 19:59 fixed_point wrote: .... Pretty ironic, given my name... ....
that was my thought the entire time lawl fffffffffffffffffffffffffffffffff(uck)
|
Oh, that's more interesting than the problem here. What are you trying to get the neural network to do?
|
On June 27 2015 16:15 Dagobert wrote: Oh, that's more interesting than the problem here. What are you trying to get the neural network to do? nothing. just want to play with one for fun (cuz people kept talking about their fantastic learning properties). Just want to run it on the mnist dataset. Couldn't do back propagation on the deep one unfortunately, so i trained a shallow one with 94% accuracy, which is not bad i think.
|
|
|
|