a fun math puzzle

evanthebouncy!

United States12796 Posts

June 25 2015 05:02 GMT

a curious puzzle for those who want to have some fun.
https://gist.github.com/evanthebouncy/bde3e510e562da5db6a2
take a look at this code and try to run it, what it does is it creates some random functions f1,f2,...,f100, and then form a chain by composing them together, and the result is a (random) constant function! how can it be?

Xxazn4lyfe51xX

United States976 Posts

June 25 2015 07:42 GMT

As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.

I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though.

fixed_point

Germany4891 Posts

June 25 2015 09:27 GMT

By order arguments, one can show that at each value of n, the result of the composition is more dependent on the random numbers you generated rather than the initial x.

This is because at the first iteration f_1 you've already mapped any x to a number between 0 and 1. Then look at the series expansion of 1/(1+e^(-x)) (respectively 1/(1+e^(x)) ) for n=2 onwards and you'll see that the terms depending on x become negligible since the powers of 0 < f_1(x) < 1 vanishes quickly unless f_1(x)=1 is already a constant function.

So yeah, the result is (almost) never a constant function, but damned close to one. Caveat: if your random number at any point is zero the function becomes constant.

evanthebouncy!

United States12796 Posts

June 25 2015 11:35 GMT

On June 25 2015 18:27 fixed_point wrote:
By order arguments, one can show that at each value of n, the result of the composition is more dependent on the random numbers you generated rather than the initial x.

This is because at the first iteration f_1 you've already mapped any x to a number between 0 and 1. Then look at the series expansion of 1/(1+e^(-x)) (respectively 1/(1+e^(x)) ) for n=2 onwards and you'll see that the terms depending on x become negligible since the powers of 0 < f_1(x) < 1 vanishes quickly unless f_1(x)=1 is already a constant function.

So yeah, the result is (almost) never a constant function, but damned close to one. Caveat: if your random number at any point is zero the function becomes constant.

it's fitting that someone with a name like "fixed point" should answer this
lol
but yeah I know it's not a constant function, it's just faster to say it that way.

can u elaborate a bit on the derivative argument (serieis expansion)? derivatives and function composition has a strong relationship via the chain rule, so we can take the derivative of the huge chain and get a big multiplications (which approaches 0 in our problem), is that what you were trying to say?

evanthebouncy!

United States12796 Posts

June 25 2015 11:37 GMT

On June 25 2015 16:42 Xxazn4lyfe51xX wrote:
As far as I can tell, it's not really a constant function. As n grows larger, the function approaches a random, constant value. For the purposes of the interpreter they give us, the point at which the function gives a constant number to the number of digits the output reads is about around n = 16.

I'm not good enough, or at least not patient enough with math to figure out why exactly it is that putting numbers through increasing numbers of random functions tends towards constancy though.

the way I reasoned with it is consider a pair of numbers x, y
and push the numbers through the chain, i.e.
x, y
f1(x), f1(y)
f2(x), f2(y)
...

and try to prove that in each step the distance between x and y is closer together

once you're done with that, u will be able to show then for any distinct values (x, y) they will converge together after a sufficiently long chain, thus the chain is close to a constant function

fixed_point

Germany4891 Posts

June 25 2015 13:16 GMT

On June 25 2015 20:35 evanthebouncy! wrote:

Show nested quote +

Derivatives are unnecessary. I can write more details with greater clarity in latex when I get home.

DucK-

Singapore11447 Posts

June 25 2015 15:01 GMT

I thought this was pure math. Seems like I need some programming knowledge to understand :\

TanGeng

Sanya12364 Posts

June 25 2015 19:40 GMT

I don't really know how fun that was. If you jam a number through 100 iterations of

shrink_fun(x) = 1 / (1 + e^-x)

You don't leave a lot of information about the original x at the very end.

fixed_point

Germany4891 Posts

June 25 2015 21:26 GMT

Heuristically this is why the function converges to a constant one.

PDF file on google drive

evanthebouncy!

United States12796 Posts

June 25 2015 23:34 GMT

#10

On June 26 2015 06:26 fixed_point wrote:
Heuristically this is why the function converges to a constant one.

PDF file on google drive

wow that's quite more involved than I expected.
but I'm sure it's second nature to you so good!!

Kleinmuuhg

Vanuatu4091 Posts

June 26 2015 00:12 GMT

#11

On June 25 2015 20:37 evanthebouncy! wrote:

Show nested quote +

sounds like a typical contraction to me

fixed_point

Germany4891 Posts

June 26 2015 06:00 GMT

#12

On June 26 2015 09:12 Kleinmuuhg wrote:

Show nested quote +

sounds like a typical contraction to me

If the domain is two fixed numbers, the functions are contractions. Generally you'll need a fixed constant 0 \leq \lambda < 1 for which the distance between any two f_1(x),f_2(y) is less than \lambda |x-y| for all x,y \in R. I'm not sure we have that here, but then again my brain doesn't work at this hour (need to take a look at the behaviour near x=0)

manicmessiah

United States107 Posts

June 26 2015 07:33 GMT

#13

I have a feeling I should know what is going on, but I have no idea

fixed_point

Germany4891 Posts

June 26 2015 10:59 GMT

#14

On June 26 2015 15:00 fixed_point wrote:

Show nested quote +

Nevermind, I'm an idiot. y=f_1(x) is between 0 and 1 and so f_100(f_99(...f(2(y)))) is a contraction. Therefore you can definitely use the contraction mapping/Banach fixed point theorem to prove that the composition/chain converges to a fixed point (i.e. constant number) for sufficiently large compositions. Pretty ironic, given my name...

However, I still think the series expansion is perhaps more intuitive for one unfamiliar with this theorem, since the original question asks for a finite sequence of compositions. (Of course from an abstract point of view, this is just a specific case of the fixed point theorem).

Dagobert

Netherlands1858 Posts

June 26 2015 12:11 GMT

#15

Having trouble with Berkeley homework?

fixed_point

Germany4891 Posts

June 26 2015 12:42 GMT

#16

On June 26 2015 21:11 Dagobert wrote:
Having trouble with Berkeley homework?

I wish

evanthebouncy!

United States12796 Posts

June 26 2015 23:19 GMT

#17

On June 26 2015 21:11 Dagobert wrote:
Having trouble with Berkeley homework?

nah i graduated 3 yrs ago lol i'm doing a phd now.
this problem actually came up when i tried to run a randomly initialized neural network of 10 layers w/o training, and noticing it is always the constant function, so I thought it was really cool so I abstracted away the non-essentials and hope people would find the simplified problem interesting to look at.

evanthebouncy!

United States12796 Posts

June 26 2015 23:20 GMT

#18

On June 26 2015 19:59 fixed_point wrote:
....
Pretty ironic, given my name...
....

that was my thought the entire time lawl
fffffffffffffffffffffffffffffffff(uck)

Dagobert

Netherlands1858 Posts

June 27 2015 07:15 GMT

#19

Oh, that's more interesting than the problem here. What are you trying to get the neural network to do?

evanthebouncy!

United States12796 Posts

June 27 2015 23:24 GMT

#20

On June 27 2015 16:15 Dagobert wrote:
Oh, that's more interesting than the problem here. What are you trying to get the neural network to do?

nothing. just want to play with one for fun (cuz people kept talking about their fantastic learning properties). Just want to run it on the mnist dataset. Couldn't do back propagation on the deep one unfortunately, so i trained a shallow one with 94% accuracy, which is not bad i think.

Please or register to reply.

a fun math puzzle

Completed

Ongoing

Upcoming