The time has come, and I have been politely asked to remove my math questions from the Big Programming Thread.

So why not have a math thread? Hopefully it isn't just me by myself in here, but I would think that there should be as much, if not more interest in maths than in programming anyways.

So what is welcome here? Anything math related! Random questions, theory, homework discussion, anything. But we must have proper etiquette (as arbitrarily determined by me with the following Dos and Don'ts.

Do: Be respectful - don't be condescending or make fun of people. Not everyone is at the same level and we should all be here to learn or teach. There will be zero tolerance for excessive rudeness or trolling.

Use google - If you are looking for help understanding or working out a problem, do a simple search to see if there are sufficient resources out there for you already. If it's a problem with many resources but you still don't understand, explain why.

Accurately explain your problem - if people are expected to put time into helping to explain a problem, then you should be putting the time in to accurately explain the problem so they can understand it.

Make your work readable. If you upload a scan/pdf, we need to be able to read the whole thing. If the equation is too complicated (symbols, sums, etc) to simply type out and you don't have a scanner, you can use latex online. There is a free editor: here.

Do not: Use this thread to cheat on homework. It's for learning. You don't learn by cheating. I don't think it's reasonably enforceable to try to stop people from cheating, but if people think you are only here for solutions then you will probably be shamed for it.

Take contributors for granted. Don't spam your questions, or get upset if you don't get answered. If you received help, maybe try to help others if you can. Or at least continue to be respectful.

Go off topic. If it's loosely related to math you can probably get away with it. If it's completely off topic you may lose privileges to post in here.

I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.

Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!

Edit: Ended up figuring it all out! Posted followup on 6th page of this thread here.

i took a 3rd year math course in "applied combinatorics". 4 month course. had to take it to complete my degree. the grade was based on 2 mid-terms at 25% each and the final exam at 50%.

i got the 3rd highest mark in a class of ~35 in the 1st mid-term with 50/100. 1 guy got 66/100 and another guy got perfect, 100/100. obviously the class we get our grades back we're a little devastated.... lots of moaning whining and complaining... prof speaks up.... and says...

"if i flip a coin 35 times and it only comes up heads 2 times i do not statistically adjust that result. i note it as a special and unusual case". what he was saying was "there could well be 33 morons in this room ... so shut up you idiots"

a bunch of people dropped the course without academic penalty immediately after receiving their grade in the 1st mid-term. There were 20 people left in the class. it was fucking brutal. toughest math course i ever took. i spent 10 amazing fun filled weekends with my applied combinatorics textbook that semester. i still remember half the content of the course and i took it more than 8 years ago.

@JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

Any program such as Mathematica or Matlab will give you answer ezpz. Why bother calculating it by hand?

It would take a very long time to properly type in the right order so it is probably safer to quickly check if it's solvable (by looking for forbidden operations or whatever), and if it is indeed a lot of software could do it.

With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.

Seems trivial, actually. Just a load of simplifications and then arithmetic. But things like the sqrt of a cubed number are simply the absolute value of that number. I looked it over and don't see an obvious multiplication by 0, so don't feel like doing the work. But it's not difficult as such.

Seems trivial, actually. Just a load of simplifications and then arithmetic. But things like the sqrt of a cubed number are simply the absolute value of that number. I looked it over and don't see an obvious multiplication by 0, so don't feel like doing the work. But it's not difficult as such.

With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.

Any program such as Mathematica or Matlab will give you answer ezpz. Why bother calculating it by hand?

It would take a very long time to properly type in the right order so it is probably safer to quickly check if it's solvable (by looking for forbidden operations or whatever), and if it is indeed a lot of software could do it.

With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.

Lot of procedures ? 1/2 hours ?

Probably about 3-5 minutes if you have a calculator and know what you're doing.

With a basic (high school) understanding of logarithms and trigonometry, this should be a cinch conceptually. It's just a lot of procedures and arithmetic.

Lot of procedures ? 1/2 hours ?

You can just use a calculator. Don't be scared by complex looking things. Take a step back and look at what is actually written. In the end it's just a sum and multiplication of a few terms.

Now all you have to do is calculate A till E first and then use the sum for the final answer.

I updated the OP. If you upload scans, please make sure they are readable.

If your problems are too hard to simply type and you do not have a scanner, please use latex. An easy solution is an online latex editor like this: https://www.codecogs.com/latex/eqneditor.php

On June 09 2017 20:12 travis wrote: I updated the OP. If you upload scans, please make sure they are readable.

If your problems are too hard to simply type and you do not have a scanner, please use latex. An easy solution is an online latex editor like this: https://www.codecogs.com/latex/eqneditor.php

On June 09 2017 23:21 Nin54545 wrote: Thank you for the A Manit0u , i am not so confident in my ability , i will study equations in a few years...

Then it is probably too early

In this specific case, sin(30), cos(60) and log100(10) are just clever ways of writing 1/2, most sqrt() also disappear. Pretty much everything simplifies nicely ; it looks basically like an equation to write something simple in a complicated way.

The weird ones are - 14.661 and 21584, which seem arbitrary. Probably aimed at obtaining a given number as a result, but not an elegant way to do it - D just looks wrong. Probably 512 and not 5*sqrt(2), but even then it should probably read (512*0.5)² instead of 512*(0.5)² (and then D=8) - sqrt(1/2.sqrt(16)) in C looks strange, it leaves a sqrt(2) in the equation which is awkward when everything else is just fractions.

On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.

Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!

Doing some biophysical modeling and analyzing the (possible) oscillations in noise, I did write some FT code. It's a bit different than pitch adjustments. I wanted some power spectrum of time series data. I used the FFTW C library, which seems the fastest thing you can get to do FT, unless you use a something optimized for a specific architecture/chipset/hardware. It is reasonable straightforward to use, and you can call it up from any language; C/Python/matlab/Julia.

It is very much a black box, and the library is so fast because it divides the problem into chunks and uses a mix of several numerical methods, depending on the nature of the problem and the hardware you are running. It is completely opaque. But since it is an industry standard, that's ok.

To me the signal processing element of it all was a bit of of a dark art. You need to be an electrical engineer specialized in signal processing to really know how to decide on the parameters you want to use to most effectively convert time series data into frequency series data. Windowing, sampling, spectral leakage, aliases, frequency resolution, and some version of the Heisenberg uncertainty saying you an increase of frequency resolution would inadvertently decrease time resolution, and all kinds of artifacts that might pop up, that all was not very easy to understand 'on the fly'. I still remember the 'convolution in the time domain corresponds to multiplication in the frequency domain', but if I had to explain it right now, I'd fail. In the end, I am a chemist by training, working with mathematicians turned biologists. And signal processing using FT is a big thing in engineering, and scientists just use it as a black box, most of the time.

The discreteness also doesn't help, as the continuous math is 'simple' the understand, as long as you are comfortable with the complex plane. But the implications of discreteness, they made it all a bit more confusing. Especially since I never took a course in discrete maths. And I has on a deadline to just get it working. So I didn't have the time to patiently go through a signal processing textbook and try out simple things step by step.

That said, for what you are doing. If you transform some sound in the time domain to the frequency domain, you can hit it with some function. Then the frequencies in your signal will chance. When you then convert it back to the time domain, it will be a different sound, as it contains different harmonics/overtones. I guess this is how autotune works, in a way.

For the math, I thought this video was best:

In the end it is all about projecting the time data onto the complex plane. That's why it uses the sine and cosine.

As for applications, it is used all over the place. It is probably one of the most commonly used algorithms around. Everyone with some electronic device, phone, mp3 player, etc, uses it all the time. Sounds, spectrum, analysis/recording/sampling of data, but also data compression. As a scientist, we usually use it when we record a spectrum of a molecule. Instead of getting how much photons it absorbs at each time, we get a fingerprint of which frequencies it absorbs in general. It removes noise, compacts what is happening over a longer period of time, and shows all the info we want to know in a straightforward manner.

On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.

Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!

He has some nice playlists explaining stuff about the fourier transform, discrete fourier transform, Z domain, sampling, zero padding, etc etc... all the things you will need if you are working with sound.

Wish I could help you more, but I always understood Fourier Transforms in a very superficial way. If you really want to understand you will really need to go deep into the math, but there are some very good ways to visualize these transforms which help the extremely abstract math.

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

On June 09 2017 23:21 Nin54545 wrote: Thank you for the A Manit0u , i am not so confident in my ability , i will study equations in a few years...

Then it is probably too early

In this specific case, sin(30), cos(60) and log100(10) are just clever ways of writing 1/2, most sqrt() also disappear. Pretty much everything simplifies nicely ; it looks basically like an equation to write something simple in a complicated way.

The weird ones are - 14.661 and 21584, which seem arbitrary. Probably aimed at obtaining a given number as a result, but not an elegant way to do it - D just looks wrong. Probably 512 and not 5*sqrt(2), but even then it should probably read (512*0.5)² instead of 512*(0.5)² (and then D=8) - sqrt(1/2.sqrt(16)) in C looks strange, it leaves a sqrt(2) in the equation which is awkward when everything else is just fractions.

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

But my question is: can you know the real probability of the event?

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

But my question is: can you know the real probability of the event?

With finite amount of data you will never be able to learn the "real" probability of an event. What you obtain using statistical methods is always an estimate. If you read scientific articles that use statistical analysis they will always report both point-estimates as well as standard deviations, confidence intervals or posterior belief distribution to measure how "precise" is their reported point-estimate.

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

But my question is: can you know the real probability of the event?

Supposed we knew everything about how everything works, then yes, in that case we would be able to know the real probability (which would be either 0 or 1).

But at the end of the day we know very little about how everything works and we may not even truly realize how little we know.

Trying to predict something like an election or the weather comes down to simplified models over the factors we think influence the outcome - then the models can be reevaluated; how the factors are weighed can be adjusted etc

Yeah probability is a way to estimate the likelihood of something happening without complete information. If we had precise knowledge of how you flip the coin - initial starting position, angle of the flick, force of the flick, exact dimensions of the coin and its weight, wind conditions, and who knows how many other parameters, the probability of heads is not 1/2. You'd be able to build a physical model to know the exact answer. It'd be 0 or 1. At that point its purely deterministic.

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

But my question is: can you know the real probability of the event?

In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%. For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?

In the same way, if a dice truly gives a 50/50 for odds vs even, you only get to exactly 50/50 going to infinity. In fact, it is completely impossible to get 50/50 if you throw a dice an odd number of times. The reason we know a dice is 50/50 to be odd or even if because we do exactly know the number of sides it has. The assumption then is that the dice if perfectly fair. Which probably isn't the case. If you are really bored, you can take a bunch of dice, or coins, throw/flip each of them an absurd number of times. Then calculate how likely the outcome you got is under the assumption that the dice/coin is fair.

In principle, any imperfection of flaw will mean the dice/coin isn't perfectly symmetrical. And thus it can in principle lead to a biased die. So in the case of an unfair dice/coin, there is no way to know with 100% accuracy that the exact probabilities are. The law of large numbers can get you far enough. Far enough for any real-world applications. (now if you really need to throw a lot of times, the dice/coin may actually wear and it's fairness may chance as a function of the number of throws, adding another layer of complexity.) So it is mainly a philosophical debate.

That said, you probably control the outcome of the dice/coin for a 100% by 'deciding' how you throw it. But I don't know of any magicians that have enough skill to throw a dice in a certain way so it gets them the outcome they want.

On June 10 2017 07:43 fishjie wrote: Yeah probability is a way to estimate the likelihood of something happening without complete information. If we had precise knowledge of how you flip the coin - initial starting position, angle of the flick, force of the flick, exact dimensions of the coin and its weight, wind conditions, and who knows how many other parameters, the probability of heads is not 1/2. You'd be able to build a physical model to know the exact answer. It'd be 0 or 1. At that point its purely deterministic.

And even then it would be an exact answer under the assumption that the physical model is completely accurate.

On June 10 2017 07:45 Ernaine wrote: In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%.

For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?

Even if a magician appeared before us and rolled 1 a billion times in a row - that wouldnt confirm anything other than rolling 1 being a possibility. Now, assuming uniform distribution(equal likelihood of each occurance - 1/6 of rolling 1), what the magician did occurs rarely - one sixth to the billion - so for real world and every day applications, we would argue the magician seems to be able to manipulate the dice such that the distribution is not uniform.

On June 09 2017 09:22 Poopi wrote: @JimmyJRaynor: if your university needs more than 3/35 people to pass the year (and it's very likely), your course was very badly designed imho :o.

I have a question but I'm not sure if the answer is trivial or if we don't have the answer yet; it's related to probability altho with some CS in it so I'll ask there!

Say we build a bayesian model that estimates the odds of a real life event A happening at 95%. But the event happens in real life only once (the results of an election for example). Says A happens as "predicted". So what? Was our estimation accurate? Maybe it actually had 80% chances of happening, but it still happened because it's still a likely event. But it didn't necessarily happen by random chance, especially if the event we are trying to predict is an election and not some randomly generated thing. Thus how can we judge if our model was well suited? I guess if we just want to have our model predict what will happen while minimizing loss and so on, like we often do, then there is no problem. But I feel like there an inherent philosophical/epistemological problem with Bayesian models :/. edit: would us be able to reproduce the event results enough times for it to be statistically significant, allow us to correctly evaluate if our estimation was right? But it still wouldn't be an absolutely precise estimation, and is it even possible to have such a thing?

(think about FiveThirtyEight and the likes for context)

edit 2 : another "application" of this question would be with smartphones weather predictions! They probably use some kind of bayesian model for that, and they will tell you: "there is 30% chances that it'll rain at this hour." how are we supposed to use this information? Assuming their estimation is roughly correct, a wise choice would be to take an umbrella if doing so is the result of a positive mathematical expectation, because we have the probability of the event... but how can I quantitatively assess how not having an umbrella would be a pain in the ass? Like I can say: "I would feel neutral if I have an umbrella and it is raining, so I assess 0 value to having an umbrella". But to have a rough idea of how painful it'll be not having an umbrella if it happens to rain... I would need to know how much and for how long it would rain! But if they can't say that to us, I can't really put their intel about the weather to good use. It won't ever be the wisest choice :/.

For your first question, it doesn't matter if the model is Bayesian or not. Bayesian statistics uses Bayes' theorem to come up with a posterior for the model parameters, but the interpretation your choice of point estimate from the posterior that you're using as your prediction has the same interpretation as a non-Bayesian model. If you want to quantify model "accuracy" (using the term loosely here), a Bayesian model is evaluated with the same metrics as non-Bayesian models (with the exception of metrics that require a posterior).

Of course, it's difficult to evaluate the quality of a probability prediction model with a single test point. However, with good modeling, good priors, and a number of test points that large enough to make evaluations sensible (but not large enough where the value of the prior information becomes negligible), Bayesian models will usually outperform most non-Bayesian models. (Disclaimer: I'm making a lot of assumptions in here, but trying to speak generally enough to be useful and carefully enough to stay accurate.)

I don't see why you think this is a philosophical problem with Bayesian inference. Bayesian inference isn't really advertised as something that allows you to evaluate models with less test points. It's usually advertised as something that allows you to incorporate prior information to build better models when there's little data available, has nicer interpretations of uncertainty measures, relies somewhat less on parametric assumptions than frequentist statistics, and gives you a full posterior as opposed to point estimates.

But my question is: can you know the real probability of the event?

In science (ie: not mathematics), nothing is absolute. Even the most mundane and predictable of events; we cannot know anything for 100%. For example, when I throw a dice, we cannot know that there is a zero chance of the dice not shattering on impact. Yes, we can calculate the forces involved. But who is to say that for the first time ever, the laws of nature won't suddenly change?

In the same way, if a dice truly gives a 50/50 for odds vs even, you only get to exactly 50/50 going to infinity. In fact, it is completely impossible to get 50/50 if you throw a dice an odd number of times. The reason we know a dice is 50/50 to be odd or even if because we do exactly know the number of sides it has. The assumption then is that the dice if perfectly fair. Which probably isn't the case. If you are really bored, you can take a bunch of dice, or coins, throw/flip each of them an absurd number of times. Then calculate how likely the outcome you got is under the assumption that the dice/coin is fair.

In principle, any imperfection of flaw will mean the dice/coin isn't perfectly symmetrical. And thus it can in principle lead to a biased die. So in the case of an unfair dice/coin, there is no way to know with 100% accuracy that the exact probabilities are. The law of large numbers can get you far enough. Far enough for any real-world applications. (now if you really need to throw a lot of times, the dice/coin may actually wear and it's fairness may chance as a function of the number of throws, adding another layer of complexity.) So it is mainly a philosophical debate.

That said, you probably control the outcome of the dice/coin for a 100% by 'deciding' how you throw it. But I don't know of any magicians that have enough skill to throw a dice in a certain way so it gets them the outcome they want.

You don't only get to exactly 50/50 going to infinity, you have 50% chances of having 50/50 if you throw it 2 times. And I know the definition of probability with infinity and such, that's not really my question. About the dice example, it has less chances of giving 6 than 1 afaik because there are less holes in 1, but again it's not my question :/.

And I'm not talking about dices at all because dices are pretty well random.

What I'm talking about is the probability of real life events that are a priori not random: are we still stuck at determinism issues?

Well, I don't agree. Yes, dices may not be fair. No real physical dice can be infinitely fair.

But if we throw only 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% heads and 90% tails is still somewhat in agreement with the data. And that is quite a bit different from the 50/50

Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.

So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice

On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.

If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.

So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice

? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.

On June 09 2017 07:03 JimmyJRaynor wrote: lots of moaning whining and complaining... prof speaks up.... and says...

"if i flip a coin 35 times and it only comes up heads 2 times i do not statistically adjust that result. i note it as a special and unusual case". what he was saying was "there could well be 33 morons in this room ... so shut up you idiots"

It's even more devastating, when you take into account, that a coin actually has 3 sides. Seems almost like that one lucky bastard, who got 100/100 tossed a coin and it came up on the 3rd side.

On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.

If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.

So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice

? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.

Either what you are saying is wrong, or it is besides the point. Whatever the true (secret?) probability the coin is, there is always a possibility that after an even number of flips, you have exactly the same number of odds and evens. And the larger the number of throws gets, the smaller those odds are. Even in the case of a perfectly fair coin. Only towards infinity will the deviations from 50/50 slowly disappear.

If you refuse to take my word for it, try it out yourself.

On June 10 2017 08:24 Ernaine wrote: Well, I don't agree. Yes, dices may not be fair. No real phyiscal dice can be infinitely fair.

If we throw 2 times and get both results once, the hypothesis that the true nature of the dice is that we get 10% coins and 90% heads still fits the data. Yes, after 2 trials we do get exactly 50/50, and we can get similar results at 4 trials and all the other even trials, but we don't know with a high probability that the coins for sure are in fact 50/50 completely fair coins.

So yes, we need to go to infinity to exactly get 50/50. You can try it with a computer. (yes, it will have pseudo-random numbers, so it is still a bit iffy, just like having a completely fair dice

? That's not what I am saying. I'm saying that you can have the same number of evens as the number of odds before infinity, not that you can be sure of the probability before going to infinity, which is the definition of probability so it's true by definition :o.

Either what you are saying is wrong, or it is besides the point. Whatever the true (secret?) probability the coin is, there is always a possibility that after an even number of flips, you have exactly the sane number of odds and evens. And the larger the number of throws gets, the smaller those odds are. Even in the case of a perfectly fair coin. Only towards infinity will the deviations from 50/50 slowly disappear.

If you refuse to take my word for it, try it out yourself.

? I'm just saying that if you throw it 2 times you can have an even and a odd, how is that wrong? There is no point because you didn't answer my question and went on about the definition of probability, which I already know :x. edit: oh nevermind you are the troll from the other time, that's why you talk non sense all the time xD.

I talk nonsense because I have a PhD. That's what we do. Troll. Sorry bro. These things are no longer a matter of opinion.

The 'troll from the other time'? At least my account doesn't have a troll name.

[edit]

Ooh, I see now what you mean. You refer to the Go-Deepmind thread, where you casually called people idiots for not agreeing with your trivially wrong position. I called you out, and you didn't like that. How do people like you get 7000 posts on TL? Amazing.

If you throw a coin twice, and it lands 'heads' once and 'tails' once, you know absolutely nothing.

I am one of the few people here able to make quality posts. Why personally attack me for no reason? We have a guy here saying it is fair to believe a coin is 50/50 after throwing it twice. And he calls me a troll. What do you want me to do? Yes, I know not everyone has a sophisticated mind. But that is ok, as long as you can accept that. I am looking at the guy above me.

Honestly, I think there's a lot of conversational equivocation going on right now between the law of large numbers in regards to theoretical probability and then the empirical, actual results obtained along the way. People are making very valid points about probabilistic models but are also talking past each other and getting hung up on semantics.

On June 10 2017 09:03 Ernaine wrote: If you throw a coin twice, and it lands 'heads' once and 'tails' once, you know absolutely nothing.

Well, absolutely nothing is a stretch ... you know that the probability of getting heads at the time you made the first throw and in the conditions of that event were non zero.

If you make the assumption that the outcome of a coin toss is multinomial distribution fixed over time, you can say that the list of outcomes contains the 'heads' and 'tails' events with a non null probability.

In reality the roll of dice or throw of a coin is perfectly deterministic. Sure we cannot calculate this (most of the time) given that there are multiple forces at play and their values/vetors are extremely hard to measure accuratly. Still such rolls are not really random in atrong sense.

That's the thing. Statistics, for a lot of things, are used for things that are deterministic in nature, but have too many parameters tied to them that it's almost impossible to obtain that deterministic outcome. Of course very educated guesses can be made using it, but it's rather sad that we -still- just can't perfectly model some deterministic events imo, or it's more of the fact that we have statistics and we're too comfy with going for an "one of the other" situation.

Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)

So I've spent a day studying Fourier Transform. So far the math is very understandable, at least on how to apply it. Understanding why it works and specifically what it does is more challenging. Most resources on the subject are just absolutely awful, and I may have to go out and buy and old text book to get good information.

It seems like FFT implementations popped up around the 80s-90s, and nobody ever really looked at them again in depth. Since then everyone takes and old implementation and wraps it. Well there's FFTW, but that library is a software engineering nightmare. It's a horribly obfuscated and heavyweight solution for a particularly small function. But what can you expect from a bunch of academics.

The real challenge is to be able to derive all the math and understand it in all its correctness. Only this way can I write a good implementation.

So, any book recommendations from anyone that really understands the math?

Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)

I'm currently a web dev, who has moved into analytics, data science, and usability. My next stop in my career is eventually aerospace. I'm looking to get back into some general math courses as a refresher. Any thoughts I where I should start? I've been looking into QPL (Quantum Programming Languages), and find that interesting as well. I'm currently in the position where I might be able to go to a University for free, and I would like to eventually pivot my career. I feel like a general refresher from the beginning would help, I know khan academy has a few courses, any thoughts on where to start from there?

Edit: You also might wanna reevaluate your point about quantum mechanics, "every quantum state can be represented as a sum of two or more other distinct states" and "the result will be another valid quantum state". (Source: Quantum Superposition)

The question is whether or not we can induce events we have not seen yet based on past events (by coming up with deterministic laws governing physics). So at best, you can achieve a near 100% probability.

For example, Newton used induction to come up with his laws of motion based on astronomy data collected by tycho brahe. He stated in Principia that the laws that govern these can be generalized to all objects - "In this experimental philosophy, propositions are deduced from the phenomena and are made general by induction. The impenetrability, mobility, and impetus of bodies and the laws of motion and law of gravity have been found by this method. And it is enough that gravity should really exist and should act according to the laws that we have set forth and should suffice for all the motions of the heavenly bodies and of our sea."

In other words, he observed the orbits of the planets, extracted out the laws of motion, and said they applied for everything. Its a big jump. So although the physical laws that humans have come up with are deterministic and have been verified empirically, you cannot know with 100% certainty that they apply deterministically to everything in the universe, because we have not observed everything in the universe. For a contrived example, if we are living in a simulation, and some coder changes some variables, then the laws of motion could change tomorrow. Or consider dark matter/ dark energy, from what I understand, our current laws of physics cannot describe the phenomenon we are seeing, so those laws don't apply to a vast majority of the universe.

When this thread was created, I had high hopes. Now this discussion has been interesting so far, but imho it not so much mathematics, but more physics/general science principles, mixed with some philosophical consequences.

Maybe I'm having a narrow view of mathematics, but maybe I dont and it is also time for a "physics Thread".

edit: Also unfortunately I dont know much about mathematics literature for non-mathematicians.

On June 13 2017 06:23 CecilSunkure wrote: Literature for mathematicians is perfectly fine. This is the math thread after all.

E: Also I vote to start banning goons coming in here to talk philosophy. It's annoying and pointless for a math thread.

A large chunk of computer science is based on discrete math, whose underpinnings are the same as philosophy. Famous philosophers such as Bertrand Russel and Descartes had huge contributions to mathematics. The overlap is real.

On June 13 2017 06:23 CecilSunkure wrote: Literature for mathematicians is perfectly fine. This is the math thread after all.

E: Also I vote to start banning goons coming in here to talk philosophy. It's annoying and pointless for a math thread.

A large chunk of computer science is based on discrete math, whose underpinnings are the same as philosophy. Famous philosophers such as Bertrand Russel and Descartes had huge contributions to mathematics. The overlap is real.

Sure overlap is real. Real annoying. Just open a new thread

Also I'm sure nobody would mind if the philosophy discussion was actually related to some real math discussion. But like 99% of the time that won't be the case.

Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?

I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn

RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha

On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?

I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn

RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha

Well, functional analysis is a rather advanced topic. Typically, when you study it, you will have under your belt both real analysis and complex analysis. To take the most out of it you will need measure theory. For this reason most of the books are pretty difficult to read.

Now, if your math background is not that strong (i.e., you are not math major) the best book I believe is "Introduction to Functional Analysis" by Kreyszig. Here is the link www.amazon.com

I think this is the only book prior knowledge of any analysis.

I have to say that functional analysis was always at the top of the things to learn, but at this point I think I will never have time to properly study it. It is really fascinating but complex subject.

On June 13 2017 05:57 Mafe wrote: When this thread was created, I had high hopes. Now this discussion has been interesting so far, but imho it not so much mathematics, but more physics/general science principles, mixed with some philosophical consequences.

Maybe I'm having a narrow view of mathematics, but maybe I dont and it is also time for a "physics Thread".

edit: Also unfortunately I dont know much about mathematics literature for non-mathematicians.

Please, no! The ask and answer stupid question already derail often enough, there is no need for a dedicated thread for derailment

it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?

On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?

I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn

RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha

IMHO, a good introduction book to functional analysis is Rudin's Functional Analysis. I should note that it is a good book to study functional analysis; it is really useful to have at least basic understanding of linear algebra, real and complex analysis, and topology beforehand to understand the motivation behind the theory.

On the other hand, I've heard good things about Introductory Functional Analysis with Applications by Erwin Kreyszig for people who look for applications. I haven't read it myself though.

On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?

Intuitively, the theorem simply states that the sum of infinitesimal changes in a quantity over time (or over some other variable) adds up to the net change in the quantity.

On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?

On June 13 2017 13:13 Mr. Wiggles wrote: Anyone have suggestions for (introductory/intermediate) books/open courses related to functional analysis?

I started to follow along with a course from Coursera a few years ago, but didn't really stick with it. I've done a bunch of undergraduate level calculus/algebra courses, but never really got into analysis. Seems like it would be interesting to learn

RE: The recent discussion, who volunteers to start the epistemology thread? Seems like that's almost what people want, haha

IMHO, a good introduction book to functional analysis is Rudin's Functional Analysis. I should note that it is a good book to study functional analysis; it is really useful to have at least basic understanding of linear algebra, real and complex analysis, and topology beforehand to understand the motivation behind the theory.

On the other hand, I've heard good things about Introductory Functional Analysis with Applications by Erwin Kreyszig for people who look for applications. I haven't read it myself though.

I disagree about Rudin's Functional Analysis. This book targets graduate math audience. Unless you have really strong math background that includes Measure Theory, that's not the book to start. But at the graduate level it is a great book.

Beside Kreyszig that Ingvar and I mentioned in our posts, another introductory books aimed at people with modest backgrounds are "Functional Analyisis for Beginners" by Saxe, and "Elementary functional analysis" by MacCluer. I own the latter and I found it quite easy to follow if you have some background in analysis.

I should say that I self-study quite a bit of math and I often find that it is best to own a couple of books. Because invariably you will get stuck at some points. But since you are doing things on your own there is no professor or classmate to ask or help you clarify your confusion. So I found that the best is to have a couple sources to consult.

On June 14 2017 01:43 JimmyJRaynor wrote: it is obvious and intuitive that multiplication and division are inverse mathematical operations. does any one have a simple 2 or 3 line intuitive way to summarize why differentiation and integration are inverse operations?

It's a little difficult without drawing but here's my best attempt.

I'm guessing you're thinking about integrals as area under a curve. (Alternatively, you could think of it as the inverse of differentiation, which would make the question trivial) So the question is, if I'm looking at the area under a curve, why does it change exactly as fast as the height of the original curve?

Well, area is height * width, at least for rectangles. The slope of the area function (derivative of the integral function) is rise over run taken at smaller and smaller intervals. So it's (height * width) / width as long we take a small enough interval where the area under the original function, f(x) looks like a rectangle. Substituting f(x) for height, we get that the derivative of the integral at some value x is:

(height of f(x) * width) / width = f(x), that is the value of the original function. So differentiating the area function gives back the original function.

There's a number of small assumptions there that aren't necessary but omitting them would make the explanation even more complicated.

Thanks for the recommendations, Lebesgue and Ingvar, I'll check them out!

I've got a bit of a weird math background, as I got three-ish years into a physics degree before switching into computer science and taking a few additional algebra courses that interested me. The end result is a mish-mash of different subjects, along with a bunch of implicit stuff that I learned in my non-math courses.

Integration is essentially multiplication of a function with an infinitesimal quantity, whereas differentiation is division of the function's change with an infinitesimal quantity.

I am not sure how to do it. I understand how n grows and I understand how i grows and I can relate them positionally in a list but I don't know how to write i as a function of n

I know we have some clever people at TL... here is a chance to show off

On June 15 2017 05:13 travis wrote: I am looking for a relation

the pattern is as follows

n: 1, 3, 7, 15, 31, 63 i: 2, 4, 6, 8, 10, 12

I need a formula for i as a function of n

I am not sure how to do it. I understand how n grows and I understand how i grows and I can relate them positionally in a list but I don't know how to write i as a function of n

I know we have some clever people at TL... here is a chance to show off

Introduce a new variable k and write n(k) and i(k). Now find the inverse, i.e. k(n). Substitute to get i(k(n)), or i(n).

edit: for example you could just say k is the position in the list, so you would have n = 2^k - 1 and i = 2k

On June 15 2017 04:06 Shalashaska_123 wrote: Hi, JimmyJRaynor.

Integration is essentially multiplication of a function with an infinitesimal quantity, whereas differentiation is division of the function's change with an infinitesimal quantity.

that's pretty good. i like that one. thx for posting.

Here are some resources. I took applied algorithms in spring as part of my professional masters program, and most of the stuff flew over my head. i only learned what i had to for homework. now i'm going back to try to understand the stuff. One of the most unexpected thing for me (and apparently other people who've taken applied algorithms before shared same surprise) is that the class is almost entirely theory. We did have some coding assignments, but most of it is math. I'll be posting more stuff as I go over my class notes. I just finished my masters and realized I haven't mastered anything.

I analyze my elements, reduce the amount of elements by 1, then cut the elements in half. Now I have 2 lists of elements, 1 of size (n-1)/2 and the other of size n/2.

I do this over and over until i am left with many lists of 2 elements (and lists of 1 elements.. but for this question we don't consider those).

So the question is, how many lists did we go through in total? where n was a list at the top, n/2 and n-1/2 were lists, ((n/2)-1)/2 and (n/2)/2 were lists, etc etc all the way down to our lists of 2. Lists of 1, left over, do not count.

Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)

So 95 in total?

If so, approximately n/2 + n/4 + ... ~= n

A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.

On June 16 2017 08:34 DanielReeLee wrote: Hello I'm looking for some problem sets for multivariable and vector calculus. Could I have some recommendations

Check out the multivariable calculus course on MIT OCW.

What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.

On June 17 2017 04:38 Acrofales wrote: Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)

So 95 in total?

If so, approximately n/2 + n/4 + ... ~= n

A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.

well, for n = 100

100 50 , 49 25, 24 24, 24

12, 12, 11, 12, 11, 12, 11, 12 5, 6, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6 2, 2, - oh god there is a lot, u get the idea it would stop at all 1s and 2s but the 1s don't count and what I want is the count of EVERY list of length > 1, including the original list and the lists in every step

On June 17 2017 04:38 Acrofales wrote: Not sure I understand the question. Would this be how you do it for n=100: 100 2*50 4* 25 8*12 (and 4*1 which we discard) 16*6 32*3 32*2 (and 32*1 which we discard)

So 95 in total?

If so, approximately n/2 + n/4 + ... ~= n

A precise answer is harder, because you need to know how many times you end up with an odd number of elements in your list. There's almost certainly a numerical way of figuring that out, but I'm lazy right now.

well, for n = 100

100 50 , 49 25, 24 24, 24

12, 12, 11, 12, 11, 12, 11, 12 5, 6, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6, 5, 5, 5, 6 2, 2, - oh god there is a lot, u get the idea it would stop at all 1s and 2s but the 1s don't count and what I want is the count of EVERY list of length > 1, including the original list and the lists in every step

On June 17 2017 04:15 travis wrote: Now I have 2 lists of elements, 1 of size (n-1)/2 and the other of size n/2.

One of these is not an integer.

hmm yeah that's true what I really want is... uh.. the floor of (n-1)/2 .. I think. see above, lol

I expect the best way to solve this is to represent it with sums and then simplify them but I am not good enough

Oh, ok. That seems incomplete. The 6s would expand to 3, 2, right? And what would then happen? It ends there? Seems like a weird algorithm.

I thought I had a simple solution, but it breaks if any list in your subdivisions has length equal to a power of 2 (that adds 1, which can occur at different points in the tree). So doesn't work, and given the weirdness of your algorithm, I'm not sure there's an easy way of figuring out how often you'll run into a power of 2.

On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:

What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.

I apologize for my English.

Unless i missed something, the result is 5*sqrt(2).

The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)

At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.

On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?

Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.

I am starting to think there isn't a reasonable way to get the exact expression. I've looked analysis of this algorithm up online and there doesn't seem to be any exact analysis online.

This is actually an analysis of best case quicksort comparisons that I am trying to do, if you are familiar.

I can see conceptually how this would be in the ballpark of n, but can you show the math for that?

Also I solved the actual values for 1 to 20. Itst a bit odd. nowhere near n for low values.

On June 17 2017 05:47 Liebig wrote: let f(n) be the result of your weird stuff for n

I've never seen recursion used this way, gonna take a minute to wrap my head around this lol edit: ok I get it. that's neat, But I am not as quick as you in using it to analyze the outcome

On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?

Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.

I am starting to think there isn't a reasonable way to get the exact expression. I've looked analysis of this algorithm up online and there doesn't seem to be any exact analysis online.

This is actually an analysis of best case quicksort comparisons that I am trying to do, if you are familiar.

I can see conceptually how this would be in the ballpark of n, but can you show the math for that?

I've never seen recursion used this way, gonna take a minute to wrap my head around this lol

Well, it's m, such that m+1 is the greatest power of 2 less than n. However, add 1 for every power of 2 you encounter along the way. For instance n=16 has 8 sublists of size 2 (or greater), rather than 7. How many powers of 2 you'll encounter is hard to predict, but if you want asymptotic behaviour, take a completely absurd upper bound: say you have all m sublists with a size equal to a power of 2 (iow, all your sublists), even that would only give you 2m total lists (and is completely impossible), which is still O(n).

On June 17 2017 05:39 hypercube wrote: Did you try some numerical experiments?

Are you interested in an exact expression or only asymptotic behaviour? The second one seems to be just on the order of n.

Asymptotic is definitively at most order of n.

We can easily get an upper limit, because the kth step (with 0 being the initial set) produces <= 2^k new groups of numbers, and there are < log 2 (n) steps. Thus, there are definitively < Sum from 0 to log2(n) (2*k) = 2^(log2(n) + 1)-1 = 2n-1 total sets in this algorithm.

So, this is actually from homework that I have been working on for a long time. The actual question is "What is the fewest exchanges that the algorithm will execute for an input of size n. No justification needed."

(What we have been doing demonstrates the best case for exchanges)

In exact counts, with n elements, if n =3 we do only 1 exchange. If n = 7, we do only 3 exchanges. if n = 15, we do 7 exchanges. After that it is always worse than this.

So does that mean that a technically correct answer to this question is: "(the floor of n/2) - 1" ?

On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:

What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.

I apologize for my English.

Unless i missed something, the result is 5*sqrt(2).

The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)

At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.

Thank you for your reply.

If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.

On June 17 2017 05:36 Acrofales wrote: I thought I had a simple solution, but it breaks if any list in your subdivisions has length equal to a power of 2 (that adds 1, which can occur at different points in the tree). So doesn't work, and given the weirdness of your algorithm, I'm not sure there's an easy way of figuring out how often you'll run into a power of 2.

I think I have an idea. Let's write the sets in rows like travis did, so the k-th row will have 2^(k-1) sets. So if we need K rows, we will have 2^K - 1 sets in total. Unfortunately, this only works if we terminate in a perfect row (e.g. a combination of 3s and 2s, that can't be broken down with our rule).

So here's an outline:

0. Observe that there is at most two unique (and consecutive) elements in any given row.

1. Prove that there is at most 1 non-perfect row. That is a row which has both elements of n > 1 and n <= 1. I will refer to this as the final row.

2. Give a formula for the sum of elements in a row. This is doable because exactly one of n and n-1 so the sum of the child elements is one less than the parent element.

3. Use this formula to calculate the number of rows, up to and including the first non-perfect row. (call this K)

4. Use the same formula to calculate the number of ineligible elements in the final row. (call this D)

Using the original idea, the final answer is 2^K - D - 1.

edit: 3 seems the trickiest, I'm pretty sure I can do the others.

On June 09 2017 04:13 CecilSunkure wrote: I've been eager to learn about Fourier transform. In particular I wanted to use it to do pitch adjustments of sound samples. I have some code here for playing sounds, and want to add some pitch adjustment stuff.

Would anyone mind chatting with me about the mathematics? I was hoping to find someone knowledge about Fourier transforms and their applications that I could bounce a bunch of questions off of. Please PM me if anyone would be so kind!

Well it took a while, but after some studying I ended up getting down an implementation. I referenced "Computational Frameworks for the Fast Fourier Transform" by Charles Van Loan. I also found two good reference implementations here and here. The first link is an implementation of some algorithms from the Van Loan book, which helped show a proper way to convert algorithms to C and verify for correctness. The second link by Paul Bourke is probably the most popular FFT implementation in existence. The Bourke implementation has a sort of nasty side-effect of scaling floating point error across the DFT, so for large inputs the results can zero out early. This can easily be fixed by adding in some direct cos/sin calls, so I did that.

In the end I got down my own implementation after referencing all 3 pieces mentioned above. Luckily all the Van Loan book is in vector and matrix algebra, something I'm already good at.

Anyways, here's my current implementation. I'll probably work on it a little more to create a more useful API. Also the final scale loop in the if statement ( sign < 0 ) would be a good candidate for SIMD ops. My implementation is in a more modern style than the older ones, so it should be a little faster due to less branch misses in favor of a little more raw computation. Also there's a symmetry to cut computational cost in half that I've yet to implement. But this is fine for now:

uint32_t PopCount( uint32_t x ) { uint32_t a = x - ((x >> 1) & 0x55555555); uint32_t b = (((a >> 2) & 0x33333333) + (a & 0x33333333)); uint32_t c = (((b >> 4) + b) & 0x0f0f0f0f); uint32_t d = c + (c >> 8); uint32_t e = d + (d >> 16); uint32_t f = e & 0x0000003f; return f; }

uint32_t Log2( uint32_t x ) { uint32_t a = x | ( x >> 1 ); uint32_t b = a | ( a >> 2 ); uint32_t c = b | ( b >> 4 ); uint32_t d = c | ( c >> 8 ); uint32_t e = d | ( d >> 16 ); uint32_t f = e >> 1; return PopCount( f ); }

// x contains real inputs // y contains imaginary inputs // count must be a power of 2 // sign must be 1.0 (forward transform) or -1.0f (inverse transform) void FFT( float* x, float* y, int count, float sign ) { int exponent = (int)Log2( (uint32_t)count );

// bit reversal stage // swap all elements with their bit reversed index within the // lowest level of the Cooley-Tukey recursion tree for ( int i = 1; i < count - 1; i++ ) { uint32_t j = Rev32( (uint32_t)i ); j >>= (32 - exponent); if ( i < j ) { float tx = x[ i ]; float ty = y[ i ]; x[ i ] = x[ j ]; y[ i ] = y[ j ]; x[ j ] = tx; y[ j ] = ty; } }

// for each recursive iteration for ( int iter = 0, L = 1; iter < exponent; ++iter ) { int Ls = L; L <<= 1; float u1 = 1.0f; // cos( pi / 2 ) float u2 = 0; // sin( pi / 2 )

// rows in DFT submatrix for ( int j = 0; j < Ls; ++j ) { // do butterflies upon DFT row elements for ( int i = j; i < count; i += L ) { int index = i + Ls; float t1 = u1 * x[ index ] - u2 * y[ index ]; float t2 = u1 * y[ index ] + u2 * x[ index ]; x[ index ] = x[ i ] - t1; y[ index ] = y[ i ] - t2; x[ i ] += t1; y[ i ] += t2; }

// Rotate u1 and u2 via Givens rotations (2d planar rotation). // This keeps cos/sin calls in the outermost loop. // Floating point error is scaled proportionally to Ls. float z = u1 * wr - u2 * wi; u2 = u1 * wi + u2 * wr; u1 = z; } }

// scale factor for forward transform if ( sign > 0 ) { float inv_count = 1.0f / (float)count; for ( int i = 0; i < count; i++ ) { x[ i ] *= inv_count; y[ i ] *= inv_count; } } }

I am glad I set you guys off on a fun math journey. I figured out I was doing the question I was assigned completely wrong, which explains why it was so difficult. LOL

On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:

What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.

I apologize for my English.

Unless i missed something, the result is 5*sqrt(2).

The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)

At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.

Thank you for your reply.

If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.

Hm, i calculated it another way, and got a better result:

Just take a look at any rectangle which touch the corner of the canal. Before (and after) the turn, they all have the lengths (2+x) and (3+y), where 2/y = x/3 (intercept theorem). Thus, we are looking at the diagonal of a rectangle with sides (2+x) and (3+6/x), and especially for the minimum of those diagonals.

The diagonal has the length of

f(x) = Sqrt((2+x)²+(3+6/x)²) and thus the derivative of: f'(x) = 1/2f(x) * (2*(2+x)- (12/x²)*(3+6/x) Which you would now need to find the zeroes of. (I had mathematica do it because i am lazy, the zero is apparently at (18)^(1/3). (And further checks will almost certainly prove it to be a minimum That means that the lowest possible length of the gondola is f( (18)^(1/3)), which you can simplify to Sqrt (13 + 6*18^(1/3) + 3* 18^(2/3)), which is about 7.02, and indeed slightly less than 5Sqrt(2). Sorry for that.

On June 17 2017 04:53 Amanebak wrote: Hey. I stumbled across a problem like this:

What is the longest gondola that can take a right-angle turn of a Venetian canal? The width of the canal is 2, respectively 3 length units before, resp. after the turn.

I apologize for my English.

Unless i missed something, the result is 5*sqrt(2).

The gondola needs to be able fit into the diagonal of the (2+3)*(2+3) square, which has a length of 5 sqrt (2)

At that point, 2sqrt(2) of the gondola is in the thinner canal, and 3sqrt(2) is in the thicker canal. It is obvious that no longer gondola could reach as far into the 3m canal if you draw a picture of the situation, and the 5sqrt(2) gondola can continue onwards from this point on.

Thank you for your reply.

If I understand it correctly, gondola turned 45 degrees and touches sides of the canal at both ends and touches the 'inner' edge of the turning with its side. That is the diagonal of the (2+3)*(2+3) square. Yes, it can go on, but I don't think it could go back, unless it bends. To imagine, I propose to make the difference of widths greater, like 1 before turn and 100 after.

Hm, i calculated it another way, and got a better result:

Just take a look at any rectangle which touch the corner of the canal. Before (and after) the turn, they all have the lengths (2+x) and (3+y), where 2/y = x/3 (intercept theorem). Thus, we are looking at the diagonal of a rectangle with sides (2+x) and (3+6/x), and especially for the minimum of those diagonals.

The diagonal has the length of

f(x) = Sqrt((2+x)²+(3+6/x)²) and thus the derivative of: f'(x) = 1/2f(x) * (2*(2+x)- (12/x²)*(3+6/x) Which you would now need to find the zeroes of. (I had mathematica do it because i am lazy, the zero is apparently at (18)^(1/3). (And further checks will almost certainly prove it to be a minimum That means that the lowest possible length of the gondola is f( (18)^(1/3)), which you can simplify to Sqrt (13 + 6*18^(1/3) + 3* 18^(2/3)), which is about 7.02, and indeed slightly less than 5Sqrt(2). Sorry for that.

Good job!

There is another way, instead of using intercept theorem. You consider diagonals that touch the edge with their sides and finds the minimal one, using \alpha -- angle between the latter direction of the canal and the diagonal. We want to minimize function

l = l_1+l_2 = 3/sin(\alpha) +2/cos(\alpha)

I got something like \alpha_min = \arctan \sqrt[3](3/2), which is interesting to me. If we generalize this a bit, we take a, resp. b as parameters, instead of 2, resp. 3 and think of symmetries of this problem, we conclude

On June 12 2017 02:45 CecilSunkure wrote: So I've spent a day studying Fourier Transform. So far the math is very understandable, at least on how to apply it. Understanding why it works and specifically what it does is more challenging. Most resources on the subject are just absolutely awful, and I may have to go out and buy and old text book to get good information.

It seems like FFT implementations popped up around the 80s-90s, and nobody ever really looked at them again in depth. Since then everyone takes and old implementation and wraps it. Well there's FFTW, but that library is a software engineering nightmare. It's a horribly obfuscated and heavyweight solution for a particularly small function. But what can you expect from a bunch of academics.

You need to tread FFTW3 as a black box, because that is what it is supposed to be. It is used by Matlab and free to use for academics. So it is an industry standard.

The real challenge is to be able to derive all the math and understand it in all its correctness. Only this way can I write a good implementation.

Well, this is just not true. You implement it as a black box. There is no point understanding what is inside, as you are just implementing it. If you want to improve it or change it, then you need to know what is inside the box.

BTW, when I did some FFTW stuff and I wanted to test correctness, I got into difficulties because defining PI with 12 decimals, or even 30, wasn't accurate enough, and rounding would cause funky stuff, resulting in frequency bins that weren't supposed to fill up, to be filled, because of oscillations in how each sin or cos call was rounded each period.

On June 17 2017 10:29 BrTarolg wrote: Man, i have a degree in maths with a focus on pure

I *still* have no idea what's going on most of the time, inc in this thread lmao

That experience I get all over again anew every 3 or 4 weeks. You talk to people or read some reference, and suddenly there exists a whole new field, with all it's own intricacies, that you never knew about, barely understand, and that you think you want to know more about. But that is just impossible. So you figure out some silly pragmatic workaround, accepting that you are just one person with finite time and finite talent.

On June 17 2017 20:28 Kadungon wrote: Well, this is just not true. You implement it as a black box. There is no point understanding what is inside, as you are just implementing it. If you want to improve it or change it, then you need to know what is inside the box.

But that's what I did. Improve and change it. Obviously not mathematically improve or change, just in terms of software engineering benefits.

i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.

You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?

On June 23 2017 05:42 esla_sol wrote: i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.

You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?

The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:

1-!100/100! which approximately equals 63 percent.

I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed

Had to take three stats courses to get my degree and I still have no idea what I'm doing when it comes to probability. Just keep guessing until I get the answer

On June 23 2017 04:15 travis wrote: Here's a math question I have.

i want to write a sum:

from i=1 to n of: 1/2*(i-1)i

But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?

It's like if I could write my sum as "from i = i +1 to n"

Example:

If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10

Is the only way to do this to solve for the relationship between what I want i to do, and n? Because I am not sure what the relationship is.

You can write basically anything below the sum symbol as a condition, but usually you use these conditions to only describe individual elements of the sum, not the sum as a whole.

I would write it as the Minimum of { (your sum from 0 to n) | n € N, (your sum) > n0}, with n0 being the thing you want. That works as long as all of the things in the sum after this are positive. If you have a sum that has negative elements afterwards, it becomes more complicated, because the minimum could be later down the line.

The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:

1-!100/100! which approximately equals 63 percent.

I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed

that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?

So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)

On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)

Isnt that just b = number of black cards left when drawing the first black card, c = total number of cards left in the deck b/c * b-1/c * b-2/c * b-3/c * b-4/c The -1/2/3/4 comes from one black card already being drawn and therefore the chance to draw antother black card decreases by 1/c

EDIT: So basically, the chance to draw a black card is always (black cards left) / (total number of cards left), and for the chance to draw multiple black caards in a row it is (chance for card 1) * (chance for card 2) ... * (chance for card n).

On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)

Isnt that just b = number of black cards, c = cards b/c * b-1/c * b-2/c * b-3/c * b-4/c The -1/2/3/4 comes from one black card already being drawn and therefore the chance to draw antother black card decreases by 1/c

I think he's asking the probability of a sequence of 5 black cards appearing anywhere in the stack. I have calculated this in sequences with replacement. Without replacement it's harder and I'm not sure you can use the same approach at all, but with replacement it works as follows:

The chance of having 5 black cards in a row of simply 1/2^5. So with 52 cards, you have 52-5+1=48 different sequences of having 5 cards. So the chance of one or more of these being all black is simply 1 minus the chance of none of them being all black:

1 - (1- 1/2^5)^48. So with replacement, you'd have ~78% chance of getting a 5-card sequence of all-black cards. Without replacement the problem seems much harder, but perhaps this gives you an idea of how to approach it?

Edit: hmm, with replacement you should just be able to enumerate all the ways of getting a flush, right? I am missing something here, but a first stab: There are 47 cards left after your flush, so that's 47! possible orders. 48 places for starting your flush, so 48! combinations... and 26 choose 5 ways of actually making a flush, so (26,5)*48!/52! This seems way too small, so I probably have a mistake somewhere. But this approach seems promising.

On June 23 2017 11:21 Uldridge wrote: So I have a bit of a problem I've worked on in the past, but my math is too lacking to solve it. The problem is as follows: with a standard deck of cards (52; 26 black, 26 red), what is the probability that one of colors is found 5 times or more (so at least 5 times) in a row when the deck is shuffled and put in a sequence (so random sequence of 52 cards)? I've made a small piece of code that can easily make over 1x10^6 runs to come to just under 80% and I know it's a combinatorial problem, but I have no idea how to formally solve it. Nor does the probability seem consistent because it fluctuates depending on the type of code used (different people, different code, or either my code sucks, but I've tested it thoroughly, so that shouldn't be it...)

I expect that a closed-form expression for this probability will be rather complicated. I suppose you can get a recursive formula which then can be evaluated by a computer.

Let A(b,r) be the probability that you will have a streak of 5 black cards in a deck with b black cards and r red cards Like for an example, if you start with stack of 26 black cards (and no red cards), you obviously have a probabilty of 1 that you will get a streak of 5 black cards. Whereas in a stack with 26 red cards (and 0 black cards), the probability is 0.

Now suppose that you have a stack of b black cards and r red cards. If the first card is red (which happens with a probabilty of r/(b+r) ), then a streak of 5 black cards must be contained in remaining b black and (r-1) red cards, where it has probability A(b, r-1). The probabilty of getting a 5-black-streak in this case (first card is red) is therefore (r/(b+r)) *A(b, r-1) If the first card is black and second card is red (which happens with probability (b/(b+r))*(r/(b+r-1)), a 5-black-streak must be contained in the remaining (b-1) black and (r-1) red cards. The probabilty of getting a 5-black-streak in this case (first card is black, second card is red) is therefore (b/(b+r))*(r/(b+r-1)) *A(b-1, r-1). ..... now you iterate this idea for (1st card black, 2nd card black, 3rd card red),..... until (1st B, 2nd B, 3rd B, 4th B, 5th B) in which case you have 5-black-streak regardless of what is happening for the reamining cards.

Altogether, you obtain a recursive formula for A(b,r) with intial values A(b,0)=1 for b >= 5 and A(0,r)=0 for any r. (You might simplify by saying A(26, 5)=1. I suppose this should be easy to code once you've found all formulas and a computer should be able to find A(26,26), the value in question, in almost no time.

edit: I think may initial values are not sufficient yet for the recursion. But if you understand the principle, adding the missing initial values should not be much of a problem.

On June 23 2017 04:15 travis wrote: Here's a math question I have.

i want to write a sum:

from i=1 to n of: 1/2*(i-1)i

But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?

It's like if I could write my sum as "from i = i +1 to n"

Example:

If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10

Your series can be written as

where k is defined by your requirement that the sum of the i values has to equal n.

All that we have to do now is solve equation (1) for k in terms of n. The finite sum on the left side is one we can look up in a table.

What we have here is a quadratic equation for k.

Use the quadratic formula to solve it.

We use the expression on the right for k since we want it to be a positive value. The one on the left is negative, so we discard it. Therefore, your sum can be written in terms of n as follows.

This sum can be evaluated. Switch the upper limit back to k for a moment and split up the sum like so.

The finite sum of i^2 and the finite sum of i can be looked up in a table and/or proved using mathematical induction.

Simplifying this gives us

Now we can plug in that value for k. Doing so and simplifying gives us the final result.

If we plug in n = 10, then the upper limit evaluates to 4 and the sum evaluates to 10, which is consistent with your example.

Is the only way to do this to solve for the relationship between what I want i to do, and n?

I'm reluctant to say it's the only way, but it is the most convenient way as far as I can see.

Because I am not sure what the relationship is.

Yeah, you are. You say yourself that you want the sum to stop when the sum of i values equals n. We write equation (1) from that fact.

The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:

1-!100/100! which approximately equals 63 percent.

I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed

that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?

On the bottom you have 100!, which is the total number of ways to order the 100 numbers. On the top you have !100, which is the number of different ways to order the 100 numbers such that none of the numbers are shouted at the same time. I have to admit I've never even heard of derangements or subfactorials before today.

On June 23 2017 05:42 esla_sol wrote: i teach elementary school and we do probability. things like dice and spinners. basic stuff. was talking to a friend about probability and he gave me this problem. i've been unable to make sense of it. any help is appreciated.

You and I are playing a game. We each shout out a number 1-100 every 5 seconds. Once someone says a number, that person can't say it again. What's the probability we match at least once?

If neither person was allowed to repeat the other's numbers the problem would be considerably easier (just break it down into a big conditional probability chain). I'm stumped.

The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:

1-!100/100! which approximately equals 63 percent.

I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed

that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?

!100 (subfactorial 100) is the numbers of ways you can derange 100 objects.

!1=0 because there is no way to derange one object !2=1 because there is one way to derange two objects (BA) !3=2 (BCA, CAB) !4=9 (BADC, BCDA, BDAC, CADB, CDAB, CDBA, DABC, DCAB, DCBA) !5=44 !6=265

On June 23 2017 04:15 travis wrote: Here's a math question I have.

i want to write a sum:

from i=1 to n of: 1/2*(i-1)i

But I don't actually want i to stop when i = n. I want i to stop when the sum of "i values" has reached n. How do I do that?

It's like if I could write my sum as "from i = i +1 to n"

Example:

If n = 10 I want my sum to go through i values of: 1, 2, 3, 4 - then STOP. Which gives me a sum of 0+1+3+6 = 10

Your series can be written as

where k is defined by your requirement that the sum of the i values has to equal n.

All that we have to do now is solve equation (1) for k in terms of n. The finite sum on the left side is one we can look up in a table.

What we have here is a quadratic equation for k.

Use the quadratic formula to solve it.

We use the expression on the right for k since we want it to be a positive value. The one on the left is negative, so we discard it. Therefore, your sum can be written in terms of n as follows.

This sum can be evaluated. Switch the upper limit back to k for a moment and split up the sum like so.

The finite sum of i^2 and the finite sum of i can be looked up in a table and/or proved using mathematical induction.

Simplifying this gives us

Now we can plug in that value for k. Doing so and simplifying gives us the final result.

If we plug in n = 10, then the upper limit evaluates to 4 and the sum evaluates to 10, which is consistent with your example.

Is the only way to do this to solve for the relationship between what I want i to do, and n?

I'm reluctant to say it's the only way, but it is the most convenient way as far as I can see.

Because I am not sure what the relationship is.

Yeah, you are. You say yourself that you want the sum to stop when the sum of i values equals n. We write equation (1) from that fact.

The probablity of matching at least once equals 1 minus the probablity of never matching so the answer to your question is:

1-!100/100! which approximately equals 63 percent.

I can explain what dearangement is all about tomorrow if you're interested but for now I need to go to bed

that's amazing. i had to look up what the factorial in the front meant. is the reason you divide by 100! because you have 1/100 chance, then 1/99, and so on?

On the bottom you have 100!, which is the total number of ways to order the 100 numbers. On the top you have !100, which is the number of different ways to order the 100 numbers such that none of the numbers are shouted at the same time. I have to admit I've never even heard of derangements or subfactorials before today.

firstly: wow secondly, sorry HTKPZ but what you posted actually wasn't what I wanted, I don't know how I got confused. This is the solution I was looking for, since I needed things in terms of n instead of in terms of the iteration I want to stop at.

but now I am shocked because while this answer certainly seems correct, this seems way more advanced than anything I should be having to do for this class, lol

maybe not though... since I do actually understand it

I really don't understand what you are doing in this step:

-what am I doing wrong? why is this image not loading?

edit2: oh wait, I see what you are doing now! I misunderstood the picture, you are saying the entire sum = n. ok that makes sense

It's the final question there (2b). It works like the first one except for now, the sum goes from i=1 to k of (i-1)i/2

but this sum will be repeated Z times and after every Z repetitions of this inside sum, Z is raised to the next power of 2 (if it was 1, it becomes 2, if it was 2, it becomes 4, if it was 4, it becomes 8), and k is incremented by 1

But then finally all this needs to be related to n, where n is equal to the total times i was incremented over all repetitions

aaand this was probably all insanely confusing so I will write it like this

If our n = 17

Then we need to do the sum from i=1 to 1 of (i-1)i/2 ONE TIME Then we need to do the sum from i=1 to 2 of (i-1)i/2 TWO TIMES Then we need to do the sum from i=1 to 3 of (i-1)i/2 FOUR TIMES

If our n was 49

Then we need to do the sum from i=1 to 1 of (i-1)i/2 ONE TIME Then we need to do the sum from i=1 to 2 of (i-1)i/2 TWO TIMES Then we need to do the sum from i=1 to 3 of (i-1)i/2 FOUR TIMES Then we need to do the sum from i=1 to 4 of (i-1)i/2 EIGHT TIMES

and we don't worry about n-values that don't line up nicely

I could definitely use help.. I am about 75% sure I've got this much correct. I'll be working on it.. lol I'll try using what shalashaska showed. But first... lunch

Apologies if this has already been answered somewhere, but I wanted to ask this.

Its been almost 20 years since I did any serious kind of mathematics, and I don't have much exposure to mathematics in my current role as an application programmer.

Can any of you recommend some resources to start learning calculus and probability from scratch? I am looking for books that talk more about the theory rather than working out exercises. A few books I tried in the past turned out to be extremely dry affairs that jumped straight into problem solving without much background text.

On June 24 2017 00:49 Piledriver wrote: Apologies if this has already been answered somewhere, but I wanted to ask this.

Its been almost 20 years since I did any serious kind of mathematics, and I don't have much exposure to mathematics in my current role as an application programmer.

Can any of you recommend some resources to start learning calculus and probability from scratch? I am looking for books that talk more about the theory rather than working out exercises. A few books I tried in the past turned out to be extremely dry affairs that jumped straight into problem solving without much background text.

Thank you.

For the calculus portion, do you mean its theoretical foundations? If that is the case you are looking for an introductory text into real analysis, you can go on math.stackexchange to find all sorts of textbook recommendations. However, the typical answer to calculus theory/ real analysis has always been Rudin's real analysis textbook though there are friendlier versions out there.

On June 24 2017 03:54 ninazerg wrote: Does 1 + 1 ever not equal 2?

Usually, one defines 2 to be 1+1. In this sense, 1+1=2 always holds.

But you can have 1+1=0 for example, in which case 2=0 (so still 1+1=2). In this case, ifthere is already some object taking the role of "2", one typically writes 0 instead of 2, to have a unambiguous notation.

On June 24 2017 03:54 ninazerg wrote: Does 1 + 1 ever not equal 2?

Usually, one defines 2 to be 1+1. In this sense, 1+1=2 always holds.

But you can have 1+1=0 for example, in which case 2=0 (so still 1+1=2). In this case, ifthere is already some object taking the role of "2", one typically writes 0 instead of 2, to have a unambiguous notation.

Indeed. In the realm of natural numbers (or in the reals, or for rationals, or a lot of other realms), 1 + 1 = 2.

However, it is quite easy to conceive of an mathematical object worth study where that is not the case. Take, for example, the F2. That is the body of numbers modulo 2. In that body, there are only two objects, everything that is 0 modulo 2, and everything that is congruent 1 modulo 2. (One could of course also call these different.

It is quite obvious that 1+1 in that body equals, because 2 is equal to zero in that body.

Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.

Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.

Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.

Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.

Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.

I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.

So, this is kind of a crossover between CS and math. Not sure which thread to post this in. I guess math, but hopefully some people here understand it.

I have a boolean formula (A ^ B) ^ (C V D) V ( B ^ D) ^ ( ~C V ~D) etc etc We are looking for the asymptotic complexity of finding satisfiability(a combination of true and falses that makes the final output true), through brute force. The question is how many combinations of inputs are there, basically. I believe the answer is 2^n.

The 2nd part of the question says that we now have an ability to pass our algorithm both the conditions (A = true, B = false, C = true, etc etc - in our 2^n combinations) - but we can ALSO pass an integer "k". Our algorithm will return if we have satisfied the assignment, but will ALSO return if the boolean formula can be satisfied with "k" or less TRUE values.

The question for the 2nd part is what is the asymptotic complexity to find a satisfying assignment. Here is what I believe the answer is.

First, find the exact "k" value. This can be done by passing "k = n", then if it comes back true, cut "k" in half, pass it again. If it's false cut it in half between k and n, if it's true again cut it in half again.. etc etc. Keep dividing our portions by half until we get an exact k. This is log_2(n) operations.

Once we have an exact k, we know we need to pass that many trues into our algorithm. Therefore the total number of combinations to try will be (n choose k).

So I think the complexity this time will be log_2(n) + (n choose k), which asymptotically is just n choose k - which in the average case I think is much better than the first scenario.

Does this math look good? I just wanted to see if anyone spots any obvious mistakes.

Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.

Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.

I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.

I still agree. I mean pi at 5 decimals is as bad as taking 1 or 10. You can't replace pi by anything, just keep it. This has been the topic of many discussions at the (physics) lab :p

Merely just a joke about rounding numbers... typically, 1.3 would be rounded down to 1 while 2.6 would be rounded up to 3, so while 1.3+1.3=2.6, individually rounding those three numbers would create the incorrect estimation that 1+1=3.

Not so much a joke, just showing you're a physicist :p Coming from math, I had a real hard time at first understanding the shenanigans behind 'pi = 1' or '1000+200 = 1000' and such.

I feel just the opposite. I get annoyed when people are working with rough estimates but use a value of pi that is correct to 5 decimal places. I know that's what some textbooks tell you to do, but come on people, use common sense.

I still agree. I mean pi at 5 decimals is as bad as taking 1 or 10. You can't replace pi by anything, just keep it. This has been the topic of many discussions at the (physics) lab :p

It depends on *why* your error is what it is. If you have a 10% error because that's the limit of your instrument and there's no reasonable way to improve it, then sure use the best value of pi you can. But if you have a 10% error because you are just looking for a ballpark figure then what's the point? You might as well put that extra effort into improving your measurement or your model or whatever.

There's a difference between measurements and ballpark estimates. Both have errors, but when you estimate you create errors in order to keep moving along and get to some reasonable number as fast as possible. Saying that pi ~ 3.14 is no different than saying 900 + 95 ~= 1000

On June 24 2017 23:50 travis wrote: So, this is kind of a crossover between CS and math. Not sure which thread to post this in. I guess math, but hopefully some people here understand it.

I have a boolean formula (A ^ B) ^ (C V D) V ( B ^ D) ^ ( ~C V ~D) etc etc We are looking for the asymptotic complexity of finding satisfiability(a combination of true and falses that makes the final output true), through brute force. The question is how many combinations of inputs are there, basically. I believe the answer is 2^n.

The 2nd part of the question says that we now have an ability to pass our algorithm both the conditions (A = true, B = false, C = true, etc etc - in our 2^n combinations) - but we can ALSO pass an integer "k". Our algorithm will return if we have satisfied the assignment, but will ALSO return if the boolean formula can be satisfied with "k" or less TRUE values.

The question for the 2nd part is what is the asymptotic complexity to find a satisfying assignment. Here is what I believe the answer is.

First, find the exact "k" value. This can be done by passing "k = n", then if it comes back true, cut "k" in half, pass it again. If it's false cut it in half between k and n, if it's true again cut it in half again.. etc etc. Keep dividing our portions by half until we get an exact k. This is log_2(n) operations.

Once we have an exact k, we know we need to pass that many trues into our algorithm. Therefore the total number of combinations to try will be (n choose k).

So I think the complexity this time will be log_2(n) + (n choose k), which asymptotically is just n choose k - which in the average case I think is much better than the first scenario.

Does this math look good? I just wanted to see if anyone spots any obvious mistakes.

Soon we are going to need the Algorithms Thread.

I'm not sure I understand part 2, do you mean you can choose k out of n inputs arbitrarily? Or the first k of n?

In part 2 you can pass an integer "k". in addition to your series of true/false.

It then tells you if the formula is satisfiable with "k" true values or less. So if your formula needs 10 true values, and you pass it a "k" value of 9, it will return false for that.

It's not tied to the actual series of boolean variables you are sending, other than that they are both related to solving the formula. They could be 2 different functions for all it matters.

On June 25 2017 10:37 travis wrote: In part 2 you can pass an integer "k". in addition to your series of true/false.

It then tells you if the formula is satisfiable with "k" true values or less. So if your formula needs 10 true values, and you pass it a "k" value of 9, it will return false for that.

It's not tied to the actual series of boolean variables you are sending, other than that they are both related to solving the formula. They could be 2 different functions for all it matters.

Are there any restrictions on the formula? The example is (A op B) op (...) ..., so it's always the same pattern or brackets? Not sure about this off the top of my head. Naive solution is to check everything at O(2^n) but if you analyse the formula itself, shouldn't it be something is O(n) because you just count the operations in a certain way? Eg, (n-k) equals the number of ANDs or some such.

Edit: The problem I have with your original solution is that you are looking at the complexity of how many function calls to f(n, k) as a binary search on k. But I am wondering if the question is to figure out the complexity of f, how it determines what to return for a given k.

in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.

problem: prove that 1+1=2 without using addition and numbers in your process and solution.

On June 25 2017 14:17 xwoGworwaTsx wrote: in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.

problem: prove that 1+1=2 without using addition and numbers in your process and solution.

On June 25 2017 14:17 xwoGworwaTsx wrote: in all my years in academic mathematics, this is the best math problem i have ever encountered. so simple yet so frustatingly elusive. And the answer is elegant and massively satisfying.

problem: prove that 1+1=2 without using addition and numbers in your process and solution.

anybody want to have a go?

Peano arithmetic defines numbers as literals in a FOL. Does that count as "using numbers"? Otherwise you could do it in set theory, but it's basically the same.

Alright travis. I am going to split my answer into two parts. I think if you understand the logic behind Gauss sum, you can very easily apply the same logic to your original problem and arrive at the formulas you desire.

We will only regard even n in the following as the gauss sum for uneven n directly follows.

1) Gauss sum The idea behind Gauss sum is to make pairs: You take the first and the last element (1 and n) , the second and the second to last element (2 , n-1) and so on. Notice that all these pairs have sum n+1. Now you have the n elements split up into n/2 pairs of sum n+1, whence the whole sum is (n+1)*n/2 , gauss sum.

2) partial sums I advise you that if you understand 1), try to apply the same logic yourself before simply reading the solution. + Show Spoiler +

Now instead of summing 1+2+...+n you just havfe the sum 1+2+...n/2 . This means our pairs are (1 +n/2), (2+n/2 - 1) and so on. There are (n/2)/2 number of pairs as you have n/2 elements. this leaves us with (n/4)*(1+ n/2)=1/8 * n * (n+2)

Also I wanted to share in this thread one of my favorite proofs. The problem is very simple and so is the proof, once you see it, though it takes a lot of creativity to solve it yourself. I dare you to try before reading the solution

"For all prime numbers n with n greater or equal to 5 holds (n^2) - 1 is dividable by 24."

First we note that (n+1)*(n-1) = (n^2)-1. (*) as n is prime we know that n is uneven. As every second even number (4,8,12,...) is dividable by 4, we know that either n+1 or n-1 is divided by 4 and the other by 2. Hence we know that 8 divides (*). Now we regard n-1,n,n+1 as 3 neigbours. for three neighboured natural numbers, exactly one of them is divided by 3. And it is not n as n is prime, hence is is n-1 or n+1 . Therefore (*) is also divided by 3, therefore by 3*8=24.

I really like it because you can explain it to friends or family who dont know maths to explain what a proof is (or you can be that guy at a party).

On June 28 2017 05:58 Kleinmuuhg wrote: Also I wanted to share in this thread one of my favorite proofs. The problem is very simple and so is the proof, once you see it, though it takes a lot of creativity to solve it yourself. I dare you to try before reading the solution

"For all prime numbers n with n greater or equal to 5 holds (n^2) - 1 is dividable by 24."

First we note that (n+1)*(n-1) = (n^2)-1. (*) as n is prime we know that n is uneven. As every second even number (4,8,12,...) is dividable by 4, we know that either n+1 or n-1 is divided by 4 and the other by 2. Hence we know that 8 divides (*). Now we regard n-1,n,n+1 as 3 neigbours. for three neighboured natural numbers, exactly one of them is divided by 3. And it is not n as n is prime, hence is is n-1 or n+1 . Therefore (*) is also divided by 3, therefore by 3*8=24.

I really like it because you can explain it to friends or family who dont know maths to explain what a proof is (or you can be that guy at a party).

By extension of your proof, its applicable to all odd numbers that are indivisible by 3 and not necessarily prime.

On June 28 2017 04:43 travis wrote: Could someone help me with summation algebra

I keep getting problems where I am summing to some portion of n, say

the sum from i=1 to (n/2) of i

now, I know that the sum i=1 to n of i is gauss sum: 1/2n(n+1)

I know this because I have memorized it

but when we sum to n/2 it becomes 1/8n(n+2)

I know this because wolfram alpha tells me so

but how can I manually solve this sum? Is there some rule for what happens to the sum when I only sum to a portion of my "n" ?

It really isn't complicated. The formula for the finite sum as you know is this.

If the upper limit of the sum changes to n/2, then wherever we see an n in the formula, we replace it with n/2. Factor out a 1/2 to get Mathematica's answer.

That's why math and physics are awesome, so I thought I'd leave this here! YT video: Math vs Physics - Numberphile Check it out, if you have 13 minutes and 52 seconds to spare.

I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:

When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?

On August 03 2017 05:55 HKTPZ wrote: I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:

When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?

Ah yes, this question touches on higher level calculus and my uni professor's answer to it in a calculus class was more or less "don't worry about it". If you want a fairly abstract and readable discussion on it, I'd recommend this stackexchange answer, or, if you prefer, this more technical blogpost.

On August 03 2017 05:55 HKTPZ wrote: I tried my luck googling and going through youtube videoes but I couldnt find anything so here goes:

When you are solving a seperable differential equation by seperating the variables, how is splitting up dy/dx justified (apart from the fact that the technique is without a doubt producing the right answer) and what is happening? Do you multiply each side of the equation by an infinitely small change of x?

Ah yes, this question touches on higher level calculus and my uni professor's answer to it in a calculus class was more or less "don't worry about it". If you want a fairly abstract and readable discussion on it, I'd recommend this stackexchange answer, or, if you prefer, this more technical blogpost.

It could also be fun to note that in algebraic geometry you introduce [Kähler] differentials on curves and surfaces basically by saying "let's consider the vector space of things of the form dx" and just treating it formally, and it ends up behaving quite like the differential geometrical notions you're discussing

There is a lot of difference between the answers when studying maths and when studying physics. If you are studying physics, the answer is: Because it works.

If you are studying maths, you have to get into what total differentials mean.

I'd guess your first step is to learn about limits from a formal perspective, and then to learn about different kinds of spaces and representations of them and how you end up doing limits in them

Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.

On August 05 2017 01:52 Kleinmuuhg wrote: Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.

Which for differential equations is usually how it works anyways. You have theorems that prove that if a solution with the necessary amount of degrees of freedom exists, it is the only one. So if you have one, no matter where you got it from, that is all the information you need.

And i think if you interpret the "dx" and "dy" as total differentials, you can actually do make that approach work.

On August 05 2017 01:52 Kleinmuuhg wrote: Actually, mathematically this is not permitted, as it is not formally correct. Often times you would rather use substitutions that are permitted. Or you can (and this is common practice) informally calculate a solution using that trick and then afterwards show that it solves the diff eq. This however is no better than saying:oh look this function magically appeared and now I can prove it solves my issue.

Which for differential equations is usually how it works anyways. You have theorems that prove that if a solution with the necessary amount of degrees of freedom exists, it is the only one. So if you have one, no matter where you got it from, that is all the information you need.

And i think if you interpret the "dx" and "dy" as total differentials, you can actually do make that approach work.

Even if interpreted as total differentials there is no general concept of simplifying differentials in general, I think. Im not quite sure on that though, since "total differential" is something that I dont really meet in the kind of functional analysis im usually concerned with.. I seem to remember an exercise from a couple of years ago that gave an example in which simplifying the application of dx/dy * dy/dz * dz/dx to an equation to 1 would yield a wrong solution. Ill see if I can dig it up.

From my first couple of classes on ordinary differential equations I seem to remember that the proofs of the proper mathematical way of solving these problems involved integration of an inverse function and applying it to the original problem, which relies heavily on the involved functions being smooth homeomorphisms (at least locally) and ranges/domains getting along. These two things are often times not checked at all by engineers/physicists that Ive been working with.

Edit: Found it! By an application of the implicit function theorem, you can show that for the function F with open domain Omega and for positive constants a,b,R

(Here it is assumed that the function has a zero in D, i think).

Interpreting p as pressure, v as molar volume and T as absolute temperature, the equation F(p,v,T)=0 (or p=RT/(v-b) - a/(v^2) ) should be the Van-der-Vals-equation, which approximates the behaviour of real gases, iirc.

So yeah, in this case, the differentials cannot be simplified, as that would yield 1 instead of -1 in the above equation.

TL are my bros so yall get this secret: The best math resource I have EVER found is this blog: https://betterexplained.com/cheatsheet/ (I have a Masters in Chemistry with Math and moonlight as a high school Math teacher, and this shit is so legit it helps my students and helps me like crazy)

If we want science to go really fast, and get to the future sooner, we need to massively speed up our math education (especially in the West). We have the Education knowledge of 'how to teach well', we just need to apply that to our math classes. This BetterExplained blog does that brilliantly, and I strongly suggest everyone check it out.

Back in school, doing a linear algebra review assignment.

It has me plot a couple vectors in R2 and then the vectors under a transformation.

The transformation matrix is [0 1; 1 0] --->(the ; denotes the next line of the matrix)

So I can see that what is happening to my vectors is that the x and y coordinates are swapped.

The assignment wants me to "Describe geometrically what T does to each vector x in R2". Is my answer that the coordinates are swapped an acceptable answer? Is there some other way to describe what this transformation matrix is doing? Thank you.

Swapping coordinates doesn't sound very geometrically.

I assume the answer that is expected is "mirrored on the straight line that goes from (0/0) to (1/1) (and further)" (Don't know what it is called in english, it has a special name in german. Could be something like "first Median")

Obviously any other straight line that goes in a 45° angle between the coordinate axis would also be fine if you are only talking about vectors. If you are talking about points, you need exactly that one.

A geometric description is something like a rotation/reflection/translation. They are almost certainly looking for the word "reflection" and what the axis of reflection is.

What do you mean by "or whatever axis labels you are using"? Using x and y is universal for the horizontal and vertical axes in R^2 afaik. Although it is weird that they would also use 'x' for a generic vector.

For Simberto; Spiegelung = reflection here. We don't have a special word for the x=y line.

On September 05 2017 01:55 travis wrote: In my book they tend to label the axes x_1, x_2... x_n

That generalises to arbitrarily many dimensions, but using x_1, x_2 is unnecessary for R^2. Tbh, I don't think I often see the axes labelled for R^n. I think it was always referred to as the i^th coordinate or i^th axis. That keeps x = (x_1, .. , x_n) free to use as an arbitrary point in R^n.

Then the way to refer to a line in the space is by using a scaling variable. You can always describe a line by giving one point p on the line and the gradient vector v, making the line r v + p for all real r. So the line y=x would be r (1,1).

Anyhoo, this is all about notation, rather than anything useful for the question and may have just confused matters more

Notation is boring, and usually people know what you mean regardless. The people who worry about notation are usually those who are very uncertain about the subject matter at hand. People who know their shit usually just swap notation on the fly to whatever fits better for the current situation.

On September 05 2017 04:12 Simberto wrote: Notation is boring, and usually people know what you mean regardless. The people who worry about notation are usually those who are very uncertain about the subject matter at hand. People who know their shit usually just swap notation on the fly to whatever fits better for the current situation.

I had a course with a lecturer who wrote all functions on the right and all group actions as right actions. It got very messy and confusing. Even if you knew what was going on and what you wanted to say it could be very difficult to write it properly.

Notation is just another form of language and - just like language - some people are very fussy and pedantic and others are not. If you are too careless with notation/language then it can make things harder to understand and explain. If you are too pedantic then you can drown in details. Most people are somewhere in the middle.

So, I've been thinking a bunch about array signal processing, or about a specific problem, and the closest math I can find to it is array signal processing.

The problem is thus: Assume you have a regular grid of points p_m. Each point has a value v_m. You have N sensors s_n at known positions. You can measure each sensor s_i, and the result is the sum, for all points, p_j / dist(p_j->s_i)^2 .

Given the measurements at each sensor, how well can you guess the values at each point? I think the main difference between this and traditional array processing is that we assume M >> N, and we aren't concerned with the positions of the points, as we assume them to be static and known.

Does anybody have any recommendations of stuff I could look up to learn more about this? Any terms that relate to this subproblem specifically?

Also, I've got a couple basic questions that might be answerable by anyone knowing any array signal processing at all. Of course if you make your sensors more spread out, you can get a better approximation. And if you get more sensors, over the same area, you get a better approximation. But can anyone give me a rough idea how much each of these helps? Like is it doubling the spread halves the error on far points, and squaring the number of sensors halves the error on far points?

On September 04 2017 23:40 travis wrote: Back in school, doing a linear algebra review assignment.

It has me plot a couple vectors in R2 and then the vectors under a transformation.

The transformation matrix is [0 1; 1 0] --->(the ; denotes the next line of the matrix)

So I can see that what is happening to my vectors is that the x and y coordinates are swapped.

The assignment wants me to "Describe geometrically what T does to each vector x in R2". Is my answer that the coordinates are swapped an acceptable answer? Is there some other way to describe what this transformation matrix is doing? Thank you.

You have one die. If you roll a 1, you win the prize. If you roll anything else, you don't win the prize but you are free to roll again. If your next roll is a number higher than your last, or is a 1, you win. Otherwise, you roll again. You continue rolling until you win.

I'm trying to figure out how to calculate with high precision (without simulating) the expected number of rolls it takes on average to win the prize. I know there are ways like markov chains that allow you calculate the chances of having won after x rolls, but I'm looking for the number of rolls on average to win. Any help would be appreciated, thank you.

On September 13 2017 06:09 enigmaticcam wrote: Question on probabilities:

You have one die. If you roll a 1, you win the prize. If you roll anything else, you don't win the prize but you are free to roll again. If your next roll is a number higher than your last, or is a 1, you win. Otherwise, you roll again. You continue rolling until you win.

I'm trying to figure out how to calculate with high precision (without simulating) the expected number of rolls it takes on average to win the prize. I know there are ways like markov chains that allow you calculate the chances of having won after x rolls, but I'm looking for the number of rolls on average to win. Any help would be appreciated, thank you.

This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.

On September 13 2017 06:27 Dark_Chill wrote:This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.

I need it in high precision though, say three past the decimal point. I can easily simulate it: after 10 million games I win on average 2.441 times. Just trying to figure out how to calculate that.

I've tried blowing it out to all possibilities and averaging the number of turns it took to win for each possibility, but I'm always stumped with how to measure those where you haven't yet won. They can't be 0, otherwise that would be as if you won without even playing, and they can't be infinity as that would just be silly. I'm sure there's a better way to do it.

On September 13 2017 06:27 Dark_Chill wrote:This definitely isn't the right way to go about it, but: 1/6 chance to win immediately. If not, then 15 out of 30 of your second rolls will result in a win. So, half of your second rolls will win it for you, and with if you add your 1/6 to it, then you have higher than 50% to win on your second roll, so on average you win by roll 2.

I need it in high precision though, say three past the decimal point. I can easily simulate it: after 10 million games I win on average 2.441 times. Just trying to figure out how to calculate that.

I've tried blowing it out to all possibilities and averaging the number of turns it took to win for each possibility, but I'm always stumped with how to measure those where you haven't yet won. They can't be 0, otherwise that would be as if you won without even playing, and they can't be infinity as that would just be silly. I'm sure there's a better way to do it.

Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.

E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2

So once a 2 is rolled, it'll take an expected number of 1.2 more rolls to win the game. Similarly for 3, but now we know the expected number of rolls we would need if we rolled a 2 next:

or, in Python terms: sum([1+1.2**i for i in range(1,6)] + [1])/6

This should be an exact answer.

(There's a bunch of stuff regarding Bernoulli distributions and things that I think you could set up if you wanted to keep it all purely symbolic, but I'm 98% confident that this simple approach is completely valid, and that I'm not cheating anywhere. I'd love to be informed if that's not the case!)

Answer is converging, so set inf to be really large number to calculate, and you'll get more or less the exact answer, with only 100, my error is on the scale of 1E-69.

DO CHECK: SUM( W%Real(1): W%Real(inf)) = 1 , good.

W%weighted (1, 2, ... , n) = (1*W%Real, 2*W%Real, ... , inf*W%Real) AVERAGE = SUM( W%weighted(1): W%weighted(inf)) = 2.48832 (truncating after the first 100 rolls)

Well, I spent a good 3 hours doing this, not very rigorous at all, but I'm 99.99% sure the answer is right. I'm curious to see the mathematically beautiful way to get this answer if there is one.

On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.

Oh, I'm just too dumb to understand it, I see.

edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.

A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.

On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.

Oh, I'm just too dumb to understand it, I see.

edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.

A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.

On September 13 2017 08:47 raNazUra wrote:Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.

E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2

How did you figure out what E[2] is in this equation? I'm not sure how you got from "E[2] = 1 + 0*5/6 + E[2]*1/6" to "E[2] = 1.2".

On September 13 2017 08:47 raNazUra wrote:Consider the expected number of rolls required after you roll a 2, call it E[2]. We know that it'll take at least one more roll and then there will be two situations, based on whether you roll a 1,3-6 or a 2.

E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1.2

How did you figure out what E[2] is in this equation? I'm not sure how you got from "E[2] = 1 + 0*5/6 + E[2]*1/6" to "E[2] = 1.2".

E[2] is, at the end of the day, some number. That means that I can treat it like I treat any other variable and do standard algebra to solve for it:

E[2] = 1 + 0*5/6 + E[2]*1/6 E[2] = 1 + E[2]/6 #Clean up E[2]*5/6 = 1 #subtract E[2]/6 from each side E[2] = 1.2 #multiply by 6/5

Probably would have been more clear had I not switched from fractions to decimals halfway through.

On September 13 2017 14:48 Acrofales wrote: @fiwi: a mathematically elegant way of doing it is in the answer above yours.

Oh, I'm just too dumb to understand it, I see.

edit: Oh nevermind, I get it. I read the problem and I got too excited about trying to solve it rather than properly reading the comments that followed.

A right answer is a right answer Shows the difference between how an engineer vs how a mathematician approaches a problem.

i.e. Python vs Excel :p

Matlab if I have an idea about the method of the solution, Excel when I have no idea how to start, Mathematica if I want to be a smart ass instead of being practical ^^.

But that was all abou being clever and finding how to start it, not about the software.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

meeh :/ i tried using latex to bbcode converter but it doesnt work the way i want it to So in short and without formality:

You want the proof that the limit f(x), x -> 0 with f(x) = (b^x - 1)/x exists. Did i get that right? You proof this by showing that f is strictly monotonic increasing in R+, which you can do by using the mean value theorem. Hence, the limit f(x), x -> 0 equals inf{ f(x) | x in R+} =: s. (1) Now you take a convergent seqeuence in R\{0}, h_n, with limit h_n -> 0 and show limit |h_n| = s (via monotonic increase and infimum property). You can easily show that f(-h) = b^(-h) f(h) with h in R+. Also x -> b^x is continuos in 0, so limit b^(-|h_n|) = b^0 = 1. Hence, limit b^(-|h_n|) f(h_n) = 1 s = s. (2)

d in R+. From (1) follows theres N_1 in Naturals with |f( |h_n| ) - s| < d for all n > N_1. From (2) follows theres N_2 in Naturals with | b^(-|h_n|) f( |h_n| ) - s | < d for all n > N_2. For all n in Naturals is f(h_n) either f( |h_n| ) or b^(-|h_n|) f( |h_n| ), so | f(h_n) - s | < d for all n in Naturals with n > max{N_1 , N_2}.

Hope this is helpful. Never did do any math in english so im lacking all the vocabulary. however, i hope you can follow my "proof".

Edit: There is an elegant way to determine the derivate of a^x (and hence the limit you are looking for). Which is excactly what the dude wrote before me: a^x = exp( x ln(a) ) and it tells you the limit is ln(a). I dont think you need to look any further for a clever and elegant way. Because you already use it all the time The common derivation rules and the fact that exp is derivable on the entire definition range. (Btw. you dont need L'H to proof any of this.)

Can't really answer on how to approach the limit question without knowing exponential before, but my 2 cents:

So if you assume you don't know exponential, the only way (i know of) that you can define a^x is as the unique solution to the functional equation f(x+y)=f(x)*f(y) and f(1)=a for a > 0.

So f'(x) must be such that f'(x+y)=f'(x)*f(y) So, now you have f'(x)=f'(0)*f(x)

And now I'm stuck cause one of the definition of e^x is as the solution of differential equation f'(x)=f(x)

But that gives an answer to why there is some constant times the function itself.

I think one of the working definitions of e is that the derivative of e^x is e^x. If you choose any of the different ways to define e, you can prove from there that that sentence is true. I don't remember the specifics, but it works.

If you have that as set, and also have the chain rule of calculus proven, you can take the way that Liebig explained above.

And that is an absolutely fine way to prove the thing you wanted to prove, or to approach the series. You know other things, and derive the result from them. So you don't know what the derivative of a^x is, but you know what the derivative of e^x is, either by definition or by something that follows from that definition.

You are trying to take the way in the opposite direction. You don't start with proving what the derivative of a^x is, and go to e^x from there, you go the other way and start with e^x, and go to a^x.

exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

But besides that i agree with you, simberto. It feels like HKTPZ goes about it the wrong way. Thats also what i was trying to say in some way: You already know d(a^x)/dx = ln(a) a^x because of the chain rule and exp properties you use daily, d(a^x)/dx = d(exp(x ln(a)))/dx = exp(x ln(a)) ln(a) = a^x ln(a). And d(a^x)/dx = lim ( a^(x+h) - a^x )/h = a^x lim (a^h - 1)/h with h -> 0. In short a^x ln(a) = d(a^x)/dx = a^x lim (a^h - 1)/h.

Commonly you calculate d(a^x)/dx = a^x ln(a) via derivation rules. These are no rules of thumb but proven generalization that work always for the right parameters. So the clever and elegant way you are looking for to determine the limit lim (a^h - 1)/h, is the set of derivation rules and exp properties.

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.

Edit: Also, from the infinity series of the exp function to d/dx exp(x) = exp(x) you just need to derive polynoms. And derivatives of polynoms are easily proven by the definition of the derivative. So with regards to the original question, it doesn't really matter with which one we start. The way to d/dx a^x (and the limit in question) from not a lot of other stuff is: Prove d/dx x^n = n x^n-1 prove chain rule define exp = infinite series x^n/n! do some work to get a reasonable definition of ln(x) derive d/dx exp(x) = exp(x) define a^x as exp(x lna) derive via ruleset that has been proven.

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.

Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.

Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.

I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).

Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.

Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.

Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.

Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.

I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).

Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.

Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>

If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.

Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.

Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.

I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).

Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.

Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>

If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.

Not sure what youre trying to say? Or even what part of my post you are referring to exactly? If it's about the last part..: "The derivative of the series" would be d/dx \lim_n\sum_{k=0}^n x^k/k! and all that I said is that you need a reason for why this would equal \lim_n d/dx \sum_{k=0}^n x^k/k!, i.e., a reason for why you are allowed to exchange the order of derivative and limit. This is NOT trivial or true for arbitrary series. It is however exactly what you need to do to reduce the case of differentiating the exponential function to the case of differentiating polynomials, i.e., applying the differentiation to the finite sums.

I know that in some cases in courses for engineers (or even physicists) people pretend like limits can be arbitrarily exchanged. Time to accept that this is just wrong and only ever done because it would be too time-consuming to give the full picture.

Clearly, I didnt manage to ask my question the right way so I will try to rephrase ^^

I know a^x can be rewritten as e^(x*lna) and I know the derivative of e^x is e^x times ln e which simplifies to e^x and I know Leibniz' chain rule.

Im asking: WITHOUT using that knowledge, is there a clever and elegant way to show that lim ((a^h)-1)/h as h goes to 0 is exactly 1/n of lim (((a^n)^h)-1)/h as h goes to 0 implying there is a connection to log.

I can show that lim ((a^h)-1)/h as h goes to 0 exists but Im at a loss trying to show the limit is ln a (what the base is is trivial - the point is to show the connection to logarithmic behaviour)

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

I remember at least three different definitions of e or exp. One via the infinite series, one via the differential equation, and one as the limit of (1+1/n)^n. Though i admit that i had to look that last one up. It has been a bit since my introductory maths classes, but i am rather certain that you can prove that they are all equivalent anyways. So basically, just use whichever is most useful in the situation you are in. The differential equation is probably not the most useful for introductory maths classes for maths students due to the problems that you mention. And except for maths students, noone cares about the exact way something can be derived from other stuff and whether that is actually correct.

Your last one should read (1+x/n)^n if you want to define the actual exponential map, I think.

Of course they are "equivalent" as long as you stay in the realm of real functions. But that was precisely the point I was trying to make. Understanding why the differential equation a) has a solution and b) the solution is unique requires way more machinery than covering convergent series. The simplestway of proving that the differential exquation in question has a solution is to give one. But if you are going to define the exponential function that way, then you can hardly use the exponential function to show that a solution exists. Using the series expression would also defeat the purpose of using the differential equation to actually define the exponential function. So all that is left is relying on more sophisticated tools like some kind of existence/uniqueness result for differential equations, I guess. That does seem rather 'high level' for such simple endavours.

I mean assuming we want to define the exp-function as the unique solution to the differential equation... How would I go about proving that the differential equation has a solution that is unique? Even attempting to go for some fix point iteration seems a bit ridiculous because that relies on an understanding of convergent sequences (which correspond 1:1 to convergent series - the exact thing we were trying to avoid).

Concerning your edit: You cant just assume that the derivative of a series (i.e. the limit of finite sums) is the same as the limit of the derivative of the finite sums. There are certain setting where limits and derivatives can be exchanged but this is NOT true in General and you should always make very sure why exactly you are allowed to interchange limits (after all taking a derivative is just another limit). In this case the uniform convergence of the series on compact sets and the pointwise convergence of the (continuous) derivatives saves the day, but we should try to always make sure we know the reasons why certain operations yield correct results.

Of course after making sure that all this is allowed, you're perfectly right. Then all that remains to do is exchange the series' limit and the d/dx and differentiate dem juicy polynomials. :>

If you define e^x as an infinite series, then the derivative is obviously the derivative of that infinite series. You don't need the general case, you need this very specific case. You should of course prove f(x) = f'(x) for f(x) = e^x, using whatever definition of the derivative has your preference.

Not sure what youre trying to say? Or even what part of my post you are referring to exactly? If it's about the last part..: "The derivative of the series" would be d/dx \lim_n\sum_{k=0}^n x^k/k! and all that I said is that you need a reason for why this would equal \lim_n d/dx \sum_{k=0}^n x^k/k!, i.e., a reason for why you are allowed to exchange the order of derivative and limit. This is NOT trivial or true for arbitrary series. It is however exactly what you need to do to reduce the case of differentiating the exponential function to the case of differentiating polynomials, i.e., applying the differentiation to the finite sums.

I know that in some cases in courses for engineers (or even physicists) people pretend like limits can be arbitrarily exchanged. Time to accept that this is just wrong and only ever done because it would be too time-consuming to give the full picture.

Thank you for talking down to engineers and physicists. I am actually a mathematician (well, at least I got my propaedeuse in math, never completed my BSc as I moved to CS which got less hooked up on infinitessimal details )

Anyway, I see your problem, and I think you are simply misunderstanding what I am saying. I do see that there is a more fundamental problem with Simberto's approach, and it's all in the definition of e^x as an infinite series, but it's not in the proof of the derivative. So lets clarify that first:

We define:

f(x) = sum x^n/n! (with n 0 to infinity)

Now:

df /dx = d / dx sum x^n/n! (with n 0 to infinity) (by definition)

Now we have to prove that df / dx = f.

Lemma: d / dx sum x^n/n! (with n 0 to N) = sum x^n/n! (with n 0 to N - 1).

Proof:

Base step: d / dx sum x^n/n! (with n 0 to 2) = d / dx (1 + x) = 1 = sum x^n/n! (with n 0 to 1). (little square box)

Inductive step: Assume d / dx sum x^n/n! (with n 0 to M) = sum x^n/n! (with n 0 to M - 1).

d / dx sum x^n/n! (with n 0 to M + 1) = d / dx (x^(M + 1)/(M + 1)! + sum x^n/n! (with n 0 to M) ) = d / dx (x^(M + 1)/(M + 1)! + d / dx sum x^n/n! (with n 0 to M) = (M + 1)x^M / (M + 1)! + d / dx sum x^n/n! (with n 0 to M) = x^M / M! + d / dx sum x^n/n! (with n 0 to M) = x^M / M! + sum x^n/n! (with n 0 to M - 1) = sum x^n/n! (with n 0 to M) (little square box)

QED.

Theorem:

d / dx sum x^n/n! (with n 0 to infinity) = sum x^n/n! (with n 0 to infinity).

Proof (using Lemma above): lim(N -> inf) d / dx sum x^n/n! (with n = 0 to N) = lim(N - > inf) sum x^n/n! (with n = 0 to N - 1) = lim(N - > inf) sum x^n/n! (with n = 0 to N) = lim(N -> inf) sum x^n/n!

QED

So there we have it: f(x) = f'(x) for f defined as above.

Now, of course, there is a problem, and that is that we haven't defined a^x, only e^x. I believe the usual solution is to simply define a^x = e^(x ln a). And from then onwards it's easy. But I'm not sure that this satisfies what the original question was asking.

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

On September 29 2017 17:23 HKTPZ wrote: Clearly, I didnt manage to ask my question the right way so I will try to rephrase ^^

I know a^x can be rewritten as e^(x*lna) and I know the derivative of e^x is e^x times ln e which simplifies to e^x and I know Leibniz' chain rule.

Im asking: WITHOUT using that knowledge, is there a clever and elegant way to show that lim ((a^h)-1)/h as h goes to 0 is exactly 1/n of lim (((a^n)^h)-1)/h as h goes to 0 implying there is a connection to log.

I can show that lim ((a^h)-1)/h as h goes to 0 exists but Im at a loss trying to show the limit is ln a (what the base is is trivial - the point is to show the connection to logarithmic behaviour)

As in the post above. Please define a^x in a suitably formal manner, and we can take it from there

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.

Also, a super elegant solution, but there is one slight hitch:

"The left side is just the derivative of ln f."

It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.

Also, a super elegant solution, but there is one slight hitch:

"The left side is just the derivative of ln f."

It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.

Thanks, mate.

The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.

On September 29 2017 18:34 Shalashaska_123 wrote: + Show Spoiler +

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

This is exactly what I was looking for. Elegant and simple. Thank you very much for spelling it out to such length.

I feel so ambivalent right now xD happy that it makes sense to me now and sad that I was too stupid to figure it out on my own

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.

Also, a super elegant solution, but there is one slight hitch:

"The left side is just the derivative of ln f."

It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.

Thanks, mate.

The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.

Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.

It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.

Also, a super elegant solution, but there is one slight hitch:

"The left side is just the derivative of ln f."

It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.

Thanks, mate.

The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.

Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.

It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know

well fortunately for my odd request, the derivative of ln x does not rely on knowing what the derivative of a^x is, so all's good ^^

I dont think its a completely arbitrary limitation - in fact I think it's very reasonable. But to each their own :D Also it isnt even for homework - it was just something i was wondering about

On September 29 2017 01:39 HKTPZ wrote: Hey y'all.

I've been messing around with differentiation of exponential functions last few days and there's something that I cant seem to figure out so I thought I'd ask TeamLiquid for advise/help/inspiration.

By the definition of differentiation, the derivative of a^x is a^x times the limit of ((a^h))-1/h) as h goes to 0. Evaluating that limit is trivial by applying L'H BUT that rests upon already knowing the derivative of a^x so assuming that isnt known how does one figure out that limit?

the brute approach obviously is just convincing oneself that the derivate of 2^x is 2^x itself times a constant which appears to be approximately 0.7

then one could do the same for 8^x and come to the realization that the derivative of 8^x is 8^x times a constant which is about 3 times as great as 0.7 - which suggests we're looking at something logarithmic - the base of which can be approximated by a Taylor Series etc etc

this approach seems so bruteish and non-elegant so my question is: is there a cleverer and more elegant way to evaluate the limit which holds the key to differentiation of exponential functions?

(help me know if my wording was too poor/confusing - English is not my first language)

Hello, HKTPZ.

Honestly, I would do it the way Liebig suggested.

On September 29 2017 02:37 Liebig wrote: a^x = exp(ln(a^x))=exp(x*ln(a))

so the derivative of a^x is ln(a)*a^x

If we can't use that approach, though, then we have to go back to the definition of the derivative.

The function we have is f(x) = a^x. Plug this into the definition.

Use one of the properties of exponents.

Factor a^x and bring it in front of the limit. This can be done because it doesn't depend on h.

We have here the indeterminate form 0/0, so we can apply L'hopital's rule.

The denominator is just 1, so it disappears. In the numerator, the constant disappears. We're just left with what we set out to find from the beginning.

This is just f'(0).

We stated at the beginning that f(x) = a^x. Substitute it here.

We thus have a differential equation for f(x). Divide both sides by f(x).

The left side is just the derivative of ln f.

Now integrate both sides with respect to x.

Exponentiate both sides so that we just have f(x) on the left side.

Use one of the properties of exponents.

Use a new constant for e^C.

Now substitute a^x for f(x).

We can determine A by setting x = 0. Doing so gives us 1 = A.

All that's left to do is to solve for f'(0). Take the natural log of both sides.

Use the property of logs that allows us to bring the power of the argument to the coefficient.

ln e = 1 and x cancels from both sides. We're left with the value of f'(0).

f'(x) = a^xf'(0). Therefore,

I hope this helped you out.

Sincerely, Shalashaska_123

Holy wowzerz. You generated a million different images from latex, uploaded them to imgur and included them here. You're my hero.

Also, a super elegant solution, but there is one slight hitch:

"The left side is just the derivative of ln f."

It's easy to prove, but you'd still have to return to first principles. I think you don't escape needing to define e^x in order to be able to define ln x as its inverse. And then you'd need to prove that sentence there. Which means a lot of work.

Thanks, mate.

The only justification needed for that step is this. We know that the derivative of ln x is 1/x. If we have ln f(x) instead, then we have to apply the chain rule: (1/f)*f', which is f'/f. Having to go any further than that is just splitting hairs in my humble opinion and would take focus away from the original problem.

Clearly it's academic, because HKTPZ is happy with it, but the way I interpreted the question was to remove all of the plumbing, which also means you can't just take as known that the derivative of ln x is 1/x, but would have to prove that, which would need some basic knowledge of what the natural logarithm is.. and HKTPZ kinda forbade us from using that knowledge.

It's a bit strange to ask to prove that the derivative of f(x) = a^x is ln(a) * a^(x), WITHOUT using the knowledge that a^x = e^(x ln a), but WITH using the general knowledge of what ln is. One follows quite naturally from the other. But I guess it's homework Placing arbitrary limitations is a great way of figuring out what students actually know

well fortunately for my odd request, the derivative of ln x does not rely on knowing what the derivative of a^x is, so all's good ^^

I dont think its a completely arbitrary limitation - in fact I think it's very reasonable. But to each their own :D Also it isnt even for homework - it was just something i was wondering about

The derivative of lnx kind of depends on having a good idea what lnx is, though. Which requires a good idea of what e^x is, which means you probably know the derivative of e^x, at which point you can use a^x = e^(xlna)

On September 29 2017 06:16 jodljodl wrote: exp is not defined by exp'(x) = exp(x). Its defined by a power series (exp(x) = sum x^n/n! , n= 0 to infinity) or as the limit of the sequence (1 + x/n)^n , n -> infty.

Exponential can also be defined as the only solution of y' = y, with y(0) = 1, implying exp'(x) = exp(x).

You need a lot more mathematical background to prove that this differential equation has a unique solution than you need to define the exponential function by it's series representation. Also the series representation can be generalized more easily (say to the case of bounded operators acting on possibly infinite-dimensional banach spaces) and it allows you to verify properties of the (real) exponential function like its limit behaviour when concatenated with polynomial functions in a very simple way.. I dont really see the advantage of using your proposed definition, as long as it's not part of a class for say engineers that are primarily interested in solving differential equations.

Mind you I dont want to say that either definition is really more "correct" (at least in the case of real exponential functions), I just think that using the series representation fits more naturally into the way an analysis class is structured. At least where I'm from, maybe it's different if you do several classes of "calculus" before developing an axiomatic theory..

Edit: The more I think about it the less I like your way of defining the exponential function. It really only works properly in the case of functions defined on at least some open neighbourhood of 0 on the real axis. And at the very least as soon as you try to generalize to the complex plane you will inevitably need the series representation to prove uniqueness of the solution anyway, bc you will need some form of an identity theorem for holomorphic functions... Just feels like it really only is useful in a setting where the people being taught are interested only in the theory of ordinary differential equations on the set of real functions and their strong solutions....

The background knowledge you need is different indeed. But I learnt both the same year (how to properly solve a differential equation, or at least know we can, and the power series and what goes with them), which is why I tend to put both those definitions in the same bag. Although, I agree with you that the series expansion definition is more convenient anyway, especially when it comes to complex numbers, but also matrices and other objects.

random statistics question I want to verify. I missed a point for this on my exam. We had to draw a box plot based on some data. They gave us 15 values.

15 20 22 22 24 24 24 28 30 30 34 36 37

so I calculated the quartiles in this way. q2 = median = 24

then remove the median from the data. then find the median of the lower half of the data, and the median of the upper half of the data.

lower half:

15 20 22 22 24 24

median = 22

upper half:

28 30 30 34 36 37

median = 32 (the mean of 34 and 30)

so for q3 I used 32. I got docked a point, they crossed it out and wrote 30.

Did I do something incorrect, or were they wrong to dock me a point? Thanks.

I see 13 , not 15 values. Assuming that there are 15 values, using the simpliest method, then then surely the q3 would be 4th from the highest value, which is 30. If there are 13 values, then it should be 3rd from the highest which is 34. Using another simple method, with 15 values, then q3 will be the mean from the 4th and 5th from the highest and assuming there are 13 values then q3 will be 4th from the highest which is 30. The other methods are probably too complicated for you and you can't seem to count either 13/15 properly anyways.

Statistics is strange. It isn't a universal maths, so you have to follow whatever methodology is taught. So tell us first the method you are supposed to use.

sorry i said 15 when I meant 13, that doesn't mean I can't count properly lol. what does "then then" mean? clearly you don't understand english, since you made a mistake in what you typed.

anyways, they never told us a method we are supposed to use.

What kind of exam tests something that you have not been taught?

Of course there is supposed to be some transfer, but still, exams are for testing whether you learned the stuff they taught you. If they didn't teach you how to do a box plot, why would they test that? It is not like it is something that you can derive from other knowledge either. You either know how to do it or not.

Yeah, at this point I am just wondering what kind of exam is this? They don't teach you methodology to a statistical analysis, and from the sounds of it you can't ask any fellow colleagues or the examiner and there is no teacher.

Also this is a maths thread, not an English thread. Which is why I don't mention your lack of capitalisation.

On October 05 2017 02:34 travis wrote: random statistics question I want to verify. I missed a point for this on my exam. We had to draw a box plot based on some data. They gave us 15 values.

15 20 22 22 24 24 24 28 30 30 34 36 37

so I calculated the quartiles in this way. q2 = median = 24

then remove the median from the data. then find the median of the lower half of the data, and the median of the upper half of the data.

lower half:

15 20 22 22 24 24

median = 22

upper half:

28 30 30 34 36 37

median = 32 (the mean of 34 and 30)

so for q3 I used 32. I got docked a point, they crossed it out and wrote 30.

Did I do something incorrect, or were they wrong to dock me a point? Thanks.

You computed it correctly. The way you solved it is the conventional way that the quartiles are solved, both in AP Statistics in high school as well as in college statistics.

Source: I'm a high school math/ statistics teacher and a college math/ statistics professor. lol.

Furthermore, that 10th data point (the second 30) is only greater than 9/13 of the data, which is 69% of the data and not 75% so... no, it can't be Q3. Your averaged value of 32, on the other hand, is greater than 10/13 of the data, which is 77% (and that's the best average you can get to ensure that you're above 75% of the data, based on there being 13 data points). Your value of 32 does a significantly better job at splitting the data into a lower 75% and a higher 25% than the 30 does.

Thanks plasma! I guess I'll go try to get that point back then.

As for them teaching us certain methodology, they taught us how to make box plot and what quartiles are. The professor then mentioned that there are multiple ways to calculate quartiles. He briefly went over 3 methods, I don't remember which. It seemed like he wasn't too concerned about which method we used.

Your "mistake" was in removing the median from the data. If you include 24 in the upper and lower halves and take the median of both, you'll get 22 and 30. This is basically method 2 from that wikipedia article you posted. The way you did the problem (method 1) is legitimate, but I suppose method 2 is how they want you to compute quartiles in this class.

This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.

On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.

I'm bad at proofs but I'll give it a shot.

Math is consistent. Let's say we had positive integers x and y, and -x * -y was negative. I assume your uncle understands that a positive number times a negative number (say, x * -y) is negative. Then (x + (-x)) * -y = x * -y + -x * -y, which is the sum of two negative numbers and therefore negative. However, x + (-x) = 0, and 0 times any number is 0 (which is neither negative nor positive). Therefore we have a contradiction, meaning -x * -y can not be negative.

On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.

I'm not a mathematician, but this 'debt' example made sense to me:

On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.

Your uncle's point of view has some philosophical apparel... After all are there really negative numbers? Can they be observed in the real world? If they arent something we can observe in the real world, how can we be sure what properties they really "have"?

The mathematical answer to the problem of existence is constructing them. This is done the following way (warning: This might not be helpful at all, I included it for sake of completeness):

<math> The cleanest way to do so is by taking ordered pairs of natural numbers (n,m) and introducing the equivalence relation (n,m) ~ (n',m') iff there ex. a natural number k such that either (n+k,m+k)=(n',m') or (n,m)=(n'+k,m'+k). Then the set of integers can be defined as the set of all equivalence classes and equipped with an addition and multiplication. Identifying (n,m) with the integer solution x to the equation n+x=m is how one usually imagines these integers.

This construction solves the issue of "existence" because once we have established knowledge about natural numbers, this allows us to construct integers using nothing but equivalence classes of natural numbers.

When reading an integer nobody thinks "wow, I guess that's a nice equivalence class of pairs of natural numbers", so after establishing this, one proves that this construct indeed has all the properties one wants from integers and moves on with life. </math>

A more "natural" way to think about it is by thinking of integers as a model (that does not necessarily correspond to anything maginary in the real world) that helps us better formulate and formalize certain problems like debt. Then the point of your uncle reduces to "okay but someone had to come up with this model!", which is true, and the question why the product of two negative numbers is a positive number is answered by "because we want the model to behave that way".

Why do we want that? Well, we want to have the laws of distributivity, associativity and commutativity of integers as well as -1*x=-x... So for two positive integers x,y their corresponding additive inverses -x,-y have to allow -x*-y=(-1)*x*(-1)*y=(-1)*(-1)*x*y=x*y.

So while there is no real universal "truth" about why this is the case, we want to have a set of integers that acknowledges these laws, so we made them in a way that they do. Arguing that negative times negative equals positive does not necessarily have to be the case boils down to arguing that at least one of these laws should be omitted. It seems reasonable to assume these laws for what we are trying to model though.

Using the consistency of the results in the lines as an argument that the negative * positive is positive, and negative * negative is negative. This is obviously not a complete proof, but a good argument nonetheless.

Also, i would like to mention that this thread is not only for maths students or professors. Anyone that has any problem with math is very welcome here!

And according to this, your uncle is actually kind of right. (-1)*(-1) = 1 is a convention that has been decided by people, but any other possible convention breaks some part of maths that is important to us, because we like maths to be consistent.

This sounds like saying 1+1=2 because somebody decided to make it that way. Whilst in one sense it is true linguistically, in many senses it also isn't true in that the concept itself is real. A multiple of two negatives is a positive is because reality works that way. Whilst someone or a group had to decide it works that way, it was decided it works that way because it indeed can only work that way. It's like the concept of imaginary numbers. (Square root of -1) Whilst the number isn't "real" as in you can count it with your fingers, it is mathematically "real" and exists as a "concept". The concept and its applications are absolutely useful in describing the real world and your computer wouldn't run if people didn't understand imaginary numbers. (Which aren't imaginary at all btw, it's just the name attached and we ran out of words to describe mathematical constructs).

Edit that's a bad example. Take fractions. You cannot count fractions with your fingers. But you can add and subtract them, divide and multiply them and otherwise use them as if they are whole numbers that you can physically count with your fingers, using methods and notations that are fairly easy to do. Now, it is true that fractions exist and everything you do with them are that way because someone or a group of someones or independent, seperated groups decided that fractions work and are described in that way, but ultimately they decided it works that way because that's just how reality operates. You don't cut a cake in half and put it in the fridge and find out that it is now 3/4 of a cake. Unless your uncle ate your cake and brought a new one and ate a quarter of it.

Or take pi. pi was worked out by physically measuring the circumference of a circle and comparing it with its diameter (or radius). Diferent groups in the ancient world all independently "discovered" pi. They all decided what pi was (some rather more accurately than others). Now it is true that pi is what some guy a while ago made it that way, but in reality it can only be that value (or around that value depending on how accurate you want to be) because that is simply how reality is.

Of course you have to question the motive of your uncle. In one sense he is broadening your mind. To question reality. To inquire about the world about you. To introspect on the nature of truth. To set you upon the path of self knowledge that cannot be taken away from you. In another sense he sounds like an idiot. Ask him why 1+1=2 and you will find out which is which.

Is your unlce Leopold Kronecker? It is a rather fundamental question wether math is discovered or invented. I do not agree with most of Dangermousecatdogs examples, heck I don't agree with Leopold Kronecker. Math is like the hammer we invented to cut cheese, as in Pi is real, yet in our conventional number system transcendental. There are different ways to set up math without it contradicting itself. Buzzwords are hyperbolic geometry, rational geometry, Inter-universal Teichmüller theory. I don't know anything about these things, yet in my ignorance I am willing to imagine, that two negatives multiplied does not give you a positive in all shapes math can take.

Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?

On October 05 2017 21:28 Dangermousecatdog wrote: Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?

If you multiple two negative half pies, you get a positive quarter square pie.

On October 05 2017 21:28 Dangermousecatdog wrote: Is cheesecake a pie? Can it be a cake? What if I cut the pie in half with a hammer, put it in a fridge and took half away and multiplied it with another negative half pie? Do I recieve a positive quarter pie? I now have 3/4 of a cake?

If you multiple two negative half pies, you get a positive quarter square pie.

I'm having a little bit of trouble understanding the steps in the example (mainly because a "negative half pie" isn't an actual thing, and I don't understand why we're comparing cakes to pies), so let me try to come up with an example that helps illustrate the "negative * negative = positive" issue. Obviously, it's pretty hard to come up with a concrete example because the act of negating a value (owing, lowering, lessening, reversing, opposing, etc.) is usually a verb rather than a noun. For that reason, it might not make sense to assume that we can take a negative tangible object and multiply it by another negative tangible object, but we can still describe a process where those negative labels might make sense in context.

- The act of removing*** an object, for example, can be represented as negating that object. If we take the reverse*** of that decision, we're adding an object. Therefore, we have a double negative which serves the same purpose as a positive.

- The opposite*** of a temperature decreasing*** means that a temperature is increasing.

- If I owe*** you money and then the debt becomes reversed***, then I'll be receiving money from you.

***All of these terms can be represented as a negative sign in some context, computationally or conceptually.

Therefore, we may choose to represent these situations as a double-negative. While they may be "simplified" into a positive overall, retaining the double negative may actually help describe the two-part negation context that would otherwise be lost by merely seeing a positive.

The same line of reasoning works for whether it's establishing that x - - y = x + y or that -x * -y = xy, because once we understand that a double-negative nets you a positive, the latter can be rewritten as -1*x*-1*y = (-1)(-1)(xy) = (1)(xy), unless someone has a further issue with multiplication or commutativity.

Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.

So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.

I don't know if that would be a formal proof per se, but that's what I wrote above (the factoring out of the two -1's) so I definitely think it's a pretty decent justification.

On October 06 2017 00:29 Day_Walker wrote: A geometric argument:

Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.

So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.

No, that's silly, that's like saying the reason the earth moves the way it does is because shadows on earth have to look the way they do - you have it the other way and it's a consequence of more fundamental things.

You're better off thinking about multiplying by a constant moving along a line in a x-y coordinate system, and you can stretch at any point on this line (going to infinity on both sides)

On October 05 2017 20:15 Dangermousecatdog wrote: This sounds like saying 1+1=2 because somebody decided to make it that way. Whilst in one sense it is true linguistically, in many senses it also isn't true in that the concept itself is real.

1+1=2 is merely an abstraction with no meaning in the real world. however, if i have 1 apple and then i acquire a 2nd apple.. i now have 2 apples. there is the real world application of 1+1=2.

also, "pi" isn't really "math". "pi" is a constant.

On October 06 2017 00:29 Day_Walker wrote: A geometric argument:

Multiplying by a positive number represents stretching or shrinking. Multiplying by a negative number represents stretching/shrinking together with a 180* rotation.

So if you multiply by two negative numbers, the two 180* rotations make a 360* rotation = no rotation. You are just left with the stretching/shrinking, which is a positive number.

No, that's silly, that's like saying the reason the earth moves the way it does is because shadows on earth have to look the way they do - you have it the other way and it's a consequence of more fundamental things.

Right, in this case we get to choose how the earth moves, and we can make our choice based on the shadows we want to see. The same thing is actually going on in the algebraic proof that travis gave: we want certain properties (shadows) like distributivity, and for those properties to hold we see that -1 * -1 must be 1 (the earth must move a certain way).

On October 06 2017 09:57 FiWiFaKi wrote: You're better off thinking about multiplying by a constant moving along a line in a x-y coordinate system, and you can stretch at any point on this line (going to infinity on both sides)

I'm not sure I understand you here. Are you looking at the graph y = kx? What do you mean by stretching at a point?

The reason I like thinking about rotations is that it fits with arithmetic in C. For example if we view -1 as a 180* rotation, it makes sense that -1 should have two square roots: 90* rotation clockwise, and 90* rotation counterclockwise.

I think his argument is along the lines that pi is only a number. Saying a number is maths is slightly weird, but not entirely. I would say that "3 is maths" doesn't sound quite right. Obviously numbers are a part of maths, and if you get into the definition of natural numbers via peano axioms, it gets quite mathy. But i still don't think "3 is maths" is a very sound statement.

Basically the same goes for pi. Pi is something you use in geometry, and geometry is maths. But pi itself is just a number.

Granted, the whole rant isn't very exact and more feelings-based, but i can understand why one would say that pi isn't maths.

Pi itself is not just a number. All constants are mathematically interesting. All constants are "maths". It's like saying e is just a number. It's a number with meaning and is one of the foundations of trig and geometry. It's a number that has to be calculated using algorithmic calculations. It is also a good example or something that at face value is because some guy made it that way, but if you look into it deeper it is both true and not true at the same time, just like multiplying two negatives make a positive.

On October 05 2017 12:04 bo1b wrote: This is probably below everyone's pay grade, but is there a proof that two negatives multiplying make a positive? It seems obvious to me that the opposite of an opposite is the original, yet my uncle insists that the entire world operates on negatives multiplying out to positives because some guy a while ago made it that way. Any help in convincing a crack pot would be appreciated.

A mathematical proof only works if you have defined your axioms. As was mentioned before a completely rigorous proof would involve the construction of integers and the respective definition of the order relation (starting from the axioms of some set theory) which would be overkill right now. Instead we take the following rules (and the typical rules of calculating) as axioms themselves, which I doubt your uncle would take issue with: For all integers (or rational numbers or real numbers) a,b,c it holds A0 either a<b, a=b or b<a A1 (a<b and b<c) implies a<c A2 a<b implies a+c<b+c A3 (a<b and 0<c) implies ac<bc

Statement 2. (c<0 and a<b) implies bc<ac Proof: c<0 implies 0<-c by Statement 1. It follows from A3 -ac<-bc. Now add ac on both sides (via A2) to arrive at 0<ac-bc. Finally add bc on both sides (again via A2) and arrive at bc<ac.

Corollary (a<0 and c<0) implies 0=0*c<ac.

Another nice corollary is 0<1: Assume this is wrong, i.e. 1<0 (because 1=0 is excluded). Then a<b implies 1*b<1*a by Statement 2 which is a contradiction of A0.

[EDIT]: I just realized that it's nicer for your purpose if you substitute A3 with A3' (0<a and 0<c) implies 0<ac

It shouldn't be hard to convince anyone that the product of two positives remains positive. From A3' you get A3 in the following way: a<b implies (by A2) 0<b-a which implies (by A3') 0<bc-ac for 0<c from which it follows ac<bc (again by A2).

Pi and its relation to a perfect circle illustrates neatly why it's only tangentially related (yet still very useful) for our understanding and applications of nature. You can't create a perfect circle in reality, you'd need infinite precision for that (infinitesimal small sizes), which reflects neatly in the constant. I guess a similar story for "e" can be used too, but I haven't thought about it nearly enough and its uses are much more esoteric to me so no claim about that one!

I'm not trying to diminish the beauty of math, by the way, just trying to argue how man made (or how conceptual) maths actually is. It's more of an abstract thing than a real thing imo; and that in itself is awesome if you think about it.

Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

That's why he wrote it sounds plural, not it is

Also proving fundamentals without defining axioms,especially when using derived concepts which are often defined using said fundamentals seems pointless.

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

thanks for the english lesson lol, now how about you take some comprehension lessons

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

That's why he wrote it sounds plural, not it is

Also proving fundamentals without defining axioms,especially when using derived concepts which are often defined using said fundamentals seems pointless.

Right, but that's an inconsistency with what ahswtini said, not with what I said. He said that since mathematics "sounds plural" he thinks that the abbreviation math should "be plural". Sounding plural doesn't mean something is plural, and if an abbreviation is supposed to represent the same thing, then shouldn't the abbreviation be the same singular or plural type? It matters with subject/ verb agreement, etc.

On October 06 2017 23:27 Uldridge wrote: I literally don't know what it is, so I'm always confused about how to write it. I guess that's why I write both math and maths lol

That's fair I think consistency is probably better.

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

thanks for the english lesson lol, now how about you take some comprehension lessons

Not sure why you're so offended, but your response doesn't help clarify anything.

The OP says "So what is welcome here? Anything math related! Random questions, theory, homework discussion, anything." This is definitely a related random question, although apparently it's upsetting you >.> I didn't mean to offend you; I was just curious as to conventional spelling preferences.

If the pair of you can just stop with the little love spat, maths and math, the spelling and spoken difference probably occurs for the same reason as Aluminium and Aluminim...

...Americans are too lazy to speak English properly.

On October 07 2017 02:08 Dangermousecatdog wrote: If the pair of you can just stop with the little love spat, maths and math, the spelling and spoken difference probably occurs for the same reason as Aluminium and Aluminim...

...Americans are too lazy to speak English properly.

Aluminum*

But seriously, it was an honest, math-related question. Other people responded with actual replies; I'm not sure why ahswtini took it so personally. People are more than welcome to move on to a different question, and I don't think anyone is going to be trying to spell-check everyone else who chooses to spell it "math" or "maths"

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

It may not be plural in English, but it is in Spanish. It's las matemáticas, not la. So given that Spanish is closer etymologically to the common Latin root "ars matematica" (and the Romans simply stole the Greek word for learning there), treating it as a plural if it has the s is probably best.

However, English in general has a weird fetish of replacing the original a with an s (and keeping it as a singular word): mechanics, electronics, analytics, physics.. all from originally Greek words ending in an a sound (especially after passing through Latin).

Finally, your example is wrong if trying to prove mathematics is not plural (not saying it is: English is a fucking weird bastardization of a language). "My favorite topic is birds/trees/cars". These are all plural, but it's not "my favorite topics are ..." In a sentence like this. Subject works in the same way..

On October 06 2017 21:43 DarkPlasmaBall wrote: Random linguistic question, after reading everyone's preferences:

In the United States, we typically abbreviate "mathematics" as "math", but I've noticed that people from many other countries prefer "maths", often times as a singular noun. I can understand if someone says something like "The different maths you might explore in high school are algebra, geometry, trigonometry, and calculus" - implying that maths is plural and responds to multiple branches of mathematics - but the idea that "pi is maths" or "algebra is maths" or "this is an example of maths" as a singular noun is foreign to me. I was just wondering if there was any additional nuance as to why some people prefer "maths" over "math" when referring to a singular entity. Thanks

i think it's just because the full word is 'mathematics' which sounds plural, and so the abbreviation 'math' should naturally also be plural

Oh... but just because a word ends in "s" doesn't mean it's plural lol. Mathematics is almost always a singular noun, not plural... mathematics is the study of many topics, but there's no such word as "mathematic". My favorite subject is mathematics, not my favorite subjects are mathematics. Oh well.

It may not be plural in English, but it is in Spanish. It's las matemáticas, not la. So given that Spanish is closer etymologically to the common Latin root "ars matematica" (and the Romans simply stole the Greek word for learning there), treating it as a plural if it has the s is probably best.

However, English in general has a weird fetish of replacing the original a with an s (and keeping it as a singular word): mechanics, electronics, analytics, physics.. all from originally Greek words ending in an a sound (especially after passing through Latin).

Finally, your example is wrong if trying to prove mathematics is not plural (not saying it is: English is a fucking weird bastardization of a language). "My favorite topic is birds/trees/cars". These are all plural, but it's not "my favorite topics are ..." In a sentence like this. Subject works in the same way..

English is definitely an inconsistent language lol, and those other Greek words are good examples too. And I didn't know about the relationship to Spanish; thanks for sharing that!

Depending on the dictionary (which just makes the issue even less consistent), it seems that "mathematics" can also be seen as technically singular as well, yet also function as either singular or plural: http://www.thefreedictionary.com/mathematics

So I guess the bottom line is: "Math is ___" or "Maths is ___" are both acceptable conventions based on location (and that the English language is often times silly and inconsistent). Understood. Thanks everyone for responding

I am not a mathematician. I am reading Matt Parker's "Things to make and do in the fourth dimension" and he just mentions that. I look up wikipedia and they just mention that and I wonder how to show that. Intuitivly I would have guessed there are infinitly many.

edit: Oh, I got it. Fun question though, don't you think?

I get why there would be a finite number. There are a finite number of 10-digit polydivisible numbers, and each one can create either 0 or 1 polydivisible numbers with the same starting digits. Repeat, and the total number of polydivisible numbers with n+1 digits is fewer than or equal to the number of polydivisible numbers with n digits (n≥10). I'm not sure how to show this eventually reaches 0, though.

On October 09 2017 03:51 The_Templar wrote: I get why there would be a finite number. There are a finite number of 10-digit polydivisible numbers, and each one can create either 0 or 1 polydivisible numbers with the same starting digits. Repeat, and the total number of polydivisible numbers with n+1 digits is fewer than or equal to the number of polydivisible numbers with n digits (n≥10). I'm not sure how to show this eventually reaches 0, though.

Presumably you show convergence with an inductive proof with a series. However, that just shows there is a finite number. Figuring out that that is 20,456 takes further work. I don't know much about number theory, though, so count me out.

I'd guess that the simplest way to figure out how many there are is to just find them all, because it is very easy to prove that you have found the last one. (Test all of the polydivisible numbers with n digits, test 0-9 as digit n+1 and see if it is divisible by n+1. If not, there are no more of them ever afterwards.)

With computers, it shouldn't be hard to just try numbers until they stop working.

the problem was to find the distance from y to the plane in R^3 spanned by u_1 and u_2 where y = [5; -9; 5], u_1 = [-3; -5; 1] and u_2 = [-3; 2; 1].

Can someone explain why step 5/6 cares about squaring the distance? Why does that matter? They are just squaring the distance and then immediately taking the square root again.

Isn't the entire point to find the length of y - yhat ? My book's examples are doing the same thing. I feel like there must be a point but I don't understand what it is.

Squaring and then square rooting is typically how you find the absolute value of a number, so that negative differences and positive differences are still represented in the final summation, rather than them cancelling out ahead of time. That's typical in distance formula problems and other kinds of problems where you need to add up variations, like with standard deviations in statistics.

But length of a vector, ||v|| means precisely to take v*v and then square root it, right? So length will always be positive. So squaring ||v|| and then square rooting it is redundant. (well, in R^n, right?)

In R^n you are right, this is redundant. ||x-y|| is sufficient. If I had to guess why this is written this way: In other applications the 2-norm (euclidian norm) is used in combination with a (...)^2 to cancel out the square root, which is where the author might have the notation from. Obviously to get the distance you then have to re-add the root.

You calculate ||x||² first, because that just involves not doing the square root when doing the norm. Basically, you ||x||, then square, then root, which would indeed be silly.

They calculate ||x||² , which just means taking the squares of all components (without root), and then in the next step they take the sqrt of that.

There is no deeper mathematical reasoning for this, it is just a different way to write down the stuff that you would already do, namely squaring all components, adding them up, and then taking the square root. So they don't calculate Sqrt(v*v) and then square it, and sqrt it again. That would be silly. They calculate v*v=||v||², and then sqrt it to get the value of the distance.

The reason this style of writing is chosen is probably that in a lot of cases, you don't actually need to take the square root in the end. An example would be sphere equations in R³. Sure, you could take the root on both sides of the equation, but that just makes the equation more ugly and more annoying to deal with.

that’s almost how i first wrote it down. but not quite- and i’m still not reasonably sure i understand why that’s the case. it looks like you’ve set it up as a calculation of a weighted percent increase. but i don’t understand why that works.

5000 games came to 49% success, so 0.49 * 5000 = 2450 successes occurred. In your case, 500 games came to 44% success, which is 0.44 * 500 = 220 successes. That means that 2450-220 = 2230 successes came from other people in 5000 - 500 = 4500 games. The success rate of that is 2230 / 4500 = 49.556%.

Pretty easy. You know your own win percentage and how many games that was. You also know that those games are a subset of a bigger set, for which you also know the win percentage. You want to know the win percentage of the bigger set without your own games. Or, in equation form:

THANK YOU. i had been doing exactly that but couldn’t make the logical leap to that one last step.

On October 19 2017 23:41 Acrofales wrote: Pretty easy. You know your own win percentage and how many games that was. You also know that those games are a subset of a bigger set, for which you also know the win percentage. You want to know the win percentage of the bigger set without your own games. Or, in equation form:

4500*x + 500*.44 = 5000*.49

Solving for x gives you the earlier equation.

yeah, this is exactly how i started except for the 5000. i was missing the realization TT walked me to in order to figure out my problem.

I have a problem in my linear algebra class, I don't understand how to solve it. I find it very confusing.

The question says to give an example of a compact set A, and a closed set B in R^2 such that ((conv A) intersection (conv B)) is non-empty, but A and B cannot be strictly separated by a hyperplane.

Eventually I just looked for an answer online and I found this:

I don't understand why conv A would be the segment [-3, 3]. Wouldn't it be the segment (-10, 10) ? And isn't A in this example not even closed?

Okay, so maybe someone can explain my misunderstanding there. But I guess, if I am correct, I could use -10 < x <10 from his example. And if I do that, then I think this answer works, there should be no hyperplane that strictly separates A and B, and (conv A intersection conv B) should be empty.

Sorry the page doesn't display properly for me, so I can't help you out with this concrete example. But the point of this problem seems to be that conv(B) of an unbounded closed set B is not necessarily closed. If you take for such a B any point on the boundary of conv(B) (that is not in the interior of conv(B)), you get your solution.

Example: B:={(x,|1/x|):x not 0}, A:={(0,0)}. As a graph B is closed and never touches the x-axis but conv(B) contains lines parallel to the x-axis that get arbitrarily close.

[EDIT]: Finally I was able to open the page. Funnily the example constructed there is basically the same as mine (it is the most simple one I can think of). It appears [-3,3] is a typo, it should be [-10,10] like you said but you must define -10<=x<=10 otherwise the set is not closed. Or simpler still, just take my example where A is a single point on the x-axis.

Hey everyone - theres this one problem I cant seem to figure out.

Here's what we are given:

Grad f(2 , 5)=(1 , 3)

h(x,y) = f(2y+x^2,2x-y)

We are asked to find the partial of h with respect to x

My thoughts/confusion:

Isn't grad of a function supposed to produce a vector valued function? Notation-wise I am perplexed as to which x we are meant to find the partial with respect to

I am really at a loss - either i just really dont get this or theres something about the notation that isnt clear

If any of you could help me see the light, thatd be greatly appreciated

On November 16 2017 19:40 amyamyamy wrote: Hey everyone - theres this one problem I cant seem to figure out.

Here's what we are given:

Grad f(2 , 5)=(1 , 3)

h(x,y) = f(2y+x^2,2x-y)

We are asked to find the partial of h with respect to x

My thoughts/confusion:

Isn't grad of a function supposed to produce a vector valued function? Notation-wise I am perplexed as to which x we are meant to find the partial with respect to

I am really at a loss - either i just really dont get this or theres something about the notation that isnt clear

If any of you could help me see the light, thatd be greatly appreciated

Hello, amyamyamy.

You're correct: grad of a function gives a vector-valued function. The first given equation we have is called a vector equation which gives rise to a system of equations. I'll show you. We can rewrite the left side as a vector (the definition of gradient) evaluated at x = 2 and y = 5.

Distribute the evaluate symbols to each of the components.

The fact that the vectors are equal means each component must be equal. This gives us the system of equations I mentioned earlier.

We'll make use of this system in a bit. For now we can start to look at the second given equation.

We don't just have x or y in the arguments of f but actually functions of x and y, so we make substitutions. We can use whatever variables we like for them. I'll use a and b.

So we have

Now differentiate both sides partially with respect to x. We can use the chain rule for multivariable functions to write df/dx in terms of the new variables, a and b, as shown below. Substitute the derivatives of a(x,y) and b(x,y).

Setting x = 2 and y = 5, we can use the first equation of the system.

We have here one equation in two unknowns, df/da and df/db. To obtain a second equation for them, we consider df/dy. Use the chain rule again to write it in terms of the new variables, a and b.

Setting x = 2 and y = 5, we can use the second equation of the system. (x and y aren't present, so it doesn't change anything.)

Solve this system of equations for df/da and df/db with substitution or elimination, whichever you prefer.

Plugging df/da and df/db into the formula for dh/dx, we get

Therefore,

I hope this helped you out. In order to avoid having to think about what the "real" x is, just make a substitution like I did, and you should be fine.

Hello! I got an idea: Let P be a polygon. SP denotes a polygon derived from P such that its sides connect ceters of P's neighbouring sides. For each integer n find a construction of P such that S^iP is not convex for all i from 0 to n, and S^{n+1}P is convex.

Perhaps you've discussed that already so if you have I apologize.

There's a frog riddle going around where you have two frogs that are equally likely to be male or female, and you're trying to find out the chance that at least one of them is female. You have 75% (MM, FM, MF, FF). You now learn that one of them is male and you don't know which one, so the riddle says that it becomes 66% likely (MM, FM, MF)

Now if you knew that the first one was male instead of either one, the chance would be 50% (MM, MF).

The initial riddle is flawed because you find out the information that one is male because males croak and you heard a croak, so you know one is male; however it doesn't take into account that if only males croak, you're more likely to hear a croak from a set of MM than a set of MF or FM, because both of them could have croaked at that moment rather than just one.

I want to discard that and ask, shouldn't the information that we are aware one of the frogs is a male be introduced in the equation? something like "sure male" (sM), then we would get sMM, MsM, sMF, FsM) and a 50% chance?

If you have a detector that can only detect if there is a croak, but not the volume of croak, and if male frogs certainly croak, the riddle works.

Otherwise, you have to do pretty complicated stuff with regard to calculating the probable amount of male frogs based on the amount of croaks that you detect, and the average amount of croaks per second a male frog emits. If you want to solve something like that, more information is needed.

You can use a different version of the riddle and just ask a women of whom you know that she has two children whether she has a son, avoiding the croaky problem.

On November 29 2017 03:17 Nebuchad wrote: I want to discard that and ask, shouldn't the information that we are aware one of the frogs is a male be introduced in the equation? something like "sure male" (sM), then we would get sMM, MsM, sMF, FsM) and a 50% chance?

sMM, MsM, sMF, FsM : yes

50% : no, both sMM and MsM map to the MM category that had a 25% initial probability and 33% probability knowing that at least one frog is male (so sMM 16.5, MsM 16.5, sMF 33 and FsM 33).

That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely) then probability of something being true is (number of outcomes where the thing is true) / (total number of outcomes) This is how you get 3/4 = 75% for the first one. There are 4 outcomes in total and 3 have at least one female. If you are told that at least one is male then you have only 3 total outcomes, of which 2 include a female. Hence the probabililty is 2/3.

It is called conditional probability. The general formula is P(A|B) = P(A and B) / P(B) where P(A|B) means the probability of A being true given that we know B is true.+ Show Spoiler [Justification] +

If you are thinking about random outcomes then the formula is easy to show. If you let N be the total number of outcomes and N(A) be the number of outcomes for which A is true then P(A) = N(A) / N and P(A|B) = N(A and B) / N(B) = ( N(A and B) / N ) / ( N(B) / N )

In general we define conditional probability this way.

So in this case you get P(one female given at least one male) = P(one female and one male) / P(at least one male) = (2/4) / (3/4) = 2/3

I am not entirely certain how you have become confused with the 'sure male' thing but I think you have overlooked that all outcomes are not equally likely because the sex of the two frogs is no longer mutually exclusive. Hence you cannot do probability in the same way. If the first frog is female then the second frog must be male.

An alternative way of looking at this is to say that if only one frog is male (FM or MF) then that male frog must be the sure male but if both are male then there is equal probability that it is the first or second, hence P(FsM) = P(sMF) = 2P(sMM) = 2P(MsM) Also knowing that P(sMM) + P(MsM) + P(FsM) + P(sMF) = 1 gives you P(FsM) = P(sMF) = 1/3 and P(sMM) = P(MsM) = 1/6.

On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)

Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.

Other than that, you analysis of the situation seems to be good.

On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)

Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.

Other than that, you analysis of the situation seems to be good.

This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.

The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.

On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)

Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.

Other than that, you analysis of the situation seems to be good.

This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.

The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.

That is a weird definition of random which i have never heard before. Might have to do with language possibly, maybe "random" doesn't 100% map onto "Zufall", but i don't really think that that is the case. I just looked up the first two books on probability theory available as ebook by my university library (Klenke, Bhattacharya), and none of them seemed to use that definition of random. (They don't actually define random at all, because that isn't actually necessary if you talk about probability theory, as you are just talking about measuring subsets of a full event space by some probability distribution function with a few basic attributes that make it behave like you expect probabilities to behave, based on kolmogorov axioms expanded to fit larger event spaces.)

What you call random i have always heard titled "Laplace assumption" in my probability classes. I don't think you are correct.

On November 29 2017 04:56 Melliflue wrote: That one frog is male is introduced into the equation - the total number of outcomes has been reduced based on that information. I'll try to explain clearly by first explaining a bit about how probability works. When the options are random (ie all equally likely)

Important to mention here: Random and equally likely are not the same. If i roll a D6, and look at whether i have rolled a 1 or not, the result is obviously random, and the options are also obviously not equally likely (1/6 yes, 5/6 no). The Laplace assumption is tempting, but one needs to think for a few seconds about whether it actually applies to the current situation, and if yes, to which basic events it applies.

Other than that, you analysis of the situation seems to be good.

This comes down to your definition of "random". The definition of "random" that I'd seen used in probability theory is "all outcomes are equally likely". So the outcome of 1 D6 is random (6 outcomes, all equally likely) but the sum of 2 D6 is not (11 outcomes, probability skews towards the middle) although the full set of 36 outcomes is random. A D6 always has 6 outcomes even if you are grouping multiple outcomes together. You cannot restrict a D6 to two outcomes (1 and not 1). If you had a cube with a "1" on one side and "not 1" on five sides then it is not random in that strict sense.

The fuzzier definition of "random" as "unpredictable" is not mathematically useful, at least in probability theory. Probability theory only makes sense if outcomes are unpredictable.

Would you mind specifying the level of the course where you heard this definition? Maybe it is a language problem, but this looks very weird to me. From my impression (I've taken courses in probability theory up to the point of the Black Scholes formula, even though this was a few years back), the word "random" is almost exclusively used in conjunction with other terms in a precise mathematical contest, mostly such as "random variable". What you call "random" is a specific property which a probability measure may or may not have.

edit: Well I should read all new posts before posting.

I was trying to avoid making it too complicated, and this may be a language issue. The word 'random' has a strict and a loose meaning in English. The looser version means that there is some element of chance involved, it wasn't picked. In expressions such as "random variable" it is being used in this looser sense.

The stricter definition is that all possible results are equally likely (if in discrete probability) or has a uniform distribution (if in continuous probability). This is what is meant when talking about a random die, a random sample, or a random number generator.

I doubt either definition appears in a textbook, particularly in no undergrad or above level textbook. You would be better off looking in a normal English dictionary. Both definitions are used in- and outside mathematics, although in my experience, at uni level the stricter version is less common.

While "Zufall" does map onto "random" for some things it is far from perfect. I did not take a course on probability while in Germany so I am not certain on all the German terms in probability, but in conversational usage I would not translate "Zufall" or "zufaellig" as "random". A better translation of "Zufall" would be "accident" or "coincidence" or "fluke".

I got annoyed by Simberto's use of "obviously". It is a pet peeve of mine when people use "obviously" (or "clearly" or any other such word) when writing about maths. It adds nothing but can be condescending. If the reader understands then they don't need to be told it is obvious, and if they don't understand then telling them it is obvious will not help them to understand.

In my experience "obviously" when used in a math lecture means "I don't want to prove this, do it yourself if you don't see why it is true". And i can see why it would annoy you, in fact it is a pretty common joke among the students of the maths department at my university to just randomly use "obviously" for stuff which very clearly is not obviously (Or to try to get by with not proving stuff that actually should be proven by saying it is obvious ) to mock the professors which use it in the way you mentioned above.

I didn't intend to use it in that way, though, i wanted to use the more common meaning of "I don't think anyone would disagree"

And i think that sometimes telling someone that something is obvious may help with understanding. It means that you have to look for a simpler explanation as to why it is true. As soon as you start to need to use too many steps to figure it out, you are probably not following the right course when looking why something obvious is obvious. Most of the cases, especially when used in text books, it is infuriating. It usually takes me 20 minutes to understand the actual proof, and then another two hours to figure out why something which is called obvious is in fact obvious. And then i notice that instead of just writing that it is obvious, the author could also have use the same amount of words explaining the thing, and get pissed off.

On December 06 2017 09:43 travis wrote: This is probably way beyond anything I should be asking, but,

If I have N points on a cartesian plane, is there a way to construct a function that describes either:

1.) all the lines between all the points

or

2.) the polygons that can be created by the points (I guess this would require more than one function....)

thanks.

I think you should specify that question. Are you looking for an object in the shape of

f:{subsets of R^2 with N elements} -> R^2 (a1,....,aN) |-> (all points that are contained in the lines between a1,.....aN)

Because at least in theory, this is rather easy. You can define f((a1,....,aN)) by f((a1,....,aN)) ={x in R^2| there are ai, aj, m in [0,1] such that x= ai +m*(aj-ai)}

Hope this can be understood. Havent learned how to embed tex here.

The question is what should map onto what with that function. If you have your points in R², and want to have a function that maps x coordinates onto the y coordinates of the lines/polygons that you have described, that is generally not possible. The reason for that is that a function of R-->R always maps one x onto exactly one y, but the lines you describe will in most cases have more than one point with the same x coordinates, but different y coordinates.

If you go more abstract, and first define a space of all possible lines in R², (Like for example by describing a line via starting and ending point, making the line basically an object of R^2xR^2), lets call it A

and then map a set of N points in R^2 (Which is an object in (R^2)^N ) onto an appropriate overspace of A, You will probably need something like A^((N-1)*N/2)

(N-1)*N/2 is the amount of lines between N points here if i am not mistaken

You can create a function f: R^2N ---> A^((N-1)*N/2) which maps (x1, y1) x....x(xN, yN) onto Line(x1,y1,x2,y2)xLine(x1,y1,x3,y3)x...xLine(xN-y,yN-1,xN,yN)

The same should be possible with polygons.

Thinking about this, this is probably way too stupid, and you should go with the above described map of R^2 ---> R^2, which also works.

Indeed, I should've put P(R^2) (the set of all subsets of R^2) as codomain instead of R^2 itself. I'm a bit in a hurry how, will comment later when I have more time.

Not quite sure what Simberto is on about. The input is clearly not R^2, but rather R^4 (x1, y1, x2, y2) for two points, R^6 for 3 points, etc., and the output are "characterizations of linepoints (or polygons)". Exactly what that's supposed to be beyond a set of points in R^2 I don't know, but lets be less restrictive and map them to equations, then we can do something like:

F(x1, y1, x2, y2) = [(y-y1)(x2-x1)=(y2-y1)(x-x1)]

if you can guarantee x1 and x2 are unequal you can rewrite this as a standard linear function, but I will leave it like this for now.

This is a general function, that for any two points will output a linear equation characterizing the line that goes through them. Now for N points, you have N^2 - N different combinations of two points, and thus you will get N^2 - N different equations as your output. I'm not quite sure what the exercise is supposed to be, but lets roll with it.

For the rest of the question, I am just as confused as ^^ those guys. You can of course, create an algorithm to find all possible triangles, quadrilaterals, etc. etc. that can be constructed for a set of N points. I'm not sure what precisely the time complexity is, but something like O(n^(n/2 +1)) is a good starting estimate: this is superexponential. So you should probably rethink why you need all possible polygons that can be drawn using N points in a plane.

As for the function representing this construction? It'd be some monstrosity of the kind f: R^2N -> P(R^2N)

On December 06 2017 20:50 Acrofales wrote: Not quite sure what Simberto is on about. The input is clearly not R^2, but rather R^4 (x1, y1, x2, y2) for two points, R^6 for 3 points, etc., and the output are "characterizations of linepoints (or polygons)". Exactly what that's supposed to be beyond a set of points in R^2 I don't know, but lets be less restrictive and map them to equations, then we can do something like:

F(x1, y1, x2, y2) = [(y-y1)(x2-x1)=(y2-y1)(x-x1)]

if you can guarantee x1 and x2 are unequal you can rewrite this as a standard linear function, but I will leave it like this for now.

This is a general function, that for any two points will output a linear equation characterizing the line that goes through them. Now for N points, you have N^2 - N different combinations of two points, and thus you will get N^2 - N different equations as your output. I'm not quite sure what the exercise is supposed to be, but lets roll with it.

For the rest of the question, I am just as confused as ^^ those guys. You can of course, create an algorithm to find all possible triangles, quadrilaterals, etc. etc. that can be constructed for a set of N points. I'm not sure what precisely the time complexity is, but something like O(n^(n/2 +1)) is a good starting estimate: this is superexponential. So you should probably rethink why you need all possible polygons that can be drawn using N points in a plane.

As for the function representing this construction? It'd be some monstrosity of the kind f: R^2N -> P(R^2N)

I wanted to basically say the same thing as you. You can't solve this as the graph of a function R-->R

You can define a function as Mafe did, which maps from P(R²)--->P(R²).

Or you can define a function as i did, from R^2N---> R^((N-1)N/2). A problem i just noticed is that in this case, the order of the points in space and the order of the resulting lines are both relevant, which they don't actually need to be. Mafes solution is definitively more elegant.

I should note to my defense that i am currently sick and my brain feels muddy.

Sorry for not replying. You guys pretty much covered what I needed, and I moved on.

I do have a new question, you guys might find interesting.

Let's say I have a set of n numbers. I have a target, which is the sum of all n numbers. Now let's say I create a new set with every k-combination of numbers. So, (n choose k) combinations. In this instance, let's say I choose 4. So I have (n choose 4) combinations made up of my n numbers.

How many combinations are there of my (n choose 4) combinations, that can equal my target sum?

Is there a general formula or easy way to calculate how many combinations can equal a target sum n, with any (n choose k) combinations?

Thank you!

edit: I should be more specific. I am asking about how many permutations there are to reach the sum, not combinations. Which makes me ask, is the answer (n/4)! ? edit2: nope.. it couldn't be. Is it (n/4)!*(n choose 4) ? This is mindbending.

S is the sum of all of your n numbers. M is the sum of k <= n numbers.

You are looking for the amount of picks for which M=S for a given k?

If you have only positive numbers, that amount of combinations (lets call it q) is 0 if k <n, because any sum of k numbers will always be smaller than the sum of those numbers + n-k additional positive numbers. If k=n, q=1.

If you also allow 0 in your base set, the amount of combinations depends on how many 0 are in there.

If you also allow negative numbers, i doubt that there is a general solution. It depends on how many combinations of numbers there are which cancel out, and is gonna get complicated.

We started with a set of numbers (1, 2, .... n), where the elements increment by 1, sequentially. The elements are positive integers, starting at 1.

We have a target, S, which is the sum of all numbers in this set.

We take this set, and we create (n choose k) unique combinations.

We now have a set with n choose k combinations in it: example, k = 4 our set: ((1,2,3,4),(1,2,11,n)...(11,23,73,88)...(7,55,386,n)) etc etc

Each element of this set has a value that is equal to the sum of the elements of the set. So, (1,2,3,4) has a value of 11. (7,55,386,n) has a value of (448+n)

How many permutations are there by which we can take k elements from our new set, and reach our target sum S? How many combinations are there? (probably the question I care more about)

Additional constraint: How many permutations are there by which we can take k elements from our new set, in which every every element of every set is unique?

Thanks!

Edit: Sorry, I am back from lunch. I am less rushed now, and have edited my post to be specific.

If we start categorizing the sums in your second set, we need to take a look at how often each sum appears.

You have 1 times (k+1)k/2, 1 times 1+2+3+5 in the example, so one times (k+1)k/2 + 1 2 times 1+2+3+6=1+2+4+5, so two times (k+1)k/2+2 and so on

maybe it helps to see what different sums one gets in what amount to figure stuff out? Not really at a conclusion yet, and also gonna eat stuff now?

But i am still not convinced that there is a simple answer. Even for very small numbers, the stuff fluctuates pretty hard.

n=3, k=1 has one solution (k=1 and k=n always means there is only one solution since both sets are equal) n=3, k=2 has none (second set is (1;2) (1;3) and (2;3) n=4, k=1 has one solution n=4, k=2 has three solutions, (1;2) (1;3) (1;4) (2;3) (2;4) (3;4), each of which has each number of the base set exactly once in it. n=4, k=3 has no solutions

If there is a general solution, it would have to factor in things like how easily k fits into n.

((n choose k) * ((n-k) choose k))/(n/k) combinations that will equal our sum,

(n choose k) is based on each individual starting element, and then ((n-k) choose k) are the combinations of what is left that can be combined with the starting element to form the sum. But then this all must be divided by (n/k), otherwise we will repeat with starting elements that have already been used in a previous combination.

and then permutations would be equal to combinations * k!, which is the possible orderings of the given combination.

so yeah, I did this and then I realized I wasn't considering duplication of elements. For example, making a sum of 45 with (1,2,3),(6,7,8),(1,8,9)

so I will be impressed if anyone can solve how many combinations there are that don't include repeats.

(I'll also be impressed if I didn't screw up my above conclusions)

I guess it could be said that when you have a combination, like above, trying to reach 45 with 3 combinations of 3 numbers, with no repeats, then you must use every number in n once.

so we can say there is 9 choose 3 possibilities for set 1. Then 6 choose 3 possibilities for set 2. Then only 1 possibility for set 3.

So instead of ((n choose k) * ((n-k) choose k))/k

It needs to be [the product from i = 0 to n/3 of: ((n-3i) choose k) ]/(n/k) ?

That sounds correct, but it only solves that questions of what to do without allowing doubles.

And you can further simplify your result.

n choose k = n!/(k! (n-k)!) n-k choose k = (n-k)! / k! (n-2k)! n-2k choose k = (n-2k)! / k! (n-3k)! with the last factor being n-(n/k - 1)k choose k = (n-(n/k-1)k)! / k! (n-n)!

so with all the (n-ik)! dropping from your product, that only leaves n!/((n/k)k!) or k*n!/n*k!

Edit which is (n-1)! / (k-1)! /Edit Edit2 Crap i suck at calculating, it is actually n! / k!^(n/k) /Edit2

Could be that i am missing something, but that should generally work for your result

And i have no idea how to solve the general problem, or if it is even possible. With the above simplification of only choosing once, any problem where k does not divide n has 0 solutions.

And then you want to know how many such B there are. Is that right?

If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. But your number of combinations would be (using your first try) ((n choose 1) * ((n-1) choose 1))/(n/1) = n * (n-1) / n = n-1

If you don't want repeated elements (in the elements of B) then it can only possibly work if k is the square root of n. Since you can only include every number once, you need to have all of the numbers. At which point the question becomes 'how many ways are there of splitting k^2 elements into k sets, each of size k.'

The answer to that is because you have k^2! ways of listing 1 , ... , k^2 and then you say the first k are the first set, the second k the second, etc. But then I don't care about the order within the first k, so I divide by k!. I have to do this for all k sets of k, which leads to (k!)^k. Finally, I can swap the first set of k with the second set of k, etc. which gives another k! factor on the bottom.

@Travis, while my combinatorics is far from amazing, this question looks quite hellish XD Part of my motivation for this is that this turns into (or at least looks like it) a partition problem, and even finding the partitions of S with no constraints is tricksy.

However your fully constrained question looks a lot more achievable. If both of your k's are the same (so each combination mini-sum is of a k subset of (1, 2, ..., n) and you now want to take k of these k-subsets, and add all k mini-sums to get a total of S) and S is the sum from 1 to n, then you have a situation where you will need have each element of (1, 2, ..., n) appear exactly once. This is because, as per uniqueness, you can't repeat any number, so you need every number from 1 to n to appear in exactly one combination. (Again hoping I'm reading your question right).

This leads to the following observation: your constrained question only has ANY solution when n = k^2. (k subsets, each of k elements, all elements unique). Because if not, you can't even partition all of (1, 2, ..., n) across the k-subsets taken.

So the problem reduces to all the ways of arranging (1,2, ..., n), in k boxes, each box having k elements. But these arrangements correspond one to one with permutations of (1, 2,..., n). (*) Which is not particularly interesting Which leads me to conclude I have misread something :| Although your additional constraint may be strong enough that this must be the case.

If this wasn't at all clear I can type something up in LaTeX that is a bit less of a mess.

EDIT: to see (*), first note that, when we "remove" the boxes, we can't possibly get an arrangement that isn't already a permutation of (1,2,...,n), and that if we "impose" the boxes, every permutation of (1,2,...,n) is also a different box arrangement. Although I'm more dubious on this than I'd like =/

EDIT 2: RIP, no. This assumes permutations all the way through, not permutations of the set of combinations. So... I lie. I'll leave it up but I lie. The previous answer explains this better and gives an answer I can't yet see to be wrong.

On December 15 2017 04:38 Melliflue wrote: I'm not sure what you want. Is it this?

And then you want to know how many such B there are. Is that right?

If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. .

there is a miscommunication here. we make n choose k combinations, but these combinations are of size n/k, not of size k.

so for k = 1, we would a single set of (1,2,3...n) which would be our single solution.

The reasoning for needing all numbers from 1 to n appearing exactly once still holds, so I think you can easily adapt the solution here. You will still have k of the combinations appearing (to get each element exactly once, you need exactly k subsets of size n/k), whose order we do not care about. And each combination now has n/k elements, and again, we do not care about the order of elements within the combination.

So I think this works?

EDIT: If you *do* care about the orders of the combinations (but not the elements within them), then just don't divide by the k!

EDIT 2: WTF the site I'm using for latex images changed it. It was n!/(((n/k)!)^k)(k!)

And the solution for that question is in Travis post #315 and then further simplified in my post #316. It is basically a slightly more general version of your k²!/k!(k+1). Though it might have gotten lost in the editing of posts while others were talking.

Though i didn't get that +1, might have overseen it somewhere.

The result was n!/k!^(n/k) and only works if k divides n. (Otherwise there is no answer to the question.

On December 15 2017 04:38 Melliflue wrote: I'm not sure what you want. Is it this?

And then you want to know how many such B there are. Is that right?

If so, then for k = 1 there are no solutions, since the highest number you can pick is B = {n} and that is not big enough. .

there is a miscommunication here. we make n choose k combinations, but these combinations are of size n/k, not of size k.

so for k = 1, we would a single set of (1,2,3...n) which would be our single solution.

Ah, sorry about that. So in what I wrote, you would need |B| = n/k. As Ciaus_Dronu said, the logic still works. Let j = n/k for convenience. There are n! permutations of 1,...,n. I split that into k blocks, each of size j. I don't care about the order within a block so I must divide by j! for each block, so that gives me (j!)^k. I don't care about the order of the blocks either so I divide by k! too. Final answer is

(I only repeated what Ciaus_Dronu said to add that expression since they had trouble with their latex image.)

I know this is almost what you and Simberto got too, minus a notational difference and a k! in the denominator for whether or not you care about the order of the blocks. I did miss some of the edits, but I wanted to give this explanation because I think it is neater. You get straight to the final answer without needing to simplify anything, and it nicely explains what that final expression means.

I am still thinking about the case where duplicates are allowed. I have not yet seen a simple way to do that.

Hey I was messing around with continued fractions and I came across something I definitely couldnt figure out myself and something Google didnt help me understand either

Basically I was wondering if there exists an elegant way to represent pi as an infinite continued fraction

On December 18 2017 06:36 amyamyamy wrote: Hey I was messing around with continued fractions and I came across something I definitely couldnt figure out myself and something Google didnt help me understand either

Basically I was wondering if there exists an elegant way to represent pi as an infinite continued fraction

I'm pretty sure there are several based on the wolfram alpha page, in particular these two:

Oh what I meant is if theres an elegant way to derive a continued fraction representation (simple - or non-simple like your example) of pi.

Obviously one could approximate pi and construct a continued fraction from that approximation but what im wondering is if theres another way to derive CF for pi

I need to find a summation, and it might be too difficult for me to figure out myself in any reasonable amount of time. If any of you can figure it out, I would be very, very thankful. So thankful that in the future you may possibly (unlikely, but possible), get compensation.

Say we have a number of elements, n.

I have a target I want to reach: ( [the floor of(n-1)/2] - n - 1)

This needs to be found over a sum of (n-2) steps. No more, no less. I need it to be distributed such that

1.) the summation uses all (n-2) steps and 2.) the size of the value being added in an earlier step is never lower than the size of the value being added in a later step and 3.) the size of the value being added in each step is as uniform as possible.

example:

n = 9 target sum: 22 number of steps: 7

example sum: 4+3+3+3+3+3+3

so, I realize the question might be a little non-specific, so there may be some wiggle room for interpretation. But hopefully it's enough information to garner a good general solution.

On December 25 2017 02:20 The_Templar wrote: Maybe he meant the floor of (n-1)^2/2, and not (n-1)/2?

That would make sense with the example given. So far splitting into cases where n is even / odd, I'm getting somewhere for even. Although it looks by far less technical and annoying than for n odd.

EDIT: Still long and annoying though

EDIT 2: Analysis for n even won't be fun to format here, can I link an online latex document when that's done? It'll only be for the even case unfortunately. Odd will likely take hours T.T

Sorry guys, I meant ( [the floor of((n-1)^2)/2] - n - 1)

So, (n-1) is squared.

that's why in my example, with a value of n=9 we do: n-1 = 8. 8^2 = 64. 64/2 = 32. 32-1-n = 22 so our target is 22

again, sorry about the error, I need to check these more closely before I walk away.

edit: just noticed the_templar predicted what I had meant. nice job

edit2: I'm able to solve it in python(well, one possible solution.. depending on interpretation I think there are more), but I still don't know how to translate it to math, and having the math may make it more likely I can extract useful information to make my algorithm better.

If it's helpful for you, this is the code:

max = ((n-1)**2)/2 - 1 - n levels = n-2 min = max/levels other = max-(min*levels) max = min+1 maxlevels = other minlevels = levels-other

this gives me a sum where the first "maxlevels" steps are of value "max", and the following "minlevels" steps are of value "min", and these values will vary by only 1

Okay here is an overleaf read-only thing you can view. It should let you generate a PDF. It only has the even solution, but I can do the odd one now quickly too.

EDIT: You can click PDF on the top bar of the screen to get something easier to read. EDIT 2: Lots of typos, that is a "live" link so I'll fix them as I see them.

I'm not able to get this to work nicely for odd case, keep getting a 1/2 term floating around that definitely doesn't work. I've probably done something not clever >.>

EDIT: Solved. For odd numbers greater than 7 anyway (everything smaller you can have as its own case). I'll put in the solution soon, I just need to shower and then I'll type it up.

EDIT: The conclusion is as follows: For n even, you want the first (n-5) terms to be n/2 - 1, and the last 3 terms to be n/2 - 2. For n odd, you want the first (n-1)/2 - 3 terms to be (n-1)/2, and the rest of the terms to be (n-3)/2.

With the steps I have included, it is also easy to spot the special cases where all the terms can be the same and where this can't be done at all, although that's not difficult on its own.

Not sure this is math. It's more signal processing, but guess I'll post it here.

I am trying to find useful descriptors for a completely irregular (discrete) time series. And when I say completely irregular, I mean that it looks random, and is made to look even more random after a dfft. Here's an example:

Every sample produces a series like this one. What I plotted here is the time series, the mean and mean + 1 std deviation for this sample, and mean, and mean +1 std deviation for the entire population. Mean - 1 std deviation is always < 0, so not very useful.

What are useful ways for describing data of this kind of data? Because I am mostly interested in the peaks, their spacing and things, I was thinking of using the following descriptors:

number of peaks > mean + 1 std dev. height of the highest peak (probably not meaningful data) %time > mean + 1 std dev. %time < mean. max time between peaks. min time between peaks.

But they seem completely ad-hoc choices. Are there more principled ways of describing a time series with what appears to be no periodical component?

Oh, what I forgot is that I can play with the window. I am now grouping events together by minute, but I can play with the window size, or do a sliding window (not sure what the point of that is in this case, other than to have a smoothing effect). Making the window size too small (10 sec) seems to make it uselessly jittery though. The peaks get broken up into seemingly meaningless rapid peak-vale-peak-vale intervals. 1 minute seems intuitively like a good trade-off between resolution and noise... but if there is a better way of analysing this, that would also help

Hey, seeing as no one else gave you an answer I thought I might add something. If I understand you correctly then as your sample points are seemingly random then there is no correlation to the time points, so why not leave time out of it an treat them as 1-dimensional samples. A good way of presenting such samples would be a box plot, which includes all statistical metrics commonly used.

What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?

The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.

On December 18 2017 08:17 amyamyamy wrote: Oh what I meant is if theres an elegant way to derive a continued fraction representation (simple - or non-simple like your example) of pi.

Obviously one could approximate pi and construct a continued fraction from that approximation but what im wondering is if theres another way to derive CF for pi

There is (unlike what is suggested by the uneducated people who have told you otherwise). But to understand it you would have needed to have deep understanding of jacobi theta functions.

However, I can give you a cleaner way.

Remember arctan(1)=Pi/4, right? but also, arctan(x) = x - x^3/3+x^5/5-x^7/7+x^9/9 - .... so you can write:

PI = 4 x (1-1/3 +1/5 -1/7 +1/9-1/11+1/13-...... )

Many other series of simmilar type abound. The best are in base 8 and base 16, though. The best one is called the BBP digit extraction algorithm because it literally allows you to finish computing one digit at a time before you have everything (think: base 16 more dense than base 10) It reads:

PI = Sum over k from zero to infinity 1/16^k * [ 4/(8k+1)- 2/(8k+4) -1/(8k+5)-1/(8k+6)]

Note that these aren't "magic numbers". They come from modular rings in base 17.

On February 23 2018 23:34 Joni_ wrote: Random pure maths question:

What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?

The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.

Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.

On February 23 2018 23:34 Joni_ wrote: Random pure maths question:

What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?

The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.

Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.

Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.

Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>

On February 23 2018 23:34 Joni_ wrote: Random pure maths question:

What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?

The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.

Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.

Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.

Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>

What I meant is you need to formulate a injective but not surjective map (or measure) from your Banach space to an appropriate field with more restrictive topology. Study what happens in that case; it will illuminate the full problem. This is a common strategy. Look at Tykhonov's theorem.

(after an hour of thinking and editing this post: ) Some thoughts: Cauchy's integral theorem tells us for which kind of domain a holomorphic function will definitely have an antiderivative. When you just look at the fundamental theorem, it's unclear if f(z)=1/z could ever have an antiderivative, even though it's holomorphic. But with Cauchy it's clear that it will have an antiderivative on a simply connected domain.

PS: I hope I'm making enough sense here. My "math english" is terrible :/

Your additional thoughts are exactly on point. The fundamental theorem talks about functions which have a holomorphic antiderivative as a base, which is not the same as holomorphic functions.

Cauchy's integral theorem states that holomorphic functions do stuff. It is thus stronger in that regard.

Meanwhile, as you noticed, once you know a holomorphic antiderivative exists, the fundamental theorem becomes stronger.

On February 23 2018 23:34 Joni_ wrote: Random pure maths question:

What are sufficient conditions on a Banach Algebra to guarantee that an element whose spectrum is a subset of {0,1} is a projection, i.e., self-adjoint and idempotent?

The Algebra being a C*-Algebra is certainly sufficient, but I'm specifically looking for weaker conditions, because C* is not guaranteed in my case.

Look into what happens to the fixed orbit of elements acting on rational points under the Frobenius automorphism.

Care to elaborate? I only know the Frobenius automorphism as an automorphism of fields of prime characteristic.

Tbh Im not particularly knowledgeable in a lot of advanced C*-topics like K-Theory in general, it's mostly spectral theory/Gelfand theory for me, so might be that I simply lack the required conceptual background. :>

What I meant is you need to formulate a injective but not surjective map (or measure) from your Banach space to an appropriate field with more restrictive topology. Study what happens in that case; it will illuminate the full problem. This is a common strategy. Look at Tykhonov's theorem.

Thanks for the reply, I didnt even notice until today. Yet there seems to be some communication mishap (I guess I just dont quite get what you mean):

There will never be an injective map, that is also an algebra homomorphism from a general Banach Algebra into any field, as a Banach Algebra is not in general an integral domain while a field is.

A map that is not an algebra homomorphism will not allow me to deduce any algebraic properties of the elements of an algebra (like idempotence) from their images, so I dont see how that could help.

As far as I can tell Tykhonov's Theorem seems to be completely unrelated to my case, but maybe Im just not seeing the proper broader picture.

Hey Ciaus sorry for not replying, I got deep into my work. I will tell you I did use what you provided, but the algorithm itself did not pan out.

but anyways here I am with a new question for you math geniuses

I have written an algorithm that works with graphs. The long and short of it is, I arrive at a point where I have k sets, each has unique values within them. The sets vary in size.

So maybe I have 8 sets: three are size 1, two are size 2, two are size 4, and one of them is size 9.

I have to separate these k sets into n sets, where n is a value that is fixed ahead of time. Let's say n is 18.

Well, using basic math I can see that I have 3+4+8+9 = 24 values to work with. 24/18 = 1.3333.. average values per set, in order to form n sets.

Now here is where things get tough. The k sets are formed as such because they are disjoint, and the values must remain disjoint. This means that each set of size one, must remain size one. In a similar fashion, any of the other sets cannot be "mixed" with values from another set. So, sets of size two could never form a new set of size 3. Nor could a value from the set of size 9 be put into a set with a value from a set of size 4 to form a set of size 2. Essentially, values can only form a new set with other values they were originally matched with.

Using this information, is it possible to actually solve what the sizes of my n disjoint sets must be?

DISCLAIMER: The above example is not a real example and likely does not have an actual solution. If needed, I could run my algorithm to get a real world example.

If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).

@travis: awesome, glad it was of some help, sorry things didn't work =/

For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.

Also while I'm here, here's a fun thought-experiment that uses a bit of basic graph theory and probability intuition:

Say you are running some NGO which delivers vaccines to countries suffering epidemics. You only have enough vaccines to vaccinate some small portion of the population (say less than 5%), and you can't gather any data on the population / distribution / individuals in the country, all you can do is randomly give people the sealed vaccine, and some instructions. You may assume the people you give the vaccine to will follow your instructions.

What instructions do you give to try most effectively stem the epidemic?

With varying extra details and depth, the very basic idea is as follows: You instruct the person you give the vaccine to not to take it themselves, but rather to give it to a friend, who should then take it. Those with more friends are hence more likely to get vaccinated. The assumption being that a person's capacity to spread the infection is proportional to how much they interact with others in their surroundings. In essence, the strategy targets those with greatest ability to spread infection.

Explicitly modeled, your population is a graph. People = vertices, friendships = edges. More incident edges, more likelihood of receiving vaccine from a friend as well as higher expected amount of infection spreading.

On May 17 2018 04:57 Kleinmuuhg wrote: If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).

On May 17 2018 05:28 Ciaus_Dronu wrote: For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.

yes, n will always be >= k

and I failed to mention the stipulation that every element in all of our sets must be used (and every element is unique, you will not see the same value twice).

in terms of the actual problem, I am not looking for what values would go in what set. that actually has an entirely new set of constraints. so, what I am hoping for is "potential solutions" in terms of the amount of sets and the sizes of them. but I think for the answer you are giving this doesn't matter.

anyways let me see if my understanding of the suggested solution is correct:

we have M elements in K sets, trying to form N sets of size Y

#at each step, we can stop and use this solution and/or backtrack to find another solution 1.) form N sets of size 1, first using the sets of size 1, and then using at least one value from each of the other sets 2.) kind of like... fill in these sets with the remaining values? potentially a huge amount of ways they could be put in... is it the case that every time we do #2 it will give us a potential solution?

On May 17 2018 05:40 Ciaus_Dronu wrote: Also while I'm here, here's a fun thought-experiment that uses a bit of basic graph theory and probability intuition:

Say you are running some NGO which delivers vaccines to countries suffering epidemics. You only have enough vaccines to vaccinate some small portion of the population (say less than 5%), and you can't gather any data on the population / distribution / individuals in the country, all you can do is randomly give people the sealed vaccine, and some instructions. You may assume the people you give the vaccine to will follow your instructions.

What instructions do you give to try most effectively stem the epidemic?

With varying extra details and depth, the very basic idea is as follows: You instruct the person you give the vaccine to not to take it themselves, but rather to give it to a friend, who should then take it. Those with more friends are hence more likely to get vaccinated. The assumption being that a person's capacity to spread the infection is proportional to how much they interact with others in their surroundings. In essence, the strategy targets those with greatest ability to spread infection.

Explicitly modeled, your population is a graph. People = vertices, friendships = edges. More incident edges, more likelihood of receiving vaccine from a friend as well as higher expected amount of infection spreading.

instructions: kill anyone who is infected

ok no but actually I would have never gotten the recommended solution, which is actually pretty damn clever

On May 17 2018 04:57 Kleinmuuhg wrote: If I understand the problem correctly then there are obviously only solutions for n greater or equal than k. Cant you just sort the k sets in ascending order and pick single elements to form new size 1 sets until you have as many sets as you want? Now calculating the sizes of the new sets shouldnt be a problem. (You have all untouched old sets, some new size 1 sets and maybe 1 half-used old set).

On May 17 2018 05:28 Ciaus_Dronu wrote: For the new problem, is there any requirement other than simply the number of new sets you must have? If not, then @Kleinmuuhg's answer looks perfect. If you want to keep sets small-ish maybe, you can just subdivide up the largest sets first, until you have n in total.

yes, n will always be >= k

and I failed to mention the stipulation that every element in all of our sets must be used (and every element is unique, you will not see the same value twice).

in terms of the actual problem, I am not looking for what values would go in what set. that actually has an entirely new set of constraints. so, what I am hoping for is "potential solutions" in terms of the amount of sets and the sizes of them. but I think for the answer you are giving this doesn't matter.

anyways let me see if my understanding of the suggested solution is correct:

we have M elements in K sets, trying to form N sets of size Y

#at each step, we can stop and use this solution and/or backtrack to find another solution 1.) form N sets of size 1, first using the sets of size 1, and then using at least one value from each of the other sets 2.) kind of like... fill in these sets with the remaining values? potentially a huge amount of ways they could be put in... is it the case that every time we do #2 it will give us a potential solution?

Hmm, there are actually a huge number of ways you may be able to end up doing this. If n >> k, and some of your original k sets are themselves large, you end up with something worse than set-partitions in terms of how many ways you can break it up.

But let's be explicit. Let m be the total number of (unique) numbers in our k sets. Let n be the target number of sets.

Constraints: There exists at least one way of getting our m numbers into n new sets if and only if:

Only if: The first inequality because you can never merge old sets, only break them up or keep them the same. The second inequality because every set needs at least 1 number in it, so you can't have fewer total numbers than sets.

if: If both inequalities hold, and n = k, then the sets as they are given are fine. If k < n, choose any set with more than 1 element (which must exist as k < m) and break it into 2 sets. We've now increased "k" by 1. Keep doing this until "k" = n. (I say "k" as we've essentially done induction to get a new value for the number of sets we already have).

Horror of counting all possible solutions The issue is that there are a huge number of ways to break up your k old sets into n new ones (every old set will just be some subset of an old one). Finding even a rough solution is nightmarish at best. Let's give an example to illustrate. Say k = 3, n = 5, m = 15. Let our 3 starting sets be A, B, C with |A| = |B|= |C| = 5.

To get to 5 sets from these 3, we could partition C into 3 subsets, and now we'd have 5 new sets, done. Call the number of ways of doing this {5, 3}. I think it's called the Stirling number of the second kind, and it's generally large. We could instead, however, partition just A or just B into 3. But... We could rather partition A and B into 2 and leave C as is, that'd also give us 5 new sets. Same for all pairings. As you can see this gets... well I'd never tackle this as a research problem unless I wanted to be only speaking in generating functions by the end of it.

Easy way to get any solution All hope is not lost though. If ALL you care about is finding some way of turning your k sets into n sets, with those same elements, by only breaking sets up and never merging them, then this is easy. First, check that our first inequality holds (otherwise no solution exists). Second, Check if k=n, if so, we are done. If not, take the largest set you have. Cut off 1 single element and make a new set with just that element. You now have k+1 sets. Find the new largest set (it might not be the same as in the previous step) and again, take off 1 element and make that element its own set. Keep doing this until you have n sets.

In our previous example with A, B, C, the algorithm would take 1 element from, say, A, at step 1. Then another element from say B (NOT A, as it is smaller than B after we chopped an element off of it) at step 2. Then you'd be left with C, two sets of size 4, and two sets of size 1.

That last part of ciaus_dronus answer is basically what I suggested. I think he explained it pretty good. If needed I can type up some pseudo code if you want

Agreed. I don't actually think the problem is that hard at all. You just need a principled way to run through your sets. Given that you have some kind of condition for what elements can be inside a set, which you haven't specified, you can simply iterate through this generation process, and stop when you find an acceptable solution. Greedy algorithms work!

Of course, if you can somehow use your acceptability criterion to restrict the search space, then you should. But if you can't and can only decide whether a solution is acceptable or not, then just iterate greedily until you find an acceptable partition.

Oh, of course, if you have a very tiny number of actually acceptable solutions, or you are looking not just for any solution, but for the best solution, then this is not going to work as well. But for just finding *a* solution, greedy is fantastic.

If your additional constraints involve a specific ordering of the elements within the sets and how they can be divided according to that, I thought of an easy way to both count and generate possible partitions. Basically, if you can "linearise" the problem a bit and get rid of partitions that don't look like intervals when you give the elements some order, it's not too hard to count all possible partitions, and to see how you get them.

If this is of value to you, I can type it up. But it's a very specific constraint so maybe not =/

The gist is to model covering in the order relation inside each partition as edges between values (ie: edge between u and v if v is the smallest value bigger than u). Then you just delete any n-k edges and get a graph on n components corresponding exactly to an allowed partition into n sets.

I'm having an issue (dispute) with someone and need some clarity, as my brain is fuzzing whenever I try to properly prove it one way or the other.

The core issue is this: trying to calculate the expected cost of something, in the context of a game.

basically; there's a yearly chance that the event might happen if you pay a fee. say X% a year. and each year, you have to pay K. i.e. each year you pay K, and you get an X% chance of the event.

now there's a modifier N, which can sometimes affect this, and i'm trying to figure out how the expected cost varies as a function of N, roughly. it doesn't have to be exactly right, as the dispute is simply about the scaling rate.

N typically has values near 1, with the baseline case being where N =1; it always has positive value I think. all values are between 0.8 and 1.5

N has the following effects: it causes the yearly cost to be K*N and it modifies the yearly chance of the event happening; I think to X*(1/N) though it might be to X*[2-N] which unfortunately is a whole different calculation, though they're fairly similar over the range of values actually used.

You are not being 100% clear. Are the following true?

If you pay, event might happen, otherwise it doesn't. Event is positive, and you with to know how much the average cost of the event is, without caring how long it takes for the event to happen? So an example would be a lottery, where you pay a monthly fee for a change of winning a cruise.

For the base case, the expected cost of the event is K/p, where K is your cost and p is the probability of the event happening (The percentage X you declared, divided by 100). The reason for that is that you expect on average p events happening per year that you pay (It doesn't matter that p is smaller than 1). So you pay K for p events, thus the cost of one event is K divided by p.

The same logic still works in your modified case. Through a change of N, the cost changes to K*N, and the amount of events per year is changed to p/N. Thus, you pay K*N for p/N events, and thus each event costs you K*N divided by p/N. Fractional maths means that this is equal to K*N*N/p. So in this case, the total costs is K*N² / p

In the second case, a similar argumentation leads to K*N / (p*(2-N))

On May 25 2018 09:06 Simberto wrote: You are not being 100% clear. Are the following true?

If you pay, event might happen, otherwise it doesn't. Event is positive, and you with to know how much the average cost of the event is, without caring how long it takes for the event to happen? So an example would be a lottery, where you pay a monthly fee for a change of winning a cruise.

For the base case, the expected cost of the event is K/p, where K is your cost and p is the probability of the event happening (The percentage X you declared, divided by 100). The reason for that is that you expect on average p events happening per year that you pay (It doesn't matter that p is smaller than 1). So you pay K for p events, thus the cost of one event is K divided by p.

The same logic still works in your modified case. Through a change of N, the cost changes to K*N, and the amount of events per year is changed to p/N. Thus, you pay K*N for p/N events, and thus each event costs you K*N divided by p/N. Fractional maths means that this is equal to K*N*N/p. So in this case, the total costs is K*N² / p

In the second case, a similar argumentation leads to K*N / (p*(2-N))

ok i'll try to clarify, though it looks like you got it all anyways.

obviously have to specify that N <= 2 for the second case, because probabilities cannot be negative.

moreover, that is right for the base case, where you know that N is applied or not. In order to compute the general case, where you cannot say a priori whehter N is active or not, you need to know the probability that the modifier N is applied.. The general case will also allow you to compute the probability of the event occurring in at most Y years, and things like that.

I've got another probability question. A bit similar to the last one I asked, but I'm still stumped.

The goal is to get to 0. If you are at some other number, you roll to see what number you to go next. Either you stay at your current number or you move up. You continue rolling until you get to 0.

See the chart below. The first number is the number you are at, and the second number is the number you can potentially go to, followed by the chance of getting there. So for example, if you are at 1 you have a 50% chance of rolling to 0 and stopping after 1 roll, and a 50% chance of staying at 1 and trying again.

The question is: What is the expected number of rolls to get to 0 for each number? I already have the answers by simulation: 1 = 2, 2 = 2.667, 3 = 3.144. But I'm stumped as to how to actually calculate that.

Calculating at 1 is easy. Without even having to do any calculations, you can easily see that if you have a 50% chance of winning, you will win on average 2 times.

Calculating 2 is trickier. You have a 25% chance of rolling 0, therefore ending in 1 roll. You have a 50% chance of rolling to 1. Since I already know the expected value at 1 is 2 rolls, that means at 2 you have a 50% chance of winning after 3 rolls. But what is the value if you roll from 2 to 2? There's a convergence there that I'm not sure how to calculate. Any help would be appreciated. Thanks!

Edit: Okay Im not typing this, sorry. Hope you can read my scribbling Not sure why our results are different for 3. If you have any questions, please ask and I can elaborate more if needed.

Sounds like a typical get-a-hypothesis-by-juggling-with-numbers-into-proof-by-induction exercise: Basically: E[W_n]= 1/(n+1)* (E[W_n]+...+E[W_1]+E[W_0]), which means you can calculate E[W_n] if you know all "lower" expected values.

However, now that I read your numbers again, it seems that not every number has the same probability? You should definitely explain what the number of rolling x is when you can roll numbers up to n. Do you mean to say you can only roll 0 or 1 and then subtract the result from n? In any case, you should be able to modify the recursive formula to get get proof by induction/recursion.

Edit: Ok my formula is wrong, take the one from the screenshot it the post above. The idea should still stand. Edit: Also, 22/7 would round to 3,142 or 3,143 but never 3,144. Is that just a typo?

I'm looking at a smaller example, a 4-bit number instead of 32-bit. Take a number from 0 to 2^4-1 and convert it to binary. If in binary a number has one 0 and three 1's, what is the probability that a bit-wise OR against another 4-bit number will yield a number with all four 1's (15)? It's 50%. The chart I provided earlier is the probability of going from a 4-bit number with x 0's to a 4-bit number with y 0's. Since it's a bitwise OR operation, you will never lose 1's.

I'm sure there are better approaches to the problem, but this is the direction I'm trying to solve it from. In fact, it's looking like I should probably try a different approach, as it's quite complicated to determine those probabilities.

Edit: Also, 22/7 would round to 3,142 or 3,143 but never 3,144. Is that just a typo?

It was just a guess from the results of my simulation. You're correct mathematically, but my simulated results would vary further than just a rounding error.

Aren't the bits completely unrelated in that situation? Meaning you can take the problem as 32 copies of the 1 bit problem and then recombine them later on?

Possible idea for a approach not completely thought through + Show Spoiler +

A single bit has a probability of P0(n) = 1/2^(n+1) to NOT be a 1 after n operations. And P1(n) is thus 1-(1/2^(n+1))

32 such bits thus have a probability of at least one of them not being a one of 1-(P1(n))^32. Maybe it is possible to somehow get an expected value out of this.

Though i am slightly confused by the "rounded to 10 digits after the decimal point" thingy. That implies some sort of numerical or simulationary approach in my opinion, because why would you want 10 digits if either the result could be described as an exact number in some way (fraction, roots or whatever) or the point is just a mathematical approach.

On July 17 2018 06:57 Simberto wrote: Aren't the bits completely unrelated in that situation? Meaning you can take the problem as 32 copies of the 1 bit problem and then recombine them later on?

yes , at first sight it looks like you could model this problem with i.d.d. random variables , then regard the 1-dim case and multiply your way up from there

Yeah I'm pretty sure there are better ways to solve it. This just kinda how I started it, so I wanted to see how far I can get. You guys helped me out, I got exactly what I needed. Thanks!

2.) given a binary string of length 15, 4 of the positions are set to 1 (at random). the rest are zero. how many strings can be made that have no ones next to each other?

-wtf, I have no idea how to do this :/

3.) similar to #2. We have 4 As, 3 Bs, 2 Cs, 1 D, 1 E. We use all the letters to form a word at random (all letters must be used). How many words contain no Bs next to each other?

i also have no idea how to do this one, it seems very similar.

I know how to do #1 and maybe #2. #3 I thought I had but some smaller examples have convinced me otherwise. I suspect I will be able to get it more easily in the morning - a bit rusty at this.

For #1 I think you're supposed to assume that all similar shapes are the same, so choosing circle #1 results in the same outcome as if you chose circle #2, for example. In that case, you should set up some kind of linear equation that satisfies the condition (picking exactly 4 shapes from 4 types of shapes), include restrictions on the number of circles and triangles you have, and figure out the number of solutions to that equation.

For #2, I think I solved it correctly by assuming that every 1 is either the first digit in the string or preceded by a 0. Therefore, you can break up the problem into 2 cases — one where the first digit is 1, and one where the first digit is 0 — and figure out the number of ways in each case to rearrange the appropriate number of 0s and "01"s. For example, in the first case, the first digit must be 1, after which point you have three "01"s and 8 other 0s.

On September 07 2018 13:00 DarkPlasmaBall wrote: For #1, why is it 16 and 15 instead of 17s?

This. Also a bit rusty, but shouldn't you multiply the second line by 4, and the third line by 6, because the order in which that happens doesn't matter?

For #2, seems like you can solve that with dynamic programming. The chance of having 2 1s next to one another is:

Let c1 be the chance the first digit is 1, and c2 be the chance the 2nd digit is 1, given that the first digit is 1.

Concerning #1: I am guessing we do not distinguish the circles, etc. from one another? That'd make for a whole lot more possibilities.

#2: the way these problems are usually handled are that you imagine the 11 zeros side by side and the 4 ones as 'partitioners' which you can place at any of the 12 positions in between the zeros or to either side. Think of it as 11 blue cars parked in a row. You now want to park your 4 red cars. The first one you can put in between two blues (10 spaces) or in front of all or in the back of all, makes a total of 12 possibilities. For the next car you have one less possibility, etc. If all the red cars were different you would have 12*11*10*9 possibilities. But you cannot distinguish between two red cars so you have to account for that. This leads to 12 over 4 as your end result. (drawing without putting back, without caring for oder, binom coefficient)

I wrote it in Python. For my dynamic programming solution, I get:

1 - 0.6373626373626374 = ~36%

Using Kleinmuuhg's method, the probability is (12 over 4)/12⁴ = 0.02387152777

What am I doing wrong?

E: I also just saw that the question is not about the probabilities, but the actual combinations. Either way, there is something wrong somewhere :D

E2: checking it for strings of length 3 and 2 1s, I think it's clear that both kleinmuuhg and my method work, but I fucked up on the total number of possiblities. It isn't 12⁴, it's 15 choose 4. And then it all works.

A simple recursive Python solution for #2 that seems to produce reasonable results:

def get_count(zeros, ones): if ones == 0: # Nothing, or only 0s left return 1 if ones == 1: # A single 1 can be placed in multiple positions return zeros + 1 if zeros == 0: # All other combinations require 0s to work return 0

# The remaining string starts with either 0 or 10 return get_count(zeros - 1, ones) + get_count(zeros - 1, ones - 1)

get_count(11, 4)

Not sure it's correct, but this gives a result of 495.

This can be solved by means of simple dynamic programming. Basically, the number of ways of selecting x number of 1's from a binary string of length y is the sum of the number of ways of selecting x - 1 number of 1's from a binary string of length y - 2 for all of x. Here's my c# code:

private int Solve(int maxSize, int maxCount) { int[,] data = new int[maxSize + 1, maxCount + 1]; for (int spot = 1; spot <= maxSize; spot++) { data[spot, 1] = 1 + data[spot - 1, 1]; for (int count = 2; count <= maxCount; count++) { for (int position = spot; position - 2 >= 1; position--) { data[spot, count] += data[position - 2, count - 1]; } } } return data[maxSize, maxCount]; }

#3 can't be trivially reduced to #2 since "abc" and "cba" are still distinct words even if the only "b" is always second.

The solution should be pretty straightforward with combinatorics. Anyhow, in Python (combined with the solution for #2):

def get_word_count(letter_counts): if sum(letter_counts.values()) == 1: return 1 # Only one letter left

def sub_count(letter): if letter_counts.get(letter) == 0: return 0 # Invalid letter to continue with else: return get_word_count({k: (v - 1 if k == letter else v) for k, v in letter_counts.items()})

return sum([sub_count(letter) for letter in letter_counts.keys()])