The Big Programming Thread - Page 808

Prev 1 806 807 808 809 810 1031 Next

Thread Rules
1. This is not a "do my homework for me" thread. If you have specific questions, ask, but don't post an assignment or homework problem and expect an exact solution.
2. No recruiting for your cockamamie projects (you won't replace facebook with 3 dudes you found on the internet and $20)
3. If you can't articulate why a language is bad, don't start slinging shit about it. Just remember that nothing is worse than making CSS IE6 compatible.
4. Use [code] tags to format code blocks.

Blisse

Canada3710 Posts

December 02 2016 01:10 GMT

#16141

Google interview today. 5 interviews. I bombed the 3rd one so badly. Communication problems but it was a counting math problem that I had no clue about. The 45 minutes was awful lol. Other questions were much more straightforward. Ran out of time on 1 of them but did well I think on the others, so who knows... Probably didn't get an offer but at least interesting to try (again).

Deleted User 3420

24492 Posts

December 02 2016 01:16 GMT

#16142

I am really curious can you tell more of what the questions were / what type of position it was? That sounds intense.

phar

United States1080 Posts

December 02 2016 04:42 GMT

#16143

Blisse - welp can't be as bad as my first round of interviews at Google, wherein I literally bombed a bucket sort question. Also don't sweat a single bad interview, it's not always enough to sink a packet. Plus you never know if that interviewer was one of the newer/uncalibrated interviewers, in which case HC may just not care as much about their interview. So it could work out. Or not, who knows.

Travis - read this: http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html

tofucake

Hyrule19031 Posts

December 02 2016 13:39 GMT

#16144

that guy doesn't work at google though :|

Ilikestarcraft

Korea (South)17726 Posts

December 02 2016 15:59 GMT

#16145

I'm pretty sure Steve Yegge is still working at google and he was definitely working at google when he wrote his guide.

Deleted User 3420

24492 Posts

December 02 2016 16:34 GMT

#16146

I scored a 64 / 70 on my last exam. yay!

I did miss one question though and I want your opinions on it.

The question was about bucket sort.

It specified "Given a range greater than N and uniform distribution", what is the big O complexity for average case and worst case.

Average case I put O(n), which is correct.
Worst case, I know is typically O(n^2) for bucket sort.

However, we never really went over this term "uniform distribution". At least not in detail, and it was never stressed. I assumed that meant that it meant that the elements were distributed equally among the buckets. But apparently it's a statistics term and that's not what it means, it just means that the probability for distribution within the range was equal.

So, during the test, I went back to this question and stared at it for like 5 minutes. I was like "uniform distribution.. uniform distribution...". And finally I put O(n) based on my assumption.

Today in class the professor pretty much threw out a disclaimer about this question. So, I assume lots of people got it wrong. He explained what uniform distribution means. But he isn't going to give points to people who put O(n). It seems unfair. Should I say something to him about it?

Acrofales

Spain17971 Posts

December 02 2016 17:12 GMT

#16147

On December 03 2016 01:34 travis wrote:
I scored a 64 / 70 on my last exam. yay!

I did miss one question though and I want your opinions on it.

The question was about bucket sort.

It specified "Given a range greater than N and uniform distribution", what is the big O complexity for average case and worst case.

Average case I put O(n), which is correct.
Worst case, I know is typically O(n^2) for bucket sort.

However, we never really went over this term "uniform distribution". At least not in detail, and it was never stressed. I assumed that meant that it meant that the elements were distributed equally among the buckets. But apparently it's a statistics term and that's not what it means, it just means that the probability for distribution within the range was equal.

So, during the test, I went back to this question and stared at it for like 5 minutes. I was like "uniform distribution.. uniform distribution...". And finally I put O(n) based on my assumption.

Today in class the professor pretty much threw out a disclaimer about this question. So, I assume lots of people got it wrong. He explained what uniform distribution means. But he isn't going to give points to people who put O(n). It seems unfair. Should I say something to him about it?

Not sure how old you are and what requirements are for your course, but where I went to school, uniform distribution was taught at high school, so uni professors were allowed to assume you knew that stuff.

However, I'm confused. The O(n^2) worst-case for bucket sort is, insofar as I remember, when your data is NOT uniformly distributed, but has clusters. If your data is uniformly distributed, then bucket sort works in O(n) time (technically I think O(n + m), with m the number of buckets), because it can neatly distribute it in buckets.

EDIT: wait, what was the exact wording? Because if the wording was "distributed according to a uniform distribution", then you have a *chance* of clustering. And thus, worst case scenario is taking that chance of clustering into account. However, if it says "uniformly distributed" then you already know they are uniformly spread over the range.

It's the difference between, I am going to flip a coin, what is the chance of heads? And, I have flipped a coin, and it came up heads, what is the chance that it came up heads?

phar

United States1080 Posts

December 02 2016 17:15 GMT

#16148

On December 02 2016 22:39 tofucake wrote:
that guy doesn't work at google though :|

He's still working out of the Kirkland office, at least last time I checked.

Acrofales

Spain17971 Posts

December 02 2016 17:39 GMT

#16149

Hehe, given the mix of interview questions and worst-case bucket sort, I thought of an interesting question for the mathheads here.

Bucket sort runs approximately worst-case scenario if 10% (or less) of the buckets have 90% (or more) of the data. Given that you know your data is distributed according to a random variable with a uniform distribution with range N, what is the probability of this occurring?

+ Show Spoiler +

I don't know the answer. I know how to compute the answer if the size of the data is given, but it's a shitty sum of cases in a multinomial distribution. I have no idea how to derive a general formula

Deleted User 3420

24492 Posts

December 02 2016 19:08 GMT

#16150

On December 03 2016 02:12 Acrofales wrote:

Show nested quote +

See now I know what it means for something to be uniformly distributed... But no I certainly didn't know what "uniform distribution" is.

Like if there were 5 buckets, 5 elements, then it's uniformly distributed if each bucket has an element.

But It didn't say it was uniformly distributed. I can't actually remember the exact wording... Something like "assume a range greater than n, and uniform distribution".

See now this is the problem.... lol. Why would he even include the terminology?

My opinion is that it isn't fair for us to be tested on a question where the answer depends on us understanding terminology that we weren't taught.

BluzMan

Russian Federation4235 Posts

December 02 2016 20:24 GMT

#16151

That question is strange. Worst case of what? Worst case of uniform distribution is everything being the same value I guess, but that is so improbable for large N that it can be considered impossible.

Nesserev

Belgium2760 Posts

December 02 2016 23:14 GMT

#16152

--- Nuked ---

BluzMan

Russian Federation4235 Posts

December 02 2016 23:33 GMT

#16153

On December 03 2016 08:14 Nesserev wrote:
For the worst case, the uniform probability distribution of the key values doesn't actually matter. Regardless of the probability distribution, the worst case is always the same, and it's always O(n²). It does make the worst case more improbable, but the worst case is almost always very improbable. That's not the point.

However, the uniform probability distribution of the key values is very important for the average case. If it wasn't uniformly distributed, the average case would not be O(n), but probably be O(m²), where m is the size of the bucket with the most elements.

That looks a lot like nitpicking, kind of like "SHA256 is unsafe because in the worst case you easily find a hash collision". Yeah, you do, but you better have a few dozen universes to look in. If you imagine this is about a practical application and ask yourself "I know my 1,000,000 items are uniformly distributed, what is the worst I have to prepare for?", you definitely won't be saying "O(n²)".

I'd say the question is very poorly worded, at least in the form presented to us.

Nesserev

Belgium2760 Posts

December 02 2016 23:43 GMT

#16154

--- Nuked ---

ZigguratOfUr

Iraq16955 Posts

December 03 2016 00:03 GMT

#16155

On December 03 2016 08:43 Nesserev wrote:

Show nested quote +

No, the question is fine.

That's the thing though. Talking about worst case O, is very often nitpicking.
Quicksort has worst case O(n²)... that doesn't stop anyone from using it.

You still keep have to keep it in mind. A naive implementation of quicksort hits its worse case on a sorted list which is less than ideal. And the fact that quicksort has a worst case O(n²) does in fact keep some people from using it. The C++ standard library generic sort uses introsort instead of quick sort to avoid that worst case.

Acrofales

Spain17971 Posts

December 03 2016 00:08 GMT

#16156

On December 03 2016 08:43 Nesserev wrote:

Show nested quote +

No, the question is fine.

That's the thing though. Talking about worst case O, is very often nitpicking.
Quicksort has worst case O(n²)... that doesn't stop anyone from using it.

But if your data is uniformly distributed, you are, by definition, not in the worst case of bucket sort, because you are talking, literally, about the shape of your data and asserting there are no clusters. If, however, your data is pulled from some population with a uniform distribution and you don't know the distribution of your actual data, THEN you are right.

Consider the following:

Please sort, using bucket sort, the following set:

[3, 4, 1, 2, 5, 7, 10, 8, 9, 6]

This is pretty much a best case for bucket sort. It is uniformly distributed data.

On the other hand, if all we know is that the data comes from a population with a discrete uniform distribution over the integers between 1 and 10, then it is possible (not likely, but possible), that sampling the population 10 times, we get:

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Clearly this dataset is not uniformly distributed, despite coming from a uniformly distributed population.

Now of course, the latter is what we are almost always talking about, because we are almost always sorting things of the latter type. But there is a lot of ambiguity in how the question, as it was passed to us, is given.

BluzMan

Russian Federation4235 Posts

December 03 2016 00:16 GMT

#16157

On December 03 2016 08:43 Nesserev wrote:

Show nested quote +

No, the question is fine.

That's the thing though. Talking about worst case O, is very often nitpicking.
Quicksort has worst case O(n²)... that doesn't stop anyone from using it.

Anything that allows ambiguity is not fine, and arguing against ambiguity here is very hard since many people made the same "mistake". The whole O(n) notation has been invented to be useful and nothing else. Applying it in a way that doesn't produce useful answers is a cargo cult, bad practice, teaching that is even worse.

Worst case analysis needs to answer the question "what pathological, but realistic data do I have to be aware of?" In practice "realistic pathological" often equals "malicious" and worst case analysis is mainly a form of DOS protection. There's no way to produce a malicious but uniformly distributed data that would crash bucket sort down to quadratic, because it wouldn't by definition be uniformly distributed. If your distribution guarantee is actually coming from somewhere (a statistics test, for instance), there's no way to DOS your code.

Neshapotamus

United States163 Posts

December 03 2016 10:55 GMT

#16158

On December 01 2016 02:09 Prillan wrote:

Show nested quote +

Maybe I'm stupid, but how would you implement a weighted graph using the above definition of an adjacency list?

I was thinking you can do this: (Java)


import java.util.LinkedList;
import java.util.List;

public class DirectedGraph {

    private List<Edge>[] edges;

    public DirectedGraph(int V){
        this.edges = (List<Edge>[])new Object[V];
        for (int i = 0; i < V; i++) {
            this.edges[i]  = new LinkedList<>();
        }
    }

    public void addEdge(int v, int w, int weight){
        edges[v].add(new Edge(v,w,weight));
    }

    public Iterable<Edge> adj(int v){
        return edges[v];
    }

}

class Edge{

    private int weight;
    private int v;
    private int w;

    public Edge(int weigth, int v, int w){
        this.weight = weigth;
        this.v = v;
        this.w = w;
    }

    public int getWeight(){
        return weight;
    }

    public int either(){
        return v;
    }

    public int other(int o){
        if (o == v) return w;
        else return v;
    }
}

BTW, nice job travis. Seems like you did very well on the exam.

Acrofales

Spain17971 Posts

December 03 2016 11:20 GMT

#16159

On December 03 2016 09:16 BluzMan wrote:

Show nested quote +

To be fair, most problems with sort speed are because of wrong assumptions about the underlying data. If you are sorting a list of names lexicographically, your first thought is that they're approximately uniformly distributed, right? Minus a few uncommon letters like Q or X, but it'd generally work?

Actually, just going by first letter, there's going to be some big clumping. Just look here:
http://home.uchicago.edu/~jsfalk/misc/baby_names/

Names are between 3 and 4 times more likely to start with a j or an a than most other letters, and there's significant variance in the other probabilities too. Moreover, the frequency of starting letters changes over time, with gender, and of course there are regional differences too.

So, if you make a bucket sort for names and expect it to run in O(n) time, you'd (probably) be wrong for any real dataset of names. It probably won't be as slow as O(n^2) either, but rather somewhere in between. And that might trip you up if you make claims about runtime based on such estimates.

Manit0u

Poland17244 Posts

December 03 2016 12:20 GMT

#16160

Prev 1 806 807 808 809 810 1031 Next

Please or register to reply.

The Big Programming Thread - Page 808

Completed

Ongoing

Upcoming