The Big Programming Thread - Page 985

Prev 1 983 984 985 986 987 1032 Next

Thread Rules
1. This is not a "do my homework for me" thread. If you have specific questions, ask, but don't post an assignment or homework problem and expect an exact solution.
2. No recruiting for your cockamamie projects (you won't replace facebook with 3 dudes you found on the internet and $20)
3. If you can't articulate why a language is bad, don't start slinging shit about it. Just remember that nothing is worse than making CSS IE6 compatible.
4. Use [code] tags to format code blocks.

Deleted User 3420

24492 Posts

December 27 2018 14:38 GMT

#19681

for those working in ML, here's a question out of personal interest:

For larger employers, is gathering/cleaning/organizing data to be used for ML generally done by a different person than whoever creates the ML models themselves? Is there a typical sort of organization in regards to who does what?

Excludos

Norway8196 Posts

December 27 2018 16:20 GMT

#19682

On December 27 2018 23:38 travis wrote:
for those working in ML, here's a question out of personal interest:

For larger employers, is gathering/cleaning/organizing data to be used for ML generally done by a different person than whoever creates the ML models themselves? Is there a typical sort of organization in regards to who does what?

From my experience, cleaning data to be used for ML is often a full time job for a team of people, who have little to no knowledge about what people who are working with the ML does. But this is anecdotal and could vary from company to company I guess. But generally I would assume for ML to be useful you need such vast amount of data that it's a full time job to take care of it.

mantequilla

Turkey779 Posts

December 27 2018 17:47 GMT

#19683

On December 27 2018 18:34 Manit0u wrote:

Show nested quote +

I guess I need to refresh my knowledge of Java world

Got a small project for my friend which could be a good way to get back in touch with Spring. Is Hibernate still a thing?

Yes hibernate is still around, with jpa annotations on entities instead of hibernate ones. You can also use mongodb and annotate entities with spring data annotations instead.

Check out spring boot (hassle free spring) and liquibase (db migrations).

Also look at jhipster, I like that project, they are always on bleeding edge

Manit0u

Poland17450 Posts

December 27 2018 20:48 GMT

#19684

On December 28 2018 02:47 mantequilla wrote:

Show nested quote +

I created a project with spring boot, hibernate, postgresql (will need a relational db for this) and flyway for migrations. Will see how it goes. Thankfully I've got a senior Java dev next to me at the office who's worked for big companies like 2 Sigma so I should get it to work

mantequilla

Turkey779 Posts

December 31 2018 18:57 GMT

#19685

For a course project I'm designing a reliable multicast algorithm. Problem is my brain isn't working. So this would be a long and stupid question:

There are N peers, p1, p2...pn. One peer wants to send a message to all other peers. If every peer receives the message, everyone accepts the message. If someone can't get the message, no one accepts it. I don't care about being optimal or fast etc. just delivering the message to everyone is fine.

For the simplicity, let's assume there are only two peers. And we are delivering only a single packet, so there's no packet ordering etc. problem.

There are two peers, p1 and p2. p1 wants to send a message to everyone (well, just p2 in this example). I build a dictionary called "global state". It includes everyone's point of view of of who received the message:

At the very start it's like this, no one heard the message:


global_state = {
	p1: {
		p1: False,
		p2: False
	},
	p2: {
		p1: False,
		p2: False
	}
}

p1 wants to send a message, since p1 is the initiator, it knows that it heard the message (duh). So it marks this knowledge on global state object:


global_state_of_p1{
	p1: {
		p1: True,
		p2: False
	},
	p2: {
		p1: False,
		p2: False
	}
}

p1 then sends the above state info to p2. It just continually tries to send in a while loop, since it runs on UDP (course requirement). When p2 finally receives the message:
- p2 knows that p1 heard the message since it's the sender.
- p2 has also heard the message now
So p2's global state info becomes like this:


global_state_of_p2{
	p1: {
		p1: True,
		p2: False
	},
	p2: {
		p1: True, //p2 knows that p1 heard the message since it's the sender.
		p2: True //p2 has also heard the message now
	}
}

Then asks this question: who doesn't know that I heard the message? Obviously the answer is only p1. If there were more than 2 peers, there would be more peers who didn't know that p2 has heard the message. So p2 sends above global state info to these peers, in this case p1. There is a loop that continnually sends message to everyone who didn't hear (*)

p1 receives the above message. It's coming from p2 so it means p2 has heard the message. p1 marks its state info with this (it just OR's its internal state with incoming message)


global_state_of_p2{
	p1: {
		p1: True,
		p2: True
	},
	p2: {
		p1: True,
		p2: True
	}
}

now p1 knows that everyone has heard the message. But p2 doesn't still know that p1 knows this. So p1 should somehow say to p2 "yes I know that everyone heard the message, stop bugging me!"

My algorithm fails here. I just can't terminate it because p2 can't be sure p1 got the message. If p1 sends an ack that says "I know everyone has heard the message", it can't be sure p2 has heard the ack... It goes on and on without terminating...

WarSame

Canada1950 Posts

December 31 2018 20:57 GMT

#19686

You would only need to go 1 layer deep of acks, right?

If p1 sends to p2, p2 acks to p1(so p1 knows p2 received it, but p2 doesn't know that p1 knows), p1 acks the ack to p2(p2 knows that p1 knows) then you're good. You only need to receive the ack back for this to know that they both know p2 received the message.

In regular networking the header will send the number of frames to the receiver and then number those frames as they're sent so that they can be acked seperately. If any are missed they are resent. If any are not acked then they are resent. If every frame is acked then the message has been passed.

Similarly, you could make a message that has an ID and then send updates containing that ID and who has acked the message every time p1 receives an ack. When all receivers have acked you can add p1 to the ackers to signify that all acks have been received.

This does seem inefficient, but maybe that's the price we pay for reliability.

mantequilla

Turkey779 Posts

December 31 2018 21:07 GMT

#19687

On January 01 2019 05:57 WarSame wrote:
You would only need to go 1 layer deep of acks, right?

If p1 sends to p2, p2 acks to p1(so p1 knows p2 received it, but p2 doesn't know that p1 knows), p1 acks the ack to p2(p2 knows that p1 knows) then you're good. You only need to receive the ack back for this to know that they both know p2 received the message.

In regular networking the header will send the number of frames to the receiver and then number those frames as they're sent so that they can be acked seperately. If any are missed they are resent. If any are not acked then they are resent. If every frame is acked then the message has been passed.

Similarly, you could make a message that has an ID and then send updates containing that ID and who has acked the message every time p1 receives an ack. When all receivers have acked you can add p1 to the ackers to signify that all acks have been received.

This does seem inefficient, but maybe that's the price we pay for reliability.

How can p1 be sure p2 got the ack to its ack?

My brain doesn't seem to get this kind of stuff, sorry.

in a tcp like scenario: A sends a message to B and A wants to be sure B got the message

my scenario is like: above + B also wants to be sure that A knows that B got the message

WolfintheSheep

Canada14127 Posts

December 31 2018 21:31 GMT

#19688

On January 01 2019 06:07 mantequilla wrote:

Show nested quote +

How can p1 be sure p2 got the ack to its ack?

When P2 sends the ACK to P1, it will expect a response (ACK for the ACK). If it doesn't receive it, it will resend it's original ACK.

If P1 only receives 1 ACK, then it can assume both it's Message and ACK were received.

You could run into scenarios where P2 is actually completely shut down and thus won't resend any messages, but then you should also be tracking which peers are actually still alive.

Manit0u

Poland17450 Posts

December 31 2018 22:42 GMT

#19689

Why making it overly complex? What you need is a control object and all the peers should communicate through it. This way you only have one place in your system where you have to track the information and you don't have to share state between peers (they don't have to know about other peers, how many there are etc.).

Let's assume you have n peers p and one control service c.

p1 sends message to c, c broadcasts it to all the other peers. Each peer that got the message acknowledges it to c. This makes c the single place where you can check the state of each peer for each message. It can retry etc.

If your peers need to know if everyone received their message they can simply ask c about it. Best way to introduce it would be something like this:

p1 sends to c with a set timeout. c rebroadcasts to other peers and waits for their acks. When all peers send their acks to c it sends an ack to p1. If everything was within the timeout limit it is a success, if not you mark it as failure.

This also gives you more flexibility since you can put retry logic etc. either in c or in each peer (you can even put it everywhere and you use c by default if p doesn't provide it, if it does it overrides c). You can also introduce different logic - if you don't want it to be timeout based you can make peers periodically ask c if their message was delivered to everyone.

This way you avoid this circle of hell where all the peers know about each other and have to constantly check each other (this gets really inefficient and stupid when you get to higher numbers of peers, also whenever you introduce a new peer you'd need to update them all with this information).

WolfintheSheep

Canada14127 Posts

December 31 2018 22:45 GMT

#19690

I'm kind of guessing by the use of the word "peer" that the intent is a decentralized system.

But if not, then w/e.

Manit0u

Poland17450 Posts

December 31 2018 23:25 GMT

#19691

On January 01 2019 07:45 WolfintheSheep wrote:
I'm kind of guessing by the use of the word "peer" that the intent is a decentralized system.

But if not, then w/e.

Well, even in the p2p world you still have trackers and what not

The problem at hand was multicast so I assumed it should work more like messaging queues with fan-out approach.

mantequilla

Turkey779 Posts

January 01 2019 00:15 GMT

#19692

On January 01 2019 07:42 Manit0u wrote:
Why making it overly complex? What you need is a control object and all the peers should communicate through it. This way you only have one place in your system where you have to track the information and you don't have to share state between peers (they don't have to know about other peers, how many there are etc.).

Let's assume you have n peers p and one control service c.

p1 sends message to c, c broadcasts it to all the other peers. Each peer that got the message acknowledges it to c. This makes c the single place where you can check the state of each peer for each message. It can retry etc.

If your peers need to know if everyone received their message they can simply ask c about it. Best way to introduce it would be something like this:

p1 sends to c with a set timeout. c rebroadcasts to other peers and waits for their acks. When all peers send their acks to c it sends an ack to p1. If everything was within the timeout limit it is a success, if not you mark it as failure.

This also gives you more flexibility since you can put retry logic etc. either in c or in each peer (you can even put it everywhere and you use c by default if p doesn't provide it, if it does it overrides c). You can also introduce different logic - if you don't want it to be timeout based you can make peers periodically ask c if their message was delivered to everyone.

This way you avoid this circle of hell where all the peers know about each other and have to constantly check each other (this gets really inefficient and stupid when you get to higher numbers of peers, also whenever you introduce a new peer you'd need to update them all with this information).

it's distributed systems course's project, must be p2p architecture where all peers being equal and not a centralized server :/ Don't know if I can fit this into project description though, maybe there's a hole in definition that would allow a centralized server

Lmui

Canada6216 Posts

January 01 2019 00:50 GMT

#19693

I'm thinking about this and have a question.

Why do you need to know the state from every other peer's point of view before accepting the message?
You should just need to know that all peers have read your message from each individual's point of view before the message can be accepted, which reduces your global state to just N entries from N^2 on each peer.
I'm assuming messages are sent to a random unmessaged peer with every S seconds (Where S is a random number 1<S<10 ) since performance doesn't seem to be a requirement

The initial states if you have 3 peers, with peer 1 receiving the initial message:

global_state_p1 = {
p1: true,
p2: False,
p3: False
}

global_state_p2 = {
p1: False,
p2: False,
p3: False
}

global_state_p3 = {
p1: False,
p2: False,
p3: False
}

It first sends a message to P2 (how you do reliability is up to you) with the message, and global_state_p1

global_state_p1 = {
p1: true,
p2: true,
p3: False
}

global_state_p2 = {
p1: true,
p2: true,
p3: False
}

global_state_p3 = {
p1: False,
p2: False,
p3: False
}

P1 (or P2) messages P3 after knowing that P2 got the message and the states are now:

global_state_p1 = {
p1: true,
p2: true,
p3: true
}

global_state_p2 = {
p1: true,
p2: true,
p3: False
}

global_state_p3 = {
p1: true,
p2: true,
p3: true
}

And P1/P3 accept the message because they know all recipients have received all messages.
At some point, P2 will reach out to P3 since P3 has not yet received the message from its standpoint. P3 will return its global state back to P2, at which point the message will be accepted by P2.

There's one primary limitation to this but I'll leave it as an exercise for the reader

Manit0u

Poland17450 Posts

January 01 2019 04:02 GMT

#19694

Well, if performance is not an issue and all you really care about is that the message is delivered to all peers then the simplest way to do it would be what @Lmui described above. I'd only add one other thing there: the origin...


payload = {
    msg: 'our message',
    origin: 1,
    state: [false, true, false, false]
}

This way you pass the message "around the table" and you know where it came from so you can later send it back to the originator as a final ack that everyone got it.

It still feels really not very efficient. Personally I'd send the message from the originator to all peers at the same time, waiting for an ack. If only the originator needs to know it's been delivered to everyone then that's it (peers do not broadcast if they're not the originator). If all peers need to know about the status (if all the others also got it) then there's the next step involved which is sending final ack to all peers by the originator once it got all acks it needed.

You're sending more messages (because of back-and-forth communication) but overall it is way more efficient since it happens all at once in parallel (which is what you really want from a distributed system).

mantequilla

Turkey779 Posts

January 01 2019 16:40 GMT

#19695

Thanks very much for the answers guys, let me thing about them for a while

Luckily project deadline is extended for a few days. If I can work out a working algorithm then I will need to plot some graphs write some reports about it etc..

Manit0u

Poland17450 Posts

January 01 2019 19:26 GMT

#19696

Simple example


module DistributedSystem
  Payload = Struct.new(:message, :origin)

  class Peer
    attr_reader :id, :payload, :others
    attr_accessor :broken

    def initialize(id, broken = false)
      @id = id
      @broken = broken
    end

    def process(payload)
      puts "Peer ##{id} processing payload from ##{payload.origin || 'external'}..."

      if payload_external?(payload)
        puts 'Rebroadcasting to other peers'

        payload.origin = id

        rebroadcast(payload).merge({ id => true }).sort.to_h
      else
        puts "Peer ##{id}#{broken ? ' not' : ''} acknowledging receival from ##{payload.origin}"

        acknowledge(payload)
      end
    end

    def others=(peers)
      @others = peers.select { |p| p.id != id }
    end

    private

    def payload_external?(payload)
      payload.origin.nil?
    end

    def rebroadcast(payload)
      others.each_with_object({}) do |peer, results|
        results[peer.id] = peer.process(payload)
      end
    end

    def acknowledge(payload)
      !broken
    end
  end
end

payload = DistributedSystem::Payload.new('Hello')
peers = (1..5).to_a.map { |id| DistributedSystem::Peer.new(id) }
peers.each { |peer| peer.others = peers }

random_peer = peers.sample

puts random_peer.process(payload)

puts '*' * 50

some_broken_peers = peers.map do |peer|
  peer.broken = true if peer.id.even?
  peer
end

payload = DistributedSystem::Payload.new('Hello')

random_working_peer = peers.select { |peer| peer.broken == false }.sample
puts random_working_peer.process(payload)


Peer #2 processing payload from #external...
Rebroadcasting to other peers
Peer #1 processing payload from #2...
Peer #1 acknowledging receival from #2
Peer #3 processing payload from #2...
Peer #3 acknowledging receival from #2
Peer #4 processing payload from #2...
Peer #4 acknowledging receival from #2
Peer #5 processing payload from #2...
Peer #5 acknowledging receival from #2
{1=>true, 2=>true, 3=>true, 4=>true, 5=>true}
**************************************************
Peer #5 processing payload from #external...
Rebroadcasting to other peers
Peer #1 processing payload from #5...
Peer #1 acknowledging receival from #5
Peer #2 processing payload from #5...
Peer #2 not acknowledging receival from #5
Peer #3 processing payload from #5...
Peer #3 acknowledging receival from #5
Peer #4 processing payload from #5...
Peer #4 not acknowledging receival from #5
{1=>true, 2=>false, 3=>true, 4=>false, 5=>true}

Is this anything like what you are after?

Manit0u

Poland17450 Posts

January 04 2019 14:36 GMT

#19697

Some nice reads on the multicast subject (and messaging in distributed systems in general):
http://250bpm.com/blog:17
http://250bpm.com/blog:5
http://250bpm.com/blog:20

And an interesting note on (supposed) superiority of C vs C++ from a perspective of maintaining software for 5 years:
http://250bpm.com/blog:4
http://250bpm.com/blog:8

Deleted User 3420

24492 Posts

January 09 2019 01:20 GMT

#19698

alright, how hard is the following to do? (in python)
My HTTP knowledge is pretty low... and my knowledge of security/authentication is even lower.

I have a ML model to predict outcomes of events (betting online).
I want to automate it.

I use a betting website, that is fairly complicated GUI wise. I know when bets come up, though. I know what fields in my browser, I imagine I can examine the http in chrome or something?

So, at specific times, I want to run my model, see the results, and if I like the results I want to place the bet automatically with a python script, rather than having to manually do it in my browser. How hard is this? Can I learn how to do this and get it done in 1 day? (tomorrow). keep in mind that it's with real money so I also need to learn whatever is required to open a secure session with authentication and whatever.

Acrofales

Spain18132 Posts

January 09 2019 07:49 GMT

#19699

On January 09 2019 10:20 travis wrote:
alright, how hard is the following to do? (in python)
My HTTP knowledge is pretty low... and my knowledge of security/authentication is even lower.

I have a ML model to predict outcomes of events (betting online).
I want to automate it.

I use a betting website, that is fairly complicated GUI wise. I know when bets come up, though. I know what fields in my browser, I imagine I can examine the http in chrome or something?

So, at specific times, I want to run my model, see the results, and if I like the results I want to place the bet automatically with a python script, rather than having to manually do it in my browser. How hard is this? Can I learn how to do this and get it done in 1 day? (tomorrow). keep in mind that it's with real money so I also need to learn whatever is required to open a secure session with authentication and whatever.

Use selenium and it should be fairly easy (maybe a bit more than a day if you're completely clueless about HTML and JS, but not very long). You can also look if the web you're interested in has a REST API, in which case you can use that, and just skip the whole GUI part. As for ssl, Python makes that extremely simple. With selenium, the browser will take care of it. If connecting directly to an API, just use an httpsconnection instead of http.

Silvanel

Poland4733 Posts

January 09 2019 10:14 GMT

#19700

Its fairly simple, You should be able to do it in one day provided the webpage is well documented. If the elements on page do not have disctinctive names/descriptions and are loaded in random order than it becomes quite complicated but in most cases it is super straightfroward.

Prev 1 983 984 985 986 987 1032 Next

Please or register to reply.

The Big Programming Thread - Page 985

Completed

Ongoing

Upcoming