The Big Programming Thread - Page 932

Prev 1 930 931 932 933 934 1031 Next

Thread Rules
1. This is not a "do my homework for me" thread. If you have specific questions, ask, but don't post an assignment or homework problem and expect an exact solution.
2. No recruiting for your cockamamie projects (you won't replace facebook with 3 dudes you found on the internet and $20)
3. If you can't articulate why a language is bad, don't start slinging shit about it. Just remember that nothing is worse than making CSS IE6 compatible.
4. Use [code] tags to format code blocks.

WarSame

Canada1950 Posts

December 24 2017 17:59 GMT

#18621

Thanks, that's exactly what I was looking for! I was having trouble finding that. It's also great that I can use YAML.

Thaniri

1264 Posts

December 24 2017 18:24 GMT

#18622

I just need to rant a bit since you mentioned it...

Fuck YAML. Every new software these days uses YAML config files and I have to write puppet code to feed settings into templates that will eventually be translated into YAML format cause standardization. At least its not XML.

Meaningful whitespace makes me want to do hyperbolic things that I wouldn't actually do. YAML isn't human readable. JSON is. Give me braces. I don't care about your 2 space rule. I'll use 500 spaces if I want in JSON. Pro tip: I won't since JSON can be minified. But minified JSON is unreadable to humans? Then use jq to unminify it if you need to.

For large scale parsing, YAML is actually slower than JSON. If I'm pushing code out to a $LARGE_NUMBER of nodes it actually adds up.

What do I get in exchange? Comments? RTFM, YAML isn't meant for programming. Multi parent inheritance? I thought the big meme these days was that inheritance is Satan reincarnate.

edit: Merry Christmas everyone! Wish everyone the best on their projects in the new year.

TMG26

Portugal2017 Posts

December 24 2017 21:59 GMT

#18623

Yaml should be only used in config files because it's pretty. But even that it's a strech because parsers can't give propper error messages.

Deleted User 3420

24492 Posts

December 24 2017 23:46 GMT

#18624

Let's saying that I am making very long lists(possibly length of millions) from pairs of pseudo-random values, inserting each element one a time. In the end, I want lists without duplicates of the first value in the pairs. Which duplicate is removed is based on the value of the 2nd element. So, I can't use a set because it won't recognize that (8,31) and (8,224) are duplicates and that (8,31) should be removed because 224 > 31. I also don't think a hash is practical because, while I said the values are pseudo random, I actually will only ever get one duplicate for each respective pair. So... I would be making a huge hash table.. when memory is already becoming a problem. [quick side note, if anyone *did* know how to use a set or set like structure to accomplish this, I would be in love with them]

Anyways back to the question
Is it faster to:

a.) insert into the list in order(by element 1). So, for each inserted element, I can do a O(logk) insertion where k is the number of elements currently in the list, putting it into a sorted position. Then, if there is a duplicate, It will be very nearby, so I can check nearby positions for duplicates. I could make it even faster by doing something like maintaining minimum, maximum, mean of the list or w/e. But that's getting pretty fancy.

b.) just insert into the list with no regard for order and then sort it at the end(maybe radix sort).

So, speed is more of a concern than memory, but both are important. I know a) will be better for memory.. but I am unsure if it would be better for speed. Any thoughts/ideas?

edit: a final edit. I don't think bucket sort is feasible because I don't know about the distribution of the values.

tofucake

Hyrule19029 Posts

December 25 2017 01:36 GMT

#18625

JSON is for data transport, not human oriented configuration

YAML is good at what it does

Why do you have to write any code to interact with yaml? Use one of the bajillion standard implementations

Manit0u

Poland17243 Posts

December 25 2017 03:15 GMT

#18626

On December 25 2017 10:36 tofucake wrote:
JSON is for data transport, not human oriented configuration

YAML is good at what it does

Why do you have to write any code to interact with yaml? Use one of the bajillion standard implementations

Seconded. YAML.parse (or whatever equivalent) will give you a hash, which makes it no different from the usual JSON.parse.

What I like about YAML for config files is that it's readable, concise and has some cool features (like anchors and aliases to dry up your config).


my_node: &default
  some_key: 'some val'
  some_other_key: 'some other var'

my_other_node:
  <<: *default
  some_different_key: 1

If your nodes share common subnodes you can easily alias and reference them. Very neat if you ask me,

phar

United States1080 Posts

December 25 2017 07:55 GMT

#18627

On December 25 2017 08:46 travis wrote:
Let's saying that I am making very long lists(possibly length of millions) from pairs of pseudo-random values, inserting each element one a time. In the end, I want lists without duplicates of the first value in the pairs. Which duplicate is removed is based on the value of the 2nd element. So, I can't use a set because it won't recognize that (8,31) and (8,224) are duplicates and that (8,31) should be removed because 224 > 31. I also don't think a hash is practical because, while I said the values are pseudo random, I actually will only ever get one duplicate for each respective pair. So... I would be making a huge hash table.. when memory is already becoming a problem. [quick side note, if anyone *did* know how to use a set or set like structure to accomplish this, I would be in love with them]

Anyways back to the question
Is it faster to:

a.) insert into the list in order(by element 1). So, for each inserted element, I can do a O(logk) insertion where k is the number of elements currently in the list, putting it into a sorted position. Then, if there is a duplicate, It will be very nearby, so I can check nearby positions for duplicates. I could make it even faster by doing something like maintaining minimum, maximum, mean of the list or w/e. But that's getting pretty fancy.

b.) just insert into the list with no regard for order and then sort it at the end(maybe radix sort).

So, speed is more of a concern than memory, but both are important. I know a) will be better for memory.. but I am unsure if it would be better for speed. Any thoughts/ideas?

edit: a final edit. I don't think bucket sort is feasible because I don't know about the distribution of the values.

Does input order have to affect output order, or can input be unstable? If input can be unstable this is trivially accomplished in linear space and constant time for insertion by using a list and map.

Your concerns on the size of the hash table sound unreasonable for input ~millions.

dekibeki

Australia34 Posts

December 26 2017 00:04 GMT

#18628

On December 25 2017 16:55 phar wrote:

Show nested quote +

Even if input order has to be respected, you can do the same thing.

Each Pair becomes a node (<A,B>) in an intrusive doubly linked list.
The map maps A -> <A,B>.


For each pair x:
  if x.a is in the map:
    compare x.b with what is in the map
    if x.b is greater:
      unlink the existing node
      add x to the end of the list
      set the found record in the map to x
  else /*x.a is not in the map*/
    add x to the end of the list
    add x to the map

C++ example code
+ Show Spoiler +


#include <unordered_set>

using NumberType = ?;

struct Node {
  Node* prev = nullptr;
  Nod* next = nullptr;
  NumberType a;
  NumberType b;

  void remove() {
    prev->next = next;
    next->prev = prev;
  }
};

struct IntrusiveList {
  Node first;
  Node last;

  InstrusiveList() {
    first.next = &last;
    last.prev = &first;
  }

  void push_back(Node* n) {
    n->prev = last.prev;
    n->next = &last;
    last.prev->next = n;
    last.prev = n;
  }
}

std::unordered_map<NumberType, Node*> nodeMap;
IntrusiveList list;

void add_node(Node* node) {
  auto nodeEmplace = nodeMap.emplace(node->a,node);
  if(!nodeEmplace.second) {
    Node*& old = nodeEmplace.first.second;
    if(old->b < node->b) {
      old->remove();
      list.push_back(node);
      old = node;
    }
  } else {
    list.push_back(node);
  }
}

You can do some fancier stuff to save memory, but that's the basic idea.

Edit: For production, use the boost intrusive list, instead of the hand rolled one

TMG26

Portugal2017 Posts

December 26 2017 10:05 GMT

#18629

Literally use a heap for your problem!
In order insert faster than a list! And when you detect the duplicate if it's the new value, don't save it. It it's the one already there, swap it.

You can also implement an heap/tree in an array instead of dynamic memory allocation . Especilly usefull for huge ammounts of data that you already know how much size you will need. No need to storing pointers if it is all continuous memory in an array.

Lists are terribly slow for your problem. Discard them.
You can also use a hash table! It's even faster than the heap, if you don't need them sorted at the end. On the jash table you can use the 1st value for the hash code.

Implementing the data structure for your poblem in C would be faster. But it's really simple using the generics collections in java by using you own comparator. But the overhead is bigger, and it seems that you are wrring shit around the promblem instead of actually solving he problem.

You can also use haskell for a really elegant solution, but I don't remember how it would be stored in memory, and since the size is huge you should just go with C so you understand the problem and the common data structures better with performance/space. Haskell will only give knowledge in the structures not performance. Using libs you will only be writing odd glue.

So: pick up C. Implement a specific Heap or Hash table depending if you need sorting or not. Use continuous memory because the size is huge, and you most likely know it first hand so you only need to allocte it once!

TMG26

Portugal2017 Posts

December 26 2017 10:28 GMT

#18630

Thinking about it again, you problem is trivial using an HashMap in java if you don't need sorting. First element is key. Operations are contants. You can just check if it exists before inserting/swaping/no-op. The problem is hash code values and the pseudo random generation might give collisions. It depends on implementation. But the algoritm woth hashmap (or equivelent in other langs) is so simple that you can quickly test it. Just remember to create the map with a big enough size. Because when it's full, expanding is a huge cost, and it's not aparent in the code.

But as I said, using a lib looks like writting shit arpund the problem rather than solving it.

phar

United States1080 Posts

December 26 2017 17:33 GMT

#18631

On December 26 2017 09:04 dekibeki wrote:

Show nested quote +

On December 25 2017 16:55 phar wrote:

Even if input order has to be respected, you can do the same thing.

Each Pair becomes a node (<A,B>) in an intrusive doubly linked list.
The map maps A -> <A,B>.


For each pair x:
  if x.a is in the map:
    compare x.b with what is in the map
    if x.b is greater:
      unlink the existing node
      add x to the end of the list
      set the found record in the map to x
  else /*x.a is not in the map*/
    add x to the end of the list
    add x to the map

C++ example code
+ Show Spoiler +


#include <unordered_set>

using NumberType = ?;

struct Node {
  Node* prev = nullptr;
  Nod* next = nullptr;
  NumberType a;
  NumberType b;

  void remove() {
    prev->next = next;
    next->prev = prev;
  }
};

struct IntrusiveList {
  Node first;
  Node last;

  InstrusiveList() {
    first.next = &last;
    last.prev = &first;
  }

  void push_back(Node* n) {
    n->prev = last.prev;
    n->next = &last;
    last.prev->next = n;
    last.prev = n;
  }
}

std::unordered_map<NumberType, Node*> nodeMap;
IntrusiveList list;

void add_node(Node* node) {
  auto nodeEmplace = nodeMap.emplace(node->a,node);
  if(!nodeEmplace.second) {
    Node*& old = nodeEmplace.first.second;
    if(old->b < node->b) {
      old->remove();
      list.push_back(node);
      old = node;
    }
  } else {
    list.push_back(node);
  }
}

You can do some fancier stuff to save memory, but that's the basic idea.

Edit: For production, use the boost intrusive list, instead of the hand rolled one

Yea it still works, I just put the caution up front because OP would have to be clear on what ordering. The replacement may need to go in the middle not the end... so that wasn't super clear from the question given.

There's also the usual caution about using a linked list: if you're passing that back to a caller who is expecting constant time random access, they're gonna have a bad time, so be explicit about it (or consider returning an iter instead).

graNite

Germany4434 Posts

December 27 2017 15:28 GMT

#18632

Please review my current try of creating a class for chess pieces. I not only want to be able to play a game of chess with this but also solve chess puzzles, thats why I want a flexible board size and pieces that can placed and created by hand.

I put many comments in the code and put the modules that were needed together so you can read it easier. It should still work, print the files of the `boardmatrix` after the initialization of all pieces and give some information of the black kingside rook.
Things that I want to change:

1) `reset` function gets a `start` option which initializes all pieces.

2) I dont know how to place the pieces while creating them, thats why the placement is in the `__init__` and as a seperate place function to move them.

3) I already have functions like `kingMove`, `queenMove`, ... . Where do I put them? In the `elif` parts where I decide which kind of piece the new one is?
I feel like this would make the class too big and hard to keep track of.

+ Show Spoiler +


    # I put all modules in one file for the review and changed the variables from modulename.name to name
    # board.py
    # the board and game settings are supposed to be here
    from string import ascii_uppercase
    
    # I want the dimensions of the board to be able to be set up to 15*15
    width = 8
    height = 8
    
    # the leading zeros are there to calculate more intuitive: files[2] = B, not C.
    files = "0" + ascii_uppercase[:width]
    ranks = range(0, height + 1)
    
    # status.py
    # contains the boardmatrix which holds the information of how the board currently looks
    
    boardmatrix = [["  " for x in range(1, height + 1)]
                   for y in range(1, width + 1)]
    
    
    def reset(setting):
            # function to control the boardmatrix. i had to use the "global" to be able to change
            # the matrix with the function
        global boardmatrix
        if setting == "empty":
            boardmatrix = [["  " for x in range(1, height + 1)]
                           for x in range(1, width + 1)]
        # elif setting == "start":
            # initialize all pieces
        else:
            print("Something failed.")
    
    # classpiece.py
    # class that holds all information about a piece: color, piece, position, its short handle,
    # whether rook or king can be used to castle etc.
    #
    
    
    class piece:
        # Summons piece of given color and position
        def __init__(self, color, piece, position):
    
            # color
            if color == "w":
                self.color = "white"
            elif color == "b":
                self.color = "black"
            else:
                self.color = "none"
    
            # piece
            if piece == "p":
                self.piece = "pawn"
                self.short = color + "P"
            elif piece == "n":
                self.piece = "knight"
                self.short = color + "N"
            elif piece == "b":
                self.piece = "bishop"
                self.short = color + "B"
            elif piece == "r":
                self.piece = "rook"
                self.short = color + "R"
                self.canCastle = True
            elif piece == "q":
                self.piece = "queen"
                self.short = color + "Q"
            elif piece == "k":
                self.piece = "king"
                self.short = color + "K"
                self.canCastle = True
    
            # position and place
            self.position = position.upper()
            file = files.index(((self.position[0]).upper()))
            rank = self.position[1:]
            boardmatrix[int(file) - 1][int(rank) - 1] = self.short
    
        def place(self, position=None):
            # remove old
            oldfile = files.index(((self.position[0]).upper()))
            oldrank = self.position[1:]
            boardmatrix[int(oldfile) - 1][int(oldrank) - 1] = "  "
            # place new
            position = position or self.position
            file = files.index(((position[0]).upper()))
            rank = position[1:]
            boardmatrix[int(file) - 1][int(rank) - 1] = self.short
            # update position
            self.position = position
    
    
    # i gave each piece a unique name, i dont know how else to initialize and later do stuff with it
    # w for white, b for black
    wApawn = piece("w", "p", "a2")
    wBpawn = piece("w", "p", "b2")
    wCpawn = piece("w", "p", "c2")
    wDpawn = piece("w", "p", "d2")
    wEpawn = piece("w", "p", "e2")
    wFpawn = piece("w", "p", "f2")
    wGpawn = piece("w", "p", "g2")
    wHpawn = piece("w", "p", "h2")
    
    bApawn = piece("b", "p", "a7")
    bBpawn = piece("b", "p", "b7")
    bCpawn = piece("b", "p", "c7")
    bDpawn = piece("b", "p", "d7")
    bEpawn = piece("b", "p", "e7")
    bFpawn = piece("b", "p", "f7")
    bGpawn = piece("b", "p", "g7")
    bHpawn = piece("b", "p", "h7")
    
    # Q  for queenside, K for Kingside, W for white square, B for black squre
    wQrook = piece("w", "r", "a1")
    wQknight = piece("w", "n", "b1")
    wBbishop = piece("w", "b", "c1")
    wQueen = piece("w", "q", "d1")
    wKing = piece("w", "k", "e1")
    wWbishop = piece("w", "b", "f1")
    wKknight = piece("w", "n", "g1")
    wKrook = piece("w", "r", "h1")
    
    bQrook = piece("b", "r", "a8")
    bQknight = piece("b", "n", "b8")
    bBbishop = piece("b", "b", "c8")
    bQueen = piece("b", "q", "d8")
    bKing = piece("b", "k", "e8")
    bWbishop = piece("b", "b", "f8")
    bKknight = piece("b", "n", "g8")
    bKrook = piece("b", "r", "h8")
    
    for line in boardmatrix:
        print(line)
    print(bKrook.position)
    print(bKrook.color)
    print(bKrook.canCastle)

WarSame

Canada1950 Posts

December 27 2017 16:36 GMT

#18633

I would recommend instead that you put the location of your pieces inside of a Board class instead of the piece itself. Then you can just call print on that Board to make it print out nicely and you can call print on certain positions, etc.

I would also recommend changing canCastle to hasMoved and making that a property for every piece.

Also, make the colour an enumeration(i.e. 0=white, 1=black, 2=purple...) so that you can mess around with colour properties if you want.

place() should be called movePiece() and should be a method of the Board. You could then decide where the piece can move based off of the type of piece.

Generally if you're feeding options into a function it's a sign that things are going wrong. Have an initializeBoard() that creates a new Board and calls a fillBoard() function that has creates new pieces and deploys them according to the default layout.

This should clear up your code a lot.

If you're interested read Clean Code to see how to write neater and more legible code. Some of your functions are not very clear and that could help it.

WarSame

Canada1950 Posts

December 31 2017 00:19 GMT

#18634

I've just started doing automatic unit testing with JUnit on Android and my god is it ever nice to have that. I now know that my functions are correctly doing certain things, that updates broke some, that some didn't do what I thought or were throwing errors in certain situations, etc. I've had to refactor my code and it is so much nicer now.

sc-darkness

856 Posts

December 31 2017 00:35 GMT

#18635

It's late night, but I find this string matching algorithm hard to follow. I've got some of its core ideas but still no clue what each value from KMP table means. Maybe it's name of variables that make it hard to follow, but I find examples easier. Or maybe I'm not smart enough.

Still, I think I've implemented more or less the algorithm in C++ just by following steps but I still don't understand it enough.

Here is the algorithm: https://en.wikipedia.org/wiki/Knuth–Morris–Pratt_algorithm

Edit: It seems to be about prefixes. I've found this question: https://stackoverflow.com/questions/13792118/kmp-prefix-table
Still, not the most readable algorithm I've encountered.

emperorchampion

Canada9496 Posts

December 31 2017 23:06 GMT

#18636

Happy New Years you nerds

WarSame

Canada1950 Posts

January 01 2018 00:42 GMT

#18637

There should be a nerd new years when Unix time hits all 0s except the leading digit.

Blisse

Canada3710 Posts

January 01 2018 01:03 GMT

#18638

On December 31 2017 09:35 sc-darkness wrote:
It's late night, but I find this string matching algorithm hard to follow. I've got some of its core ideas but still no clue what each value from KMP table means. Maybe it's name of variables that make it hard to follow, but I find examples easier. Or maybe I'm not smart enough.

Still, I think I've implemented more or less the algorithm in C++ just by following steps but I still don't understand it enough.

Here is the algorithm: https://en.wikipedia.org/wiki/Knuth–Morris–Pratt_algorithm

Edit: It seems to be about prefixes. I've found this question: https://stackoverflow.com/questions/13792118/kmp-prefix-table
Still, not the most readable algorithm I've encountered.

I usually google search for a bunch of sources until I find one with the language that makes sense. can be videos, blogs, coding challenges, anything. wiki's not my favourite source for algos, only algo i use wiki is mergesort.

knp is really straightforward. if you're looking for a pattern in a text, once you've seen some part of the pattern, how many characters can you skip?


no      : 12345678901234567
text    : AABAABAABAACAABAA
pattern : AABAACAABAA

text[0] -> pattern[0], good
text[1] -> pattern[1], good
text[2] -> pattern[2], good
text[3] -> pattern[3], good
text[4] -> pattern[4], good
text[5] -> pattern[5], bad

naively, we would go, text[1] -> pattern[0], text[2] -> pattern[1]. can we do better?

notice that we've seen a repeated AA pattern already. in fact, pattern[0..1] == pattern[3..4].

so instead of checking text[5] == pattern[5], we can retry with text[5] == pattern[2], because we know pattern[0..1] == pattern[3..4].

text[5] -> pattern[2], good
text[6] -> pattern[3], good
text[7] -> pattern[4], good
text[8] -> pattern[5], bad

etc.

to determine where we jump to, we build a table of the longest proper prefix which is also a suffix


no      : 123456789012345
pattern : AABAACAABAA
table   : 01012012345

table[0] -> pattern[0..0] -> A -> 0, 'proper prefix' excludes the entire word
table[1] -> pattern[0..1] -> AA -> 1, A is a prefix and a suffix
table[2] -> pattern[0..2] -> AAB -> 0
table[3] -> pattern[0..3] -> AABA -> 1, A is a prefix and a suffix
table[4] -> pattern[0..4] -> AABAA -> 2, AA is a prefix and a suffix
...

table[9] -> pattern[0..9] -> AABAACAABA -> 4, AABA is the longest prefix and suffix

happy new year!

sc-darkness

856 Posts

January 02 2018 00:49 GMT

#18639

Thank you Blisse! I appreciate your help!

Does anyone know what happens to memory in the following scenario?


1. Check current memory usage of application (e.g. Task Manager)
2. Allocate an object.
3. Deallocate object.
4. Check if memory is the same as step 1.

In my case, memory wasn't the same as it was initially. Slightly higher (e.g. 0.5-1 MB but not the same).

Language I use is C++. Steps are as described above. The only thing I did for step 2 and 3 was to use std::list<std::string> with 1 million elements, then I cleared list.

Is Task Manager not reliable? Is OS doing something funny? I noticed the same effect when I used my own linked list implementation (it's only for learning purposes, I know std::list is better).

Edit: It turns out 'delete' doesn't really release memory immediately. Maybe it's up to the operating system.

Excludos

Norway8042 Posts

January 02 2018 07:42 GMT

#18640

On January 02 2018 09:49 sc-darkness wrote:
Thank you Blisse! I appreciate your help!

Does anyone know what happens to memory in the following scenario?


1. Check current memory usage of application (e.g. Task Manager)
2. Allocate an object.
3. Deallocate object.
4. Check if memory is the same as step 1.

In my case, memory wasn't the same as it was initially. Slightly higher (e.g. 0.5-1 MB but not the same).

Language I use is C++. Steps are as described above. The only thing I did for step 2 and 3 was to use std::list<std::string> with 1 million elements, then I cleared list.

Is Task Manager not reliable? Is OS doing something funny? I noticed the same effect when I used my own linked list implementation (it's only for learning purposes, I know std::list is better).

Edit: It turns out 'delete' doesn't really release memory immediately. Maybe it's up to the operating system.

That depends entirely on the language you're using. C and C++ should delete it immediately (unless you're using something like Qt which runs its own "delete later"). Meanwhile Java runs garbage collection, which means it deletes the objects whenever it feels like it.

Prev 1 930 931 932 933 934 1031 Next

Please or register to reply.

The Big Programming Thread - Page 932

Completed

Ongoing

Upcoming