|
Thread Rules 1. This is not a "do my homework for me" thread. If you have specific questions, ask, but don't post an assignment or homework problem and expect an exact solution. 2. No recruiting for your cockamamie projects (you won't replace facebook with 3 dudes you found on the internet and $20) 3. If you can't articulate why a language is bad, don't start slinging shit about it. Just remember that nothing is worse than making CSS IE6 compatible. 4. Use [code] tags to format code blocks. |
On May 15 2019 01:08 enigmaticcam wrote:I've got this MS SQL script that I can't figure out. See below (generalized): merge TableTarget TGT using ( select * from TableSource ) as SRC on TGT.ColumnA = SRC.ColumnA when matched and exists ( select SRC.ColulmnB, SRC.ColumnC except select TGT.ColumnB, TGT.ColumnC ) then update ... when not matched by target then insert ... I don't know what "when matched and exists" means, nor what the "except" statement means. I know what they mean generally speaking, and I know what a merge statement is, but in this context I can't figure out what they're trying to do. Any help would be appreciated.
Just read the docs?
WHEN MATCHED THEN <merge_matched> Specifies that all rows of *target_table, which match the rows returned by <table_source> ON <merge_search_condition>, and satisfy any additional search condition, are either updated or deleted according to the <merge_matched> clause.
The MERGE statement can have, at most, two WHEN MATCHED clauses. If two clauses are specified, the first clause must be accompanied by an AND <search_condition> clause. For any given row, the second WHEN MATCHED clause is only applied if the first isn't. If there are two WHEN MATCHED clauses, one must specify an UPDATE action and one must specify a DELETE action. When UPDATE is specified in the <merge_matched> clause, and more than one row of <table_source> matches a row in target_table based on <merge_search_condition>, SQL Server returns an error. The MERGE statement can't update the same row more than once, or update and delete the same row.
WHEN NOT MATCHED [ BY TARGET ] THEN <merge_not_matched> Specifies that a row is inserted into target_table for every row returned by <table_source> ON <merge_search_condition> that doesn't match a row in target_table, but satisfies an additional search condition, if present. The values to insert are specified by the <merge_not_matched> clause. The MERGE statement can have only one WHEN NOT MATCHED clause.
Basically, this inserts rows into the TARGET_TABLE if they exist in the SOURCE_TABLE but not in the TARGET_TABLE or updates rows in TARGET_TABLE if SOURCE_TABLE has different data from TARGET_TABLE in specific columns.
|
I already know that. I know the purpose of the merge statement, and I've used it before. But typically I've used it like this
merge Target using Source on join Source to Target when matched then update when not matched then insert But the SQL I quoted earlier has an extra step in there and I don't know what it's doing:
merge Target using Source on join Source to Target when matched and exists (something except something) then update when not matched then insert How is "when matched and exists ... then update..." different than "when match then update..."?
|
Zurich15234 Posts
On May 13 2019 20:59 travis wrote: I ask because I am trying to understand how alphastar would be using an LSTM, because they must be using it for either the entire sequence of the game or for a very long sequence, because otherwise it seems useless. But... I don't really understand how. I can't answer all the other questions because I am not too familiar with LSTM. I can give some input on where AlphaStar would use RNNs/LSTMs though:
One problem in reinforcement learning are hidden or rather partially observable features of the state. Let's look at disruptor shots. If you give the agent a frame (=state) of the game, it might be able to identify a disruptor shot. But the information of trajectory and speed of the shot is not observable from a single state. You need a temporally linked sequence of states to extract that information. RNNs are one approach of building a sequence of states to learn these partially observable features from.
In easier tasks like Atari games the agent would just repeat the last action X times per step and learn from the X states returned to extract partially observable features. LSTMs were introduced to extend this method to more complex environment and where a longer memory than the past X frames is required. But this is where my understanding hits its limit, I am still stuck below the Atari level of agent building ...
Usage of LSTMs might go well beyond that for AlphaStar, but above problem was what LSTMs where introduced to solve in reinforcement learning.
|
On May 15 2019 05:13 enigmaticcam wrote:I already know that. I know the purpose of the merge statement, and I've used it before. But typically I've used it like this merge Target using Source on join Source to Target when matched then update when not matched then insert But the SQL I quoted earlier has an extra step in there and I don't know what it's doing: merge Target using Source on join Source to Target when matched and exists (something except something) then update when not matched then insert How is "when matched and exists ... then update..." different than "when match then update..."?
Both conditions must be met (match must be found and exists must evaluate to true) for update to happen. In this particular case: matching rows were found and data in columns B and C is different between source and target rows for the match.
Edit:
Example:
SRC_TABLE { COL_A: 1, COL_B: 1, COL_C: 1 } { COL_A: 2, COL_B: 1, COL_C: 2 } { COL_A: 3, COL_B: 2, COL_C: 2 }
TGT_TABLE { COL_A: 1, COL_B: 1, COL_C: 1 } { COL_A: 2, COL_B: 1, COL_C: 1 }
Trying to do the above merge joining on COL_A and checking for exists except on COL_B and COL_C would result in: COL_A: 1 - no action COL_A: 2 - update COL_A: 3 - insert
Technically this exists statement isn't really necessary here, worst case scenario is that you'll update rows to the same values which shouldn't be a problem unless it's a heavy operation or auto-updated timestamps and such shouldn't be changed if you don't really change the values. Another concern would be having to update a lot of rows, with exists you perform less update operations but more select operations to filter results and only update what's necessary.
In other news, Intel has dropped the ball again: https://www.bbc.com/news/technology-48278400
|
On May 15 2019 07:17 zatic wrote:Show nested quote +On May 13 2019 20:59 travis wrote: I ask because I am trying to understand how alphastar would be using an LSTM, because they must be using it for either the entire sequence of the game or for a very long sequence, because otherwise it seems useless. But... I don't really understand how. I can't answer all the other questions because I am not too familiar with LSTM. I can give some input on where AlphaStar would use RNNs/LSTMs though: One problem in reinforcement learning are hidden or rather partially observable features of the state. Let's look at disruptor shots. If you give the agent a frame (=state) of the game, it might be able to identify a disruptor shot. But the information of trajectory and speed of the shot is not observable from a single state. You need a temporally linked sequence of states to extract that information. RNNs are one approach of building a sequence of states to learn these partially observable features from. In easier tasks like Atari games the agent would just repeat the last action X times per step and learn from the X states returned to extract partially observable features. LSTMs were introduced to extend this method to more complex environment and where a longer memory than the past X frames is required. But this is where my understanding hits its limit, I am still stuck below the Atari level of agent building ... Usage of LSTMs might go well beyond that for AlphaStar, but above problem was what LSTMs where introduced to solve in reinforcement learning.
You pretty much covered the extent of my understanding as well
I suppose that it might be similar to the mystery that is the relationships within the hidden layers of the network: it's hard to understand what relationships they are actually modeling.
So maybe what happens is that it might take say, the last 100 steps, and combine it into a much smaller space than your original input of a single frame - yet still correctly capture important relationships even though it may not be organized like your original input space.
|
|
On May 17 2019 00:40 Nesserev wrote:Isn't it weird how every team now markets the vulnerability that they've found: ridiculous names, with a logo, and with its own website ( zombieloadattack.com), etc. It isnt that weird, theyre just trying to get recognition. If you are the guy who discovered 'the Heartbleed bug' that everyone was talking about for a month, it looks pretty good on your CV.
|
@travis if that can help with your understanding of RNNs.
You can view a typical LSTM/GRU network as this simple diagram on the left. https://www.researchgate.net/figure/The-standard-RNN-and-unfolded-RNN_fig1_318332317 You have to imagine this device (the "unit") receiving some kind of sequential input signal (x_0, x_1, ... x_n). Usually it's a time series or a natural language sentence (this is what I'm most familiar with, a sequence of words). The elements of the sequence are fed 1 by 1 to the unit, and it performs some computation, based not only on the current input (of course), but also based on the outcome of the previous computation step. It's the only concept of "memory": the fact that there is a recurrence relation tying input and result of the computation (hence Recurrent NN). You can imagine that this indeed allows you to somewhat remember the history of what you did with the sequence. In a way you approximate the complete sequence with some aggregate computed every step of the way. You can choose to also output something at every step of the computation (for example, if we're talking words, you can output words at each step, like in machine translation, or if we're talking Go boardstate, maybe a recommended next move), but it's not required, maybe you just want to predict something at the very end after processing the entire sequence (like weather forecast).
A RNN unit is typically not a big structure (again, figure on the left), so why do we call that "deep" learning then? Because in the end, the complete computation performed by the network on the entire sequence can be represented by the unfolded network (figure on the right), materializing the same actual unit as multiple units at different points in time. And this network has potentially a LOT of connections/layers, depending on the length of the sequence.
Some rapid-fire answers to your previous questions:
Is that initialized as all zeros? Yes, there is a definition for the initial values used. For example here: https://en.wikipedia.org/wiki/Long_short-term_memory it's indeed 0 (you can see it in the equations).
But how does that work when the input to the LSTM is discrete values that correspond to labels (like, types of things) You have to find a suitable numeric (often vectors) representation for your input sequence. For example in DL for NLP (natural language, aka actual English words), we have these things called embeddings that map a word to a vector of fixed dimension (like 50, or 300).
Are all of the last K steps are somehow stored in a single vector of a size that does not change? How does that information get combined into a single vector without completely screwing up what the inputs mean? I hope the above clarified that question. There is basically nothing physically stored but the current state. A good analogy is our brain and our own memory. Our brain is the same physical thing our whole life mostly (like, it doesn't grow in size), but our memory encoded inside is a product of everything that we've lived. We retain a lot of what we've experienced, but we also forget a lot. As you might imagine yes it's possible to screw up and lose track of what the input meant (it happens that the DL model doesn't learn anything of value).
Or is it actually stored in a 3d tensor which starts as zeros and one of the dimensions corresponds to number of remembered iterations? Thanks So as you might guess by now, the answer is the general case of a pure RNN, no, it doesn't store every iteration. BUT, there are DL architectures that actually do store all the intermediate states of the RNN, and perform some further processing using all those. For example in language, there is this thing called an "attention layer", that takes as input all the states/products of the LSTM cells for all successive words, learns which words were retrospectively more important to focus on (and as you step back and look at the entire sequence at once, it's rather possible), and tries to favor the more important sections of the sentence. This idea helps a lot and attention-based RNNs are state-of-the-art in a lot of NLP tasks atm. It's possible to do that because sentences are typically bounded in size, you can afford to look at up to 100 LSTM states for 100 words at once, for example.
As for how to interpret what exactly a particular RNN unit does at each step (like LSTM, or the simpler GRU), and why it was designed that way, it's more nebulous, and I don't think it's something that relevant to focus on anyway (until you're really deep into this :D).
|
Because of finals I can't give your post the attention it does but I just want to say that even if it takes a while to respond I appreciate the level of detail and I am definitely going to read it all.
|
On May 16 2019 22:59 Manit0u wrote:Show nested quote +On May 15 2019 05:13 enigmaticcam wrote:I already know that. I know the purpose of the merge statement, and I've used it before. But typically I've used it like this merge Target using Source on join Source to Target when matched then update when not matched then insert But the SQL I quoted earlier has an extra step in there and I don't know what it's doing: merge Target using Source on join Source to Target when matched and exists (something except something) then update when not matched then insert How is "when matched and exists ... then update..." different than "when match then update..."? Both conditions must be met (match must be found and exists must evaluate to true) for update to happen. In this particular case: matching rows were found and data in columns B and C is different between source and target rows for the match. Edit: Example: SRC_TABLE { COL_A: 1, COL_B: 1, COL_C: 1 } { COL_A: 2, COL_B: 1, COL_C: 2 } { COL_A: 3, COL_B: 2, COL_C: 2 }
TGT_TABLE { COL_A: 1, COL_B: 1, COL_C: 1 } { COL_A: 2, COL_B: 1, COL_C: 1 }
Trying to do the above merge joining on COL_A and checking for exists except on COL_B and COL_C would result in: COL_A: 1 - no action COL_A: 2 - update COL_A: 3 - insert Technically this exists statement isn't really necessary here, worst case scenario is that you'll update rows to the same values which shouldn't be a problem unless it's a heavy operation or auto-updated timestamps and such shouldn't be changed if you don't really change the values. Another concern would be having to update a lot of rows, with exists you perform less update operations but more select operations to filter results and only update what's necessary. In other news, Intel has dropped the ball again: https://www.bbc.com/news/technology-48278400 Thank you! Now I know why it never performs an update when I run it. Appreciate it.
|
|
Hi everyone, i have slightly ofttopic question but i hope its close enough to programming to be ok.
So i have problem with open office/ google docs. What i want to do is check a value one column say A and if i find what i want get product column B * column C. And i want to do it for few hundreds rows. Sounds simple enough but i cant fit this into one cell (doing it usuing mutiple cells is simple but how to do it in one? does eanyone know?
In pseudo python it would be something like:
for row in range(1:300): if column[A] == "ZZZ": result = result + column[B] * column[C]
return result
But how to do it in open office/google docs ?
|
On June 06 2019 01:09 Silvanel wrote:Hi everyone, i have slightly ofttopic question but i hope its close enough to programming to be ok. So i have problem with open office/ google docs. What i want to do is check a value one column say A and if i find what i want get product column B * column C. And i want to do it for few hundreds rows. Sounds simple enough but i cant fit this into one cell (doing it usuing mutiple cells is simple but how to do it in one? does eanyone know? In pseudo python it would be something like: for row in range(1:300): if column[A] == "ZZZ": result = result + column[B] * column[C]
return result
But how to do it in open office/google docs ?
iirc in openoffice it would be possible to:
SUMPRODUCT(A1:A300="ZZZ"; B1:B300; C1:C300)
|
Thanks. I was usuing SUMPRODUCT but didnt know that You can have condition attached to range. This worked at the first try. Cheers.
|
Hah, I did my first talk at IT conference. It was really awkward so let me sum it up for you: Ruby developers in total 2 (me + 1 guy in the audience). Overall level of talks and audience - ridiculous, compared to other talks mine was like level 2 (I wanted to do a talk about some practical implementations of fuzzy search across multiple tables in RDBMS on the back-end), others level 5, audience level was mostly 0 (the only questions other talkers got were coming from me). It felt really weird and I'm pretty dumbfounded at 2 things: 1. How many people come to these talks just because 2. How many talks are about high level infrastructure that puts you to sleep.
At least now I know that my next talk will be about interview questions and the audience will absolutely love it.
Source code for the interested: https://bitbucket.org/kkarski/rails-model-filtering-example/src/master/
|
Please disregard my previous post. I wrote it after too many beers.
New challenge (I want to compare this against code I have, the most concise and easy to understand solution wins). Write this thing in different languages (I need PHP, Java, C, C++, C#, Python and Javascript): I need it to run in https://repl.it/
def first_uniq_letter(str) return '?' unless str
uniq = str.split('').reject { |c| str.count(c) > 1 }.first
uniq || '?' end
def test return 'fail' unless first_uniq_letter('abba') == '?' return 'fail' unless first_uniq_letter('') == '?' return 'fail' unless first_uniq_letter(nil) == '?' return 'fail' unless first_uniq_letter('tesseract') == 'r'
'ok' end
test => 'ok'
|
Hyrule18767 Posts
quick python 2 code:
from ordered_set import OrderedSet
def first_unique_letter(check): if type(check) != str: return '?'
for l in OrderedSet([c for c in check]): if check.count(l) == 1: return l
return '?'
def test(): if first_unique_letter('abba') != '?' or first_unique_letter('') != '?' or first_unique_letter(None) != '?' or first_unique_letter('tesseract') != 'r': return 'fail'
return 'ok'
print test()
|
Here is concise C#. Introduce named variables and/or replace ?. with if for clarity to your liking. Important: GroupBy maintains order according to specification. Grouping also conveniently is a reference type unlike char. A similar approach I tried runs into the issue where FirstOrDefault returns default(char) instead of null.
char FirstUniqueLetter(string str) { return str?.GroupBy(x => x).FirstOrDefault(g => g.Count() == 1)?.Key ?? '?'; }
|
Depending on your definitions of "conciseness" the following C# method is more or less concise than spinesheath's:
using System; using System.Linq;
class MainClass { public static void Main (string[] args) { Console.WriteLine (FirstUniqueLetter("abba")); Console.WriteLine (FirstUniqueLetter("")); Console.WriteLine (FirstUniqueLetter("tesseract")); }
static char FirstUniqueLetter(string str) { foreach (char c in str.Distinct()) if (str.Count(s => s == c) == 1) return c; return '?'; } }
In terms of readability I don't think there's much argument about it though.
Obviously this is pretty much the same as tofucake's Python version, don't think there will be much difference in any other language.
|
On June 28 2019 15:46 Apom wrote:Depending on your definitions of "conciseness" the following C# method is more or less concise than spinesheath's: using System; using System.Linq;
class MainClass { public static void Main (string[] args) { Console.WriteLine (FirstUniqueLetter("abba")); Console.WriteLine (FirstUniqueLetter("")); Console.WriteLine (FirstUniqueLetter("tesseract")); }
static char FirstUniqueLetter(string str) { foreach (char c in str.Distinct()) if (str.Count(s => s == c) == 1) return c; return '?'; } }
In terms of readability I don't think there's much argument about it though. Obviously this is pretty much the same as tofucake's Python version, don't think there will be much difference in any other language. According to the documentation the result of distinct is unordered, so your implementation does not necessarily return the first unique character. https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinct?view=netframework-4.8 The actual implementation is ordered, but you can't really rely on that. Plus you can just get rid of the Distinct and have the proper behavior for no real cost (if you wanted optimal performance there is a much faster approach).
|
|
|
|