Liquid`

Team Liquid Liquipedia

EDT 12:09
CEST 18:09
KST 01:09

Forum Sidebar

Events/Features

Fact based Zerg Upgrade Tier List BGH Auto Balance -> http://bghmmr.eu/ STARCRAFT MOVIE - Last Night at the Command center BW General Discussion Battle cruiser feet vs Carrier fleet

[Megathread] Daily Proleagues CSLAN 4 is Coming! Small VOD Thread 2.0 The Casual Games of the Week Thread

Why doesn't anyone use restoration? Simple Questions, Simple Answers Relatively freeroll strategies Creating a full chart of Zerg builds

Stormgate/Frost Giant Megathread ZeroSpace at Steam NextFest - Last free demo Beyond All Reason Nintendo Switch Thread Path of Exile

Looking for a Dota Mentor Official 'what is Dota anymore' discussion

League of Legends

Heroes of the Storm

Simple Questions, Simple Answers Heroes of the Storm 2.0

Deck construction bug

Vanilla Mini Mafia

How To Predict Tilt in Espor…

An Exploration of th…

waywardstrategy

I'm an arrogant trash talke…

Gauntlet SC2: A Retrospectiv…

Why RTS gamers make better f…

[Patch 4.4] CUDDLY INCOMING! ヽ(*・ω・)ﾉ - Page 98

Prev 1 96 97 98 99 100 158 Next

Zess

Adun Toridas!9144 Posts

March 27 2014 21:48 GMT

#1941

On March 28 2014 06:02 killerdog wrote:
The first, and (imo) relatively simple application of the data i would try, which reddit would probably like too (more then just raw statistics and p values) would be a "what should i pick" style program.

Just choose a p value to use as a cutoff, then calculate all the counters/synergies of each individual champion. This'll give you two lists for each champion, 117 entries long (or however many champions there are -1.) delete all data with p values above your cutoff.

Now each champion has an associated list showing what it's strongest with, and what it's best against. Now write a program where you can enter between 0 and five champions on the enemy team, and 0-4 champions on your team. The program will then calculate the probabilty of each available champion of winning in this situation, by adding the synnergy chances of your team and the "counter" chances of the other teams champions, (potentially weighting by some inverse function f(p)->x so lower p values are more important), and spit out 4-5 champions which have the highest chance of winning.

Should be codeable in a day or two (plus however long creating data for each champion takes) and would be pretty interesting, plus reddit tends to love those kind of things, especially when the statistics behind it's creation are easily understandable. You can add specific roles if you want, or weight your lane opponent more heavily then the rest of their team or whatever.

I'm not really a statistician, but find more "practical" applications like that much more interesting, even if they wont be 100% reliable.

Show nested quote +

Imoperator and roffles get temped, and GD immediately turns to discussing the merits of various statistical models for analyzing league meta-data.

This is the exact sort of thing that Yango, Kuomindong, and I have said that Sufficiency's data mining is not useful for, from various perspectives. Moreover, the fact that championselect.net is taken seriously by enough people to keep it running means this sort of modeling is fairly unnecessary as well if your gold is just to do something plebian

I think Sufficiency's project is very cool to see data, but you really shouldn't read too much into it because it suffers from the problems all historical data mining is suspect to.

killerdog

Denmark6522 Posts

March 27 2014 22:21 GMT

#1942

On March 28 2014 06:48 xes wrote:

Show nested quote +

This is the exact sort of thing that Yango, Kuomindong, and I have said that Sufficiency's data mining is not useful for, from various perspectives. Moreover, the fact that championselect.net is taken seriously by enough people to keep it running means this sort of modeling is fairly unnecessary as well if your gold is just to do something plebian

I think Sufficiency's project is very cool to see data, but you really shouldn't read too much into it because it suffers from the problems all historical data mining is suspect to.

I get that it won't be an actual viable tool for serious gameplay, it's not meant to be. It would just be a potentially very fun, and definitely interesting, use of the data. And it would be in a form everyone would appreciate, without the sometimes imtimidating aspect of p values and long numbers. I'd imagine a user friendly form like this would probably be a lot more popular on places like reddit, where he keeps posting it to.

As long as you explain your method, and explain any shortcomings in your method, then by definition there's nothing scientifically wrong with it at all.

killerdog

Denmark6522 Posts

March 27 2014 22:25 GMT

#1943

On March 28 2014 06:29 Sufficiency wrote:

Show nested quote +

I see. It seems that her AD ratios are mostly the same as her AP ratios (same on E, AP ratio is only 0.05 smaller on Q). I guess Lizard gives her a quicker early game spike.... which makes sense.

Show nested quote +

Oh wtf lol http://www.teamliquid.net/forum/closed-threads/32696-automated-ban-list-latest-theanarchy?page=1699#33965

What happened?

http://www.teamliquid.net/forum/viewpost.php?post_id=21070673 i think

Zess

Adun Toridas!9144 Posts

March 27 2014 22:49 GMT

#1944

On March 28 2014 07:21 killerdog wrote:

Show nested quote +

On March 28 2014 06:48 xes wrote:

On March 28 2014 06:02 killerdog wrote:
The first, and (imo) relatively simple application of the data i would try, which reddit would probably like too (more then just raw statistics and p values) would be a "what should i pick" style program.

Just choose a p value to use as a cutoff, then calculate all the counters/synergies of each individual champion. This'll give you two lists for each champion, 117 entries long (or however many champions there are -1.) delete all data with p values above your cutoff.

Now each champion has an associated list showing what it's strongest with, and what it's best against. Now write a program where you can enter between 0 and five champions on the enemy team, and 0-4 champions on your team. The program will then calculate the probabilty of each available champion of winning in this situation, by adding the synnergy chances of your team and the "counter" chances of the other teams champions, (potentially weighting by some inverse function f(p)->x so lower p values are more important), and spit out 4-5 champions which have the highest chance of winning.

Should be codeable in a day or two (plus however long creating data for each champion takes) and would be pretty interesting, plus reddit tends to love those kind of things, especially when the statistics behind it's creation are easily understandable. You can add specific roles if you want, or weight your lane opponent more heavily then the rest of their team or whatever.

I'm not really a statistician, but find more "practical" applications like that much more interesting, even if they wont be 100% reliable.

On March 28 2014 05:50 krndandaman wrote:
I'm legitimately confused whether some people are parodying or actually serious.

plz stop i want my league discussion back

Imoperator and roffles get temped, and GD immediately turns to discussing the merits of various statistical models for analyzing league meta-data.

This is the exact sort of thing that Yango, Kuomindong, and I have said that Sufficiency's data mining is not useful for, from various perspectives. Moreover, the fact that championselect.net is taken seriously by enough people to keep it running means this sort of modeling is fairly unnecessary as well if your gold is just to do something plebian

I think Sufficiency's project is very cool to see data, but you really shouldn't read too much into it because it suffers from the problems all historical data mining is suspect to.

I get that it won't be an actual viable tool for serious gameplay, it's not meant to be. It would just be a potentially very fun, and definitely interesting, use of the data. And it would be in a form everyone would appreciate, without the sometimes imtimidating aspect of p values and long numbers. I'd imagine a user friendly form like this would probably be a lot more popular on places like reddit, where he keeps posting it to.

As long as you explain your method, and explain any shortcomings in your method, then by definition there's nothing scientifically wrong with it at all.

Perhaps, but like Sufficiency already mentioned, the predictive power of these analytic metrics is unknown and questionable. In some of the matchup datasets, there is definitely "is this a thing or is it a statistical fluke?" where your usual notion of p-value isn't useful since those are meant for "finding needles in a haystack" while this kind of data dredging is "finding stuff that isn't hay in a haystack."

However, the metrics provided have been pretty interesting (like the pentakill one, and even this wintime differential [though i don't think that differential is entirely caused by lategame/earlygame per se]). Something I would like to see is "skillcap differential." LoLKing already has the data, but it isn't presented in a very good format. The premise would be an initial pass to see the winrate differential between Bronze V and Challenger of various champions, and out of the top5 see what the per league breakdown is.

For example, Mid lulu is extremely punishing when played well and fit into a team, while also being extremely punishing to play if you aren't effective with your spells. We would expect both individual skill, and relative team "sense-making" to go down as you go down in skill, and maybe that correlates with a drop in winrate.

Sufficiency

Canada23833 Posts

March 27 2014 23:01 GMT

#1945

On March 28 2014 06:48 xes wrote:

Show nested quote +

This is the exact sort of thing that Yango, Kuomindong, and I have said that Sufficiency's data mining is not useful for, from various perspectives. Moreover, the fact that championselect.net is taken seriously by enough people to keep it running means this sort of modeling is fairly unnecessary as well if your gold is just to do something plebian

I think Sufficiency's project is very cool to see data, but you really shouldn't read too much into it because it suffers from the problems all historical data mining is suspect to.

I don't quite understand this. Are you saying the issue with this study is that this is an observational study?

Goumindong

United States3529 Posts

March 27 2014 23:08 GMT

#1946

On March 28 2014 05:27 ZERG_RUSSIAN wrote:

Show nested quote +

Actually tho doctors prescribe medication based on a good deal of understanding of pharmacology BACKED by statistical analysis

idk about bankers

OK but do we believe that champion picks don't have an effect on the outcome of the game? The problem in doing this isn't on the theoretical side. We believe that champion picks matter, we know the system is binary with regards to outcomes and champions in the game, so we don't have to worry about complex functions. This pretty much locks down the potential theory to one which sufficiency should be using. So we have a theory, its as complex as it can and needs to be, and we want to measure how large the effect of champion counters are. No problem there.

The problem in going deeper than 1 or 2 interactions is that you just don't have enough data (and if you, probably not enough time to run the calculation). There are what, 117 champions and 5 slots for one side with 112 and 5 for the other? This is something like 22 quadrillion potential games which can be played. The number of dummy terms we would have would be far larger than that even because you want to know whether an effect is from the 2 champion or 3 champion or 4 champion interaction. Even if you had enough observations(which we don't), doing the math would break your computer.

My main problem with Sufficiency's work on champion counters is that I don't understand why the regression needs to be logistic (well, why it needs to be a regression at all, just look at win rates for one champion against another champion). I was under the impression that logistic regressions are valuable when you want to tie a binary dependent variable with a continuous independent variable and that it should be indistinguishable from a simple linear system or summary statistics when all of the predictors are binary(I.E. the situation that we find ourselves in). Additionally i wanted to know precisely what his model was because "its logistic" doesn't actually tell me much because it doesn't explain which terms are in it and in which manner... which is the important part of a model.

Sufficiency

Canada23833 Posts

March 27 2014 23:15 GMT

#1947

On March 28 2014 08:08 Goumindong wrote:

Show nested quote +

OK but do we believe that champion picks don't have an effect on the outcome of the game? The problem in doing this isn't on the theoretical side. We believe that champion picks matter, we know the system is binary with regards to outcomes and champions in the game, so we don't have to worry about complex functions. This pretty much locks down the potential theory to one which sufficiency should be using. So we have a theory, its as complex as it can and needs to be, and we want to measure how large the effect of champion counters are. No problem there.

The problem in going deeper than 1 or 2 interactions is that you just don't have enough data (and if you, probably not enough time to run the calculation). There are what, 117 champions and 5 slots for one side with 112 and 5 for the other? This is something like 22 quadrillion potential games which can be played. The number of dummy terms we would have would be far larger than that even because you want to know whether an effect is from the 2 champion or 3 champion or 4 champion interaction. Even if you had enough observations(which we don't), doing the math would break your computer.

My main problem with Sufficiency's work on champion counters is that I don't understand why the regression needs to be logistic (well, why it needs to be a regression at all, just look at win rates for one champion against another champion). I was under the impression that logistic regressions are valuable when you want to tie a binary dependent variable with a continuous independent variable and that it should be indistinguishable from a simple linear system or summary statistics when all of the predictors are binary(I.E. the situation that we find ourselves in). Additionally i wanted to know precisely what his model was because "its logistic" doesn't actually tell me much because it doesn't explain which terms are in it and in which manner... which is the important part of a model.

The model is as follows:

ln E(win) = b0 + I(Champion A is on one side) * b1 + I(Champion B is on the other side) * b2 + I(Champion A is on one side and Champion B is on the other side) * b3

The estimates you see are for b3, since it is the interaction term. This eliminates the effect of the overall power of Champion A and B. Repeat this for any pairs of champions A and B.

I don't really want to explain to you why this has to be a logistic regression, or why logistic regression even makes sense. You can try reading a book on categorical data analysis to enlighten yourself.

Omnishroud

1073 Posts

March 27 2014 23:44 GMT

#1948

Returned to IMop being banned. Best week off ever.

Goumindong

United States3529 Posts

March 27 2014 23:48 GMT

#1949

All those values are binary though so it's identical to a linear system.

Gahlo

United States35173 Posts

March 28 2014 00:01 GMT

#1950

I don't know which is a worse discussion, this or warwick support.

Zess

Adun Toridas!9144 Posts

March 28 2014 00:01 GMT

#1951

Sufficiency should open up his own thread on this so we can put our cancer in there.

On March 28 2014 09:01 Gahlo wrote:
I don't know which is a worse discussion, this or warwick support.

This is basically the equivalent of a Warwick support discussion the context of categorical data analysis.

Sufficiency

Canada23833 Posts

March 28 2014 00:02 GMT

#1952

On March 28 2014 08:48 Goumindong wrote:
All those values are binary though so it's identical to a linear system.

I hate to say this. but I think you should stop posting about this and read some books before you further embarrass yourself. Try this one, it's the classic:

http://www.amazon.com/Categorical-Data-Analysis-Alan-Agresti/dp/0470463635

Pay strong attention to 3-way contingency tables and inference on such tables.

Nothing against you in particular, but judging from your feedback so far, it seems that you have some degrees of training in basic applied statistics. But all you have been doing so far was throwing jargon at me, and it sounds to me that you have no idea what is actually happening beyond trying to impress the ordinary reader with a bunch of verbal diarrhea.

I chose to avoid jargon as much as possible so it's easy to grasp for any reader. I also chose to not disclose my educational background, training, experience, and publication records because I feel it's not useful and will only make me sound condescending.

GolemMadness

Canada11044 Posts

March 28 2014 00:13 GMT

#1953

On March 28 2014 09:01 xes wrote:
Sufficiency should open up his own thread on this so we can put our cancer in there.

Show nested quote +

This is basically the equivalent of a Warwick support discussion the context of categorical data analysis.

Yes, please make this into a blog or something.

Sponkz

Denmark4564 Posts

March 28 2014 00:35 GMT

#1954

On March 28 2014 09:01 Gahlo wrote:
I don't know which is a worse discussion, this or warwick support.

I'm still wondering what the discussion is, so far it's just derp.

On March 28 2014 08:48 Goumindong wrote:
All those values are binary though so it's identical to a linear system.

Logistic regression is supposed to be identical to a linear system, what's your point?

petered

United States1817 Posts

March 28 2014 00:41 GMT

#1955

On March 28 2014 09:13 GolemMadness wrote:

Show nested quote +

Yes, please make this into a blog or something.

I know, right? Can we please just get back to anecdotal evidence for backing up our theories? I mean, as evidenced by nearly every facet of modern day research, numbers are basically useless for analyzing anything.

Sponkz

Denmark4564 Posts

March 28 2014 00:46 GMT

#1956

On March 28 2014 09:41 petered wrote:

Show nested quote +

I know, right? Can we please just get back to anecdotal evidence for backing up our theories? I mean, as evidenced by nearly every facet of modern day research, numbers are basically useless for analyzing anything.

Numbers are fine, but when everyone starts shitstorming over. what seems to be scientific interest from Sufficiency (and the urge to share it), it doesn't prove anything apart from what is it (a freaking hobby thing that he wanted to share).

And the hilarious and yet sad part is, that Sufficiency seems to know what he's talking about, yet people keeps disbelieving, like what the fuck guys?

TheYango

United States47024 Posts

March 28 2014 00:51 GMT

#1957

On March 28 2014 09:41 petered wrote:

Show nested quote +

I know, right? Can we please just get back to anecdotal evidence for backing up our theories? I mean, as evidenced by nearly every facet of modern day research, numbers are basically useless for analyzing anything.

"Every facet of modern research" demands numbers because they are necessary for the precision and rigor required in their respective fields. Not to mention that they all went through hundreds of years of qualitative analysis and general theory before they reached the point where numbers could be practically applied.

We're talking about a game that has only existed for less than 10 years. Even when applied to a game like baseball there was at least some qualitative understanding of the statistics being mined (and like 100+ years of baseball theory) before something like Moneyball could happen.

Nobody has the qualitative understanding of the game necessary to draw proper conclusions from data like this and build meaningful models. The qualitative understanding of what a lot of numbers actually mean isn't there yet. The statistics we have are the analogues for stuff like batting averages that were proven to be useless. The complex aggregate statistics which are actually meaningful don't even exist yet--and many of those were developed through anecdotal impressions of their relevance before statistics showed them to be so.

Goumindong

United States3529 Posts

March 28 2014 01:02 GMT

#1958

On March 28 2014 09:02 Sufficiency wrote:

Show nested quote +

I hate to say this. but I think you should stop posting about this and read some books before you further embarrass yourself. Try this one, it's the classic:

http://www.amazon.com/Categorical-Data-Analysis-Alan-Agresti/dp/0470463635

Pay strong attention to 3-way contingency tables and inference on such tables.

Nothing against you in particular, but judging from your feedback so far, it seems that you have some degrees of training in basic applied statistics. But all you have been doing so far was throwing jargon at me, and it sounds to me that you have no idea what is actually happening beyond trying to impress the ordinary reader with a bunch of verbal diarrhea.

I chose to avoid jargon as much as possible so it's easy to grasp for any reader. I also chose to not disclose my educational background, training, experience, and publication records because I feel it's not useful and will only make me sound condescending.

The issue is that everything you're saying sounds the same to me.

If your independent variables take two possible values it doesn't matter if you take the log or not. The interpretation of your coefficients changes slightly but it's the same system. Because you have only two values a log transformation is indistinguishable from a linear transformation.

Sufficiency

Canada23833 Posts

March 28 2014 01:02 GMT

#1959

On March 28 2014 09:51 TheYango wrote:

Show nested quote +

"Every facet of modern research" demands numbers because they are necessary for the precision and rigor required in their respective fields. Not to mention that they all went through hundreds of years of qualitative analysis and general theory before they reached the point where numbers could be practically applied.

We're talking about a game that has only existed for less than 10 years. Even when applied to a game like baseball there was at least some qualitative understanding of the statistics being mined (and like 100+ years of baseball theory) before something like Moneyball could happen.

Nobody has the qualitative understanding of the game necessary to draw proper conclusions from data like this and build meaningful models because the qualitative understanding of what a lot of numbers actually mean isn't there yet. The statistics we have are the analogues for stuff like batting averages that were proven to be useless. The complex aggregate statistics which are actually meaningful don't even exist yet.

I think you should posting too. I think I have lost about all the respect I had for you.

On March 28 2014 10:02 Goumindong wrote:

Show nested quote +

The issue is that everything you're saying sounds the same to me.

If your independent variables take two possible values it doesn't matter if you take the log or not. The interpretation of your coefficients changes slightly but it's the same system. Because you have only two values a log transformation is indistinguishable from a linear transformation.

Sigh.

EDIT: let me give you a more serious response. Words are that you are an economist. What you just said to me is like I just ask you "WTF WHY DO PEOPLE RESPOND TO INCENTIVES?!?!?!".

I am serious. The matter of taking the log is really fundamental. It's like the economic principle that people respond to incentives and everything has a cost.

petered

United States1817 Posts

March 28 2014 01:04 GMT

#1960

On March 28 2014 09:51 TheYango wrote:

Show nested quote +

"Every facet of modern research" demands numbers because they are necessary for the precision and rigor required in their respective fields. Not to mention that they all went through hundreds of years of qualitative analysis and general theory before they reached the point where numbers could be practically applied.

We're talking about a game that has only existed for less than 10 years. Even when applied to a game like baseball there was at least some qualitative understanding of the statistics being mined (and like 100+ years of baseball theory) before something like Moneyball could happen.

So you need 100+ years of history to be able to understand a game through numbers? Yango you disappoint, that is nonsensical. The math behind data analysis is not that drastically different from one area of research to the next, making it possible to draw from the rich experience of other fields of research. The sheer volume of LoL games played and the numerous quantitative measures that can be drawn from each game make it a fantastic target for this type of analysis.

Now, I'll be honest, I didn't even look very much at what Sufficiency posted. I am just so confused when people flip their shit about someone trying to use numbers to better understand the game. Sure there are challenges and caveats with some studies, but that does not mean that he is going down the wrong path or that what he has presented is useless. Disagree with his study/conclusions fine, but this will inevitably be the future of understanding the game, since that is what is happening is just about every other field.

Prev 1 96 97 98 99 100 158 Next

Please or register to reply.

Live Events Refresh

Refresh

Next event in 17h 51m

[ Submit Event ]

Refresh

StarCraft 2

296
MaxPax

221
Liquid`VortiX

44
mouzStarbuck

15

StarCraft: Brood War

1467
Mini

1382
Shuttle

823
EffOrt

786
firebathero

675
Light

433
Soma

381
Snow

226
actioN

224
ggaemo

169
[ Show more ]

Dota 2

Gorgc7541
Dendi1099

Counter-Strike

fl0m1890
ceh9489
adren_tv61

Other Games

singsing2407
B2W.Neo1004
FrodaN435
DeMusliM368
QueenE79
Trikslyr55
Dewaltoss26
BEARDiaguz7

Organizations

Dota 2

PGL Dota 2 - Main Stream8552

StarCraft 2

Blizzard YouTube

StarCraft: Brood War

[ Show 16 non-featured ]

Upcoming Events

The PondCast

17h 51m

Douyu Cup 2020

1d 12h

Oliveira vs Trap

Jieshi vs XY

soO vs FanTaSy

TY vs Coffee

OSC

1d 23h

Douyu Cup 2020

2 days

Neeb vs Impact

MacSed vs Cyan

Scarlett vs Kelazhur

INnoVation vs Dear

Douyu Cup 2020

3 days

Maestros of the Game

3 days

herO vs Classic

Maru vs Serral

BSL22 NKC (BSL vs China)

3 days

Douyu Cup 2020

4 days

BSL22 NKC (BSL vs China)

4 days

Online Event

4 days

Liquipedia Results

Completed

Proleague 2026-06-19

WardiTV Spring 2026

Heroes Pulsing #2

Ongoing

IPSL Spring 2026

CSCL: Masked Kings S4

BSL 22 Non-Korean Championship

CSL Season 21: Qualifier 1

SCTL 2026 Spring

Maestros of the Game 2

IEM Cologne Major 2026

Stake Ranked Episode 2

CS Asia Championships 2026

Asian Champions League 2026

IEM Atlanta 2026

PGL Astana 2026

BLAST Rivals Spring 2026

PGL Bucharest 2026

1.

ByuN
2.

TY
3.

Dark
4.

Solar
5.

6.

sOs
8.

soO
9.

1.

Rain
2.

Flash
3.

5.

Bisu
6.

Mini
8.

Sidebar Settings...