• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 02:19
CEST 08:19
KST 15:19
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
[ASL21] Ro16 Preview Pt1: Fresh Flow3[ASL21] Ro24 Preview Pt2: News Flash10[ASL21] Ro24 Preview Pt1: New Chaos0Team Liquid Map Contest #22 - Presented by Monster Energy21ByuL: The Forgotten Master of ZvT30
Community News
MaNa leaves Team Liquid7$5,000 WardiTV TLMC tournament - Presented by Monster Energy5GSL CK: More events planned pending crowdfunding7Weekly Cups (May 30-Apr 5): herO, Clem, SHIN win0[BSL22] RO32 Group Stage5
StarCraft 2
General
MaNa leaves Team Liquid Team Liquid Map Contest #22 - Presented by Monster Energy Quebec Clan still alive ? BGE Stara Zagora 2026 cancelled Blizzard Classic Cup @ BlizzCon 2026 - $100k prize pool
Tourneys
$5,000 WardiTV TLMC tournament - Presented by Monster Energy Sparkling Tuna Cup - Weekly Open Tournament RSL Revival: Season 5 - Qualifiers and Main Event GSL CK: More events planned pending crowdfunding Sea Duckling Open (Global, Bronze-Diamond)
Strategy
Custom Maps
[D]RTS in all its shapes and glory <3 [A] Nemrods 1/4 players [M] (2) Frigid Storage
External Content
Mutation # 521 Memorable Boss The PondCast: SC2 News & Results Mutation # 520 Moving Fees Mutation # 519 Inner Power
Brood War
General
ASL21 General Discussion [ASL21] Ro16 Preview Pt1: Fresh Flow JD's Ro24 review The Korean Terminology Thread so ive been playing broodwar for a week straight.
Tourneys
[ASL21] Ro16 Group A [Megathread] Daily Proleagues Escore Tournament StarCraft Season 2 [ASL21] Ro24 Group F
Strategy
Any training maps people recommend? Fighting Spirit mining rates Muta micro map competition What's the deal with APM & what's its true value
Other Games
General Games
Stormgate/Frost Giant Megathread Nintendo Switch Thread Battle Aces/David Kim RTS Megathread General RTS Discussion Thread Starcraft Tabletop Miniature Game
Dota 2
The Story of Wings Gaming Official 'what is Dota anymore' discussion
League of Legends
G2 just beat GenG in First stand
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Vanilla Mini Mafia Mafia Game Mode Feedback/Ideas TL Mafia Community Thread Five o'clock TL Mafia
Community
General
US Politics Mega-thread European Politico-economics QA Mega-thread Russo-Ukrainian War Thread The China Politics Thread Trading/Investing Thread
Fan Clubs
The IdrA Fan Club
Media & Entertainment
[Manga] One Piece [Req][Books] Good Fantasy/SciFi books Movie Discussion!
Sports
2024 - 2026 Football Thread Formula 1 Discussion Cricket [SPORT] Tokyo Olympics 2021 Thread
World Cup 2022
Tech Support
[G] How to Block Livestream Ads
TL Community
The Automated Ban List
Blogs
lurker extra damage testi…
StaticNine
How Streamers Inspire Gamers…
TrAiDoS
Broowar part 2
qwaykee
Funny Nicknames
LUCKY_NOOB
Iranian anarchists: organize…
XenOsky
ASL S21 English Commentary…
namkraft
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1860 users

OpenAI's Dota 2 bots vs. 5 top professionals in TI - Page 11

Forum Index > Dota 2 General
Post a Reply
Prev 1 9 10 11 12 13 16 Next All
WolfintheSheep
Profile Joined June 2011
Canada14127 Posts
August 18 2018 08:37 GMT
#201
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
Show nested quote +
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?

Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.

So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.

Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.

Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
Average means I'm better than half of you.
evanthebouncy!
Profile Blog Joined June 2006
United States12796 Posts
August 18 2018 09:10 GMT
#202
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
Show nested quote +
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?


So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.

The lower bound for the number of games is porportional to (1 / (1 - gamma))^3

where gamma is the "discount-factor" where it allows you to see more into the future.

So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy

OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.

I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.

So in a sense yes, the computation is the bottle neck
Life is run, it is dance, it is fast, passionate and BAM!, you dance and sing and booze while you can for now is the time and time is mine. Smile and laugh when still can for now is the time and soon you die!
spudde123
Profile Joined February 2012
4814 Posts
August 18 2018 09:38 GMT
#203
On August 18 2018 17:37 WolfintheSheep wrote:
Show nested quote +
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?

Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.

So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.

Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.

Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.


In OpenAI's case the bots are not directly trying to win the game but rather they learn to maximize their manually crafted reward. Certainly some things they do require something like foresight, like prioritizing getting a tower kill in 30 seconds over farming individually right now. But there are some things that seem quite difficult to fully learn with this sort of approach.

Take warding for example. Placing a ward alone doesn't give them any reward. So they would have to learn that placing a ward in a specific sort of game situation in a specific spot allows them to play better over the next six minutes or so. It'll be interesting to see how the warding develops over time. Sentries to counter invis heroes seems a bit more simple because it can be more short term (though obviously even there it's often good to use obs+sentry combos to spot movements) and I suspect they could learn how to use wards behind a tower to improve their tower pushes.

Also another thing that came to mind is how they deal with the prospect of losing a game. The bots aren't taught to deal with the ancient blowing up as a complete disaster but rather it's just one aspect that gives a negative reward at some point in the future. When behind, a human team is likely to take a risk and go for a rosh fight for example because they know the enemy will likely be able to push their base if they get the aegis. As far as I gather, this sort of approach isn't necessarily what the bots would do. They may think that taking a fight now would result in them losing it and they just continue farming because it is what maximizes their reward for the next few minutes. However, they may end up straight up losing shortly after that, but from their perspective it isn't worse than probably getting crushed in a 5v5 fight right now.

But not sure what exactly OpenAI's plans are. Will they continue to use Dota as a test bed to test their ability to adapt to all sorts of weird situations? Or will they just stop the project once they beat pros at TI?

FreakyDroid
Profile Joined July 2012
Macedonia2616 Posts
August 18 2018 12:12 GMT
#204
On August 18 2018 17:37 WolfintheSheep wrote:
Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.

So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.

Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.

Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.


I think GPU computation will soon surpass CPU computation, and it already has in some areas, so my money would be on that rather than on quantum computers, at least in the near future. I use GPU rendering for 3D work and while it is faster than CPU (arguably), its got some nasty limitation with RAM, not being able to combine the RAM from multiple GPU's, however Im not sure if thats the case with GPUs for machine/deep learning. The new ones with tensor cores from Nvidia seem to be tailored towards these kind of tasks, but seeing as they have 0 competition at the moment, I dont think they are in any rush to improve that technology or make it available at cheaper prices.

Yeah, that was my read on the AI, which didnt knock my socks off. I know the OpenAI team were happy with the result, but I personally didnt see the bots as anything special because the only thing that can impress me is planning and strategy rather than godlike execution. However now that I understand the limitations it has due to compute power, I wonder what will it take to have a break through on this field: better algorithms or better compute power, or maybe both.

On August 18 2018 18:10 evanthebouncy! wrote:
Show nested quote +
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?


So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.

The lower bound for the number of games is porportional to (1 / (1 - gamma))^3

where gamma is the "discount-factor" where it allows you to see more into the future.

So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy

OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.

I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.

So in a sense yes, the computation is the bottle neck


I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D

However, even if the AI had all the computation power it needs, I still dont understand how the problem of planning ahead or devising a longer strategy would help it achieve a more human like read on the game, when they are still limited by the immediate rewards. Maybe Im asking the wrong question dunno.

I just watched this video, but sadly its way too technical for me to understand it
Smile, tomorrow will be worse
WolfintheSheep
Profile Joined June 2011
Canada14127 Posts
August 18 2018 17:38 GMT
#205
On August 18 2018 18:10 evanthebouncy! wrote:
Show nested quote +
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?


So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.

The lower bound for the number of games is porportional to (1 / (1 - gamma))^3

where gamma is the "discount-factor" where it allows you to see more into the future.

So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy

OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.

I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.

So in a sense yes, the computation is the bottle neck


I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D

However, even if the AI had all the computation power it needs, I still dont understand how the problem of planning ahead or devising a longer strategy would help it achieve a more human like read on the game, when they are still limited by the immediate rewards. Maybe Im asking the wrong question dunno.[/QUOTE]
As I understand these neural networks, which to be honest I may not fully, the AI learns but creating a gigantic set of data and finding the optimum actions amongst that set.

So the AI won't actually be planning ahead or devising a better strategy on the fly. It needs to have those already in its playset, and then the game needs the match the conditions to use that strategy.
Average means I'm better than half of you.
nimdil
Profile Blog Joined January 2011
Poland3754 Posts
August 20 2018 15:52 GMT
#206
On August 18 2018 18:10 evanthebouncy! wrote:
Show nested quote +
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?


So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.

The lower bound for the number of games is porportional to (1 / (1 - gamma))^3

where gamma is the "discount-factor" where it allows you to see more into the future.

So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy

OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.

I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.

So in a sense yes, the computation is the bottle neck


Can you count to ten at least?

2^100 is ridiculously large number. I will give you some multiplication.

Imagine that 1 game takes 1 microsecond and that you use all x86 computers in the world.

Intel produces ~400 million CPUs a year. Let say this translates to 400 million computers a year. Let say we combine all computers created in last 50 years. That's 20 billions computers. Now we used them to play dota2 for last 8 years. 8 years = 252460800 seconds = 2524608 * 10^2 seconds = 2524608 * 10^8 microseconds <= 2.6 * 10^14 seconds.

Combined game power is 20 billions - 2 * 10^10 - computers times combined time 2.6 * 10^14.

That's 5.2 * 10^24. That's under 2^83. In other words even if someone run simulation on a completely unrealistic number of computers for completely unrealistic time assuming one game takes unrealistically short time to compute, we still are nowhere near 2^100 games, much less 2^100 * 37037037037 games.
Ayaz2810
Profile Joined September 2011
United States2763 Posts
August 20 2018 16:24 GMT
#207
Any updates on when this will take place? I would love to watch it. Even with the limitations, it should be exciting.
Vrtra Vanquisher/Tiamat Trouncer/World Serpent Slayer
Pontual
Profile Joined October 2016
Brazil3038 Posts
August 20 2018 16:31 GMT
#208
There will be a match each day at ti, probably after all the main stage games.

But the match that originated the thread already happened:

https://www.twitch.tv/videos/293517383
polgas
Profile Blog Joined April 2010
Canada1770 Posts
Last Edited: 2018-08-21 00:51:47
August 20 2018 23:33 GMT
#209


Wednesday, Thursday and Friday! (Aug 22-24)

They'll even do a match against the eventual TI winner. But no dates for those yet.

https://openai.com/five/

Leee Jaee Doong
PlayerofDota
Profile Joined May 2017
29 Posts
August 21 2018 10:12 GMT
#210
On August 18 2018 17:37 WolfintheSheep wrote:
Show nested quote +
On August 17 2018 22:39 FreakyDroid wrote:
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote:
it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.


After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?

Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.

So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.

Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.

Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.


That is so out of data what you just said. You can absolutely compute every move in Chess, computers have been able to do that for the past 20 years. Thing is modern AI systems use intuition, its essentially really specified brain power to do the same thing over and over, how it learns stuff is by rewards, the simpler the game the easier it is to maximize play faster, the more complex the game the more rewards, the harder it is to master.

Anyways Google's open mind AI has already beaten the chess computer blue fish or whatever the name was. In fact OpenAI didn't even lose a single game against the computer that calculates literally up to 10k moves ahead, openAI uses intuition and is able to squeeze out victories.

The biggest problem for chess for example is computing the exact amount of moves, it worthless to compute 1000 moves ahead, the game has 1/10k chance to end like that, so its about understanding how your opponent plays and figuring out what move is he playing for, for humans this is usually 2-5 moves ahead for professionals, 0-2 moves for amateurs.

So a modern AI is essentially a brain that specializes in a certain thing. With repetition it is able to master things like Chess, GO, even Dota it seems now. The biggest problem I see with today's AI is that it needs too much repetition to learn stuff. A normal human being can learn to play decent dota with like 20-50 games, I'm talking about not doing dumb shit like suicide into towers, suicide into rosh at level 4, using salve's while enemy heroes are hitting you, just basic stuff like that. For a human it would take 20-50 games to be at a decent level to not to dumb shit, for AI it takes 2-5 million hours to learn that.

OpenAI has played hundreds of millions of hours and that is with limited hero pool and certain game modifications.

AI advantage is that it can play 180 years worth of play in a single day, every day. Though you do need super expensive supercomputers and gazzilion of hard drives to store all of that data and obviously that is a lot of electricity expanded for a very specific task.
Orome
Profile Blog Joined June 2004
Switzerland11984 Posts
August 21 2018 13:57 GMT
#211
I'm no expert on AI, but everything about chess in the above post is really just wrong. You absolutely cannot 'compute every move' in chess. The conservative estimate for the amount of typical (ie 40-move-long) chess games is 10^120, a number that far outstrips the amount of atoms in the universe. Even the estimate of reasonable chess games (ie. excluding obvious blunders) is 10^40. We aren't even close to being able to solve chess via brute force (ie. a minimax algorithm) and if we were, no 'intuitive AI' would ever find any edge vs a traditional computer as the game would just be solved.

Saying 'a computer that literally calculates up to 10k moves ahead' is just as misleading as claiming GMs 'usually calculate 2-5 moves ahead'. Chess calculation is not a singular string of white and black moves, it's a game tree starting from the present position that includes all legal moves (the brute-force method, an optimised version of which is used by traditional engines) or all candidate moves (the human method), and going as far as computational or brain power allow. Calculating 4 ply (2 moves each) ahead in a complex middlegame position with 10+ candidate moves per ply is far more difficult (or often even a pointless approach for humans) than calculating 30 ply ahead for a mate in 15 where every single move is forced.

But anyway, I realise the post was more about AI than chess or traditional chess computers and I do agree that AlphaZero only taking 4 hours to surpass (an unoptimal version of) Stockfish is incredibly impressive. Though I'd be careful with words like intuition when talking about an AI. While it's true that AlphaZero evaluated far fewer positions than Stockfish, there could be many reasons for that. We simply don't know exactly why AlphaZero is as good as it is. It's one of the most fascinating things to me in all this: With traditional chess computers, we know exactly how they reach their conclusions (we even gave them the algorithms by which they judge positions). With an AI, we really don't, at least as far as I understand it (evan?). We give them the tools to learn, we understand how they learn, but the results of that learning are a black box that can surpass us.
On a purely personal note, I'd like to show Yellow the beauty of infinitely repeating Starcraft 2 bunkers. -Boxer
267
Profile Joined December 2017
64 Posts
August 21 2018 14:21 GMT
#212
Looked for the time of this thing and ended up reading a lot of AI development things ^^

So anyway... the play tomorrow when and against who?
polgas
Profile Blog Joined April 2010
Canada1770 Posts
August 21 2018 17:29 GMT
#213
Start time around 3:30-4:00pm, may vary:

Leee Jaee Doong
evanthebouncy!
Profile Blog Joined June 2006
United States12796 Posts
Last Edited: 2018-08-21 21:31:28
August 21 2018 21:30 GMT
#214
========================
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
=========================

It's (1 / (1 - gamma) )

so if gamma is 1

you get 1 / (1 - 1) = 1 / 0 = infinite
Life is run, it is dance, it is fast, passionate and BAM!, you dance and sing and booze while you can for now is the time and time is mine. Smile and laugh when still can for now is the time and soon you die!
Nevuk
Profile Blog Joined March 2009
United States16280 Posts
August 21 2018 21:37 GMT
#215
On August 22 2018 06:30 evanthebouncy! wrote:
========================
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
=========================

It's (1 / (1 - gamma) )

so if gamma is 1

you get 1 / (1 - 1) = 1 / 0 = infinite

I believe the point is that you can never have gamma = 1, yes
evanthebouncy!
Profile Blog Joined June 2006
United States12796 Posts
August 21 2018 23:31 GMT
#216
On August 22 2018 06:37 Nevuk wrote:
Show nested quote +
On August 22 2018 06:30 evanthebouncy! wrote:
========================
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
=========================

It's (1 / (1 - gamma) )

so if gamma is 1

you get 1 / (1 - 1) = 1 / 0 = infinite

I believe the point is that you can never have gamma = 1, yes


I mean if you just want gamma = 1 you'd just use the episodic count H = 20000

It is fairly agreeable in the RL community that (1 / (1 - gamma) ) ~ H

One for a stationary RL (i.e. you expect the game go on forever and has no definitive ending)
One for episodic RL (i.e. you expect the game to end upon a certain point, i.e. Go, or short-ish game of dota)
Life is run, it is dance, it is fast, passionate and BAM!, you dance and sing and booze while you can for now is the time and time is mine. Smile and laugh when still can for now is the time and soon you die!
polgas
Profile Blog Joined April 2010
Canada1770 Posts
Last Edited: 2018-08-23 03:03:29
August 23 2018 01:05 GMT
#217
Game 1 win by Pain Gaming. Humans outsmarted the bots!

https://www.twitch.tv/videos/300704237

Leee Jaee Doong
InFiNitY[pG]
Profile Blog Joined December 2002
Germany3474 Posts
August 23 2018 07:38 GMT
#218
On August 22 2018 06:30 evanthebouncy! wrote:
========================
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
=========================

It's (1 / (1 - gamma) )

so if gamma is 1

you get 1 / (1 - 1) = 1 / 0 = infinite


1/0 does not equal infinite

1/0 is undefined and therefore mathematically impossible.
"I just pressed stimpack, and somehow I won the battle" -Flash
Bigtony
Profile Blog Joined June 2011
United States1606 Posts
August 23 2018 08:09 GMT
#219
On August 23 2018 10:05 polgas wrote:
Game 1 win by Pain Gaming. Humans outsmarted the bots!

https://www.twitch.tv/videos/300704237




Will be crazy to see if the AI stops randomly using ultimates in the next game.
Push 2 Harder
sertas
Profile Joined April 2012
Sweden890 Posts
August 23 2018 08:59 GMT
#220
bots just abuse the fact they can react to blinks faster and 100% accurate clicks compared to humans, playwise they play like morons.
Prev 1 9 10 11 12 13 16 Next All
Please log in or register to reply.
Live Events Refresh
BSL
19:00
RO32 Group B
Sterling vs Azhi_Dahaki
Napoleon vs Mazur
Jimin vs Nesh
spx vs Strudel
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
NeuroSwarm 210
Nina 135
StarCraft: Brood War
Sea 20118
GuemChi 4508
ggaemo 81
soO 48
scan(afreeca) 39
Bale 16
yabsab 16
Noble 10
NotJumperer 1
Dota 2
ROOTCatZ125
League of Legends
JimRising 704
WinterStarcraft502
Super Smash Bros
Mew2King75
Other Games
summit1g18944
m0e_tv518
RuFF_SC277
Organizations
Counter-Strike
PGL381
Other Games
BasetradeTV228
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 14 non-featured ]
StarCraft 2
• Berry_CruncH122
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
League of Legends
• Lourlo1584
• Rush1446
• Stunt521
Upcoming Events
Replay Cast
2h 41m
Wardi Open
3h 41m
Afreeca Starleague
3h 41m
Soma vs YSC
Sharp vs sSak
Monday Night Weeklies
9h 41m
OSC
17h 41m
Afreeca Starleague
1d 3h
Snow vs PianO
hero vs Rain
WardiTV Map Contest Tou…
1d 3h
GSL
1d 5h
Replay Cast
2 days
Kung Fu Cup
2 days
[ Show More ]
The PondCast
3 days
WardiTV Map Contest Tou…
3 days
Escore
4 days
WardiTV Map Contest Tou…
4 days
Korean StarCraft League
4 days
CranKy Ducklings
5 days
WardiTV Map Contest Tou…
5 days
IPSL
5 days
WolFix vs nOmaD
dxtr13 vs Razz
BSL
5 days
Sparkling Tuna Cup
6 days
WardiTV Map Contest Tou…
6 days
Ladder Legends
6 days
BSL
6 days
IPSL
6 days
JDConan vs TBD
Aegong vs rasowy
Replay Cast
6 days
Liquipedia Results

Completed

Escore Tournament S2: W2
RSL Revival: Season 4
NationLESS Cup

Ongoing

BSL Season 22
ASL Season 21
CSL 2026 SPRING (S20)
IPSL Spring 2026
StarCraft2 Community Team League 2026 Spring
Nations Cup 2026
PGL Bucharest 2026
Stake Ranked Episode 1
BLAST Open Spring 2026
ESL Pro League S23 Finals
ESL Pro League S23 Stage 1&2
PGL Cluj-Napoca 2026
IEM Kraków 2026

Upcoming

Escore Tournament S2: W3
Acropolis #4
BSL 22 Non-Korean Championship
CSLAN 4
Kung Fu Cup 2026 Grand Finals
HSC XXIX
uThermal 2v2 2026 Main Event
RSL Revival: Season 5
WardiTV TLMC #16
IEM Cologne Major 2026
Stake Ranked Episode 2
CS Asia Championships 2026
Asian Champions League 2026
IEM Atlanta 2026
PGL Astana 2026
BLAST Rivals Spring 2026
CCT Season 3 Global Finals
IEM Rio 2026
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2026 TLnet. All Rights Reserved.