• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 16:33
CEST 22:33
KST 05:33
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Team Liquid Map Contest #22: Results and Winners7Code S Season 2 (2026): RO4 and Finals Preview12TL.net Map Contest #22 - Voting & Ladder Map Selection7Code S Season 2 (2026) - RO8 Preview7[ASL21] Finals Preview: Two Legacies21
Community News
ZeroSpace at Steam NextFest - Last free demo8Weekly Cups (June 8-14): Clem and Solar double, PTR tested0RSL: S6 Finals played at BlizzCon 202611Douyu Cup 2026: $20,000 Legends Event (June 26-28)10[BSL22] Non-Korean Championship from 13 to 28 June4
StarCraft 2
General
StarCraft II 5.0.16 PTR Patch Notes may 26th Daily SC2 Player Grid - feedback wanted TL Poll: How do you feel about the 5.0.16 PTR balance changes? Code S Season 2 (2026) - RO8 Preview Updates to The Core/Core Lite for v5.0.16?
Tourneys
Master Swan Open (Global Bronze-Master 2) GSL CK #4 20-21th June Crank Gathers Season 4: BW vs SC2 Team League Douyu Cup 2026: $20,000 Legends Event (June 26-28) Maestros of The Game 2 announcement and schedule !
Strategy
[G] Having the right mentality to improve
Custom Maps
Work In Progress Melee Maps [D]RTS in all its shapes and glory <3
External Content
Mutation # 530 One For All The PondCast: SC2 News & Results Mutation # 529 Opportunities Unleashed Mutation # 528 Infection Detected
Brood War
General
BGH Auto Balance -> http://bghmmr.eu/ vespene.gg — BW replays in browser Data needed BW General Discussion VPN experiences
Tourneys
The Casual Games of the Week Thread [Megathread] Daily Proleagues [ASL21] Grand Finals [BSL22] Grand Finals - Sunday 21:00 CEST
Strategy
Simple Questions, Simple Answers Relatively freeroll strategies Creating a full chart of Zerg builds Why doesn't anyone use restoration?
Other Games
General Games
Stormgate/Frost Giant Megathread ZeroSpace at Steam NextFest - Last free demo Path of Exile Nintendo Switch Thread ZeroSpace Megathread
Dota 2
Looking for a Dota Mentor Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug
TL Mafia
Vanilla Mini Mafia {D-2} Late to making 20.06.2026 memorable [p]94718
Community
General
US Politics Mega-thread Russo-Ukrainian War Thread [H]Internet/Gaming Cafe Tips and Tricks The Games Industry And ATVI UK Politics Mega-thread
Fan Clubs
The HerO Fan Club! The herO Fan Club!
Media & Entertainment
Movie Discussion! [Req][Books] Good Fantasy/SciFi books [TV/BOOK] *SPOILERS* Game of Thrones Discussion
Sports
2024 - 2026 Football Thread McBoner: A hockey love story TeamLiquid Health and Fitness Initiative For 2023 Formula 1 Discussion Cricket [SPORT]
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread Facing Challenges in Mobile App Development
TL Community
The Automated Ban List
Blogs
How To Predict Tilt in Espor…
TrAiDoS
An Exploration of th…
waywardstrategy
I'm an arrogant trash talke…
FlaShFTW
Gauntlet SC2: A Retrospectiv…
Ctone23
Why RTS gamers make better f…
gosubay
Customize Sidebar...

Website Feedback

Closed Threads



Active: 8493 users

[Programming Blog] autofan and web stuff

Blogs > Loser777
Post a Reply
Loser777
Profile Blog Joined January 2008
1931 Posts
September 13 2013 10:12 GMT
#1
Well, it's another one of those mental masturbation blogs that few people will appreciate. I've recently been trying to shoehorn my TL fan club scraper into a web app. For a long time, as someone who likes "getting close to the metal," (and had 0 experience in web development) I looked down on web-development, thinking it offered no real programming challenges. Boy, was I wrong. Not only is there an insane amount of knowledge needed just to get the ball rolling, so much of it is scattered in hard to find places. There's also just a billion ways to do whatever you want, so honing down a sensible approach (and software stack) for your needs is excessively difficult.

Enough bitching, here's what I've been trying to learn:
  • Server-side code with node.js
  • Client side code with jQuery
  • CSS

Node.js is fucking ridiculous. I started off with 0 knowledge of js, and within the first two hours I was actually somewhat convinced that the asynchronous-event-driven-callback model made reasonable sense. It's also mind-boggingly simple. You toss around some "data," and a billion callbacks and somehow you have a web server that handles requests.

jQuery is weird. Half the time I seriously question if what I'm doing makes sense. (It feels like cheating compared to the amount of detail C++ requires)

CSS is awesome. It also makes it shamefully easy to steal other people's designs (even if they are free!)

The current setup is kind of hilarious, as my node.js implementation is hack and ALSO does none of the actual scraping/parsing. In fact, it actually creates a child process that executes the original autofan binary (recall, that's written in C/C++) each time it gets an AJAX request. There could be some overhead saved if fan club member lists are saved in a database (they are), and the server uses some model where entries are updated only if they are stale (it does not), and only calls the binary if they are stale (it does not). There would likely be issues if I wanted to migrate this to some fancier hosting (e.g heroku), as some of the libraries used by autofan are a bit exotic, and by exotic I mean old (libtidy).

Anyhow, if you want to give it a spin, it's hosted on my laptop over Wi-Fi (don't expect blazing speeds LOL) (server uptime will also probably be like 1%, no guarantees)
http://68.5.231.239:8888

This is just a proof-of-concept, NOTHING is ready yet.
I caution that this is not ready to replace current fan club member lists (the rules for adding members are far from definitive). Also, I recommend you choose fan club threads with 15 pages or less because I don't have a fast internet connection--and 15 pages will already take over a minute (atrocious loading time, I know).

An aside:
Upon testing autofan with some fan clubs, I noticed that administrators have their own class, they a slightly different class for their post headers, yet the same class for the actual post. This caused autofan to freak out and think that there was a headless zombie post on pages where an admin posted. Anyway, that was fixed.

As usual, here are the git repositories if you feel like being a masochist and reading horrible code.
https://github.com/eqy/nodesandbox
https://github.com/eqy/autofan
(the postgres_exp branch is the most recent, and contains the administrator fix).

I hope that I can really understand some web dev stuff through this project, because I have a much, much more exciting TL-related idea in my head that's been brewing for a bit that I would really like to be able to implement. I only have about a week or so of summer vacation left, so it's mad dash to see what I can learn/get done before school kicks in.

6581
c0ldfusion
Profile Joined October 2010
United States8293 Posts
September 13 2013 12:41 GMT
#2
Interesting blog! I've been hearing people rave about node.js for a while now.

As someone who also has no experience in JS, what resource did you use/do you recommend for picking it up to play around with it? Thanks.
Ao
Profile Joined July 2009
Korea (South)19 Posts
September 13 2013 19:16 GMT
#3
Hey there.

There's a language out there created by Google themselves called Go. It basically replaces node.js in it's entirety and makes you wonder why node.js ever existed. See the below gist for a 3-file reproduction of your work. I'm sure there's bugs, don't use my code, disclaimer etc. etc.

The Go code itself is a bit ugly, I basically hacked this together in an hour or so and it's now 4am, tiredness = programming fail. I'm also pretty sure there's a resource leak where I'm forgetting to close a request somewhere... But I can't be bothered to figure out the semantics of which http methods require a Close() atm.

https://gist.github.com/aarondl/6554786

If you've got any questions feel free to PM me
Loser777
Profile Blog Joined January 2008
1931 Posts
September 13 2013 20:12 GMT
#4
On September 13 2013 21:41 c0ldfusion wrote:
Interesting blog! I've been hearing people rave about node.js for a while now.

As someone who also has no experience in JS, what resource did you use/do you recommend for picking it up to play around with it? Thanks.

Check out this SO topic: (which I used)
http://stackoverflow.com/questions/2353818/how-do-i-get-started-with-node-js

On September 14 2013 04:16 Ao wrote:
Hey there.

There's a language out there created by Google themselves called Go. It basically replaces node.js in it's entirety and makes you wonder why node.js ever existed. See the below gist for a 3-file reproduction of your work. I'm sure there's bugs, don't use my code, disclaimer etc. etc.

The Go code itself is a bit ugly, I basically hacked this together in an hour or so and it's now 4am, tiredness = programming fail. I'm also pretty sure there's a resource leak where I'm forgetting to close a request somewhere... But I can't be bothered to figure out the semantics of which http methods require a Close() atm.

https://gist.github.com/aarondl/6554786

If you've got any questions feel free to PM me

Interesting work! I've heard of Go, but haven't done anything with it. It's probably one the most hipstery/bleeding-edge languages out there, if not the most hipstery/bleeding edge language. It did cross my mind as a possible implementation choice, but I'm not sure if the postgres drivers in Go are are mature enough to use.

Also, interesting choice of reading the "data-user" attribute, it definitely simplifies the process of grabbing usernames. But moving the implementation language of the scraper/parser to Go from C++ defeats the purpose of mental masturbation a bit , you're skipping the entire download->tidy up->xml->navigate xml flow :0.
6581
Ao
Profile Joined July 2009
Korea (South)19 Posts
September 14 2013 03:06 GMT
#5
Let me assure you that Go is not a hipster language. It's made by neckbeards for neckbeards. Node.js is much more hipster by comparison (js on the server... really?)

Here's a high quality stable postgres driver for Go:
https://github.com/bmizerany/pq

As for my particular no-C++ regex-laden implementation it's a byproduct of laziness. I had no intent on porting that mess, though it's not hard to do if you really wanted a "walk the nodes" implementation over regex. My intention was merely to show that node.js is one-upped by Go and to recommend it highly over node.

C++ is my first language and my first great love in programming languages, however as I've matured as a programmer I've decided it's a horrific language and no one should use it unless there's an absolute necessity or requirement. So these days I'm glad I can use Go as an almost complete replacement.

Anyways, happy coding!
Loser777
Profile Blog Joined January 2008
1931 Posts
Last Edited: 2013-09-14 06:21:49
September 14 2013 06:06 GMT
#6
Even as a language to learn basic web development? I don't feel that Go is inherently any more readable or more efficient for my use case, and in many ways the callbacks to handle events asynchronously model is seen in Go:

http.HandleFunc("/", handleIndex)
http.Handle("/standard.css", http.FileServer(http.Dir(".")))


There are a few things that surprise me though, such as seeing pages (index) having their content defined globally and not within the request handler. I guess there's a trade off: separating callbacks into modules keeps your data organized, but too many modules and tracing through the callback chains becomes a nightmare.

I still prefer my POST request handler

var buf bytes.Buffer
for _, page := range pages {
for _, signup := range page.Signups {
_, err = buf.WriteString(signup)
if err != nil {
log.Fatal(err)
}
_, err = buf.WriteString("<br />")
if err != nil {
log.Fatal(err)
}
}
}

seems a bit more involved with a <br /> with each line instead of

response.writeHead(200, {"Content-Type": "text/html"});
response.write("<pre>");
response.write(stdout);
response.write("</pre>");
response.end();

Overall though, I'm pretty impressed. Maybe I'll take Ken Thompson's word for it.

EDIT: defer is a pretty awesome language feature. Even in such a simple program as autofan, that would have saved me a lot of headaches where some pointer wasn't freed because the function exited early due to an error or some other condition.
6581
Ao
Profile Joined July 2009
Korea (South)19 Posts
September 14 2013 08:18 GMT
#7
On September 14 2013 15:06 Loser777 wrote:
Even as a language to learn basic web development? I don't feel that Go is inherently any more readable or more efficient for my use case, and in many ways the callbacks to handle events asynchronously model is seen in Go:

You've developed with node.js for what I'd take to be a few hours of time. There's a natural callback spaghetti that node's awful concurrency model perpetuates but of course you won't run into it in the small time that you've worked in it. Go does not have this, it has Go routines and channels which are based on the CSP style of concurrency by Hoare and is leagues above callback spaghetti one-thread style concurrency. So when it comes to what they're both best at doing (writing small sites, one-page application sites, or API endpoints)... Go is a great deal better at it. From performance, memory usage, and code readability. Not to mention you get all the benefits of statically compiled languages, easy deployment, static code analysis and typing errors, etc etc.

http.HandleFunc("/", handleIndex)
http.Handle("/standard.css", http.FileServer(http.Dir(".")))

This is as far as callbacks go in the web development world of Go. I could have defined the handler inline but for the sake of cleanliness I didn't.

On September 14 2013 15:06 Loser777 wrote:
There are a few things that surprise me though, such as seeing pages (index) having their content defined globally and not within the request handler. I guess there's a trade off: separating callbacks into modules keeps your data organized, but too many modules and tracing through the callback chains becomes a nightmare.

Globally defined content is a horrible idea if you're gonna make this a bigger project. The reason I did it in this example is because it really makes sense especially in such a simple scenario. I have two files, why would I read those files over and over again every time a request for it came in? This in essence is caching the entire file in memory and hence makes request responses much faster since I never hit the disk (keep in mind the kernel's file caching might also come into play here to make a read-always solution almost as fast). And on the note about callback seperation, this is exactly the type of callback spaghetti I'm talking about that you'll run into later with node.

On September 14 2013 15:06 Loser777 wrote:
I still prefer my POST request handler
var buf bytes.Buffer
for _, page := range pages {
for _, signup := range page.Signups {
_, err = buf.WriteString(signup)
if err != nil {
log.Fatal(err)
}
_, err = buf.WriteString("<br />")
if err != nil {
log.Fatal(err)
}
}
}

seems a bit more involved with a <br /> with each line instead of

response.writeHead(200, {"Content-Type": "text/html"});
response.write("<pre>");
response.write(stdout);
response.write("</pre>");
response.end();

Overall though, I'm pretty impressed. Maybe I'll take Ken Thompson's word for it.

There's a number of ways I could do this stuff. If I was using pre I'd have to append \n anyways. And also on the ugliness note: I wanted to pedantically handle every error to show you that you should. Lots of people ignore errors for example code and I think that's wrong. But reasonably I can probably remove the error handling from this code since the only error likely to come back from a buffer write is out of memory on re-allocation and then it becomes much simpler:

var buf bytes.Buffer
for _, page := range pages {
for _, signup := range page.Signups {
buf.WriteString(signup)
buf.WriteString("<br />")
}
}

I could have avoided all the concatenation into a seperate memory buffer and just spammed it directly to the response stream (now that I think of this, why didn't I do it that way??)

for _, page := range pages {
for _, signup := range page.Signups {
io.WriteString(w, signup)
io.WriteString(w, "<br />")
}
}

Or I could move it into a function that would be less noisy (which is probably what should be done since I'm doing this concatenation in two places anyways...) Lots of ways to skin a cat. My only hope is that I've casted some doubt on node.js as a reasonable solution to anything over Go haha.

Good luck!
Loser777
Profile Blog Joined January 2008
1931 Posts
Last Edited: 2013-09-14 10:32:00
September 14 2013 10:30 GMT
#8
Thanks, I'll keep that in mind. Go vs. Node.js still leads to heated debates on sites like HN and SO, so it's good to hear some strong opinions.

On the other hand, my blog is now thoroughly (hopelessly?) derailed. If any actual fan club maintainers see this, please give me some feedback on whether such a tool would be useful, and in what ways you would most likely want to use it.

Once again, it's here (until I can't host the site on my laptop anymore).
http://68.5.231.239:8888/autofan
6581
tarpman
Profile Joined February 2009
Canada723 Posts
September 14 2013 18:14 GMT
#9
Nice work Loser777. I think it's simply awesome not only that you're trying this stuff out but also that you're posting it for us to see! That takes some courage. I've never posted my first project in a brand new language

Happy to see you took my suggestion about using XPath. I hope it saved a few lines?

Scraping the topic pages individually seems time consuming. Did you consider grabbing the single page version to save a few requests?

One other suggestion... add the email address you're using for commits to your GitHub account, so your work gets attributed properly! Then the contributions calendar on your profile will get filled up

OK no more derailing... happy coding! I hope you get some fanclub maintainers using this soon
Saving the world, one kilobyte at a time.
Loser777
Profile Blog Joined January 2008
1931 Posts
September 14 2013 20:46 GMT
#10
On September 15 2013 03:14 tarpman wrote:
Nice work Loser777. I think it's simply awesome not only that you're trying this stuff out but also that you're posting it for us to see! That takes some courage. I've never posted my first project in a brand new language

Happy to see you took my suggestion about using XPath. I hope it saved a few lines?

Scraping the topic pages individually seems time consuming. Did you consider grabbing the single page version to save a few requests?

One other suggestion... add the email address you're using for commits to your GitHub account, so your work gets attributed properly! Then the contributions calendar on your profile will get filled up

OK no more derailing... happy coding! I hope you get some fanclub maintainers using this soon

XPath is definitely more maintainable and saved many lines. The issue with &view=all is that it's restricted to threads of a certain size. For huge fan clubs, that won't work unless I have TL+--and even then the web server would have to be "logged in" at all times.

Hmm, that's interesting what you said about contributions. I've re-tied my email to git on my work machine, but it seemed weird in that it selectively defined some activity as "contributions." Maybe that was just when I created the repository.
6581
tarpman
Profile Joined February 2009
Canada723 Posts
September 15 2013 00:44 GMT
#11
On September 15 2013 05:46 Loser777 wrote:
Hmm, that's interesting what you said about contributions. I've re-tied my email to git on my work machine, but it seemed weird in that it selectively defined some activity as "contributions." Maybe that was just when I created the repository.

That worked. Your commits are linked to your GitHub account now. Unfortunately it turns out that you have to contact support to complete the process (when you've linked your account after already pushing the commits). Up to you whether that's worth the effort...
Saving the world, one kilobyte at a time.
Ao
Profile Joined July 2009
Korea (South)19 Posts
Last Edited: 2013-09-15 01:29:16
September 15 2013 01:25 GMT
#12
I thought I'd derail it some more. I keep seeing you post that it's hosted on your laptop but there's really no good reason for that.

Google has a free version of Google App Engine and gives quite a lot of resources until they ask you to pay for it.

Amazon also has a free tier of their ec2 services.

Maybe that'll help you get it off your laptop haha.
Please log in or register to reply.
Live Events Refresh
Next event in 14h 27m
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
ByuN 537
WinterStarcraft513
UpATreeSC 109
Railgan 70
ZombieGrub51
JuggernautJason6
StarCraft: Brood War
Calm 2965
Soulkey 337
ggaemo 220
actioN 120
Rock 23
Counter-Strike
fl0m11937
summit1g4996
byalli740
Heroes of the Storm
Liquid`Hasu371
Trikslyr75
Other Games
Grubby2619
singsing2378
FrodaN621
Beastyqt609
shahzam382
B2W.Neo351
uThermal286
C9.Mang0209
ROOTCatZ130
PiGStarcraft103
Mew2King76
Livibee57
KnowMe57
Organizations
Dota 2
PGL Dota 2 - Main Stream4645
Other Games
gamesdonequick1655
StarCraft 2
angryscii 23
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
[ Show 16 non-featured ]
StarCraft 2
• Hupsaiya 55
• musti20045 11
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• FirePhoenix14
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Counter-Strike
• C_a_k_e 1659
• imaqtpie539
• Shiphtur136
Upcoming Events
WardiTV Spring Champion…
14h 27m
GSL
15h 27m
Maru vs ShoWTimE
Classic vs Reynor
herO vs Lambo
Solar vs Clem
BSL22 NKC (BSL vs China)
22h 27m
XuanXuan vs Jaystar
Mihu vs Messiah
eOnzErG vs Dewalt
Bonyth vs Jaystar
TerrOr vs Messiah
XuanXuan vs Mihu
eOnzErG vs Jaystar
Replay Cast
1d 3h
WardiTV Spring Champion…
1d 14h
GSL
1d 15h
Patches Events
1d 20h
BSL22 NKC (BSL vs China)
1d 22h
Dewalt vs Messiah
Bonyth vs Mihu
TerrOr vs XuanXuan
eOnzErG vs Messiah
Jaystar vs Mihu
Dewalt vs XuanXuan
Bonyth vs TerrOr
Replay Cast
2 days
WardiTV Weekly
2 days
[ Show More ]
Monday Night Weeklies
2 days
Sparkling Tuna Cup
3 days
The PondCast
4 days
Douyu Cup 2020
5 days
Oliveira vs Trap
Jieshi vs XY
soO vs FanTaSy
TY vs Coffee
Douyu Cup 2020
6 days
Neeb vs Impact
MacSed vs Cyan
Scarlett vs Kelazhur
INnoVation vs Dear
Liquipedia Results

Completed

KCM Race Survival 2026 Season 2
uThermal 2v2 2026 Main Event
Heroes Pulsing #2

Ongoing

IPSL Spring 2026
Acropolis #4
CSCL: Masked Kings S4
YSL S3
BSL 22 Non-Korean Championship
CSL Season 21: Qualifier 1
SCTL 2026 Spring
Maestros of the Game 2
WardiTV Spring 2026
Murky Cup 2026
IEM Cologne Major 2026
Stake Ranked Episode 2
CS Asia Championships 2026
Asian Champions League 2026
IEM Atlanta 2026
PGL Astana 2026
BLAST Rivals Spring 2026
IEM Rio 2026
PGL Bucharest 2026

Upcoming

CSL Season 21: Qualifier 2
CSL 2026 Summer (S21)
CSLAN 4
Blizzard Classic Cup 2026
Kung Fu Cup 2026 Grand Finals
RSL Revival: Season 6
CranK Gathers Season 4: BW vs SC2 Team League
HSC XXIX
Douyu Cup 2026
BCC 2026
Light HT
Heroes Pulsing #3
BLAST Open Fall 2026
Esports World Cup 2026
BLAST Bounty Summer 2026
BLAST Bounty Summer Qual
Stake Ranked Episode 3
XSE Pro League 2026
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Disclosure: This page contains affiliate marketing links that support TLnet.

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2026 TLnet. All Rights Reserved.