Archiving go materials

I still miss NHK videos and I wish I saved them somewhere. Also Go Game Guru had a lot of nice articles, right, until it died?

I wonder if we should archive online go content and how would we go about it in an organized manner?

1 Like


A lot of GoGameGuru content can be found via as @stone_defender mentioned. You have to jump back a couple of years, since the more recent snapshots capture a fake website set up by spammers.

For example, I was able to recover their coverage of the famous Lee Sedol vs Gu Li jubango in this post:

Also, GoGameGuru released some of their content (go problems and some of their game review SGFs) on GitHub under a Creative Commons license:

Which I cataloged in a listing of go content in the Creative Commons:


I think it would be nice to have such an archive, but collecting and redistributing content might be legally problematic, unless the content has been released under some sort of free/permissive license (like Creative Commons). Even the Wayback Machine has questionable legal status.

The strict application of copyright with all rights reserved prevents a lot of archival efforts. Tragically, a lot of creative work is lost, since they never get publicly archived and preserved before they slip “out of print” (or maybe “off the web”), which often occurs long before they would enter the public domain.


Yeah, I use wayback machine all the time. It doesn’t solve all the problems:

  1. Wayback machine doesn’t archive everything (not all pages, videos).
  2. Even if it’s archived there it doesn’t matter if no one knows about it.
  3. Wayback machine can also die.

Yeah, copyright is a problem. Well, maybe in 50 years it’s going to be different so might as well download some.

Also we could archive videos of people who probably wouldn’t mind, e.g. Haylee’s videos.


John Fairbairn once got angry at me for using the Wayback Archive of his old articles on the Mindzine site as a reference point for posting information on Sensei’s Library, complaining that I was violating his copyright and that he might one day want to use them in a book (these articles were fifteen years old.) I lost most of my respect for him at that point.


Yes, the Wayback machine is a non-ideal tool, and could perhaps even come under threat due to lack of funding or legal issues.

I think that copyright law is indeed a huge problem. While copyright law serves a useful purpose in giving content creators a time-limited monopoly to commercially exploit their work (which has the good of encouraging content creation), the ever-growing length of copyright terms only narrowly serves some corporate interests (such as keeping older Disney IP out of the public domain) while being highly detrimental to the public good, since many older, less popular works are doomed to be lost and copyright restrictions only exacerbates the problem of preserving and archiving them. See also: Public Domain Day - Wikipedia

It’s not just 50 years, but I think over 100 years. For such an incredibly long time span, it’s not very practical for an individual to preserve and archive works. It takes an organization, or at least a plan to pass material down to someone else, requiring considerations in one’s last will and testament. With care, I think it is possible to preserve and pass down a collection of physical books to future generations. However, digital content may erode even faster, since one would have to convert file formats probably at least every decade or so, and transfer from one system to another, before various storage media becomes obsolete. Even now, there are collections of floppy disks where people struggle to find a machine that can read them, and software that can process their contents.

It’s disappointing to hear that he had such a reaction. Really he should thank you for notifying him that his content is on the Wayback machine, if he does not wish it to be there (I believe the Wayback machine will take down content upon request by the actual owners).

I think it’s terribly short-sighted of him to be so concerned about his potential commercial interests, when some Sensei’s Library reference might only minimally impact those interests (and perhaps only benefit them by increasing publicity), since pulling his content off of the Wayback machine and preventing other people from talking about it will only help increase the likelihood that his work will fall into obscurity and be forgotten by time. The overwhelming majority of academics that I know, would prefer their work to be shared and read as widely as possible, since they care more about spreading ideas. Ultimately, seeking to grow the audience is probably a better business model, since the popularity of Go in the West woefully below its potential.

I think a solution is to somehow encourage more content producers to embrace concepts like the Creative Commons and new models for revenue generation. To preserve the longevity of content and ideas, they need to be freely shareable, redistributable, adaptable, etc. Traditional copyright laws are very dated tools that create an artificial scarcity incongruent with an age of digital content.


I totally agree with everything in your post (which is really eloquent, nice job).


Some thoughts on this subject.

Historical games are archived: in books, on Waltheri / go4go, on GoGoD (most complete)

Modern professional games are archived: in books, on Waltheri, on go4go (most complete)

Amateur games are archived: in books, in magazines, on OGS and other servers. I don’t know where the largest collection of amateur games is; OGS has over 20 million games archived.

Video lectures are archived on Youtube and its Oriental equivalents.

Textual information is archived on Sensei’s Library and Wikipedia.

The open-source OGS code is archived on GitHub, and in GitHub’s arctic code storage unit.

So, for historical games we have a current archival strategy of analog / offline digital / online digital. That is pretty stable. For modern professional games, though, our archival is mainly online; amateur games are almost never recorded offline. Video lectures are almost all online. Textual information is mixed online / offline digital.

PS. Any books out of copyright can be scanned and uploaded to the Gutenberg Project. Public domain media of any kind can be uploaded to

1 Like

I think the best way to archive Go content at the grassroots level would be to download content from the Internet and save it onto USB drives (USB seems like the most stable offline storage protocol right now.) Sensei’s Library and Wikipedia explicitly support safe ways of ripping data from their sites.