The integrated AI Review feature for OGS

walken · April 27, 2019, 12:37am

This is very nice ! thanks a lot anoek for working on this.

Looking at the board from GaJ’s post, I have a few questions.

LZ’s moves are indicated with both letters and background color/intensity; how is one supposed to read that ? (i.e. B looks better than A in the example, but then why not always use label A for the bot’s recommended move ?)

Would it be possible to show the bot’s proposed variation as a branch in the analyze mode panel ? (not sure how much it’d help, but maybe do it just for the few largest missed-opportunity moves in the game ?)

Eugene · April 27, 2019, 1:24am

If you click on the bot’s variation it does load into the analyze mode game tree.

I’m not sure why the whole lot isn’t loaded immediately

lucasfelix · April 27, 2019, 1:33am

Oh, wonder!
How many goodly creatures are there here!
How beauteous mankind is! O brave new world,
That has such people in ’t!

anoek · April 27, 2019, 1:33am

The color intensity indicates how much a position was explored, and the letters indicate the order in which the bot would choose to play.

Eugene · April 27, 2019, 2:03am

That seems a bit wierd - I thought that Leela will play the move that was most explored? But A is not coloured at all?

andysif · April 27, 2019, 3:09am

so which is the better move? a dark B or a light A ?

anoek · April 27, 2019, 3:29am

I think the discrepancy is an artifact of not enough playouts, it’s pretty low for the beta site. I think that given enough time, they will probably match up, but in a short time frame, I’m assuming leela started exploring another move and realized it might be better than the move it had explored a lot already. Note I am not a Leela Zero expert by any stretch

Mulsiphix1 · April 27, 2019, 4:43am

This is awesome! Any idea if this will become available for the 9x9 and 13x13 board sizes sometime in the future?

lucasfelix · April 27, 2019, 12:50pm

The best part is that OGS’s interface is open source: you can experiment with the code and test your changes on the beta server.

github.com

online-go/online-go.com/blob/devel/CONTRIBUTING.md

# Development Environment

Make sure you have at least 4 GB of physical memory allocation to your OS and 4GB swap space allocated or `npm run dev` will probably freeze the system.

Getting setup is easy, you'll need to have [Node.js](https://nodejs.org/) installed,
then simply clone the repository and within the working directory run the following:

```
# You only need to run this the first time
npm install yarn
npm run yarn install
npx husky install

# Run this to start the development server and build system
npm run dev
```
If you get React dependency errors with npm install yarn, try this instead:
```
sudo npm install --global yarn
```

This file has been truncated. show original

anoek · April 27, 2019, 12:52pm

I’d like to add 9x9 and 13x13 too yes, but figured I’d see how it goes with 19x19 first

frolag · April 27, 2019, 1:49pm

In the example game at move 42, I see “D13 41.5%”. How do I interpret this? Does this mean that LZ was surprised by the next move by GnuGo? If LZ is so much stronger than the players it reviews, then in theory it should only find blunders (negative changes) and never be lectured by the players (positive changes). A small positive change (0.5%) would be acceptable noise, but large positive changes are worrisome and could be sign of a bug somewhere.

Devin_Fraze · April 27, 2019, 5:35pm

What is the difference between the blue and purple lines on the win-rate graph? And it if it is important, perhaps denote what they mean somewhere.

anoek · April 27, 2019, 10:27pm

That shows the change in the estimated percent chance of that player winning, after that move was made.

The blue is the quick estimate given by the neural network, the purple is the estimation after Leela Zero has done a number of playouts and visits. So the blue is basically Leela Zero’s hunch, and the purple is what it thinks after it’s thought about it for awhile I need to document that better…

txwolf · April 28, 2019, 12:10am

Would this be the first on go site?

Eugene · April 28, 2019, 12:46am

I have a feeling that if you have the purple line data, there is no point in showing the blue line eh… just extraneous information?

hyperpape · April 28, 2019, 1:53pm

Awesome idea!

I’m seeing a layout bug in Firefox on Linux.

. Would you prefer issues on github, or is posting here adequate?

smurph · April 28, 2019, 1:57pm

It does that sometimes when you load games via new tab/window. Refresh.

But yes, bugs are tracked via github.

frolag · April 28, 2019, 3:15pm

My question was how to interpret a large positive value like 41.5%. In my view the full interpretation table is something like

-30% : blunder
-5% : mistake
0% : good move
5% : LZ admits it slightly misjudged the previous position
30% : LZ admits it severely misjudged the previous position

Do you agree with this table? If not, then how would you express the meaning of a large positive value like 41.5% in English?

Since I’m worried about a bug but you’re not, I investigated a bit more. When I analyze the example game on my computer with the same network and the same number of playouts (200), the winrate doesn’t change at all at move 42 (D13 0%). The largest positive change over the entire analysis is 8.8%, and this number drops to 6.7% with 2000 playouts.

This support my claim that seeing 41.5% in the preview and so many moves above 10% indicates that something is not right.

mark5000 · April 28, 2019, 8:05pm

I think you’ve got to curve the table, because win rate is nonlinear. For example, if my win rate is 29%, your table says I can never blunder, even if I lose 60-points in one move. Conversely, if the game is even, where a 2-point mistake is worth 10%, your table might flag a 6% loss as a mistake even if it doesn’t actually lose any points. I feel a table like yours should try as best as possible to emulate score, where a 4-point loss is marked as questionable, and you judge mistakes and blunders from there.

Lys · April 29, 2019, 10:22am

I like very much this new feature and think it should be a wonderful tool to boost site supporters number.

Is it available an example of the “fast” review?