Where can I help translating Joseki Explorer?

It has not been the plan: it is a deliberate choice not to do that.

The choice I made was to prefer to support translation (and to continue to improve it as ideas like this come in) rather than have different - even possibly conflicting - descriptions in different language.

Another factor is the sheer availability of translators for OJE. If we chose the “separate translations” path initially, then OJE would have been bare of translations, and sparse indeed for a very long time.

It may be that we are at a tipping point now, I’m not sue.

So it’s technically possible of course, but it will take persuasion that it is the right thing.

This is something we’d really love to support.

If the only way to “remove this barrier” were to have different text in different languages - maybe with a translation provided if there isn’t text in a language - then this is something we should look at.

Am I right in understanding that the essense of the barrier is “achieving natural sounding language with correct terminology”?

I would like to “prove” that we can not do that with prompting before resorting to separarate translations. We had reasonable success with system warnings this way, at least with the “target languages that were complained about”, so it seems conceivable, although Mandarin might be “the hardest we’ve yet to try”.

3 Likes

2 posts were split to a new topic: Improving the non-English experience at OGS

This choice is understandable, considering that you previously thought that the translations were almost perfect. But now that there is evidence to the contrary, maybe it is time to reconsider. I will provide some more arguments.

First of all, there is the general question of whether you want a truly multilingual Joseki Explorer, or simply a translated Joseki Explorer. If we stick with machine translations, there will always be texts that are slightly off (and some that are even terribly wrong). Of course, with enough context provided by continuous community feedback, the machine-translated explanations of the most common positions should be fine. But explanations written by humans will always be a better experience.
Why? While words can be translated, often there is no 1:1 correspondence between an English word and a non-English word. Moreover, speakers of other languages structure their sentences differently. A human native speaker can easily differentiate between the structure of a text and its meaning, and is able to explain a Joseki position in a way that is both correct (using the English text as reference) and natural. A machine will either stick as closely to the structure of the English text (to keep the explanation correct), or unknowingly alter the meaning (by trying to make the text look natural).
I understand that this may be hard to grasp if you don’t speak a foreign language (especially not one that is very different from English). But in many cases, non-human translations simply do not feel natural.

I’m not sure if there is a way to prove this besides showing you examples of failed translations. But I can’t imagine that you will ever be able to provide enough context that covers every position, especially if the English texts are changing constantly. Keep in mind that natural-sounding language is much harder to achieve than just correct language.

Even if this is the case, consider that machine translations also fail for languages like German and French, which are closely related to English (see Do not use (unverified) machine translations for examples). The problem is simple: Language models do not understand the text that they are translating, so they will make mistakes.

Beyond correctness and “naturalness” of the Joseki explanations, there may be more reasons to allow different text pools. For example, the text in another language may use an analogy to explain a position that simply doesn’t make sense in English (e.g. a reference to the joseki’s name). Or there might be great resources, like articles or YouTube videos, that explain a position in a non-English language. Currently, there is no way to include such useful information for users who speak other languages.

Finally, here is my suggestion for how the Joseki Explorer could look like for non-English users (influenced also by the discussions in previous threads):

  1. If there is an explanation written in the user’s language, show it at the top. Below, show the original English text with the option to auto-translate it.
  2. Both versions of the explanation should have a timestamp (I imagine this to be useful in general, also for English users, to judge whether an explanation is up-to-date). If the timestamp of the non-English text is older than that of the English text, show a small warning that the explanation might be outdated.
  3. If there is no explanation in the user’s language, show only the original English text with the option to auto-translate it.
  4. If there is no explanation in the user’s language, show an option to help with the translations.

I strongly encourage you to at least consider hiding the machine-translated text behind a button press, or showing it below the English original. This would already look less “amateurish” than the current situation.
If you really don’t want users to make an additional button press for the translation, at least show a small warning that the text is machine-translated.

3 Likes

These are all very sensible suggestions, from my experience in my day job of working on a product where things have independently translatable text fields. The other workflow that is important to consider is letting someone know when they change the English (primary) value that other translations exists and is their edit so major that all others translations be deleted as invalid? In the OGS joseki example if the English is “This move is good” and then someone adds “because white gets a nice wall” then you can keep the Chinese version of ‘this move is good’. But if someone changes it to “This move is no longer considered joseki due to AI saying the wall is not so strong with the peep at A, and the territory black gets is too big” then you don’t want to keep the Chinese.

3 Likes

Oh, “prove” was too strong.

I meant “take the efforts we know that we have yet to take, which may achieve the outcomes”

Keep in mind that natural-sounding language is much harder to achieve than just correct language.

Yes - but AI is getting better every day, and we have not yet done any prompt work on OJE.

I’m on standby to support anyone who wants to help with that…

1 Like

I think it’s changing a bit in recent years with LLMs. You can come across examples of native speakers rating LLMs translations as surprisingly natural sounding, although usually the best examples are in something like Spanish or French as opposed to Chinese.

So basically I wouldn’t rule out non-human translations like you might with Google translate 10 years ago, and progress can improve drastically over short time periods nowadays.

I think that’s also where people are drawing a line of standard that’s just too high for whats needed.

If all the source material is English, and it’s being translated to other languages then I would say we don’t need the machine translation to capture some poetic nuance. As a first approximation it just has to convey the meaning and sentiment correctly.

If someone wrote “black is favourable here as the position is a solid as the Great Wall” and it was translated to ここに黒はいいです, that’s “good enough” as far as I would be concerned. Unless the sentence or paragraph goes into specifics, it’s probably not worth trying to idiom match to make it seem more natural or distinguish between adjectives that are functionally the same.


Multiple source languages and separate text pools can cause a lot of issues though.

You don’t want to have to check 50 languages to see which version of the OJE has the best explanation in this position, and then check it again for the next move in the position.

There’s an example I’ve encountered on another website called boardgamearena where wikis are completely separate and unrelated in different languages and so one language will have one sentence

and another will have whole paragraphs of tips and strategy

and

There’s very little incentive to translate these types of things to another language, even though they might be tips from very strong players in those games.

The author themselves might not speak many of the languages that site offers wiki pages for.

It’s basically not a great all round experience to have say French players have little to no help available, and English players well written guides. There’s also no indication that these things exist, because only your language is shown to you.

You’d have to be aware in the first place that other versions of the wiki exist, and then have to search through all other languages to see if there’s something better available as help.


You’re not as worried with these kinds of issues when the project is at a huge scale like Wikipedia, there’s enough interest and sources in many languages that somebody will probably contribute something. Or for more popular games, there’s guides and videos in that language that can populate the area.

It’s the more niche areas of abstract games, where you or someone wants to encourage more people to play but there’s little to no resources for players in that language where some kinds of machine translation become useful and scalable.


I think this is a very valid point, and being able to tag a position with multiple sources could be useful. For instance I think at the moment each position is just showing one resource reference? It could point to multiple sources, and having that to be different for each language might not be a bad idea, unless it’s very conflicting. Imagine player A says this move is good and player B says it’s a big blunder.

In some cases though, maybe OJE is not even the best project for this kind of stuff. Maybe something like

could be better at collecting many videos in multiple languages for each topic in Go.

Maybe only if it’s verified in some way. It’d be very easy to destroy all the non English versions very quickly that way.

We’d definitely want a good undo button then :slight_smile: (decent backups etc).


On the suggestions I agree there’s some useful points.

It’s already like that in a sense, you get both (though it’s not indicating machine translation)

Only relevant if you’re allowing multiple out of sync versions in different languages.

Though this is assuming again multiple versions exist in multiple languages.

Except for the button press, is this not the current situation?

I agree though that it should mention AI translation or machine translation so you have an low expectation :stuck_out_tongue:

No more a level of verification than would be needed on the person making the destructive edit to the English text. Which I believe is currently via a limited set of trusted users having edit rights to the joseki explorer.

2 Likes

Getting the terminology right is something that should be possible but I’ve also seen it failing horrible for machine translations on the main site. So if there is a new string in the main gameplay flows then I might feel the need to go and update the translations just because the machine translation will not be consistent with the rest of the manual translation and sometimes will be using entirely confused terms.

I wonder, how do we prompt the AI translation for different languages? Do we include examples with the correct transitions (especially with go terms, but I guess also with website terms)? Or does the translation service not support providing examples? Because it would be great to not have to correct “ranked”, “capture”, “suicide”, “handicap”, “pass”, “resign”, etc etc, every time they appear in new text.

And for modern LLM based translation, providing a few examples usually greatly improves translation correctness.

2 Likes

I’m not ruling out machine translations completely, and I’m all for improving them. But they should ideally be verified by humans, or at least clearly marked as machine translations.

I’m was not referring to “some poetic nuance” there. Explanations should be easily understandable, and different languages convey meaning in different ways. Even if a text presents the correct information, but is written in an unusual way, it becomes harder to understand and looks a lot less professional.

I don’t really get the point here, because I doubt anyone would do that. And your example of boardgamearena does not really apply to the suggestions that I made. I don’t see how it would make anyone’s life harder to have joseki explanations written in your native language by humans in addition to the English explanation, or a machine-translated version thereof.

It seems like you have misunderstood my suggestions. I wanted to propose how the OJE could look like with multiple text pools. So what I describe in your quote is not the current situation, but a situation in which the text in the blue box in your screenshot is written by a human.
If there is no explanation in the user’s language written by a human (i.e. the current situation), the machine translation should be hidden behind a button press or at least clearly marked.

That’s exactly why.

Yes, the second part is only relevant with multiple text pools. But having the timestamps could also be nice-to-have in general. Imagine if you’re looking up a joseki which you think is good, but the OJE explanation says is a bad result. If the timestamp is old, the information on OJE might be outdated. If the timestamp is recent, your information about the joseki in your mind may be outdated. Either way, it may help you to judge the explanation.

It’s when they diverge substantially you have an issue.

For example there’s a wealth more knowledge in CJK language, although in recent years a lot more English content has been created.

Still one can imagine conflicting or substantially different information depending on whether it’s coming from amateurs or pros, and one has to decide which language is the base language to translate the text from.

Let’s say you add Finnish, and English is the base language, but the Chinese version is much better in many positions. You need someone (or a volunteer team) to manually curate this every time there’s substantive change.

That’s the main issue. Otherwise things end up way out of sync, like the Boardgame arena example.

I think the timestamp of when the user added it in this case is much less important than the “timestamp” or date of a source.

OJE in principle wants variations that have sources, not just anything a random person decided to add like josekipedia.

So it’s really whether the source is from an 80s Japanese textbook, or a post AlphaGo video, lecture, book etc that’s important, not when someone came across that and decided to add it.

2 Likes

This problem applies to joseki information in general, not just to translations. Already there is a team of volunteers who curate the OJE, and one may find conflicting information in other sources.

Of course, it would not be ideal to have conflicting information about a joseki in the OJE itself. That’s why I proposed timestamps (to highlight outdated translations) and always showing the English text (as a common explanation to compare against), such that conflicts can at least be found easily. That way, users can point them out and/or fix them.

Note that right now, users also have to find and point out bad translations manually, but have no way of fixing them.

That’s a great point. Some kind of “source timestamp” would be more useful than a “last edit timestamp”, while also helping with comparing explanations for the same joseki in different languages.

That is valid, and I like that approach, so I wouldn’t want to change it. Some of us would simply like to improve the OJE experience in their native language :slight_smile:

So what I’ve gathered was that pootle approach, mostly for small words and phrases, wouldn’t be good for OJE. Also the frequency and its ability to change fairly easily.

If one did do a direct translation system, like if you select a position, select a source language (English for now) and target language, then simply add the translation, then I agree timestamps are a good way to see if meanings have change or if one language has fallen behind another.

My specific point was more so if you can freely add information in any language because you can have multiple language pools.

Then you can have conflicting information existing at the same time.

The difference at the moment is that while an old Go book says one thing and a new video from a pro says another, someone has made that decision already on how to present the information (correct, mistake, comments etc)

There is not two versions where you put both of the conflicting sources in.

That can happen with separate information pools, like languages, where you’re not simply asking for direct translations.

A direct translation system like you described is what I had in mind too. It indeed seems more suitable for OJE than Pootle.

I understand your concerns about conflicting information in different languages and agree with your point. It wouldn’t really make sense to have the overall evaluation (whether a move is good or bad) be dependent on the language. So if a translation system is implemented, the overall evaluation should still be a global setting.

In order to deal with this, I would suggest the following (numbers continued from my previous suggestions):

  1. Give translators specific instructions. For example, “Please provide an explanation of this position in your language. If there already is an English explanation of this position, please stick closely to it. If you disagree with the overall evaluation of this position, first discuss it in the Forums before making any changes.”
  2. Do not immediately publish new translations, but keep them pending until a “Curator” approves them. Curators could be manually selected, trustworthy members; or automatically selected based on number of accepted OJE edits; or something else.

Would that be acceptable?

It would be @GreenAsJade that could answer that.

Putting it all together your approach is:

2 Likes