Hi guys, no idea which forum category to post this in. I was looking for ways to replace words with kanji for study purposes, and found this greasemonkey script that someone made. It replaces the words even if they’re inside another word, which is what I need, ex: “ŝi ek話is” is correct and a vital function.
Ideally, there would be settings where you can have it replace words only half the time, or every third time etc. the word appears, and you would be able to click on or hover over the word and see what it’s replacing (hover on 兄, get “frat”). As it is, you have to keep the wordlist open and ctrl+f to see what words it might be replacing. Anyway, here’s the whole script with a few examples:
// ==UserScript==
// @name Replaces Words
// @namespace random URL so that if you have duplicate script names the computer can still tell them apart
// @description Replaces text.
// @copyright JoeSimmons
(function () {
'use strict';
var words = {
'malproksim': '遠',
'dog\'s': '犬の',
' mi ': ' 私 ',
' mi ': ' 俺 ',
' du ': ' 二 ',
///////////////////////////////////////////////////////
'': ''
};
///////////////////////////////////////////////////////////////////////////////
var regexs = [
],
replacements = [
],
tagsWhitelist = [
'PRE',
'BLOCKQUOTE',
'CODE',
'INPUT',
'BUTTON',
'TEXTAREA'
],
rIsRegexp = /^\/(.+)\/([gim]+)?$/,
word,
text,
texts,
i,
userRegexp;
// prepareRegex by JoeSimmons
// used to take a string and ready it for use in new RegExp()
function prepareRegex(string) {
return string.replace(/([\[\]\^\&\$\.\(\)\?\/\\\+\{\}\|])/g, '\\$1');
}
// function to decide whether a parent tag will have its text replaced or not
function isTagOk(tag) {
return tagsWhitelist.indexOf(tag) === - 1;
}
delete words['']; // so the user can add each entry ending with a comma,
// I put an extra empty key/value pair in the object.
// so we need to remove it before continuing
// convert the 'words' JSON object to an Array
for (word in words) {
if (typeof word === 'string' && words.hasOwnProperty(word)) {
userRegexp = word.match(rIsRegexp);
// add the search/needle/query
if (userRegexp) {
regexs.push(new RegExp(userRegexp[1], 'g')
);
} else {
regexs.push(new RegExp(prepareRegex(word).replace(/\\?\*/g, function (fullMatch) {
return fullMatch === '\\*' ? '*' : '[^ ]*';
}), 'gi')
);
}
// add the replacement
replacements.push(words[word]);
}
}
// do the replacement
texts = document.evaluate('//body//text()[ normalize-space(.) != "" ]', document, null, 6, null);
for (i = 0; text = texts.snapshotItem(i); i += 1) {
if (isTagOk(text.parentNode.tagName)) {
regexs.forEach(function (value, index) {
text.data = text.data.replace(value, replacements[index]);
});
}
}
}());
Instructions:
-
Install Greasemonkey on Firefox, or whatever the Chrome equivalent is. On Firefox, on the top menu bar go into “Tools > Greasemonkey”, hovering on that will give you some options and one of those is “New Userscript”, click that.
-
Copy-paste the code above into it, make sure to add in a random URL where it says to.
-
The word to the left is what will be replaced. To the right is what it will be replaced with:
‘viand’: ‘肉’,
= “viand” will now appear as 肉 instead.
If you have a double entry with the left side being exactly the same, the script will cycle between replacements:
’ mi ': ’ 私 ‘,
’ mi ': ’ 俺 ',
This means that “mi” will be replaced with 私 half the time, and 俺 the other half of the time. (EDIT: This seems to work sometimes but not all the time, I haven’t figured out why or if it’s just my imagination).
— Add spaces to control things; ’ or ’ with spaces at both ends would only replace the word “or” when it’s alone as a single word, but ’ over’ with a space only in front, would replace it when it begins a word, so “overhaul” would get its over replaced but “sleepover” wouldn’t. Then, "over " with a space at only the end, would make it so sleepover gets replaced but “overhaul” doesn’t. You can of course replace whole phrases as well, ex:
' at the dog ': ' hundinum ',
— To include apostrophies do it like this: '
dog’s = dog’s
— The author of the script said that to include periods/full stops, do this: ‘/\N.A.S.A./’ however I haven’t tested this part.
-
Make a list with your words where my examples are; after the var words = { but before the long line of ////////////.
-
Run that wordlist through a “reverse sort by length” line generator like this one. (To use that one: Paste in your entries, hit “Length”, then hit “Reverse”). The script reads from top to bottom, so if longer entries aren’t on top, then it will just replace things wrong.
Note that it counts spaces before ’ as characters, so if you have entries witha different space amount like this for example:
’ mi ': ’ 私 ‘,
’ mi ': ’ 俺 ',
Those won’t be counted as the same length by the script.
- Now copy-paste that sorted list back into the script, save it and test it out.
— If there’s problems where you don’t want some words to be replaced, add them in and make sure they’re at the top of the list. Ex. if “hey” should never be replaced but “ey” is making it be, make sure you’re somewhere above “ey” and then write this:
'hey': 'hey',
Likewise, if a compound word or phase or something is being replaced wrong, just add it in as its own entry with the correct form as you notice it. I add bunches of words in, then check a random book at gutenberg.org to catch any immediate errors.
———————
This script does NOT affect “input text” forms (like when going to write a forum post here on Memrise) and some other places.
To exclude pages and domains so that they aren’t affected, you can go into the settings via Firefox and put them in there, or you can write this directly into the script (* stands for “anything goes here”):
// @exclude file://*
// @exclude *.memrise.com/*
And so on. Same for including pages, just replace “exclude” with “include”. this is good for if you want the default to be that almost no pages are affected (since my list is in Esperanto, if I were to always have it enabled it’d be changing even English words according to my wordlist, which isn’t good):
Unless your list of words is really short, make sure you exclude search engines, Wikipedia, Facebook, Tumblr and Memrise. If your list gets really long like mine (1.000+ entries), pages will load slowly and a “this script isn’t responding!” error might pop up, that you have to hit “let it run anyway” on.
Later I’ll experiment with duplicating the script and splitting up long lists into multiple smaller ones to see if that avoids the problem.
———————
Example of end result:
ne 考inte eĉ 一 瞬on, 何流e ŝi 再出iĝos, Alico 後走is ĝin en la tunel小on.
= ne konsiderinte eĉ unu momenton, kiamaniere ŝi reeliĝos, Alico postkuris ĝin en la tuneleton.
I was looking at the dog
= I was looking hundinum