Memrise2Anki Replacement

@Eltaurus i’ve done a lot of progress, could you help me with downloading it using the extension api? I’m not sure how the communication and stuff works The files are supposed to go into a subfolder called courseID/filename

relevant api: Chrome Extension- Change download folder for specific download - Stack Overflow

current fork: GitHub - baroxyton/CourseDump2022: Google Chrome extension to download Memrise courses as csv files

I recently paid for an AwesomeTTS subscription and use it with imported Duolingo words and now trying it with my Memrise decks. Apart from being good quality recordings I feel having different voices to the ones your used to hearing in Duolingo or Memrise is also beneficial.

1 Like

Thank you and @naynaynay23 for your precious contribution! It makes some of our lives easier and I can’t thank you enough guys!

While using the extension I stumbled upon two bugs:

1. When downloading this course:

  • If I pick NOT to download media it grabs all 644/644 words correctly.
  • If I pick to DOWNLOAD media it grabs 461/644 words and 507 media files regardless of embedding or not embedding them in CSV entries.

2. This course has multiple media per card (pronunciations by different people) but only the first one gets downloaded and embedded per CSV entry.

2 Likes

Thank you very much for this feedback.
I’ve updated the extension so that it would handle different numbers of audio and video files per word. It seems to work for both of your cases now.

I also changed a little bit the way the data is stored in the .csv files. I hope it wouldn’t cause much trouble, but if you’ll have any problems with importing new tables into Anki, please feel free to ask here.

2 Likes

Thank you! :heart:

Though the download process works perfectly fine now I stumbled upon some obstacles while importing .csv to Anki and have a few suggestions:

1. Currently media files get added as separate fields e.g.,

“Front”,“Back”,“Media”

which Anki recognizes accordingly:
anki_2022-10-17_13-02-40

The problem is, Anki can’t determine which side the media belongs to therefore can’t put it to use. And there is no option to append additional fields to existing ones.

What could be done about this is to add media to a corresponding field on the .csv generation phase, e.g.,

“Front, Media”,“Back”
“Front”,“Back, Media”
“Front, Media”,“Back, Media”

2. While trying to figure out a solution to the problem above I discovered that Anki Manual says:

Do not put subdirectories in the media folder, or some features will not work.

And this explains why:

Features like automatically deleting unused media, syncing etc. won’t work.

So I guess it’s better to add media to .csv without subfolders but keep the downloading structure as is and then instruct the user to put files directly into collection.media folder.

3. Currently if a user chooses to download media → the extension generates .csv → then flashes “Would you like some help with Anki integration?” → then waits for a response before downloading any media. I guess it would be better to flash the abovementioned message at the very end of the script after all files are saved.

Sorry if it’s overwhelming and by no means I want to strain your free time to work on this. If I could I would fix this myself but alas. :sweat_smile:

And thank you again for your brilliant work!

There is, but to do so you’ll have to modify the whole Note Type before starting the import.
In Anki go to Tools->Manage Note Types, select the Note Type you would like to edit from the list (in the screenshot you are using the Basic (and reversed card) one), then press Fields and Add.
You might want to clone the existing Note Type before adding fields though, so as to preserve the original. For that, instead of just selecting the Note Type from the list, click Add, select the Note Type from the list there and enter a new name, then proceed to adding fields to the new Note Type.

Alternatively, you can simply import the Basic with Media.apkg file, which I’ve included into the Extension folder (the Anki deck it produces can be safely deleted right after). It will create the Template: Basic with Media Note Type for you, which you can further modify to your liking and use instead of Basic (and reversed card) for importing data downloaded from memrise. (I plan to make a better template in the future as well, something closer to this)

It might seem a bit overcomplicated, but the reason for such separation of fields is that it makes subsequent editing in Anki much more flexible. If, for example, you would like to move Audio from the Front to the Back of a Card at some point, you will only need to move one corresponding line in the card template, without needing to redownload the whole csv table with different formatting and import it again into Anki (potentially losing the information about learned cards in the process).


Ah, yes. We discussed the subfolder issue with @naynaynay23, but I completely forgot about this part. I’ve removed the subfolder names from the csv table for now. (maybe it can be made to work in some way in the future)


You are right. I was thinking about the placement of this popup as well and decided to put it before downloading media, so that one could get busy reading the manual while waiting for the script to do its job. But since the downloading itself is perfomed by a separate process, there is no point in delaying its start, so I’ve updated the extension accordingly.


Thank you for your valuable contributions and for your kind words :blush:

2 Likes

It actually is very smart! I was hesitant at first but then tried the Basic with Media.apkg and even tidied it up a little in Anki’s default fashion with example cards for the future users of your extension! I sent you a pull request.


Please take notice that Google will enforce Manifest V3 for all Chrome extensions starting January 2023 so it will probably stop working in a few months. I hope it won’t require a complete rewrite.

2 Likes

Thank you for this addition, and also for helping with filling up the readme file.

Yes, I’m aware of the manifest version issue. I tried to write the extension using v3 initially, but it was too hard to filter out much more prominent v2 search results, so I decided to deal with the changing api later.
However, it seems that the support for v2 was recently announced to continue till 2024. I believe that Memrise will break something much sooner than that :slightly_smiling_face:

2 Likes

Languages I hoped Memrise would help me learn:
image

Languages Memrise actually makes me learn:
image

:upside_down_face:

4 Likes

It seems, that in the middle of it we forgot to answer the original question.

Yes, it is possible to extract audio with this extension now)

@Eltaurus I opened some issues for a bad course and a suggestion. Do you want me to continue to do that or just post suggestions here?

It’s totally up to you, but if you post your suggestions here, I think more people would see them and be able to share their opinions.

Suggestion: Make an option for the extension to always download media instead of asking every time. That’s because it can take 1-5 seconds for the dialog to come up and if you move away to another tab, the dialog will close.

Also option to never suggest help importing into Anki.

Thanks

1 Like

Another suggestion: By default insert the UTF-8 BOM into the downloaded CSV then you won’t have people asking all the time why Excel is showing garbage.

Here are some simple scripts that will batch convert:

I personally used the notepad++ with python script. Read the comments for a proper working version, but it does the job.

That’s neat! I’ve added that to the exporting settings and Excel seems to read the files properly now.
I also added global constants to disable popups and a course’s id to the saved stuff.

1 Like

I noticed something if you try to download multiple courses with audio at the same time. If the load is too high, it’ll crush your computer to near frozen, but that’s not the problem. It’s that the audio files in the folders will have duplicates. The total number is correct, but I’ll see (1) files meaning it had to rename and highly likely some audio files are missing because they were taken by a duplicate.

I’m guessing that courses often have audio files that share the same name, so when you hit that from 2 courses, it’s random which will get copied.

The problem doesn’t exist when you download 1 at a time, so it’s not a huge deal, but if you want to queue a bunch and go away, it may not work.

If it’s easy, I would suggest modifying the download filename by prepending/appending course ID, then when you copy into the media folder, remove the course ID to match the csv.

Or modify the csv link source and filename with course ID and then there shouldn’t be any issues.

The media files are downloaded directly into their respective subfolders and are not copied there from somewhere else. So I’m not sure how modifying names would solve this issue.

If you are looking for a way to run several instances of the script at the same time, you can try modifying the line

await sleep(100);

in the background.js file, replacing the number with some larger value like

await sleep(1000);

This will increase the interval between downloads in each thread, so it should minimize the chance of the threads clashing.


Btw, how many courses have you managed to download already?

My Chrome download folder is on spinning disk, so it’s slow enough I could see hundreds of *.tmp files there before my laptop finally caught up and moved them into the media folders. So, not sure if it’s a Chrome thing or javascript where it temporarily downloads to your dl folder, then moves it as part of an atomic operation.

I’ll try the sleep change.

I’ve downloaded dozens of the most popular courses for Korean.

Can I just say that you guys are the best? I thought I’d never get to import a course from Memrise again, but I’m glad I was wrong. I know there’s more important things to tweak in this extension, but is there a way to add Memrise levels as Anki tags the way the old add-on used to? Preferably as hierarchical tags, for example, “German_1” being the parent tag, then “German_1::01_The_Basics”, “German_1::02_Asking_Yes/No_Questions” the 2nd tag, and so on.

2 Likes

That’s strange, for me the temporary download files appear in their media subfolders (which is the expected behavior), not in the root download directory. As far as the script is concerned, a subfolder name is just a part of a downloaded media file name. So the clashing of the names between different courses as you describe is rather puzzling.

I’m thinking about making a community-accessible collection of downloaded courses, as we did previously with the mems, and your contribution would be very much appreciated. Would you mind uploading what you have gathered somewhere?