⏳ Archiving the old General Discussion forum

I will attempt to archive parts of the old General Discussion forum here over the next while.

If you have threads you want to save from the General Discussion, paste the links to them in a reply here. If you want to archive threads yourself, have a look at this to do so easily:

http://www.bitsgalore.org/2014/08/02/How-to-save-a-web-page-to-the-Internet-Archive/

The end goal is probably to paste the threads here in a format like this:

cos wrote:

alanh wrote:

One could argue that archiving the forum on archive.org is an unnecessary step perhaps for that.

Edited to add:

1 Like

page 1

http://web.archive.org/web/20160506125610/http://www.memrise.com/thread/1809376/

http://web.archive.org/web/20160506125617/http://www.memrise.com/thread/1285485/

http://web.archive.org/web/20160506125622/http://www.memrise.com/thread/1809868/

http://web.archive.org/web/20160506125627/http://www.memrise.com/thread/1810640/

http://web.archive.org/web/20160506125632/http://www.memrise.com/thread/1810583/

http://web.archive.org/web/20160506125636/http://www.memrise.com/thread/1810542/

http://web.archive.org/web/20160506125640/http://www.memrise.com/thread/1810444/

http://web.archive.org/web/20160506125645/http://www.memrise.com/thread/1806134/

http://web.archive.org/web/20160506125650/http://www.memrise.com/thread/1810363/

http://web.archive.org/web/20160506125654/http://www.memrise.com/thread/1807498/

http://web.archive.org/web/20160506125658/http://www.memrise.com/thread/1809794/

1 Like

Hi Arete_Hime,

Thanks for starting this off. I’ll try to make some time to contribute soon.

I was just wondering if, before the list gets too long, it would be advisable/possible for contributors to include a brief (no more than one line) description of what is contained in the thread which makes it worth archiving. For example, “contains user script for xxxxx”. It may save some duplication of effort.

Thanks. Yes, giving a description of the thread would be very helpful. I’m not going to do that myself though as I’m trying to do as much as possible automatically to save time.

I have already been doing this slowly on archive.org for some posts from the old forum that I have saved links to. Don’t think I’ll get to all of them of course; the more people the better.

By the way, one problem that may not be evident yet in using archive.org: Right now, all of the images are still hosted on memrise.com, so when you view a forum post on archive.org that loads those images, they still work. But archive.org won’t archive the images, so if memrise removes them - or just changes their locations - all of the archive.org forum posts will lose them. For some posts, that’ll just mean all the userpics go away and some other things probably, so they’ll look ugly but still be readable. But other posts, like this one for example - http://www.memrise.com/thread/1298677/ - use images as part of the post or comments. Those will lose some of their important content when viewed at archive.org if memrise takes those images down.

I hoped it did. That was actually an important reason for me to use it.

page 2

http://web.archive.org/web/20160506152236/http://www.memrise.com/thread/1801645/

http://web.archive.org/web/20160506152239/http://www.memrise.com/thread/1802409/

http://web.archive.org/web/20160506152249/http://www.memrise.com/thread/1803851/

http://web.archive.org/web/20160506152258/http://www.memrise.com/thread/1808728/

http://web.archive.org/web/20160506152304/http://www.memrise.com/thread/1808880/

http://web.archive.org/web/20160506152308/http://www.memrise.com/thread/1807597/

http://web.archive.org/web/20160506152313/http://www.memrise.com/thread/1763563/

http://web.archive.org/web/20160506152317/http://www.memrise.com/thread/1810266/

http://web.archive.org/web/20160506152324/http://www.memrise.com/thread/1809153/

http://web.archive.org/web/20160506152331/http://www.memrise.com/thread/1803689/

http://web.archive.org/web/20160506152338/http://www.memrise.com/thread/1808676/

page 3

http://web.archive.org/web/20160506161400/http://www.memrise.com/thread/1292545/

http://web.archive.org/web/20160506154544/http://www.memrise.com/thread/1745286/

http://web.archive.org/web/20160506154548/http://www.memrise.com/thread/1810025/

http://web.archive.org/web/20160506154557/http://www.memrise.com/thread/1804013/

http://web.archive.org/web/20160506154611/http://www.memrise.com/thread/1809807/

http://web.archive.org/web/20160506154640/http://www.memrise.com/thread/1798520/

http://web.archive.org/web/20160506154644/http://www.memrise.com/thread/1783972/

http://web.archive.org/web/20160506154653/http://www.memrise.com/thread/1809649/

http://web.archive.org/web/20160506154702/http://www.memrise.com/thread/1810043/

http://web.archive.org/web/20160506154705/http://www.memrise.com/thread/1359471/

http://web.archive.org/web/20160506154712/http://www.memrise.com/thread/1809992/

http://web.archive.org/web/20160506154718/http://www.memrise.com/thread/1809947/

Memrise Monitor chrome extension
http://web.archive.org/web/20160508135858/http://www.memrise.com/thread/1369581/

5 identical choices out of 6 to choose from
http://web.archive.org/web/20160508135858/http://www.memrise.com/thread/1807325/

I feel like autowatering words that I don’t know http://web.archive.org/web/20160508140102/http://www.memrise.com/thread/1345991/

Forgive accidental typos
http://web.archive.org/web/20160508135930/http://www.memrise.com/thread/1293902/

app synching?
http://web.archive.org/web/20160508145626/http://www.memrise.com/thread/1295329/

Auto-ignore plugin
http://web.archive.org/web/20160508145630/http://www.memrise.com/thread/1808699/

Is there any “work mode” scripts/add ons/filters?
http://web.archive.org/web/20160508150351/http://www.memrise.com/thread/1809415/

1 Like

Arete_Hime - Sorry, I haven’t made a contribution to this task yet. I did spend some time trawling through the old forum pages over the weekend but quickly found I was losing the will to go on living. :slight_smile: There was too big a risk that I was simply going over the same ground that you and @cos had already covered.

Anyway, it helps that the forum search tool on the Memrise Users Wiki site still works. That, at least, helps speed things up…if you can remember what it is you are actually looking for. :wink:

The short descriptors you have used in your last list will be a big help too!

Thanks for all your hard work.

1 Like

I’ve now trawled through the first 50 pages of General Discussion and generally saved maybe 1-4 threads per page. As I went on I included more of the first post (I’ll go back over the first links as well and include some text). I’ll do Course Ideas probably as well over the next couple of days.

I do find myself wondering what the use of it is. We collectively know enough tips and tricks to not need to save those. And as the discussions get older they become less applicable and interesting. The two kinds of posts that keep being applicable and useful I think are userscripts and recommendations of courses.

As per usual, I’ve written an Autohotkey script to speed things up, my procedure:

Prerequisites: one open instance of Notepad (probably called links.txt) and Chrome, Autohotkey script active, Chrome extension from my first post installed.

Have a Chrome tab open on the General Discussion/Course Ideas forum, middle click any threads on it that look interesting (to open them in a new tab) until you’ve collected a few. Right click inside the threads you want to save and click “Save this page to the Internet Archive”, do something else until that’s finished. Highlight the text from the post you want to include to describe it and hit F9 (this pastes the text to Notepad and then pastes the url to Notepad and takes you to the next open tab in Chrome).

#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.
; #Warn  ; Enable warnings to assist with detecting common errors.
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.


SetTitleMatchMode 2




F9::
{
Send, {CTRLDOWN}c{CTRLUP}
WinWait, links.txt - Notepad, 
IfWinNotActive, links.txt - Notepad, , WinActivate, links.txt - Notepad, 
WinWaitActive, links.txt - Notepad, 
Sleep, 100
Send, {CTRLDOWN}{END}{CTRLUP}{ENTER}{CTRLDOWN}v{CTRLUP}
Sleep, 300
WinWait, Memrise - Google Chrome, 
IfWinNotActive, Memrise - Google Chrome, , WinActivate, Memrise - Google Chrome, 
WinWaitActive, Memrise - Google Chrome, 
Sleep, 400
Send, {CTRLDOWN}l{CTRLUP}
Sleep, 200
Send, {CTRLDOWN}c{CTRLUP}
WinWait, links.txt - Notepad, 
IfWinNotActive, links.txt - Notepad, , WinActivate, links.txt - Notepad, 
WinWaitActive, links.txt - Notepad, 
Sleep, 100
Send, {CTRLDOWN}{END}{CTRLUP}{ENTER}{CTRLDOWN}v{CTRLUP}{ENTER}{ENTER}
Sleep, 300
WinWait, Memrise - Google Chrome, 
IfWinNotActive, Memrise - Google Chrome, , WinActivate, Memrise - Google Chrome, 
WinWaitActive, Memrise - Google Chrome, 
Sleep, 300
Send, {ALT}
Sleep, 500
Send, {CTRLDOWN}{TAB}{CTRLUP}
}
return

… by the way, my internet is slow so everything takes 5 times too long. If you or anyone wants to help, please do.

Part of the point of this is that when people run across links to memrise forum threads and those links lead to a 404, they may think to try archive.org, so it’d be good if linked-to posts were there. I’ve been going through my memrise notification emails and archiving those.

I found this tool that looks like it could really speed this whole process up: https://internetarchive.readthedocs.io

It’s apparently a command line interface that allows one to interact with archive.org. If somebody got a list of links to each of the forum posts (probably not difficult to get) this could potentially be used to upload them all with one simple command.

My biggest concern is how easily searchable any of the information would be.

1 Like

Many thanks Arete_Hime, for doing this valuable but onerous job,

but not wishing to sound ungrateful, I wonder if when someone visits a link they could paste the title of the thread above the link - otherwise we could end up looking for ages for a specific post we are interested in, unless ‘we’ had actually made a note of the thread number (which I’m afraid I never did - only the title).

Thanks. I looked at it. It looks like that only covers uploading (items that are on your computer) and not archiving URLs? If it can do bulk archiving of URLs, I agree collecting them wouldn’t be a major issue.

@DW7, I did that later and I will go back over the ones for which I did not do that.

1 Like

Thank you so much Arete_Hime - I thought that when I scrolled down.

It will be so helpful :star: :thumbsup:

Hey, this is a great idea. I’ve got a few things to say that I hope will help.

First, I’d like to point out that archive.org does indeed make copies of images, so we shouldn’t be worried about losing them. If you look at the URLs for images in the page source, they all have /web/ before them. If this wasn’t the case before, I suspect that it takes archive.org a while to download all the images from the original page to its servers.

Second, according to the documentation that @DrewSSP provided, it’s true that the Internet Archive API can only be used to upload documents from your local machine. However, there is a workaround for this if you access to either CygWin or a Linux machine.

You can pipe the results of a curl command into ia upload, as shown in the documentation here: https://internetarchive.readthedocs.io/en/latest/cli.html

However, there are two caveats to this:
First, curl downloads the html from a webpage to your machine. Essentially, this means have to wait for every page to download before it can be uploaded back to archive.org. It’s automated so you don’t have to watch it, but it could be time consuming.
Second, the upload command only works for one page at a time. You’d have to get the URLs beforehand, store them in a file, and write a script to run ia upload on each of them in turn.

There’s one more possibility: Scrapy. http://doc.scrapy.org/en/0.14/index.html#
Scrapy is a PHP API designed for building web crawlers. This would allow us to create a PHP program to crawl the Memrise forum and pick out the URLs for each post. Once aggregated, the PHP script could be coded to upload each post to archive.org on its own.

Furthermore, you stated that your hope is to paste the threads here. A web crawler could be used to reformat output from each original post automatically. From there, it would be simple to paste each post here.

You could skip archive.org and build the posts immediately with Scrapy, but I’m not yet sure if there is a way to automate the upload process to this site. You might run into spam blockers trying to do that.

1 Like

Thanks. I’ve exceeded the time I’m willing to devote to this already, and I don’t know how to write the scripts required, so I am not going to do this.

I have now trawled through 50/230 pages of the General Discussion forum and archived 203 threads. It’s unlikely I’ll do anymore from that.

I’ve trawled through 26/42 threads in Course Ideas and archived 196 threads. I will perhaps try to finish that, or maybe not, can a 2-year-old thread still have useful info?

As the threads get older, it becomes close to impossible for me to determine if they have any useful content without reading them (which I am not going to do) as I have read none of them.

Oh, and then there are the other forums, like http://www.memrise.com/forum/3/ Feature Suggestions

If anyone wants to continue…

@arete_Hime ~ I have to confess that I don’t understand a great deal of what everyone is talking about with this thread, but I just wanted to say “thank you” for all your efforts on behalf of the Memrise community. You went above and beyond, and stepped up when no one else did. I commend you for that, and I am sure others do as well. Thank you. 謝謝 !

3 Likes

Thanks :slight_smile: I figured it would take me a few hours, so why not. It took a bit longer though. The only bother is that my internet is slow, making it take longer than it should.