Home

Previous Entry | Next Entry

Backing up LJ safely

  • Jan. 7th, 2009 at 8:58 AM
professional
On the better safe than sorry tip, I just backed up my LJ. The "services" out there to do this are insecure as hell. So no way I'm putting my password into them. Instead, I'm using this local script:

http://hewgill.com/ljdump/

Pretty nifty and easy on OSX [Edit: I should add, this will work on Win or Linux as well, you're just on your own for installing python. Not too hard though]. You need to have python installed... but it's standard on most newer macs I think. Once you do:

1) download the script and double click to decompress
2) open up ljdump.config.sample in a text editor (TextEdit will do)
3) enter your username and pass word for lj between the appropriate tags
4) save as ljdump.config (in the same directory as the sample)
5) open Terminal
6) cd Downloads/ljdump-1.2 (or where ever you downloaded it to)
6) python ljdump.py

It'll create a folder under your user name, go though and download each entry as an xml file. Then it'll grab the comments too (which is almost as important to me).

726 entries, 8039 comments

That would have taken forever any other way.

You can then re-run this every once in a while to grab new entires, update any changes to old ones you might have made, grab new comments, etc. Now I just need to write a ruby script to pop all this into mysql for me.

Comments

[info]flaquita wrote:
Jan. 7th, 2009 02:23 pm (UTC)
bless you...i have been trying to figure out how to do this and most of the solutions have been Windows-based...whew.
*off to archive eight years' worth of entries*
[info]jimmyether wrote:
Jan. 7th, 2009 02:30 pm (UTC)
it took just under 20 minutes total for me.
[info]silentkid wrote:
Jan. 7th, 2009 02:32 pm (UTC)
ditto
thanks for this, always been leery of the "backup" services as well. Also, thanks for pointing me towards friendfeed. need a good central repository for lj and facebook. could've used it when i had myspace, one of the reasons I cancelled my account.
[info]jimmyether wrote:
Jan. 7th, 2009 02:43 pm (UTC)
Re: ditto
FriendFeed still needs some work before I'll give them the "ubercool" rating, but it really is pretty brilliant. The real-time feed is great, which is a concept that frankly is likely to totally change the way people use the internet. But importing/sharing from any site or service with an rss feed or api... that's simple, elegant and really powerful.

And then there's services like BlackType, which allow you to track the comments you make on whatever sites you run *and* anywhere else you might comment. Which, if you want, you can feed into FriendFeed as well. http://www.backtype.com/jimmyether
[info]ravengirl wrote:
Jan. 7th, 2009 03:05 pm (UTC)
THANK you. I'll do this tonight.
[info]jimmyether wrote:
Jan. 7th, 2009 03:06 pm (UTC)
Happy to help! :)
[info]callmesteam wrote:
Jan. 7th, 2009 03:50 pm (UTC)
thanks! any idea where i effed up?

Macintosh:~ callmesteam$ cd Desktop/ljdump-1.2
Macintosh:ljdump-1.2 callmesteam$ python ljdump.py
Traceback (most recent call last):
File "ljdump.py", line 116, in
config = xml.dom.minidom.parse("ljdump.config")
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xml/dom/minidom.py", line 1913, in parse
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/xml/dom/expatbuilder.py", line 922, in parse
IOError: [Errno 2] No such file or directory: 'ljdump.config'
[info]jimmyether wrote:
Jan. 7th, 2009 06:16 pm (UTC)
Looks like you just didn't edit the config file with your lj credentials and save it as ljdump.config in the same directory. It's not finding the ljdump.config file.
[info]callmesteam wrote:
Jan. 8th, 2009 02:46 am (UTC)
i did... thats what i'm not getting. its in a folder right next to the others in the same folder.
[info]jimmyether wrote:
Jan. 8th, 2009 05:22 am (UTC)
there are only two files you really need, both in the same folder:

ljdump.config
ljdump.py

Check the spelling. If you've got that and it's not working, then it must be a permissions issue (which is strange).

cd to the ljdump-1.2 directory and do:

chmod 777 ljdump.config
[info]jimmyether wrote:
Jan. 8th, 2009 05:26 am (UTC)
you can also do:

ls -la

while in that ljdump-1.2 directory and send me the output. It'll give you the owner and permissions on each file.
[info]callmesteam wrote:
Jan. 8th, 2009 02:45 pm (UTC)
Macintosh:ljdump-1.2 callmesteam$ ls -la
total 64
drwxr-xr-x@ 7 callmesteam callmesteam 238 Jan 8 07:43 .
drwx------+ 61 callmesteam callmesteam 2074 Jan 8 07:40 ..
-rw-r--r--@ 1 callmesteam callmesteam 6148 Jan 7 08:43 .DS_Store
-rw-r--r--@ 1 callmesteam callmesteam 418 Sep 8 2006 ChangeLog
-rw-r--r-- 1 callmesteam callmesteam 156 Sep 8 2006 ljdump.config.sample
-rw-r--r--@ 1 callmesteam callmesteam 154 Jan 8 07:43 ljdump.config.txt
-rwxr-xr-x 1 callmesteam callmesteam 10770 Sep 8 2006 ljdump.py
Macintosh:ljdump-1.2 callmesteam$


i tried everything and still no dice... i read the python text too...
[info]jimmyether wrote:
Jan. 8th, 2009 03:59 pm (UTC)
ahhhhh... simple. your file is actually named ljdump.config.txt

In terminal, cd to the directory and do:

mv ljdump.config.txt ljdump.config

that'll fix it
[info]callmesteam wrote:
Jan. 8th, 2009 04:23 pm (UTC)
you're a genius. it seems to be doing its thing now.
[info]callmesteam wrote:
Jan. 8th, 2009 05:09 pm (UTC)
so it up loaded 2443 seemingly random entries. how are they sorted? the first is old, the last is old.
[info]jimmyether wrote:
Jan. 9th, 2009 01:55 am (UTC)
Mine were in order by number... L-1 being the oldest. Sort by name and it'll make some sense.
[info]xianrex wrote:
Jan. 7th, 2009 03:51 pm (UTC)
Brilliant! And luckily I happen to have an OSX at home.
[info]jimmyether wrote:
Jan. 7th, 2009 06:17 pm (UTC)
It'll work on Win too... just have to have python installed. I'm not sure what hoops you have to jump through for that.
[info]a_bonsai_tree wrote:
Jan. 7th, 2009 08:30 pm (UTC)
Put it all on a big floppy disk, Xian.
[info]madbard wrote:
Jan. 7th, 2009 06:27 pm (UTC)
Is there anything wrong with setting up a temp password while you use LJBook?
[info]jimmyether wrote:
Jan. 7th, 2009 06:37 pm (UTC)
That might be one way around it. Assuming you get to change it back before someone gets in an locks you out. :) I know I know. Unlikely.

I think my issue with LJBook, aside from data being sent in the clear and god knows what being logged, is just that it outputs to pdf. I like having it all in the xml files, because I can have a simple script go through and parse them or search through then. I also don't think you get the comments with LJBook.
[info]madbard wrote:
Jan. 7th, 2009 06:49 pm (UTC)
You do get comments with LJBook. The formatting is a bit plain, though, and there are no images. How does your Python script fare on this front?
[info]jimmyether wrote:
Jan. 7th, 2009 06:59 pm (UTC)
oh, it's just xml files for each entry and comment thread. So, yeah, no formatting what so ever. You get the original html code used in your post along with a lot of internal LJ data like timestamp, tags and such. But the power of the xml is that you can slurp it up into other things easily.
[info]madbard wrote:
Jan. 8th, 2009 06:45 pm (UTC)
I understand the programmer's love of abstraction, and the desire to have the data in malleable form that could be rendered in any manner visually. My problem would be that I'd never get around to rendering it in any form. Maybe the solution is to download both XML, and have a quick-n-easy rendering on hand.
[info]jimmyether wrote:
Jan. 8th, 2009 11:38 pm (UTC)
Granted, I am a complete geek, but still... when I write I tend to write in text files with a bunch a meta data at the head anyway. Makes for really easy content searches when I want to find something. So, even if I never get to writing the tool, I can always get at what I need... a lot faster in fact than I ever could on LJ (since they never had a decent journal search for some dumb reason).
[info]iheartnerds87 wrote:
Jan. 8th, 2009 02:24 am (UTC)
is there any way to get the xml files in a more 'readable' format?
[info]jimmyether wrote:
Jan. 8th, 2009 05:14 am (UTC)
For just simply reading them, the easiest way would be to use firefox and do "open file" on each. Then you can copy paste into a new blog or whatever you like. The html will still be in tact. I'm probably going to eventually write a little script that slurp them up into a database and output them in a local web format I can browse and search... but god knows when I'll get to that.

Advertisement

Jimmy Is...

a recording engineer, a mastering engineer, a record producer, a record label owner, a recording artist, a songwriter, a lyricist, an audiophile, a music nut, a web developer, a web marketing consultant, a Ruby on Rails enthusiast, an entrepreneur, a social networking advocate, a "Getting Things Done" zealot, a coffee junkie, a doting Father, a loving husband and all around swell-guy.



This journal has additional friends-only content. Please comment to be added.
Powered by LiveJournal.com
Designed by Jamison Wieser