TulipChain seems to work better now than it used to for some reason, I can do fairly large categories in it now.
The DMOZ importer will run about 50 times faster in today's release, and thus handle much larger files. It can also handle the report file format of your previous post.
That sounds good huh... But thats local....generally a server would take longer i think.. but paul....it was so fast i can hardly imagine the import went well... im of to bed and will see it in the morning as it has to regenerate about 3900 cats and 22000 links all together..
EDIT: ok,the file i did while sleeping was 4.6MB and had 8857 links.but the category shows 3,817 . its still regenerating...and dont know if that has an effect. it looks like everything is imported as i see that all by alphabet is inserted... the last cat from the big file (letter W) shows. but strange it doesnt show 8857 links.
And...when i went to my computer i should see the starting page of the regenerator but saw a buch of java stuff with name dynamicdrive. so.i didnt expected any problems as you upgraded the file and i have max execution time set to 0.
Server should be the same time, and that's about what I'd expect from my tests. Cutting directly to mysql insertions instead of creating objects make a huge difference. In my tests the number shown matches the number in the report file once regeneration is fully complete. If it doesn't for you, send me your report file.
And...when i went to my computer i should see the starting page of the regenerator but saw a buch of java stuff with name dynamicdrive.
That sounds like a page dying before it could complete, maybe, which would mean regeneration didn't finish. If you could give me the actual HTML source of what you saw, and then tell me what you see when you press the back button, I'd have a better idea. Could be you had some sort of intensive process that ran at a set time which temporarily killed your local server.
hi paul, well copying the error is what i normaly do but completely forgot as i just came out of bed when i saw it.. but the final counting did showed but after a long time regenerating and when it did run the cats counting again by itself....thats what happend when i inserted another file.. all seems to be working,but the regenerating is a killer. it takes to much time and specialy when the totals increase.
amazing how you solved the importing cause it goes with such a speed. but a solution to the regenerating would be a good thing..
isnt it possible for it to continue counting from its last run ? after a big is added?
ok paul... i finished abnother category.. Before i started i had 34878 links and 9 categories. the file was 5.22 MB ,had 13219 link in it. I started the regeneration at 03:04 (my time) and it was finished 8 hours later. total now 48097 links in 10 categories. Thats a bit long huh..... Thats local...with dual core processor....so thats fast enough...
Mysql is the bottleneck for regeneration, so there's very little I can do about it. I'll check over the code to combine updates and remove anything unnecessary in 5.0, but I doubt the improvement will be very significant.
Is it categories that are slowest to regenerate, or links? I can probably speed up links more than categories.
Running mysql table optimization (near the top of advanced options) could also speed it up.
Well its hard to tell which one is slowing down.. now i have 10 categories (4827 counting subcategories0 but i do have a feeling the cats are slower now...,and as it goes 10 by 10 a time it takes up a long time..
I just noticed something weird... From a lot of imported links the site description isnt taken along with it... I checked the import file and the links include site description...
Any way to do a mass "get meta description" somehow ?
btw paul....it happend again... was just file with 8000 links and when regenerating i received the following error.. not sure if it was with the cats or links..
Fixed descriptions... but out of a report file with 16,839 I only get 1,268 loaded, so I'm going to put the old dmoz importer back into the 4.1 series and keep the experimenting to 5.0.
Implimented (in 5.0) a couple of new speedup option checkboxes, then tested with 21096 links, 1408 subcategories. "Regenerate everything" took 76 minutes. Links are the definite slowest spot for me, just due to the number of them.
At any rate, figure out what you want to import and do all at once so you only have to regenerate it once.
0/5
1
2
3
4
5
Sorry, you don't have permission to post posts. Log in, or register if you haven't yet.
Comments on DMOZ importer
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
the first one i imported was my country computer category.
that was over 2MB and 5400 links
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
TulipChain seems to work better now than it used to for some reason, I can do fairly large categories in it now.
The DMOZ importer will run about 50 times faster in today's release, and thus handle much larger files. It can also handle the report file format of your previous post.
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
You upgraded the dmoz importer?
Your fast...
Is that a single file download Paul ?
(note: it sure is positive that the importer can handle big files better,but we still cant gain time on the regenerate process)
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
That sounds good huh...
But thats local....generally a server would take longer i think..
but paul....it was so fast i can hardly imagine the import went well...
im of to bed and will see it in the morning as it has to regenerate about 3900 cats and 22000 links all together..
EDIT: ok,the file i did while sleeping was 4.6MB and had 8857 links.but the category shows 3,817 .
its still regenerating...and dont know if that has an effect.
it looks like everything is imported as i see that all by alphabet is inserted...
the last cat from the big file (letter W) shows.
but strange it doesnt show 8857 links.
And...when i went to my computer i should see the starting page of the regenerator but saw a buch of java stuff with name dynamicdrive.
so.i didnt expected any problems as you upgraded the file and i have max execution time set to 0.
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
Server should be the same time, and that's about what I'd expect from my tests. Cutting directly to mysql insertions instead of creating objects make a huge difference. In my tests the number shown matches the number in the report file once regeneration is fully complete. If it doesn't for you, send me your report file.
And...when i went to my computer i should see the starting page of the regenerator but saw a buch of java stuff with name dynamicdrive.
That sounds like a page dying before it could complete, maybe, which would mean regeneration didn't finish. If you could give me the actual HTML source of what you saw, and then tell me what you see when you press the back button, I'd have a better idea. Could be you had some sort of intensive process that ran at a set time which temporarily killed your local server.
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
hi paul, well copying the error is what i normaly do but completely forgot as i just came out of bed when i saw it..
but the final counting did showed but after a long time regenerating and when it did run the cats counting again by itself....thats what happend when i inserted another file..
all seems to be working,but the regenerating is a killer.
it takes to much time and specialy when the totals increase.
amazing how you solved the importing cause it goes with such a speed.
but a solution to the regenerating would be a good thing..
isnt it possible for it to continue counting from its last run ?
after a big is added?
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
ok paul...
i finished abnother category..
Before i started i had 34878 links and 9 categories.
the file was 5.22 MB ,had 13219 link in it.
I started the regeneration at 03:04 (my time) and it was finished 8 hours later.
total now 48097 links in 10 categories.
Thats a bit long huh.....
Thats local...with dual core processor....so thats fast enough...
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
Mysql is the bottleneck for regeneration, so there's very little I can do about it. I'll check over the code to combine updates and remove anything unnecessary in 5.0, but I doubt the improvement will be very significant.
Is it categories that are slowest to regenerate, or links? I can probably speed up links more than categories.
Running mysql table optimization (near the top of advanced options) could also speed it up.
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
Well its hard to tell which one is slowing down..
now i have 10 categories (4827 counting subcategories0
but i do have a feeling the cats are slower now...,and as it goes 10 by 10 a time it takes up a long time..
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
Hi paul,
I just noticed something weird...
From a lot of imported links the site description isnt taken along with it...
I checked the import file and the links include site description...
Any way to do a mass "get meta description" somehow ?
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
Looks like an issue with the new dmoz importer... I really should've only put it in 5.0. Fixed shortly.
No way to get descriptions after the fact.
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
ah i see....
so its kinda messed up now...
so i have to start over again...
Forum Regular
Usergroup: Customer
Joined: Jul 09, 2008
Total Topics: 22
Total Comments: 110
btw paul....it happend again...
was just file with 8000 links and when regenerating i received the following error..
not sure if it was with the cats or links..
function toggleBox(szDivID) { var obj = document.getElementById(szDivID); if (obj.style.visibility == "visible") iState = 0; else iState = 1; obj.style.visibility = iState ? "visible" : "hidden"; obj.style.height = iState ? "100%" : "0px"; } /* Select and Copy form element script- By Dynamicdrive.com For full source, Terms of service, and 100s DTHML scripts Visit http://www.dynamicdrive.com */ //specify whether contents should be auto copied to clipboard (memory) //Applies only to IE 4+ //0=no, 1=yes var copytoclip=1 function HighlightAll(theField) { var tempval=eval("document."+theField) tempval.focus() tempval.select() if (document.all&©toclip==1){ therange=tempval.createTextRange() therange.execCommand("Copy") window.status="Contents highlighted and copied to clipboard" setTimeout("window.status=''",3800) } } function checkall() { var boxes = document.getElementsByName('selection[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('link[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('linkedit[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('linkid[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('cat[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('comment[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('member[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('feed[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('rating[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('attach[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('event[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; var boxes = document.getElementsByName('quote[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = true; } function uncheckall() { var boxes = document.getElementsByName('selection[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = false; var boxes = document.getElementsByName('link[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = false; var boxes = document.getElementsByName('linkedit[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = false; var boxes = document.getElementsByName('linkid[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = false; var boxes = document.getElementsByName('cat[]'); for (i=0; i < boxes.length; i++) boxes[i].checked = false; var boxes = document.getElementsByName('comment[]'); for (i=0; i < boxes.length; i++)
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
Fixed descriptions... but out of a report file with 16,839 I only get 1,268 loaded, so I'm going to put the old dmoz importer back into the 4.1 series and keep the experimenting to 5.0.
developer
Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California
Total Topics: 61
Total Comments: 7868
Implimented (in 5.0) a couple of new speedup option checkboxes, then tested with 21096 links, 1408 subcategories. "Regenerate everything" took 76 minutes. Links are the definite slowest spot for me, just due to the number of them.
At any rate, figure out what you want to import and do all at once so you only have to regenerate it once.