Webmastersite.net
Register Log In

DMOZ importer
and regerating after it

Comments on DMOZ importer

mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 11, 2008 - 2:05 AM:

Hello paul,

Is the dmoz importer still working / compatible updated whatever with current wsnlinks ?
Reason i ask is that as i did it before with my previous version, and as described here: scripts.webmastersite.net/w...x.php?section=dmozimporter , it now imported the links in full html.
As a result it trashed my index as the lines were very long like this:

[<a href="http://editors.dmoz.org:8080/editors/editurl.cgi?url=http%3A%2F%2Fwww.academieanderlecht.be%2F&cat=World/Nederlands/Regionaal/Belgi%c3%ab/Brussel/Anderlecht/Onderwijs/">EDIT</a>] <span style="background-color: rgb(0, 255, 0);">[]</span> <a href="http://www.academieanderlecht.be/">Academie voor Beeldende Kunsten</a>

And that cant be right.. smiling face

What i did was: Category to merge the import into: and picked the right cat.
Then after i clicked import the right frame went blank and it probably imported like 5 links in full html.
that was it.

So whats wrong?
Any ideas... ?
Paul
developer

Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California

Total Topics: 61
Total Comments: 7868
Paul
Posted Jul 11, 2008 - 4:46 PM:

Works fine on correctly-generated files, like the attached. Are you using the same 6.02 version of TulipChain? Newer TulipChain versions are likely incompatible.


Attached Files:
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 11, 2008 - 5:54 PM:

Hello Paul,

Yes im using same version.
I included the generated file i used..

Attached Files:
Paul
developer

Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California

Total Topics: 61
Total Comments: 7868
Paul
Posted Jul 11, 2008 - 11:01 PM:

You have all those colors of rainbow that aren't in mine, you can see. Please describe each step you take to generate that. My guess is you're running "check all sites" in there.
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 12, 2008 - 4:08 AM:

Your right Paul, im using check all sites. If you dont you could easely end up with links that are dead or some other reason why it shouldnt include an url. The colors indicate if something is wrong.. For example... I did it again and i only had a first part checked... it showed this :

Meta Redirect Found: http://www.greenbrussel.org Add trailing slash to domain.

You see its in yello?
On the report page it says that yello means warning.
Thats how the report is generated.
So why is it wrong to use the full site check?


And another thing Paul,

As im playing with a fully new install on a testsite im trying it out to use the dmoz import again.

The first category is about 2mb file.
But when importing i get :

Warning: require_once(../classes/bedrijven.php) [function.require-once]: failed to open stream: No such file or directory in /home/public_html/includes/prestart.php on line 23

Fatal error: require_once() [function.require]: Failed opening required '../classes/bedrijven.php' (include_path='.:/usr/lib/php:/usr/local/lib/php') in /home/public_html/includes/prestart.php on line 23



You see it requires : classes/bedrijven.php
Bedrijven.php ? ?
Thats the name of a category inside computers.
Why is it creating that,or looking for that.?
And if its because the file is to big,how to do it when a category is to big?
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 12, 2008 - 2:58 PM:

ok...as i tried it again exactly how you said paul...
it goes wrong all the way..
look at the included screenshots..

screenshot 1 shows how the index gets trashed messing up the top 5 block.

screenshot 2 show inside the category...the same full htm stuff....

That cant be right huh...sad


Attached Files:
Paul
developer

Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California

Total Topics: 61
Total Comments: 7868
Paul
Posted Jul 13, 2008 - 1:27 AM:

Clearly you're still generating it wrong and showing me screenshots of what wrong files do is useless -- I know that if you feed garbage data into it you get garbage data out of it. Either show me the file you're using or test mine. I've given you a correctly formed file in post #2, and nobody else seems to have any trouble following the procedure. But if you want me to tell you what's wrong your latest file, you'll have to show it to me.

If you dont you could easely end up with links that are dead

Just use the script's dead link checker.
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 13, 2008 - 6:36 AM:

But how many mistakes can i make paul ?
Its basically the default settings that you use when opening the tulipchain..
Even when i dont check and that it looks the same as your sample it fails.
Look at the included file..

Whats so different then with yours?
I dont see anything..
Only difference is that your HTML goes like :

<html>
<head>
<meta http-equiv='content-type' content='text/html;charset=UTF-8'>
<title>Dmoz Link Report for Business/Textiles_and_Nonwovens/Textiles/Carpets/</title>

And mine goes like:

<html><head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8"><title>Dmoz Link Report for World/Nederlands/Computers/Adviesbureaus/</title>

Only difference is the html,which are more one 1 line.
But with the included file it puts it in my computer category but with full a href html stuff..,as shown in the previous uploaded samples.

Attached Files:
  • 1.htm (65 KB, 498 downloads)
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
Posted Jul 13, 2008 - 12:39 PM:

Ok after a lot of trying i now found out what the problem of import was paul...
when you generate a report and click in the opened browser to save it ,it saved wrong...
it saved as webpage,complete.
while it should have been saved as webpage,only html.
that did the trick....

Another thing i would like to solve is the internal server error i get when importing a large file.
its not the error i expect.
where does that come from ?
Paul
developer

Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California

Total Topics: 61
Total Comments: 7868
Paul
#10 - Quote - Permalink
Posted Jul 13, 2008 - 8:12 PM:

Show me a file that gives an internal server error. Probably a matter of your php memory limit setting.
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
#11 - Quote - Permalink
Posted Jul 13, 2008 - 11:14 PM:

hi,

well that happend on the server when running a 2mb import file .
as i found out how to solve the saving problem i started importing.
but with big files it shows the internal server error,which is very weird.
i will tell my host about this..

but as i installed a new version locally i have no problems importing...
after im finished i will then upload it to the server.
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
#12 - Quote - Permalink
Posted Jul 14, 2008 - 6:03 AM:

Paul ,

Maybe weird question..
but as im locally importing and that goes ok..
for example....i imported a computer category of 5500 links..a litle bit more then 2mb.
But some are even larger and i now received an error:

Fatal error: Maximum execution time of 500 seconds exceeded in C:\xampp\htdocs\wsnlinks\databases\mysqli.php on line 17

But thats weird as i changed the settings in my php.ini file to:

max_execution_time = 1200 ; Maximum execution time of each script, in seconds
max_input_time = 600 ; Maximum amount of time each script may spend parsing request data
memory_limit = 128M ; Maximum amount of memory a script may consume (16MB)

So why time error?


mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
#13 - Quote - Permalink
Posted Jul 14, 2008 - 3:45 PM:

nevermind paul, i found it in the dmozimporter...wink
Paul
developer

Usergroup: Administrator
Joined: Dec 20, 2001
Location: Diamond Springs, California

Total Topics: 61
Total Comments: 7868
Paul
#14 - Quote - Permalink
Posted Jul 15, 2008 - 4:25 AM:

I've never seen TulipChain successfully handle a category of 5500 links, so I haven't given much thought to handling files of that size. Planning an eventual rewrite of the file which should make it more efficient.
mariow
Forum Regular

Usergroup: Customer
Joined: Jul 09, 2008

Total Topics: 22
Total Comments: 110
mariow
#15 - Quote - Permalink
Posted Jul 15, 2008 - 4:29 AM:

well the largest one i have here is 11MB and has 22000 links..
Search thread for
Download thread as
  • 0/5
  • 1
  • 2
  • 3
  • 4
  • 5



Sorry, you don't have permission to post posts. Log in, or register if you haven't yet.