May 4, 2011

Extract content from HTML documents | CyberSEO Pro | Support Forum

Avatar

Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

sp_TopicIcon
Extract content from HTML documents
Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 (0 votes) 
September 16, 2018
11:39 am
Avatar
saviulisse67
Member
Members
Forum Posts: 45
Member Since:
September 5, 2018
sp_UserOfflineSmall Offline

I would like to have some instruction about the possibility of CyberSEO to extract content from HTML documents. Is there a link on this? I bought CyberSEO because in the product description it was said that CyberSEO was able to extract content as well as from RSS and ATOM documents, even from HTML documents.

http://www.cyberseo.net/premium-plugins/
It can import any type of content form CSV tables, raw text dumps, HTML documents and JSON files.

September 16, 2018
12:07 pm
Avatar
CyberSEO
Admin
Forum Posts: 4059
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

Yes it can import the content from any CSV, RSS, JSON w/o any problems. But not from any HTML files. There are some restrictions of course. Come HTML pages can be parsed, but some not.

September 17, 2018
9:02 am
Avatar
saviulisse67
Member
Members
Forum Posts: 45
Member Since:
September 5, 2018
sp_UserOfflineSmall Offline

Do you have a link that divides the html documents that you can extract and how to extract them if there are special procedures? Thank you,

September 17, 2018
12:06 pm
Avatar
CyberSEO
Admin
Forum Posts: 4059
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

Sure. Here it is: Login to see this link

September 17, 2018
4:42 pm
Avatar
saviulisse67
Member
Members
Forum Posts: 45
Member Since:
September 5, 2018
sp_UserOfflineSmall Offline

I followed the link Login to see this link and I followed the recommended procedure from the site. Under "Site-specific extraction rules" the page says to type the following command "git clone Login to see this link". I wrote the command, using the Putty program, on a server Linux Debian 8 but I got the "command not found" error. I believe the server does not recognize the Git command. It's not important, I think, but do you have an idea about where I went wrong with the procedure? Thank you.

September 18, 2018
8:07 pm
Avatar
CyberSEO
Admin
Forum Posts: 4059
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

No, there is no procedure I can suggest you to follow. It's a 3rd party scrip which is not included into CyberSEO. You can use as a separate product only. So if you have any questions regarding it, please contact its developers directly.

Forum Timezone: Europe/Amsterdam

Most Users Ever Online: 541

Currently Online:
10 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

ninja321: 86

s.baryshev.aoasp: 68

Freedom: 61

harboot: 56

Pandermos: 54

MediFormatica: 49

Member Stats:

Guest Posters: 337

Members: 2940

Moderators: 0

Admins: 1

Forum Stats:

Groups: 1

Forums: 5

Topics: 1685

Posts: 8601

Newest Members:

sdmactech, dungdn.msn, a6479501, ee.ramos.n, info.atpltraining, pay2hostweb

Administrators: CyberSEO: 4059