Extract content from HTML documents

Extract content from HTML documents

Topic Rating: 0 (0 votes)

September 16, 2018
11:39 am

saviulisse67

Member

Members

Forum Posts: 45

Member Since:
September 5, 2018

Offline

I would like to have some instruction about the possibility of CyberSEO to extract content from HTML documents. Is there a link on this? I bought CyberSEO because in the product description it was said that CyberSEO was able to extract content as well as from RSS and ATOM documents, even from HTML documents.

http://www.cyberseo.net/premium-plugins/
It can import any type of content form CSV tables, raw text dumps, HTML documents and JSON files.

September 16, 2018
12:07 pm

CyberSEO

Admin

Forum Posts: 4059

Member Since:
July 2, 2009

Offline

Yes it can import the content from any CSV, RSS, JSON w/o any problems. But not from any HTML files. There are some restrictions of course. Come HTML pages can be parsed, but some not.

September 17, 2018
9:02 am

saviulisse67

Member

Members

Forum Posts: 45

Member Since:
September 5, 2018

Offline

Do you have a link that divides the html documents that you can extract and how to extract them if there are special procedures? Thank you,

September 17, 2018
12:06 pm

CyberSEO

Admin

Forum Posts: 4059

Member Since:
July 2, 2009

Offline

Sure. Here it is: Login to see this link

September 17, 2018
4:42 pm

saviulisse67

Member

Members

Forum Posts: 45

Member Since:
September 5, 2018

Offline

I followed the link Login to see this link and I followed the recommended procedure from the site. Under "Site-specific extraction rules" the page says to type the following command "git clone Login to see this link". I wrote the command, using the Putty program, on a server Linux Debian 8 but I got the "command not found" error. I believe the server does not recognize the Git command. It's not important, I think, but do you have an idea about where I went wrong with the procedure? Thank you.

September 18, 2018
8:07 pm

CyberSEO

Admin

Forum Posts: 4059

Member Since:
July 2, 2009

Offline

No, there is no procedure I can suggest you to follow. It's a 3rd party scrip which is not included into CyberSEO. You can use as a separate product only. So if you have any questions regarding it, please contact its developers directly.

All RSS

Forum Timezone: Europe/Amsterdam

Most Users Ever Online: 541

Currently Online:
10 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.