Extracted website incomplete at import

Extracted website incomplete at import

Topic Rating: 0 (0 votes)

January 31, 2021
2:19 am

inTempoDK

Member

Members

Forum Posts: 6

Member Since:
January 11, 2021

Offline

Hi, just tried to extract a page, most the data is missing.

This is what i see when i use direct link to extract data

Login to see this link

this is what i got

Login to see this link

As you can see, everything after the image is gone, why is this? this happens often (little things gone, all gone ect) different rss sources, is it not possible to just import all the data "raw" and then clean it yourself?

Correct me if im wrong but the advanced -> php command is done on the post after import already has happened correct?

January 31, 2021
2:30 am

inTempoDK

Member

Members

Forum Posts: 6

Member Since:
January 11, 2021

Offline

Ah found out it does import all, IF i remove all the advanced -> php? commands

I had these

$post['post_content'] = preg_replace('/<div><img src=".*?" class="ff-og-image-inserted">\s*<\/div>/s', '', $post['post_content']);
$post['post_content'] = preg_replace('/<strong>.*?\?\)/s', '', $post['post_content']);

according to regex test site it should work fine with these two, so i dont know why that would remove all content after image.

January 31, 2021
2:50 am

inTempoDK

Member

Members

Forum Posts: 6

Member Since:
January 11, 2021

Offline

And just ignore this question, for some reason first try it didnt show this, but the preg_replace actually had a hit before the text so it took tho who thing.

Sorry, my bad :/

While its a nifty function, its also dangerous :P

though have one more question, as Login to see this link doesnt seem to extract all data on some articles (not this plugins fault) is there a solution for that?

one example is:

Login to see this link

Any solution for this?

February 1, 2021
1:35 pm

CyberSEO

Admin

Forum Posts: 4072

Member Since:
July 2, 2009

Offline

No, there is no solution for the Full-Text-RSS script. It's a 3rd-praty product and it is not included into the CyberSEO Pro distributive. You can use it as a stand-alone service under the GNU General Public License.

All RSS

Forum Timezone: Europe/Amsterdam

Most Users Ever Online: 541

Currently Online:
8 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.